Interaction: Typing

[This won’t make much sense unless you’ve read the original piece first….]
There are at least three issues packed up in this topic:

  • An implementation issue, about whether the middleware even permits the sending or receiving of messages that do not conform precisely to a particular encoding and type/schema.
  • The relationship between the type(s) of messages that can be exchanged and the description of the service.
  • The extent to which message types are accessible, manipulable, and negotiable. Another way of thinking about it is how early or late message types are bound.

[DIGRESSION: For different distributed computing frameworks, we need to distinguish between feasibility (is it possible?) and idiom (is it a natural way of working within the framework?). For example, I can use XML to encode a rigid, unchanging message structure based on a well-known DTD; I can also use Java RMI to exchange opaque blobs that I interpret using some private mechanism; neither of these reflects the natural, idiomatic use of the technology.]
When we talk about “untyped”, we must recognize that it’s a relative term. Eventually there has to be a semantic match: the request must be expressed in a form that the service can interpret, and the response must be comprehensible to the requester. In that sense, the client and the server must structure their messages using mutually compatible formats, schemas, types, whatever.
Some mechanisms (e.g. CORBA, RMI or COM) bind the message type at design time. They depend on the use of software tools which typically generate client and server stub code that is, literally, incapable of handling messages that do not conform to this type. The client cannot generate invalid messages, and the service application logic never even sees invalid messages; they are rejected at a lower level. There are XML-based mechanisms that support this kind of model, using XML simply as an object serialization format.
In some cases it may be desirable to defer type binding to run time. This is particularly true if a service is identified by a persistent identifier (such as a URI) with no, or weak, type information. The most common example is a simple XML/HTTP web service: the client sends an XML message via an HTTP POST, and the service parses the message to determine whether it corresponds to a request that it can handle. In simple cases, the service only understands one type, and if the request doesn’t conform it will be rejected. Semantically this is similar to a classic RPC; however it is likely to be less efficient, and the failure modes are different. However we are not restricted to such simple cases. A service may, for example, delegate the request to another, more capable service; or it may invoke a translator to map the request into a form that it can understand. Such examples highlight an assumption that is not found in the RPC world: that type matching is not always a simple black-or-white, true-or-false proposition. This in turn requires [or is that too strong?] that the message be expressed in a language that supports some form of composition.
[DIGRESSION: I was trying to imagine how one might do this in Java. Given a blob that represented a serialized Java object, one could deserialize it into something like a Java bean that supported introspection so that for each property you could obtain both accessor methods and type information that could be used by a classloader. This feels convoluted, but maybe someone has done it.]
The interface type is also involved in service descriptions. In some cases, such as Jini, a service description is based on the annotated interface type: it makes no sense to talk about discovering a service independent of its type. At the other extreme, a service is simply a network addressable end-point, with no type information specified or even available. And then there’s WSDL, the web services definition language which is rich enough (or perhaps over-engineered enough) to describe a spectrum of service types. Although the specifications suggest that it can be dynamically interpreted at runtime, the complexity and (relative) rigidity of WSDL seems best-suited to design-time use. (Curiously, there is no standard way of retrieving the WSDL corresponding to a web service URI, although there are some common practices.)
So why is all this important? Well, as distributed systems scale in various ways – number of services, number of replicated component services, number of cross-domain service dependencies, lifetimes of services, and so forth – there is increasing interest in services that are relatively loosely coupled, with more flexible, less brittle, often asynchronous interactions. (Most of this is additive, not alternative: existing RPC-style services have advantages that are still important.) Maybe it’s a throwback to the Internet mantra of “be liberal in what you accept”, maybe it’s influenced by agent-style anthropomorphism, maybe it’s the result of overloading the simple HTTP protocol to do distributed computing, maybe it’s just a recognition of a world in which version skew is a way of life. In any case, one way in which we decouple these components is by deferring type binding, from design time to run time, and by making the type of a message accessible to the application rather than being hidden below the marshalling and serialization infrastructure.
Is this a dichotomy or a spectrum? I can certainly identify a number of styles which I can order in various ways, so it feels like more than just an XOR. But that’s all for now.