Tuscany Databinding Framework

Overview

It is necessary to support the flow of any data type that is supported by both the client and the provider. With the ability to attach data transformation mediations to wires, this actually becomes a requirement to support any data type that can be mapped from client to provider and back again.

In any interchange there are just two things that are defined: the format of data that will be supplied by the client and the format of data that will be consumed (delivered to) the provider. Neither client or provider needs to be aware of the format of data on the other end or of what gyrations the fabric went though in order to make the connection. As part of making the connection, it is the fabric's job to make the connection as efficient as possible, factoring in the semantic meaning of the data, the policies that need to be applied, and what the different containers support.

All this flexibility just about requires we use the most generic type possible to hold the data being exchanged: a java.lang.Object or a (void*) depending on the runtime. The actual instance used would depend on the actual wire, some examples from Java land being:

POJO (for local pass by reference)
SDO (when supplied by the application)
Axiom OMElement (for the Axis2 binding)
StAX XMLStreamReader (for streamed access to a XML infoset)
ObjectInputStream (for cross-classloader serialization) and so forth.

Each container and transport binding just needs to declare which data formats it can support for each endpoint it manages. The wiring framework need to know about these formats and about what transformations can be engaged in the wire pipeline.

For example, the Axis2 transport may declare that it can support Axiom and StAX for a certain port and the Java container may declare that it can only handle SDOs for an implementation that expects to be passed a DataObject. The wiring framework can resolve this by adding a StAX->SDO transform into the pipeline.

The limitation here is whether a transformation can be constructed to match the formats on either end. If one exists then great, but as the number increases then developing n-squared transforms becomes impractical. A better approach would be to pick the most common formats and require bindings and containers to support those at a minimum, with other point-to-point transforms being added as warranted. (Source: http://mail-archives.apache.org/mod_mbox/ws-tuscany-dev/200603.mbox/%3c4418A53D.2080606@apache.org%3e)

Business data are represented in different ways

SDO
JAXB
JavaBeans
DOM

Different Web Service stacks use different data representations

Axis1 uses DOM
Axis2 uses AXIOM
JAX-WS uses JAXB

Application developers should have the freedom to choose their preferred data representation and components with compatible data should be able to interoperate without the intervention of the business logic

Usage Scenarios

The data model

<schema targetNamespace="http://www.example.com/Customer" xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:cust="http://www.example.com/Customer">
<element name="customer" type="cust:Customer" />
<complexType name="Customer">
<sequence>
<element name="customerId" type="string" />
<element name="name" type="string" />
<element name="billingAddress" type="cust:Address" />
<element name="mailingAddress" type="cust:Address" />
</sequence>
</complexType>
<complexType name="Address">
<sequence>
<element name="street" type="string" />
<element name="city" type="string" />
<element name="state" type="string" />
<element name="zipCode" type="string" />
</sequence>
</complexType>
</schema>

Source Concrete Type	Source Declared Type	Target Concrete Type	Target Declared Type	Support
JAXBCustomer	java.lang.Object	SDOCustomer	java.lang.Object	Y
JAXBCustomer	customer.Customer	SDOCustomer	customer.Customer

> - Databinding technology is not reflected in the service contract but in
> the implementation contract. For example, different component
> implementations may choose different databinding technologies. Currently,
> we pass this information as part of Operation, which is part of the
> ServiceContract.

I can see the following paths below with potentially different data
contracts that require transformations:

1. component1.ref1 --> component2.svc1

    a) component1's implementation contract --> component1.ref1's
ServiceContract
    b) component1.ref1's ServiceContract --> component2.svc1's
ServiceContract
    c) component2.svc1's ServiceContract --> component2's implementation
contract

2. composite.service1 w/ binding1

a) binding1's contract (mandated by the transport/protocol stack, for
example, AXIOM for Axis2) --> composite.service1 ServiceContract (by the
interface definition under <service>)

3. composite.reference1 w/ binding1

a) composite.reference1's ServiceContract (by the interface definition
under the <reference>) --> Composite reference's binding contract (mandated
by the transport/protocol stack)

There are different cases:

1) Case 1: A "weak" interface implemented by a method expecting a
databinding-specific data. The implementation has a contract which is not
the same as the ServiceContract for the service.

public interface GenericInterface {
Address getAddress(Customer customer);
}

Both Address and Customer are plain interfaces.

Then if the implementation code only work off the common interfaces, then no
transformation is required. If it happens that impl code will cast the
interface to some hidden contract such as commonj.sdo.DataObject, then we
need to have the method in the impl class to express such requirements.

Another case is that we provide a componentType file for a POJO component to
indicate that it exposes service using WSDL. Then the ServiceContract for
the POJO component now is a WSDL service contract.

A similar case would be that a JavaScript component using interface.java, so
the incoming data should be conforming to the java interface. But the
JavaScript code might want to deal with all the data as XMLBeans.

For references and services with bindings, it becomes more obvious to see
the databinding requirement from the binding contracts. For example, the
binding.axis2 would only consume and provide data in AXIOM. The databinding
information will be provided by binding extensions and set to the binding
metadata.

2) Case 2: Two remotable interfaces with different databindings for the
reference and target service

Let's assume there are two remotable interfaces generated from the same WSDL
under two different databindings (SDO vs. JAXB):

public interface JAXBInterface {
JAXBAddress getAddress(JAXBCustomer customer);
}

public interface SDOInterface {
SDOAddress getAddress(SDOCustomer customer);
}

We now have two components: Component1 is implemented using SDO while
Component2 is implemented using JAXB. Component1 has a reference "ref1"
typed by SDOInterface (because component1 will use SDO data for the outbound
service call) while Component2 has a service "svc1" typed by JAXBInterface
(because component2 only accepts JAXB data).

Should we support the wiring from Component1.ref1-->Component2.svc1? (I
think it's resonable as the two interfaces can be mapped against each other
because both are representation of the same WSDL portType using different
databindings.

Operation-level transformations

Operation:

InputType

OutputType

FaultTypes

Parameter-level transformations

Logical Type vs. Physical Type

The runtime's main job is to connect user components together so typically the actual type used would be determined by the user code that implements the source or target. The databinding framework's role here is to convert from the type used by the source to the type used by the target. The internal types used by the runtime should not influence this - which is an essential separation to maintain given the components and the wire connecting them need to work on different runtimes (implemented in different languages).

Where runtime types do matter is in the conversion between some serialized form and an in-memory representation and the two places where that occurs are in the configuration properties and in the binding implementations. To handle configuration properties (with the XPath requirement) we use DOM in the Java runtime; I believe the C++ runtime uses SDO. Each transport binding also tends to deserialize using a specific technology - for example, AXIOM for Axis2, JAXB for JAX-WS, Serializable for RMI and so the databinding framework is used to convert between the form generated by the binding and the form used by the component.

The logical type represents the data type the user thinks is flowing across a wire. This could be a Java type, a XML type, a CORBA type, whatever depending on the /logical/ service contract defined in the assembly.

The physical type is the actual representation of that type that is flowed by the runtime. In the Java runtime this will always be a Java type (i.e. some subclass of Object). In some cases it will be the same as the logical type - e.g. when a Java component calls another Java component over a local wire using a Java interface then both logical and physical types will be the same. In many cases though they will be different - for example, if the service contract was WSDL then the logical type would be the XML type used by the WSDL.

Within the runtime the same logical type may have different physical forms. For example, the same XML document could be represented physically as a DOM, a StAX stream, an SDO, a JAXB object, or an AXIOM stream. The framework supports conversion between these different physical forms.

1. A component (A) consumes a service provided by another component (B). The implementation of A prefers SDO while the implementation of B prefers JAXB.

In the SCA term, A is wired to B using a reference.

Data is represented by an interface which is independent of the databinding
Data is represented by an interface or class which is databinding-specific (either generated or dynamic)

2. A component (A) consumes a web service using axis2. Axis2 engine expects to handle AXIOM objects.

3. A component is exposed as a service over a transport/protocol.

Where runtime types do matter is in the conversion between some serialized form and an in-memory representation and the two places where that occurs are in the configuration properties and in the binding implementations. To handle configuration properties (with the XPath requirement) we use DOM in the Java runtime; I believe the C++ runtime uses SDO. Each transport binding also tends to deserialize using a specific technology - for example, AXIOM for Axis2, JAXB for JAX-WS, Serializable for RMI and so the databinding framework is used to convert between the form generated by the binding and the form used by the component.

interfaces for services and references are the contracts for SCA assembly.

1. Incoming data for component implementation

2. Interace mapping

interface.java <--> interface.java
interface.wsdl <--> interface.java
interface.wsdl <--> interface.wsdl
interface.* <--> other

Mapping from interface.wsdl to interface.java

JaxbAddress getAddress(JaxbCustomer customer)

SdoAddress getAddress(SdoCustomer customer)

What's a databinding?

A databinding represents a specific data format in the Tuscany runtime. Each databinding has a unique name which identifies the data format.

Typical databindings

XML/Java databinding frameworks
- SDO
- JAXB
- XMLBeans
- Castor
- Axiom
- JavaBeans
XML Parsing Technologies
- SAX (InputSource, ContentHandler)
- DOM (Node)
- StAX (XMLStreamReader/XMLStreamWriter/XMLEventReader/XMLEventWriter)
I/O
- InputStream/OutputStream
- Reader/Writer

What's a transformer?

Usage Scenarios

Data transformation between SCA components

Data transformation can be performed on wire for mappable and remotable interfaces
Data transformation can happen between interfaces defined using different IDLs such as Java or WSDL.

Data transformation for composite services

<interface.xxx> defines the outbound service contract (SC2) which can be wired to a target component, reference or service (SC3).
<binding.xxx> can optionally hint a service contract for the inbound data from the binding protocol layer.

Data transformation for composite references

<interface.xxx> defines the inbound service contract (SC2) which can be wired from a source component, reference or service (SC1).
<binding.xxx> can optionally hint a service contract (SC3) for the outbound data to the binding protocol layer.

Data transformation for property values

Property values are loaded from SCDLs as DOM documents
The DOM Document can be transformed into a java object under a databinding, such as SDO, JAXB so that the component implementation code can work with the databinding directly instead of DOM.

Data Transformations

How to transform data across databindings

A databinding is a terminal for the data transformation
Three types of databindings depending on how the data is represented by the databinding
- Some databindings can feed the data for consumption
- Some databindings serve a sink to receive data
- Some databindings can bridge the sink so that data coming into the sink can be consumed by others
  Scenario 1: Source Ã Source
  Scenario 2: Source Ã Sink
  Scenario 3: Sink Ã Source (Pipe)
  How to use databindings?
  Declare the data binding for the interfaces
Data Binding requirement can be expressed as:
- SCDL extension

<interface.wsdl ...>
<db:databinding xmlns:db="http://tuscany.apache.org/xmlns/sca/databinding/1.0" name="commonj.sdo.DataObject">
</interface.wsdl>

Java annotations for a remotable interface

@DataType can be applied to remotable interfaces at type and method level

@Remotable
@DataType(name="org.w3c.dom.Node")
public interface Interface2

{ Node call(Node msg); @DataType(name="javax.xml.stream.XMLStreamReader") XMLStreamReader call1(XMLStreamReader msg); }

What's behind the magic?

Load and build the data binding metadata

The XML or annotation is loaded/processed as the DataBindingDefinition model object
The builder will resolve the DataBindingDefinition and create DataBinding as the runtime context for transformation

Databinding transformer graph

The algorithm to calculate the transformation path

The transformers are registered and selected using the following algorithm.
- The data transformation capabilities for various databindings can be nicely modeled as a weighted, directed graph with the following rules. (Illustrated in the attached diagram).
- Each databinding is mapped to a vertex.
- If databinding A can be transformed to databinding B, then an edge will be added from vertex A to vertex B.
- The weight of the edge is the cost of the transformation from the source to the sink.
In the data interceptor on the wire, if we find out that the data needs to be transformed from databinding A to databinding E. Then we can apply Dijkstra's Shortest Path Algorithm to the graph and figure the most performed path. It can be A->E, or A>C->E depending on the weights. If no path can be found, then the data cannot be mediated.
Transform data
Direct transformation

Multi-hop transformation
Deal with different IDLs
Dealing with WSDL/XSD based IDLs
SCA allows the interfaces to be defined using various IDLs, for example, java interface or WSDL portType
IDLs may have different ways to represent the input/output/fault data
The databinding framework is designed to support the transformation across IDLs
Some special databindings are internally used for this purpose:
- idl:input
- idl:ouput
  Dealing with WSDL/XSD based IDLs
WrapperHandler
- Provide WrapperStyle WSDL wrapping/unwrapping support

SimpleTypeMapper: convert data between XSD simple types (by the databinding, for example, OMElement with an OMText child) and java objects

Extend the databinding framework

What can be extended?

The Tuscany databinding framework can be extended in two ways:

1. Add more databinding providers to support new formats to represent business data

2. Add more transformers to facilitate the data exchange accross databindings

How to contribute a new databinding or transformer?

Databindings and transformers can be plugged into Tuscany runtime following the Tuscany extensibility story. It can be achieved in the following steps:

1. Provide a java class which implements the DataBinding interface. You can subclass the DataBindingExtension.
2. Provide a java class which implements the Transformer interface. You can subclass the TransformerExtension.
3. Register your databindings and transformers as system components in the extension composite.

The DataBinding SPI:

/**
 * DataBinding represents a data representation, for example, SDO, JAXB and AXIOM
 */
public interface DataBinding {
    /**
     * A special databinding for input message of an operation
     */
    String IDL_INPUT = "idl:input";
    /**
     * A special databinding for output message of an operation
     */
    String IDL_OUTPUT = "idl:output";
    /**
     * A special databinding for fault message of an operation
     */
    String IDL_FAULT = "idl:fault";
    /**
     * The name of a databinding should be case-insensitive and unique
     * 
     * @return The name of the databinding
     */
    String getName();
    
    /**
     * Get the aliases for the databinding
     * 
     * @return An array of aliases
     */
    String[] getAliases();

    /**
     * Introspect and populate information to a DataType model
     * 
     * @param javaType The java class or interface to be introspected
     * @param annotations The java annotations
     * @return true if the databinding has recognized the given data type
     */
    boolean introspect(DataType dataType, Annotation[] annotations);

    /**
     * Introspect the data to figure out the corresponding data type
     * 
     * @param value The object to be checked
     * @return The DataType or null if the java type is not supported by this databinding
     */
    DataType introspect(Object value);

    /**
     * Provide a WrapperHandler for this databinding
     * @return A wrapper handler which can handle wrapping/wrapping for this databinding
     */
    WrapperHandler getWrapperHandler();

    /**
     * Make a copy of the object for "pass-by-value" semantics
     * @param source object to copy 
     * @return copy of the object passed in as argument
     */
    Object copy(Object object);
    
    /**
     * Get the type mapper for simple types
     * @return The databinding-specific simple type mapper
     */
    SimpleTypeMapper getSimpleTypeMapper();
    
    /**
     * Get the handler that can handle exceptions/faults in the
     * databinding-specific way
     * 
     * @return An instance of the exception handler
     */
    ExceptionHandler getExceptionHandler();
}

Transformer SPI

/**
 * A transformer provides the data transformation from source type to target type. The cost of the transformation is
 * modeled as weight.
 */
public interface Transformer {
    /**
     * Get the source type that this transformer transforms data from. The type is used as the key when the transformer
     * is registered with TransformerRegistry.
     *
     * @return A key indentifying the source type
     */
    String getSourceDataBinding();

    /**
     * Get the target type that this transformer transforms data into. The type is used as the key when the transformer
     * is registered with TransformerRegistry.
     *
     * @return A key indentifying the target type
     */
    String getTargetDataBinding();

    /**
     * Get the cost of the transformation. The weight can be used to choose the most efficient path if there are more
     * than one available from the source to the target.
     *
     * @return An integer representing the cost of the transformation
     */
    int getWeight();
}


/**
 * PullTransformer transforms data from one binding format to the other one which can be directly consumed
 *
 * @param <S> The source data type
 * @param <R> the target data type
 */
public interface PullTransformer<S, R> extends Transformer {
    /**
     * Transform source data into the result type.
     *
     * @param source The source data
     * @param context The context for the transformation
     * @return The transformed result
     */
    R transform(S source, TransformationContext context);
}

Register databindings and transformers

/**
 * Module activator for AXIOM databinding
 * 
 * @version $Rev: 529327 $ $Date: 2007-04-16 10:10:43 -0700 (Mon, 16 Apr 2007) $
 */
public class AxiomDataBindingModuleActivator implements ModuleActivator {

    public Map<Class, Object> getExtensionPoints() {
        return null;
    }

    public void start(ExtensionPointRegistry registry) {
        DataBindingExtensionPoint dataBindingRegistry = registry.getExtensionPoint(DataBindingExtensionPoint.class);
        dataBindingRegistry.register(new AxiomDataBinding());

        TransformerExtensionPoint transformerRegistry = registry.getExtensionPoint(TransformerExtensionPoint.class);
        transformerRegistry.registerTransformer(new Object2OMElement());
        transformerRegistry.registerTransformer(new OMElement2Object());
        transformerRegistry.registerTransformer(new OMElement2String());
        transformerRegistry.registerTransformer(new OMElement2XMLStreamReader());
        transformerRegistry.registerTransformer(new String2OMElement());
        transformerRegistry.registerTransformer(new XMLStreamReader2OMElement());
    }

    public void stop(ExtensionPointRegistry registry) {
    }

}

Child pages

Tuscany Databinding Guide

Tuscany Databinding Framework

Overview

Usage Scenarios

The data model

Operation-level transformations

Parameter-level transformations

Logical Type vs. Physical Type

What's a databinding?

What's a transformer?

Usage Scenarios

Data transformation between SCA components

Data transformation for composite services

Data transformation for composite references

Data transformation for property values

Data Transformations

How to transform data across databindings

What's behind the magic?

Extend the databinding framework

What can be extended?

How to contribute a new databinding or transformer?

The DataBinding SPI:

Transformer SPI

Register databindings and transformers