Much data contains numeric values that are enumerations, where each value is associated with a logical string the provides a meaningful symbolic interpretation of it.


This proposal provides an alternative mechanism by introducing a new notion to DFDLdfdlx:inputTypeCalc and outputTypeCalc which are analogous to inputValueCalc and outputValueCalc except that they are associated with types, not elements; and that they compose with preexisting parsing behaviours.


Suppose we have an existing type t1[A] and we want to define a new type, t2[A] with the trivial identity transforms. We may do this by defining t1 as a new xsd simpleType with base A, and add the dfdldfdlx:repType annotation to specify the repType as t2.


Code Block
<xs:simpleType name=t2 dfdldfdlx:repType=t1>
  <xs:restriction base=A />


Code Block
<xs:simpleType name=t2 dfdldfdlx:repType=t1>
  <xs:restriction base=A>


The KeySet-Value transforms are central to the support of enumerations. Abstractly, a KeySet-Value transform is defined by a set of (keyset, canonicalKey, value) tuples, where each canonicalKey is a member of the corresponding keyset, all values are unique, and all keysets are mutually disjoint. The transform is then defined by:

Code Block
data : { (keyset, canonicalKey, value) }


parse(x) = let (keyset, canonicalKey, value)  data such that x 



 keyset in
unparse(x) = let (keyset, canonicalKey, value)  data such that x = value




This behaves similarly to a standard invertible key-value map, except that it is possible for multiple keys to map to the same value, in which case a single key is chosen as the inverse of said value.  

This is specified in schema by definng t2[B] as an xsd enumeration of type B. On each enumeration value, we use DFDL annotations to specify one or more keys (or repValues) to associate with it. There are two ways to specify repValues. The dfdldfdlx:repValues annotation is a space deliminated list of values; and the dfdldfdlx:repValueRanges is a space separated list of ints which will be interperated as “min1 max2 min2 max2 … minN maxN”, which represents the union of all intervals [minK, maxK]. The repValue set of t2 is the union of that specified by the above to methods. For example:

Code Block
<xs:simpleType name="fruitEnumType"




  <xs:restriction base="xs:string">


    <xs:enumeration value="Apple"


 dfdlx:repValues="0" />


    <xs:enumeration value="Banana"


 dfdlx:repValues="1" />


    <xs:enumeration value="Disused"


 dfdlx:repValues="11 13 15" />


    <xs:enumeration value="Illegal"


 dfdlx:repValues="12 14"


 dfdlx:repValueRanges=”3 10 16 255”/>





The canonical repValue is the first value specified by dfdldfdlx:repValues, or (of dfdldfdlx:repValues is not present), the first value specified by dfdldfdlx:repValueRanges.

Union Transfom

Suppose we have multiple types using a common repType, but with disjoint repValues. For instance, we might have a separate type for negative integers and non-negative integers. We can combine these into a single type using the xsd union construct:

Code Block
<xs:simpleType name=”signedInt”




  <xs:union memberTypes=”negativeInt nonnegativeInt” />



Here, we require the the repType of all component types match the repType of the parent type. The repValues of the parent type is the disjoint union of the repValues of the child types, and the inputTypeCalc/outputTypeCalc functions are defined piecewise by those of the component functions.


The final type of transform that this proposal will consider are those defined by arbitrary DFDL expressions. These expressions will be defined by means of explicit dfdldfdlx:inputTypeCalc and dfdldfdlx:outputTypeCalc annotations on the type. In addition, the repValue set must be explicitly defined by placing dfdldfdlx:repValues and/or dfdldfdlx:repValueRanges directly on the type.

Code Block
<xs:simpleType name="fruitLocalType"




 dfdlx:inputValueCalc ="{


 dfdlx:repTypeValue()  2 }"




 dfdlx:ouputValueCalc ="{


 dfdlx:logicalTypeValue() + 2 }" 








 dfdlx:repValues="12 14"




 dfdlx:repValueRanges="3 10 16 255" >


  <xs:restriction base=”xs:int” />



Note that, in the above example, a non DFDL aware validator will mistakingly believe that all integers are legal values. This can be resolved by explicitly specifying the set of logical values using the xsd restriction mechanism:

Code Block
<xs:simpleType name="fruitLocalType"




dfdlx:repTypeValue() - 2"




dfdlx:logicalTypeValue() + 2" 




 dfdlx:repValues="12 14"


 dfdlx:repValueRanges="3 10 16 255" >






      <xs:restriction base="xs:int ">


        <xs:enumeration value="10"/>


        <xs:enumeration value="12"/>








      <xs:restriction base="xs:int">












      <xs:restriction base="xs:int">













Note that the only effect of adding these restrictions on the logical type is in validation.


As an alternative, we add two annotations to xs:choice: dfdldfdlx:choiceBranckKeyKind, and dfdldfdlx:choiceDispathKeyKind

When choiceBranckKeyKind is “byType” each branch of the xs:choice must be a simple element with a transform. The choice will then behave as if the each element specified dfdldfdlx:choiceBranchKey as the set of repValues defined by the type of said element.

When dfdldfdlx:choiceDispathKeyKind is “byType”, we require all choice options to be simple elements and share a common repType. We then parse the repType, and use the resulting simple value as the choiceDispatchKey.

For example:

Code Block






  <xs:element name=”fruit” type=”tns:fruitEnumType”/>


  <xs:element name=”localFruit” type=”tns:fruitLocalType”/>


  <xs:element name=”disused” type=”tns:fruitDisusedType”/>




Using with explicit raw elements

It may be desirable to include both the raw and logical values in the infosets. Traditionally, this usecase has been accomplished using inputValueCalc and outputValueCalc annotations. This remains the case here. To support this usecase, we expose the inputTypeCalc/outputTypeCalc functions to the DFDL expression language:

Code Block
<xs:sequence >


  <xs:element name="raw" type="tns:fruitRepType"




”dfdlx:outputTypeCalcInt(tns:fruitEnumType, ../fruit)”/>


  <xs:element name=”fruit” type=”tns:fruitEnumType”




”dfdlx:inputTypeCalcString(tns:fruitRepType, ../raw)”/>




A more complicated example would be using a raw element with a choice of logical elements:

The only additional mechanism is the dfdldfdlx: outputTypeCalcNextSiblingInt/String functions, which takes the value of the following sibling and applies the outputTypeCalc function associated with the element type of the following sibling.

Code Block


  <xs:element name="raw" type="tns:fruitIntType"




dfdlx:outputTypeCalcNextSiblingInt()" />








 dfdlx:choiceDispatchKey="../raw" >


    <xs:element name="fruit" type="tns:fruitType"




dfdlx:inputTypeCalc(tns:fruitType, ../raw)" />


    <xs:element name="localFruit" type="tns:fruitLocalType"




dfdlx:inputTypeCalc(tns:fruitLocalType, ../raw)" />


    <xs:element name="disused" type="tns:fruitDisuedType"




dfdlx:inputTypeCalc(tns:fruitDisusedType, ../raw)" />





In principle, this could be accomplished more generically, by allowing dfdldfdlx:outputTypeCalc to take an arbitrary expression returning a path to a node, along with some form of next-sibling function (to allow for the fact that there is not a constant name for the next sibling). However, due to ease of implementation, only this more limited structure will be supported by this proposal.

Summary of annotations

  • dfdlx:repType
    • Applies to xs:simpleType
    • Defines the representation type associated with the annotated type.
    • On parse, the DFDL processor first parses according to the repType, then applies any conversion specified by the annotated type.
    • On unparse, the DFDL processor first applies the conversion specified by the annotated type, then the unparse behavior specified by the repType
  • dfdlx:choiceBranchKeyKind
    • Applies to xs:choice
    • Values: byType, explicit, speculative, implicit
    • byType
      • Each choice option must be a simple element
      • All choice options must have a type with a common repType
      • The valueSets of all options must be mutually disjoint
      • The choice dispatch will behave as if the choiceBranchKeys specified by an option are the valueSet of the options type.
    • Explicit
      • Each choice option must directly specify a choiceBranchKey. These values will be used for direct dispatch
    • Speculative
      • Direct dispatch will not be used. Choice options will be parsed speculatively, and the first non-failing case will be used
      • Requires choiceDispatchKeyKind=speculative
    • Implicit
      • Current behavior
      • If choice options provide explicit choiceBranchKeys, then behave as if we were “explicit”
      • Otherwise, behave as if we were “speculative”
  • dfdlx:choiceDispatchKeyKind
    • Applies to xs:choice
    • Values: byType, explicit, speculative, implicit
    • byType
      • Each choice option must be a simple element
      • All choice options must have a type with a common repType
      • First, parse according to the common repType without consuming any input
      • Then, use the resulting value as the choiceDispatchKey
    • Explicit
      • Gets the choiceDispatchKey from the dfdlx:choiceDispatchKey annotation
    • Speculative
      • Direct dispatch will not be used. Choice options will be parsed speculatively, and the first non-failing case will be used
      • Requires choiceBranchKeyKind=speculative
    • Implicit
      • Current behavior
      • If dfdlx:choiceDispatchKey is present, them behave as if we were explicit
      • Otherwise, behave as if we were speculative


  • dfdlx:inputTypeCalc
    • Applies to xs:simpleType
    • Requires dfdlx:repType to also be present
    • Is a DFDL expression
    • On parse, first parse according to the repType, then populate the value of this element to the result of evaluating the dfdlx:inputTypeCalc expression
    • The value of the repType may be accessed by the expression through the dfdlx:repTypeValue() function
  • dfdlx:ouputTypeCalc
    • Applies to xs:simpleType
    • Requires dfdlx:repType to also be present
    • Is a DFDL expression
    • On unparse, first evaluate this expression, then unparse according to the repType as if the logical value were the result of evaluating this expression
    • The original logical value of this type may be accessed by the expression through the dfdlx:logicalTypeValue() function
  • dfdlx:repValues
    • Applies to xs:enumeration and xs:simpleType
    • A space separated list of values
    • Values must be of a type consistend with repType
    • When applied to xs:enumeration:
      • Defines a KeySet-Value transform, and associates the annotated enumeration value with the listed keys
      • Adds the listed values to the repValue set of the parent simpleType
    • When Applied to xs:simpleType
      • Adds the listed keys to the repValue set of the parent
      • This set will be used by xs:choice when choiceBranchKeyKind=byType
  • dfdlx:repValueRanges
    • Applies to xs:enumeration and xs:simpleType
    • Requires dfdlx:repType to be present and refer to an integer type
    • A space separated list of integers defining ranges of integers
    • Takes the form “min1 max1 min2 max2 … minN maxN”
    • Represents the set of integers described by the union of the intervals [mink, maxK]
    • Behaves as if all members of this set were included in the dfdlx:repValues annotation

Summary of Functions

  • dfdlx:inputTypeCalcInt(f: QName, x:Any)
    • f must be a constant QName resolving to a simpleType with a transform defined
    • The repType of said transform must match the type of x
    • The type f must be a restriction of xs:int
    • Returns the result of applying the inputTypeCalc function assotiated with f to x
  • dfdlx:inputTypeCalcString(f: QName, x:Any)
    • Behaves like dfdlx:inputTypeCalcInt(f: QName, x:Any) except that f must be a string type, and a string will be returned
  • dfdlx:outputTypeCalcInt(f: QName, x:Any)
    • f must be a constant QName resolving to a simpleType with a transform defined
    • The repType of said transform must be a restriction of xs:int
    • The type f must match the type of x
    • Returns the result of applying the outputTypeCalc function assotiated with f to x
  • dfdlx:outputTypeCalcString(f: QName, x:Any)
    • Behaves like dfdlx:inputTypeCalcInt(f: QName, x:Any) except that the repType associated with f must be a restriction of xs:string, and a string will be returned
  • dfdlx:outputTypeCalcNextSiblingInt(f: QName, x:Any)
    • The following sibling must be a simpleType whose repType is a restriction of xs:int
    • Returns the result of applying the outputTypeCalc function associated with the type of the following element to the value of the following element
  • dfdlx:outputTypeCalcNextSiblingString()
    • Behaves like dfdlx:outputTypeCalcNextSiblingInt, except the repType must be a restriction of xs:string, and a string will be returned
  • dfdlx:repTypeValue()
    • Can only be called from inside dfdlx:inputTransform
    • Returns the value of the underlying repType
  • dfdlx:logicalTypeValue
    • Can only be called from inside dfdlx:outputTransform
    • Returns the logical value of this element.