Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
<xs:simpleType name="vehicleType" dfdlx:repType="tns:uint3">
   <xs:restriction base="xs:string">
     <xs:enumeration value="NoStatement" dfdlx:repValues="0"/> 
     <xs:enumeration value="truck"       dfdlx:repValues="1" />
     <xs:enumeration value="suv"         dfdlx:repValues="2" />
     <xs:enumeration value="bus"         dfdlx:repValues="3" />
     <xs:enumeration value="train"       dfdlx:repValues="4" />
     <xs:enumeration value="car"         dfdlx:repValues="5" />
     <xs:enumeration value="ILLEGAL7"    dfdlx:repValues="7"/     <!-- ILLEGAL 6 -->
     <!-- ILLEGAL 7 -->
  </xs:restriction>
</xs:simpleType>
 
<xs:simpleType name="uint3" dfdl:length="3" dfdl:lengthUnits="bits">
  <xs:restriction base="xs:unsignedInt"/>
</xs:simpleType>

...

Note that there must  be one enumeration value for every possible value of the representation type integer. If not, a mapping from an unmapped integer (parsing) or unmapped string (unparsing) is a processing error.

The comments in the above noting that values 6 and 7 are illegal are not visible to Daffodil. They are merely comments. 

An attempt to parse an unmapped integer should result in the string ILLEGAL_N where N is the unmapped integer value. This results in such a value being considered well-formed, but as it does not match a value in the enumerations allowed, it will be considered invalid. 

The prefix string such as "ILLEGAL" shown here should be configurable via a Daffodil tunable, with value "ILLEGAL" as the default.

Actual enumeration values named "ILLEGAL" or prefixed  that way are not an error. 

Unparsing requires valid data input. Hence an attempt to unparse a string such as "ILLEGAL_7" fails  ith a processing error. 

All the dfdx:repValues integers must be distinct.

...

All the enumeration strings must be distinct.

Only mappings between strings and non-negative integers and strings (when parsing, the opposite direction for unparsing) are supported. 

The upper bound on the size of the enumerations is 16 bits (64K enumeration values), but may be enlarged in the future if needed.  (Largest known as of this writing is 4096 entries - 12 bits)

The runtime implementation uses an array to map from integers to strings, and a hash table-like technique to map from strings back to integers, so as to achieve constant time for parse and unparse for each such enumerated-value element. 

Implementation Note:

This feature , minus is already implemented as of Daffodil 2.4.0 with exceptions for:

  • the ability for the dfdlx:repValues property to be omitted and its value implied
    • schemas should always add the dfdlx:repValues property for now.
  • the ability to synthesize ILLEGAL_N from an unmapped integer.
    • Currently a processing error occurs
    • Schemas should define explicit ILLEGAL_N enumeration values for all integers, along with a pattern facet with a regular expression indicating that the string cannot begin with "ILLEGAL_" such as "[^(ILLEGAL)].*"
      • This works because within a simple-type definition enumerations and patterns are ANDed. One of the enumerations must be satisfied,

...

      • AND one of the patterns (if there is more than one pattern).
      • This could be extended, if other enumeration values want to be considered well-formed but invalid. The regular expression can also exclude those E.g., UNDEFINED, UNUSED, or other marker enumeration values

Prior Features Removed

These dfdlx properties described in the prior proposal are removed. They are not in use.  

...