You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

There are times when parsing data formats when it is nessasary to consider data that occurs at a future point in the bitstream. For instance, consider a simple fixed-length tagged union, where the tag occurs after the union. Conceptually, such a format may be described by:

<xs:choice dfdl:choiceDispatchKey="{ tag }">
  <xs:element name="a" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="1"/>
  <xs:element name="b" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="2"/>
</xs:choice>
<xs:element name="tag" type="xs:int" dfdl:length="8" />

An existing proposal (Proposal: DFDL base+offset feature - Enables describing TIFF) would allow for this, by making it possible to put the <tag> element first in the infoset, despite it occuring later in the bitstream. However, such a proposal imposes unnessasary complexity for such a usecase. In particular, the schema must specify explicitly to jump forward and backward in the bitstream. Further, the full generallity of the schema involves considering additional issues surrounding unparsing (such as overlapping data).

For basic usecases such as the above, it is possible to instead add support in a much simpler manner, by providing lookahead capabilities directly in DPath:

<xs:choice dfdl:choiceDispatchKey="{ daf:lookAhead(16,8) }">
  <xs:element name="a" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="1"/>
  <xs:element name="b" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="2"/>
</xs:choice>
<xs:element name="tag" type="xs:int" dfdl:length="8" />

Another potential solution would be to allow forward references in DPath expressions during parsing, if the compiler can prove that such a forward reference is resolvable (eg. the portion of content being skipped over is of constant length). However doing so would add significant complexity to both Daffodil and DFDL.

This proposal is to add the daf:lookAhead function to DPath.

daf:lookAhead

  • daf:lookAhead(distance, bitSize) 
    • read bitSize bits, where the first bit is located at an offset of distance from the current location
  • Restrictions
    • distance >=0
    • bitSize >= 0
    • distance + bitSize <= Implementation defined limit no less than 512 bits
    • Cannot be called during unparse
    • Error if looks past EOF
    • Undefined behavior if looks past document boundery when in streaming mode.

Examples

The following two elements are equivalent:

  • <xs:element name="a" type="xs:unsignedInt" dfdl:length="3" dfdl:lengthUnits="bits" />
  • <xs:element name="a" type="xs:unsignedInt" dfdl:length="3" dfdl:lengthUnits="bits" dfdl:inputValueCalc="{ daf:lookAhead(0,3) }" />

The following example demonstrates using lookAhead to branch based on a field in the future:

<xs:choice dfdl:choiceDispatchKey="{ daf:lookAhead(16,8) }">
  <xs:element name="a" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="1"/>
  <xs:element name="b" type="xs:int" dfdl:length="16" dfdl:choiceBranchKey="2"/>
</xs:choice>
<xs:element name="tag" type="xs:int" dfdl:length="8" />
  • No labels