This Proposal is still a Work In Progress

Specification of the positions of elements within data representations in DFDL is done by use of lengths, alignments and skips.

At one time there was a plan for a feature allowing locations of elements to be specified relative to a base element. This was dropped simply to get a DFDL v1.0 specification done. There are certainly specifications and other format description systems that allow one to specify the locations of elements by explicit offsets.

The TIFF Image file format is one that cannot be expressed in DFDL without additonal features like these.

New Properties

  • dfdl:offsetUnits with values "bits" or "bytes"
  • dfdl:offsetKind with values "startToStart" or "endToStart" - indicates if the offset is from the start of the base element or the end of the base element.
  • dfdl:offsetBase - non-empty string containing a relative path to an earlier element. An initial / means the path is absolute. ".." means relative parent. This is a path to an element, and specifies from where the offset will be measured.
  • dfdl:offset - constant non-negative integer or DFDL expression that evaluates to a non-negative integer. When added to the position of the start/end of the offsetBase, gives the starting position of the element. The starting position of the element is the start of the AlignmentFill region in the Data Syntax Grammar. It is a Processing Error if the expression evaluates to a negative value. (TBD: should this be SDE?)

Tunable Parameters

  • maximum offset - limits the buffering required

Discussion

  • This feature effectively enables random accessing of the data stream within the limits imposed by the implementation.
  • This feature lets parsing jump forward, discover information that it saves in variables or the infoset, and then back up and parse earlier data.
  • An interaction with arrays must be considered - if the offset expression evaluates to the same value over and over one might end up in a no-forward-progress situation.
  • Issue: Overlapping data - is it allowed - user beware - to have overlapping data - that is, using offsets to back up into the middle of data that has already been parsed..... Is this an error that must be detected, an error that need not be detected? Is it a processing error or a runtime SDE?

Example

TBD - draw from TIFF. Consider unparsing - when the offsets will be computed from stored element values. How do we compute the value to store.

  • No labels