Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

(Idea for future: for debugging, it may be useful to compute these for every element, so as to be able to show a user exactly where the representation of every element is, and this applies to both parsing and unparsing.) 

To enable the unparser to continue unparsing after an OVC element is suspended, the DataOutputStream implementation supports what we call 'splitting' the stream into an original part, and an added buffering data output stream, into which the unparsing can continue. Once the OVC element value is computed, then the unparsing of the OVC element can proceed, using the original DataOutputStream. Once that unparsing of the OVC element is complete, the original DataOutputStream is finished, and everything, including its ending bit position, is known, so that it can be recombined with the buffering data output stream that was split off of it.

A few complicating factors

  • The length of the OVC element may not be known; hence, the starting bit position of this added buffering DataOutputStream may not be known until unparsing of the OVC element's suspension has completed.
  • Alignment: Because elements and model groups (terms generally) can have alignment, and text anywhere can have mandatory text alignment, then in the case where we do not know the starting bit position, we are not able to compute the size of the alignment fill region needed.
    • This implies that non-zero alignment requires a split of the data output stream of its own - in the case where the starting bit position is not known.
  • Bit order: Elements can have dfdl:bitOrder, and model groups can have text (e.g., dfdl:initiator), and text implies a bit order as charset encodings each have a specified bit order. It is not meaningful for the bit order to change except on a byte boundary (8 bit boundary). So, if the starting bit position of a buffering data output stream is not known, then the unparser cannot determine whether a bit order change is legal or not until that starting bit position has been determined.
    • This implies that bit order changes require a split of the data output stream of their own - in the case where the starting bit position is not known.
  • Interior Alignment: The length of an element of complex type may depend on its starting bit position in data output stream. This happens because terms (elements or model groups) may have alignment (or mandatory text alignment). These alignment regions may be of varying size depending on where the term starts in the data output stream; hence, the length of a complex type may not be able to be computed until its starting position is known, and recursively the starting positions of any elements inside it are known.
    • This implies that expressions that compute the dfdl:contentLength or dfdl:valueLength of an element must potentially suspend until the starting bit positions become known so that the length of the alignment regions can be computed.
    • Circular deadlocks can occur if an OVC element needs the length of a later element, but the length of the later element depends (by way of this interior alignment issue), on the length of the OVC element.
      • Note: it is expected that formats are rare (but possible) where an OVC element itself is a variable-length element.
  • Target length: Some elements have an explicit length which can be fixed, or given by a dfdl:length expression. When unparsing, this dfdl:length expression is evaluated to give a value known as the target length. This can differ from the value's implicit length in that the value may need to be padded to achieve the target length, or for xs:string only, the value may need to be truncated to fit within the target length.
    • TBD: For elements with explicit length, there is an element unused region at the end which may need to be filled (with dfdl:fillByte). For simple elements this would also be a difference between value and content length. For complex types. .......
    • There is commonly a circular dependency between an OVC element storing a length, and the element whose length it stores. Deadlock is avoided when unparsing because the value of the OVC element must depend only on the dfdl:valueLength (which excludes padding/filling), and so can be computed without reference to the target length of the element. The target length expression is then able to depend on the value of the OVC element and the circularity is avoided.

This example recreates what the problem is.

...