Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

draw.io Diagram
bordertrue
viewerToolbartrue
fitWindowfalse
diagramNamedataOutputStream
simpleViewerfalse
width
diagramWidth278
revision12

On the left we have the JVM stream (or a buffer) holding whole bytes which we'll write in hex. On the right we have the frag byte which will eventually, once filled up, flow into the whole bytes part, at which point the frag byte will be reset. At the bottom of the frag byte we have the current bit order (shown as MSBF), and the number of bits in the fragment (shown as 2). The data in the frag byte illustrates the bits that are significant, with X for the bits as yet unoccupied by unparsed data.

...

We then get a Start-Element 'x' event and as it is simple type, with the value 255.

This is a whole byte, and the stream is currently byte aligned (because it is empty), so this data is output to the whole-bytes part of the data output stream:

...

Now we suspend the computation of element 'a' for later, but proceed to unparse element 'b', into the buffered stream. This results in:

 

draw.io Diagram
bordertrue
viewerToolbartrue
fitWindowfalse
diagramNamebuffered50
simpleViewerfalse
width
diagramWidth598
revision4

...

However, this is the subtle bug. It's not at a byte boundary in the buffered stream because that didn't start on a boundary that results in bit 17 being aligned within that stream. Rather, webyte boundary. Rather, every byte in the buffered stream is off by 3 bits. We're at bit 4 in MSBF order in the frag byte.

...

Ideas for possible fixes include

Starting Frag Byte

  • Add a starting frag byte to buffered streams.
    • This byte contains a partial byte to insure that the whole bytes (buffered) are always on byte boundaries, and the frag byte(s) are always on byte boundaries.
    • This doesn't work. Because the starting position of a buffered data output stream isn't necessarily known. It is known in this example because element 'a' has fixed length, but if element 'a' had variable length we would have no idea where the starting position is, or where any byte boundaries actually occur in the data.
  • We could fix the above if we insist that OVC elements have fixed length. DFDL doesn't require this however, so it is not a particularly stable solution.

...

Split on Bit-order Change

  • Any time bit-order changes, split off yet another buffered data output stream. Record the bit order of every buffered output stream.
  • When eventually collapsing the data output streams together, check that we in fact end up changing bit orders on proper byte boundaries or issue a runtime SDE.
    • Probably need to save information on each buffered data output stream for diagnostic purposes in issuing this error. E.g., the Element's ERD.