Unparsing: Data Output Stream Buffering

There is a bug in bit-order handling when unparsing. This bug came up in Link16 when surrounded by NACT envelopes because NACT is bigEndian MSBF, and Link16 is littleEndian LSBF.

This example recreates what the problem is.

Consider unparsing data described by this schema:

<dfdl:format lengthKind="explicit"
   dfdl:byteOrder="littleEndian" 
   dfdl:bitOrder="mostSignificantBitFirst" 
   dfdl:lengthUnits="bits"
   dfdl:alignment="1" dfdl:alignmentUnits="bits"/>

<element name="r" dfdl:lengthKind="implicit">
  <complexType>
   <element name="x" type="xs:int" dfdl:length="8"
   <element name="a" type="xs:int" dfdl:outputValueCalc="{ dfdl:valueLength(../d) }"
                     dfdl:length="5"/>
   <element name="b" type="xs:int" dfdl:length="3"/>
   <element name="c" type="xs:int" dfdl:length="4" dfdl:bitOrder="leastSignificantBitFirst"/>
   <element name="d" type="xs:string" dfdl:lengthKind="pattern" 
        ... string in a LSBF encoding like the 7-bit ascii packed one ..../>
   ...
 </complexType>
</element>

The important feature above is that at element 'c', the bit order changes from MSBF to LSBF. And this happens immediately after element 'a' which has dfdl:outputValueCalc, that must wait until we have the infoset element 'd' to compute its value.

This should be fine, because 'x' is 8 bits wide, 'a' is 5 bits wide, and 'b' is 3 bits wide, so we're on a byte boundary logically when this change occurs.

Now let's consider this infoset

<r>
  <x>255</x> <!-- hex FF binary 1111 1111-->
  <a>22</a> <!-- binary 10110 -->
  <b>7</b> <!-- binary 111 -->
  <c>6</c> <!-- binary 0101 -->
  <d>abc</d>
  ...
</r>

Now we are going to try to animate what happens as we unparse this based on the arrival of the infoset events to the unparser.

Our start state has a DataOutputStream which is 'direct' meaning attached to an actual Java JVM output stream. Because DFDL and Daffodil are very bit-oriented we implement bit-level I/O on top by accompanying this output-stream with a "frag byte", a single byte which contains from 0 to 7 bits which are the fragment of a byte that is not yet complete and ready to be written to the actual output stream. The frag byte never contains all 8 bits (except transiently) because when it has all 8 bits we write it to the byte stream and clear it.

We'll depict a data output stream like so. This shows a data output stream where 4 whole bytes + 2 additional bits have been written.

On the left we have the JVM stream (or a buffer) holding whole bytes which we'll write in hex. On the right we have the frag byte which will eventually, once filled up, flow into the whole bytes part, at which point the frag byte will be reset. At the bottom of the frag byte we have the current bit order (shown as MSBF), and the number of bits in the fragment (shown as 2). The data in the frag byte illustrates the bits that are significant, with X for the bits as yet unoccupied by unparsed data.

So let's look at unparsing the infoset we see in the XML above.

We start from an empty data output stream. We will get a Start-Element 'r' event. No I/O occurs as this root element has nothing about it in the representation. So per below, the data output stream remains empty as here. Note that the bit position (1 based) is 1, but nothing has been output at that position. That is the position where output will begin.

We then get a Start-Element 'x' event with the value 255.

This is a whole byte, and the stream is currently byte aligned (because it is empty), so this data is output to the whole-bytes part of the data output stream:

This leaves us at bit position 8.

Next to unparse is element 'a'. However, this has dfdl:outputValueCalc, which depends on the infoset event for element 'd', which we don't have yet. So now the unparser works its magic, and splits off a buffered data input stream. So below, the stream on the left is the direct stream, the buffered stream is on the right.

Because we know the length (in bits) of element 'a' to be 5 bits, we know the starting position of the buffered stream. Logically, the first bit in the buffered stream is at bit 14.

Now we suspend the computation of element 'a' for later, but proceed to unparse element 'b', into the buffered stream. This results in:

We see that the 3 bits representing the value 6, 111, are in the frag byte of the buffered stream. The buffered stream now contains 3 bits. That means it ends at bit 16, with position at bit 17.

Now we get the start event for 'c'. However, element 'c' uses the LSBF bit order.

This should be fine, because the logical position in the bit stream is 17, which is 1 mod 8, i.e., byte aligned.

However, this is the subtle bug. It's not at a byte boundary in the buffered stream because that didn't start on a byte boundary. Rather, every byte in the buffered stream is off by 3 bits. We're at bit 4 in MSBF order in the frag byte.

Let's look at the data in bits, using A for the bits not yet populated from element 'a':

1111 1111 | AAAA A111

That's the whole byte for element 'x', followed by the 5 bits for 'a' (unknown as yet), and the 3 bits for 'b'. Let's add to this sequence, the next byte, and using letter 'C' show which bits should be occupied by the value of element 'c'

1111 1111 | AAAA A111 | XXXX CCCC

Because element 'c' is LSBF, and we are starting a new byte, the bits for 'c' go in the least-significant bits of the third byte. The value is 6, which is 0101. So the bits should be:

1111 1111 | AAAA A111 | XXXX 0101

So, now considering our buffered data output stream above, we want to write out the bits of element 'c', but we can't yet flush the frag byte because we don't have all its bits yet. The byte boundary in the data is after the three 111 bits in the frag byte. So we have 5 bits of the next byte available to write in the frag byte, but we only have the data as yet to populate 1 of them.

So we actually need a second frag byte, or we have no place to put all the bits of 'c' in the data output stream.

But ultimately our algorithm and frag byte system, have broken down here. As the bits in frag byte 1 aren't contiguous any more. It contains 4 live bits and 4 unused, but the used bits are not adjacent. The second frag byte is supposed to be LSBF, but its shown here MSBF so we can see how the bits are assembled so they come out adjacent when combined with the bits in frag byte 1.

In addition, the next 4 bits to be output, should go into the unused bits of frag byte 1, not frag byte 2. Only once those 4 bits are output, at which point we can flush frag byte 1 to the whole bytes buffer, do we start writing to frag byte 2, which can become frag byte 1, and we then drop frag byte 2.

At this point I think it should be apparent that the algorithm is a confusing mess. The fact that frag bytes don't represent byte-aligned bytes of data is the assumption we had that ultimately is not being maintained. The design in place here is not up to the job of a buffered data output stream starting not on a byte boundary.

The solution to this problem is to split the data output stream again, on every bit order change.

Any time bit-order changes, split off yet another buffered data output stream.
Record the bit order of every buffered output stream.
When eventually collapsing the data output streams together, check that we end up changing bit orders on proper byte boundaries or issue a runtime SDE.
A detail is to save information on each buffered data output stream for diagnostic purposes in issuing this error. E.g., the Term's Runtime Data object.

To illustrate this, let's go back to where we have just unparsed element 'b' and we're about to unparse element 'c'.

If we recognize that element 'c' has a different bit order, and we split off another buffered stream we get:

Now, the middle stream is marked FINAL, because we are splitting due to bit order change, not due to some additional suspended material to be added on later. The right-most stream contains the representation of 'c', confident that this stream begins on a byte boundary.

Now we unparse element 'd'. This is a string, and without getting into the details, we're going to say that it contains 28 bits containing the characters 'abc' then a NUL. These characters are 7-bits wide each, and they are stored LSBF. These characters in hex are 'a' is 0x61, 'b' is 0x62, 'c' is 0x63, and NUL is 0x00. First we unparse the 'a' getting 7 bits of 1100001. The frag byte has room for 4 characters, and LSBF those are the lease significant 4 bits of this character, so 0001. So the data output stream gets 0001 added to its frag byte for a full byte of 00001110 or hex 0x0E. This would be moved to the whole bytes, and the remaining 2 bits of the character 11 placed in the frag byte. So our streams now look like this:

By the same technique we output the next 3 characters, 'b', 'c', and the NUL (7 bits each), and we get:

At this point we have unparsed element 'd', the details of that are elsewhere, but basically, when we receive the event for element 'd', we construct an infoset element node. Since element 'd' is used in expressions, we know we must capture four things: contentStartPosition, valueStartPosition, valueEndPosition, and contentEndPosition (tbd; names aren't right), and as we unparse the value of 'd' into the output stream (third one, right-most above), we record these positions.

The expression for computing element 'a' value depends on knowing the content-length of element 'd', so once we have the contentEndPosition, we can subtract the contentStartPosition and we know the content length. This allows the suspended computation of the value for element 'a', and the unparsing thereof, to be resumed.

So, that expression is evaluated, creates the value 28, and that value is stored in the infoset element for the instance of 'a'. Then the unparsing of that infoset element is done, and stored with the suspended computation/unparser is the data output stream to use, which is the left-most one above. Now 28 expressed as binary is 11100 (5 bits), so we write these bits to the data output stream.

The last thing the suspended computation and unparser for 'a' does, is "setFinal" on the data output stream. This tells the data output stream that no further data will be appended. What follows its contents is then logically the next data output stream (middle one above). The "setFinal" operation initiates a process called "collapsing" the data output stream buffers.

The data output streams are connected into a double-linked chain:

We've also emphasized that this first data output stream on the left is special. it is the "direct" output stream in that it is actually connected to a java.io.OutputStream, not just to a buffer. Bytes written to the "whole bytes" of the direct output stream actually are being written out by Daffodil.

So when a buffered data output stream is setFinal, it is simply marked as such, and this will be checked on later. But when the direct data output stream is marked final, then it is combined with the next data output stream. If that stream is final, then it will be combined with the next after that, and so on until either all the streams have been collapsed and all data written out, or a data stream that is not final is encountered.

So looking at the collapsing of the direct and first buffered data output stream, first the data of the buffered stream is written to the direct data output stream. Since the direct data output stream has a frag byte, that byte must be completed first. In this example, the buffered data output stream (middle above) doesn't have any whole bytes, but it does have a frag byte, and so bits from this frag byte are moved into the direct data output stream's frag byte.

In this example, the direct data output stream's frag byte has 5 bits, and the buffered data output stream contributes 3 more, making a whole byte:

Notice now that the middle stream has no contents any more. It's bit length is now 0. Since the frag byte of the direct stream is full, this is output, and the direct stream's frag byte is cleared.

Next the direct output stream and the buffered data output stream (left and middle) are "Morphed". The middle buffered data output stream will morph into being a direct output stream by taking up the actual java.io.OutputStream of the left-most stream. The formerly direct data output stream on the left is now "dead", and can be discarded (garbage collected, or returned to a pool).

Now, the middle data output stream (now the direct one), was FINAL before, and remains FINAL. That means we continue to look to the right and combine the direct data output stream into the next buffered data output stream. In this case, the right-most stream has 3 whole bytes, the direct stream has no frag byte, so those whole bytes are just output to the direct stream's whole bytes.

Next the frag byte contents from the right-most data stream are output to the direct stream. As the frag byte of the direct stream is empty, these bits just move over to the direct stream's frag byte.

Note that in the above, we're now collapsing the data output stream created by unparsing element 'c', and that element had bit order LSBF. When we copied the whole bytes, since there was no frag byte the bit-order change is legal. Had there been a frag byte on the direct stream, then changing bit orders would have had to create a runtime Schema Definition Error, as it is not meaningful to change bit order in the middle of a byte.

Now the last step is that the direct stream and buffered stream will again morph:

At this point, all the data output streams have been collapsed, and all subsequent unparsing will be going to this last remaining live data output stream. The collapsing ends the need for the data output streams to first buffer all output. So the overhead introduced by all the above machinery is no longer encountered, unless the need again arises due to another element with dfdl:outputValueCalc, another change of bit order, or the need for alignment fill regions or final unused regions of unknown length.

This algorithm can achieve streaming behavior from unparsing. Note that the delayed unparsing of element 'a' lasted until element 'd' was received and unparsed (into buffered data output stream). At that point the unparser outputs the data for elements 'a', 'b', 'c', and 'd' (most of it for 'd', 7 bits are left in a frag byte). The output of data has caught up to the streaming in of infoset events.

This discussion did not cover several other important aspects of the unparser algorithm:

suspended unparsers - for alignment fill, unused regions, and for expressions and dfdl:outputValueCalc.
- TBD: expressions involving variables yet to be set.
queueing of suspensions on the infoset, and infoset event detection including open/final infoset nodes.
capture of start/end positions
propagation of start/end positions and start positions, lengths of data output streams

Those will be covered in other pages.

Page tree

Unparsing: Data Output Stream Buffering