Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update for PCAP

...

The headache being the relative path to it., and then the fact that the most natural thing to do is to put both IVC and OVC on it (though the OVC may not be needed in many cases).

...

Variables: The direction property, and forward reference from dfdl:setVariable value and dfdl:newVariableInstance defaultValue Expressions

Consider unparsing with dfdl:setVariable expression referring to OVC forward-referencing element.

Code Block
<xs:element name="len" type="xs:int" dfdl:outputValueCalc="{ dfdl:valueLength(../data) + 10 }"/>

<xs:sequence>
  <xs:annotation><xs:appinfo ...>
    <dfdl:setVariable name="var">{ ../len }" />
  </xs:appinfo></xs:annotation>

  <xs:element name="data" dfdl:length="{ $var }">
    ....
  </xs:element>
....
</xs:sequence>

In the above, when unparsing, the output value calc for len can be evaluated, but we must delay its evaluation and unparsing until the subsequent data element is available. 

The next thing the unparse has to do, after delaying the unparsing of 'len' is set the var variable. This requires the value of len, which has been deferred.

We have real schemas (e.g., even PCAP) where this occurs.

It is clear that variables need to be able to be evaluated at unparse time, and the expressions used with them to default or set their values need to be able to forward-reference into the infoset when they are evaluated at unparse time.

We have prototyped a extension direction property for dfdl:defineVariable, which allows one to declare the variable dfdl:direction  'parseOnly', 'unparseOnly', or 'both'.

We believe using dfdl:setVariable is inconsistent with use of dfdl:newVariableInstance having a default value, as there are race conditions between reading a default value and setting the value that are complex. Stylistically, one should declare a variable with no default value nor external value, if one intends to use it with dfdl:newVariableInstance.

The following scenario comes from PCAP and illustrates use of dfdl:newVariableInstance and variables with forward-referencing defaultValue expressions:

Code Block

      <!-- Internally used by IPAddressGroup at unparse time
           These are IPAddressGroup's local variables. -->
      <dfdl:defineVariable name="remainingDottedAddr" type="xs:string" dfdlx:direction="unparseOnly"/>
      <dfdl:defineVariable name="priorRemainingDottedAddr" type="xs:string" dfdlx:direction="unparseOnly"/>

      <!-- Parameter for IPAddressGroup used at unparse time -->
      <dfdl:defineVariable name="ipAddressElement" type="xs:string" dfdlx:direction="unparseOnly"/>

<!-- 
A PCAP schema has two different IP addresses, IPSrc and IPDest.

They are defined in terms of a common group definition pcap:IPAddressGroup
which works like a common "subroutine" within the schema. The variable pcap:ipAddressElement is
a formal parameter of IPAddressGroup.
-->

  <xs:group name="IPSrcGrp">
      <xs:sequence>
        <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
          <!-- 
            Note how this defaultValue expression forward references to the 
            Element containing the string (e.g, 1.2.3.4) to be used at unparse time only,
            as the variable is itself an 'unparseOnly' variable declaration.
           -->
          <dfdl:newVariableInstance ref="pcap:ipAddressElement" defaultValue='{ IPSrc }'/><!-- example 1.2.3.4 -->
        </xs:appinfo></xs:annotation>
        <xs:element name="IPSrcString">
          <xs:complexType>
            <xs:group ref="pcap:IPAddressGroup"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
  </xs:group>


  <xs:group name="IPDestGrp">
      <xs:sequence>
        <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
          <dfdl:newVariableInstance ref="pcap:ipAddressElement" defaultValue='{ IPDest }'/><!-- example 1.2.3.4 -->
        </xs:appinfo></xs:annotation>
        <xs:element name="IPDestString">
          <xs:complexType>
            <xs:group ref="pcap:IPAddressGroup"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
  </xs:group>


 <xs:group name="IPAddressGroup">
      <xs:annotation><xs:documentation><![CDATA[
   
      This group is a reusable subroutine of DFDL. It parses a string like 1.2.3.4 into 4 integers named Byte1, Byte2, Byte3, Byte4
      containing the "." separated integers. 

      Arguably, this is extreme for DFDL. I mean an infoset with the 4 bytes it should be a good enough parsed representation of the
      4 bytes. But this serves as a useful example regardless. 
      
      This is used to unparse IP addresses expressed in the dotted notation that is common. 

      There is one parameter. Users must bind the $pcap:ipAddressElement variable to the string to be so parsed
      using dfdl:newVariableInstance.

      ]]></xs:documentation></xs:annotation>
    <xs:sequence>
      <xs:sequence>
        <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
          <dfdl:newVariableInstance ref="pcap:priorRemainingDottedAddr" 
             defaultValue='{ $pcap:ipAddressElement }'/><!-- example 1.2.3.4 -->
          <dfdl:newVariableInstance ref="pcap:remainingDottedAddr" 
             defaultValue='{ $pcap:priorRemainingDottedAddr }'/><!-- example 1.2.3.4 -->
        </xs:appinfo></xs:annotation>
        <xs:element name="Byte1" type="xs:unsignedByte" 
          dfdl:outputValueCalc="{
            xs:unsignedByte(fn:substring-before($pcap:remainingDottedAddr, '.'))
          }"/>
        <xs:sequence>
          <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
            <dfdl:newVariableInstance ref="pcap:priorRemainingDottedAddr" 
              defaultValue='{ $pcap:remainingDottedAddr }'/><!-- example 1.2.3.4 -->
            <dfdl:newVariableInstance ref="pcap:remainingDottedAddr" 
              defaultValue='{ fn:substring-after($pcap:priorRemainingDottedAddr, ".") }'/><!-- example 2.3.4 -->
          </xs:appinfo></xs:annotation>
          <xs:element name="Byte2" type="xs:unsignedByte" 
           dfdl:outputValueCalc="{
             xs:unsignedByte(fn:substring-before($pcap:remainingDottedAddr, '.'))
           }"/>
          <xs:sequence>
            <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
              <dfdl:newVariableInstance ref="pcap:priorRemainingDottedAddr" 
                defaultValue='{ $pcap:remainingDottedAddr }'/><!-- example 2.3.4 -->
              <dfdl:newVariableInstance ref="pcap:remainingDottedAddr" 
                defaultValue='{ fn:substring-after($pcap:priorRemainingDottedAddr, ".") }'/><!-- example 3.4 -->
            </xs:appinfo></xs:annotation>
            <xs:element name="Byte3" type="xs:unsignedByte" 
              dfdl:outputValueCalc="{
                xs:unsignedByte(fn:substring-before($pcap:remainingDottedAddr, '.'))
              }"/>
            <xs:sequence>
              <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/">
                <dfdl:newVariableInstance ref="pcap:priorRemainingDottedAddr" 
                  defaultValue='{ $pcap:remainingDottedAddr }'/><!-- example 3.4 -->
                <dfdl:newVariableInstance ref="pcap:remainingDottedAddr" 
                  defaultValue='{ fn:substring-after($pcap:priorRemainingDottedAddr, ".") }'/><!-- example 4 -->
              </xs:appinfo></xs:annotation>
              <xs:element name="Byte4" type="xs:unsignedByte" 
                dfdl:outputValueCalc="{
                  xs:unsignedByte($pcap:remainingDottedAddr)
                }"/>
            </xs:sequence>
          </xs:sequence>
        </xs:sequence>
      </xs:sequence>
    </xs:sequence>
  </xs:group>


<!-- These groups are then used like so -->

              ...
              <xs:sequence dfdl:hiddenGroupRef="pcap:IPSrcGrp"/>
              <!-- IPSrc will be of the usual IP address form: 1.2.3.4 --> 
              <xs:element name="IPSrc" type="xs:string" 
                dfdl:inputValueCalc="{ 
                  fn:concat(../IPSrcString/Byte1, '.', 
                            ../IPSrcString/Byte2, '.',
                            ../IPSrcString/Byte3, '.',
                            ../IPSrcString/Byte4) }"/>
              <xs:sequence dfdl:hiddenGroupRef="pcap:IPDestGrp"/>
              <xs:element name="IPDest" type="xs:string" 
                dfdl:inputValueCalc="{ 
                  fn:concat(../IPDestString/Byte1, '.',
                            ../IPDestString/Byte2, '.',
                            ../IPDestString/Byte3, '.',
                            ../IPDestString/Byte4) }"/>
             ....

The above DFDL schema enables the 4 bytes of IP Source Address and 4 bytes of IP Destination Address to be parsed into this logical XML infoset:

Data Bytes (hex) 01 02 03 04 05 06 07 08


Code Block
<IPSrc>1.2.3.4</IPSrc>

<IPDest>5.6.7.8</IPDest>

These will unparse back to the same 8 bytesThis breaks the rule that when variables are evaluated, things they reference must have already been evaluated. Basically, when we delay evaluating the OVC for len we are suspending anything that depends on len having a value as well.

Parse-Time Forward Reference

...