(Work in progress)
See also basic layering documentation: Proposal: Data Layering for Base64, Line-Folding, Compression, Etc
Extension of it for Checksums/CRC/Parity: Proposal: Checksums, CRC, Parity - Layering Enhancements
A Basic Mechanism Exists
As part of the work for
, I added new test layers in the daffodil-test module.You can see this in PR: https://github.com/apache/daffodil/pull/643
There is an IPv4 layer example/test, and a checkDigits example/test.
This illustrates a way of doing "field extensibility" of layer transforms now, without any particular dynamic layer loading feature.
You pass the name of your layer transform using property
dfdlx:layerTransform='{ "someLayer" }
which because it is an expression it bypasses the validation checks against the LayerTransformEnum defined in the dfdlx.xsd schema. (in daffodil-propgen)
Then somewhere in Scala/Java code before compiling the schema one must call LayerTransformFactory.register(SomeLayerTransformFactory), and the SomeLayerTransformFactory object has to be defined on the classpath.
In the case of my tests in daffodil-test, the scala test driver code does this call.
This is likely sufficient for application developers using Daffodil via API from software.
However: It won't let you run a new layer transform from the Daffodil CLI tool.
Enabling CLI Loading of Layers
To make the CLI work:
- The enum check in dfdlx.xsd should just be removed entirely, and the diagnostic message if the layer transform is not found should list all the currently registered layer transform names.
- Provide a way that the evaluation of dfdlx:layerTransform property, if the name does not correspond to a registered LayerTransformFactory, it could guess the name of the corresponding LayerTransformFactory object, (e.g, by uppercase first letter, and append "LayerTransformFactory" or "TransformFactory" if the name ends in "Layer", to the string value of the property) and try to dynamically load that class.
This mechanism would also eliminate the need for API code to explicitly call the LayerTransformFactory.register(...) method. However, it is good practice to do that anyway, so as to get a Java linkage error (build/compile time of the software) rather than a dynamic runtime error.
Create 'daffodil-layers' Module
The following existing (as of Daffodil 3.1.0) layering transforms should be removed from daffodil-runtime1 and put in a daffodil-layers module.
- aisPayloadArmoring
- fourByteSwap
- base64_MIME
- gzip
- lineFolded_IMF
- lineFolded_iCalendar
These enums for LayerTransformEnum should be removed as they are not implemented:
- quotedPrintable
- compress
- base64
- base64url
This new daffodil-layers module should have the usual src/test/resources and src/test/scala subtree of tests showing each of the layers in use both in unit tests where sensible, and in TDML-based tests to illustrate "real" usage.
Documentation
A web page on the Daffodil user site should contain links to doc about each layer transform that is part of the daffodil-layers jar.
Scaladoc(?) of each Layer Transform
- These links should be to the scaladoc pages for each of these transforms.
- Each transform's scaladoc should be where the transforms are thoroughly documented.
- ??? There are pros/cons here.... needs discussion
- scaladoc is created in src/main/scala and it would end up almost repeating the example code which is in the src/test/resources and src/test/scala directories
Test Automation Plan
Testing during Daffodil 'sbt test' Build Cycle
Tests in daffodil-test will illustrate how dynamic loading of a layer from a jar works, but the jar in that case isn't a published jar.
This uses stripped down versions of the IPv4 header checksum, and OTH-Gold-style check-digits.
- Improvement: Be sure to call these FakeIPv4 or similar renaming to avoid confusion with the real ethernet-ip schemas.
In addition all tests built in to the daffodil-layers module would run on 'sbt test' also.
Testing of User-like Situations - Outside the Daffodil Repository Tree
DFDL Schemas Regression Testing Rig
There is a general schema regression testing rig that enables developers to easily maintain a large tree of DFDL schemas for regression testing against daffodil.
This tool/repo should be published.
TODO: Create Apache Daffodil project repository for this.
This uses git submodules so that it does not copy schemas.
This should then test the "real" schemas mentioned below.
Updating "Real" DFDL Schema Examples that use Layers
The ethernet-ip DFDL schema (on github) can be made into a full exercise of IPv4 checksums.
- It could further be extended to do ICMP, TCP, UDP checksums also.
- The PCAP schema which uses this ethernet-ip schema would inherit this also.
The OpenDFDL (on github - note: should move to DFDLSchemas) example checkDigits schema can be made into a complete example.
The DFDLSchemas (on github) GPS-SPS example computes parity bits. This could also be made into a complete test example - though this is quite large.