...
The current Parser/Unparser objects are specific to daffodil-runtime1. Introduction of daffodil-runtime2 requires replacing Parser/Unparser above with RuntimeGenerator. This is an object which is the output of the Daffodil schema compiler and which encapsulates runtime-specific optimizations and behavior. The above list becomes:
- DSOM
- Gram
- RuntimeRuntimeGenerator
- For daffodil-runtime1 - Parser/Unparser objects created by parser() and unparser() methods - these are the runtime-specific objects that actually carry out parsing/unparsing. There are also RuntimeData classes which store information used by parsers/unparsers, and Evaluatable objects which encapsulate compiled expressions for evaluation at runtime.
- For daffodil-runtime2 - Code generated by the generateCode() method. Creates codgen.ast.Generator objects. (TBD: or objects encapsulating those in some manner.)
The DSOM and Gram layers of the Daffodil schema compiler should be runtime-independent.
...
.
DSOM - Daffodil Schema Object Model
...
Only lengthKind 'explicit' or 'implicit' for simple types, and only lengthKind 'implicit' for complex types.
Only types long, int, short, byte, unsignedLong, unsignedInt, unsignedShort, unsignedByte, float, double, string, and hexbinary are supported.
Leaves out the decimal, integers greater than 64 bits long, boolean and date/time related types
The dfdl:representation is always 'binary'. No text numbers are supported.
As the dfdl:binaryNumberRep is always 'binary', integers are fixed-length 2’s complement.
When added, note that occursCountKind="expression", and choices with only dfdl:choiceDispatchKey and dfdl:choiceBranchKey implies no backtracking/discrimination is required.
Rationale: This and requiring only dfdl:occursCountKind='expression' means there are no ponts points of uncertainty, so there is no backtracking.
...
Properties that end up needed, but shouldn't be - ex: anything about text numbers, anything about date/time - are bugs in Daffodil that should be reported. An include-file DFDL format definition should hide these from users so they are not distracting.
Phases
The above restrictions on the features suggest dividing up the implementation of Runtime 2 into 2 distinct phases:
- Phase 1: (aka Runtime2P1) No expressions. All lengths are fixed. All arrays have fixed length.
- Phase 2: (aka Runtime2P2) Adding the DFDL expression language, lengthKind 'explicit', occursCountKind 'expression'.
Goals
- Use Julian Feiauer Feinauer contributed code generation library so as to have the possibility of Java, C++, and Python backends from Runtime 2
- Initial focus is a backend where the Infoset is Java POJO objects. The POJO definitions are part of the generated code, which is output as one or multiple text files.
DPath expressions (when implemented) compile into native language expressions that navigate Infoset objects the way handwritten code would.
- Dependencies on Java garbage collection should be minimized and documented.
The amount of runtime-library code should be minimum footprint.
Selective linking can be assumed (even for Java - search for GraalVM)
Satisfy the requirements that caused the PLC4X project to create their own MSpec data format language. (Alternatively, Daffodil with Runtime 2 should be a good target for MSpec compilation.)
With one exception: DFDL is still going to be XML Schema based. Changing the syntax of the DFDL language is out of scope, as that's a front-end project. This is a runtime/back-end project.
...