Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

This design has been superseded by a new design.

Principles of Operation

Many global group defs exist with the intention that they are ONLY going to be used as hidden groups. An example of this is presence-bit indicator flags. These are 1-bit elements that live in a hidden group because they indicate the presence or absence of an element in the data. These flags can be used via dfdl:occursCount and dfdl:occursCountKind='expression', or via flags, choices, and discriminators. Either way they are a common case of hidden groups.

...

Polymorphic Terms which are sometimes hidden, sometimes not, in the same schema, are expected to be far less common. Many schemas are expected to have no such groups. Every Term will be known to be hidden, or known to be not-hidden.

Choices, and the choice branch maps used for unparsing, are related to the isHidden problem due to the transition plan below.

Implementation: Schema Compiler

The daffodil schema compiler has Root.refMap that allows us to know what all group refs are referring to a particular group def. This allows us to know if all such are hidden group refs, all are not hidden group refs, or they are a mixture. That tells the schema compiler if a given global group def is always hidden, always not hidden, or a mixture.

Furthermore, by looking at the transitive closure of the refMap, one can determine for every Term, whether is always appears within a hidden group, never appears within a hidden group, or some of both.

This calculation can be done in a single walk of the DSOM tree structure. Its complexity is order of the number of Term objects in the DSOM tree.

The algorithm is roughly:

Code Block
// on class Term (for every term in the schema)
lazy val optIsKnownHidden : Option[Boolean] = 
     case if the term is a model group, and the parent of the model group is a global group def.
       then using the ref map for all group refs referring to this group def
         if all are hidden group ref 
           then check for elements that if isSimpleType, that it is defaultable or has outputValueCalc, and SDE otherwise.
                result is Some(true)
         if none are hidden group ref then result is Some(false)
         else None
     case if the term is any other kind of term, then if it has a lexically enclosing model group
       then the result is the optIsKnownHidden of the lexically enclosing model group.
     case if the term is the root element then Some(false)

This Term.optIsKnownHidden is also carried on TermRuntimeData structures for all Terms.

If Term.optIsKnownHidden is Some(true), then the schema compiler should check for elements that if they are of simple type they are either defaultable or have dfdl:outputValueCalc. It is an SDE otherwise.

No check is required if Term.optIsKnownHidden is Some(false), and a check will occur at runtime for Term.optIsKnownHidden = None.

TBD: it may be useful to have a global attribute on the root computed which indicates if there are any of these mixed-hidden Terms. If all Terms are known either hidden or non-hidden with no ambiguity, further optimizations may apply. For other runtime backends, it may even be disallowed to have these mixed-hidden Terms. For Runtime1, however, the runtime overhead of this implementation is expected to be so little that this may be unnecessary, and profiling studies should indicate whether further performance attention is needed

Implementation: Runtime var DIElement.isHidden

This flag member is set on infoset elements at the time they are created (parsing) or spliced into the infoset (unparsing - streaming unparser).

We  dynamically maintain boolean PState/UState member isInsideHiddenContext at runtime:

ParseOrUnparseState has member.

Code Block
var isInsideHiddenContext : Boolean = false

In the parse1 and unparse1 methods of Parser and Unparser respectively, we implement (example shows parser)

Code Block
trd match {
case srd: SequenceGroupRef if srd.optIsKnownHidden.isDefined && 
                              srd.optIsKnownHidden.get == true) => 
    if (state.isInsideHiddenContext) {
     parse(state)
    } else {
     state.isInsideHiddenContext = true
     parse(state)
     state.isInsideHiddenContext = false
    }
}
case _ => parse(state)

When infoset elements are created by element combinators, (parsing), or when they are accepted and spliced into the infoset by element combinators (unparsing) they call DIElement.setIsHidden:

elem.setIsHidden{ if (erd.isKnownHidden.isDefined) erd.isKnownHidden.get // no checking needed. It should have been done at schema compile time. else { val res = state.isInsideHiddenContext if (res && erd.isSimpleType) if (!erd.isDefaultable && !erd.isOutputValueCalc) state.SDE(...must be defaultable or OVC...) // checking in runtime case. res } }
Code Block

Runtime Checking

When an element is setIsHidden(true), then if it is of simple type it should be checked to insure it is either defaultable or dfdl:outputValueCalc, and it is a runtime SDE if not as shown above.

Implementation: Choice Combinators

ChoiceCombinator.unparser method computes a choiceBranchEventMap, and also determines statically which branch should be taken if the choice is hidden.

...

The ChoiceBranchEventMap is computed without regard for isHidden, that is, it assumes the current choice is not, itself hidden. Contents of each branch of the choice may contain hidden content or not.

Test Plan/Design-for-Test

Tests should insure that elements are properly hidden if they are multiple group references away from a true dfdl:hiddenGroupRef.

Test schemas with groups that appear both hidden and non-hidden are required to insure the runtime determination is exercised.

Transition Plan

Releases 2.5.0 and prior did not use this technique.

...

All such properties should be computed on global group definitions, not repeatedly for every group reference.

Algorithm TBD

  • possibleFirstChildElementInInfoset calls possibleFirstChildTerms
  • possibleFirstChildTerms calls possibleNextSiblingTerms
  • possibleNextSiblingTerms calls enclosingTerms (note terms plural)

...