Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

TBD: as with namespace-distinctions, where we warn when an element is only distinguishable by the namespace, which isn't represented in, for example JSON, we could also warn about Anonymous choices or other things that make metadata mapping to Drill (or NiFi or ... ) harder. 

type (of element unless noted)nillable (yes/no, * = don't care)dimension (scalar, optional, array, * = don't care)drill metadata
**arraysub-table with added index column to hold position (note: name of index column should not collide)
date/time**TBD: are there date/time types corresponding? If so use them, if not use strings in ISO8601 format
string

Must map any DFDL infoset illegal string characters to Drill-allowed characters (analogous to what we do with XML-illegal characters for converting the DFDL infoset to XML).
string*scalar

String (non nullable) TBD: is empty string distinguished from null string in Drill? 

(ANSI SQL databases distinguish empty strings from null strings - DFDL also distinguishes these. Some other databases do not)


simple typenoscalarcorresponding Drill type
simple typeyesscalar

nullable corresponding drill type

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typenooptional

nullable corresponding drill type

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typeyesoptional

nullable corresponding drill type (note: the two concepts of optional and nullable are collapsed)

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typenoarray

sub table with index and non-nullable value column

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typeyesarray

sub table with index and nullable value column

(TBD: no distinction from string. Combine with string if there is no distinction)







bounded size unsigned integers 

(excluding unsignedLong)

**next larger size signed integer
unsignedLong

TBD: Do we have bignum? 

TBD: should we just restrict this to range of signed long type?

TBD: just use string?


integer (unbounded)

TBD: Do we have a corresponding type? (if not use string)
decimal

TBD: Do we have a corresponding type? (If not use string)
complex (element with sequence or choice) noscalarsub-map (is there such a thing?)
complex (element with sequence or choice) nooptional or arrayarrayMap
complex (element with sequence or choice)yesscalar, optional, or arraysub-table with index column and a map. 
complex sequencenoscalar

TBD: merge children into parent context?

TBD: extend child element names with enclosing element name?

TBD: name collisions? 

TBD: more than one child with same name? (non-array case)


complex sequence yesscalarsub table 
complex sequence*optional or arraysub table 
complex choice








About Other Metadata

Bestides Apache Drill, there are other systems with similar metadata organization. 

  • Apache NiFi Records
  • Apache Avro - adds a restriction. Unions cannot directly contain other unions
    • Note that in a pure logical data system there is no need for unions that contain other unions, as such can always be flattened into a single union.
    • In DFDL nested "choices" are not uncommon if one choice uses discriminators and another uses direct dispatch. 
      • This is a common idiom when trying to get direct dispatch plus "a default" choice branch for when the value is not found as a choice branch key. 
      • DFDL may be amended with a default choice branch feature to help with this. 
  • Apache Pulsar: supports several kinds of complex type structures. Both NiFi and Avro are supported.