You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Apache Drill provides query capabilities against a variety of data systems.

By enabling Drill for DFDL-described data, one could immediately query data that has a DFDL schema describing its format.

Metadata Mapping

TBD: does Drill support...

  • nullable complex types (a column containing a sub-table, that is itself nullable?)
  • date/time/datetime types
  • big int, big decimal
  • nullable strings (distinguished from empty strings)

TBD: should we be trying to simplify the metadata to make querying easier, or be ruthlessly uniform so that queries will be ugly but at least consistent?

TBD: should we be trying to handle XSD here (all of it) or just DFDL?

typenillable (yes/no, * = don't care)dimension (scalar, optional, array, * = don't care)drill metadata
**arraysub-table with added index column to hold position (note: name of index column should not collide)
date/time**TBD: are there date/time types corresponding? If so use them, if not use strings in ISO8601 format
string

Must map any DFDL infoset illegal string characters to Drill-allowed characters (analogous to what we do with XML-illegal characters for converting the DFDL infoset to XML).
string*scalar

String (non nullable) TBD: is empty string distinguished from null string in Drill? 

(ANSI SQL databases distinguish empty strings from null strings - DFDL also distinguishes these. Some other databases do not)


simple typenoscalarcorresponding Drill type
simple typeyesscalar

nullable corresponding drill type

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typenooptional

nullable corresponding drill type

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typeyesoptional

nullable corresponding drill type (note: the two concepts of optional and nullable are collapsed)

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typenoarray

sub table with index and non-nullable value column

(TBD: no distinction from string. Combine with string if there is no distinction)


simple typeyesarray

sub table with index and nullable value column

(TBD: no distinction from string. Combine with string if there is no distinction)







bounded size unsigned integers 

(excluding unsignedLong)

**next larger size signed integer
unsignedLong

TBD: Do we have bignum? 

TBD: should we just restrict this to range of signed long type?

TBD: just use string?


integer (unbounded)

TBD: Do we have a corresponding type? (if not use string)
decimal

TBD: Do we have a corresponding type? (If not use string)





complex sequencenoscalar

TBD: merge children into parent context?

TBD: extend child element names with enclosing element name?

TBD: name collisions? 

TBD: more than one child with same name? (non-array case)


complex sequence yesscalarsub table 
complex sequence*optional or arraysub table 
complex choice








  • No labels