Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Data transformations could be applicable to the key or the value of the record. We will have *Key and *Value variants for these transformations that reuse the common functionality from a shared base class.

  • Some common utilities for data transformations will shape up:

    • Cache the changes they make to Schema objects, possibly only preserving last-seen one as the likelihood of source data Schema changing is low.

    • Copying of Schema objects with the possible exclusion of some fields, which they are modifying. Likewise, copying of Struct object to another Struct having a different Schema with the exception of some fields, which they are modifying.

    • Where fields are being added and a field name specified in configuration, we will want a consistent way to convey if it should be created as an a required or optional field. We can use a leading '!' or '?' character for this purpose if the user wants to make a different choice than the default determined by the transformation.

    • ConfigDef does not provide a Type.MAP, but for the time being we can piggyback on top of Type.LIST and represent maps as a list of key-value pairs separated by :.
    • Where field names are expected, in some cases we should allow for getting at nested fields by allowing a dotted syntax which is common in such usage (and accordingly, will need some utilities around accessing a field that may be nested).
    • There are escaping considerations to several such configs, so we will need utilities that that assume a consistent escaping style (e.g. backslashes).

...