Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Implementation Details ( With Option#2)

  • Use uri to parse the type of dataset in the connector
  • Rely on the uri for Hbase dataset to be setup with required relevant mappings. Rely on Kite-HDFS partitioning for hbase partitioning  strategy strategy setup
  • KiteExtractor to support creating Hbase datasets via Kite SDK and reading records and piggyback on partitioning implementation of Kite-HDFS
  • KiteLoader to support creating Hbase datasets via Kite SDK and writing records. Unlike Kite-HDFS that has the ability to create temp datasets and merge them only when job succeeds ( commit phase), in case  of Hbase we cannot do that, we have to commit as we write. We are aware that at this point if a job/task failure happens, there can be partial commits and or dupes.
  • If we support DFM, add relevant DFM configs and code in KiteConnector  - TBD

...