Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Overall there are 2 ways to implementing this functionality using the KiteSDK

Option 1

Duplicate a lot of the code in KiteConnector and add a new independent connector for KiteHbaseConnector. The major con is the code duplication and effort to support Yet another connector

 

Option 2:

  • Use the current KiteConnector and add a enum to select the type of dataset Kite will create underneath, or parse to URI given in the FromJobConfig and ToJobConfig to figure out the dataset to be HIVE/ Hbase or HDFS

 

Code Block
public enum DataSetType {
  HDFS,
  HBASE,
  HIVE
}
// use this enum to determine what dataset kite needs to create underneath
  @Input
  public DataSetType datasetType
 
or
// parse this to figure out the data set
  @Input(size = 255, validators = {@Validator(DatasetURIValidator.class)})

  public String uri

 

 

  • Piggy back on config annotations ( conditions that we are intending to add since ages! ) to show only relevant config subsequently. For instance 

    hdfsHostAndPort may not be relevant for HIVE or HBase


    Pros :

  • No code duplication
  • No weird build dependency of KiteHbaseConnector depending on KiteConnector that might make independent connector upgrade complicated

Implementation Details

  • Add support for Hbase related configs
  • Add support to create hbase dataset in the Kite

...