You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Title : Sqoop Config Input as a Top Level Entity 

JIRA: https://issues.apache.org/jira/browse/SQOOP-1516

Summary

Configs are exposed in code via the Connectors and Drivers ( the two CONFIGURABLES supported). They annotate the config classes with "@Config" annotation and that is how sqoop registers these entities into the repository during the server startup. If a connector is already existing in the sqoop repository (or during upgrade path) then the connector's upgrade API is invoked to update the attributes of the config object.

The current SQ_CONFIG stores the top level config entries per configurable. 

  +-------------------------------------+
     | SQ_CONFIG                           |
     +-------------------------------------+
     | SQ_CFG_ID: BIGINT PK AUTO-GEN       |
     | SQ_CONFIGURABLE: BIGINT             |FK SQ_CONFIGURABLE(SQC_ID)
     | SQ_CFG_NAME: VARCHAR(64)            |
     | SQ_CFG_TYPE: VARCHAR(32)            |"LINK"|"JOB"
     | SQ_CFG_INDEX: SMALLINT              |
     +-------------------------------------+
 

Currently we support 2 types of configs. LINK and JOB configs. The MConfigType Enum encapsulates this information. It is the value used in "SQ_CFG_TYPE" when a config is registered.

@InterfaceAudience.Private
@InterfaceStability.Unstable
public enum MConfigType {
  /** Unknown config type */
  OTHER,
  @Deprecated
  // NOTE: only exists to support the connector data upgrade path
  CONNECTION,
  /** link config type */
  LINK,
  /** Job config type */
  JOB;
}

 Each class annotated with @Config, exposes a list of inputs via the "@Input" annotation and its attributes. The @Input annotated fields are stored in another table SQ_INPUT along with supported attribute and their values. The SQ_INPUT only stores the input keys and the attribute values. The actual value for the SQ_INPUT are dependent on per JOB and per LINK ( Refer to this wiki to understand sqoop entities ) and hence there are 2 additional tables where we store the SQ_JOB_INPUT and SQ_LINK_INPUT.

     +----------------------------+
     | SQ_INPUT                   |
     +----------------------------+
     | SQI_ID: BIGINT PK AUTO-GEN |
     | SQI_NAME: VARCHAR(64)      |
     | SQI_CONFIG: BIGINT         |FK SQ_CONFIG(SQ_CFG_ID)
     | SQI_INDEX: SMALLINT        |
     | SQI_TYPE: VARCHAR(32)      |"STRING"|"MAP"
     | SQI_STRMASK: BOOLEAN       |
     | SQI_STRLENGTH: SMALLINT    |
     | SQI_ENUMVALS: VARCHAR(100) |
     +----------------------------+
 
   +----------------------------+
     | SQ_LINK_INPUT              |
     +----------------------------+
     | SQ_LNKI_LINK: BIGINT PK    | FK SQ_LINK(SQ_LNK_ID)
     | SQ_LNKI_INPUT: BIGINT PK   | FK SQ_INPUT(SQI_ID)
     | SQ_LNKI_VALUE: LONG VARCHAR|
     +----------------------------+
     +----------------------------+
     | SQ_JOB_INPUT               |
     +----------------------------+
     | SQBI_JOB: BIGINT PK        | FK SQ_JOB(SQB_ID)
     | SQBI_INPUT: BIGINT PK      | FK SQ_INPUT(SQI_ID)
     | SQBI_VALUE: LONG VARCHAR   |
     +----------------------------+


 


 

 

The following table lists the type and number in (*) of configs exposed by each of the configurables. Each config object is represented as a list. Hence a connector can expose a FROM-CONFIG with more than one config objects in it. 

CONFIGURABLELINK-CONFIGJOB-CONFIG
CONNECTOR

(1)

LINK-CONFIG

MLinkConfiggList

(2)

FROM-CONFIG

MFromConfigList

TO-CONFIG

MToConfiggList

DRIVERNONE

1

DRIVER-CONFIG

MDriverConfiggList

 

The current design proposal enhancement proposal to the existing functionality ( command line and rest apis)  to support reuse of config objects by providing hooks to perform RU ( Read and Update) operations on the config input objects independently. 


Requirements
  • Read and Update the Config Inputs by Type and By Job

 

Non Goals

 

  • Config

Design

Shell Commands

 

Rest API 

GET
v1/config/link?configurableId=?&direction ( get all the config details for the given configurable )

POST
v1/config/link?configurableId=?&direction ( post data for the link config object)

Testing

 

 

 

  • No labels