This document serves as a guide for the public facing Sqoop Repository API as of 1.99.5 release
This API can evolve in future releases and hence it is relevant to the state of the API in 1.99.5
Background
Sqoop2 supports a persistent store for the sqoop entities such as the Configurables ( Connector and Driver) , Configs exposed by the Connectors, Jobs and Jobruns etc. The persistent store is commonly referred to as the repository. We also expose Rest APIs and shell commands to perform CRUD operations on the sqoop entities such as connectors and drivers, connector configs related to link and job information, sqoop job and its configs. Thus the persistent store comes handy in keeping a history of the sqoop entity objects created and updated over time. In order to access the persistent store with ease, we also expose a simple java based repository API that different data stores can implement to store the sqoop entity objects.
At this point, we support relational data stores, since the entities are related to each other and expressing these relations becomes easier with a relational data store. In future it is possible to add a non-relational store to implement the repository API. Repository structure ( schema and its fields ) has also changed over time during the sqoop releases so have the APIs to retrofit to the new structures.
The rest of the document will focus on the main public facing entities and repository APIs
Sqoop Entities
Entity | Model | Relationship | Description |
---|---|---|---|
Configurable | Configurable.java | Top Level Entity | Entity that exposes config objects. Configurable have a associated version that acts as a identifier for connector config upgrades. |
CONNECTOR | MConnector.java |
| is a type of configurable |
DRIVER | MDriver.java |
| is a type of configurable |
CONFIG | MConfig.java and @Config annotation | Top Level Entity | MConfigType are the supported config types as of 1.99.5 |
INPUT | MInput.java and @Input annotation | Top Level Entity | It holds the key-values for the given InputType. |
LINK |
MLinkConfig.java | has 1-n configs-inputs objects | Represents the sqoop connector's link information. Link encapsulates the details required to connect to the the data source the connector represents. It has one main component the LINK CONFIG |
JOB |
| has 1-n configs-inputs objects has 1-n submissions | Represents the sqoop job. It encapsulates all the required configs to run the sqoop job. Primarily the sqoop job has the 3 main components, the FROM, TO and the DRIVER. FROM and its related FROM-CONFIG represent the config-inputs-values required to Extract data from the source TO and its related TO-CONFIG represent the config-inputs-values required to load data to the destination DRIVER and its related DRIVER-CONFIG represent the config-inputs-values required by the execution engine that runs the sqoop job optimally.
|
SUBMISSION | MSubmission.java | Represents the job run details. Includes the job status, job counters and metrics from the job execution engine |