Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: removing some private api from docs. Adding deprecation plans


 

This page aims to catalogue and describe the various public facing APIs exposed by Hive in order to inform developers wishing to integrate their applications and frameworks with the Hive ecosystem. To date the following APIs have been identified in the Hive project that are either considered public, or widely used in the public domain:

...

The APIs can be segmented into two conceptual categories: operation based APIs and query based APIs:.

Operation based APIs

Operation based APIs expose many tightly scoped methods that each implement a very specific Hive operation. Such methods usually accept and return strongly typed values appropriate to their respective operation. The implementations of the operations usually target very specific layers or subsystems within Hive and are therefore likely to be efficient in use. However, the outcome of an operation may diverge from that of the equivalent HQL as the different code paths may be invoked in each case. Operation based APIs are used for constructing processes that need to interact in a repetitive, declarative manner and provide a greater degree of compile-time checking.

...

Note
iconfalse
titleTODO

Requires overview.

HiveServer2 API

...

.

...

icon
Note

false
titleTODO

Describe.

HCatalog CLI (Command Line)

Query based API. This is well documented on the wiki.

...

iconfalse
titleTODO

...

Hive community has been working deprecating Hive Cli. Hcatalog Cli is similar to Hive Cli and will be deprecated.

Metastore (Java)

A Thrift operation based API with Java bindings, described by the IMetaStoreClient interface. The API decouples the metastore storage layer from other Hive internals. Because Hive itself uses this internally, it is required to implement a comprehensive feature set which makes it attractive to developers who might find the other APIs lacking. It was not originally intended to be a public API although it became public in version 1.0.0 (HIVE-3280) and there is a proposal that it be documented more fully (HIVE-9363). Anecdotally, its use outside of the Hive project is not currently recommended.

Note
iconfalse
titleTODO: API usage

There are numerous ways of instantiating the metastore API including: HCatUtil.getHiveMetastoreClient(), new HiveMetaStoreClient.HiveMetaStoreClient(...). It may be useful to make some recommendations on the preferred approach.

Hive (Java)

...

.

...

Note
iconfalse
titleTODO

I suspect its use is not encouraged. Seeking clarification on the motivations behind this class and thoughts on its use outside of Hive.

Driver (Java)

Query based API with Java endpoint. Refers to the org.apache.hadoop.hive.ql.Driver class.

Note
iconfalse
titleTODO

Describe the role of Driver, when to use it, etc.

WebHCat (REST)

WebHCat is a REST operation based API for HCatalog. This is well documented on the wiki.

...

iconfalse
titleTODO

...

This not actively maintained and likely not be supported in future releases. For job submission, consider using Oozie or similar tools. For DDL, use JDBC.

Streaming Data Ingest (Java)

...

Operation based Java API focused on mutating (insert/update/delete) records into transactional tables using Hive’s ACID feature. Large volumes of mutations are applied atomically in a single long-lived transaction. Documented with package level Javadoc on the wiki. Scheduled for release in Hive version 2.0.0 (HIVE-10165).

hive-jdbc (JDBC)

Query based JDBC API .

...

iconfalse
titleTODO

...

supported by Hive. It supports most of the functionality in JDBC spec.