Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The JDBC client (e.g. beeline) creates a HiveConnection by initiating a transport connection (e.g. TCP connection) followed by a OpenSession API call to get a SessionHandle. The session is created from server side.
  • The HiveStatement is executed (following JDBC standards) and ExecuteStatement API call is made from Thrift client. In the API call, SessionHandle information is passed to server along with the query information.
  • HS2 server receives the request and asks the driver (which is a CommandProcessor) for query parsing and compilation. The driver kicks off a background job that will talk to Hadoop and then immediately returns response to client. This is an asynchronous design of the ExecuteStatement API. The response contains a OperationHandle created from server side.
  • Client uses the OperationHandle to talk to HS2 to poll the status of the query execution.

Other resources

How to set up HS2: Setting Up HiveServer2

HS2 clients: HiveServer2 Clients

Cloudera blog on HS2: http://blog.cloudera.com/blog/2013/07/how-hiveserver2-brings-security-and-concurrency-to-apache-hive/

 

Source Code Description

Hopefully this is kind of a manual for a new starter to locate some basic components from source code.

...

  • org.apache.hive.service.cli.SessionHandle class: session identifier. Instances of this class are returned from server and used by client for as input for Thrift API calls.
  • org.apache.hive.service.cli.OperationHandle class: operation identifier. Instances of this class are returned from server and used by client to poll the execution status of an operation. 

 

Other resources

How to set up HS2: Setting Up HiveServer2

HS2 clients: HiveServer2 Clients

Cloudera blog on HS2: http://blog.cloudera.com/blog/2013/07/how-hiveserver2-brings-security-and-concurrency-to-apache-hive/