You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

HiveServer2 (HS2) is a service that enables clients to execute queries against Hive. HiveServer2 is success to HiveServer1 which has been deprecated. HS2 supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.

HS2 is a single process running as a composite service, which includes the Thrift based Hive service (TCP or HTTP) and a Jetty web server for web UI. 

The Thrift based Hive service is the core of HS2 and responsible for servicing the Hive queries (e.g. from Beeline). Thrift is a RPC framework for building cross-platform services. Its stack consists of 4 layers: Server, Transport, Protocol and Processor. You can find more details about the layers at: https://thrift.apache.org/docs/concepts.

The usage of those layers from HS2 implementation is described as below.

Server

HS2 uses a TThreadPoolServer (from Thrift) for TCP mode, or a Jetty server for the HTTP mode. 

Regarding the TThreadPoolServer, it allocates one worker thread per TCP connection. Each thread is always associated with a connection even if the connection is idle. So there is a potential performance issue resulting from a large number of threads due to a large number of concurrent connections. In future we may think about switching to another server types for TCP mode, for example TThreadedSelectorServer. Here is a article about a performance comparison between different Thrift Java servers.  

Transport

HTTP mode is required when a proxy is needed between the client and server (for example, for load balancing or security reasons). That's why it's supported as well as TCP mode. You can specify the transport mode of the Thrift service through the hive config: "hive.server2.transport.mode".

Protocol

The Protocol implementation is responsible for serialization/deserialization. We are currently using TBinaryProtocol as our thrift protocol for serialization. In future we may think about other protocols such as TCompactProtocol based on more performance evaluation.

Processor

Process implementation is the application logic to handle requests.  

 

 

 

  • No labels