Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Minor Edits

Live Long and Process (LLAP) functionality was added in Hive 2.0 (HIVE-7926 and associated tasks). HIVE-9850 links documentation, features, and issues for this enhancement.

...

Similar to the DataNode, LLAP daemons can be used by other applications as well, especially if a relational view on the data is preferred over file-centric processing. The daemon is also open through optional APIs (e.g.: , InputFormat) that can be leveraged by other data processing frameworks as a building block.

Last, but not least, fine-grained column-level access control -- control  a key requirement for mainstream adoption of Hive -- Hive  fits nicely into this model.

The diagram below shows an example execution with #LLAP. Tez AM orchestrates overall execution. The initial stage of the query is pushed into #LLAP, and large shuffle is performed in their own containers. Multiple queries and applications can access #LLAP concurrently.

...

Persistent daemon

To facilitate caching , and JIT optimization, and to eliminate most of the startup costs, we will run a daemon runs on the worker nodes on the cluster. The daemon will handle handles I/O, caching, and query fragment execution.

  • These nodes will be stateless. Any request to an #LLAP node will contain contains the data location and metadata. It will process processes local and remote locations; locality will be is the caller’s responsibility (YARN).
  • Recovery/resiliency. Failure and recovery is simplified because any data node can still be used to process any fragment of the input data. The Tez AM can thus simply rerun failed fragments on the cluster.
  • Communication between nodes. #LLAP nodes will be are able to share data (e.g., fetching partitions, broadcasting fragments). This will be is realized with the same mechanisms used today in Tez.

Execution Engine

#LLAP will work within existing, process-based Hive execution to preserve the scalability and versatility of Hive. It will not replace the existing execution model but enhance it.

...

Hive Contributor Meetup Presentation

Try Hive LLAP

 

 

 

 

 

Save

Save

Save