Table of Contents |
---|
Live Long and Process Low Latency Analytical Processing (LLAP) functionality was added in Hive 2.0 (HIVE-7926 and associated tasks). HIVE-9850 links documentation, features, and issues for this enhancement.
...
- Asynchronous spindle-aware IO
- Pre-fetching and caching of column chunks
- Multi-threaded JIT-friendly operator pipelines
Also known as Live Long and Process, LLAP provides a hybrid execution model which . It consists of a long-lived daemon replacing which replaces direct interactions with the HDFS DataNode, and a tightly integrated DAG-based framework.
Functionality such as caching, pre-fetching, some query processing and access control are moved into the daemon. Small Small/short queries are largely processed by this daemon directly, while any heavy lifting will be performed in standard YARN containers.
...
- Eviction policy. The eviction policy is tuned for analytical workloads with frequent (partial) table-scans. Initially, a simple policy like LRFU is used. The policy is pluggable.
- Caching granularity. Column-chunks are the unit of data in the cache. This achieves a compromise between low-overhead processing and storage efficiency. The granularity of the chunks depends on the particular file format and execution engine (Vectorized Row Batch size, ORC stripe, etc.).
A bloom filter is automatically created to provide Dynamic Runtime Filtering.
Workload Management
YARN is used to obtain resources for different workloads. Once resources (CPU, memory, etc.) have been obtained from YARN for a specific workload, the execution engine can choose to delegate these resources to LLAP, or to launch Hive executors in separate processes. Resource enforcement via YARN has the advantage of ensuring that nodes do not get overloaded, either by LLAP or by other containers. The daemons themselves is under YARN’s control.
...
LLAP servers are a natural place to enforce access control at a more fine-grained level than “per file”. Since the daemons know which columns and records are processed, policies on these objects can be enforced. This is not intended to replace the current mechanisms, but rather to enhance and open them up to other applications as well.
Web Services
HIVE-9814 introduces the following web services:
JSON JMX data - /jmx
JVM Stack Traces of all threads - /stacks
XML Configuration from llap-daemon-site - /conf
HIVE-13398 introduces the following web services:
LLAP Status - /status
LLAP Peers - /peers
/status example
Code Block | ||
---|---|---|
| ||
curl localhost:15002/status
{
"status" : "STARTED",
"uptime" : 139093,
"build" : "2.1.0-SNAPSHOT from 77474581df4016e3899a986e079513087a945674 by gopal source checksum a9caa5faad5906d5139c33619f1368bb"
} |
/peers example
Code Block | ||
---|---|---|
| ||
curl localhost:15002/peers
{
"dynamic" : true,
"identity" : "718264f1-722e-40f1-8265-ac25587bf336",
"peers" : [
{
"identity" : "940d6838-4dd7-4e85-95cc-5a6a2c537c04",
"host" : "sandbox121.hortonworks.com",
"management-port" : 15004,
"rpc-port" : 15001,
"shuffle-port" : 15551,
"resource" : {
"vcores" : 24,
"memory" : 128000
},
"host" : "sandbox121.hortonworks.com"
},
]
} |
SLIDER on YARN Deployment
LLAP can be deployed via Slider, which bypasses node installation and related complexities (HIVE-9883).
LLAP Status
AMBARI-16149 introduces LLAP app status, available with Hive Server.
Example usage.
/current/hive-server2-hive2/bin/hive --service llapstatus --name {llap_app_name}
Resources
...