Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added web services info

Table of Contents

Live Long and Process Low Latency Analytical Processing (LLAP) functionality was added in Hive 2.0 (HIVE-7926 and associated tasks). HIVE-9850 links documentation, features, and issues for this enhancement.

...

  • Asynchronous spindle-aware IO
  • Pre-fetching and caching of column chunks
  • Multi-threaded JIT-friendly operator pipelines

Also known as Live Long and Process, LLAP provides a hybrid execution model which .  It consists of a long-lived daemon replacing which replaces direct interactions with the HDFS DataNode, and a tightly integrated DAG-based framework.
Functionality such as caching, pre-fetching, some query processing and access control are moved into the daemon. Small Small/short queries are largely processed by this daemon directly, while any heavy lifting will be performed in standard YARN containers.

...

  • Eviction policy. The eviction policy is tuned for analytical workloads with frequent (partial) table-scans. Initially, a simple policy like LRFU is used. The policy is pluggable.
  • Caching granularity. Column-chunks are the unit of data in the cache. This achieves a compromise between low-overhead processing and storage efficiency. The granularity of the chunks depends on the particular file format and execution engine (Vectorized Row Batch size, ORC stripe, etc.).

A bloom filter is automatically created to provide Dynamic Runtime Filtering.

Workload Management

YARN is used to obtain resources for different workloads. Once resources (CPU, memory, etc.) have been obtained from YARN for a specific workload, the execution engine can choose to delegate these resources to LLAP, or to launch Hive executors in separate processes. Resource enforcement via YARN has the advantage of ensuring that nodes do not get overloaded, either by LLAP or by other containers. The daemons themselves is under YARN’s control.

...

LLAP servers are a natural place to enforce access control at a more fine-grained level than “per file”. Since the daemons know which columns and records are processed, policies on these objects can be enforced. This is not intended to replace the current mechanisms, but rather to enhance and open them up to other applications as well.

Web Services

HIVE-9814 introduces the following web services:

JSON JMX data - /jmx
JVM Stack Traces of all threads - /stacks
XML Configuration from llap-daemon-site - /conf 

HIVE-13398 introduces the following web services:

LLAP Status - /status
LLAP Peers - /peers 

 

/status example

Code Block
languagebash
curl localhost:15002/status

{
  "status" : "STARTED",
  "uptime" : 139093,
  "build" : "2.1.0-SNAPSHOT from 77474581df4016e3899a986e079513087a945674 by gopal source checksum a9caa5faad5906d5139c33619f1368bb"
}

 

/peers example 

Code Block
languagebash
curl localhost:15002/peers
{
  "dynamic" : true,
  "identity" : "718264f1-722e-40f1-8265-ac25587bf336",
  "peers" : [ 
 {
    "identity" : "940d6838-4dd7-4e85-95cc-5a6a2c537c04",
    "host" : "sandbox121.hortonworks.com",
    "management-port" : 15004,
    "rpc-port" : 15001,
    "shuffle-port" : 15551,
    "resource" : {
      "vcores" : 24,
      "memory" : 128000
    },
    "host" : "sandbox121.hortonworks.com"
  }, 
]
}

 

SLIDER on YARN Deployment

LLAP can be deployed via Slider, which bypasses node installation and related complexities (HIVE-9883).

LLAP Status 

AMBARI-16149 introduces LLAP app status, available with Hive Server.

Example usage.

/current/hive-server2-hive2/bin/hive --service llapstatus --name {llap_app_name}

Resources

LLAP Design Document

...