Long Live and Process (LLAP) functionality was added in Hive 2.0 (HIVE-7926 and associated tasks).
Hive has become significantly faster thanks to various features and improvements that were built by the community over the past two years, including Tez and Cost-based-optimization. Keeping the momentum, here are some examples of what we think will take us to the next level:
In order to achieve this we are proposing a hybrid execution model which consists of a long-lived daemon replacing direct interactions with the HDFS DataNode and a tightly integrated DAG-based framework. Similar to the DataNode, LLAP daemons can be used by other applications as well, especially if a relational view on the data is preferred over file-centric processing. We’re thus planning to open the daemon up through optional APIs (e.g.: InputFormat) that can be leveraged by other data processing frameworks as a building block. Last, but not least, fine-grained column-level access control -- a key requirement for mainstream adoption of Hive -- fits nicely into this model. | |