Using HCatalog

Table of Contents

Info

title	Version information

HCatalog graduated from the Apache incubator and merged with the Hive project on March 26, 2013.
Hive version 0.11.0 is the first release that includes HCatalog.

...

Joe in data acquisition uses distcp to get data onto the grid.

No Format
hadoop distcp file:///file.dat hdfs://data/rawevents/20100819/data hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'"

...

Without HCatalog, Sally must be manually informed by Joe when data is available, or poll on HDFS.

No Format
A = load '/data/rawevents/20100819/data' as (alpha:int, beta:chararray, ...); B = filter A by bot_finder(zeta) = 0; ... store Z into 'data/processedevents/20100819/data';

With HCatalog, HCatalog will send a JMS message that data is available. The Pig job can then be started.

No Format


A = load 'rawevents' using org.apache.hive.hcatalog.pig.HCatLoader();
B = filter A by date = '20100819' and by bot_finder(zeta) = 0;
...
store Z into 'processedevents' using org.apache.hive.hcatalog.pig.HCatStorer("date=20100819");

...

Without HCatalog, Robert must alter the table to add the required partition.

No Format
alter table processedevents add partition 20100819 hdfs://data/processedevents/20100819/data select advertiser_id, count(clicks) from processedevents where date = '20100819' group by advertiser_id;

With HCatalog, Robert does not need to modify the table structure.

No Format
select advertiser_id, count(clicks) from processedevents where date = ‘20100819’ group by advertiser_id;

...

WebHCat is a REST API for HCatalog. (REST stands for "representational state transfer", a style of API based on HTTP verbs). The original name of WebHCat was Templeton. For more information, see the WebHCat manual.

Panel

titleColor	indigo
titleBGColor	silver
title	Navigation Links

Next: HCatalog Installation

General: HCatalog Manual – WebHCat Manual – Hive Wiki Home – Hive Project Site
Old version of this document (HCatalog 0.5.0): Overview

Space shortcuts

Child pages

Versions Compared

Old Version 10

New Version Current

Key

Using HCatalog

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 10

New Version Current

Key

Using HCatalog