1. HBase Table format
Data in an external hbase table can be accessed in 2 ways: Cell format or Row format. In examples below, assume that an external hbase table HT has been created with one column family ‘cf’.
1. Cell Format
Syntax: hbase.”_CELL_”.HT
With this syntax, each cell of an hbase table is returned as a row in a virtual table with 5 columns.
-- Definition of Trafodion table HBASE."_CELL_".HT
(
ROW_ID VARCHAR(100) NOT NULL
, COL_FAMILY VARCHAR(100) NOT NULL
, COL_NAME VARCHAR(100) NOT NULL
, COL_TIMESTAMP LARGEINT NOT NULL
, COL_VALUE VARCHAR (1000) NOT NULL
)
PRIMARY KEY (ROW_ID ASC)
2. Row Format
Syntax: hbase.”_ROW_”.HT
With this syntax, all cells that are part of a row_id are returned as one row.
-- Definition of Trafodion table HBASE."_ROW_".HT
(
ROW_ID VARCHAR(100) NOT NULL
, COLUMN_DETAILS VARCHAR(10000) NOT NULL
)
PRIMARY KEY (ROW_ID ASC)
3. Column Length
By default, virtual columns have max length as shown above. To change it, following defaults could be used:
Control query default hbase_max_column_name_length ‘200’;
- This will change size for fields row_id, col_family, col_name
Control query default hbase_max_column_val_length ‘2000’;
- This will change size of field col_value
Control query default hbase_max_column_info_length ‘20000’
- This will change size of field column_details
2. Accessing data from HBase tables
Data can be accessed as a regular relational table in cell or row format.
1. Cell Format Access
Selecting all cells of table HT. Table has 3 cells, 2 for row_id 1 and 1 for row_id 2.
>>select * from hbase."_CELL_".ht;
ROW_ID COL_FAMILY COL_NAME COL_TIMESTAMP COL_VALUE
------------ ----------------- --------------- ----------------------- ---------------
1 cf a 1412546755474 100
1 cf b 1412546755479 200
2 cf c 1412546824687 300
--- 3 row(s) selected.
Applying expression on selected cells:
>>select left(row_id, 5), left(col_family || '.' || col_name, 6), left(col_value, 5) from hbase."_CELL_".ht;
(EXPR) (EXPR) (EXPR)
------ ------ ------
1 cf.a 100
1 cf.b 200
2 cf.c 300
--- 3 row(s) selected.
Applying predicates to limit output.
This example will select row_id 1 and column ‘b’
>>select left(row_id, 5), left(col_family || '.' || col_name, 6), left(col_value, 5) from hbase."_CELL_".ht where row_id = '1' and col_family = 'cf' and col_name = 'b';
(EXPR) (EXPR) (EXPR)
------ ------ ------
1 cf.b 200
--- 1 row(s) selected.
2. Row Format Access
Data in row format is returned as row_id followed by all cells in that row. Individual cell data from a row could be extracted or displayed using Trafodion functions.
Column_display function could be used to select and return all cells in a row in displayable format.
Each cell value is returned in “<col-fam>:<col-name> => <col-value>” format.
>>select left(row_id, 4) , column_display(column_details, 40) from hbase."_ROW_".HT;
(EXPR) (EXPR)
------ ----------------------------------------
1 cf:a => 100, cf:b => 200
2 cf:c => 300
--- 2 row(s) selected.
column_lookup function could be used to extract individual cells from a row.
This example will extract and return value of column ‘cf:a’ in all rows.
>>select left(row_id, 4), left(column_lookup(column_details, 'cf:a'), 10) from hbase."_ROW_".HT;
(EXPR) (EXPR)
------ ----------
1 100
--- 1 row(s) selected.
column_lookup in a predicate to compare a column value to another value.
This example will not return any rows as there are no rows where column ‘cf:a’ has a value less than 10.
>>select column_display(column_details, 40) from hbase."_ROW_".ht where column_lookup(column_details, 'cf:a') < '10';
--- 0 row(s) selected.
This example will return rows as the predicate qualifies.
>>select column_display(column_details, 40) from hbase."_ROW_".ht where column_lookup(column_details, 'cf:a') < '101';
(EXPR)
----------------------------------------
cf:a => 100, cf:b => 200
--- 1 row(s) selected.
3. Join between Trafodion and external HBase tables
Trafodion and Hbase tables could be joined using sql queries and predicates.
Trafodion table Trafodion.sch.traftab:
(
EMP_ID INT NO DEFAULT NOT NULL
, EMPNAME VARCHAR(100) NOT NULL
)
PRIMARY KEY (EMP_ID ASC)
This table contains 2 rows with employee id 1 for wayne, and employee id 2 for maya.
>>select * from trafodion.sch.traftab;
EMP_ID EMPNAME
----------- ---------------
1 wayne
2 maya
--- 2 row(s) selected.
External hbase table HBASETAB with column family ‘cf’.
Contains one row with row_id ‘1’ and one cell with location Palo Alto.
>>select left(row_id,4), column_display(column_details, 100) from hbase."_ROW_".hbasetab;
(EXPR) (EXPR)
------ ------------------------
1 cf:location => Palo Alto
--- 1 row(s) selected.
Join Trafodion and hbase tables to return employee name from traf table and location from hbase table.
First look at the query plan that will be used. In this example, a hash join will be done between the two tables.
>>explain options 'f' select t.emp_id, left(t.empname, 20), left (column_lookup(column_details, 'cf:location'), 10) from trafodion.sch.traftab t, hbase."_ROW_".hbasetab h where t.emp_id = cast(h.row_id as int);
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 1.00E+002
2 1 3 hybrid_hash_join 1.00E+002
. . 2 trafodion_scan TRAFTAB 1.00E+002
. . 1 hbase_scan HBASETAB 1.00E+002
--- SQL operation complete.
Now run the query to return one row with ID ‘1’.
>>select t.emp_id, left(t.empname, 20),
left (column_lookup(column_details, 'cf:location'), 10)
from trafodion.sch.traftab t, hbase."_ROW_".hbasetab h
where t.emp_id = cast(h.row_id as int);
EMP_ID (EXPR) (EXPR)
----------- -------------------- ----------
1 wayne Palo Alto
--- 1 row(s) selected.
4. IUD (Insert/Update/Delete) on Native Hbase tables
Data can be inserted, updated and deleted on external native hbase tables. This can be done from within a Trafodion transaction in cell or row formats.
1. Cell Format
Only inserts are allowed in cell format.
The example below Inserts a new column ‘z’ with value ‘101’ in hbase table ht. All columns of the cell format must be specified. This statement will insert one cell in the underlying hbase table ‘ht’.
Second query inserts 2 rows in one insert.
>>insert into hbase."_CELL_".ht values ('1', 'cf', 'z', -1, '101');
--- 1 row(s) inserted.
>>insert into hbase."_CELL_".ht values ('2', 'cf', 'y', -1, '202'),
('2', 'cf', 'w', -1, '303');
--- 2 row(s) inserted.
>>select * from hbase."_CELL_".ht;
ROW_ID COL_FAMILY COL_NAME COL_TIMESTAMP COL_VALUE
------------ ----------------- --------------- ----------------------- ---------------
1 cf z 1412546755474 101
2 cf w 1412546755474 303
2 cf y 1412546755474 202
--- 3 row(s) selected.
2. Row Format
Insert, Update and Delete operations are allowed in row format.
Insert using column_create function
This function is used to create a column value. It can then be used in a row format insert or update statement.
Syntax: column_create(‘<column-family>:<column-name>’, ‘<column-value>’)
- In this example, 2 rows are being inserted. First one has 1 column and second one has 2 columns:
>>insert into hbase."_ROW_".ht values
+> ('1', column_create('cf:1b', '102')),
+> ('2', column_create(('cf:2b', '201'),
+> ('cf:2c', '202')));
--- 2 row(s) inserted.
>>select left(row_id, 10) row_id, left(column_display(column_details), 40)
from hbase."_ROW_".ht;
ROW_ID (EXPR)
---------- ------------------------------
1 cf:1b => 102
2 cf:2b => 201, cf:2c => 202
--- 2 row(s) selected.
Update using column_details function
This function is used to update a row
- In this example, value for row ‘2’, column ‘cf:2b’ is updated from 201 to 301.
It is done after looking up column ‘cf:2c’.
>>update hbase."_ROW_".ht set column_details =
column_create ('cf:2b', '301')
where row_id = '2' and column_lookup(column_details, 'cf:2c') = '202';
--- 1 row(s) updated.
>>select left(row_id, 10) row_id, left(column_display(column_details), 40)
from hbase."_ROW_".ht;
ROW_ID (EXPR)
---------- ----------------------------------------
1 cf:1b => 102
2 cf:2b => 301, cf:2c => 202
--- 2 row(s) selected.
Delete operation
Delete all cells in row with rowid ‘1’
>>delete from hbase."_ROW_".ht where row_id = '1';
--- 1 row(s) deleted.
>>select left(row_id, 10) row_id,
left(column_display(column_details), 40)
from hbase."_ROW_".ht;
ROW_ID (EXPR)
---------- ----------------------------------------
2 cf:2b => 301, cf:2c => 202
--- 1 row(s) selected.
Delete all rows:
>>delete from hbase."_ROW_".ht;
--- 1 row(s) deleted.
>>select count(*) from hbase."_ROW_".ht;
(EXPR)
--------------------
0
--- 1 row(s) selected.
Transactional operation
- This example shows transactional operation on native hbase table.
Rows are inserted within a transaction where they are visible. A rollback operation then aborts it and all inserted rows are removed.
>>select * from hbase."_ROW_".ht;
--- 0 row(s) selected.
>>begin work;
--- SQL operation complete.
>>insert into hbase."_ROW_".ht values
('4', column_create(('cf:4a', '301'),
('cf:4b', '302')));
--- 1 row(s) inserted.
>>select left(row_id, 10) row_id,
left(column_display(column_details), 40)
from hbase."_ROW_".ht where row_id = '4';
ROW_ID (EXPR)
---------- ----------------------------------------
4 cf:4a => 301, cf:4b => 302
--- 1 row(s) selected.
>>rollback work;
--- SQL operation complete.
>>select * from hbase."_ROW_".ht;
--- 0 row(s) selected.
- In this example, rows are deleted within a transaction. If rolled back, they are undeleted. If committed, they remain deleted.
>>select count(*) from hbase."_ROW_".ht;
(EXPR)
--------------------
1
--- 1 row(s) selected.
>>begin work;
--- SQL operation complete.
>>delete from hbase."_ROW_".ht;
--- 1 row(s) deleted.
>>select count(*) from hbase."_ROW_".ht;
(EXPR)
--------------------
0
--- 1 row(s) selected.
>>rollback work;
--- SQL operation complete.
>>select count(*) from hbase."_ROW_".ht;
(EXPR)
--------------------
1
--- 1 row(s) selected.
>>begin work;
--- SQL operation complete.
>>delete from hbase."_ROW_".ht;
--- 1 row(s) deleted.
>>select count(*) from hbase."_ROW_".ht;
(EXPR)
--------------------
0
--- 1 row(s) selected.
>>commit work;
--- SQL operation complete.
>>select count(*) from hbase."_ROW_".ht;
(EXPR)
--------------------
0
--- 1 row(s) selected.