1.    HBase Table format

 

Data in an external hbase table can be accessed in 2 ways: Cell format or Row format. In examples below, assume that an external hbase table HT has been created with one column family ‘cf’.

1.       Cell Format

Syntax:  hbase.”_CELL_”.HT

With this syntax, each cell of an hbase table is returned as a row in a virtual table with 5 columns.

-- Definition of Trafodion table HBASE."_CELL_".HT

  (

    ROW_ID                           VARCHAR(100) NOT NULL

  , COL_FAMILY                    VARCHAR(100) NOT NULL

  , COL_NAME                      VARCHAR(100) NOT NULL

  , COL_TIMESTAMP            LARGEINT NOT NULL

  , COL_VALUE                      VARCHAR (1000) NOT NULL

  )

  PRIMARY KEY (ROW_ID ASC)

 

2.       Row Format

 Syntax: hbase.”_ROW_”.HT

With this syntax, all cells that are part of a row_id are returned as one row.

-- Definition of Trafodion table HBASE."_ROW_".HT

  (

    ROW_ID                             VARCHAR(100) NOT NULL

  , COLUMN_DETAILS           VARCHAR(10000) NOT NULL

  )

  PRIMARY KEY (ROW_ID ASC)


 3.       Column Length

By default, virtual columns have max length as shown above. To change it, following defaults could be used:

Control query default hbase_max_column_name_length ‘200’;

  • This will change size for fields row_id, col_family, col_name

Control query default hbase_max_column_val_length ‘2000’;

  • This will change size of field col_value

Control query default hbase_max_column_info_length ‘20000’

  • This will change size of field column_details

 

2.    Accessing data from HBase tables

Data can be accessed as a regular relational table in cell or row format.

 1.       Cell Format Access

Selecting all cells of table HT. Table has 3 cells, 2 for row_id 1 and 1 for row_id 2.

>>select * from hbase."_CELL_".ht;

ROW_ID     COL_FAMILY    COL_NAME    COL_TIMESTAMP      COL_VALUE

------------    -----------------    ---------------     -----------------------      ---------------

1                   cf                        a                      1412546755474         100

1                   cf                        b                      1412546755479         200

2                   cf                        c                      1412546824687        300                                                                                                                                                                          

--- 3 row(s) selected.

 

Applying expression on selected cells:

>>select left(row_id, 5), left(col_family || '.' || col_name, 6), left(col_value, 5) from hbase."_CELL_".ht;

(EXPR)  (EXPR)  (EXPR)

------  ------  ------

1       cf.a    100  

1       cf.b    200  

2       cf.c    300   

--- 3 row(s) selected.

 

Applying predicates to limit output.

This example will select row_id 1 and column ‘b’

>>select left(row_id, 5), left(col_family || '.' || col_name, 6), left(col_value, 5) from hbase."_CELL_".ht where row_id = '1' and col_family = 'cf' and col_name = 'b';                                                           

(EXPR)  (EXPR)  (EXPR)

------  ------  ------

1       cf.b    200  

--- 1 row(s) selected.

 

2.       Row Format Access

Data in row format is returned as row_id followed by all cells in that row.  Individual cell data from a row could be extracted or displayed using Trafodion functions.

Column_display function could be used to select and return all cells in a row in displayable format.

Each cell value is returned in “<col-fam>:<col-name> => <col-value>” format.

>>select left(row_id, 4) , column_display(column_details, 40) from hbase."_ROW_".HT;

(EXPR)  (EXPR)                                 

------  ----------------------------------------

1       cf:a => 100, cf:b => 200               

2       cf:c => 300                            

--- 2 row(s) selected. 

 

 column_lookup function could be used to extract individual cells from a row.

This example will extract and return value of column ‘cf:a’ in all rows.

>>select left(row_id, 4), left(column_lookup(column_details, 'cf:a'), 10)  from hbase."_ROW_".HT;

(EXPR)  (EXPR)   

------  ----------

1       100      

--- 1 row(s) selected.

 

column_lookup in a predicate to compare a column value to another value.

This example will not return any rows as there are no rows where column ‘cf:a’ has a value less than 10.

>>select column_display(column_details, 40) from hbase."_ROW_".ht where column_lookup(column_details, 'cf:a') < '10';   

--- 0 row(s) selected.

 

This example will return rows as the predicate qualifies.

>>select column_display(column_details, 40) from hbase."_ROW_".ht where column_lookup(column_details, 'cf:a') < '101';

(EXPR)                                 

----------------------------------------

cf:a => 100, cf:b => 200               

--- 1 row(s) selected.

 

3.    Join between Trafodion and external HBase tables

Trafodion and Hbase tables could be joined using sql queries and predicates.

Trafodion table Trafodion.sch.traftab:

  (

    EMP_ID                           INT NO DEFAULT NOT NULL

  , EMPNAME                      VARCHAR(100) NOT NULL

  )

  PRIMARY KEY (EMP_ID ASC)

This table contains 2 rows with employee id 1 for wayne, and employee id 2 for maya.

>>select * from trafodion.sch.traftab;

EMP_ID       EMPNAME

-----------  ---------------

          1  wayne                                                                                              

          2  maya                                                                                               

--- 2 row(s) selected.


External hbase table HBASETAB with column family ‘cf’.

Contains one row with row_id ‘1’ and one cell with location Palo Alto.

>>select left(row_id,4), column_display(column_details, 100) from hbase."_ROW_".hbasetab;

(EXPR)  (EXPR)

------  ------------------------

1       cf:location => Palo Alto                                                                           

--- 1 row(s) selected.


Join Trafodion and hbase tables to return employee name from traf table and location from hbase table.

First look at the query plan that will be used. In this example, a hash join will be done between the two tables. 

>>explain options 'f' select t.emp_id, left(t.empname, 20), left (column_lookup(column_details, 'cf:location'), 10)  from trafodion.sch.traftab t, hbase."_ROW_".hbasetab h where t.emp_id = cast(h.row_id as int);

LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD

---- ---- ---- --------------------  --------  --------------------  ---------

3    .    4    root                                                  1.00E+002

2    1    3    hybrid_hash_join                                      1.00E+002

.    .    2    trafodion_scan                  TRAFTAB               1.00E+002

.    .    1    hbase_scan                      HBASETAB              1.00E+002

--- SQL operation complete.


Now run the query to return one row  with ID ‘1’.

>>select t.emp_id, left(t.empname, 20),

     left (column_lookup(column_details, 'cf:location'), 10) 

      from trafodion.sch.traftab t, hbase."_ROW_".hbasetab h

      where t.emp_id = cast(h.row_id as int);

 

EMP_ID       (EXPR)                (EXPR)   

-----------  --------------------  ----------

          1  wayne                 Palo Alto 

--- 1 row(s) selected.


4.    IUD (Insert/Update/Delete) on Native Hbase tables 

Data can be inserted, updated and deleted on external native hbase tables. This can be done from within a Trafodion transaction in cell or row formats.

1.       Cell Format

Only inserts are allowed in cell format. 

The example below Inserts a new column ‘z’ with value ‘101’ in hbase table ht. All columns of the cell format must be specified. This statement will insert one cell in the underlying hbase table ‘ht’.

Second query inserts 2 rows in one insert.

>>insert into hbase."_CELL_".ht values ('1', 'cf', 'z', -1, '101');

--- 1 row(s) inserted.

>>insert into hbase."_CELL_".ht values ('2', 'cf', 'y', -1, '202'),

('2', 'cf', 'w', -1, '303');

--- 2 row(s) inserted.

>>select * from hbase."_CELL_".ht;

  ROW_ID     COL_FAMILY    COL_NAME    COL_TIMESTAMP      COL_VALUE

------------    -----------------    ---------------     -----------------------      ---------------

1                   cf                        z                      1412546755474         101

2                   cf                        w                     1412546755474         303

2                   cf                        y                      1412546755474         202                                                                           

--- 3 row(s) selected.

 

2.       Row Format

Insert, Update and Delete operations are allowed in row format. 

Insert using column_create function

This function is used to create a column value. It can then be used in a row format insert or update statement.

  Syntax: column_create(‘<column-family>:<column-name>’, ‘<column-value>’) 

  • In this example, 2 rows are being inserted. First one has 1 column and second one has 2 columns:

>>insert into hbase."_ROW_".ht values

+>                              ('1', column_create('cf:1b', '102')),

+>                              ('2', column_create(('cf:2b', '201'),

+>                                                           ('cf:2c', '202')));

--- 2 row(s) inserted.

 

>>select left(row_id, 10) row_id, left(column_display(column_details), 40)

           from hbase."_ROW_".ht;

 

ROW_ID      (EXPR)                                 

----------  ------------------------------

1           cf:1b => 102              

2           cf:2b => 201, cf:2c => 202

--- 2 row(s) selected.

 

Update using column_details function

This function is used to update a row

  • In this example, value for row ‘2’, column ‘cf:2b’ is updated from 201 to 301.

It is done after looking up column ‘cf:2c’.

>>update  hbase."_ROW_".ht set column_details =

      column_create ('cf:2b', '301')

   where row_id = '2' and column_lookup(column_details, 'cf:2c') = '202'; 

--- 1 row(s) updated.


>>select left(row_id, 10) row_id, left(column_display(column_details), 40)

           from hbase."_ROW_".ht; 

ROW_ID      (EXPR)                                 

----------  ----------------------------------------

1           cf:1b => 102             

2           cf:2b => 301, cf:2c => 202

--- 2 row(s) selected.

 

  Delete operation

Delete all cells in row with rowid ‘1’

>>delete from hbase."_ROW_".ht where row_id = '1';

--- 1 row(s) deleted.

 

>>select left(row_id, 10) row_id,       

     left(column_display(column_details), 40)

           from hbase."_ROW_".ht; 

ROW_ID      (EXPR)                                 

----------  ----------------------------------------

2           cf:2b => 301, cf:2c => 202

--- 1 row(s) selected.

 

Delete all rows:

 >>delete from hbase."_ROW_".ht;

--- 1 row(s) deleted.


>>select count(*) from hbase."_ROW_".ht; 

(EXPR)             

--------------------

                   0

--- 1 row(s) selected.

 

Transactional operation

  • This example shows transactional operation on native hbase table.

Rows are inserted within a transaction where they are visible. A rollback operation then aborts it and all inserted rows are removed.

 

>>select * from hbase."_ROW_".ht; 

--- 0 row(s) selected.


>>begin work; 

--- SQL operation complete.


>>insert into hbase."_ROW_".ht values

       ('4', column_create(('cf:4a', '301'),

      ('cf:4b', '302'))); 

--- 1 row(s) inserted.


>>select left(row_id, 10) row_id,   

     left(column_display(column_details), 40)

           from hbase."_ROW_".ht where row_id = '4'; 

ROW_ID      (EXPR)                                 

----------  ----------------------------------------

4           cf:4a => 301, cf:4b => 302             

--- 1 row(s) selected.


>>rollback work; 

--- SQL operation complete.


>>select * from hbase."_ROW_".ht; 

--- 0 row(s) selected.

 

  • In this example, rows are deleted within a transaction. If rolled back, they are undeleted. If committed, they remain deleted.

 

>>select count(*) from hbase."_ROW_".ht;

(EXPR)             

--------------------

                   1

--- 1 row(s) selected.


>>begin work; 

--- SQL operation complete.

>>delete from hbase."_ROW_".ht;

--- 1 row(s) deleted.

>>select count(*) from hbase."_ROW_".ht;

(EXPR)             

--------------------

                   0

--- 1 row(s) selected.

>>rollback work;

--- SQL operation complete.

>>select count(*) from hbase."_ROW_".ht;

(EXPR)             

--------------------

                   1

--- 1 row(s) selected.

 

 

>>begin work; 

--- SQL operation complete.

>>delete from hbase."_ROW_".ht;

--- 1 row(s) deleted.

>>select count(*) from hbase."_ROW_".ht;

(EXPR)             

--------------------

                   0

--- 1 row(s) selected.

>>commit work;

--- SQL operation complete.

>>select count(*) from hbase."_ROW_".ht;

(EXPR)             

--------------------

                   0

--- 1 row(s) selected.

 

 

  • No labels