Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Hive JDBC Driver

Table of Contents

The current JDBC interface for Hive only supports running queries and fetching results. Only a small subset of the metadata calls are supported.

To see how the JDBC interface can be used, see sample code.

Integration with Pentaho

  1. Download pentaho report designer from the pentaho website.
  2. 1.#2

    Overwrite report-designer.sh with the code provided below.

    Code Block
    
    #!/bin/sh
    
    HADOOP_CORE={{ls $HADOOP_HOME/hadoop-*-core.jar}}
    CLASSPATH=.:$HADOOP_CORE:$HIVE_HOME/conf
    
    for i in ${HIVE_HOME}/lib/*.jar ; do
      CLASSPATH=$CLASSPATH:$i
    done
    
    CLASSPATH=$CLASSPATH:launcher.jar
    
    echo java -XX:MaxPermSize=512m -cp $CLASSPATH -jar launcher.jar
    java -XX:MaxPermSize=512m -cp $CLASSPATH org.pentaho.commons.launcher.Launcher}}}
     1.#3 Build and start the hive server with instructions from [HiveServer|Hive/HiveServer]
     1.#4 compile and run the hive jdbc client code to load some data (I haven't figured out how to do this in report designer yet). See [sample code|Hive/HiveClient#head-fd2d8ae9e17fdc3d9b7048d088b2c23a53a6857d] for loading the data.
     1.#5 Run the report designer (note step 2)
       
    $ sh reporter-designer.sh
    Code Block
    
     1.#6 Select 'Report Design Wizard'
     1.#7 select a template - say 'fall template' - next
     1.#8 create a new data source - JDBC (custom), Generic database
     1.#9 Provide hive jdbc parameters. Give the connection a name 'hive'.
       
    URL:
    
    
  3. Build and start the hive server with instructions from HiveServer.
  4. Compile and run the Hive JDBC client code to load some data (I haven't figured out how to do this in report designer yet). See sample code for loading the data.
  5. Run the report designer (note step 2).

    Code Block
    $ sh reporter-designer.sh
    
  6. Select 'Report Design Wizard'.
  7. Select a template - say 'fall template' - next.
  8. Create a new data source - JDBC (custom), Generic database.
  9. Provide Hive JDBC parameters. Give the connection a name 'hive'.

    Code Block
       URL: jdbc:hive://localhost:10000/default

    
       Driver name: org.apache.hadoop.hive.jdbc.HiveDriver

    1.#10 Click on
    
       Username and password are empty
    Code Block
    
    
  10. Click on 'Test'. The test should succeed 1.#11
  11. Edit the query: select 'Sample Query', click edit query, click on the connection 'hive'. create Create a new query. Write a query on the table testHiveDriverTable: eg: select * from testHiveDriverTable. Click next. 1.#12 Layout Step: Add {{PageOfPages}} to Group Items By. testHiveDriverTable, for example, select * from testHiveDriverTable. Click next.
  12. Layout Step: Add PageOfPages to Group Items By. Add key and value as Selected Items. Click next. And Finish.
  13. Change the Report header to Add key and value as Selected Items. Click next. And Finish. 1.#13 Change the Report header to 'hive-pentaho-report'. Change the type of the header to 'html' 1.
  14. #14 Run the report and generate pdf. You should get something like the report attached here. h3.

Integration

...

with

...

SQuirrel

...

SQL

...

Client

  1. # Download, install and start the SQuirrel SQL Client from the [SQuirrel SQL website|http://squirrel-sql.sourceforge.net/]. # Select 'Drivers -> New .
  2. Select 'Drivers -> New Driver...' to register the Hive JDBC driver. ##
    1. Enter

    1. the

    1. driver

    1. name

    1. and

    1. example

    1. URL:

    1. Code Block
      languagetext
         
    1. Name: Hive

    1. 
         Example URL: jdbc:hive://localhost:10000/default
      
  3. Code Block #

    Select

    'Extra

    Class

    Path

    ->

    Add'

    to

    add

    the

    following

    jars

    from

    your

    local

    Hive

    and

    Hadoop

    distribution.

    Code Block
     You will need to build Hive from the trunk after the commit of [HIVE-679|https://issues.apache.org/jira/browse/HIVE-679].
       
    HIVE
    _HOME/build/dist/lib/*.jar

    code
    
       HADOOP_HOME/hadoop-*-core.jar
    
     # 
  4. Select

    'List

    Drivers'.

    This

    will

    cause

    SQuirrel

    to

    parse

    your

    jars

    for

    JDBC

    drivers

    and

    might

    take

    a

    few

    seconds.

    From

    the

    'Class

    Name'

    input

    box

    select

    the

    Hive

    driver:

    code
    Code Block
    
       org.apache.hadoop.hive.jdbc.HiveDriver
  5. Click 'OK' to complete the driver registration.
  6. Select 'Aliases -> Add Alias...' to create a connection alias to your Hive server.
    1. Give the connection alias a name in the 'Name' input box.
    2. Select the Hive driver from the 'Driver' drop-down.
    3. Modify the example URL as needed to point to your Hive server.
    4. Leave 'User Name' and 'Password' blank and click 'OK' to save the connection alias.
  7. To connect to the Hive server, double-click the Hive alias and click 'Connect'.

...

Also note that when a query is running, support for the 'Cancel' button is not yet available.a