You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Hive Plugin Developer Kit

This page explains Apache Hive's Plugin Developer Kit, or PDK. This allows developers to build and test Hive plugins without having to set up a Hive source build; only a Hive binary release is needed.

The PDK is planned for inclusion in the Hive 0.8.0 release; until then, please download this snapshot version of Apache Hive built by John Sichi.

Currently, the PDK is only targeted at user defined functions (including UDAF's and UDTF'S), although it may be possible to use it for building other kinds of plugins such as serdes, input/output formats, storage handlers and index handlers. The PDK's test framework currently only supports automated testing of UDF's.

Example Plugin

To demonstrate the PDK in action, the Hive release includes an examples/test-plugin directory. You can build the test plugin by changing to that directory and running

ant -Dhive.install.dir=../..

This will create a build subdirectory containing the compiled plugin: pdk-test-udf-0.1.jar. There's also a build/metadata directory containing add-jar.sql (demonstrating the command to use to load the plugin jar) and class-registration.sql (demonstrating the commands to use for loading the UDF's from the plugin).

You can run the tests associated with the plugin via

ant -Dhive.install.dir=../.. test

If all is well, you should see output like

Buildfile: /hive-0.8.0-SNAPSHOT/examples/test-plugin/build.xml


    [junit] Running org.apache.hive.pdk.PluginTest
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 38.955 sec


Your Own Plugin

To create your own plugin, you can follow the patterns from the example plugin. Let's take a closer look at it. First, the build.xml:

<project name="pdktest" default="package">
  <property name="plugin.libname" value="pdk-test-udf"/>
  <property name="plugin.title" value="Hive PDK Test UDF Library"/>
  <property name="plugin.version" value="0.1"/>
  <property name="plugin.vendor" value="Apache Software Foundation"/>
  <property name="function.sql.prefix" value="tp_"/>
  <import file="${hive.install.dir}/scripts/pdk/build-plugin.xml"/>

All this buildfile does is define some variable settings and then import a build script from the PDK, which does the rest (including defining the package and test targets used for building and testing the plugin). So for your own plugin, change the variable settings accordingly, and set hive.install.dir to the location where you've installed the Hive release.

The imported PDK buildfile assumes a few things about the structure of your plugin source structure:

  • your-plugin-root
    • build.xml
    • src: contains Java source files
    • test: contains setup.sql, cleanup.sql, and any datafiles needed by your tests

For the example plugin, a datafile onerow.txt contains a single row of data; setup.sql creates a table named onerow and loads the datafile, whereas cleanup.sql drops the onerow table. The onerow table is convenient for testing UDF's.

  • No labels