Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

The PDK is planned for inclusion in the Hive 0.8.0 release; until thenthat is available, please download this snapshot version of Apache Hive built by John Sichi (from a recent snapshot build from Jenkins; make sure it includes HIVE-2244).

Currently, the PDK is only targeted at user defined functions (including UDAF's and UDTF'S), although it may be possible to use it for building other kinds of plugins such as serdes, input/output formats, storage handlers and index handlers. The PDK's test framework currently only supports automated testing of UDF's.

...

To demonstrate the PDK in action, the Hive release includes an examples/test-plugin directory. You can build the test plugin by changing to that directory and running

Code Block

ant -Dhive.install.dir=../..

...

You can run the tests associated with the plugin via

Code Block

ant -Dhive.install.dir=../.. test

If all is well, you should see output like

Code Block

Buildfile: /hive-0.8.0-SNAPSHOT/examples/test-plugin/build.xml

get-class-list:

test:
    [junit] Running org.apache.hive.pdk.PluginTest
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 38.955 sec

BUILD SUCCESSFUL

...

To create your own plugin, you can follow the patterns from the example plugin. Let's take a closer look at it. First, the build.xml:

Code Block

<project name="pdktest" default="package">
  <property name="plugin.libname" value="pdk-test-udf"/>
  <property name="plugin.title" value="Hive PDK Test UDF Library"/>
  <property name="plugin.version" value="0.1"/>
  <property name="plugin.vendor" value="Apache Software Foundation"/>
  <property name="function.sql.prefix" value="tp_"/>
  <import file="${hive.install.dir}/scripts/pdk/build-plugin.xml"/>
</project>

...

Now let's take a look at the source code for a UDF.

Code Block

package org.apache.hive.pdktest;

import org.apache.hive.pdk.HivePdkUnitTest;
import org.apache.hive.pdk.HivePdkUnitTests;

import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

/**
 * Example UDF for rot13 transformation.
 */
@Description(name = "rot13",
  value = "_FUNC_(str) - Returns str with all characters transposed via rot13",
  extended = "Example:\n"
  + "  > SELECT _FUNC_('Facebook') FROM src LIMIT 1;\n" + "  'Snprobbx'")
@HivePdkUnitTests(
    setup = "create table rot13_data(s string); "
    + "insert overwrite table rot13_data select 'Facebook' from onerow;",
    cleanup = "drop table if exists rot13_data;",
    cases = {
      @HivePdkUnitTest(
        query = "SELECT tp_rot13('Mixed Up!') FROM onerow;",
        result = "Zvkrq Hc!"),
      @HivePdkUnitTest(
        query = "SELECT tp_rot13(s) FROM rot13_data;",
        result = "Snprobbx")
    }
  )
public class Rot13 extends UDF {
  private Text t = new Text();

  public Rot13() {
  }

  public Text evaluate(Text s) {
    StringBuilder out = new StringBuilder(s.getLength());
    char[] ca = s.toString().toCharArray();
    for (char c : ca) {
      if (c >= 'a' && c <= 'm') {
        c += 13;
      } else if (c >= 'n' && c <= 'z') {
        c -= 13;
      } else if (c >= 'A' && c <= 'M') {
        c += 13;
      } else if (c >= 'N' && c <= 'Z') {
        c -= 13;
      }
      out.append(c);
    }
    t.set(out.toString());
    return t;
  }
}

...

  • support annotations for other plugin types
  • add more annotations for automatically validating function parameters at runtime (instead of requiring the developer to write imperative Java code for this)
  • add Eclipse support
  • move Hive builtins to use PDK for more convenient testing optimize test performance (invoking CLI over and over is very slow)(HIVE-2523)
  • command-line option for invoking a single testcase