Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: fix formatting

...

  • trunk/conf - This directory contains the packaged hive-default.xml and hive-site.xml.
  • trunk/data - This directory contains some data sets and configurations used in the hive tests.
  • trunk/ivy - This directory contains the ivy files used by the build infrastructure to manage dependencies on different hadoop versions.
  • trunk/lib - This directory contains the run time libraries needed by Hive.
  • trunk/testlibs - This directory contains the junit.jar used by the junit target in the build infrastructure.
  • trunk/testutils (Deprecated)

Hive SerDe

What is a ! SerDe?

  • !SerDe is a short name for "Serializer and Deserializer."
  • Hive uses SerDe (and !FileFormat) to read and write table rows.
  • HDFS files -!-> InputFileFormat )--> <key, value> -(-> Deserializer --> Row object
  • Row object --> Serializer )--> <key, value> -(!-> OutputFileFormat --> HDFS files

Note that the "key" part is ignored when reading, and is always a constant when writing. Basically row object is stored into the "value".

...

Compiling and Running Hive

Note
title

Hive now uses maven for build, see updated hive maven build instructions

...

Ant to Maven

As of version 0.13 Hive uses Maven instead of Ant for its build. See the Hive Developer FAQ for updated instructions. The following instructions are not currently up to date.

Hive can be made to compile against different versions of Hadoop.

...