Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note
titleEarliest version AvroSerde is available

The AvroSerde is available in Hive 0.9.1 and greater.

Overview -- Working with Avro from Hive

The AvroSerde allows users to read or write Avro data as Hive tables. The AvroSerde's bullet points:

  • Infers the schema of the Hive table from the Avro schema. Since Starting in Hive 0.14, Avro schema can be inferred from Hive table schema.
  • Reads all Avro files within a table against a specified schema, taking advantage of Avro's backwards compatibility abilities
  • Supports arbitrarily nested schemas.
  • Translates all Avro data types into equivalent Hive types. Most types map exactly, but some Avro types don't exist in Hive and are automatically converted by the AvroSerde.
  • Understands compressed Avro files.
  • Transparently converts the Avro idiom of handling nullable types as Union[T, null] into just T and returns null when appropriate.
  • Writes any Hive table to Avro files.
  • Has worked reliably against our most convoluted Avro schemas in our ETL process.

...