Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Parquet (http://parquet.io/) is a columnar storage format for Hadoop.

Table of Contents

Summary of Hive Parquet support

...

Introduction to Parquet

Parquet is (http://parquet.io/)  is an ecosystem wide columnar format for Hadoop. At the time of this writing it supports:

Engines

  • Apache Hive
  • Apache Drill
  • Cloudera Impala
  • Apache Crunch
  • Apache Pig
  • Cascading

Data description

  • Apache Avro
  • Apache Thrift
  • Google Protocol Buffers

...