Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
// Hive version 0.11 through 0.14:
hive --orcfiledump <hdfs-location<location-of-orc-file>
 
// Hive version 0.15 and later:
hive --orcfiledump [-d] [--rowindex <col_ids>] <hdfs-location] <location-of-orc-file>
 
// Hive version 1.2.0 and later:
hive --orcfiledump [-d] [-t] [--rowindex <col_ids>] <location-of-orc-file>
 
// Hive version 1.3.0 and later:
hive --orcfiledump [-j] [-p] [-d] [-t] [--rowindex <col_ids>] <location-of-orc-file>

Adding -d to the command will cause it to dump the data in the ORC file rather than the metadata (Hive 1.1.0 and later).

Adding --rowindex with a comma separated list of column ids will cause it to print row indexes for the specified columns, where 0 is the top level struct containing all of the columns and 1 is the first column id (Hive 1.1.0 and later).

Adding -t to the command will print the timezone id of the writer.

Adding -j to the command will print the orc file metadata in json format. To pretty print the json metadata add -p to the command.

<location-of-orc-file> is the URI of the orc file.

ORC Configuration Parameters

...