Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: make Row Group Index & Bloom Filter Index level-4 headings

...

EncodingStream KindOptionalContents
DIRECTPRESENTYesBoolean RLE
 DATANoByte RLE

Indexes

Row Group Index

The row group indexes consist of a ROW\_INDEX stream for each primitive
column that has an entry for each row group. Row groups are controlled
by the writer and default to 10,000 rows. Each RowIndexEntry gives the
position of each stream for the column and the statistics for that row
group.

...

Because dictionaries are accessed randomly, there is not a position to
record for the dictionary and the entire dictionary must be read even
if only part of a stripe is being read.

Bloom Filter Index

Info
titleVersion 1.2.0+: Bloom Filter

Bloom Filters are added to ORC indexes from Hive 1.2.0 onwards.
Predicate pushdown can make use of bloom filters to better prune
the row groups that do not satisfy the filter condition.

...