Page History

...

Key	Default	Notes
orc.compress	ZLIB	high level compression (one of NONE, ZLIB, SNAPPY)
orc.compress.size	262,144	number of bytes in each compression chunk
orc.stripe.size	268435456	number of bytes in each stripe
orc.row.index.stride	10,000	number of rows between index entries (must be >= 1000)
orc.create.index	true	whether to create row indexes
orc.bloom.filter.columns	""	comma separated list of column names for which bloom filter should be created
orc.bloom.filter.fpp	0.05	false positive probability for bloom filter (must >0.0 and <1.0)

For example, creating an ORC stored table without compression:

...

The goal of the column statistics is that for each column, the writer
records the count and depending on the type other useful fields. For
most of the primitive types, it records the minimum and maximum
values; and for numeric types it additionally stores the sum.
From Hive 1.2.0 onwards, the column statistics will also record if
there are any null value within the row group by setting hasNull flag.
hasNull flag is used by ORC's predicate pushdown to better answer
'IS NULL' queries.

message ColumnStatistics {
 // the number of values
 optional message ColumnStatistics {
 // the number of values
 optional uint64 numberOfValues = 1;
 // At most one of these has a value for any column
 optional IntegerStatistics intStatistics = 2;
 optional DoubleStatistics doubleStatistics = 3;
 optional StringStatistics stringStatistics = 4;
 optional BucketStatistics bucketStatistics = 5;
 optional DecimalStatistics decimalStatistics = 6;
 optional DateStatistics dateStatistics = 7;
 optional BinaryStatistics binaryStatistics = 8;
 optional TimestampStatistics timestampStatistics = 9;
 optional bool hasNull = 10;
}

For integer types (tinyint, smallint, int, bigint), the column
statistics includes the minimum, maximum, and sum. If the sum
overflows long at any point during the calculation, no sum is
recorded.

...

Encoding	Stream Kind	Optional	Contents
DIRECT	PRESENT	Yes	Boolean RLE
	DATA	No	Byte RLE

Indexes

Row Group Index

The row group indexes consist of a ROW\_INDEX stream for each primitive
column that has an entry for each row group. Row groups are controlled
by the writer and default to 10,000 rows. Each RowIndexEntry gives the
position of each stream for the column and the statistics for that row
group.

...

Because dictionaries are accessed randomly, there is not a position to
record for the dictionary and the entire dictionary must be read even
if only part of a stripe is being read.

Bloom Filter Index

Info

title	Version 1.2.0+: Bloom Filter

Bloom Filters are added to ORC indexes from Hive 1.2.0 onwards. Predicate pushdown can make use bloom filters to better prune the row groups that do not satisfy the filter condition.

The bloom filter indexes consist of a BLOOM_FILTER stream for each
columns specified through 'orc.bloom.filter.columns' table properties.
BLOOM_FILTER stream records a bloom filter entry for each row group
(default to 10,000 rows) in a column. In the presence of bloom filter
stream, predicate pushdown in ORC will make use of bloom filter indexes
instead of min/max stats from row group indexes.

Each BloomFilterEntry stores the number of hash functions ('k') used and
the bitset backing the bloom filter. The bitset is serialized as repeated
longs from which number of bits ('m') for the bloom filter can be derived.
m = bitset.length * 64.

message BloomFilter {
 optional uint32 numHashFunctions = 1;
 repeated fixed64 bitset = 2;
}

message BloomFilterIndex {
 repeated BloomFilter bloomFilter = 1;
}

Bloom filter internally uses two different hash functions to map a key
to a position in the bit set. For tinyint,smallint,int,bigint,float
and double types, Thomas Wang's 64-bit integer hash function is used.
Floats are converted to IEEE-754 32 bit representation
(using Java's Float.floatToIntBits(float)). Similary, Doubles are
converted to IEEE-754 64 bit representation (using Java's
Double.doubleToLongBits(double)). All these primitive types
are casted to long base type before passing it onto the hash function.
For strings and binary types, Murmur3 64 bit hash algorithm is used.
The 64 bit variant of Murmur3 considers only the most significant
8 bytes of Murmur3 128-bit algorithm. The 64 bit hashcode generated
from the above algorithms is used as a base to derive 'k' different
hash function. We use the idea mentioned in the paper "Less Hashing,
Same Performance: Building a Better Bloom Filter" by Kirsch et.al to
quickly compute the k hashcodes.

The algorithm for computing k hashcodes and setting the bit position
in bloom filter is as follows
 1) Get 64 bit base hash code from Murmur3 or Thomas Wang's hash algorithm
 2) Split the above hashcode into two 32-bit hashcodes (say hash1 and hash2)
 3) k'th hashcode is obtained by (where k > 0)
      combinedHash = hash1 + (k * hash2)
 4) If combinedHash is negative flip all the bits
      combinedHash = ~combinedHash
 5) Bit set position is obtained by performing modulo with m
      position = combinedHash % m
 6) Set the position in bit set. The LSB 6 bits identifies the long index
    within bitset and bit position within the long uses little endian order.
      bitset[(int) (position >>> 6)] |= (1L << position);

Bloom filter streams are interlaced with row group indexes. This placement
makes it convenient to read the bloom filter stream and row index stream
together in single read operation.

Image Added

Space shortcuts

Child pages

Versions Compared

Old Version 11

New Version 12

Key

Indexes