Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The algorithm for computing k hashcodes and setting the bit position
in bloom filter is as follows
1) Get 64 bit base hash code from Murmur3 or Thomas Wang's hash algorithm
2) Split the above hashcode into two 32-bit hashcodes (say hash1 and hash2)
3) k'th hashcode is obtained by (where k > 0)
combinedHash = hash1 + (k * hash2)
4) If combinedHash is negative flip all the bits
combinedHash = ~combinedHash
5) Bit set position is obtained by performing modulo with m
position = combinedHash % m
6) Set the position in bit set. The LSB 6 bits identifies the long index
within bitset and bit position within the long uses little endian order.
bitset[(int) (position >>> 6)] |= (1L << position);

Bloom filter streams are interlaced with row group indexes. This placement
makes it convenient to read the bloom filter stream and row index stream
together in single read operation.

...