Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Row ID - A monotonic increasing integer associated with every row in a sstable. It’s stored in an index structure instead of key token or key offset, because it compresses better.
  • Postings/posting-list - Sorted row ids that match a given indexed value. 
  • Token file - An index of Row ID -> partition key token for every row in the sstable.
  • Offset file - An index of Row ID -> partition key offset on the data/primary-index file for every row in the sstable.
  • Segment - A smallest unit of on-disk indexing structure that is flushed during compaction to reduce memory pressure. Multiple segments of an index are written to the same physical file.

Index Format Version 1

SAI is optimised for storage. Tokens and offsets are stored once per SSTable.  Column indexes access the token and offset files using a row ID. Offsets are compressed using Frame of Reference (FoR) encoding while tokens are not because tokens consume the full 8 bytes and therefore cannot be compressed.

...