Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As an example, there was a question from Pinterest with a on user ML, 250G state deployment that fits into memory (Jan 30, 2020).

...

For (1), potentially multiple unconfirmed versions have to be maintained (corresponding to multiple checkpoints, to avoid full snapshot on failure, see unconfirmed checkpoints

Removed keys

Currently, entries are just “unlinked” on removal. Therefore, they wouldn’t be included in the snapshot, but will be present after recovery.

...

  1. An entry can be removed many times - keep (and write) it only once
  2. An entry can be re-added (and then re-removed)
  3. A snapshot with a removal can still be unconfirmed by JM by the time of the next checkpoint (see unconfirmed checkpoints)
  4. An entry key can be mutable, such as BinaryRowData (this doesn’t cause issues with non-incremental snapshots because removal takes place straight ahead, without tracking)

...