Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Definition

Type that determines how commit will be stored and handled relative to its application to the target dataset.data will be laid out as file and stored, inside a def~table.


Image Added#todo verify


Following table summarizes the trade-offs between these two storage table types

Trade-offCopy On Write def~copy-on-write (COW)Merge On Read def~merge-on-read (MOR)
Data LatencyHigherLower
Update cost (I/O)Higher (rewrite entire dataset def~table parquet)Lower (append to `delta log`)Parquet File SizeSmaller (high update(I/0) cost)Larger (low update cost)
Write AmplificationHigherLower (depending on compaction strategy to the dataset def~table parquet)
Query/Read AmplificationLower/ZeroHigher (merging base and deltas on the fly)

Related concepts

  1. commitdef~commit
  2. Commit List
  3. Merge On Read def~merge-on-read (MOR)
  4. Copy On Write def~copy-on-write (COW)
  5. def~timeline

Status (draft)