Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Definition

Type that determines how commit will be stored and handled relative to its application to the target dataset.data will be laid out as file and stored, inside a def~table.


Image Added#todo verify


Following table summarizes the trade-offs between these two

...

table types

Trade-off
Copy On Write
def~copy-on-write (COW)
Merge On Read
def~merge-on-read (MOR)
Data LatencyHigherLower
Update cost (I/O)Higher (rewrite entire
dataset
def~table parquet)Lower (append to `delta log`)
Parquet File Size
Smaller (high update(I/0) cost)Larger (low update cost)
Write AmplificationHigherLower (depending on compaction strategy to the
dataset
def~table parquet)
Query/Read AmplificationLower/ZeroHigher (merging base and deltas on the fly)

Related concepts

  1. commitdef~commit
  2. Commit List
  3. Merge On Read def~merge-on-read (MOR)
  4. Copy On Write def~copy-on-write (COW)
  5. def~timeline

Status (draft)