Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Definition

Type that determines how a commit will be stored and handled relative to its application to the target dataset.

#todo verify

Following table summarizes the trade-offs between these two storage types

Trade-offCopy On Write (COW)Merge On Read (MOR)
Data LatencyHigherLower
Update cost (I/O)Higher (rewrite entire dataset parquet)Lower (append to `delta log`)
Parquet File SizeSmaller (high update(I/0) cost)Larger (low update cost)
Write AmplificationHigherLower (depending on compaction strategy to the dataset parquet)

Related concepts

  1. commit
  2. Commit List
  3. Merge On Read (MOR)
  4. Copy On Write (COW)

Status (draft)