You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Definition

Type that determines how a commit will be stored and handled relative to its application to the target dataset.

#todo verify

Following table summarizes the trade-offs between these two storage types

Trade-offCopy On Write (COW)Merge On Read (MOR)
Data LatencyHigherLower
Update cost (I/O)Higher (rewrite entire dataset parquet)Lower (append to `delta log`)
Parquet File SizeSmaller (high update(I/0) cost)Larger (low update cost)
Write AmplificationHigherLower (depending on compaction strategy to the dataset parquet)

Related concepts

  1. commit
  2. Commit List
  3. Merge On Read (MOR)
  4. Copy On Write (COW)

Status (draft)



  • No labels