Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Hudi allows us to store multiple versions of data in the dataset overtime while providing snapshot isolation. The number of versions are configurable and depends on how much tradeoff one prefers for space used by those older versions. Hence, as one can imagine, Hudi has the capabilities to allow to accessing the state of a table (dataset) at a certain point/instant in time. This could be done by allowing read mechanisms from a Hudi table (dataset) that go back and read the version of the data that was written a particular point/instant in time (see instant time)

Background

Hudi provides 3 different query/view type to access data written in 2 different storage type - Copy On Write (COW) and Merge On Read (MOR).

...