You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Definition

Action type for a timeline instant

  • COMMITS - A commit denotes an atomic write of a batch of records into a dataset.
  • CLEANS - Background activity that gets rid of older versions of files in the dataset, that are no longer needed.
  • DELTA_COMMIT - A delta commit refers to an atomic write of a batch of records into a Merge On Read (MOR) storage type of dataset, where some/all of the data could be just written to delta logs.
  • COMPACTION - Background activity to reconcile differential data structures within Hudi e.g: moving updates from row based delta log files to columnar file formats. Internally, compaction manifests as a special commit on the timeline
  • ROLLBACK - Indicates that a commit/delta commit was unsuccessful & rolled back, removing any partial files produced during such a write
  • SAVEPOINT - Marks certain file groups as “saved”, such that cleaner will not delete them. It helps restore the dataset to a point on the timeline, in case of disaster/data recovery scenarios.

  • No labels