Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Definition

Action type for a timeline instantdef~timeline

Excerpt
Key Instant action types performed include:
  • COMMITS - `action type` which denotes an atomic write of a batch of records from an external source into a dataset def~table (see commit def~commit).
  • CLEANS - `action type` which denotes a background activity that gets rid of older versions of files in the dataset def~table, that are no longer needed.
  • DELTA_COMMIT - `action type` which denotes an atomic write of a batch of records into a Merge On Read def~merge-on-read (MOR)storage def~table-type of dataset def~table, where some/all of the data could be just written to delta logs (see commit def~commit).
  • COMPACTION - `action type` which denotes a background activity to reconcile differential data structures within Hudi e.g: moving merging updates from row based delta log files to columnar onto def~base-files columnar file formats. Internally, compaction manifests as a special commit def~commit on the timeline (see timeline instantdef~timeline)
  • ROLLBACK - `action type` denotes that a timeline instant def~timeline of `instant action type` commit/delta commit was unsuccessful & rolled back, removing any partial files produced during such a write
  • SAVEPOINT - `action type` marks certain file groups as “saved”, such that cleaner will not delete them. It helps restore the dataset def~table to a point on the timeline, in case of disaster/data recovery scenarios.

Related concepts

  1. def~timeline
  2. instant state
  3. def~instant-time
  4. def~table
  5. def~commit
  6. dataset
  7. commit