You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Definition

Controls how datasets are exposed to queries

Hudi supports the following views of stored data

  • Error rendering macro 'excerpt-include'

    No link could be created for 'read optimized view'.

  • incremental view : Queries on this view only see new data written to the dataset, since a given commit/compaction. This view effectively provides change streams to enable incremental data pipelines.
  • realtime view : Queries on this view see the latest snapshot of dataset as of a given delta commit action. This view provides near-real time datasets (few mins) by merging the base and delta files of the latest file slice on-the-fly.

Following table summarizes the trade-offs between the different views.

Trade-offReadOptimizedRealTime
Data LatencyHigherLower
Query LatencyLower (raw columnar performance)Higher (merge columnar + row based delta)


Related concepts

  1. timeline instant
  2. dataset
  3. commit
  4. storage type


  • No labels