THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
Definition
Controls how datasets are exposed to queries
Hudi supports the following views of stored data
...
- read pptimized view : Queries on this view see the latest snapshot of the dataset as of a given commit or compaction action. This view exposes only the base/columnar files in latest file slices to the queries and guarantees the same columnar query performance compared to a non-hudi columnar dataset.
...
- incremental view : Queries on this view only see new data written to the dataset, since a given commit/compaction. This view effectively provides change streams to enable incremental data pipelines.
...
- realtime view : Queries on this view see the latest snapshot of dataset as of a given delta commit action. This view provides near-real time datasets (few mins) by merging the base and delta files of the latest file slice on-the-fly.
Following table summarizes the trade-offs between the different views.
Trade-off | ReadOptimized | RealTime |
---|---|---|
Data Latency | Higher | Lower |
Query Latency | Lower (raw columnar performance) | Higher (merge columnar + row based delta) |
Excerpt |
---|
Related concepts
...