Hudi provides efficient upserts, by mapping a def~record-key + def~partition-path combination consistently to a def~file-id, via an indexing mechanism. This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file group. In short, the mapped file group contains all versions of a group of records.
Excerpt | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Hudi provides efficient upserts, by mapping a def~record-key + def~partition-path combination consistently to a def~file-id, via an indexing mechanism. This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file group. In short, the mapped file group contains all versions of a group of records. Hudi currently provides two choices for indexes : `BloomIndex` and `HBaseIndex` def~bloom-index and def~hbase-index, (with a few in the works :
Hudi Indices can be classified based on their ability to lookup records across partition.
HBase Index (global)Here, we just use HBase in a straightforward way to store the mapping above. The challenge with using HBase (or any external key-value store for that matter) is performing rollback of a commit and handling partial index updates. Bloom Index (non-global)This index is built by adding bloom filters with a very high false positive tolerance (e.g: 1/10^9), to the parquet file footers. The advantage of this index over HBase is the obvious removal of a big external dependency, and also nicer handling of rollbacks & partial updates since the index is part of the data file itself.At runtime, checking the Bloom Index for a given set of record keys effectively amounts to checking all the bloom filters within a given partition, against the incoming records, using a Spark join. Much of the engineering effort towards the Bloom index has gone into scaling this join by caching the incoming RDD[HoodieRecord] and dynamically tuning join parallelism, to avoid hitting Spark limitations like 2GB maximum for partition size. As a result, Bloom Index implementation has been able to handle single upserts upto 5TB, in a reliable manner. DAG with Range Pruning: draw.io Diagram | | ||||||||||||||||||||
border | true | ||||||||||||||||||||
viewerToolbar | true | ||||||||||||||||||||
fitWindow | false | ||||||||||||||||||||
diagramName | hoodie-bloom-index-dag | ||||||||||||||||||||
simpleViewer | false | ||||||||||||||||||||
width | 400 | ||||||||||||||||||||
diagramWidth | 1003 | revision | 1