Page History

...

Pre-work
- (Vinoth) Land all relevant prs
APIs: (https://issues.apache.org/jira/browse/HUDI-4141)
- FileGroup APIs in Java
- Rust/C++ APIs for Timeline, Metadata, FileGroup Read/Write (https://issues.apache.org/jira/browse/HUDI-6486)
- Internal APIs/Abstractions/Code Refactoring (https://issues.apache.org/jira/browse/HUDI-6243)
  - HUDI-43
  - HoodieSchema ? https://issues.apache.org/jira/browse/HUDI-6499
  - (???) Introduce TrueTime API or equivalent, to explain the foundations more clearly. (reuse HUDI-3057)
  - <what are some other code refactoring.. to burn down?> (, HUDI-2261, HUDI-6243, HUDI-3614, HUDI-4444, HUDI-4756)
Design
- (Vinoth) General purpose, global timeline (no active vs archived distinction) (HUDI-309,
  Jira
  server ASF JIRA
  serverId 5aa69414-a9e9-3523-82ec-879b028fb15b
  key HUDI-6698
  )
- (Vinoth) Non-blocking concurrency control/clustering + updates, inserts + inserts for Spark + Flink.
- (Vinoth) Spark SQL statements to complete DB vision. (vinoth has a list. ???)
- (Vinoth) Lance file format + storing blobs/images.(Needs an epic)
- (Vinoth) Backwards compatibility testing. 1.0 reader can read 0.x format?
Implementation
- Multi-table transaction
- MT/RLI on Parquet base files
- Follow ups on LSM Timeline.
  Jira
  server ASF JIRA
  serverId 5aa69414-a9e9-3523-82ec-879b028fb15b
  key HUDI-6698
- Minimize configs and cleanup defaults (https://issues.apache.org/jira/browse/HUDI-1239)
- Meta Sync to Glue/HMS with reduced storage/API overhead (HUDI-2519, HUDI-5108, HUDI-6488), seamless inc query, cdc query, ro/rt experience
- Broader Performance improvements (HUDI-3249)
- SQL experience for timeline, metadata. (HUDI-6498)
- [???] Parquet Rewriting at Page Level for Spark Rows (Writer perf) (HUDI-4790)
- Introduce HudiStorage APIs to abstract out Hadoop FileSystem. (HUDI-6497)
Open/Risk Items:
- (Ethan/Danny) _hoodie_operation metafield. Spark/Flink interop.
- (Vinoth) Are we happy with DT <> MT sync mechanism? does this need to be revisited? (HUDI-2461 + other issues with Flink OCC)
- (Sagar) Are we happy with how log compaction is implemented? (https://issues.apache.org/jira/browse/HUDI-3580)
- (Vinoth) Should we retain virtual keys support? https://issues.apache.org/jira/browse/HUDI-2235

...

Space shortcuts

Page tree

Versions Compared

Old Version 82

New Version 83

Key