THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- (Vinoth) Identify & land all critical outstanding PRs (that solve critical issues, take us forward in our 1.0 path)
- (Vinoth) to identify. https://github.com/apache/hudi/pulls?q=is%3Apr+is%3Aopen+label%3Arelease-1.0.0
- (Vinoth)
Land all relevant prs
- (Sagar) Move
master
to 1.0.0
- (Sagar & Vinoth & Danny) Land storage format 1.0
- (Vinoth)
Put up a 1.0 tech specs doc
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6706 - (Vinoth)
Scope this epic tight. https://issues.apache.org/jira/browse/HUDI-6242
- (Sagar)
Make all the agreed upon format changes described here.
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6776 - (Ethan)
Standardization of serialization - log blocks, timeline meta files.
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6824 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6825 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6826 Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6850 Base file format can be different within file groups
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6821 (Sagar) - (Sagar)
No Java classes show up in table properties.
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6780 - (Danny)
Introduce transition time into the active timeline
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-1623 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6775 - (Danny)
Remove appends and fix logfile names
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6641
- (Vinoth)
- Design:
- (Sagar)
Multi-table transactions? (
)Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6709 - (Lin) Keys: UUIDs vs. what we do today.
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6701 - (Vinoth)
OCC/Time-Travel Read (+Write)
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-4677 - (Vinoth/Danny)
Time-Travel read on NB CC & finalize NB CC design
- (Danny) TrueTime API implementation for Hudi (wait based, or filesystem/stateless based)
- (Vinoth/Shawn) Cloud native storage layout design (Udit's RFC-60)
- (Sagar)
Logical partitioning/Index Functions API (Java, Native) and its integration into Spark/Presto/Trino. (HUDI-512)
- (Sagar)
Schema Evolution and version tracking in MT.
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6778
- (Sagar)
- Implementation
- (Lin)
Finalize RFC-46/RecordMerger API, cross-platform support, only invoked for
hoodie.merge.mode=custom
? (complete HUDI-3217)Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6702 Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6765 Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6784 Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-5249 Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-5807 Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6767 - (Ethan)
Implement MoR snapshot query (positional/key based updates, deletes), partial updates, custom merges on new File Format code path.
- (Lin)
Implement a uniform way to read incremental data files based on new timeline (https://issues.apache.org/jira/browse/HUDI-2750)
- (Ethan)
Implement writers for positional updates, deletes, partial updates, ordering field-based merging.
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6653 - (Ethan)
Implement engine agnostic FileGroup Read APIs across Spark/HiveHive
Jira server ASF JIRA columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key HUDI-6785 - (Vinoth)
Implement DataFrame based write path; Take HoodieData abstraction to completion and end-end row writing for Spark? All write operations work with rows end-end (HUDI-4857)
- (Sagar)
Async indexer is in final shape (complete HUDI-2488)
- (Sagar)
Secondary indexes (Bloom, RLI, VectorIndex, ..) on Spark read/write path. (HUDI-3907, HUDI-4128)
- (Sagar) Existing Optimistic Concurrency Control is in final shape (complete HUDI-1456)
- (Lin)
Land Parquet keyed lookup code (???)
- (Danny)
Land LSM Timeline in well-tested, performant shape (HUDI-309)
- (Danny)
Flink/Non-blocking CC (HUDI-6640, HUDI-6495 )
- (Danny)
Change Timeline/FileSystemView to support snapshot, incremental, CDC, time-travel queries correctly.
- (???)
Introduce TrueTime API or equivalent, to explain the foundations more clearly. (reuse HUDI-3057)
-
<what are some other code refactoring.. to burn down?> (, HUDI-2261, HUDI-6243, HUDI-3614, HUDI-4444, HUDI-4756)
- (Lin)
- (Sagar) Open/Risk Items:
- (Ethan)
Are we happy with how log compaction is implemented? (https://issues.apache.org/jira/browse/HUDI-3580)
- (Vinoth)
Should we retain virtual keys support? https://issues.apache.org/jira/browse/HUDI-2235
- (Ethan)
...