You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Issue Management


Everything is in Apache JIRA! Not clickup. We use this page to ground ourselves.

Board | Kickoff video |

  1. Any issue you file, please attach to an EPIC as much as possible, so it does not scatter around.
  2. Issue must be tagged with Fix Version/s: 1.0.0 to show up on the board.
  3. Vinoth will move issues from 1.0.0 to 1.1.0 if it does not seem important.
  4. Pending project management tasks:
    • Vinoth to create a "roadmap" in JIRA
    • Vinoth to go into each Epic deeply, clean up tasks themselves.
    • Vinoth to scout for R.M.

Execution Phase 1 (Aug 15-Sept 15)

  • (Vinoth) Identify & land all critical outstanding PRs (that solve critical issues, take us forward in our 1.0 path)
    • Vinoth to identify.
    • [Sagar] Move master to 1.0.0
  • (Ethan & Vinoth & Danny) Land storage format 1.0 (Complete)
    • [Vinoth] Put up a 1.0 tech specs doc
    • Make all format changes described here. https://issues.apache.org/jira/browse/HUDI-6242
    • Standardization of serialization - log blocks, timeline meta files.
    • Change Timeline/FileSystemView to support snapshot, incremental, CDC, time-travel queries correctly.
    • Changes to make multiple base file formats within each file group.
    • Remove any Java classes from showing up in table properties. HUDI-5761
    • [Danny] Introduce transition time into the active timeline
    • [Danny] Land LSM Timeline in well-tested, performant shape (HUDI-309, HUDI-6626, this needs an epic ASAP???)
  • Design:
    • [Sagar] Multi-table transactions? (VC: we have a strawman. but needs an RFC to validate correctness across phantom reads, self-joins, nested queries, and isolation levels)
    • [Lin] Keys: UUIDs vs. what we do today.
    • [Danny???] Time-Travel Read (+Write) (resolve HUDI-4500, HUDI-4677 and similar, address branch/merge use-cases)
    • [Ethan???] Logical partitioning/Index Functions API (Java, Native) and its integration into Spark/Presto/Trino. (HUDI-512)
    • [Shawn] Cloud native storage layout design (Udit's RFC-60)
    • [Sagar + ???] Schema Evolution and version tracking in MT.
    • [Vinoth] Lance file format + storing blobs/images.
  • Implementation
    • [Sagar] RFC-46/RecordMerger API, is this our final choice? cross-platform? only for hoodie.merge.mode=custom ? (complete HUDI-3217)
    • [Sagar] Async indexer is in final shape (complete HUDI-2488)
    • [Lin] Land Parquet keyed lookup code (???)
    • [Danny] Flink/Non-blocking CC (HUDI-5672, HUDI-6640, HUDI-6495 )
    • [???] Parquet Rewriting at Page Level for Spark Rows (Writer perf) (HUDI-4790)
    • [Ethan] Implement MoR snapshot query (positional/key based updates, deletes), partial updates, custom merges on new File Format code path.
    • [Ethan] Implement writers for positional updates, deletes, partial updates, ordering field based merging.
    • Existing Optimistic Concurrency Control is in final shape (complete HUDI-1456)
    • Implement a uniform way to fetch incremental data files based on new timeline (https://issues.apache.org/jira/browse/HUDI-2750)
    • <what are some other code refactoring.. to burn down?> (, HUDI-2261, HUDI-6243, HUDI-3614, HUDI-4444, HUDI-4756)
  • (Sagar) Open/Risk Items:

Execution Phase 2 (Sept 15-Oct 30)

Packaging Phase (Nov 1- Nov 15)(Marked 1.1.0 for now)

  • Release (if still pending!)
  • Docs
  • Examples
  • Bundles & Packages (HUDI-3529)
  • Site updates
  • Deprecate/Cleanup cWiki

Below the line (Marked 1.1.0 for now)

  • No labels