You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Quick Links 

Issue Management

Actual issue tracking is in Apache JIRA!  We use this page to ground ourselves.

  1. Any issue you file, please file as "Issue" and not as "Sub task" (sub tasks cannot be added to Epics) 
  2. Please attach issues to an Epic as much as possible, so it does not scatter around. (see 1.0 Epics
  3. Keep issues unassigned, unless you are about to begin working on it. 
  4. Issue must be tagged with Fix Version/s: 1.0.0 to show up on the board.
  5. Vinoth Chandar  will move issues from 1.0.0 to 1.1.0 if it does not seem important.
  6. Pending project management tasks:
    • (Vinoth) to create a "roadmap" in JIRA
    • (Vinoth) to go into each Epic deeply, clean up tasks themselves.
    • (Vinoth) to scout for R.M.

Roadmap to visualize which epics are in what phase. 

Execution Phase 1 (Aug 15-Sept 15)

  • (Vinoth) Identify & land all critical outstanding PRs (that solve critical issues, take us forward in our 1.0 path)
    • (Vinoth) to identify.
    • (Sagar) Move master to 1.0.0
  • (Ethan & Vinoth & Danny) Land storage format 1.0 (Complete)
    • (Vinoth) Put up a 1.0 tech specs doc
    • Make all format changes described here. https://issues.apache.org/jira/browse/HUDI-6242
    • Standardization of serialization - log blocks, timeline meta files.
    • Change Timeline/FileSystemView to support snapshot, incremental, CDC, time-travel queries correctly.
    • Changes to make multiple base file formats within each file group.
    • No Java classes show up in table properties. HUDI-5761
    • (Danny) Introduce transition time into the active timeline
    • (Danny) Land LSM Timeline in well-tested, performant shape (HUDI-309, HUDI-6626, this needs an epic ASAP???)
  • Design:
    • (Sagar) Multi-table transactions? (VC: we have a strawman. but needs an RFC to validate correctness across phantom reads, self-joins, nested queries, and isolation levels)
    • (Lin) Keys: UUIDs vs. what we do today.
    • (Danny???) Time-Travel Read (+Write) (resolve HUDI-4500, HUDI-4677 and similar, address branch/merge use-cases)
    • (Ethan???) Logical partitioning/Index Functions API (Java, Native) and its integration into Spark/Presto/Trino. (HUDI-512)
    • (Shawn) Cloud native storage layout design (Udit's RFC-60)
    • (Sagar + ???) Schema Evolution and version tracking in MT.
    • (Vinoth) Lance file format + storing blobs/images.
  • Implementation
    • (Sagar) RFC-46/RecordMerger API, is this our final choice? cross-platform? only for hoodie.merge.mode=custom ? (complete HUDI-3217)
    • (Sagar) Async indexer is in final shape (complete HUDI-2488)
    • (Lin) Land Parquet keyed lookup code (???)
    • (Danny) Flink/Non-blocking CC (HUDI-5672, HUDI-6640, HUDI-6495 )
    • [???] Parquet Rewriting at Page Level for Spark Rows (Writer perf) (HUDI-4790)
    • (Ethan) Implement MoR snapshot query (positional/key based updates, deletes), partial updates, custom merges on new File Format code path.
    • (Ethan) Implement writers for positional updates, deletes, partial updates, ordering field based merging.
    • Existing Optimistic Concurrency Control is in final shape (complete HUDI-1456)
    • Implement a uniform way to fetch incremental data files based on new timeline (https://issues.apache.org/jira/browse/HUDI-2750)
    • <what are some other code refactoring.. to burn down?> (, HUDI-2261, HUDI-6243, HUDI-3614, HUDI-4444, HUDI-4756)
  • (Sagar) Open/Risk Items:

Execution Phase 2 (Sept 15-Oct 30)

Packaging Phase (Nov 1- Nov 15)(Marked 1.1.0 for now)

  • Release (if still pending!)
  • Docs
  • Examples
  • Bundles & Packages (HUDI-3529)
  • Site updates
  • Deprecate/Cleanup cWiki

Below the line (Marked 1.1.0 for now)

  • No labels