THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- HIP-1 : CSV Source Support for Delta Streamer
- HIP-2 : Orc Storage in Hudi
- HIP-3: Timeline Service with Incremental File System View Syncing
- HIP-4 : Faster Hive incremental pull queries
Roadmap
<WIP>
...
This is a rough roadmap (non exhaustive list) of what's to come in each of the areas for Hudi, to provide a general idea for
where we are headed.
Writing data & Indexing
- Support for indexing parquet records to improve speed
- Indexing the log file, moving closer to scalable 1-min ingests
- Overhaul of
- Incrementalizing cleaning based on timeline metadata
Reading data
- Incremental Pull natively via Spark Datasource
- Real-time view support on Presto
- Hardening incremental pull via Realtime view
- Support for Streaming style batch programs via Beam/Structured Streaming integration
Storage
- ORC Support
- Support for collapsing and splitting file groups
- Custom strategies for data clustering
- Columnar stats collection to power better query planning
Usability
- Painless migration of historical data, with safe experimentation
- Hudi on Flink
- Hudi for ML/Feature stores
Metadata Management
- Standalone timeline server to server DFS listings
...