...
In this RFC, we propose to implement Option 1. In the future, we will add support for Option 2 as well as Option 3 in the future.
An Example of Concurrency control with Priority for clustering/compactions
Dependency/Follow up
With multiple writers, there is no guarantee on ordered completion of commits (for conflict-free multiple writers). Incremental reads depend on monotonically increasing timestamps to ensure no insert/update is missed. This guarantee will be violated with multiple writers. To be able to use multiple writers, we will introduce a new notion to the hoodie timeline, END_COMMIT_TIME along with the START_COMMIT_TIME that is already used for performing commits. Incremental Reads will depend on END_COMMIT_TIME to tail data and also publish them to downstream systems to be able to checkpoint them.
An Example of Concurrency control with Priority for clustering/compactions
- Clustering is scheduled (files
- Clustering is scheduled (files f1,f2,f3 -> g1)
- Clustering moves inflight
- Ingestion writes scheduled
- Ingestion writes moved to inflight
- Ingestion has updates for f1
- Clustering finished after taking a lock and checking no other commit has succeeded, put the REPLACE file with mapping f1,f2,f3 -> g1 or put this information in the consolidated metadata
- Ingestion tries to finish acquires a lock
- Finds clustering has finished in the meantime, finds file level conflict, overlapping f1
- If a higher priority was given to the new writer over clustering, one possible implementation is as follows:Write a new REPLACE metadata to reverse the mapping of f1,f2,f3 -> g1 that was done before.
- NOTE that the entire mapping needs to be reversed since records can go from M file groups to N file groups
- Need to ensure that the previous version (before clustering) is not cleaned
- During the merge of consolidated metadata or the REPLACE timeline, this scenario of "revert" needs to be handled.
- Side effect
- Redundant clustering operation, previous one’s work is not used, another one needs to be scheduled
- Queries will ping pong back and forth between different number of files, layout
Rollout/Adoption Plan
Once the proposed solution is implemented, users will be able to run jobs in parallel by simply launching multiple writers. Upgrading to the latest Hudi release providing this feature will automatically upgrade your timeline layout.
Failures & Build rollback
TBD
Turning on Scheduling Operations
By default, the scheduling of operations will be enabled for any job for backwards compatibility for current users. The users need to ensure they turn off scheduling and then turn it on only for the dedicated job.
Test Plan
Unit tests and Test suite integration
Dependency/Follow up
- )
- Clustering moves inflight
- Ingestion writes scheduled
- Ingestion writes moved to inflight
- Ingestion has updates for f1
- Clustering finished after taking a lock and checking no other commit has succeeded, put the REPLACE file with mapping f1,f2,f3 -> g1 or put this information in the consolidated metadata
- Ingestion tries to finish acquires a lock
- Finds clustering has finished in the meantime, finds file level conflict, overlapping f1
- If a higher priority was given to the new writer over clustering, one possible implementation is as follows:
- Write a new REPLACE metadata to reverse the mapping of f1,f2,f3 -> g1 that was done before.
- NOTE that the entire mapping needs to be reversed since records can go from M file groups to N file groups
- Need to ensure that the previous version (before clustering) is not cleaned
- During the merge of consolidated metadata or the REPLACE timeline, this scenario of "revert" needs to be handled.
- Side effect
- Redundant clustering operation, previous one’s work is not used, another one needs to be scheduled
- Queries will ping pong back and forth between different number of files, layout
Rollout/Adoption Plan
Once the proposed solution is implemented, users will be able to run jobs in parallel by simply launching multiple writers. Upgrading to the latest Hudi release providing this feature will automatically upgrade your timeline layout.
Failures & Build rollback
TBD
Turning on Scheduling Operations
By default, the scheduling of operations will be enabled for any job for backwards compatibility for current users. The users need to ensure they turn off scheduling and then turn it on only for the dedicated job.
Test Plan
Unit tests and Test suite integrationWith multiple writers, there is no guarantee on ordered completion of commits (for conflict-free multiple writers). Incremental reads depend on monotonically increasing timestamps to ensure no insert/update is missed. This guarantee will be violated with multiple writers. To be able to use multiple writers, we will introduce a new notion to the hoodie timeline, END_COMMIT_TIME along with the START_COMMIT_TIME that is already used for performing commits. Incremental Reads will depend on END_COMMIT_TIME to tail data and also publish them to downstream systems to be able to checkpoint them.