Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The new index does not need to bootstrap for existing/history dataset, it is also useful for different Flink Job writes.

The Compatibility

The operator coordinator is introduced since Flink 1..11 release, in order to be compatible with Flink version lower than 1.11, we need to add a pipeline that does not use the operator coordinator:

input operator => the instance generator => fileID assigner => bucket writer => commit sink.

That is to replace the coordinator with instance generator and commit sink just like the original pipeline.

Implementation Plan

  1. Implements the current code base on #step1 and add a test framework to the hoodie-fink module, including the UTs and ITs
  2. Refactoring the HoodieFlinkWriteClient to abstract out the indexing/bucketing work, in hoodie-spark, they are all sub-pipelines, but in flink, we should abstract it as Interface/Operator
  3. Implements the code to #step2
  4. Implements the code to #step3
  5. Add a new index

...