THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
In order to solve bottleneck 2, we propose to split the StreamWriteOperator into 2 operators: the BucketAssigner & BucketWriter
The BucketAssigner
...
- We start a new instant when Coordinator starts(instead of start on new checkpoint from #step1 and #step2)
- The BucketWriter blocks and flush the pending buffer data when a new checkpoint starts, it has an asynchronous thread to consume the buffer(for the first version, we flush the data out just in the #processElement) and write batch based on buffer data size;
- In Coordinator, if data within one checkpoint write success (got a checkpoint success notification), check and commit the inflight Instant and start a new instant
That means, if a checkpoint succeeds but we do not receive the success event, the inflight instant will span two/(or more) checkpoints.
...