Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In order to solve bottleneck 2, we propose to split the StreamWriteOperator into 2 operators: the BucketAssigner & BucketWriter

Image RemovedImage Added

The BucketAssigner

...

  • We start a new instant when Coordinator starts(instead of start on new checkpoint from #step1 and #step2)
  • The BucketWriter blocks and flush the pending buffer data when a new checkpoint starts, it has an asynchronous thread to consume the buffer(for the first version, we flush the data out just in the #processElement) and write batch based on buffer data size;
  • In Coordinator, if data within one checkpoint write success (got a checkpoint success notification), check and commit the inflight Instant and start a new instant

Image RemovedImage Added

That means, if a checkpoint succeeds but we do not receive the success event, the inflight instant will span two/(or more) checkpoints.

...