Page History

...

This is enough for aligned checkpoints, but unaligned checkpoints would introduce additional complexity. Since unaligned checkpoint barriers could jump over the pending records, if we instead wait for the EndOfPartition directly, since EndOfPartition could not jump over, the CheckpointCoordinator could not get notified in time and we might incur longer checkpoint periods during the finishing phase. This is similar for the aligned checkpoints with timeout. To cope with this issue, the upstream tasks would wait till the downstream tasks to process all the pending records before exit. The upstream tasks would emit a special event, namely EndOfUserRecordsEvent EndOfData, after all the records and the downstream tasks would respond with another special event, and the upstream tasks only exit after all the response events are received. During this period, the unaligned checkpoints or checkpoints with timeout could be done normally. Afterwards the EndOfPartition could reach the downstream CheckpointAligner quickly since there are no pending records.

...

The detail life cycle for the source stream tasks and the operators would become

Event	Stream Task Status	Operator Status	Final Checkpoint	Stop with Savepoint with Drain	Stop with Savepoint
	RUNNING	RUNNING	-	-	-
No more records or Received Savepoint Trigger			-	-	-
			-	finish task (cancel source thread for legacy source and suspend the mailbox for the new source)	finish task (cancel source thread for legacy source and suspend the mailbox for the new source)
			Advanced To MAX_WATERMARK and trigger all the event timers	Advanced To MAX_WATERMARK and trigger all the event timers	-
			Emit MAX_WATERMARK	Emit MAX_WATERMARK	-
	WAITING_FOR_FINAL_CP	FINISHED	call operator.endInput() & operator.finish()	call operator.endInput() & operator.finish()	-
			Emit

EndOfUserRecordsEvent

EndOfData[finished = true]

Emit

EndOfUserRecordsEvent

EndOfData[finished = true]

Emit

EndOfUserRecordsEvent

EndOfData[finished = false]
	when checkpoint triggered, emit Checkpoint Barrier	Emit Checkpoint Barrier	Emit Checkpoint Barrier
	Wait for Checkpoint / Savepoint Completed	Wait for Checkpoint / Savepoint Completed	Wait for Checkpoint / Savepoint Completed

Checkpoint Completed

Wait for downstream tasks

acknowledge EndOfUserRecordsEvent

acknowledge EndOfData

Wait for downstream tasks

acknowledge EndOfUserRecordsEvent

acknowledge EndOfData

Wait for downstream tasks

acknowledge EndOfUserRecordsEventDownstream Tasks acknowledge EndOfUserRecordsEvent

acknowledge EndOfData
Checkpoint Completed && EndOfData acknowledged	CLOSED	CLOSED	-	-
			Call operator.close()	Call operator.close()	Call operator.close()
			Emit EndOfPartitionEvent	Emit EndOfPartitionEvent	Emit EndOfPartitionEvent

Similarly, the status for the non-source tasks would become

Event	Stream Task Status	Operator Status	Final Checkpoint	Stop with Savepoint with Drain	Stop with Savepoint
	RUNNING	RUNNING	-	-	-
			-	-	-
			-	-	-
Aligned on MAX_WATERMARK			Advanced To MAX_WATERMARK and trigger all the event timers	Advanced To MAX_WATERMARK and trigger all the event timers	N/A (MAX_WATERMARK is not emitted in this case)
			Emit MAX_WATERMARK	Emit MAX_WATERMARK	N/A
Aligned On EndOfUserRecordsEvent	WAITING_FOR_FINAL_CP	FINISHED	call operator.endInput() & operator.finish()	call operator.endInput() & operator.finish()	-
			Emit

EndOfUserRecordsEvent

EndOfData[finished = true]

Emit

EndOfUserRecordsEvent

EndOfData[finished = true]

Emit

EndOfUserRecordsEvent

EndOfData[finished = false]
Aligned on Checkpoint Barrier	Emit CheckpointBarrier	Emit CheckpointBarrier	Emit CheckpointBarrier
	Wait for Checkpoint / Savepoint Completed	Wait for Checkpoint / Savepoint Completed	Wait for Checkpoint / Savepoint Completed

Checkpoint Completed

Wait for downstream tasks

acknowledge EndOfUserRecordsEvent

acknowledge EndOfData

Wait for downstream tasks

acknowledge EndOfUserRecordsEvent

acknowledge EndOfData

Wait for downstream tasks

acknowledge EndOfUserRecordsEventDownstream Tasks acknowledge EndOfUserRecordsEventCLOSEDCLOSED

acknowledge EndOfData
	Wait for EndOfPartitionEvent	Wait for EndOfPartitionEvent	Wait for EndOfPartitionEvent
Checkpoint completed/EndOfData acknowledged/EndOfPartition received	CLOSED	CLOSED
			Call operator.close()	Call operator.close()	Call operator.close()
			Emit EndOfPartitionEvent	Emit EndOfPartitionEvent	Emit EndOfPartitionEvent

Info

We need to wait for a checkpoint to complete, that started after the finish() method. However, we support concurrent checkpoints. Moreover there is no guarantee the notifyCheckpointComplete arrives or the order in which they will arrive. It should be enough though to wait for notification for any checkpoint that started after finish().

We should make sure though later checkpoints do not leave behind lingering resources.

Imagine a scenario where:
   1. task/operator received `finish()`
   2. checkpoint 42 triggered (not yet completed)
   3. checkpoint 43 triggered (not yet completed)
   4. checkpoint 44 triggered (not yet completed)
   5. notifyCheckpointComplete(43)


Our proposal is to shutdown the task immediately after seeing first `notifyCheckpointComplete(X)`, where X is any triggered checkpoint AFTER `finish()`. This should be fine, as:
   a) ideally there should be no new pending transactions opened after checkpoint 42
   b) even if operator/function is opening some transactions for checkpoint 43 and checkpoint 44 (`FlinkKafkaProducer`), those transactions after checkpoint 42 should be empty

After seeing 5. (notifyCheckpointComplete(43)) It should be good enough to:
   - commit transactions from checkpoint 42, (and 43 if they were created, depends on the user code)
   - close operator, aborting any pending transactions (for checkpoint 44 if they were opened, depends on the user code)

If checkpoint 44 completes afterwards, it will still be valid. Ideally we would recommend that after seeing `finish()` operators/functions should not be opening any new transactions, but that shouldn't be required.

...

Page tree

Versions Compared

Old Version 69

New Version 70

Key