Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently tasks could be classified into two types according to how checkpoint is notified: the source tasks are trigger via the RPC from the JobManager, while the non-source tasks are triggered via barriers received from precedent tasks. If checkpoints after some tasks finished are considered, the non-source tasks might also get triggered via RPC if they become the new "root" tasks. The implementation for non-source tasks would be different from the source tasks: it would notify the CheckpointBarrierHandler instead of directly performing the snapshot so that the CheckpointBarrierHandler could deal with checkpoint subsuming correctly. Since currently triggerCheckpoint is implemented in the base StreamTask class uniformly and only be called for source StreamTask classes, we would refactor the class hierarchy to support different implementation of  triggerCheckpoint method for the two types of tasks.

If for one task not all its precedent tasks are finished, the task would still get notified via checkpoint barrier from the running precedent tasks. However, some of the precedent tasks might be already finished and the CheckpointBarrierHandler must consider this case for . As a whole, 



The basic algorithm to compute the tasks to trigger would be iterating over the ExecutionGraph to find the new root running tasks. However, the algorithm could be optimized by iterating over the JobGraph instead. The detailed algorithm is shown in Appendix. 

...