Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Therefore, for checkpoints after some tasks are finished we need to trigger all the leaf new root tasks whose precedent tasks are all finished. Currently CheckpointCoordinator computes tasks to trigger in a separate timer thread, which runs concurrently with JobMaster’s main thread. The mechanism works well currently since it only needs to pick the current execution attempt for each execution vertex. However, to find all the leaf new root tasks needs to iterate over the whole graph and if JM changes tasks’ state during computation, it would be hard to guarantee the rightness of the algorithm, especially the JM might restarted the finished tasks on failover. To simplify the implementation we would proxy the computation of tasks to trigger to the main thread. 

The basic algorithm to compute the tasks to trigger would be iterating over the ExecutionGraph to find the leaf running new root running tasks. However, the algorithm could be optimized by iterating over the JobGraph instead. The detailed algorithm is shown in Appendix. 

...


Aligned Checkpoint

Unaligned Checkpoint

Received Trigger + all EndOfPartition

Wait till processed all barriers, and then report the snapshot

Wait till processed all EndOfPartition, and then report the snapshot. In this case we would take the snapshot the same as the aligned method.

Do not receive Trigger + some barriers

Wait till received barrier or EndOfPartition from each channel, then report the snapshot.

When receiving the first barrier, start spilling the buffers before the barrier or EndOfParitition for each channel.

Tasks Waiting For Checkpoint Complete Before Finish

...