Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Figure 1. An illustration of the the structure of the Checkpoint. One issue is that whether we need to keep the operator states collected from the finished tasks. 

Option 3. Allow tasks to finish &

...

Checkpoints do not

...

contain the

...

states from finished tasks

If we do not want to keep the final snapshots of the finished tasks instead, we need to explore how it changes the current behavior. Although in a checkpoint the state is snapshotted in the unit of task, it is finally reorganized by the unit of operator since there might be topology changes or rescaling. In other words, we need to view the checkpoint as composed of the current working progress of each operator, and tasks are only stateless runtime units to execute the remaining work for the operators. If we do not include the state from the finished tasks, it is equivalent to some operator discarding a part of finished work’s state in the checkpoint. Let ΔR represents the state of running task and ΔF represents the state of finished task when taking checkpoints, then the result of the operator’s execution after failover is

...