...
A full set of failure detection and handling will be discussed in Milestone Three. More explicitly, network failures between TMs are only partially discussed within the scope of Milestone One (the part affecting producers in Section2, Point2Network connection errors between TMs).
1. Task Failure Handling
...
However, JM may deploy a new sink before TM successfully fails the old one. We have to make sure only one view exists at any time, and the view is connected with the new attemptId/attemptNumber. In theory, the old execution can connect after the new execution. In reality, it can happen, but rarely. If only one view is allowed at a time, creating ResultSubpartitionView for the newer execution attempt should prevent the creation of ResultSubpartitionView from the older execution attempt to succeed.
Anchor network errors network errors
network errors | |
network errors |
Point2: Network connection errors between TM
...