Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

There are four more points worth to highlight here:

Point1: Orphan/Zombie Tasks

<orphan>

TM may disconnect to JM, but both TM and JM are alive. In the sink failure case, TM fails the sink task, and JM redeploys the sink to a different TM slot requested from RM.

...

3. Partial Records

A record can span can possibly span over multiple buffers. If a task reads a partial record and fails, the partial record is lost after restarting. The remaining part read from the upstream can not be deserialized successfully (can not decide the end of the record).

...

It is also possible that downstream of the failed tasks miss barriers as well, but we will postpone the discussion till later.

The proposed solution is to

...