Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. When SlotPool starts, BlockedTaskManagerChecker will be passed in to check whether a task manager is blocked.
  2. When receiving slot offers from blocked task managers (including task managers on blocked nodes), all offers should be rejected.
  3. When a task manager is newly blocked, BlocklistContext will perform the following on SlotPool:
    1. If the action is MARK_BLOCKED, release all free(no task assigned) slots on the blocked task manager(it needs to introduce a new method: SlotPoolService#releaseFreeSlotsOnTaskManager).
    2. If the action is MARK_BLOCKED_AND_EVACUATE_TASKS, release all slots on the blocked task manager(it needs to introduce a new method: SlotPoolService#releaseSlotsOnTaskManager). Note that we should not unregister the TM directly, because in the case of batch jobs, there may be unconsumed result partitions on it.
  4. When a slot state changes from reserved(task assigned) to free(no task assigned), it will check whether the corresponding task manager is blocked. If yes, release the slot.

We will introduce a new slot pool implementation: BlocklistSlotPool, which extends the DefaultDeclarativeSlotPool and overrides some methods to implement the blocklist-related functions described above, and the blocklist information will be passed in the constructor. This new implementation will be only used when the blocklist is enabled.

Code Block
titleSlotPoolService
public interface SlotPoolService {

   /**
     * Releases all slots belonging to the owning TaskExecutor if it has been registered.
     *
     * @param taskManagerId identifying the TaskExecutor
     * @param cause cause for failing the slots
     */
    void releaseSlotsOnTaskManager(ResourceID taskManagerId, Exception cause);

    /**
     * Releases all free slots belonging to the owning TaskExecutor if it has been registered.
     *
     * @param taskManagerId identifying the TaskExecutor
     * @param cause cause for failing the slots
     */
    void releaseFreeSlotsOnTaskManager(ResourceID taskManagerId, Exception cause);

    //...
}

...


Code Block
titleBlocklistSlotPool
public class BlocklistSlotPool extends DefaultDeclarativeSlotPool {
    // ...
}

...