Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion threadhttps://lists.apache.org/thread/ocssfxglpc8z7cto3k8p44mrjxwr67r9
Vote thread
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-31439

Release


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Now these two SlotManager have some duplicated codes(eg, the delay check of checkResourceRequirements ) and some similar component names(eg, TaskManagerTracker, TaskExecutorManager). This causes duplicate development of features (, such as Make ResourceActions declarative).

This FLIP aims to unify the implementation of SlotManager. Since FineGrainedSlotManager has almost all the capability of DeclarativeSlotManager, we will implement all the lacks of FineGrainedSlotManager and enable it by default.

...

Overview of FineGrainedSlotManager and DeclarativeSlotManager

Functionality supports

functionalityFineGrainedSlotManagerDeclarativeSlotManager
Allocate new task managers when resource not enoughYESYES
Release idle task managers if there are no tasks/resultPartitionsYESYES
Keep some redundant task managersNOYES
Max limitations of slots numberYESYES
Filter out blocked resourcesYESYES
Track requirements of multiple jobsYESYES
Fulfill requirements by evenly strategyNOYES
Reclaim inactive slots when job finishedNOYES
Different slot resources in the same task managerYESNO

sub-components

DeclarativeSlotManager

...

The redundant task managers are used to speed up failover. The .

FineGrainedSlotManager has reserved the interface of heterogeneous task managers, but there are only one implementation which will requests task managers in same resources currently. Therefore, the current redundant task managers will not consider heterogeneity. This could be considered in detail when we decide to support heterogeneous task managers.

 The logic in FineGrainedSlotManager should be:

  • Introduce redundantTaskManagerNum to DefaultResourceAllocationStrategy

  • Invoke tryFulFillRedundantResourceProfiles at the end of tryFulfillRequirements. it should use the remaining registeredResources and pendingFreeResources to fulfill the redundant slot requirements(defaultSlotResourceProfile * redundantTaskManagerNum * numSlotsPerWorker) and try to add new PendingTaskManagers to resultBuilder if the resource is not enough.

    Code Block
    languagejava
    linenumberstrue
    tryFulFillRedundantResourceProfiles(
           Collection<InternalResourceInfo> registeredResources,
           List<InternalResourceInfo> pendingFreeResources,
           ResourceAllocationResult.Builder resultBuilder){}


Split resource allocate/release related logic from FineGrainedSlotManager to TaskManagerTracker

Currently the FineGrainedSlotManager is response to slots allocations and resources request/release. This makes logical of FineGrainedSlotManager complicated, So we will move task manager related work from FineGrainedSlotManager to TaskManagerTracker, which already tracks task managers but not including request/release.

After this change, the TaskManagerTracker will manage all the behavior of TaskManager:

  • request new task manager
  • tracking pending task manager
  • tracking registered task manager
  • release idle task manager
  • deal with max resources limitation

Try reclaim inactive slots when job terminated

...