Discussion thread | |
---|---|
Vote thread | |
JIRA | |
Release |
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
We introduced FineGrainedSlotManager in FLIP-56 to allow users to set different resources for slot requests. The FineGrainedSlotManager is the same as DeclarativeSlotManager if the user does not configure the resource profiles of SlotSharingGroup.
Now these two SlotManager have some duplicated codes(eg, the delay check of checkResourceRequirements ) and some similar component names(eg, TaskManagerTracker, TaskExecutorManager). This causes duplicate development of features (such as Make ResourceActions declarative).
This FLIP aims to unify the implementation of SlotManager. Since FineGrainedSlotManager has almost all the capability of DeclarativeSlotManager, we will implement all the lacks of FineGrainedSlotManager and enable it by default.
Public Interfaces
For now, we will set ‘cluster.fine-grained-resource-management.enabled’ to true by default. This configuration option will be preserved for user fallback if there are some issues with FineGrainedSlotManager.
For long term plans, the DeclarativeSlotManager will be completely removed in the next release after the default value is changed.
Proposed Changes
Add the missing capability of FineGrainedSlotManager
Use different slot matching strategy to spread out slots
SlotMatchingStrategy was introduced by FLINK-12122. It’s used for spread out slots across all registered TaskManagers. In FineGrainedSlotManager, this logic should be:
- Introduce SlotMatchingStrategy to DefaultResourceAllocationStrategy
Introduce new Interface to SlotMatchingStrategy to find the expected instance
Optional<InstanceID> findMatchingSlot( Predicate<InstanceID> isResourceMatching, Collection<InstanceID> availableTaskManagers, Function<InstanceID, Number> instanceScoreLookup);
- Add totalProfile to DefaultResourceAllocationStrategy#InternalResourceInfo to calculate the score(totalProfile.subtract(availableProfile)) of Instance.
- DefaultResourceAllocationStrategy#tryFulfillRequirementsForJobWithResources invoke SlotMatchingStrategy to find the best TaskManager to allocate resources.
Keep some redundant task managers to speed up failover
The redundant task managers are used to speed up failover. The logic in FineGrainedSlotManager should be:
Introduce redundantTaskManagerNum to DefaultResourceAllocationStrategy
Invoke tryFulFillRedundantResourceProfiles at the end of tryFulfillRequirements. it should use the remaining registeredResources and pendingFreeResources to fulfill the redundant slot requirements(defaultSlotResourceProfile * redundantTaskManagerNum * numSlotsPerWorker) and try to add new PendingTaskManagers to resultBuilder if the resource is not enough.
tryFulFillRedundantResourceProfiles( Collection<InternalResourceInfo> registeredResources, List<InternalResourceInfo> pendingFreeResources, ResourceAllocationResult.Builder resultBuilder){}
Try reclaim inactive slots when job terminated
As described in FLINK-21751, the task manager may report free slots to RM earlier than JM when a job finishes, which causes RM to reassign slots to the finished job. It’s hard to keep a strict order for TM/JM, so we need to try to reclaim inactive slots when the job is terminated.
- Introduce freeInactiveSlots to SlotStatusSyncer
- Try to reclaim inactive slots in FineGrainedSlotManager#clearResourceRequirements
Use FineGrainedSlotManager as default SlotManager
Therefore, The FineGrainedSlotManager has the full capability of DeclarativeSlotManager. We can change the default value of cluster.fine-grained-resource-management.enabled from false to true. This option will be preserved in case some corner cases.
The DeclarativeSlotManager and related configs will be completely removed in the next release after the default value is changed.
Limitations
No