THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
Page properties | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
- If the JV of the SEV doesn’t contain co-location constraint
- If the parallelism of JV is equals to numberOfSlots
- Traverse all SEVs of JV, assign the SEVs[subtask_index] to the ESSGs[subtask_index]This strategy ensures that SEVs with the same index can be assigned to the same ESSG.
- It also ensures that co-located subtasks will be in one ESSG, given that co-located job vertices share the same parallelism and will always be in the same SSG.
- In the case of forward edges, remote data exchange of such JVs can be reduced, all subtasks with forward shuffle are still in the same Slot, and they are local data exchange.
- If the parallelism of JV is equals to numberOfSlots
- If the parallelism of JV is less than the numberOfSlots in the current SSG.
- Get ESSGs[++eSSGIndex] as ESSG.
- Add the SEV into the target ESSG.
- If the parallelism of JV is less than the numberOfSlots in the current SSG.
...
- Based on these information:
- When `current time > arrival time of the last slotRequest + slot.request.max-interval`, then the SlotPool declareResourceRequirement to the RM. (For all jobs, includes batch and streaming)
- When `the number of available slots cached exceeds or equals the pending slots`, the slots selection and allocation can be carried out. (Just for streaming job)
Limitation: The waiting mechanism(`the number of available slots cached exceeds or equals the pending slots`) does not take effect for Batch jobs, so Batch jobs do not have a global perspective to balance the number of tasks when allocating slots to TMs. There are several main reasons:
...
- Expose two options to the user
- slot.sharing-strategy and cluster.evenly-spread-out-slots
- After discussing at mail list, we unify these 2 option into a option: taskmanager.load-balance.mode.
- Related discussions at here:
- Change the strategy of slot.request.max-interval
- The old strategy is: when `current time > arrival time of the last slotRequest + slot.request.max-interval` and `the number of available slots cached exceeds or equals the pending slots`, the slots selection and allocation can be carried out.
- The latest strategy is:
- For DeclarativeSlotPool#increaseResourceRequirementsBy, When `current time > arrival time of the last slotRequest + slot.request.max-interval`, then the the DeclarativeSlotPool declareResourceRequirement to the RM.
- For DeclarativeSlotPool#decreaseResourceRequirementsBy, When `current time > arrival time of the last slotRequest + slot.request.max-interval`, then the DeclarativeSlotPool declareResourceRequirement to SlotPool declareResourceRequirement to the RM.
- For DeclarativeSlotPoolBridge, When `the number of available slots cached exceeds or equals the pending slots`, the slots selection and allocation can be carried out.
- This new strategy not only implements the functionality of the old strategy, but also reduces the number of rpcs from SlotPool to ResourceManager.
...