Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Coupling between resource requirements and operator chaining / slot sharing. If the SSGs are changed, either explicitly specified by users or due to changes of operator chaining / slot sharing strategies, the specified resource requirements also need to be adjusted.
  • User involvement against parallelism differences. For SSGs with operators of different parallelisms, the slots that do not contain subtasks of all the operators may have more resources than needed. To improve resource utilization against this issue, users would need to separate operators with different parallelisms into different SSGs.

Summary


Granularity

Pros

Cons

Operator

Decoupling between resource requirements and operator chaining / slot sharing.

Potential optimization against parallelism differences.

Too much user involvement.

Hard to support hybrid resource requirements.

Accumulative configuration error.

Task

Decoupling between resource requirements and slot sharing.

Potential optimization against parallelism differences.

Less user involvement and accumulative configuration error compared to the operator granularity.

Hard to support hybrid resource requirements.

Still too much user involvement and accumulative configuration error.

Expose operator chaining.

Slot Sharing Group

Flexible user involvement.

Support hybrid resource requirements.

Less accumulative configuration error.

Simplify the system.

Coupling between resource requirements and operator chaining / slot sharing.

User involvement against parallelism differences.

The above table summarizes the advantages and disadvantages of the three design options.

...

Code Block
languagejava
titleRuntimeContextStreamGraphGenerator
public class StreamGraphGenerator {
    public StreamGraphGenerator setSlotSharingGroupResource(Map<String, ResourceProfile> slotSharingGroupResources);
}

...

As for now, we propose in FLINK-20863 to exclude network memory from ResourceProfile for the moment, to unblock the fine-grained resource management feature from the network memory controlling issue. If needed, it can be added back in future, as long as there’s a good way to specify the requirement.

Resource Matching

Currently, ResourceProfile::isMatching uses the following rules (hereinafter, loose matching) to decide whether a slot resource can be used to fulfill the given resource requirement, in both SlotManager and SlotPool

  • An unspecified requirement (ResourceProfile::UNKNOWN) can be fulfilled by any resource.
  • A specified requirement can be fulfilled by any resource that is greater than or equal to itself. Note that this rule is not taking effect since there’s no specified requirement atm.

The loose matching rules were designed before the dynamic slot allocation. Under the assumption that resources of slots are decided when the TM is started and cannot be changed, the loose matching rules have the following advantages.

  • For standalone deployments, it allows slot requests to be fulfilled when the slots of pre-launched TMs can hardly have the exact required resources.
  • For active resource manager deployments, it increases the chance of slots being reused, thus reducing the cost of starting new TMs for various resource requirements.

With dynamic slot allocation introduced in FLIP-56, the benefits of the loose matching rules have been significantly reduced. As slots can be dynamically created after the TMs being started, with any desired resources as long as available, the only benefit the loose matching rules retain is to avoid allocating new slots when the slots can be reused on the JM side, which is insignificant since there’s no need to start new TMs.

On the other hand, the loose matching rules also introduce some problems.

  • Reusing larger slots for fulfilling smaller requirements can harm resource utilization.
  • It’s not straightforward to always find a feasible matching solution (assuming there is one) when matching a set of requirements and slots, in cases of job failovers or declarative slot allocation protocol.

Image Added

The above figure demonstrates how it could fail to find the feasible matching solution with the loose matching rules. Assuming there are two resource requirements A and B, and there are two slots X and Y. The number below each Requirement/Slot represents the amount of resource. Then A can be fulfilled with X and Y, while B can only be fulfilled with Y. A feasible matching is shown on the left, where both requirements can be fulfilled. However, the loose matching rules can also result in another matching, shown on the right, where A is fulfilled by Y, leaving B and X unmatched.

Given the reduction of its benefits and the problems it introduced, we proposed in FLINK-20864 to replace the loose matching rules with the following exact matching rules.

  • An unspecified requirement (ResourceProfile::UNKNOWN) can only be fulfilled by a TM's default slot resource.
  • A specified requirement can only be fulfilled by a resource that is equal to itself.

Resource Deadlock

Image Added

The above figure demonstrates a potential case of deadlock due to scheduling dependency. For the given topology, initially the scheduler will request 4 slots, for A, B, C and D. Assuming only 2 slots are available, if both slots are assigned to Pipeline Region 0 (as shown on the left), A and B will first finish execution, then C and D will be executed, and finally E will be executed. However, if in the beginning the 2 slots are assigned to A and C (as shown on the right), then neither of A and C can finish execution due to missing B and D consuming the data they produced.

Currently, with coarse-grained resource management, the scheduler guarantees to always finish fulfilling requirements of one pipeline region before starting to fulfill requirements of another. That means the deadlock case shown on the right of the above figure can never happen.

However, there’s no such guarantee in fine-grained resource management. Since resource requirements for SSGs can be different, there’s no control on which requirements will be fulfilled first, when there’s not enough resources to fulfill all the requirements. Therefore, it’s not always possible to fulfill one pipeline region prior to another.

To solve this problem, FLINK-20865 proposes to make the scheduler defer requesting slots for other SSGs before requirements of the current SSG are fulfilled, for fine-grained resource management, at the price of more scheduling time.

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.