...
1. Formula for determining if Absolute Preferred distribution is possible
Where allInstanceTags is a map of all client instance tags and has a signature of Map<String, Set<String>>
Partially Preferred Standby Task Distribution
...
1. Formula for determining if Partially Preferred distribution is possible
Where allInstanceTags is a map of all client instance tags and has a signature of Map<String, Set<String>>
Assuming active stateful task 0_0 is in Node-1, Partially Preferred standby task distribution will look like this:
...
Compatibility, Deprecation, and Migration Plan
N/A
Rejected Alternatives
...
The initial idea was to introduce two configurations in StreamsConfig,
rack.id
, which defines the rack of the Kafka Streams instance andstandby.task.assignor
- class that implementsRackAwareStandbyTaskAssignor
interface.The signature of RackAwareStandbyTaskAssignor was the following:
Code Block language java public interface RackAwareStandbyTaskAssignor { /** * Computes desired standby task distribution for a different {@link StreamsConfig#RACK_ID_CONFIG}s. * @param sourceTasks - Source {@link TaskId}s with a corresponding rack IDs that are eligible for standby task creation. * @param clientRackIds - Client rack IDs that were received during assignment. * @return - Map of the rack IDs to set of {@link TaskId}s. The return value can be used by {@link TaskAssignor} * implementation to decide if the {@link TaskId} can be assigned to a client that is located in a given rack. */ Map<String, Set<TaskId>> computeStandbyTaskDistribution(final Map<TaskId, String> sourceTasks, final Set<String> clientRackIds); }
By injecting custom implementation of RackAwareStandbyTaskAssignor interface, users could hint Kafka Streams where to allocate certain standby tasks when more complex processing logic was required — for example, parsing rack.id, which can be a combination of multiple identifiers (as seen in the previous examples where we have cluster and zone tags).
The above mentioned idea was abandoned because it's easier and more user-friendly to let users control standby task allocation with just configuration options instead of forcing them to implement a custom interface.
- The second approach was to refactor
TaskAssignor
interface to be more user-friendly and expose it as a public interface. Users then could implement customTaskAssignor
logic and set it viaStreamsConfig
. With this, Kafka Streams users would effectively be in control of Active and Standby task allocation.
Similarly to the point above, this approach also was rejected because it's more complex.
Even though it's more-or-less agreed on the pluggable TaskAssignor interface's usefulness, it was decided to cut it out of this KIP's scope and prepare a separate one for that feature.