Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languageyml
public static final String TASK_ASSIGNMENT_RACK_AWARENESS_CONFIG = "taskt
ask.assignment.rack.awareness";
public static final String TASK_ASSIGNMENT_RACK_AWARENESS_DOC = "List of client tag keys used to distribute standby replicas across Kafka Streams instances." +                                                                
                                                                " When configured, Kafka Streams will make a best effort to distribute" +
                    											" the standby tasks over each client tag dimension.";

...

Info
Standby task distribution algorithm is not specified in this KIP, but is left as an implementation detail. However, every distribution algorithm must handle gracefully when ideal standby task distribution is not possible; In that case, Kafka Streams must not fail the assignment but try to find the subsequent most optimal distribution. The ideal distribution means there is no repeated client dimension amongst clients assigned to the active task and all standby tasks.

Benefits of tags vs single rack.id configuration

Defining multiple client.tag with combination of task.assignment.rack.awareness gives more flexibility, which otherwise could have been only possible with pluggable custom logic Kafka Streams's user must provide (it is briefly described in "Rejected Alternatives" section).

For instance, if we append multiple tags to form a single rack, it may not give desired distribution to the user if the infrastructure topology is more complex. Let us consider the following example with appending multiple tags to form the single rack.


Code Block
Node-1:
rack.id: K8s_Cluster1-eu-central-1a
num.standby.replicas: 1

Node-2:
rack.id: K8s_Cluster1-eu-central-1b
num.standby.replicas: 1

Node-3:
rack.id: K8s_Cluster1-eu-central-1c
num.standby.replicas: 1

Node-4:
rack.id: K8s_Cluster2-eu-central-1a
num.standby.replicas: 1

Node-5:
rack.id: K8s_Cluster2-eu-central-1b
num.standby.replicas: 1

Node-6:
rack.id: K8s_Cluster2-eu-central-1c
num.standby.replicas: 1


In the example mentioned above, we have three AZs and two Kubernetes clusters. Our use-case is to distribute standby task in the different Kubernetes cluster and different availability zone. For instance, if the active task is in Node-1 (K8s_Cluster1-eu-central-1a), the corresponding standby task should be in either on Node-5 (K8s_Cluster2-eu-central-1b) or on Node-6 (K8s_Cluster2-eu-central-1c).

Unfortunately, without custom logic provided by the user, this would be very hard to achieve with a single rack.id configuration. Because without any input from the user, Kafka Streams might as well allocate standby task for the active task either:

  • In the same Kubernetes cluster and different AZ (Node-2, Node-3)
  • In the different Kubernetes cluster but the same AZ (Node-4)

On the other hand, with the combination of the new "client.tag.*" and "task.assignment.rack.awareness" configurations, standby task distribution algorithm will be able to figure out what will be the most optimal distribution by balancing the standby tasks over each client.tag dimension individually. And it can be achieved by simply providing necessary configurations to Kafka Streams.


Changes in HighAvailabilityTaskAssignor

...