Table of Contents |
---|
Status
Current state: Under Discussion Accepted
Discussion thread: here
JIRA:
Jira | ||||||
---|---|---|---|---|---|---|
|
...
The clusterMetadata
instance used by partition assignors contains replica information for every partition, where each replica's rack is included in their Node
if the broker was configured with broker.rack
. This KIP also adds rack id for each member's Subscription
instance in GroupSubscription.
So a rack-aware partition assignor can match the rack id of the members with the rack id of the replicas to ensure that consumers are assigned partitions in the same rack if possible. In some cases, this may not be possible, for example, if there is a single consumer and one partition which doesn't have a replica in the same rack. In this case the partition is assigned with mismatched racks and will result in cross-rack traffic. The built-in assignors will prioritize balancing partitions over improving locality, so in some cases, partitions may be allocated to a consumer in a different rack if there aren't sufficient partitions in the same rack as the consumer. The goal will be to improve locality for cases where load is uniformly distributed across a large number of partitions.
Rebalance to Improve Locality After Reassignments
Rack-aware partition assignment will use racks of all partition replicas including those marked offline or not in the ISR to ensure that transient states don't result in sub-optimal assignments. But replica racks may change due to reassignments when replicas are added or removed. In this case, the existing assignment may no longer be optimal and the next rebalance may not happen for a long time. To improve locality in this case, leader will trigger rebalance whenever it detects that the set of racks of partition replicas have changed in the metadata. This rebalance will be triggered only if the leader has client.rack
configured. Since reassignments that change the set of replica racks of a partition are rare typically, this shouldn't result in frequent rebalances.
Compatibility, Deprecation, and Migration Plan
...
Kafka Streams introduced rack-aware rack assignment in KIP-708. Flexible client tags were introduced to implement rack-awareness along with a rack-aware assignor for standby tasks. Tags are more flexible, but since we want to match existing rack configuration in the broker and consumers, it seems better to use rack id directly instead of adding prefixed tags. The next generation consumer group protocol proposed in KIP-848 also uses racks in the protocol.
...