Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current state: Under Discussion Accepted

Discussion thread: here

JIRA:

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-14352

...

The clusterMetadata instance used by partition assignors contains replica information for every partition, where each replica's rack is included in their Node if the broker was configured with broker.rack. This KIP also adds rack id for each member's Subscription  instance in GroupSubscription. So a rack-aware partition assignor can match the rack id of the members with the rack id of the replicas to ensure that consumers are assigned partitions in the same rack if possible. In some cases, this may not be possible, for example, if there is a single consumer and one partition which doesn't have a replica in the same rack. In this case the partition is assigned with mismatched racks and will result in cross-rack traffic. The built-in assignors will prioritize balancing partitions over improving locality, so in some cases, partitions may be allocated to a consumer in a different rack if there aren't sufficient partitions in the same rack as the consumer. The goal will be to improve locality for cases where load is uniformly distributed across a large number of partitions.

Rebalance to Improve Locality After Reassignments

Rack-aware partition assignment will use racks of all partition replicas including those marked offline or not in the ISR to ensure that transient states don't result in sub-optimal assignments. But replica racks may change due to reassignments when replicas are added or removed. In this case, the existing assignment may no longer be optimal and the next rebalance may not happen for a long time. To improve locality in this case, leader will trigger rebalance whenever it detects that the set of racks of partition replicas have changed in the metadata. This rebalance will be triggered only if the leader has client.rack  configured. Since reassignments that change the set of replica racks of a partition are rare typically, this shouldn't result in frequent rebalances.

Compatibility, Deprecation, and Migration Plan

...

Kafka Streams introduced rack-aware rack assignment in KIP-708. Flexible client tags were introduced to implement rack-awareness along with a rack-aware assignor for standby tasks. Tags are more flexible, but since we want to match existing rack configuration in the broker and consumers, it seems better to use rack id directly instead of adding prefixed tags. The next generation consumer group protocol proposed in KIP-848 also uses racks in the protocol.

...