Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • AdminUtils.assignReplicasToBrokers is updated to create rack aware assignment based on the broker to rack mapping passed into it. If the map is empty, it will treat it as no rack information available and produce the same assignment as the current implementation. When making the rack aware assignment, it tries to keep the following properties:
    • Even distribution of replicas among brokers
    • Even distribution of partition leadership among brokers
    • Assign to as many racks as possible. That means if the number of racks are more than or equal to the number of replicas, each rack will have at most one replica. On the other hand, if the number of racks is less than the the number of replicas (which should happen very infrequently), each rack should have at least one replica and no other guarantees are made on how the replicas will be distributed among racks. For example, if there are 2 racks and 4 replicas, one rack can have 3 replicas, 2 replicas or 1 replica. This is to keep the algorithm simple while still keeping other replica distribution properties and fault tolerance from the racks.
  • Implementation detail of the rack aware assignment (see more in the pull request https://github.com/apache/kafka/pull/132):
    • Before doing the rack aware assignment, sort the broker list such that they are interlaced according to the rack. In other words, adjacent brokers in the sorted list should not be in the same rack if possible . For example, assuming 6 brokers mapping to 3 racks: 0 -> "rack1", 1 -> "rack1", 2 -> "rack2", 3 -> "rack2", 4 -> "rack3", 5 -> "rack3", the sorted broker list could be (0, 2, 4, 1, 3, 5)
    • Apply the same assignment algorithm to assign replicas, with the addition of skipping a broker if its rack is already used for the same partition
  • KafkaApis will initialize the RackLocator if configured and call the new AdminUtils.assignReplicasToBrokers API with the broker to rack mapping obtained from RackLocator. This will ensure a rack aware assignment RackLocator.getRackInfo() will be called every time a topic needs to be created (when auto topic creation is enabled) to ensure the latest broker-rack mapping is used.
  • TopicCommand and ReassignPartitionsCommand will initialize the RackLocator (if enabled from command line) and call the new AdminUtils.assignReplicasToBrokers API with the broker to rack mapping obtained from RackLocator. Rack aware assignment will be used for topic creation, adding partitions and partitions reassignment.

...