Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

KIP-73 added quotas for replication but it doesn't separate normal replication traffic from reassignment. So a user is able to specify the partition and the throttle rate but it will be applied to all non-ISR replication traffic. This is undesirable because if a node that is being throttled falls out of ISR it would further prevent it from catching up. KIP-455 will make brokers aware of pending reassignment and thus we'd be able to separate these two kinds of replication. Moreover we won't have to manually specify a list of replicas to throttle because the broker would be able to figure out automatically in runtime which partitions needed to be throttled based on the LeaderAndIsr request.

...

Config nameTypeDefaultValid valuesImportanceDynamic update mode
leader.reassignment.throttled.rate.maxLongLong.MAX_VALUE-1[-1,...]mediumper-broker
follower.reassignment.throttled.rate.maxLongLong.MAX_VALUE-1[-1,...]mediumper-broker

These new configs would control how much reassignment traffic can take place on a broker on the leader and the follower side. They are both maximum values which means that the actual reassignment traffic can be smaller or even zero (if there's nothing to reassign). Their maximum value is the respective leader or follower.replication.throttled.rate. Specifying a higher value would result in a configuration error.

The possible configuration variations are:

  • replication.throttled.rate is set but reassignment.throttled.rate isn't (or -1): any kind of replication (so including reassignment) can take up to replication.throttled.rate bytes.
  • replication.throttled.rate and reassignment.throttled.rate both set: reassignment can use a bandwidth up to the configured limit but other replication can use the remaining rate (so general replication throttled rate = replication.throttled.rate - reassignment.throttled.rate)
  • replication.throttled.rate is not set but reassignment.throttled.rate is set: in this case general replication has no bandwidth limits but reassignment.throttled.rate has the configured limits.
  • neither replication.throttled.rate nor reassignment.throttled.rate are set (or -1): no throttling is set on any replication.

It is useful to add this feature on both leader and follower side as throttling only on the leader for instance make it more complicated to calculate the throughput limit on the follower side. For instance we may have 2 reassigning partitions with the overall limit of 10MB/s configured and that would mean 20MB/s used bandwidth on a broker which is replicating from those partitions. But since replicas can scale up the thousands on a single broker, the complexity of calculating the resulted follower reassignment would increase proportionally.

...