Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

A related work is KIP-73 which added  added quotas for replication but it doesn't separate normal replication traffic from reassignment. So a user is able to specify the partition and the throttle rate but it will be applied to both ISR and non-ISR traffic. This is undesirable because if a node that is being throttled falls out of ISR it would further prevent it from catching up. KIP-455 will make brokers aware of pending reassignment and thus we'd be able to separate these two kinds of replication. Moreover we won't have to manually specify a list of replicas to throttle because the broker would be able to figure out which partitions needed to be throttled based on the LeaderAndIsr request.

As part of this KIP we plan to add a broker level dynamic config configs that defines a throttle rate of would define throttle rates for the reassignment related replication of the leaders and the followers in bytes/seconds. As a consequence replication will take more time but we would have the benefit of a stable clusterit will be possible to limit only reassignment and it won't consume too much network bandwidth.

Goals

  • Reassignment traffic shouldn't have an effect on other replication or client traffic so they'll get guaranteed throughput.
  • This throughput limit should be configured dynamically by the administrator so that an ongoing reassignment's throttle could be controlled on the fly.
  • The quota should be able to be upgraded easily programmatically so management systems built on Kafka should be able to expose it easily.

...