You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

Status

Current state: Under Discussion

Discussion thread: here

JIRA Unable to render Jira issues macro, execution error.

Released: 

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

KIP-73 added quotas for replication but it doesn't separate normal replication traffic from reassignment. So a user is able to specify the partition and the throttle rate but it will be applied to all non-ISR replication traffic. This is undesirable because if a node that is being throttled falls out of ISR it would further prevent it from catching up. KIP-455 will make brokers aware of pending reassignment and thus we'd be able to separate these two kinds of replication. Moreover we won't have to manually specify a list of replicas to throttle because the broker would be able to figure out which partitions needed to be throttled based on the LeaderAndIsr request.

As part of this KIP we plan to add a broker level dynamic configs that would define throttle rates for the reassignment related replication of the leaders and the followers in bytes/seconds. As a consequence it will be possible to limit only reassignment and it won't consume too much network bandwidth.

Goals

  • Reassignment traffic shouldn't have an effect on other replication or client traffic so they'll get guaranteed throughput.
  • This throughput limit should be configured dynamically by the administrator so that an ongoing reassignment's throttle could be controlled on the fly.
  • The quota should be able to be upgraded easily programmatically so management systems built on Kafka should be able to expose it easily.

Public Interfaces

  • Extra configs: two extra config will be added, called leader.reassignment.throttled.rate and follower.reassignment.throttled.rate. Please see proposed changes for more info.
  • Existing configsleader.replication.throttled.replicas and follower.replication.throttled.replicas would keep their existing behavior and wouldn't be deprecated.
  • Reassignment tool: the tool would use the new configs going forward instead of the old ones.

Proposed Changes

Behavior of Existing Configs

We would keep leader.replication.throttled.replicas and follower.replication.throttled.replicas because there are still scenarios where generic replication throttling is needed. One such case is a bootstrapping broker where a lot of follower partitions need to catch up with their leader but they're using up the bandwidth from the leader partitions on that broker. In this case it is still useful to set follower replication throttling so the leaders can stay in-sync. (Furthermore as a possible improvement to bootstrapping it might be useful to allow only a subset of follower partitions to replicate at a time but this falls out of the scope of this KIP.)

New Configs

Config nameTypeDefaultValid valuesImportanceDynamic update mode
total.replication.throttled.rate




leader.reassignment.throttled.rateLongLong.MAX_VALUE[1,...]mediumper-broker
follower.reassignment.throttled.rateLongLong.MAX_VALUE[1,...]mediumper-broker

It is useful to add this feature on both leader and follower side as throttling only on the leader for instance make it more complicated to calculate the throughput limit on the follower side. For instance we may have 2 reassigning partitions with the overall limit of 10MB/s configured and that would mean 20MB/s used bandwidth on a broker which is replicating from those partitions. But since replicas can scale up the thousands on a single broker, the complexity of calculating the resulted follower reassignment would increase proportionally.

Behavior-wise they'd throttle the addingReplicas of the LeaderAndIsrRequest during reassignment. 

To change these configs, the user must have ALTER_CONFIG privilege on the given cluster config as imposed by the incrementalAlterConfigs API which will be the medium for applying the configuration. By adding the new configs we'd also like to remove the related zookeeper dependencies in the kafka-reassign-partitions.sh command, so applying the quota would happen through the AdminClient API.

Reassignment Tool

The --throttle option's behavior in kafka-reassign-partitions.sh would change as it would use the new configs going forward. If there is need for reproducing the old behavior then it would still be possible by calling kafka-configs.sh manually before and after the reassignment to set the correct replication throttling.

Throttling Calculation

The quota calculation method that is introduced in KIP-73 wouldn't change in principal but we will only apply it to the calculated (reassigned) replicas accordingly. This has the benefit that we don't need to change the recommendations in KIP-73.

Updating Reassignment Throttling

At this point this KIP doesn't aim to add new AdminClient RPC call as the config value can be changed by the IncrementalAlterConfigs API.

Compatibility, Deprecation, and Migration Plan

The only change which needs to be mentioned is the tooling change. With this we'll change the --throttle option's behavior. If for some reason the old behavior is needed it can be reproduced by calling kafka-configs.sh manually before and after the reassignment with the intended parameters.



  • No labels