Table of Contents

Status

Current state: DiscussionAdopted

Discussion thread: here

JIRA:

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	KAFKA-8834

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	KAFKA-8835

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	KAFKA-9059

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

While a reassignment is in progress, the number of replicas for a partition being reassigned temporarily increases beyond the replication factor. Once all new replicas are trying to catch up and are not in the ISR, the old replicas are removed and the number of replicas again matches the replication factor. Until that point, however, the partition is treated as under-replicated both from the perspective of metrics and from the topic command utility. This is misleading because the partitions may satisfy the required replication factor throughout the reassignment. Furthermore, it obscures actual replication problems while a reassignment is in progress because some number of under-replicated partitions are expected. For example, this makes it difficult to use URPs for alerting. In this KIP, we propose to distinguish the URPs caused by reassignment. The broker considers these partitions "under-replicated" even if the desired replication factor is always satisfied. This is misleading and makes URP metrics difficult to use for alerts. In KIP-455, we gave the leader a way to detect a reassignment. Specifically, the LeaderAndIsr request now has a separate field for the replicas which are being added and those that are being removed. This allows us to compute a more useful metric value.

Proposed Changes

We will distinguish "UnderSynchronized" partitions as those which have an in-sync replica set that is smaller than the topic's replication factor, and "OverReplicated" partitions as those which have more replicas than the replication factor.

The high level idea is that users can monitor the over-replicated partitions to track the progress of a reassignment. The under-synchronized partitions can be monitored separately for possible alerting.

Public Interfaces

We will add two new metrics exposed on the broker which represent counts of the new categories mentioned above: "UnderSynchronizedCount" and "OverReplicatedCount."

change the semantics of the "UnderReplicated" metric to taking into account the AddingReplicas. Specifically, we will use the following formula:

Code Block
isUnderReplicated == size(original assigned replicas) - size(isr) > 0

We count a partition as under-replicated if the current isr is smaller than the size of the current replica set. This allows us to count AddingReplicas which makes this metric consistent with UnderMinIsr criteria. Note that a reassignment may change the number of replicas, but URP satisfaction will not take this into account until the reassignment is complete.

Similarly, we will change the behavior of the kafka topic command so that `--under-replicated-partitions` returns results consistent with the change above. Because the adding/removing replicas are not visible from the Metadata API, we will use the new ListReassignment API.

Additionally, we are adding a couple new metrics to track the progress of an active reassignment. These are described below.

Public Interfaces

As described above, this KIP changes the semantics of `kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions`. Replicas which are being added as part of a reassignment will not count toward this value.

We will also add some additional metrics to improve monitoring for reassignments. The table below shows all of the changes from this KIP.

Metric	Is New	Type	Includes Current Assigned Replicas	Includes Reassigning Replicas
kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions	No	Gauge	Yes	No
kafka.server:type=ReplicaManager,name=ReassigningPartitions	Yes	Gauge	No	Yes
kafka.server:type=ReplicaManager,name=ReassignmentMaxLag	Yes	Gauge	No	Yes
kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesOutPerSec	Yes	Meter	No	Yes
kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesInPerSec	Yes	Meter	No	Yes

Note that the `ReassignmentBytesOutPerSec` and `ReassignmentBytesInPerSec` meters are broker-level metrics. We are not proposing any topic-level metrics for tracking reassignment progress.

ReassignmentMaxLag will be implemented separately as it requires some more consideration. JIRA is linked on the top of the KIPThe topic command utility will have similar options to display the partitions in each category: --under-synchronized-partitions and --over-replicated-partitions.

Compatibility, Deprecation, and Migration Plan

These changes are backwards compatibleThe main concern from a compatibility perspective is the semantic change to the "UnderReplicated" metric. Users may have to make changes if this is used to track the reassignment state. However, we believe that continued misuse of this metric (i.e. not taking reassignment into account) is a more substantial problem.

Rejected Alternatives

We considered redefining leaving the "UnderReplicated" metric with its current semantics and adding a new metric to represent the "under-replicated partition" to exclude partitions being reassigned. Ultimately we were reluctant to change its semantics for compatibility with previous versions considering its broad usagesynchronized" replicas. We ultimately rejected this because we felt it was necessary to address the misuse of the URP metric due to its surprising behavior during a reassignment.

Space shortcuts

Child pages

Versions Compared

Old Version 2

New Version Current

Key

Status

Proposed Changes

Public Interfaces

Public Interfaces

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 2

New Version Current

Key

Status

Proposed Changes

Public Interfaces

Public Interfaces

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives