Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

(2) Under-replicated partitions occur whenever the inSyncReplicas set is not equal to the allReplicasMap, which can happen when we have:

  • Repartitioning
  • Broker restarts
  • Transient network issues
  • Broker failurefailures


The current categorization of topic partitions has a gap as an UnderReplicatedPartition does not tell operators if the reduced ISR set is intentional (repartitioning/restarts) or if there may be something wrong such as a broker has completely failed. This makes it hard for operators as setting an alert for UnderReplicatedPartitions may not be effective as it may be too noisy, and increasing the # of samples needed to trigger the alert increases the time to detect failures.

...