Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current state: Under Discussion

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here [Change the link from KAFKA-1 to your own ticket]

 

JIRA:

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-13484
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

When a partition becomes offline it is important to quickly determine what customers/service are being impacted - possibly reducing overall disruption. Additionally, while many sources recommend instrumenting alerting on the current OfflinePartitionCount metric as reported by the active controller, this metric is not tagged in any way. However, as an example it may be beneficial to only have alerts on a subset of topics or disable alerting for certain test topics. Further as topics are often synonymous with differing use cases (e.g. metrics, logs, etc) having the ability to associate an offline partition with a topic would provide new granularity for debugging, alerting, and SLO/SLA reporting. This KIP proposes to tag the current OfflinePartitionsCount  metric by the topic name associated with the offline partition(s) for these reasons. 

...