This page is meant as a template for writing a KIP. To create a KIP choose Tools->Copy on this page and modify with your content and replace the heading with the next KIP number and a description of your issue. Replace anything in italics with your own description.
Status
Current state: "Under Discussion"
Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]
JIRA: here [Change the link from KAFKA-1 to your own ticket]
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
For managing Kafka-as-a-service, it will be useful to have some more metrics to support health checks so that any issues can be identified early.
Public Interfaces
Broker-side metrics
Error rates
There are currently no metrics to report error rates in the broker. It will be useful to monitor errors returned to clients for each request type, so that alerts can be generated if requests fail consistently. Each error code for each request type will be measured separately to provide flexibility in terms of how errors are processed by downstream tools.
This metric will be a meter in the same group as existing request metrics RequestsPerSec
etc.
MBean: kafka.network:type=RequestMetrics,name=ErrorsPerSec,request=api_key_name,error=error_code_name
Fetch down conversion rate
Down conversions are expensive since the whole response has to be read into memory for conversion. It will be useful to monitor the rate of down conversion.
This will be a meter in the same group as existing topic metrics TotalFetchRequestsPerSec
etc.
MBean: kafka.server:type=BrokerTopicMetrics,name=FetchDownConversionsPerSec,topic=([-.\w]+)
ZooKeeper latency
It will be good to monitor latency of ZooKeeper requests so that any issues with ZooKeeper communication can be detected early.
This will be a histogram in the same group as existing ZooKeeper session metrics ZooKeeperSyncConnectsPerSec
etc.
MBean: kafka.server:type=SessionExpireListener,name=ZooKeeperLatency
Client-side metrics
Client versions
We currently have a MBean for client version that gives commit id and version, but this is not exposed as a metric. In order to optimize upgrades and debug issues, it will be useful to have a gauge for client versions to monitor the versions used by clients.
This will be a Gauge.
MBean: [kafka.admin.client|kafka.consumer|kafka.producer]:type=client-version,commit_id="CommitId",version="Version"
Proposed Changes
New yammer metrics will be added on the broker-side and Kafka metrics on the client-side.
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
These are new metrics and there will be no impact on existing users.