Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: More intro

...

Being able to centrally monitor and troubleshoot problems with Kafka clients is becoming increasingly important as the use of Kafka is expanding within organizations as well as for hosted Kafka services. The typical Kafka client user is now an application owner with little experience in operating Kafka clients, while the cluster operator has profound Kafka knowledge but little insight in the client application.

Troubleshooting Kafka problems is currently an organisationally complex issue, with different teams or even organisations running the client applications and the brokers. While some organisations may already have custom collection of existing client metrics in place, most do not and metrics are typically not available when the problem needs to be analysed. Enabling metrics after-the-fact may not be possible without code change to the application, or at least a restart, which typically means the required metrics data is lost.

While the broker already tracks request-level metrics for connected clients, there is a gap in the end-to-end monitoring when it comes to visibility of client internals, be it queue sizes, internal latencies, error counts, application behaviour (such as message processing rate), etc. These are Kafka client metrics, and not application metrics.

...

One of the key goals of this KIP is to have the proposed metrics and telemetry interface generally available and enabled by default in all of the mainstream Kafka clients, allowing troubleshooting and monitoring as needed without interaction from cluster end-users. While metrics are to be enabled by default on the clients, the brokers still need to be configured with a metrics plugin, and metrics subscriptions must be configured on the cluster before any metrics are sent and collected.

User privacy is an important concern and extra care is taken in this proposal to not expose any information that may compromise the privacy of the client user.

...