You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Status

Current state: Under Discussion

Discussion thread: here

JIRA: KAFKA-5061

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

It is currently not possible to monitor producer and consumer metrics for individual tasks in Kafka Connect due to JMX MBean naming conflicts. Standard Kafka producer and consumer clients use client.id in metric names to disambiguate JMX MBeans when multiple instances are running in the same JVM. In Kafka Connect, all producer and consumer instances created by a Worker inherit the same client id from Worker properties file. Since it is not possible to set a different client ID for each task, any Connector with more than one task running on a Worker node will generate JMX MBean naming conflicts. This makes it impossible to collect per-task Kafka client metrics across a Kafka Connect cluster.

Public Interfaces

Prose adding one new property to the Kafka Connect worker.properties file, which requires changing WorkerConfig.java

Name: unique.client.id
Doc: If true the task id is appended to the client.id used by each Source or Sink task. This avoids name conflicts on JMX mbeans and enables task-level client metrics.
Type: BOOLEAN
Default: false

Proposed Changes

Add a worker configuration option to automatically append task ID to the client ID used by producer or consumer instances instantiated by a Worker. This ensures all JMX MBean names used within a Kafka Connect cluster are distinct.

See PR here for proposed implementation:  https://github.com/apache/kafka/pull/5775

Compatibility, Deprecation, and Migration Plan

The default value is false, which keeps existing behavior unchanged. 

Rejected Alternatives

Several options were proposed in https://issues.apache.org/jira/browse/KAFKA-5061:

"Provide default client IDs based on the worker group ID + task ID (providing uniqueness for multiple connect clusters up to the scope of the Kafka cluster they are operating on)"


 This option avoids a configuration change, but does not maintain backward compatibility and will alter metrics names in existing clusters that have not explicitly overridden the client id in configuration.  NOTE: This would be the simplest option if backwards compatibility is not a serious concern.  A possible implementation is offered in this PR:  https://github.com/apache/kafka/pull/6097

"Allow overriding client.id on a per-connector basis"

This would not allow for per-task monitoring, as all tasks created by a connector would share the same client id.  A related option is to offer connectors more direct control of the client id, in configuration or code.  This would require a much more complex change.

  • No labels