Status
Current state: Voting Open (vote thread)
Discussion thread: here
JIRA: KAFKA-5061
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
It is currently not possible to monitor producer and consumer metrics for individual tasks in Kafka Connect due to JMX MBean naming conflicts. Standard Kafka producer and consumer clients use client.id
in metric names to disambiguate JMX MBeans when multiple instances are running in the same JVM. In Kafka Connect, all producer and consumer instances created by a Worker inherit the same client id from Worker properties file. Since it is not possible to set a different client ID for each task, any Connector with more than one task running on a Worker node will generate JMX MBean naming conflicts. This makes it impossible to collect per-task Kafka client metrics across a Kafka Connect cluster.
Public Interfaces
Propose updating the default client.id used by Kafka clients created by Kafka Connect worker tasks. The new default will include the current connector and task id.
Proposed Changes
PR Available here: https://github.com/apache/kafka/pull/6097
Consumers created for sink tasks will have a default client.id
of the form:
connector-consumer-{
connectorId}-{taskId}
e.g. For connector "conn1", task "2" the default client ID would become: connector-consumer-conn1-2
Producers created for source tasks will have a default client.id
of the form: connector-producer-{
connectorId}-{taskId}
Dead-letter queue producers created for sink tasks will have a default client.id
of the form: connector-dlq-producer-{
connectorId}-{taskId}
Compatibility, Deprecation, and Migration Plan
The change will affect any existing cluster where client.id
has not been over-ridden in the worker configuration. Since the current default is not useful for JMX monitoring the change should have minimal impact.
Any client IDs specified in the worker configuration via producer.client.id or consumer.client.id properties will remain unchanged, as those will take precedence.
Rejected Alternatives
Add a worker configuration option to automatically append task ID to the client ID used by producer or consumer instances instantiated by a Worker. This ensures all JMX MBean names used within a Kafka Connect cluster are distinct.
This approach was rejected because adding new configuration options is considered a larger change to public interfaces than necessary.
Allow overriding client.id on a per-connector basis
This option does not give the required level of granularity, individual tasks would still have name conflicts without other tasks create by the same connector. This could be avoided if the connector could control the client id at the task level, but that would require a much more complex change.