Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

An important part of deploying Kafka Connect is monitoring the health of the workers in a cluster and the connectors and tasks that have been deployed to the cluster. The Kafka Connect framework only has a few metrics capturing the number of connectors and tasks for each worker, so we propose to add metrics to monitor more information about the connectors, tasks, and workers. This All metrics reported by each worker are scoped by the activities within that worker.

There are several things that are out of scope for this proposal, though they may be addressed in future KIPs. First, this proposal expressly avoids changes to the Connect API, and therefore does not address how connector implementations can define their own connector-specific metrics. All metrics reported by each worker are scoped by the activities within that worker. Second, Kafka Connect does not have any existing mechanism to aggregate the metrics reported by each worker, and therefore any such aggregation is out of scope for this KIP.

Public Interfaces

All of the following will be added via Kafka's metrics library like most of the metrics in the Kafka brokers and other components. The scope context of all metrics are limited to the worker where the metrics are being reported, and all metric names metrics include the name of the worker ID in the MBean attribute (similarly to how Kafka producer and consumer metrics include the client ID).

Source Task Metrics

Metric NameDescriptionMBean attribute
source record produce rateThe number of records per second produced (before transformation) by this task belonging to the named source connector in this workerkafka.connect:type=source-task-metrics,name=source-record-produce-rate,worker=([-.\w]+),connector=([-.\w]+),task=([\d]+)
source record produce totalThe total number of records produced (before transformation) by this task belonging to the named source connector in this workerkafka.connect:type=source-task-metrics,name=source-record-produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
source record write rateThe number of records per second output from the transformations and written to Kafka for this task belonging to the named source connector in this worker. This is after transformation and excludes any records filtered out by the transformations.kafka.connect:type=source-task-metrics,name=source-record-produce-rate,worker=([-.\w]+),connector=([-.\w]+),task=([\d]+)
source record write totalThe total number of records output from the transformations and written to Kafka by this task belonging to the named source connector in this worker. This is after transformation and excludes any records filtered out by the transformations.kafka.connect:type=source-task-metrics,name=source-record-produce-total,worker=([-.\w]+)l,connector=([-.\w]+),task=([\d]+)
poll time percentageThe average percentage of time spent polling this task belonging to the named source connector in this workerkafka.connect:type=source-task-metrics,name=poll-time-percentage,worker=([-.\w]+),connector=([-.\w]+),task=([\d]+)
transform time percentageThe average percentage of time spent transforming source records for this task belonging to the named source connector in this workerkafka.connect:type=source-task-metrics,name=transform-time-percentage,worker=([-.\w]+),connector=([-.\w]+),task=([\d]+)
write time percentageThe average percentage of time spent converting and writing source records for this task belonging to the named source connector in this workerkafka.connect:type=source-task-metrics,name=write-time-percentage,worker=([-.\w]+),connector=([-.\w]+),task=([\d]+)
pause time percentageThe average percentage of time this task belonging to the named source connector in this worker were pausedkafka.connect:type=source-task-metrics,name=pause-time-percentage,worker=([-.\w]+),connector=([-.\w]+),task=([\d]+)

...