...
All of the following will be added via Kafka's metrics library like most of the metrics in the Kafka brokers and other components. The context of all metrics are limited to the worker where the metrics are being reported, and all metrics are defined as attributes on the specified MBean attribute and are measured within the context of a single worker.
All metrics defined below are at the INFO recording level.
Connector Metrics
MBean name: kafka.connect:type=connector-metrics,connector=([-.\w]+)
...
MBean name: kafka.connect:type=sink-task-metrics,connector=([-.\w]+),task=([\d]+)
Metric/Attribute Name | Description | |
---|---|---|
sink-record-read-rate | The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied. | |
sink-record-send-rate | The average per-second numbrer of records output from the transformations and sent to this task belonging to the named sink connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations. | |
sink-record-lag-max | The maximum lag in terms of number of records behind the consumer the offset commits are for any topic partitions. | |
partition-count | The number of topic partitions assigned to this task belonging to the named sink connector in this worker. | |
offset-commit-seq-no | The current sequence number for offset commits | |
offset-commit-completion-rate | The average per-second number of offset commit completions that were completed successfully | |
offset-commit-completion-skip-rate | The average per-second number of offset commit completions that were received too late and skipped/ignored | |
flush-max-time | The maximum time taken by this sink task to pre-commit/flush | |
flush-99p-time | The 99th percentile time spent by this sink task to pre-commit/flush | |
flush-95p-time | The 95th percentile time spent by this sink task to pre-commit/flush | |
flush-90p-time | The 90th percentile time spent by this sink task to pre-commit/flush | |
flush-75p-time | The 75th percentile time spent by this sink task to pre-commit/flush | |
flush-50p-time | The 50th percentile (average) time spent by this sink task to pre-commit/flush |
...
Metric/Attribute Name | Description |
---|---|
rebalance-success-total | The total number of this worker's successful rebalances |
rebalance-success-percentage | The average percentage of this worker's rebalances that succeeded |
rebalance-failure-total | The total number of this worker's failed rebalances |
rebalance-failure-percentage | The average percentage of this worker's rebalances that failed |
rebalance-max-time | The maximum time spent by this worker to rebalance |
rebalance-99p-time | The 99th percentile time spent by this worker to rebalance during the last window (defaults to an hour) |
rebalance-95p-time | The 95th percentile time spent by this worker to rebalance during the last window (defaults to an hour) |
rebalance-90p-time | The 90th percentile time spent by this worker to rebalance during the last window (defaults to an hour) |
rebalance-75p-time | The 75th percentile time spent by this worker to rebalance during the last window (defaults to an hour) |
rebalance-50p-time | The 50th percentile (average) time spent by this worker to rebalance during the last window (defaults to an hour) |
time-since-last-rebalance | The time since the most recent rebalance in this worker |
task-failure-rate | The number of tasks that failed in this worker |
Configuration
The distributed and standalone worker configuration files will support the following properties. These exactly match the producer and consumer configurations of the same name. (The first three are already in the distributed worker configuration.)
Configuration Field | Type | Default | Importance | Description |
---|---|---|---|---|
metrics.sample.window.ms | long | 30000 | low | The window of time in milliseconds a metrics sample is computed over. Must be a non-negative number. |
metrics.num.samples | int | 2 | low | The number of samples maintained to compute metrics. Must be a positive number. |
metric.reporters | string | "" | low | A list of classes to use as metrics reporters. Implementing the MetricReporter interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics. |
metrics.recording.level | string | "INFO" | low | The highest recording level for metrics. Must be either "INFO" or "DEBUG". |
Proposed Changes
We will add the relevant metrics and worker configuration properties as specified in the Public Interfaces section.
...
Existing Connect coordinator metrics will not be changed.
The metrics.sample.window.ms
, metrics.num.samples
, and metric.reporters
configurations already exist in the distrtibuted worker; these will also be added to the standalone worker. The metrics.recording.level
configuration will be added to both the distributed and standalone worker configurations. All four of these metrics have sensible default values and therefore users do not need to add or override them in their existing configuration files.
Rejected Alternatives
None
...