Table of Contents

Status

Current state: "Under Discussion"

Discussion thread: here

JIRA: here [Change the link from KAFKA-1 to your own ticket]

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently, Sink Connectors include only putBatchLatency metric to measure the time to sink a batch of records into a external system.

In other to correlate these metrics and measure a complete end-to-end latency of the Sink Connector there is a need for additional measures:

record e2e latency: wall-clock time - record timestamp to evaluate how late records are processed.
convert time: latency to convert + transform batch of records to know how long is taking to convert and transform records

With these metrics, it will be clearer how much latency the sink connector.

In the case of source connectors, convert time can be added to improve the latency monitoring and have parity with the metrics on the sink connectors.

Public Interfaces

The following metrics would be added:

sink-record-e2e-latency-min [ms]
sink-record-e2e-latency-max [ms]
sink-record-e2e-latency-avg [ms]
sink-record-convert-time-max [ms]
sink-record-convert-time-avg [ms]
source-record-convert-time-max [ms]
source-record-convert-time-avg [ms]

Proposed Changes

Sink Connectors have 3 stages:

polling: gather consumer records
convert/transform: convert records individually to generic SinkRecord and apply transformers
process: put record batches into a external system.

Process stage already has a latency metric: put-batch-time

To measure record e2e latency sink-record-e2e-latency , it's proposed to measure the different between record timestamp and current system time (wall-clock) at the beginning of the convert stage as it is when records are iterated already.

Convert latency sink-record-convert-time measures the convert and transformation per record.

Polling can be monitored with Consumer fetch metrics, e.g. fetch-latency-avg/max

On the other hand, in the Source Connector has:

polling: gather records from external system
transform/convert: apply transformations and convert records to ProducerRecord
send records: to Kafka topics

Polling records stage already has a latency metric: poll-batch-time .

source-record-convert-time metric will measure the transformations applied and conversion from SourceRecord into ProducerRecord .

Send records stage can be monitored with Producer sender metrics, e.g. request-latency-avg/max

Compatibility, Deprecation, and Migration Plan

N/A

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

Space shortcuts

Child pages

Versions Compared

Old Version 1

New Version 2

Key

Status

Motivation

Public Interfaces

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 1

New Version 2

Key

Status

Motivation

Public Interfaces

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives