You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

 

Status

Current state: Under Discussion

Discussion thread: here

JIRA: KAFKA-3715: Higher granularity Streams metrics

Motivation

  • This KIP proposes the addition of latency and throughput metrics for Kafka Streams at the granularity of each processor node, in the addition to the global rate (which already exists). The idea is to allow users to toggle the recording of these metrics when needed for debugging. The RecordLevel for these granular metrics is debug and a client can toggle the record level by changing the  “metrics.record.level” in the client config. (The introduction of RecordLevel and client config changes are covered in a separate KIP)


Public Interfaces

  • none

Proposed Changes

  • Enumeration of Sensors: This KIP proposes the introduction of the following sensors

    • Node punctuate time sensor: This sensor is associated with latency metrics depicting the average and max latency in the punctuate time of a node.

    • Node creation time sensor: This sensor is associated with latency metrics depicting the average and max latency in the creation time of a node.

    • Node destruction time sensor:  This sensor is associated with latency metrics depicting the average and max latency in the destruction time of a node.

    • Node process time sensor: This sensor is associated with latency metrics depicting the average and max latency in the process time of a node.

    • Node throughput sensor: This sensor is associated with throughput metrics depicting the context forwarding rate of metrics through a node, i.e., indicating how many records were forwarded downstream from this processor node.

    • Skipped records sensor in StreamTask:This sensor is associated with a count metric, which helps monitor if streams are well synchronized. The metric measures the difference in the total record count and the number of added records between the last record time. This is useful during debugging as this count should not be off by too much during normal operations.

Compatibility, Deprecation, and Migration Plan

  • none

Rejected Alternatives

  • none
  • No labels