Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current stateUnder DiscussionAccepted

Discussion thread: here

JIRA: KAFKA-6263

...

Add the following metrics via a sensor:1) kafka.coordinator.group

  • kafka.server:type=group-

...

  • coordinator-metrics,name=group-load-time-max

Type: SampledStat.Max

Value: 0 or greater over time; maximum time, in milliseconds, it took to load offsets and group metadata from

...

the __consumer_offsets

...

partitions loaded in the last 30 seconds.

...

  • kafk.aserver:type=group-

...

  • coordinator-metrics,name=group-load-time-avg

Type: SampledStat.Avg

Value: 0 or greater over time; average time, in milliseconds, it took to load offsets and group metadata from

...

the __consumer_offsets

...

partitions loaded in the last 30 seconds.

Note: this average may look very low at times when a majority of the partitions are unused causing some load times to be 0 seconds.

...

  • kafka.

...

  • server:type=transaction-

...

  • coordinator-

...

  • metrics,name=transaction-load-time-max

Type: SampledStat.Max

Value: 0 or greater over time; maximum time, in milliseconds, it took to load offsets and transaction state from

...

the __

...

transaction_state partitions loaded in the last 30 seconds.

...

  • kafka.

...

  • server:type=transaction-

...

  • coordinator-

...

  • metrics,name=transaction-load-time-avg

Type: SampledStat.Avg

Value: 0 or greater over time; average time, in milliseconds, it took to load offsets and transaction state from

...

the __

...

transaction_state partitions loaded in the last 30 seconds.

Note: this average may look very low at times when a majority of the partitions are unused causing some load times to be 0 seconds.

Proposed Changes

For each of the group metadata manager and transaction state manager, add a sensor that indicates the max and avg number of milliseconds it took to load the each partition. This max and average are computed from a running window based on the partitions that finished loading in the last 30 seconds. 

...