...
Add the following metrics via a sensor:1) kafka
- kafka.coordinator.group:type=group-metadata-manager-metrics,name=group-load-time-max
Type: SampledStat.Max
Value: 0 or greater over time; maximum time, in milliseconds, it took to load offsets and group metadata from one __consumer_offsets partition in the last 30 seconds.
...
- kafka.coordinator.group:type=group-metadata-manager-metrics,name=group-load-time-avg
Type: SampledStat.Avg
Value: 0 or greater over time; average time, in milliseconds, it took to load offsets and group metadata from one __consumer_offsets partition in the last 30 seconds.
Note: this average may look very low at times when a majority of the partitions are unused causing some load times to be 0 seconds.
...
- kafka.coordinator.group:type=transaction-state-manager-metrics,name=transaction-load-time-max
Type: SampledStat.Max
Value: 0 or greater over time; maximum time, in milliseconds, it took to load offsets and transaction state from one __consumer_offsets partition in the last 30 seconds.
...
- kafka.coordinator.group:type=transaction-state-manager-metrics,name=transaction-load-time-avg
Type: SampledStat.Avg
Value: 0 or greater over time; average time, in milliseconds, it took to load offsets and transaction state from one __consumer_offsets partition in the last 30 seconds.
Note: this average may look very low at times when a majority of the partitions are unused causing some load times to be 0 seconds.
Proposed Changes
For each of the group metadata manager and transaction state manager, add a sensor that indicates the max and avg number of milliseconds it took to load the partition. This max and average are computed from a running window based on the partitions that finished loading in the last 30 seconds.
...