...
Name | Context | Type | Description |
---|---|---|---|
kafka.controller:type=KafkaController,name=TimedOutBrokerHeartbeatCount | Controller | Long | The number of broker heartbeats that timed out on this controller since the process was started. Note that only active controllers handle heartbeats, so only they will see increases in this metric. |
kafka.controller:type=KafkaController,name=EventQueueOperationsPerformedCount | Controllers | Long | The total number of event queue operations that were performed. This includes deferred operations. |
kafka.controller:type=KafkaController,name=EventQueueOperationsTimedOutCount | Controllers | Long | The total number of event queue operations that timed out before they could be performed. |
kafka.controller:type=KafkaController,name=NewActiveControllersCount | Controller | Long | Counts the number of times this node has seen a new controller elected. A transition to the "no leader" state is not counted here. If the same controller as before becomes active, that still counts. |
kafka.server:type=MetadataLoader,name=CurrentMetadataVersion | Broker and Controller | Integer | Outputs the feature level of the current effective metadata version. |
kafka.server:type=MetadataLoader,name=HandleLoadSnapshotCount | Broker and Controller | Long | The total number of times we have loaded a KRaft snapshot since the process was started. |
kafka.server:type=MetadataLoader,name=LatestSnapshotGeneratedBytes | Broker and Controller | Long | The total size in bytes of the latest snapshot that the node has generated. If none have been generated yet, this is the size of the latest snapshot that was loaded. If no snapshots have been generated or loaded, this is 0. |
kafka.server:type=MetadataLoader,name=LatestSnapshotGeneratedAgeMs | Broker and Controller | Long | The interval in miliseconds since the latest snapshot that the node has generated. If none have been generated yet, this is approximately the time delta since the process was started. |
kafka.server:type=ForwardingManager,name=AdminQueueTimeMs | Broker | Histogram | A histogram describing the amount of time in milliseconds each admin request spends in the forwarding manager queue. This does not include the time that the request spends waiting for a response from the controller. |
kafka.server:type=ForwardingManager,name=AdminQueueLength | Broker | Integer | The current number of RPCs that are waiting in the forwarding manager queue, prior to being sent to the broker. |
kafka.server:type=ForwardingManager,name=AdminRemoteTimeMs | Broker | Histogram | A histogram describing the amount of time in milliseconds each request sent by the ForwardingManager spends waiting for a response. This does not include the time spent in the queue. |
Implementation Notes
Lockless
In order to avoid excessive performance impacts from these new metrics, none of them will require locks to read. (Except for any locks inside the Yammer library, JMX implementation, and so on.)
Histogrram details
Histograms have the standard Yammer MBean fields of 50thPercentile, 75thPercentile, 95thPercentile, 98thPercentile, 999thPercentile, 99thPercentile, Count, Max, Mean, Min, StdDev.
Rationale
TimedOutBrokerHeartbeats
...
This metric is useful to monitor how long it has been since the node last generated a snapshot. If this time grows too large, it may indicate a potential problem, since loading times might also become very large.
QueueTimeMs
This metric is useful to monitor how much time metadata requests spend in the ForwardingManager queue waiting to be sent. If this gets too long, it may delay admin requests.
QueueLength
This metric is useful to monitor the length of the ForwardingManager queue. If this gets too long, it may delay admin requests.
Compatibility, Deprecation, and Migration Plan
...