Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Broker metrics

...

As a GetTelemetrySubscriptionsRequest is received for a previously unknown client instance id the CLIENT_METRICS config cache is scanned for any configured metric subscriptions whose match selectors match that of the client. The resulting matching configuration entries are compiled into a list of subscribed metrics which is returned in GetTelemetrySubscriptionsResponse.RequestedMetrics along with the minimum configured collection interval (this can be improved in future versions by including a per-metric interval so that each subscribed metric is collected with its configured interval, but in its current form longer-interval metrics are included “for free” if there are shorter-interval metrics in the subscription set). The a CRC32 checksum is also calculated based on the compiled metrics and is returned as the SubscriptionId in the response, as well as stored in the per-client-instance cache on the broker to track configuration changes.

This client instance specific state is maintained in broker memory up to MAX(60*1000, PushIntervalMs * 3) milliseconds and is used to enforce the push interval rate-limiting. There is no persistence of client instance metrics state across broker restarts or between brokers.


New broker metrics

The following brokers metrics should be added:

  • ClientMetricsInstanceCount - current number of client metric instances being managed by this broker. E.g., the number of unique CLIENT_INSTANCE_IDs with an empty or non-empty subscription set.
  • ClientMetricsActiveInstanceCount - current number of active client metric instances being managed by this broker. E.g., the number of unique CLIENT_INSTANCE_IDs with a non-empty subscription set.
  • ClientMetricsNewInstanceCount - total number of GetTelemetrySubscriptionsRequests with a Null CLIENT_INSTANCE_IDs
  • ClientMetricsUnknownInstanceCount - total number of metrics requests for unknown CLIENT_INSTANCE_IDs.
  • ClientMetricsThrottleCount - total number of throttled PushTelemetryRequests due to a higher PushTelemetryRequest rate than the allowed PushIntervalMs.
  • ClientMetricsPluginExportCount - the total number of metrics requests being pushed to metrics plugins, e.g., the number of exportMetrics() calls.
  • ClientMetricsPluginExportTimeMs - the amount of time plugins spent handling pushed metrics, e.g., the amount of time spent in exportMetrics().
  • ClientMetricsPluginErrorCount - the total number of exceptions raised from plugin's exportMetrics().


Client metrics and metric labels

...