...
As a GetTelemetrySubscriptionsRequest is received for a previously unknown client instance id the CLIENT_METRICS config cache is scanned for any configured metric subscriptions whose match selectors match that of the client. The resulting matching configuration entries are compiled into a list of subscribed metrics which is returned in GetTelemetrySubscriptionsResponse.RequestedMetrics along with the minimum configured collection interval (this can be improved in future versions by including a per-metric interval so that each subscribed metric is collected with its configured interval, but in its current form longer-interval metrics are included “for free” if there are shorter-interval metrics in the subscription set). The a CRC32 checksum is also calculated based on the compiled metrics and is returned as the SubscriptionId in the response, as well as stored in the per-client-instance cache on the broker to track configuration changes.
This client instance specific state is maintained in broker memory up to MAX(60*1000, PushIntervalMs * 3) milliseconds and is used to enforce the push interval rate-limiting. There is no persistence of client instance metrics state across broker restarts or between brokers.
New broker metrics
The following brokers metrics should be added:
- ClientMetricsInstanceCount - current number of client metric instances being managed by this broker. E.g., the number of unique CLIENT_INSTANCE_IDs with an empty or non-empty subscription set.
- ClientMetricsActiveInstanceCount - current number of active client metric instances being managed by this broker. E.g., the number of unique CLIENT_INSTANCE_IDs with a non-empty subscription set.
- ClientMetricsNewInstanceCount - total number of GetTelemetrySubscriptionsRequests with a Null CLIENT_INSTANCE_IDs
- ClientMetricsUnknownInstanceCount - total number of metrics requests for unknown CLIENT_INSTANCE_IDs.
- ClientMetricsThrottleCount - total number of throttled PushTelemetryRequests due to a higher PushTelemetryRequest rate than the allowed PushIntervalMs.
- ClientMetricsPluginExportCount - the total number of metrics requests being pushed to metrics plugins, e.g., the number of exportMetrics() calls.
- ClientMetricsPluginExportTimeMs - the amount of time plugins spent handling pushed metrics, e.g., the amount of time spent in exportMetrics().
- ClientMetricsPluginErrorCount - the total number of exceptions raised from plugin's exportMetrics().
Client metrics and metric labels
...