Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Minor clarifications

...

Discussion thread: here and now here

JIRA: KAFKA-15601here TBD

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Code Block
GetTelemetrySubscriptionsRequestV0 {
 	ClientInstanceId uuid                // UUID4 unique for this client instance.
										 // Must be set to Null on the first request, and to the returned ClientInstanceId
                                         // from the first response for all subsequent requests to any broker.
}

GetTelemetrySubscriptionsResponseV0 {
 	ThrottleTimeThrottleTimeMs int32					 // Standard throttling.
	ErrorCode int16						             // ErrorThe code.
duration in milliseconds for ClientInstanceIdwhich uuidthe request was throttled due to a quota violation,
        // Assigned client instance id if ClientInstanceId was Null in the request, else Null.
    SubscriptionId int32                 // Uniqueor identifierzero forif the currentrequest subscriptiondid setnot forviolate thisany client instancequota.
    AcceptedCompressionTypes Array[int8] 	ErrorCode int16						 // The compressionerror typescode, the or 0 if there was no error.
    ClientInstanceId uuid                // Assigned client instance id if ClientInstanceId was Null in the request, else Null.
    SubscriptionId int32                 // Unique identifier for the current subscription set for this client instance.
    AcceptedCompressionTypes Array[int8] // The compression types the broker accepts for PushTelemetryRequest.CompressionType
                                         // as listed in MessageHeaderV2.Attributes.CompressionType. The array will be sorted in
                                         // preference order from higher to lower. The CompressionType of NONE will not be
                                         // present in the response from the broker, though the broker does support uncompressed
                                         // client telemetry if none of the accepted compression codecs are supported by the client.
    PushIntervalMs int32                 // Configured push interval, which is the lowest configured interval in the current subscription set.
    TelemetryMaxBytes int32              // The maximum bytes of binary data the broker accepts in PushTelemetryRequest.
    DeltaTemporality bool                // If True; monotonic/counter metrics are to be emitted as deltas to the previous sample.
                                         // If False; monotonic/counter metrics are to be emitted as cumulative absolute values.
	RequestedMetrics Array[string]		 // Requested telemetry metrics prefix string match.
										 // Empty array: No metrics subscribed.
										 // Array[0] empty string: All metrics subscribed.
										 // Array[..]: prefix string match.
}

PushTelemetryRequestV0 {
	ClientInstanceId uuid                // UUID4 unique for this client instance, as retrieved in the first GetTelemetrySubscriptionsRequest.
    SubscriptionId int32                 // SubscriptionId from the GetTelemetrySubscriptionsResponse for the collected metrics.
	Terminating bool                     // Client is terminating.
    CompressionType int8                 // Compression codec used for .Metrics (ZSTD, LZ4, Snappy, GZIP, None).
                                         // Same values as that of the current MessageHeaderV2.Attributes.
	Metrics binary                       // Metrics encoded in OpenTelemetry MetricsData v1 protobuf format.
}

PushTelemetryResponseV0 {
	ThrottleTime int32
PushTelemetryResponseV0 {
  	ThrottleTimeMs int32	             // The duration in milliseconds for which the request was throttled due to a quota violation,
                                         // Standard throttling.
	 or zero if the request did not violate any quota.
    ErrorCode int16                      // ErrorThe error code, or 0 if there was no error.
}

Metrics serialization format

...

Retrieve broker-generated client instance id, may be used by application to assist in mapping the client instance id to the application instance through log messages or other means.

The client instance ids returned correspond to the client_instance_id labels added by the broker to the metrics pushed from the clients. This should be sufficient information to enable correlation between the metrics available in the client, and the metrics pushed to the broker.

The following method is added to the Producer, Consumer, and Admin client interfaces:

...

The following new broker metrics should be added:

.Total number of GetTelemetrySubscriptionsRequests received by this broker

Metric Name

Type

Group

Tags

Description

ClientMetricsInstanceCount

Gauge

ClientMetrics

version: broker's software version

Current number of client metric instances being managed by the broker. E.g., the number of unique CLIENT_INSTANCE_IDs with an empty or non-empty subscription set

ClientMetricsSubscriptionRequestCount

Meter

ClientMetrics

version: broker's software version

.

ClientMetricsUnknownSubscriptionRequestCount

Meter

ClientMetrics

client version: client's software version


Total number of metrics requests GetTelemetrySubscriptionsRequests with unknown CLIENT_INSTANCE_IDs.

ClientMetricsThrottleCount

Meter

ClientMetrics

client_instance_id

Total number of throttled PushTelemetryRequests due to a higher PushTelemetryRequest rate than the allowed PushIntervalMs.

ClientMetricsPluginExportCount

Meter

ClientMetrics

client_instance_id

The total number of metrics requests being pushed to metrics plugins, e.g., the number of exportMetrics() calls.

ClientMetricsPluginErrorCount

Meter

ClientMetrics

client_instance_id
reason (reason for the failure)

The total number of exceptions raised from plugin's exportMetrics().

ClientMetricsPluginExportTimeHistogramClientMetricsclient_instance_idAmount of time broker spends in invoking plugin exportMetrics call

Security

Since client metric subscriptions are primarily aimed at the infrastructure operator that is managing the Kafka cluster it should be sufficient to limit the config control operations to the CLUSTER resource. 

...