You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

This page is meant as a template for writing a KIP. To create a KIP choose Tools->Copy on this page and modify with your content and replace the heading with the next KIP number and a description of your issue. Replace anything in italics with your own description.

Status

Current state: "Under Discussion"

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here [Change the link from KAFKA-1 to your own ticket]

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Problem

I am an operator of a Tiered Storage Kafka cluster and I would like to know whether interactions with the remote tier, uploads, build auxiliary state, and deletions, are progressing at a steady pace.

For example, if uploads are not progressing at a constant rate I will have data building up in local storage and I might need to take corrective actions (like adding more storage temporarily). Likewise, if deletes are not progressing at a constant rate this might indicate a problem with the retention settings of my topics which I would like to remedy.

Similarly, if there are errors while building the auxiliary state of remote log segments or deleting remote log segments, it could indicate a problem with the Tiered Storage plugins, or the underlying storage.

Solution

To get this observability we would like to expose a remote-upload, remote-delete and other metrics detailed in the table below from the point of view of Kafka. Since upload and deletion are carried out by plugins such metrics emitted by Kafka are going to be best estimates on what it believes the state of the world is. We propose the new metrics to be emitted on per topic and per partition granularity.

Other Tiered Storage metrics are emitted only at a topic level, but we would like to also expose these at a partition level because this will allow us to easily track whether a single broker is experiencing a slowdown or the issue is spread across the cluster.

Additionally, we would like to expose remote build auxiliary state error rate and remote delete error rate at a topic level.

We leave more detailed progress metrics to be emitted by the developers of Tiered Storage plugins.

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

New Metrics

MBean

Description

kafka.server:type=BrokerTopicMetrics, name=TotalRemoteRecordsLag, topic=([-.w]+), partition=([-.w]+)

The remote records lag of a topic is defined as the number of records in non-active segments eligible for tiering not yet uploaded to the remote storage.

kafka.server:type=BrokerTopicMetrics, name=TotalRemoteBytesLag, topic=([-.w]+), partition=([-.w]+)

The remote bytes lag of a topic is defined as the number of bytes in non-active segments eligible for tiering not yet uploaded to the remote storage.

kafka.server:type=BrokerTopicMetrics, name=DeleteRemoteLag, topic=([-.w]+), partition=([-.w]+)

The tier lag of a topic is defined as the number of records in non-active segments marked for deletion but not yet deleted from remote storage.

kafka.server:type=BrokerTopicMetrics, name=RemoteDeleteRequestsPerSec, topic=([-.w]+)

The number of delete requests for expired remote segments to remote storage per second.

kafka.server:type=BrokerTopicMetrics, name=RemoteDeleteErrorsPerSec, topic=([-.w]+)

The number of delete requests for expired remote segments to remote storage which resulted in errors per second.

kafka.server:type=BrokerTopicMetrics, name=BuildRemoteLogAuxStateRequestsPerSec, topic=([-.w]+)

The number of requests for rebuilding the auxiliary state for a topic-partition per second.

kafka.server:type=BrokerTopicMetrics, name=BuildRemoteLogAuxStateErrorsPerSec, topic=([-.w]+)

The number of requests for rebuilding the auxiliary state for a topic-partition which resulted in errors per second.

kafka.server:type=BrokerTopicMetrics, name=TotalRemoteLogSizeComputationTime, topic=([-.w]+)

The amount of time needed to compute the size of the remote log.

kafka.server:type=BrokerTopicMetrics, name=TotalRemoteLogSizeBytes, topic=([-.w]+)

The total size of a remote log in bytes.

kafka.server:type=BrokerTopicMetrics, name=TotalRemoteLogMetadataCount, topic=([-.w]+)

The total number of metadata entries for remote storage.

Proposed Changes

Constraints

The metric calculation should not acquire locks. If it had to acquire locks then it would be contending with the archival/replication/deletion paths of Tiered Storage.

Details

The lag metrics should not only be updated on an archival/deletion cycle. If archiving/deletion is failing to run for whatever reason we should still see the latest state of records being queued up for archival/deletion.

Compatibility, Deprecation, and Migration Plan

These are new metrics and as such shouldn't have compatibility concerns.

Test Plan

Describe in few sentences how the KIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Unit and integration tests.

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

  1. Emit the lag metrics only on topic level - we rejected this alternative because it wouldn't allow an operator to fairly quickly understand whether any problems are isolated to individual brokers or widespread through the cluster.
  2. Update the lag metrics on each archival/deletion cycle - we rejected this alternative because if the cycle does not run for whatever reason the old metric would be emitted even if new segments have become eligible for tiering.
  • No labels