Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current state: Voting (voting thread)

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

KIP-664: Provide tooling to detect and abort hanging transactions provided tooling to get visibility into transactional and idempotent producers that the broker keeps track of.  This KIP proposes to add ProducerCount ProducerIdCount  metrics that enable easy monitoring of transactional and idempotent producer counts on the broker.

Producer ids are used by idempotent and transaction producers.  The brokers keep a small amount of metadata (e.g. producer id, epoch, sequence number, etc.) in memory for every partition that the idempotent producer produced to.   This metadata is maintained on every replica and it's recovered from logs and snapshots even if brokers restart.  The KIP-98 - Exactly Once Delivery and Transactional Messaging has details on producer ids and related protocols and data structures.

In idempotent producers, a new producer id is created when KafkaProducer is created.  A badly written application may frequently create new KafkaProducer objects.  This is not optimal in general, but specifically for idempotent producers, doing so would pollute broker memory with producer ids and related metadata.  Even though the metadata for each producer id is small, creating too many producer ids could run brokers out of memory.

The ProducerIdCount metric can be used to set up alerts so that this pattern can proactively detected and action could be taken before too many producer ids run the broker out of memory.

Public Interfaces

We propose adding 2 a new broker metricsmetric

The number of active transactional / idempotent producers that produced to topic <topic> partition <N>. 
NameDescriptionkafka.cluster:type=Partition,name=ProducerIdCount,topic=<topic>,partition=<N> 
kafka.server:type=ReplicaManager,name=ProducerIdCount The total number of active transactional / idempotent producers in the broker.

Proposed Changes

Add the new metrics metric to the ReplicaManager  and Partition classes correspondinglyclass.

Compatibility, Deprecation, and Migration Plan

  • No migration plan is needed because these metrics are the metric is new

Rejected Alternatives

Have partition level metric as well - this doesn't seem to be needed as we can use KIP-664: Provide tooling to detect and abort hanging transactions for detailed debugging, once alerted on total producer id on the broker.

Name the metric ProducerCount - may be misleading as the producers without producer ids are not counted.

...