Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The motivations here are similar to KIP-854 Separate configuration for producer ID expiry.  Idempotent producers became the default in Kafka since KIP-679: Producer will enable the strongest delivery guarantee by default unless otherwise specified at the client side as a result of this all producer instances will be assigned a to PIDs. The increase of number of PIDs stored in Kafka brokers by ProducerStateManager expose the broker to OOM errors if it has high number of producers, rogue or misconfigured client(s). As a result of this the broker will hit OOM and become offline. The only way to recover from this is to increase the heap.  

KIP-854 added separated config to expire PID from transaction IDs however the broker still exposed to OOM if it has high number of PID before `producerproducer.id.expiration.ms` ms  is exceeded. And decreasing the value of `producerproducer.id.expiration.ms` ms  will impact all clients which not desired all the time. It would be more beneficial to target only inefficient users and stopping them from crowding the map of PIDs to their ProducerState  by ProducerStateManager.


This KIP propose to throttle the number PIDs at the leader of the partition by adding a new rating quota that will be applied during handling the PRODUCE request. This way the broker can reject only risky users early on in the process and protect itself without impacting everyone elsegood behaving users.


Proposed Changes

We propose adding the new QuotaManager called ProducerIdQuotaManager on the PRODUCE request level in the Kafka API that limits the number of active PIDs per user (KafkaPrincipal). The number of active PIDs will be defined as a rate within a period of time (similar to ControllerMutation quota).

...