Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Add note about IBP requirement

Table of Contents

Status

Current state: Draft Accepted

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

Voting thread: here

JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-12620

...

With KIP-631, the controller no longer uses ZooKeeper for persistence and instead uses the KRaft metadata log. 

This KIP aims to solve both problems by introducing a new RPC and a new method of generating Producer ID blocks using the metadata log as storage.

Public Interfaces

New AllocateProducerIdBlock AllocateProducerIds RPC to be used by brokers to request a new block of IDs from the controller. The use of this RPC in ZK mode is enabled by selecting an IBP of 3.0-IV0 or higher. This RPC is always used in KRaft mode.


Code Block
AllocateProducerIdBlockRequestAllocateProducerIdsRequest => BrokerId BrokerEpoch
  BrokerId => int32
  BrokerEpoch => int64


Code Block
AllocateProducerIdBlockResponseAllocateProducerIdsResponse => ErrorCode BrokerId BrokerEpoch
                                   ProducerIdBlockStart ProducerIdBlockEndProducerIdStart ProducerIdLen
  ErrorCode => int16
  BrokerId => int32
  BrokerEpochProducerIdStart => int64
  ProducerIdBlockStartProducerIdLen => int64
  ProducerIdBlockEnd => int64 int32 


ProducerIdStart is ProducerIdBlockStart and ProducerIdBlockEnd are inclusive. E.g., if the controller generates 1000 IDs starting from zero, it will return ProducerIdBlockStartreturn ProducerIdStart=0 and ProducerIdBlockEnd=999and ProducerIdLen=1000 which represents IDs 0 through 999.

Possible errors could be:

...

An authorization error will be considered fatal and should cause the broker to terminate. This indicates the broker is incorrectly configured to communicate with the controller. All other error types should be treated as transient and the broker should retry.

The AllocateProducerIdBlock AllocateProducerIds RPC should be rate limited using the existing throttling mechanism. This will help guard against malicious or malfunctioning clients.

...

A new metadata record to be used by the controller in the KRaft metadata log

Code Block
ProducerIdBlockRecordAllocateProducerIdsRecord => BrokerId BrokerEpoch ProducerIdBlockEndProducerIdEnd
  BrokerId => int32
  BrokerEpoch => int64
  ProducerIdBlockEndProducerIdEnd => int64


Proposed Changes

Controller

The In both ZK and KRaft modes, the controller will now be responsible for generating new blocks of IDs and recording this block to the KRaft persisting the latest generated block. In ZK mode, the controller will use the existing ZNode and JSON format for persistence. In KRaft mode, the controller will commit a new record to the metadata log. Since  Since the controller (in either mode) uses a single threaded event model, we can simply calculate the next block of IDs based on what is currently in memory. The controller will need to commit a record to the metadata log persist generated PID block so it can be “consumed” and never used again.

It will be the responsibility of the controller to determine the PID block size. We will use a block size of 1000 like the current implementation does. We include the block start and length in the RPC to accommodate for possible changes to the block size in the future.

The ProducerIdBlockRecord AllocateProducerIdsRecord will consume 20 bytes plus record overhead. Since the established upper bound on Producer ID is Long.MAX_VALUE, the required storage could theoretically be in the petabyte range (assuming 1000 IDs per block and no truncation of old records). However, in practice, we will truncate old metadata records and it is unlikely to see such excessive producer ID generation.

...

Metadata snapshots will need to include the the latest Producer ID block that was committed to the metadata log. Since we only need the latest block, the impact on the size of the snapshots is trivial.

Broker

The broker code becomes quite simple with this change. Rather than implementing the logic to allocate the next block of PIDs and dealing with semantics of ZooKeeper, the broker will simply will now defer to the controller . The broker will perform a request to the controller to fetch the next block of IDsfor allocating blocks of PIDs. This is done by sending an AllocateProducerIdsRequest to the controller using an inter-broker channel. If the request fails, the broker will retry for certain transient errors. The broker should use the ID block returned by the RPC (as opposed to waiting to replicate the resulting AllocateProducerIdsRecord metadata record).

Since there is now a Kafka network request involved in the ID block generation, we should consider pre-fetching blocks so a client is never waiting on an InitProducerIdRequest for too long. We will likely share an existing broker-controller channel for this RPC, so we cannot guarantee low or consistent latencies.

...

In order to upgrade from the ZK-based ID block generation, we will need to ensure that the ID blocks generated by the quorum controller do not overlap with those previously generated by ZK. This can be done by reading the latest producer ID block from ZK and generating an equivalent record in the metadata log. This will need to be incorporated into the overall KRaft upgrade plan once that is available.


Rejected Alternatives

Uncoordinated ID generation

One alternative would be to attempt to pre-allocate IDs in an uncoordinated manner using the broker ID. A scheme like this could probably be made to work, but might be hard to reconcile with previously ZK-generated ID ranges. It also means that some care would need to be taken to account for brokers that might be added in the future. This would also require that the brokers use some durable storage for their own ID ranges. Using the controller is not much more complicated than an uncoordinated approach (possibly its even simpler), and using the controller makes it easier to ensure correctness.

Idempotent RPC 

Another design consideration was whether or not the AllocateProducerIdsRequest should be idempotent. If we supported this, the broker would need additional local state and the controller would need additional in-memory state and logic. This approach could help guard against bugs where a broker rapidly requests ID blocks, but it also opens up a new class of bugs where the same producer IDs are returned more than once (possibly to different brokers). Since the ID space is quite large, and this RPC will be rate limited, we can forego this additional logic and the complexity that comes with it.