Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Each partition in broker maintains a list of the 5 most recent appended sequence numbers and corresponding offsets for each producerId (a.k.a PID). If a producer fails to receive a successful response and retries the produce request, broker can still return the offset of the successful append from the maintained cache via RecordMetadata`RecordMetadata`. The maintained cache per producerId per partition in broker instance consumes lots of memory which causes OOM in production. This KIP aims at reducing memory usage for producer state and the complexity to manage it on the broker side.

...

When the sequence number reaches Int.MaxValue, client can wraparound starting from 0 again.

On broker side, 

  • Brokers will return DUPLICATE_SEQUENCE_NUMBER for any sequence that is within 1000 of the latest sequence number (accounting for overflow). In this case, we won't return offset of the appended batch.

  • Brokers will return OUT_OF_ORDER_SEQUENCE for any sequence that is outside 1000 of the latest sequence number.

Note: 1000 is an arbitrary number, we can pick one that makes sense.

On Client side,

  • When clients receive DUPLICATE_SEQUENCE_NUMBER, it will discard it and move on.

  • When clients receive OUT_OF_ORDER_SEQUENCE, it will handle as before - retry the send request.

Compatibility, Deprecation, and Migration Plan

With this proposal, restriction on max.in.flight.requests.per.connection can be removed. broker won't provide offset along with a DUPLICATE_SEQUENCE_NUMBER error response unless it sets max.in.flight.requests.per.connection to 1.

This feature may need document that the user must set this config parameter to 1 if the offset is required in `RecordMetadata`.

For old version of broker without this feature, it can restrict the number of inflight requests to 5 internally.

Mechanism may be needed to tell whether a broker supports the new duplicate detection logic, like bumping the Produce API version so that a client can tell whether the broker it talks to is an old version (a.k.a. limit to 5 inflight sequence number) upon communication.

...