...
Old Discussion thread: https://lists.apache.org/thread.html/f44317eb6cd34f91966654c80509d4a457dbbccdd02b86645782be67@%3Cdev.kafka.apache.org%3E
JIRA: https://issues.apache.org/jira/browse/
Jira | ||||||
---|---|---|---|---|---|---|
|
PULL REQUEST: https://github.com/apache/kafka/pull/75288103
Motivation
Current log compaction is based on the server side view i.e. compacted based on record offset and the offset is by the order when the record was received on the broker side. So for the same key, only the highest offset record is kept after compaction so that Kafka is able to reconstruct the current state of the events in a "most recent snapshot" approach. The issue then occurs when the insertion order is not guaranteed, which causes the log compaction to keep the wrong state. This can be easily replicated when using a multi-threaded (or simply multiple) producer(s), or when sending the events asynchronously. The following is an example:
...