...
In the discussion of KIP-185: Make exactly once in order delivery per partition the default producer setting, the following point regarding the OutOfOrderSequenceException
was raised:
- The
OutOfOrderSequenceException
indicates that there has been data loss on the broker.. ie. a previously acknowledged message no longer exists. For most part, this should only occur in rare situations (simultaneous power outages, multiple disk losses, software bugs resulting in data corruption, etc.). - However, there is another perfectly normal scenario where data is removed: in particular, data could be deleted because it is old and crosses the retention threshold.
- Hence, if a producer remains inactive for longer than a topic's retention period, we could get an
OutOfOrderSequence
which is a false positive: the data is removed through valid processes, and this isn't an error.
We would like to eliminate the possibility of getting spurious OutOfOrderSequenceExceptions
– when you get it, it should always mean data loss and should be taken very seriously.
Design
Essentially, we want to distinguish between the case where a producer's state is removed from the broker because the retention time has elapsed, and when the state is lost due to some problem in the system.
One solution is described here:
- When the producer metadata is removed from the
ProducerStateManager
on the broker due to retention, the nextProduceRequest
from the client will arrive with the existing producer id and with a non-zero sequence. Currently this results in anOutOfOrderSequenceException
returned by the broker, since the broker can't find any metadata and gets a non-zero sequence. This isn't strictly correct, and we propose introducing a newUnknownProducerException
and returning this instead. - The client can treat the
UnknownProducerException
as a non-fatal error and just reinitialize the producer and continue on it's merry way in most cases. - However, the above solution opens a hole: if the first write from the producer is actually lost (maybe due to a simultaneous power outage, multiple disk failures, etc.), we would not detect it. In particular, the first write with sequence = 0 is written, but then the records are lost on the broker. The next write with sequence=N would get an
UknownProducerException
and with the protocol above would simply be retried. Hence the fact that a message was lost would never be raised to the application. - We can solve the situation in (3), by keeping track of the last ack'd offset on the producer, and also returning the log start offset in each
ProduceResponse
. With these two pieces of information, we can be sure that anUknownProducerException
is valid if the log start offset returned along with the error code is greater than the last ack'd offset. This means that the front of the log has been truncated, causing the producer to become unknown. In this case, there is no unwanted data loss and the last batch can simply be retried. If we get anUnkownProducerException
but the log start offset is not greater than the last ack'd offset, then the record has been not been lost due to the retention period elapsing, and this should be treated as a fatal error. - With the changes above, an
OutOfOrderSequenceException
would always mean real data loss. AnUnkownProducerException
may mean some data loss.
Level of Effort
- Client side changes to track the last ack'd offset and correctly interpret an
UnknownProducerException
and either retry it or raise it as an error – 1 day. - Broker side changes to raise the
UnkownProducerException
– 0.25 days. - Updates to the protocol to return the
logStartOffset
per partition (with KIP) - 2 days. - System tests + Debugging - 2 days
Total : 1.25 weeks.
it was realized that we don't have graceful handling when an idempotence-enabled producer is writing to a broker with a message format older than v2 (ie. the 0.11.0 message format).
In particular, if we enable idempotence, any produce requests to topics with an older message format will fail with an UnsupportedVersionException
. Thus if the idempotent producer was to be made the default, the out of the box producer would fail to produce when used with clusters which haven't upgraded the message format yet.
This is particularly problematic since the recommended upgrade path is to upgrade broker code while keeping the message format at the older version, then upgrade all clients, and only finally upgrade the message format on the server. With the current behavior, the middle step is actually untenable if we enable idempotence as the default.
Design
To solve this problem, we propose introducing a new idempotence mode in 1.0.0, ie. enable.idempotence=requested
. The other modes would be enable.idempotence=off
and enable.idempotence=required.
In the requested
mode, idempotence would be best effort, ie. you get it if the broker supports it, otherwise no error is raised. In required
mode, requests would fail if the broker did not support idempotence. In off
mode, there is no idempotence (similar to the false
value today).
The new enable.idempotence=requested
mode would be introduced in the 1.0.0 release. In addition, the 1.0.0 client will disable idempotence if the message format of the destination topic doesn't support it.. In required mode, the client will raise an error if it detects an incompatible message format.
This means we would need to upgrade the TopicMetadata
in the MetadataResponse version to include the version of the message format for the topic in question.
Additionally, if a 1.0.0 client with enable.idempotence=requested
is writing to an 0.11.0 broker, idempotence will be disabled.
Finally, a 1.0.0 producer would interpret enable.idempotence=true
as enable.idempotence=required
and enable.idempotence=false
as enable.idempotence=off
.
Level of Effort
- Update MetadataResponse to include the message format version - 1 day
- Use the message format version to determine whether a partition supports idempotence - 5 days.
- Client side changes to send the idempotent producer mode, interpret old configurations of
enable.idempotence
– 0.5 days.
Total: 6.5