Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current stateUnder DiscussionAccepted: [VOTE] KIP-129: Kafka Streams Exactly-Once Semantics

Discussion thread: TBD
JIRA: TBD [DISCUSS] KIP-129: Kafka Streams Exactly-Once Semantics

JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-4923

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

The only public interface changes proposed in this KIP is adding one more config to StreamsConfig:

 

processing.guarantee

Here are the possible values:


exactly_once: the processing of each record will be reflected exactly once in the application’s state even in the presence of failures.

at_least_once: the processing of each record will be reflected at least once in the application’s state even in the presence of failures.


Default: at_least_once

 

The tradeoff between these modes is that in exactly_once mode the output will be wrapped in a transaction and hence to consumers running in read_committed mode, will not be available until that task commits it's output. The frequency of this is controlled by the existing config commit.interval.ms. This can be made arbitrarily small, down to processing each input in its own transaction, though as that commit becomes more frequent the overhead of the transaction will be higher.

...

The proposed change should be backward compatible. Users could simple swipe in the new jar as runtime dependency and restart their application (plus changing the config of processing.guarantee) to get the Exactly Once semantics.

Rejected Alternatives

  • "producer per thread" design (including a discussion about pros/cons for both approaches)
    • https://docs.google.com/document/d/1CfOJaa6mdg5o7pLf_zXISV4oE0ZeMZwT_sG1QWgL4EE/edit
    • "Our overall proposal is, to implement KIP-129 as is as a “Stream EoS 1.0” version. The raised concerns are all valid, but hard to quantify at the moment. Implementing KIP-129, that provides a clean design, allows us to gain more insight in the performance implications. This enables us, to make an educated decision, if the “producer per task” model is performant enough (with or without any of the proposed improvements) or if a switch to a “producer per thread” model is mandatory. If we do decide to do the “producer per thread” design, the changes should be incremental and easily achievable."