Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: This KIP was earlier published with the title "Atomic commit of source connector records and offsets", which corresponded to KAFKA-10000. It has now been expanded to also include general EoS support described in KAFKA-6080.

Motivation

It's been possible for sink connectors to implement exactly-once delivery for a while now, but and in systems with features like unique key constraints, even relatively easy. However, given Kafka's lack of such features that allow for easy remediation and/or prevention of duplicates, the same cannot be said for source connectors, and without framework-level adaptations, exactly-once delivery for source connectors remains impossible.

There are two key reasons for this:

Source offset accuracy: The Connect framework periodically writes source task offsets to an internal Kafka topic at a configurable interval, once the source record that they correspond to has been successfully sent to Kafka. This provides at-least-once delivery guarantees for most source connectors, but allows for the possibility of duplicate writes in cases where source records are written to Kafka but the worker is interrupted before it can write the offsets for those records to Kafka.

...