Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Bumps up the epoch of the PID, so that the any previous zombie instance of the producer is fenced off and cannot move forward with its transaction.
  2. Recovers (rolls forward or rolls back) any transaction left incomplete by the previous instance of the producer.

Limitations

  • Distributed mode is required; standalone mode will not be supported (yet) for exactly-once source connectors
  • Exactly-once source support can only be enabled for all source connectors or none; it cannot be toggled on a per-connector basis
  • In order to be viable for exactly-once delivery, connectors must assign source partitions to at most one task at a time (otherwise duplicate writes may occur)
  • In order to be viable for exactly-once delivery, connectors must use the Connect source offset API for tracking progress (otherwise duplicate writes or dropped records may occur)

Addressed failure/degradation scenarios

...

Rolling upgrades that enable exactly-once support on a cluster will be possible. Users can stop each then do the following for each worker, one by one: stop the worker, upgrade to a later version if necessary, set set exactly.once.source.enabled to  to true in   in the config, then restart, one by one.

This will have the effect that, until the leader is upgraded, no upgraded workers will start source tasks. This is necessary in order to guarantee that a task count record has been written to the config topic before starting a task with a transactional producer.

...

Hard downgrades can be performed in the same way as soft downgrades. Because source task offsets on upgraded workers are still written to the worker’s global offsets topic, even if a downgraded worker does not support per-connector offsets topics, it will still pick up on relatively-recent source offsets for its connectors. Some of these offsets may be out-of-date or not match older than the ones in the connector’s separate offsets topic, but the only consequence of this would be duplicate writes by the connector, which will be possible on any cluster without exactly-once support enabled. Without any writes to the global offsets topic, all records processed by a connector since a switch to a dedicated offsets topic would be re-processed after the downgrade and would likely result in a flood of duplicates. While technically permissible given that the user in this case will have knowingly switched to a version of the Connect framework that doesn't support exactly-once source connectors (and is therefore susceptible to duplicate delivery of records), the user experience in this case could be quite bad, so a little extra effort on the part of Connect to significantly reduce the fallout of downgrades in this case is warranted.

...