Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Add rationale for user-configurable transaction boundaries

...

Rejected because: this approach was significantly more complex, involved duplicating existing data, and provided little if any advantages over the current proposal.

Future Work

Finer control over offset commits

...

Non-configurable transaction boundaries

Summary: instead of allowing users to configure the transaction boundaries for their connector, define all transaction boundaries the same way (by interval, poll batch, connector definition, connector with a fallback of poll, etc.).

Rejected because: there is no one-size-fits-all strategy for transaction boundaries that can be expected to accommodate every reasonable combination of connector and use case. Defining transactions on the batches returned from SourceTask::poll  would heavily limit throughput for connectors that frequently produce small record batches. Defining transactions on an interval would add a latency penalty to the records at the beginning of these transactions and, in the case of very large transactions, would inflate the memory requirements of downstream consumers (which would have to buffer the entire transaction locally within each topic-partition before beginning to process any of the records in it). And some connectors may have no reasonable way to define their own transaction boundaries at all.

Future Work

Standalone mode support

Since the design for atomic writes of source records and their offsets relies on source offsets being stored in a Kafka topic, standalone mode is not eligible. If there is sufficient demand, we may add this capability to standalone mode in the future.

...