Transactional Messaging in Kafka

Kafka currently provides at-least-once messaging guarantees. Duplicates can arise due to either producer retries or consumer restarts after failure. One way to provide exactly-once messaging semantics is to implement an idempotent producer. This has been covered at length in the proposal for an Idempotent Producer. An alternative and more general approach is to support transactional messaging. This can enable use-cases such as replicated logging for transactional data services in addition to the classic idempotent producer use-cases.

What is transactional messaging?

Producers can explicitly initiate transactional sessions, send (transactional) messages within those sessions and either commit or abort the transaction. The guarantees that we aim to achieve for transactions are perhaps best understood by enumerating the functional requirements.

A consumer's application should not be exposed to messages from uncommitted transactions.
The broker cannot lose any committed transactions.
There should be no duplicate messages within transactions.
Transaction ordering within a partition: A transaction-aware consumer should see transactions in the original transaction-order within each partition.
Interleaving: Each partition should be able to accept messages from both transactional and non-transactional producers

If interleaving of transactional and non-transactional messages is allowed, then the relative ordering of non-transactional and transactional messages will be based on the relative order of append (for non-transactional messages) and final commit (for the transactional messages).

So in the above diagram, partitions p0 and p1 receive messages for transactions X1 and X2, and non-transactional messages as well. The time-line is the time of arrival of the messages to the broker. Since X2 is committed first, each partition will expose messages from X2 before X1. Since the non-transactional messages arrived before the commits for X2 and X2, those messages will be exposed before messages from either transaction.

Furthermore, we have the following implementation requirements:

The implementation should be scalable. E.g., a dedicated log per transaction is unacceptable.
Performance:
1. The throughput of a transactional producer should be comparable to that of a non-transactional producer.
2. Acceptable latency. E.g., avoid copying the transactional data as much as possible.
3. Any implementation should not make the partition unavailable (say, due to locking) for an unreasonable period of time.
Client simplicity: Favor a scheme that lend to a simpler client-side implementation (even if it adds more complexity to the broker). For example, it is acceptable (but not ideal) for a consumer implementation to (internally) buffer and subsequently discard messages from uncommitted transactions. i.e., if the chosen implementation allows the broker to materialize messages from uncommitted transactions in the data logs.

Finally, it is worth adding that any implementation should also provide the ability to associate each transaction's input state with the transaction itself. This is necessary to facilitate retries for transactions - i.e., if a transaction needs to be aborted and retried, then the entire input for that transaction needs to be replayed.

Space shortcuts

Child pages

What is transactional messaging?