Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. A consumer's application should not be exposed to messages from uncommitted transactions.
  2. The broker cannot lose any committed transactions.
  3. There should be no duplicate messages within transactions.
  4. Transaction ordering within a partition: A transaction-aware consumer should see transactions in the original transaction-order within each partition.
  5. Interleaving: Each partition should be able to accept messages from both transactional and non-transactional producers

If interleaving of transactional and non-transactional messages is allowed, then the relative ordering of non-transactional and transactional messages will be based on the relative order of append (for non-transactional messages) and final commit (for the transactional messages).


 Image Added

So in the above diagram, partitions p0 and p1 receive messages for transactions X1 and X2, and non-transactional messages as well. The time-line is the time of arrival of the messages to the broker. Since X2 is committed first, each partition will expose messages from X2 before X1. Since the non-transactional messages arrived before the commits for X2 and X2, those messages will be exposed before messages from either transaction.

Furthermore, we have the following implementation requirements:

  1. The implementation should be scalable. E.g., a dedicated log per transaction is unacceptable.
  2. Performance:
    1. The throughput of a transactional producer should be comparable to that of a non-transactional producer.
    2. Acceptable latency. E.g., avoid copying the transactional data as much as possible.
    3. Any implementation should not make the partition unavailable (say, due to locking) for an unreasonable period of time.
  3. Client simplicity: Favor a scheme that lend to a simpler client-side implementation (even if it adds more complexity to the broker). For example, it is acceptable (but not ideal) for a consumer implementation to (internally) buffer and subsequently discard messages from uncommitted transactions. i.e., if the chosen implementation allows the broker to materialize messages from uncommitted transactions in the data logs.Interleaving: Each partition should be able to accept messages from both transactional and non-transactional producers.

Finally, it is worth adding that any implementation should also provide the ability to associate each transaction's input state with the transaction itself. This is necessary to facilitate retries for transactions - i.e., if a transaction needs to be aborted and retried, then the entire input for that transaction needs to be replayed.