Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In the diagram above, the sharp edged boxes represent distinct machines. The rounded boxes at the bottom represent Kafka TopicPartitions, and the diagonally rounded boxes represent logical entities which run inside brokers. 

 Each arrow represents either an RPC, or a write to a Kafka topic. These operations occur in the sequence indicated by the numbers next to each arrow. The sections below are numbered to match the operations in the diagram above, and describe the operation in question.

1. Finding a transaction coordinator -- the GroupCoordinatorRequest

 Since the transaction coordinator is at the center assigning PIDs and managing transactions,the first thing a producer has to do is issue a GroupCoordinatorRequest to any broker to discover the location of its coordinator.

...

If the transaction.app.id configuration is set, this AppId passed along with the InitPIDRequest, and the mapping to the corresponding PID is logged in the transaction log in step 2a. This enables us to return the same PID for the AppId to future instances of the producer, and hence enables recovering or aborting previously incomplete transactions.

 In addition to returning the PID, the InitPIDRequest performs the following tasks:

...

5.3 Writing the final Commit or Abort Message

 After all the commit or abort markers are written the data logs, the transaction coordinator writes the final COMMITTED or ABORTED message to the transaction log, indicating that the transaction is complete (step 5.3 in the diagram). At this point, most of the messages pertaining to the transaction in the transaction log can be removed.

...