...
- Use `
DescribeProducers
` to collect the set of ProducerIds which have transactions exceeding the max transaction timeout - Use `
ListTransactions
` to the available brokers to find the the TransactionalIds associated with these ProducerIds. - Finally, use `
DescribeTransactions
` to validate the transaction state and ensure it is safe to abort.
This proposal adds a command line tool which will automates this process of analysis.
Recovery
The remaining problem to solve is how to safely abort a hanging transaction. We propose to extend the `WriteTxnMarker`
API so that it can be used by the Kafka AdminClient. Currently we use the coordinator epoch (which is the leader epoch of the associated __transaction_state partition) as a kind of concurrency control. Basically partition leaders will not accept non-monotonic updates for a given `ProducerId`
. We need to ensure that writes from the AdminClient do not interfere with this mechanism.
...