Status
Current state: Under Discussion
Discussion thread:
...
Page properties | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
|
Motivation
While implementing JDBC exactly once sink I found that the current abstractions (TwoPhaseCommitSinkFunction) don’t suit this use case. Having a requirement to avoid code duplication, I propose a new abstraction, with the following goals in mind:
- accommodate the needs of the existing Kafka sinks
- accommodate the needs of the new JDBC sink:
- commits are retried in case of transient failures instead of failing the job
- rollbacks are retried
- need to distinguish between transactions started during this run and restored from the state; ignore commit failures (with reason “unknown”) for the latter; this is a consequence of a lack of timeouts
- when committing a group of transactions: an option to stop commits as soon as one failed; otherwise consistency can be violated (if the failure was transient then failed commit and all the further commits will be retried later)
- transaction timeouts aren’t used to ignore commit failures, as most DBs don’t support them
state will probably need to include all to-commit transactions (as union list)- minor API changes required
- accommodate the needs of other 2PC-sinks in future; these could be existing file sink, WAL; or potential DynamoDb, pulsar
- and non-sinks (see this question)
- batch jobs support in which sinks may not be running at the time when the job finishes and pre-committed checkpoints need to be committed
- improve testability; currently, TwoPhaseCommitSinkFunction requires a lot of mocking
...
This enables customization of various aspects independently and finer grained testing.
Extracting 2PC Resource also allows to run it not as a Sink (might be needed for a batch jobs to commit final pre-committed transactions when Tasks are not running anymore).
Serialization can be viewed as implementation detail of StateHandler. Though API to build it or some default implementation should be provided.
...