...
What about RocksDB upgrades? If we bump RocksDB version between Flink versions, do we support recovering from a native format snapshot (incremental checkpoint)?
Proposal
...
Canonical Savepoint | Native Savepoint | Aligned Checkpoint | Unaligned Checkpoint | ||||||||||||
Statebackend change | |||||||||||||||
Self-contained and relocatable | |||||||||||||||
State Processor API | (???) | (???) | Schema evolution | (???) | (???) | (???) | Flink minor (1.x → 1.y) version upgradeFlink bug/patch | ( | 1.14.x → 1.14.y) version upgrade(change) | (change) | Arbitrary job upgrade (changed graph shape/record types) | Job upgrade w/o changing graph shape and record types | Rescaling |
...
change |
...
Proposal 2
Canonical Savepoint | Native Savepoint | Aligned Checkpoint | Unaligned Checkpoint | Statebackend change | Self-contained and relocatable | State Processor API | (???) | |
(???) | Schema evolution | (???) | (???) | (???) | ||||
Flink minor (1.x → 1.y) version upgrade | (change) | |||||||
Flink bug/patch (1.14.x → 1.14.y) version upgrade | (change) | (change) | ||||||
Arbitrary job upgrade (changed graph shape/record types) | (change) | |||||||
Job upgrade w/o changing graph shape and record types | (change) | (change) | ||||||
Rescaling |
...
- In Flink 1.15 `--native` savepoint mode is added, but `--canonical` is kept the default.
- In Flink 1.16 `--native` will become the new default.
Rejected Alternatives
Rejected proposal for checkpoint guarantees
Canonical Savepoint | Native Savepoint | Aligned Checkpoint | Unaligned Checkpoint | |
Statebackend change | ||||
Self-contained and relocatable | ||||
State Processor API | (???) | (???) | ||
Schema evolution | (???) | (???) | (???) | |
Flink minor (1.x → 1.y) version upgrade | ||||
Flink bug/patch (1.14.x → 1.14.y) version upgrade | (change) | (change) | ||
Arbitrary job upgrade (changed graph shape/record types) | ||||
Job upgrade w/o changing graph shape and record types | ||||
Rescaling |
Main aim of the first proposal was to unify guarantees between two types of savepoints and two types of checkpoints. The only difference between native and canonical savepoint should be the ability to change statebackend, and officially there would be no difference between aligned and unaligned checkpoints. Hence we would simplify the documentation, as we could avoid documenting the distinction between unaligned and aligned checkpoints.
It was rejected because native savepoints and checkpoints are basically the same thing, so there is not much sense in artificially decreasing aligned checkpoint guarantees.