Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Status

Current state: Under Discussion

Discussion thread: tba

JIRA:

...

Page properties


Discussion threadhttps://lists.apache.org/thread/zw2crf0c7t7t4cb5cwcwjpvsb3r1ovz2
Vote threadhttps://lists.apache.org/thread/tpyros2fl0howbtcb3fc54f7b7pjn1fw
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-25154

Release
1.15


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

If we do not claim a snapshot from which we are restoring, we should not reference any artefacts from it. The problem occurs only for incremental snapshots, because those are the only ones that reference increments from previous checkpoints. We can solve the issue, by forcing the first checkpoint after a restore to be a “full checkpoint”

NOTE (needs to be documented): Once the first checkpoint has completed successfully, the job won't depend in any way on the snapshot used for restoring. Therefore one can start as many jobs from a single snapshot as they wish.

NOTE: In this context and the entire document when talking about “full checkpoints/snapshots” we mean checkpoints that do not cross reference other snapshots. It does not enforce any particular format. In particular, in the case of RocksDB, it could still be a set of SST files, which is what happens if we enable “incremental checkpoint” in RocksDB. We must make sure that all used files are uploaded independently. Thus if we use the same files that the original snapshot, we must either re-upload or duplicate them.

...