Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Cut can be consistent and inconsistent. It's prohibited to create a snapshot on inconsistent cut.
  2. Restoring requires read WAL ahead for last incremental snapshot:
    1. There are 2 records in WAL for every consistent cut: ConsistentCutStartRecord IncrementalSnapshotStartRecord and ConsistentCutFinishRecordIncrementalSnapshotFinishRecord.
    2. ConsistentCutFinishRecordIncrementalSnapshotFinishRecordcontains info which transactions before ConsistentCutStartRecord IncrementalSnapshotStartRecord has to be excluded from consistent stateincremental snapshot.
    3. Then it's important to read WAL ahead, reach ConsistentCutFinishRecordIncrementalSnapshotFinishRecord and only after that apply entries since previous Consistent CutIncremental Snapshot.
  3. In some circumstances it's impossible to create Consistent Cut Incremental Snapshots anymore, full snapshot should be created (see below, limitations in Phase 1).
  4. Only one instance of Consistent Cut (and then incremental snapshot) Incremental Snapshot can be created in one moment, concurrent process are not allowed.

...

Code Block
languagebash
// Create incremental snapshot.
// SNP - name of pre-existing full snapshot.
// [--retries N] - amount of attempts to create incremental snapshot, in case of inconsistent cut. Default is 3.
$ control.sh --snapshot create SNP --incremental [ --retries N ]
^ -- Incremental snapshot SNP_1640984400 created at 2022-01-01T00:00:00.

Under the hood this command:

Under the hood this command:

  1. Makes checks:
    1. Base snapshot (at least
    Makes checks:
    1. Base snapshot (at least its metafile) exists. Exists metafile for all incremental snapshots.
    2. Validate that no misses in WAL segments since previous snapshot. SnapshotMetadata should contain info about last WAL segment that contains snapshot data:
      1. If snapshot is fullClusterSnapshotRecord is written to WAL, segment number of this record is stored within existing structure SnapshotMetadata.
      2. If snapshot is incremental: stored segment number is a segment that contains ConsistentCutFinishRecordIncrementalSnapshotFinishRecord.
    3. Check that baseline topology is the same (relatively to base snapshot).
    4. Check that WAL is consistent (there was no disabling WAL since previous snapshot) - this info is stored into MetaStorage.
  2. Starts a new Consistent Cut.
  3. On finish Consistent Cut:
    1. if cut is consistent: log ConsistentCutFinishRecordIncrementalSnapshotFinishRecord with rolloverType=CURRENT_SEGMENT  to enforce archiving the segment after logging the record.
    2. if cut is inconsistent: skip log ConsistentCutFinishRecord IncrementalSnapshotFinishRecord and retry since 1.
    3. fail if retry attempts are exceeded.
  4. Awaits the segment with ConsistentCutFinishRecord IncrementalSnapshotFinishRecord has been archived and compacted.
  5. Collects WAL segments for current incremental snapshot (from previous snapshot to ConsistentCutFinishRecord IncrementalSnapshotFinishRecord).
  6. Creates hardlinks to the compressed segments into target directory.
  7. Writes a meta files with description of the new incremental snapshot:
    1. meta.smf: 
      1. Pointer to ConsistentCutFinishRecord IncrementalSnapshotFinishRecord.
    2. binary_meta, marshaller_data if it changed since previous snapshot.
Code Block
languagebash
# Proposed directory structure
$ ls $IGNITE_HOME
db/
snapshots/
|-- SNP/
|---- db/
|---- increments/
|------ 0000000000000001/
|-------- metanode0.smf
|-------- db/
|---------- binary_meta/
|---------- marshaller/
|-------- 0000000000000000.wal.wals/
|---------- 0000000000000000.wal.zip

Restore process

Code Block
languagebash
// Restore cluster on specific incremental snapshot
$ control.sh --snapshot restore SNP --incrementalincrement 1

With control.sh --snapshot restore  command:

  1. User specifies incremental full snapshot name
  2. Parses snapshot name and extracts base and incremental snapshots
  3. Additionally to full snapshot check (already exists in SnapshotRestoreProcess) it checks incremental snapshots:
    1. Checks that all WAL segments are presented (from ClusterSnapshotRecord to requested ConsistentCutFinishRecord).
  4. After full snapshot restore processes (prepare, preload, cacheStart) has finished, it starts another DistributedProcess - `walRecoveryProc`:
    1. Every node applies WAL segments since base snapshot while not reach requested ConsistentCutFinishRecordIncrementalSnapshotFinishRecord.
    2. Ignite should forbid concurrent operations (both read and write) for restored cache groups during WAL recovery.
    3. Process of data applying for snapshot cache groups (from base snapshot) is similar to GridCacheDatabaseSharedManager logical restore:
      1. disable WAL for specified cache group
      2. find `ClusterSnapshotRecord` related to the base snapshot
      3. starts applying WAL updates with striped executor (cacheGrpId, partId). Apply filter for versions in ConsistentCutFinishRecord.
      4. enable WAL for restored cached groups
      5. force checkpoint and checking restore state (checkpoint status, etc).

Checking snapshot

Code Block
languagebash
// Check specific incremental snapshot
$ control.sh --snapshot check SNP --increment 1

With control.sh --snapshot check  command:

Check includes following steps on every baseline node:

  1. Check snapshot files are consistent:
    1. Snapshot structure is valid and metadata matches actual snapshot files
    2. all WAL segments are presented (from ClusterSnapshotRecord to requested IncrementalSnapshotFinishRecord).
  2. Check snapshot incremental snapshot data integrity:
    1. It parses WAL segments from the first incremental snapshot to the specified one (with --increment param).
    2. For every partition it calculates hashes for entries, and for entry versions.
      1. On the reduce phase it compares partitions hashes between primary and backup copies.
    3. For every pair of nodes that participated as primary nodes it calculates hash of committed transactions. For example:
      1. There are two transactions:
        1. TX1, and there are 2 nodes that participates in it as primary nodes: A and B
        2. TX2, and there are 2 nodes: A and C
      2. On node A it prepares 2 collections: TxHashAB = [hash(TX1)], TxHashAC = [hash(TX2)]
      3. On node B it prepares 1 collection: TxHashBA = [hash(TX1)]
      4. On node C it prepares 1 collection: TxHashCA = [hash(TX2)]
      5. On the reduce phase of the check it compares collections from all nodes and expects that:
        1. TxHashAB equals TxHashBA
        2. TxHashAC equals TxHashCA

Note that incremental snapshot doesn't check data of related full snapshot. Then full check of snapshot will consist of two steps:

  1. Check full snapshot
  2. Check incremental snapshot

Atomic caches

For Atomic caches it's required to restore data consistency (primary and backup nodes) differently, with ReadRepair feature. Consistent Cut relies on transaction protocol' messages (Prepare, Finish). Atomic caches protocol doesn't have enough messages to sync different nodes.

...