Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Process overview

The overall process must include includes the following sequential steps:

  1. Make sure that all partitions Check the integrity of the cache group are available in the cluster and there are no conflicts in the saved cache configurations. Make sure snapshot and make sure that the target cache group doesn't exist (user must manually destroy the cache before restoring).
  2. Copy the cache data files between nodes according to required partition distribution partitions locally on all nodes where the snapshot was taken. Check and merge binary metadata .on one of the snapshot nodes.
  3. Dynamically start the restored cache group(s).

If errors occur (I/O errors, node failure, etc.), the changes made to the cluster must be fully or partially reverted (depending on the type of error).

...

Restore operation will be rejected if the restored cache/group already present in the cluster. The user must manually destroy it and restart the operation.

Failover

To avoid the possibility of starting a node with inconsistent data, the partition files are first copied to a temporary directory and then this directory is moved using an atomic move operation. When the node starts, the temporary folder is deleted (if such exists).

If any of the nodes on which the snapshot was taken leave the cluster during the restore process, the process is aborted and all changes are rolled back (to achieve this, a special cache launch mode was introduced, if the node exits during the exchange, the process is rolled back).// TBD

Whole cluster restore

// TBD

...