Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Distributed Process is used to complete steps [1, 3]. To achieve the step [2] a new SnapshotFutureTask  must be developed.

...

Data to copy to snapshot

...

The local snapshot task is an operation that executes on each local node independently. It copies all the persistence user files from the Ignite work directory to the target snapshot directory with additional machinery to achieve consistency. This task is closely connected with the node checkpointing process due to, for instance, cache partition files are only eventually consistent on disk during the ongoing checkpoint process and fully consistent only when the checkpoint ends.

The local snapshot operation on each cluster node reflects as – SnapshotFutureTask .

Data to copy to snapshot

The following must be copied to snapshot:

  • cache partition files
  • cache configuration
  • binary meta information
  • marshaller meta information
Base copy strategy

Binary meta, marshaller meta, configurations still stored in on-heap, so it is easy to collect and keep this persistent user information consistent under the checkpoint write-lock (no changes allowed).

following must be copied to snapshot:

  • cache partition files
  • cache configuration
  • binary meta information
  • marshaller meta information

Base copy strategy

Binary meta, marshaller meta, configurations still stored in on-heap, so it is easy to collect and keep this persistent user information consistent under the checkpoint write-lock (no changes allowed).

Another strategy must be used for cache partition files. The checkpoint process will write dirty pages from PageMemory to the cache partition files simultaneously with another process copy them to the target directory. Each cache partition file is consistent Another strategy must be used for cache partition files. The checkpoint process will write dirty pages from PageMemory to the cache partition files simultaneously with another process copy them to the target directory. Each cache partition file is consistent only at checkpoint end. So, to avoid long-time transaction blocks during the cluster-wide snapshot process it should not wait when checkpoint ends on each node. The process of copying cache partition files must do the following:

...

  1. Cache partition file already copied, but the checkpoint still not ended – wait while checkpoint ends and start merging cache partition file with its delta.
  2. The current checkpoint process ended, but the cache partition file is still copying – the next checkpoint process must read and copy the old version of a page to delta file prior to writing its dirty page.The current checkpoint process ended, but the cache partition file is still copying – the next checkpoint process must read and copy the old version of a page to delta file prior to writing its dirty page

Local snapshot task

The local snapshot task is an operation that executes on each local node independently. It copies all the persistence user files from the Ignite work directory to the target snapshot directory with additional machinery to achieve consistency. This task is closely connected with the node checkpointing process due to, for instance, cache partition files are only eventually consistent on disk during the ongoing checkpoint process and fully consistent only when the checkpoint ends.

The local snapshot operation on each cluster node reflects as – SnapshotFutureTask .

The local snapshot task process

  1. A new checkpoint starts (forced by node or a regular one).
  2. Under the checkpoint write lock – fix cache partition length for each partition (copy from 0  - to length ).
  3. The task creates new on-heap collections with marshaller meta, binary meta to copy.
  4. The task starts copying partition files.
  5. The checkpoint thread:
    1. If the associated with task checkpoint is not finished - write a dirty page to the original partition file and to delta file.
    2. If the associated with task checkpoint is finished and partition file still copying – read an original page from the original partition file and copy it to the delta file prior to the dirty page write.
  6. If partition file is copied – start merging copied partition with its delta file.
  7. The task ends then all data successfully copied to the target directory and all cache partition files merged with its deltas.

...