You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

IDIEP-43
Author
Sponsor
Created

  

Status

ACTIVE


Motivation

The most of open-source distributed systems provide `cluster snapshots` functionality, but the Apache Ignite doesn't have such one. Cluster snapshots will allow users to copy their data from an active cluster and load it later on another, such as copying data from a production system into a smaller QA or development system. 

Management

Create snapshot

[public] Java API

IgniteSnapshot
public interface IgniteSnapshot {
    /**
     * @return List of all known snapshots.
     */
    public List<String> getSnapshots();

    /**
     * Create a consistent copy of all persistence cache groups from the whole cluster.
     *
     * @param name Snapshot name.
     * @return Future which will be completed when a process ends.
     */
    public IgniteFuture<Void> createSnapshot(String name);
}

[public] JMX MBean

SnapshotMXBean
package org.apache.ignite.mxbean;

/**
 * Snapshot features MBean.
 */
@MXBeanDescription("MBean that provides access for snapshot features.")
public interface SnapshotMXBean {
    /**
     * Gets all created snapshots on the cluster.
     *
     * @return List of all known snapshots.
     */
    @MXBeanDescription("List of all known snapshots.")
    public List<String> getSnapshots();

    /**
     * Create the cluster-wide snapshot with given name.
     *
     * @param snpName Snapshot name to created.
     * @see IgniteSnapshot#createSnapshot(String) (String)
     */
    @MXBeanDescription("Create cluster-wide snapshot.")
    public void createSnapshot(@MXBeanParameter(name = "snpName", description = "Snapshot name.") String snpName);
}

[public] Command Line

control.sh --snapshot
# Starts cluster snapshot operation.
control.sh --snapshot ERIB_23012020

# Display all known cluster snapshots.
control.sh --snapshot -list

[internal] File Transmission

Internal API which allows to request and receive the required snapshot of cache groups from a remote. Used as a part of IEP-28: Rebalance peer-2-peer to send created local snapshot to the remote (demander) node.

IgniteSnapshotManager#createRemoteSnapshot
/**
 * @param parts Collection of pairs group and appropriate cache partition to be snapshot.
 * @param rmtNodeId The remote node to connect to.
 * @param partConsumer Received partition handler.
 * @return Future which will be completed when requested snapshot fully received.
 */
public IgniteInternalFuture<Void> createRemoteSnapshot(
    UUID rmtNodeId,
    Map<Integer, Set<Integer>> parts,
    BiConsumer<File, GroupPartitionId> partConsumer);

Restore snapshot (manually)

The snapshot procedure stores all internal files (binary meta, marshaller meta, cache group data files, and cache group configuration) the same directory structure way as the Apache Ignite does with preserving configured consistent node id.

To restore a cluster from snapshot user must manually copy all snapshot data files to the IGNITE_HOME/work  directory pay attention to consistent node ids.

Snashot directory structure
maxmuzaf@TYE-SNE-0009931 ignite % tree work
work
└── snapshots
    └── backup23012020
        ├── binary_meta
        │   ├── snapshot_IgniteClusterSnapshotSelfTest0
        │   ├── snapshot_IgniteClusterSnapshotSelfTest1
        │   └── snapshot_IgniteClusterSnapshotSelfTest2
        ├── db
        │   ├── snapshot_IgniteClusterSnapshotSelfTest0
        │   │   ├── cache-default
        │   │   │   ├── cache_data.dat
        │   │   │   ├── part-0.bin
        │   │   │   ├── part-2.bin
        │   │   │   ├── part-3.bin
        │   │   │   ├── part-4.bin
        │   │   │   ├── part-5.bin
        │   │   │   └── part-6.bin
        │   │   └── cache-txCache
        │   │       ├── cache_data.dat
        │   │       ├── part-3.bin
        │   │       ├── part-4.bin
        │   │       └── part-6.bin
        │   ├── snapshot_IgniteClusterSnapshotSelfTest1
        │   │   ├── cache-default
        │   │   │   ├── cache_data.dat
        │   │   │   ├── part-1.bin
        │   │   │   ├── part-3.bin
        │   │   │   ├── part-5.bin
        │   │   │   ├── part-6.bin
        │   │   │   └── part-7.bin
        │   │   └── cache-txCache
        │   │       ├── cache_data.dat
        │   │       ├── part-1.bin
        │   │       ├── part-5.bin
        │   │       └── part-7.bin
        │   └── snapshot_IgniteClusterSnapshotSelfTest2
        │       ├── cache-default
        │       │   ├── cache_data.dat
        │       │   ├── part-0.bin
        │       │   ├── part-1.bin
        │       │   ├── part-2.bin
        │       │   ├── part-4.bin
        │       │   └── part-7.bin
        │       └── cache-txCache
        │           ├── cache_data.dat
        │           ├── part-0.bin
        │           └── part-2.bin
        └── marshaller

17 directories, 30 files


Process overview


Proposed changes


Limitations


Risks and Assumptions

// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.

Discussion Links

http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Hot-cache-backup-td41034.html

Reference Links

  1. Apache Geode – Cache and Region Snapshots 
    https://geode.apache.org/docs/guide/16/managing/cache_snapshots/chapter_overview.html
  2. Apache Cassandra – Backing up and restoring data
    https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsBackupRestore.html

Tickets

// Links or report with relevant JIRA tickets.

  • No labels