ID	IEP-109
Author	Nikolay Izhikov
Sponsor	Maksim Timonin
Created	17.07.2023
Status	DRAFT

Motivation

IEP-43 introduces persistent caches snapshots feature. This feature highly adopted by Ignite users.
But Ignite snapshot supports persistent caches only.

Cache dumps is essentially a file that contains all entries of cache group on the time of dump creation.
Meta information of dumped caches and binary included in dump, also.
Dump is consistent, which means all entries that existed in cluster at the moment of dump creation will lands into dump file.

Ignite must provide dump restore feature. This process will create cache group and caches from saved config and put all save entries in it. This will essentially restore cache state to the moment of dump creation.

Dump feature can be used both, for persistent and in-memory cache groups. Use-cases are:

In-memory cluster restarts.
Version upgrade.
Disaster recovery.
DC/Hardware replacement.
Data motion.

Description

API

New command for dump creation and restore will be added.

Example:

create snapshot example

> ./control.sh --dump --create DUMP --cache-group 123456790

> ./control.sh --dump --restore DUMP --cache-group 123456790,097654321

Creation

Dump creation flow is similar to persistent snapshot creation. So you may want to check IEP-43 for snapshot design details.

On receiving request of dump creation Ignite will start distributed process.
PME wil be started on receiving of InitMessage .
Under PME locks setup entry before change listener.
Listener must be invoked before actually changing entry.
Save cache group metadata.
Save binary and marshaller metadata.
Iterate CacheDataTree of cache group and handle each entry.
Each entry must be sent to specific cache group handler that will write it to corresponding file.
Inside entry listener:
1. store entry key that was handled by listener.
2. handle entry like on step 6.

After algorithm ends each file will contain consistent copy of cache group.

Details:

Consistency guarantees - to make sure there are no concurrent update dump creation starts under PME locks.
Persistent pages contains CRC to make sure data integrity while storing on disk.
Dump data integrity must be protected by CRC, also.
Metadata must be properly prepared and saved during dump creation:

- StoredCacheData.
- binary_meta.
- marshaller.

Restore

Prerequisites:

Restored cache group not exists on cluster.
Count of nodes in cluster are the same as in time of creation (this restriction can be eliminated in Phase 2 of IEP implementation).
All nodes in cluster has dump locally.

High-level dump restore algorithm steps:

Create corresponding caches.
Restore all saved metadata.
Iterate saved entries and put them as local cache entries.
Wait all nodes complete step 2 and 3.
Start caches that belongs to restored data region.

Rejected alternatives

There are a couple of alternatives to implement backup copies of inmemory caches that was rejected during initial analyzes:

Store full data region copy
The idea of this approach is to store full data region copy on the disk.
1. Pros:
  1. Mostly sequential writes.
2. Cons:
  1. restore must setup all Ignite internals strustured regarding restored offheap data.
    Such code can be difficult to implement and maintain.
On demand persistence
The idea of this approach is to reuse PDS infrastucture and persistent snapshot code by introducing new cache mode "PDS_ON_DEMAND".
This mode will use persitence code but with WAL and checkpoint disabled. So on creating snapshot regular checkpoint will be performed.
After checkpoint PDS files are ready to be copied to snapshot folder.
1. Pros:
  1. Code reuse.
2. Cons:
  1. Additional configuration on user-side required (set new mode for DataRegion).
  2. All Ignite codebase needs to be aware of new mode - baseline, PDS, WAL, checkpoint, etc.
  3. PDS page stores additional data - storage overhead.
shmem usage
The idea of this approach is to use shared memory feature of Linux.
1. Pros:
  1. Easy to implement (questioneable).
  2. Same approach used by other vendors to implement in-memory cluster restarts.
2. Cons:
  1. OS specific.
  2. Only for certain scenarios. Doesn't cover all use-cases.
Use "snapshot" name for feature instead of dump.

Risks and Assumptions

DataRegionConfiguration#persistenceEnabled=false for in-memory caches by the definition.
The way to restore snapshot on different topology must be further investigated.
Compaction of snapshot must be further investigated.
Consistent per-entry snapshot that will be implemented in this IEP can be created for persistence caches also.

Phases scopes

Phase 1
- dump creation.
- restore dump on the same topology.
- control utility integration.
- metrics, system views integration.
- new SecurityPermission.
- dumping as ZIP file.
Phase 2
- restore snapshot on different topology.
Phase 3
- snapshot compactification.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

Tickets

key	summary	type	updated	assignee	customfield_12311032	customfield_12311037	customfield_12311022	customfield_12311027	priority	status	resolution
JQL and issue key arguments for this macro require at least one Jira application link to be configured

Page tree

IEP-109 Cluster caches dump