Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejava
titleConsistentCutRecord
/** */
public class ConsistentCutStartRecord extends WALRecord {
	/** Marker that inits Consistent Cut. */
	private final ConsistentCutMarker marker;
}


/** */
public class ConsistentCutFinishRecord extends WALRecord {
    /**
     * Collections of TXs committed BEFORE the ConsistentCut (sent - received).
     */
    private final Set<GridCacheVersion> before;

     /**
     * Collections of TXs committed AFTER the ConsistentCut (exclude).
     */
    private final Set<GridCacheVersion> after;
 }

Unstable topology

There are some cases to handle for unstable topology:

...

  1. First solution: set flag ConsistentCutManager#inconsistent  to true , and persist this flag within local MetaStorage.
    1. On receiving new ConsistentCutVersion check the flag and raise an exception.
    2. Clean flag on recovering from ClusterSnapshot, or after creating a ClusterSnapshot.
    3. + Simple handling in runtime.
    4. - Need to rebalance a single node after PITR with file-based (or other?) rebalance: more time for recovery, file-based rebalance has not-resolved issues yet(?). 

  2. Do not disable WAL during rebalance:
    + WAL now is consistent with data, can use it for PITR after rebalancing
    - Too many inserts during rebalance may affect rebalance speed, IO utilization

  3. Automatically create snapshots for rebalancing cache groups:
    + Guarantee of consistency (snapshot is created on PME, after rebalance).
    + Faster recovery.
    - Complexity of handling - Ignite should provide additional tool for merging WAL and multiple snapshots created in different time
    ? is it possible to create snapshot only on single rebalanced node?
    ? Is it possible to sync current PME (on node join) with starting snapshot?

  4. Write a special Rebalance record to WAL with description of demanded partition:
    1. During restore read this record and repeat the historical rebalance at this point, after rebalance resume recovery with existing WALs.
    2. In case Record contains full rebalance - stops recovering with WAL and fallback to full rebalance.
      ? Is it possible to rebalance only specific cache groups, and continue to WAL recovery for others.
      - For historical rebalance during recovery need separate logic for extracting records from WAL archives from other nodes. 

Choosing algorithm for transactional consistency

There are some possible solutions to guarantee transactional consistency:

  1. Consistent Cut
    + Simple algorithm - requires only some read-write locks to sync threads, and doesn't affect performance much.
    + Doesn't require to much additional space - it just writing few additional messages to WAL.
    - Requires time to recovery (applying every message from WAL to system) that depends on how many operations need to be restored.
    - Requires additional storage to persist WAL archives.
    - Doesn't restore ATOMIC caches.
  2. Incremental physical snapshots (collection of partition binary files changed since previous snapshot, can be implemented as delta or as full copy).
    - High disk IO usage for preparing snapshots.
    - Current implementation of snapshots requires PME, that affects performance much.
    + Fast recovering, doesn't depend on amount operations to restore (WAL-free).
  3. MVCC
    - Ignite failed to support MVCC, very hard to implement.

From those options Consistent Cut is more preferable. It may require optimizations for recovery, but it looks like we can provide some. For example, some options:

  1. Using WAL compaction for archived files, that excludes physical records from WAL files;
  2. Apply DataEntry from WAL in parallel by using striped executor (cache group id and partition id);
  3. Using index over WAL files for fast access to written Consistent Cuts.
  1. ON DISTRIBUTED SNAPSHOTS, Ten H. LAI and Tao H. YANG, 29 May 1987