Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents


ConsistentCut splits timeline WAL on 2 global areas - BEFORE Before and AFTER After. It guarantees that every transaction committed BEFORE committed Before also will be committed BEFORE Before on every other node participated in the transaction. It means that an Ignite node can safely recover itself to this point the Before state without any coordination with other nodes.

The border between Before and After areas consists of two WAL records - ConsistentCutStartRecord and ConsistentCutFinishRecord.

It guarantees that the Before consists of:
1. transactions committed before ConsistentCutStartRecord and weren't included into ConsistentCutFinishRecord#after();
2. transactions committed between ConsistentCutStartRecord and ConsistentCutFinishRecord and were included into ConsistentCutFinishRecord#before().

Image Added

Code Block
languagejava
titleConsistentCutRecord
/** */
public class ConsistentCutStartRecord extends WALRecord {
	/** Marker that inits Consistent Cut. */
	private final ConsistentCutMarker marker;
}


/** */
public class ConsistentCutFinishRecord extends WALRecord {
    /**
     * Collections of transactions committed BEFORE.
     */
    private final Set<GridCacheVersion> before;

     /**
     * Collections of transactions committed AFTER.
     */
    private final Set<GridCacheVersion> after;
 }

Algorithm

  1. Initial state:
    1. No concurrent ConsistentCut process is running.
  2. User starts a command for creating new incremental snapshot:
    1. Ignite node inits a DistributedProcess with special message holds new ConsistentCutMarker (goal is to notify every node in a cluster about running incremental snapshot).  
  3. Process of creation of incremental snapshot can be started by two events (what will happen earlier):
    1. Receive the ConsistentCutMarker by discovery.
    2. Receive the ConsistentCutMarker by transaction message (Prepare, Finish)
  4. On receiving the marker, every node: 
    1. Checks whether ConsistentCut has already started for this marker, skip if it has.
    2. Checks local topVersion  with received in marker. Skip if it is different.
    3. In message thread atomically:
      1. creates new ConsistentCut future
      2. creates committingTx, goal is to track COMMITTING+ transactions, that aren't part of IgniteTxManager#activeTx
      3. starts signing outgoing messages with the ConsistentCutMarker.
    4. In background thread:
      1. Writes a ConsistentCutStartRecord  to WAL with the received ConsistentCutMarker .
      2. Collects active transactions - concat of IgniteTxManager#activeTx and committingTxs .
  5. While the DistributedProcess  is alive every node signs output transaction messages:
    1. Prepare messages signed with the ConsistentCutMarker  (to trigger ConsistentCut  on remote node, if not yet).
    2. Finish messages signed with the ConsistentCutMarker  (to trigger...) and transaction ConsistentCutMarker  (to notify nodes which side of cut this transaction belongs to).
    3. Finish messages is signed with transaction ConsistentCutMarker on node that commits first.
  6. For every collected active transaction, node waits for Finish message, to extract the ConsistentCutMarker  and prepares before , after  collections:
    1. if received marker is null or differs from local, then transaction on before  side
    2. if received color equals to local, then transaction on after  side
  7. After all transactions finished:
    1. Writes a ConsistentCutFinishRecord  into WAL with the collections ( before, after ). 
    2. Stops filling committingTxs .
    3. Completes ConsistentCut  future, and notifies a node-initiator about finishing local procedure (with DistributedProcess  protocol).
  8. After all nodes finished ConsistentCut :
    1. every node stops signing outgoing transaction messages
    2. ConsistentCut  future becomes null.
    3. Ignite node now in the initial state again
  9. Node initiator checks that every node completes correctly and that topVer wasn't changed since start.
    1. If any node complete exceptionally, or topology changed - complete IS with exception.

...