Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Initially all process are white, sent and received collections are empty, LocalState is empty.
  2. After some time of system work, every node might have:
    1. Optionally empty collections sent and received
    2. Optionally non-empty LocalState <-> sent + received. State match events that changed its state.
  3. Random process can start a snapshot (furthermore, multiple process may start it simultaneously):
    1. Node colors itself to red.
    2. It commits a LocalState.
    3. It commits sent and received as collections for every IN and OUT channel. New one created for next LocalState.
    4. It prepares a marker message: it is red, and has a payload of sent. Goal of the marker is to guarantee order of messages (receivedij must be a subset of sentji).
  4. Mark every ordinal message between distributed processes with the marker message, if no upcoming message to a node, then it just sends the marker as an ordinary message.
  5. On receiving the ordinal message a process has to check the marker at first, before applying the message;
  6. If receiving color differs from local color, node has to trigger the local snapshot procedure.
  7. Handle sent from the received marker:
    1. calculates ChannelState for the channel it received a message: sent - received; where sent extracts from the marker, received - calculates locally since local snapshot.
  8. On received marker messages from all IN channels, it prepares a snapshot:
    1. Local snapshot of node i: Ni = LocalStatei + Σ ChannelStateij (sent - received)
  9. Every such local snapshot is a unit of global snapshot:
    1. Note, that snapshot consist of committed LocalStates and messages between nodes.
    2. committed sent and received collections are cleaned.

...

On receiving a message with new CutVersion node sets it and commits LocalState and ChannelState - to identify wrong order of the events

  1. LocalState maps to local WAL (all of committed transactions are part of LocalState);
  2. Channel:
    1. We can piggy back on Ignite transaction protocol (Prepare, Finish) messages with CommunicationSpi.
    2. In case there is no transaction for a channel, we can rely on the DiscoverySpi to start local snapshot on non-participated nodes.
  3. ChannelState maps to `IgniteTxManager#activeTransactions`:
    1. sent collection match committed transactions for which local node is near - they send FinishMessages to other nodes.
    2. received collection match committed transactions for which local node isn't near - they receive FinishMessages from other nodes. 
  4. `IgniteTxManager#activeTransactions` doesn't track:
    1. committing transactions (COMMITTING+), they are removed from this collection before start committing them.
      1. track them additionally: add to a separate collection before it starts committing, and remove after it committed.

...

  1. Initial state:
    1. Ignite WAL are in consistent state relatively to previous full or incremental snapshot.
    2. Every Ignite node has local ConsistentCut future equals to null (node is WHITE).
    3. Empty collection committingTxs (Set<GridCacheVersion>) that goal is to track COMMITTING+ transactions, that aren't part of IgniteTxManager#activeTx . It's automatically shrinks after transaction committed.
  2. Ignite node inites a global snapshot, by starting DistributedProcess (by discovery IO):
    1. creates a new ConsistentCutMarker.
    2. prepares a marker message that contains the marker and transmits this message to other nodes.
  3. Every nodes starts a local snapshot process after receiving the marker message (whether by discovery, or by communication with transaction message) 
    1. Atomically: creates new ConsistentCut future (node becomes RED), creates committingTxs, starts signing outgoing messages with the ConsistentCutMarker.
    2. Write a snapshot record to WAL with the received ConsistentCutMarker (commits LocalState).
    3. Collect of active transactions - concat of IgniteTxManager#activeTx and committingTxs 
    4. Prepares 2 empty collections - before[sent - received] andafter[exclude] cut.
  4. While global Consistent Cut is running every node signs output transaction messages:
    1. Prepare messages signed with the ConsistentCutMarker (to trigger ConsistentCut on remote node, if not yet).
    2. Finish messages signed with the ConsistentCutMarker (to trigger...) and transaction ConsistentCutMarker (to notify nodes which side of cut this transaction belongs to).
    3. Finish messages is signed on node that commits first (near node for 2PC, backup or primary for 1PC).
  5. For every collected active transaction, node waits for Finish message, to extract the ConsistentCutMarker and fills before, after collections:
    1. if received marker is null or differs from local, then transaction on before side
    2. if received color equals to local, then transaction on after side
  6. After all transactions finished:
    1. Writes a WAL record with ChannelState (before, after). 
    2. Stops filling committingTxs.
    3. Completes ConsistentCut future, and notifies a node-initiator about finishing local procedure (with DistributedProcess protocol).
  7. After all nodes finished ConsistentCut, every node stops signing outgoing transaction messages - ConsistentCut becomes null (node is WHITE again).

...