...
Code Block |
---|
language | java |
---|
title | ConsistentCutRecord |
---|
|
/** */
public class ConsistentCutStartRecord extends WALRecord {
/** Marker that inits Consistent Cut. */
private final ConsistentCutMarker marker;
}
/** */
public class ConsistentCutFinishRecord extends WALRecord {
/**
* Collections of TXs committed BEFORE the ConsistentCut (sent - received).
*/
private final Set<GridCacheVersion> before;
/**
* Collections of TXs committed AFTER the ConsistentCut (exclude).
*/
private final Set<GridCacheVersion> after;
}
|
Unstable topology
There are some cases to handle for unstable topology:
- Client or non-baseline server node leaves – no need to handle.
- Server node leaves:
- all nodes finish local running ConsistentCut, making them in-consistent
- Server node joins:
- all nodes finish local running ConsistentCut, making them in-consistent
- new node checks whether rebalance was required for recovering. If it is required, then handle it TBD
TBD: Which ways to use to avoid inconsistency between data and WAL after rebalance. There are options:
...
- During restore read this record and repeat the historical rebalance at this point, after rebalance resume recovery with existing WALs.
- In case Record contains full rebalance - stops recovering with WAL and fallback to full rebalance.
? Is it possible to rebalance only specific cache groups, and continue to WAL recovery for others.
- For historical rebalance during recovery need separate logic for extracting records from WAL archives from other nodes.
Links
- ON DISTRIBUTED SNAPSHOTS, Ten H. LAI and Tao H. YANG, 29 May 1987