Page History

...

First solution: set flag ConsistentCutManager#inconsistent to true , and persist this flag within local MetaStorage.
1. On receiving new ConsistentCutVersion ConsistentCutMarker check the flag and raise an exception.
2. Clean flag on recovering from ClusterSnapshot, or after creating a ClusterSnapshot.
3. + Simple handling in runtime.
4. - Need to rebalance a single node after PITR with file-based (or other?) rebalance: more time for recovery, file-based rebalance has not-resolved issues yet(?).
Do not disable WAL during rebalance:
+ WAL now is consistent with data, can use it for PITR after rebalancing
- Too many inserts during rebalance may affect rebalance speed, IO utilization
Automatically create snapshots for rebalancing cache groups:
+ Guarantee of consistency (snapshot is created on PME, after rebalance).
+ Faster recovery.
- Complexity of handling - Ignite should provide additional tool for merging WAL and multiple snapshots created in different time
? is it possible to create snapshot only on single rebalanced node?
? Is it possible to sync current PME (on node join) with starting snapshot?
Write a special Rebalance record to WAL with description of demanded partition:
1. During restore read this record and repeat the historical rebalance at this point, after rebalance resume recovery with existing WALs.
2. In case Record contains full rebalance - stops recovering with WAL and fallback to full rebalance.
  ? Is it possible to rebalance only specific cache groups, and continue to WAL recovery for others.
  - For historical rebalance during recovery need separate logic for extracting records from WAL archives from other nodes.

...

Page tree

Versions Compared

Old Version 39

New Version 40

Key