...
- First solution: set flag
ConsistentCutManager#inconsistent
to true
, and persist this flag within local MetaStorage.- On receiving new ConsistentCutVersion ConsistentCutMarker check the flag and raise an exception.
- Clean flag on recovering from ClusterSnapshot, or after creating a ClusterSnapshot.
- + Simple handling in runtime.
- - Need to rebalance a single node after PITR with file-based (or other?) rebalance: more time for recovery, file-based rebalance has not-resolved issues yet(?).
- Do not disable WAL during rebalance:
+ WAL now is consistent with data, can use it for PITR after rebalancing
- Too many inserts during rebalance may affect rebalance speed, IO utilization
- Automatically create snapshots for rebalancing cache groups:
+ Guarantee of consistency (snapshot is created on PME, after rebalance).
+ Faster recovery.
- Complexity of handling - Ignite should provide additional tool for merging WAL and multiple snapshots created in different time
? is it possible to create snapshot only on single rebalanced node?
? Is it possible to sync current PME (on node join) with starting snapshot?
- Write a special Rebalance record to WAL with description of demanded partition:
- During restore read this record and repeat the historical rebalance at this point, after rebalance resume recovery with existing WALs.
- In case Record contains full rebalance - stops recovering with WAL and fallback to full rebalance.
? Is it possible to rebalance only specific cache groups, and continue to WAL recovery for others.
- For historical rebalance during recovery need separate logic for extracting records from WAL archives from other nodes.
...
{"serverDuration": 125, "requestCorrelationId": "bc9ba0ef11c397db"}