Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. First solution: set flag ConsistentCutManager#inconsistent  to true , and persist this flag within local MetaStorage.
    1. On receiving new ConsistentCutVersion ConsistentCutMarker check the flag and raise an exception.
    2. Clean flag on recovering from ClusterSnapshot, or after creating a ClusterSnapshot.
    3. + Simple handling in runtime.
    4. - Need to rebalance a single node after PITR with file-based (or other?) rebalance: more time for recovery, file-based rebalance has not-resolved issues yet(?). 

  2. Do not disable WAL during rebalance:
    + WAL now is consistent with data, can use it for PITR after rebalancing
    - Too many inserts during rebalance may affect rebalance speed, IO utilization

  3. Automatically create snapshots for rebalancing cache groups:
    + Guarantee of consistency (snapshot is created on PME, after rebalance).
    + Faster recovery.
    - Complexity of handling - Ignite should provide additional tool for merging WAL and multiple snapshots created in different time
    ? is it possible to create snapshot only on single rebalanced node?
    ? Is it possible to sync current PME (on node join) with starting snapshot?

  4. Write a special Rebalance record to WAL with description of demanded partition:
    1. During restore read this record and repeat the historical rebalance at this point, after rebalance resume recovery with existing WALs.
    2. In case Record contains full rebalance - stops recovering with WAL and fallback to full rebalance.
      ? Is it possible to rebalance only specific cache groups, and continue to WAL recovery for others.
      - For historical rebalance during recovery need separate logic for extracting records from WAL archives from other nodes. 

...