By rebalancing the Apache Ignite cluster, it balances the distribution of primary and backup data copies according to applied affinity function on the new set of peer nodes. Imbalanced data increases the likelihood of data loss and can affect peer utilization during data requests. On the other hand, a balanced set of data copies optimizes each Ignite peer's requests load and each Ignite peer's disk resources consumption.
Currently, there are two types of the Apache Ignite cluster rebalancing:
Regardless of which rebalance mode is used `SYNC` or `ASYNC
` (defined in `CacheRebalanceMode` enum), the Apache Ignite rebalance implementation has a number of limitations caused by a memory-centric desing architecture. Although all cache data is sent between peer's in batches (used `GridDhtPartitionSupplyMessage`) it still processes entries one by one. Such approach have the low impact with a pure in-memory Apache Ignite use case but it leads to additional fsync's and logging WAL records with the native persistence enabled.
By default, `setRebalanceThreadPoolSize` is set to `1` and `setRebalanceBatchSize` to `512K` which means that thousands of key-value pairs will be processed single-thread and individually. This lead to:
Rebalancing procedure doesn't utilize network and storage device throughput to full extent.