...
When the supplier node receives the cache partition file demand request it will send the file over the CommunicationSpi. The cache partition file can be concurrently updated by checkpoint thread during its transmission. To guarantee the file consistency Сheckpointer must use Copy-on-Write [3] tehnique and save a copy of updated chunk into the temporary file.
While the demander node is in the partition file transmission state it must save sequentially all cache entries corresponding to the MOVING partition into a new temporary storage. These entries will be applied later one by one on the newly received cache partition file. All asynchronous operations will be enrolled to the end of temporary storage during storage reads until it becomes fully read. The file-based FIFO approach assumes to be used by this process.
The temporary storage is chosen to be WAL-based. The storage must support to:
Expected problems to be solved
The node is ready to become partition owner when partition data is rebalanced and cache indexes are ready. For the message-based cluster rebalancing approach indexes are rebuilding simultaneously with cache data loading. For the file-based rebalancing approach, the index rebuild procedure must be finished before the partition state is set to the OWNING state.
...
To provide basic recovery guarantees we must to:
...
Recovery from different stages:
The SSL must be disabled to take an advantage of Java NIO zero-copy file transmission using FileChannel#transferTo method. If we need to use SSL the file must be splitted on chunks the same way to send them over the socket channel with ByteBuffer. As the SSL engine generally needs a direct ByteBuffer to do encryption we can't avoid copying buffer payload from the kernel level to the application level.
...