Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. The demander node prepares the set of IgniteDhtDemandedPartitionsMap#full cache partitions to fetch;
  2. The demander node checks compatibility version (for example, 2.8) and starts recording all incoming cache updates to the new special storage – the temporary WAL;
  3. The demander node sends the GridDhtPartitionDemandMessage to the supplier node;
  4. When the supplier node receives GridDhtPartitionDemandMessage and starts the new checkpoint process;
  5. The supplier node creates empty the temporary cache partition file with .tmp postfix in the same cache persistence directory;
  6. The supplier node splits the whole cache partition file into virtual chunks of predefined size (multiply to the PageMemory size);
    1. If the concurrent checkpoint thread determines the appropriate cache partition file chunk and tries to flush dirty page to the cache partition file
      1. If rebalance chunk already transferred
        1. Flush the dirty page to the file;
      2. If rebalance chunk not transferred
        1. Write this chunk to the temporary cache partition file;
        2. Flush the dirty page to the file;
    2. The node starts sending to the demander node each cache partition file chunk one by one using FileChannel#transferTo
      1. If the current chunk was modified by checkpoint thread – read it from the temporary cache partition file;
      2. If the current chunk is not touched – read it from the original cache partition file;
  7. The demander node starts to listen to new pipe incoming connections from the supplier node on TcpCommunicationSpi;
  8. The demander node creates the temporary cache partition file with .tmp postfix in the same cache persistence directory;
  9. The demander node receives each cache partition file chunk one by one
    1. The node checks CRC for each PageMemory in the downloaded chunk;
    2. The node flushes the downloaded chunk at the appropriate cache partition file position;
  10. When the demander node receives the whole cache partition file
    1. The node initializes received .tmp file as its appropriate cache partition file;
    2. Thread-per-partition begins to apply data apply for data entries from the begining of beginning of WAL-temporary storage;
    3. All async operations corresponding to this partition file still write to the end of temporary WAL;
    4. At the moment of WAL-temporary storage is ready to be empty
      1. Start the first checkpoint;
      2. Wait for the first checkpoint ends and own the cache partition;
      3. All operations now are swithed to switched to the partition file instead of writing to the temporary WAL;
      4. Schedule the temporary WAL storage deletion;
  11. The supplier node deletes the temporary cache partition file;

...

  • The new layer over the cache partition file must support direct using of FileChannel#transferTo method over the CommunicationSpi pipe connection;
  • The process manager must support transferring the cache partition file by chunks of predefined size (multiply to the page size) one by one;
  • The connection bandwidth of the cache partition file transfer must have an ability to the ability to be limited at runtime;

Checkpointer

...

While the demander node is in the partition file transmission state it must save all cache entries corresponding to the moving partition into a new temporary WAL storage. These entries will be applied later one by one on the received cache partition file. All asynchronous operations will be enrolled to the end of temporary WAL storage during storage reads until it becomes fully read. The file-based FIFO approach assume to assumes to be used by this process.

The new write-ahead-log manager for writing temporary records must support to:

...

In case of crash recovery, there is no additional actions need to be applied to keep the cache partition file consistency. We are not recovering partition with the moving state, thus the single partition file will be losed and lost and only it. The uniqueness of it is guaranteed by the single-file-transmission process. The cache partition file will be fully loaded on the next rebalance procedure.

...