Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Rotate cache group key - add a new encryption key on each node and set it for writing.
  • Schedule background re-encryption for archived data and cleanup the old key when it completes.

...

To support multiple keys for reading encrypted data it is required to store a key identifier on each encrypted page and on each encrypted WAL record. The key identifier is a sequential counter , and should be the same on all nodes.

  1. Start distributed process CACHE_GROUP_KEY_CHANGE_PREPARE, each node
    1. verifies that re-encryption is not in progress
    2. ensures that new key identifier does not existsexist
    3. adds new key
  2. After successful completion of PREPARE, start distributed process CACHE_GROUP_KEY_CHANGE_FINISH, each node
    1. adds new key and sets it for writing
    2. adds the mapping "WAL segment -> *old* key identifier" (to safely cleanup previous key in the future)
    3. saves logical WAL record (ENCRYPTION_STATUS_RECORD) with current page count in partitions.
    4. stores current page count as total pages for background re-encryption on partitions.
    5. adds the mapping "WAL segment -> *old* key identifier" (to safely cleanup this key in the future)
    6. sets new key for writing
    7. starts background re-encryption

Background re-encryption

Process The process applies for all existing partitions including index.

Scan all pages from specified range (metaPageId + [offset -> total])

  1. acquire page
    1. if the checkpoint is finished (after key change) and page is dirty - skip this page.
    2. if the checkpoint is not finished or page is not dirty
      1. lock page
      2. unlock page (dirty=true)
  2. release page

Re-encryption progress is stored into metapage (int offset, int total), updates during the checkpoint.

The process aborts only when a partition is destroyed.

Cleanup old key

...

It is possible that the node will fail after adding a new key, but before setting it for writing (as an active key).
This node doesn't know whether the PREPARE phase was successful or not, therefore, it does not know which key is currently being used for writing.
By default, it will try to rejoin with the old key, if the join is rejected, then it should be possible to manually set the correct key identifier using the system property or command-line tools.

When a non-baseline node joins a cluster (with baseline change), it cleans up all existing data, so this shouldn't be a problem case.

...