Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Start distributed process CACHE_GROUP_KEY_CHANGE_PREPARE, each node
    1. verifies that re-encryption is not in progress
    2. ensures that new key identifier does not exist
  2. After successful completion of PREPARE, start distributed process CACHE_GROUP_KEY_CHANGE_FINISH, each node
    1. adds new key and sets it for writing
    2. adds the mapping "WAL segment -> *old* key identifier" (to safely cleanup previous key in the future)
    3. saves logical WAL record (ENCRYPTION_STATUS_RECORD) with current page count in partitions.
    4. stores current page count as total pages for background re-encryption on partitions.
    5. starts background re-encryption of an existing data.

After the FINISH phase is complete, a new encryption key for writing is set on all nodes, i.e. the key change process is formally completed.

Background re-encryption of existing data will be completed sometime in the future, the new "pagesLeftForReencryption" cache group metric can be used to track re-encryption progress ('0' means the process has ended).

Background re-encryption

The process applies for all existing partitions including index.

...

  1. re-encryption completed for cache group (and after that at least one checkpoint was completed)
  2. last WAL segment in which the key was used is removed

Fault tolerance

If CACHE_GROUP_KEY_CHANGE_PREPARE has not been successfully completed on all nodes, the process is interrupted and must be restarted.
When the process restarts, a new key identifier is generated (an unused key will be overwritten).

...

Key rotation

The node join is rejected during the encryption key rotation, but this limitation may be revised in the future.

When a node joins the cluster (before/after key rotation), it receives the current encryption keys for the cache groups used for writing (it "rotates" encryption key automatically).
If the encryption key is a new key, then the node sets it for writing and starts the background re-encryption process (it starts re-encryption automatically).
Therefore, a node may leave the cluster during a key change, or a node may be absent and rejoin later (it does not matter if the baseline changes or not).

Re-encryption

If the node stops/fails during re-encryption, after restarting it must continue re-encryption from the stored offset:

  1. If checkpoint failed it should restore physical records from WAL, as usual).
  2. If checkpoint was not invoked reencryption re-encryption is started from the beginning using saved logical WAL record.

...