Motivation

Cache encryption key rotation required in case of it compromising or at the end of crypto period (key validity period). in addition, such feature is required to provide support for encrypt and descrypt existing caches in the future.

Overview

Local partition re-encryption strategy is similar to partition snapshotting - create partition snapshot encrypted with the new key and then swap the original partition file with the new one.

Cluster-wide process consists of the following steps:

Prepare changing the encryption key - send new key and start re-encryption task on each affinity node.
Finish changing the encryption key - swap partitions and replace cache encryption key in the metastore.

Prepare changing the encryption key

The node initiator generates new encryption key(s) for cache group(s).
The distributed process starts a new cache encryption key change operation by sending an initial discovery message with the list of re-encrypted cache groups and encrypted keys.
The distributed process configured action initiates a new local re-encryption task on each node.

Local re-encryption task

Start copying of each partition file to the target directory with the re-encryption. These files will have dirty data due to concurrent checkpoint thread writes.
Collect all dirty pages related to ongoing checkpoint process and corresponding partition files and apply them (with re-encryption) to the copied file right after the copy process ends.
When local re-encryption of all required cache groups completes - send message that this phase is finished on this node (in other words, distributed process "prepare" is finished on local node).
Continue to collect and apply dirty pages encrypted with the new key to copied partition until "finish" phase is started.

Finish changing the encryption key

After completion of the key change preparation process, a new distributed process is initiated to complete the key change.

The discovery event from the distributed process pushes a new exchange task to the exchange worker to start PME (PME is required to prevent reordering of WAL records when key will be changed and to simplify initial design, this could and will be changed in the future)

While updates are blocked each node:

Forces the checkpoint (required for WAL consistency?)
Swap all partition files:
1. Backup original file.
2. Move re-encrypted file to the place of the original one.
Change encryption key(s) in metastore (update encryption keys history - add new key and set current WAL pointer to previous key).
Remove partition backups (2a).

WAL

After changing the encryption key, new WAL records will be encrypted with the new key. However, it must be possible to read older WAL records (at least to support historical rebalance).

For each cache, instead of a key, it is necessary to keep a history of keys in the form WALPointer -> key
(stored the maximum pointer for which the associated key is applicable).

When removing a WAL segment to which WALPointer(s) refers - key(s) should be also removed.
When the WAL is cleared, respectively, the key history must also be cleared (except the last one).

Recovery

By canceling the re-encryption procedure is meant clearing all temporary data.

If a node crashes during the replacement of the partitions, the original backup copies of the partitions are restored when the node starts.
If major topology changes during key rotation - cancelling whole procedure.
If cache is stopping during re-encryption - cancelling whole procedure, other minor topology changes should not affect re-encryption procedure.

(TBD) When baseline node with data joins the cluster and the cache group has a different key:
1. If historical rebalancing is not applicable encryption key will be changed when node joins and the partitions are cleared.
2. If historical rebalancing is applicable - existing data should be re-encrypted with the new key before(?) node joins the cluster.

Process management

TBD

Public API changes

TBD

Monitoring

Re-encryption process state.

Input: cache id.
Output:
- List of Tuples6
  - Node ID
  - Reencryption process state.
  - Count of partition to process.
  - Current partition index.
  - Current partition id.
  - Count of processed page in current partition.

Tickets

Unable to render Jira issues macro, execution error.

Page tree

TDE. Phase-3. Cache key rotation.