Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

MS SQL Server provide rotation of database encryption key with background re-encryption of existing data [4]. Oracle and MySQL, out of the box, do not provide an automatic procedure for rotating tablespace keys, master key rotation is supported [5][6], Currently, TDE is being developed for PostgreSQL, but support for tablespace key rotation is not planned [7].

Partition re-encryption strategies

At the moment, encryption occurs at the pagememory level, when a page is written to the pagestore or WAL.

Copy with re-encryption.

This strategy is similar to partition partition snapshotting - create partition snapshot encrypted with the new key and then replace the original partition file with the new one.

In place re-encryption

Sequentially read all the pages from the datastore and mark as dirty, log them into WAL. Checkpointer writes the pages encrypted with the new key.

This strategy requires changing the format of the encryption page to store the identifier (number) of the encryption key (for recovery). Each encrypted page has a space reserved for a page crc (4 bytes) that has an encryption block size (minimum 8 bytes).

Comparison

...

Performance(rough estimate)

...

Implementation complexity (rough estimate)

...

Process description

The

In place re-encryption design.

The overall process consists of the following steps

  • Rotate cache group key - add new encryption key on each node and set it for writing.
  • Schedule background re-encryption for archived data and cleanup the old key when it completes.

...

Detailed description

To support multiple keys for reading encrypted data it is required to store key identifier on each encrypted page and on each encrypted WAL record. The key identifier is a sequential counter, and should be the same on all nodes.

...

  • Background re-encryption may affect performance. Performance impact can be managed using following properties:
    1. IGNITE_REENCRYPTION_THREAD_POOL_SIZE - number of threads used for reencryption.
    2. IGNITE_REENCRYPTION_BATCH_SIZE - number of pages that is scanned during reencryption under checkpoint lock.
    3. IGNITE_REENCRYPTION_THROTTLE - delay in milliseconds between batches during a partition scanning.
  • The WAL history can be not enough to store all entries between checkpoints (this should be carefully tuned by properly setting the size of the WAL history and tuning the re-encryption performance).
  • The WAL history (for delta rebalancing) may be lost for all cache groups due to background re-encryption.

Copy with re-encryption design.

Cluster-wide process consists of the following steps:

  • Prepare changing the encryption key - send new key and start re-encryption task on each affinity node.
  • Finish changing the encryption key - swap partitions and replace cache encryption key in the metastore.

Prepare changing the encryption key

  1. The node initiator generates new encryption key(s) for cache group(s) and begins new distributed process to start a new cache encryption key change operation by sending an initial discovery message with the list of re-encrypted cache groups and encrypted keys.
  2. The distributed process configured action initiates a new local re-encryption task on each node.

Local re-encryption task

  1. Start copying of each partition file (including index) to the target directory with the re-encryption. These files will have dirty data due to concurrent checkpoint thread writes.
  2. Collect all dirty pages related to ongoing checkpoint process and corresponding partition files and apply them (with re-encryption) to the copied file right after the copy process ends.
  3. When local re-encryption of all required cache groups completes - send message that this phase is finished on this node (in other words, distributed process "prepare" is finished on local node).
  4. Continue to collect and apply dirty pages encrypted with the new key to copied partition until "finish" phase is started.

Finish changing the encryption key

After completion of the key change preparation process, a new distributed process is initiated to complete the key change.

The distributed process configured action initiates partition swapping on each node (this action may require suspension of local or global operations if WAL records can be reordered during key change).

  1. Acquire checkpoint lock.
  2. Swap all partition files:
    1. Backup original file.
    2. Move re-encrypted file to the place of the original one.
  3. Change encryption key(s) in metastore (update encryption keys history - add new key and set current WAL pointer to previous key).
  4. Cancel checkpoint updates for copied partitions.
  5. Release checkpoint lock.
  6. Force checkpoint
  7. Remove partition backups (2a).

WAL

After changing the encryption key, new WAL records will be encrypted with the new key. However, it must be possible to read older WAL records (at least to support historical rebalance).

reference documentsFor each cache, instead of a key, it is necessary to keep a history of keys in the form WALPointer -> key
(stored the maximum pointer for which the associated key is applicable).

When removing a WAL segment to which WALPointer(s) refers - key(s) should be also removed.
When the WAL is cleared, respectively, the key history must also be cleared (except the last one).

Recovery

The re-encryption procedure does not start if there are LOST partitions in the cache group or any baseline node is missing (this is a limitation of the initial design and should be improved in the future).
The cache stop operation is rejected, for cache groups in which re-encryption is performed.

By canceling the re-encryption procedure is meant clearing all temporary data.

If a node crashes during the replacement of the partitions, the original backup copies of the partitions are restored when the node starts.
If major topology changes during key rotation - cancelling whole procedure.
Minor topology changes should not affect re-encryption procedure.
If the partition is scheduled for eviction during re-encryption, cancel the re-encryption of this partition.

Risks and assumptions

...

  • .

Process management

// TBD

Public API changes

...