Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Motivation

Master keyrotation is required if it has beencompromisedKey rotation required in case of it compromising or at the end of  the crypto period (key validity period). 

Design assumes that , an administrator will provide an ability to get a new master key by EncryptionSPI from underlying storage.

Goal: 

To implement ability to rotate master encryption key. 

New processes: 

  1. Master key rotation.

  2. Master key rotation recovery start.

New administrator commands: 

  1. Master keys viewnode -> master key hash 

  2. Cache group keys viewnode -> group name -> encryption key hash 

Master Key rotation: 

Process start: 

Definitions

  • MK – MasterKeyEncrypts group keys. Master key is stored in some key storage. Master keys are identified by name.

Prerequisites

New master key must be available to EncryptionSPI for each server node. The cluster must be active.

Process management

...

Users can control the master key rotation process via  some

user interface(CLI,

...

JMX, Java API). 

  • JAVA API:
    ignite.encryption().changeMasterKey(String masterKeyName)  - starts master key rotation process.
    String ignite.encryption().getMasterKeyName()  - gets current master key name.

  • JMX:
    changeMasterKey(String masterKeyName)  - starts master key rotation process.
    String getMasterKeyName()   - gets current master key name.

  • CLI:
    # Starts master key rotation.
    control.sh --encryption change_master_key newMasterKeyName

    # Displays cluster's current master key name.
    control.sh --encryption get_master_key_name

Process description

Master key change process consist of two phases:

  1. Prepare master key change.
  2. Perform master key change.

Each phase is a distributed process.

Prepare master key change

The goal is to verify that all server nodes have the same master key. The server node begins prepare phase with the MasterKeyChangeRequest that contains:

  1. New master key name.
  2. New master key digest.

Each server node executes the following actions:

  1. Obtains a digest of a new master key. If the digest is unavailable, the process completes with an error.

Process description: 

  1. Initiating message is sent by discovery. 
  2. Initiating message should contain: 
    1. New master key hash
    2. New master key id.
  3. When server node processed message following actions are executed: It obtain hash of new master key.

  4. Compares it with the one in messagethe request.
  5. If it differs then error added to the message.
  6. If on step1 there are some errors we log it and cancel process. Otherwise got to step3.
  7. Action message is sent by discovery.
  8. Action message sould contain:
    1. New master key hash
    2. New master key id.
  9. When server node processed message following actions are executed: 
  10. the process completes with an error checking the consistency of the master key digest.
  11. Stores locally master key name and digest.

The coordinator starts the perform phase when the prepare phase is completed without errors.

Perform master key change

The coordinator node starts the prepare phase with the MasterKeyChangeRequest that contains:

  1. New master key name.
  2. New master key digest.

Each server node executes the following actions:

  1. It checks that the cluster is active (WAL must be writable to correctly log changes and survive cluster restarts). Otherwise, the process completes with the error.
  2. Checks that master key name and digest is the same as it was taken from the prepare phase. Otherwice, log it and cancel the process.
  3. Blocks creation of encrypted
  4.  cache key. 
  5. Encrypt cache group keys with new master key. 
  6. Unblock creation of encrypted cache key. 
  7. EncryptionSPI executes keys rotation (implementation specific).

Process completion: 

Process completes when all nodes in cluster will process action message.

Master key rotation recovery start

Motivation:

  1. groupkeys.
  2. Re-encrypts all cache group keys with new master key in a temporary datastructure. No changes in MetaStore.
  3. Creates WAL logical record (ChangeMasterKeyRecord ) that consist of:
    1. New master key name.
    2. Reenctyped cache group keys.
  4. Writes cache group keys to MetaStore.
  5. Unblockscreation of encryptedgroupkeys.

Distributed process

Distributed process is a cluster-wide process that accumulates single nodes results to finish itself.

The process consists of the following phases:

  1. The initial request starts the process. The InitMessage sent via discovery.
  2. Each server node processes the initial request and sends the single node result to the coordinator. The SingleNodeMessage sent via communication.
  3. The coordinator accumulate all single nodes results and completes process. The FullMessage sent via discovery.

Several processes of the same type can be started at the same time.

Guarantees:

  • Survives on topology and coordinator change (the SingleNodeMessage with a result will be redirected to the new one).
  • The exec and the finish actions will be called only ones.

Process completion

The process completes when the perform phase completed (all nodes have been re-encrypted their keys).

Corner cases

Node was down during key rotation. MasterKeyChangeRecord not found.

If some node was unavailable during master key rotation process it will unable to join to the cluster because it has old master key has.

To update this node design introduce master key recovery start option.

Process start:

Administartor initiates process by providing startup option.

Process description:

Node should execute following steps before join to the cluster:

  1. Obtain old master key by id
  2. Obtain new master key by id
  3. Reencrypt cache group keys with new master key and store it to metastore.
  4. EncryptionSPI executes keys rotation (implementation specific).

user should run Ignite with system property (IGNITE_MASTER_KEY_NAME_TO_CHANGE_BEFORE_STARTUP=newMasterKeyName)

The node will re-encrypt cache keys with new MK and try to join the cluster.

Node was down during key rotation. MasterKeyChangeRecord found.

A node should not try to join to the cluster before the process of ChangeMasterKeyRecord. Regardless of whether the key rotation was finished successfully or not, the recovery will  be from the record.

  1. If we found ChangeMasterKeyRecord  in the process node recovery it was passed to EncryptionManager .
  2. When MetaStore becomes writable, EncryptionManager  writes new cache group keys to it.

Node join during key rotation process

Reject node join. It may lead to inconsistent master keys in cluster.

Starting cache during key rotation process

Cache keys must not be created during the master key rotation process. So, a node will throw an exception if a users will start cache during the key rotation process. Moreover, if group keys were generated before the master key was change, starting the cache will be rejected (case of client node starts the cache).

Node couldn’t complete the perform phase

Node will process the critical failure error. Failure handler must stop the node to prevent inconsistent keys in the cluster.

Public java API changes

EncryptionSpi

The concept of the masterKeyId will be added to the cache keys encryption process in EncryptionSpi :

New methods will be introduced:

  • setMasterKeyName(String masterKeyName)  // Sets "current" master key name
  • String getMasterKeyName()  // Gets "current" master key name

The following methods will work with master key that was set by previous method:

  • byte[] masterKeyDigest() 
  • byte[] encryptKey(Serializable key) 
  • Serializable decryptKey(byte[] key) 

This is necessary so that Ignite can decrypt cache keys with the old master key and encrypt with the new one.

Code changes

Meta Storage

Meta storage will store name of master key. Key name from meta storage has a higher priority to key name from EncryptionSpi .

Node attribute

Currently, the joining node sends hash MK for validation in attributes. Attributes can't be modified at runtime. So the joining node will send hash MK in JoiningNodeDiscoveryData .

Tickets

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyIGNITE-12186

New commands: 

  • Master key hashes. 

    • Inputnothing 

    • Output: 

      • List of Tuples3 

        • Node ID 

        • Current key hash 

        • Previous key hash or null. 

  • Cache key hashes. 

    • Inputcache id. 

    • Output: 

List of Tuples3 

...

Node ID 

...

Current key hash 

...