Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Motivation

With the growing amount of information stored, the importance of stored data to individuals, companies and governments is ever increasing. In addition to traditional storage criteria such as performance, capacity and reliability, security is also becoming an important feature of storage systems.and there are some industries that have mandatory legal requirements for it, such as bank, transportation, etc.

...

  • EncryptionManager : Module for encrypting and decrypting table data files[2]
  • KeyManagementClient : A minimum client interface to connect to a key management service (KMS) [3]

Hudi and iceberg also have some shortcomings. Hudi relies on Spark and Parquet, and cannot encrypt data in ORC format; Iceberg only provides basic interfaces without corresponding implementation classes.

As a data lake framework, it is very important for paimon in supporting data encryption to meet enterprise security standards. This document describes how to extend the current paimon architecture to provide users with out-of-the-box encryption capabilities.


Goals

Not related to engine

It is not related to the engine, when enable encryption, users can read and write paimon by any engines (Flink, Spark, Java API)

Pluggable KMS

The key management service (KMS) is pluggable in the system,

...

[1].https://hudi.apache.org/docs/encryption/
[2].https://github.com/apache/iceberg/blob/c07f2aabc0a1d02f068ecf1514d2479c0fbdd3b0/api/src/main/java/org/apache/iceberg/encryption/EncryptionManager.java#L32
[3].https://github.com/apache/iceberg/blob/1e57760394583889f2cb7fb87d021471e8c46f0c/core/src/main/java/org/apache/iceberg/encryption/KeyManagementClient.java#L27
[4].https://en.wikipedia.org/wiki/Symmetric-key_algorithm

...