ID | IEP-97 |
Author | Anton Vinogradov |
Sponsor | |
Created |
|
Status | ACTIVE |
Customers may want to
user's data at the network and memory layer.
Ignite supports Disk Compression and Transparent Data Encryption, but they are able to transform the data at the persistent layer only.
To cover both layers (network and memory) and make the feature compatible with the existing data, it is proposed to transform/restore CacheObject's bytes on the fly.
A possible solution is to transform the byte arrays they provided during the marshaling/unmarshalling phase. This will cover both layers, messaging (network) and storage (in-memory + persist).
GridBinaryMarshaller already transforms objects to bytes.
And, all we need is to transform and wrap these bytes.
For example,
and the idea is just to transform the given array somehow and add a special prefix GridBinaryMarshaller#TRANSFORMED == -3 at the beginning to make it distinguishable from untransformed data.
For example,
We need to cover all CacheObjects.
Most of them have the following structure:
protected Object val; // Unmarshalled value. protected byte[] valBytes; // Marshalled value bytes.
and all we need - is to add transformation during the marshaling/unmarshalling phase:
protected byte[] valueBytesFromValue(CacheObjectValueContext ctx) throws IgniteCheckedException { byte[] bytes = ctx.kernalContext().cacheObjects().marshal(ctx, val); return CacheObjectTransformer.transformIfNecessary(bytes, ctx); } protected Object valueFromValueBytes(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException { byte[] bytes = CacheObjectTransformer.restoreIfNecessary(valBytes, ctx); return ctx.kernalContext().cacheObjects().unmarshal(ctx, bytes, ldr); } public void prepareMarshal(CacheObjectValueContext ctx) throws IgniteCheckedException { if (valBytes == null) valBytes = valueBytesFromValue(ctx); } public void finishUnmarshal(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException { if (val == null) val = valueFromValueBytes(ctx, ldr); }
BinaryObject(Impl)s have different structures:
private Object obj; // Deserialized value. Value converted to the Java class instance. private byte[] arr; // Serialized bytes. Value!
(De)serialization is similar to (un)marshalling, it's a process to gain a Java class instance from bytes or vice versa, but it happens at different times and code layers.
(Un)marshalling happens on putting/getting an object to/from the cache, but (de)serialization happens on building/deserializing of a binary object detached from any cache.
In a lucky circumstance, BinaryObjectImpl requires no marshalling, serialization already generates bytes that can be used as marshalled bytes.
But, if we're going to transform the data during the marshaling/unmarshalling phase we need to add an additional data layer to the BinaryObjectImpl:
private Object obj; // Deserialized value. Value converted to the Java class instance. private byte[] arr; // Serialized bytes. Value! private byte[] valBytes; // Marshalled value bytes.
Where valBytes == arr when the transformation is disabled.
It's not possible to just replace arr with valBytes because, unlike, for example, from CacheObjectImpl arr is not just a mashalled bytes, it's an object's value required, for example, to provide hashCode/schemaId/typeId/objectField, and we must keep it as is.
So, BinaryObjectImpl requires valBytes to/from arr conversion:
private byte[] arrayFromValueBytes(CacheObjectValueContext ctx) { return CacheObjectTransformer.restoreIfNecessary(valBytes, ctx); } private byte[] valueBytesFromArray(CacheObjectValueContext ctx) { return CacheObjectTransformer.transformIfNecessary(arr, start, arr.length, ctx); } public void finishUnmarshal(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException { if (arr == null) arr = arrayFromValueBytes(ctx); } public void prepareMarshal(CacheObjectValueContext ctx) { if (valBytes == null) valBytes = valueBytesFromArray(ctx); }
Some customers may want to encrypt the data, some to compress it, while some just keep it as is.
So, we must provide a simple way to append any transformation.
The simplest way is to use Service Provider Interface (IgniteSpi):
public interface CacheObjectTransformerSpi extends IgniteSpi { /** Additional space required to store the transformed data. */ public int OVERHEAD = 2; /** * Transforms the data. * * @param bytes Byte array contains the data. * @param offset Data offset. * @param length Data length. * @return Byte array contains the transformed data started with non-filled area with {@link #OVERHEAD} size. * @throws IgniteCheckedException when transformation is not possible/suitable. */ public byte[] transform(byte[] bytes, int offset, int length) throws IgniteCheckedException; /** * Restores the data. * * @param bytes Byte array contains the transformed data. * @param offset Data offset. * @param length Data length. * @return Byte array contains the restored data. */ public byte[] restore(byte[] bytes, int offset, int length); }
This API is known for the overhead used to store transformed data and is able to work with byte arrays with custom offsets, which is necessary to guarantee performance.
Every customer may implement this interface in a proper way if necessary and specify it in the configuration:
IgniteConfiguration getConfiguration() { IgniteConfiguration cfg = ... cfg.setCacheObjectTransformerSpi(new XXXTransformerSpi()); return cfg; }
But, most customers just want to transform the data, so, they may extend the adapter with the simplified API:
public abstract class CacheObjectTransformerSpiAdapter extends IgniteSpiAdapter implements CacheObjectTransformerSpi { ... /** * Transforms the data. * * @param original Original data. * @return Transformed data. * @throws IgniteCheckedException when transformation is not possible/suitable. */ protected abstract ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException; /** * Restores the data. * * @param transformed Transformed data. * @return Restored data. */ protected abstract ByteBuffer restore(ByteBuffer transformed); }
class CompressionTransformerSpi extends CacheObjectTransformerSpiAdapter { protected ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException { int locOverhead = 4; // Original length. int totalOverhead = CacheObjectTransformerSpi.OVERHEAD + locOverhead; int origSize = original.remaining(); int lim = origSize - totalOverhead; if (lim <= 0) throw new IgniteCheckedException("Compression is not profitable."); ByteBuffer compressed = byteBuffer(lim); compressed.position(locOverhead); Zstd.compress(compressed, original, 1); compressed.flip(); compressed.putInt(origSize); compressed.rewind(); return compressed; } protected ByteBuffer restore(ByteBuffer transformed) { ByteBuffer restored = byteBuffer(transformed.getInt()); Zstd.decompress(restored, transformed); restored.flip(); return restored; } }
class EncryptionTransformerSpi extends CacheObjectTransformerSpiAdapter { private static final int SHIFT = 42; // Secret! protected ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException { ByteBuffer transformed = byteBuffer(original.remaining()); // Same capacity is required. while (original.hasRemaining()) transformed.put((byte)(original.get() + SHIFT)); transformed.flip(); return transformed; } protected ByteBuffer restore(ByteBuffer transformed, int length) { ByteBuffer restored = byteBuffer(transformed.remaining()); // Same size. while (transformed.hasRemaining()) restored.put((byte)(transformed.get() - SHIFT)); restored.flip(); return restored; } }
Transformation requires additional memory allocation and subsequent GC work.
Transformation requires additional CPU utilization.
// Links to discussions on the devlist, if applicable.
// Links to various reference documents, if applicable.