Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

IDIEP-97
AuthorAnton Vinogradov 
Sponsor
Created

 

Status

Status
colour

Grey

Green
title

DRAFT

active


Table of Contents

Motivation

...

A possible solution is to wrap them or transform the byte arrays they provided during the marshaling/unmarshalling phase. This will cover both layers, messaging (network) and storage (in-memory + persist).

Transformation

GridBinaryMarshaller already transforms objects to bytes. 

And, all we need is to transform and wrap these bytes.

For example,

  • 42 will be transformed to [3, 42, 0, 0, 0], where 3 is a GridBinaryMarshaller#INT
  • "Test string" will be transformed to [9, 11, 0, 0, 0, 84, 101, 115, 116, 32, 115, 116, 114, 105, 110, 103], where 9 is a GridBinaryMarshaller#STRING

and the idea is just to transform the given array somehow and add a special prefix GridBinaryMarshaller#TRANSFORMED == -3 at the beginning to make it distinguishable from untransformed data.

For example,

  • No-Op transformer will produce [-3, 3, 42, 0, 0, 0] or [-3, 9, 11, 0, 0, 0, 84, 101, 115, 116, 32, 115, 116, 114, 105, 110, 103].
  • Pseudo-Crypto transformer, which adds 1 to every original byte, will produce [-3, 4, 43, 1, 1, 1] or [-3, 10, 12, 1, 1, 1, 85, 102, 116, 117, 33, 116, 117, 115, 106, 111, 104]
  • Magic-Compressor will produce [-3, 7] or [-3, 17], where 7 and 17 are the result of a magic compression.

CacheObjects

We need to cover all CacheObjects.

Most of them have the following structure:

Code Block
languagejava
titleXXX extends CacheObjectAdapter
 protected Object val; // Unmarshalled value.
 protected byte[] valBytes; // Marshalled value bytes.

and all we need - is to add transformation during the marshaling/unmarshalling phase:

Code Block
languagejava
titleCacheObjectAdapter transformation
protected byte[] valueBytesFromValue(CacheObjectValueContext ctx) throws IgniteCheckedException {
    byte[] bytes = ctx.kernalContext().cacheObjects().marshal(ctx, val);

    return CacheObjectTransformerUtils.transformIfNecessary(bytes, ctx);
}


protected Object valueFromValueBytes(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException {
    byte[] bytes = CacheObjectTransformerUtils.restoreIfNecessary(valBytes, ctx);

    return ctx.kernalContext().cacheObjects().unmarshal(ctx, bytes, ldr);
}


public void prepareMarshal(CacheObjectValueContext ctx) throws IgniteCheckedException {
	if (valBytes == null)
		valBytes = valueBytesFromValue(ctx);
}


public void finishUnmarshal(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException { 
	if (val == null) 
    	val = valueFromValueBytes(ctx, ldr);
}

BinaryObjects

BinaryObject(Impl)s have different structures:

Code Block
languagejava
titleBinaryObjectImpl
private Object obj; // Deserialized value. Value converted to the Java class instance.
private byte[] arr; // Serialized bytes. Value!

(De)serialization is similar to (un)marshalling, it's a process to gain a Java class instance from bytes or vice versa, but it happens at different times and code layers.

(Un)marshalling happens on putting/getting an object to/from the cache, but (de)serialization happens on building/deserializing of a binary object detached from any cache.

In a lucky circumstance, BinaryObjectImpl requires no marshalling, serialization already generates bytes that can be used as marshalled bytes.

But, if we're going to transform the data during the marshaling/unmarshalling phase we need to add an additional data layer to the BinaryObjectImpl:

Code Block
languagejava
titleBinaryObjectImpl
private Object obj; // Deserialized value. Value converted to the Java class instance.
private byte[] arr; // Serialized bytes. Value!
private byte[] valBytes; // Marshalled value bytes.

Where valBytes == arr when the transformation is disabled.

It's not possible to just replace arr with valBytes because, unlike from CacheObjectImpl arr is not just a mashalled bytes, it's an object's value that is required, for example, to provide hashCode/schemaId/typeId/objectField, and we must keep it as is.

So, BinaryObjectImpl requires valBytes to/from arr conversion:

Code Block
languagejava
titleBinaryObjectImpl (un)marshalling
private byte[] arrayFromValueBytes(CacheObjectValueContext ctx) {
    return CacheObjectTransformerUtils.restoreIfNecessary(valBytes, ctx);
}

private byte[] valueBytesFromArray(CacheObjectValueContext ctx) {
    return CacheObjectTransformerUtils.transformIfNecessary(arr, start, arr.length, ctx);
}


public void finishUnmarshal(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException {
	if (arr == null)
		arr = arrayFromValueBytes(ctx);
}


public void prepareMarshal(CacheObjectValueContext ctx) {
	if (valBytes == null)
		valBytes = valueBytesFromArray(ctx);
}

Transformer

Some customers may want to encrypt the data, some to compress it, while some just keep it as is.

So, we must provide a simple way to append any transformation.

API

Code Block
languagejava
titleInterface
public interface CacheObjectTransformerManager extends GridCacheSharedManager {
    /**
     * Transforms the data.
     *
     * @param original Original data.
     * @return Transformed data (started with {@link GridBinaryMarshaller#TRANSFORMED} when restorable)
     * or {@code null} when transformation is not possible/suitable.
     */
    public @Nullable ByteBuffer transform(ByteBuffer original);

    /**
     * Restores the data.
     *
     * @param transformed Transformed data.
     * @return Restored data.
     */
    public ByteBuffer restore(ByteBuffer transformed);
}

Every customer may implement this interface in a proper way if necessary and specify it via plugin configuration:

Code Block
languagejava
titleCustom transformer
IgniteConfiguration getConfiguration() {
	IgniteConfiguration cfg = ...

	cfg.setPluginProviders(new XXXPluginProvider()); // Which provides some XXXCacheObjectTransformerManager()

    return cfg;
}

Examples

Compression example

Code Block
languagejava
titleCompression
class CompressionTransformer extends CacheObjectTransformerAdapter {
	protected ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException {                     
		int overhead = 5; // Transformed flag + length.

        int origSize = original.remaining();
        int lim = origSize - overhead;              

		if (lim <= 0)             
        	return null; // Compression is not profitable.

        ByteBuffer compressed = byteBuffer(overhead + (int)Zstd.compressBound(origSize));    

		compressed.put(TRANSFORMED);
		compressed.putInt(origSize);    

		int size = Zstd.compress(compressed, original, 1);

 		if (size >= lim)
        	return null; // Compression is not profitable.          

		compressed.flip();          

        return compressed;
    }

    protected ByteBuffer restore(ByteBuffer transformed) {
        ByteBuffer restored = byteBuffer(transformed.getInt());

        Zstd.decompress(restored, transformed);

        restored.flip();
              
        return restored;
    }
}

Encryption example

Code Block
languagejava
titleEncryption
class EncryptionTransformer extends CacheObjectTransformerAdapter {
    private static final int SHIFT = 42; // Secret!

    protected ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException {
        ByteBuffer transformed = byteBuffer(original.remaining() + 1); // Same capacity is required.

		transformed.put(TRANSFORMED);

        while (original.hasRemaining())
            transformed.put((byte)(original.get() + SHIFT));

        transformed.flip();

        return transformed;
    }

    protected ByteBuffer restore(ByteBuffer transformed, int length) {
        ByteBuffer restored = byteBuffer(transformed.remaining()); // Same size.
		
		while (transformed.hasRemaining())
            restored.put((byte)(transformed.get() - SHIFT));

        restored.flip();

        return restored;
    }
}

Risks and Assumptions

Transformation requires additional memory allocation and subsequent GC work.

...