Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


IDIEP-97
AuthorAnton Vinogradov 
Sponsor

 

Created
Status

Status
colourGreen
titleactive


Table of Contents

Motivation

Customers may want to 

  • minimize (compress)
  • protect (encrypt)

user's data at the network and memory layer.

Ignite supports Disk Compression and Transparent Data Encryption, but they are able to transform the data at the persistent layer only.

Description

To cover both layers (network and memory) and make the feature compatible with the existing data, it is proposed to transform/restore CacheObject's bytes on the fly.

A possible solution is to transform the byte arrays they provided during the marshaling/unmarshalling phase. This will cover both layers, messaging (network) and storage (in-memory + persist).

Transformation

All we need is to cover all CacheObjects.

CacheObjects

Most of them has the following structure:

Code Block
languagejava
titleXXX extends CacheObjectAdapter
 protected Object val; // Unmarshalled value.
 protected byte[] valBytes; // Marshalled value bytes.

and all we need - is to add transformation during the marshaling/unmarshalling phase:

Code Block
languagejava
titleCacheObjectAdapter transformation
protected byte[] valueBytesFromValue(CacheObjectValueContext ctx) throws IgniteCheckedException {
    byte[] bytes = ctx.kernalContext().cacheObjects().marshal(ctx, val);

    return CacheObjectsTransformer.transformIfNecessary(bytes, ctx);
}


protected Object valueFromValueBytes(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException {
    byte[] bytes = CacheObjectsTransformer.restoreIfNecessary(valBytes, ctx);

    return ctx.kernalContext().cacheObjects().unmarshal(ctx, bytes, ldr);
}

... 

public void prepareMarshal(CacheObjectValueContext ctx) throws IgniteCheckedException {
	if (valBytes == null)
		valBytes = valueBytesFromValue(ctx);
}

...

public void finishUnmarshal(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException { 
	if (val == null) 
    	val = valueFromValueBytes(ctx, ldr);
}

BinaryObjects

BinaryObject(Impl)s have the different structure:

Code Block
languagejava
titleBinaryObjectImpl
private Object obj; // Deserialized value.
private byte[] arr; // Serialized bytes.

(De)serialization is a simmilar to (un)marshalling, it's a process to gain java class instance from bytes and or vice versa, but it happen at different time and code layer.

(Un)marshalling happens on putting/getting object to/from cache, but (de)serialization happens on building/deserializing of a binary object detached from any cache.

A lucky circumstance, BinaryObjectImpl require no marshalling, serialization already generates byte which can be used as marshalled bytes.

But, if we're going to transform the data during the marshaling/unmarshalling phase we need to add additional data layer to the BinaryObjectImpl:

Code Block
languagejava
titleBinaryObjectImpl
private Object obj; // Deserialized value.
private byte[] arr; // Serialized bytes.
private byte[] valBytes; // Marshalled value bytes.

Where valBytes == arr when transformation is disabled.

It's not possible to just replace arr with valBytes because, unlike, for example, from CacheObjectImpl arr is not just a mashalled bytes, it's a object's value requred, for example, to provide hashCode/schemaId/typeId/objectField.

So, BinaryObjectImpl requres valBytes to arr conversion:

Code Block
languagejava
titleBinaryObjectImpl (un)marshalling
private byte[] arrayFromValueBytes(CacheObjectValueContext ctx) {
    return CacheObjectsTransformer.restoreIfNecessary(valBytes, ctx);
}

private byte[] valueBytesFromArray(CacheObjectValueContext ctx) {
    return CacheObjectsTransformer.transformIfNecessary(arr, start, arr.length, ctx);
}

...

public void finishUnmarshal(CacheObjectValueContext ctx, ClassLoader ldr) throws IgniteCheckedException {
	if (arr == null)
	arr = arrayFromValueBytes(ctx);
}

...

public void prepareMarshal(CacheObjectValueContext ctx) {
	if (valBytes == null)
		valBytes = valueBytesFromArray(ctx);
}

SPI (Service Provider Interface)

Some customets may want to encrypt the data, some to compress, while some just keep it as is.

So, we must provide the simple way to append any transformation.

API

Simplest way is to use Service Provider Interface (IgniteSpi):

Code Block
languagejava
titleSPI
public interface CacheObjectTransformerSpi extends IgniteSpi {
    /** Additional space required to store the transformed data. */
    public int OVERHEAD = 6;

    /**
     * Transforms the data.
     *
     * @param bytes  Byte array contains the data.
     * @param offset Data offset.
     * @param length Data length.
     * @return Byte array contains the transformed data started with non-filled area with {@link #OVERHEAD} size.
     * @throws IgniteCheckedException when transformation is not possible/suitable.
     */
    public byte[] transform(byte[] bytes, int offset, int length) throws IgniteCheckedException;

    /**
     * Restores the data.
     *
     * @param bytes  Byte array ending with the transformed data.
     * @param offset Transformed data offset.
     * @param length Original data length.
     * @return Byte array contains the restored data.
     */
    public byte[] restore(byte[] bytes, int offset, int length);
}

This API known about overhead used to store transformed data and allowns to work with byte arrays with custom offsets, which is necessary to guarantee the performance.

Every customey may impelent this interface in proper way if necessary:

Code Block
languagejava
titleCustom SPI
IgniteConfiguration getConfiguration() {
	IgniteConfiguration cfg = ...

	cfg.setCacheObjectTransformerSpi(new XXXTransformerSpi());

	return cfg;
}

Simplified API

But, most customers just want to thansform the data, so, they may extend the adapter with the simple API.

Code Block
languagejava
titleCacheObjectTransformerSpiAdapter
public abstract class CacheObjectTransformerSpiAdapter extends IgniteSpiAdapter implements CacheObjectTransformerSpi {
...
    /**
     * Transforms the data.
     *
     * @param original Original data.
     * @return Transformed data.
     * @throws IgniteCheckedException when transformation is not possible/suitable.
     */
    protected abstract ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException;

    /**
     * Restores the data.
     *
     * @param transformed Transformed data.
     * @param length Original data length.
     * @return Restored data.
     */
    protected abstract ByteBuffer restore(ByteBuffer transformed, int length);
}

Compression example

Code Block
languagejava
titleCompressionSpi
class CompressionTransformerSpi extends CacheObjectTransformerSpiAdapter {
    private static final LZ4Factory lz4Factory = LZ4Factory.fastestInstance();

    protected ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException {
        int lim = original.remaining() - CacheObjectTransformerSpi.OVERHEAD;

        if (lim <= 0)
            throw new IgniteCheckedException("Compression is not profitable.");

        ByteBuffer compressed = byteBuffer(lim);

        Zstd.compress(compressed, original, 1);

        compressed.flip();             

        return compressed;
    }

    protected ByteBuffer restore(ByteBuffer transformed, int length) {
        ByteBuffer restored = byteBuffer(length);

        Zstd.decompress(restored, transformed);

        restored.flip();
              
        return restored;
    }
}

Encryption example

Code Block
languagejava
titleEncryptionSpi
class EncryptionTransformerSpi extends CacheObjectTransformerSpiAdapter {
    private static final int SHIFT = 42; // Secret!

    protected ByteBuffer transform(ByteBuffer original) throws IgniteCheckedException {
        ByteBuffer transformed = byteBuffer(original.remaining()); // Same capacity is required.

        while (original.hasRemaining())
            transformed.put((byte)(original.get() + SHIFT));

        transformed.flip();

        return transformed;
    }

    protected ByteBuffer restore(ByteBuffer transformed, int length) {
        ByteBuffer restored = byteBuffer(length);

        while (transformed.hasRemaining())
            restored.put((byte)(transformed.get() - SHIFT));

        restored.flip();

        return restored;
    }
}

Risks and Assumptions

Transformation requires additional memory allocation and subsequent GC work.

Transformation requires additional CPU utilization.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

// Links to various reference documents, if applicable.

Tickets

Jira
serverASF JIRA
columnIdsissuekey,summary,issuetype,updated,assignee,customfield_12311032,customfield_12311037,customfield_12311022,customfield_12311027,priority,status
columnskey,summary,type,updated,assignee,Priority,Priority,Priority,Priority,priority,status
maximumIssues20
jqlQueryproject = Ignite AND labels IN (iep-97) ORDER BY status
serverId5aa69414-a9e9-3523-82ec-879b028fb15b