Status

Current state: Under Discussion

Discussion thread: https://lists.apache.org/thread/tbhmkf44jhjf8lqmo7w2whynbgttf1o6

PR: https://github.com/apache/kafka/pull/12545

Motivation

Currently we use Deserializer#deserialize(String topic, Headers headers, byte[] data) in Fetcher#parseRecord(TopicPartition, RecordBatch, Record) to deserialize key&value, we first call Utils.toArray(ByteBuffer) to convert ByteBuffer into byte[] and then call Deserializer#deserialize(String topic, Headers headers, byte[] data) which will cause memory allocation and memory copying. Actually, we can directly use ByteBuffer instead of byte[] for deserialization, which will reduce memory allocation and memory copying in some cases.

If we add the default method Deserializer#deserialize(String, Headers, ByteBuffer) and use it in Fetcher#parseRecord(TopicPartition, RecordBatch, Record), we can reduce the memory allocation and memory copy of StringDeserializer and ByteBufferDeserializer, of course if user-customized Deserializers implement this method, they also can reduce memory allocation and memory copying.

Public Interfaces

We propose adding default method Deserializer#deserialize(String, Headers, ByteBuffer).

Class

Method

Deserializer

default T deserialize(String topic, Headers headers, ByteBuffer data) {
    return deserialize(topic, headers, Utils.toArray(data));
}

ByteBufferDeserializer

@Override
public ByteBuffer deserialize(String topic, Headers headers, ByteBuffer data) {
    return data;
}

StringDeserializer

@Override
public String deserialize(String topic, Headers headers, ByteBuffer data) {
    if (data == null) {
        return null;
    }

    try {
        if (data.hasArray()) {
            return new String(data.array(), data.position() + data.arrayOffset(), data.remaining(), encoding);
        } else {
            return new String(Utils.toArray(data), encoding);
        }
    } catch (UnsupportedEncodingException e) {
        throw new SerializationException("Error when deserializing ByteBuffer to string due to unsupported encoding " + encoding);
    }
}

Proposed Changes

Deserializer add default method deserialize(String, Headers, ByteBuffer);
Invoke Deserializer#deserialize(String, Headers, ByteBuffer) instead of Deserializer#deserialize(String, Headers, byte[]) in Fetcher#parseRecord(TopicPartition, RecordBatch, Record).

Compatibility, Deprecation, and Migration Plan

This proposal has no compatibility issues, we just add default method deserialize(String, Headers, ByteBuffer) which is compatible with the existing Deserializers.

Rejected Alternatives

Another solution I thought of is PoolArea, just like Netty, but this solution is more complicated.

Space shortcuts

Child pages

Status

Motivation

Public Interfaces

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

Space shortcuts

Child pages

KIP-863: Reduce Fetcher#parseRecord() memory copy

Status

Motivation

Public Interfaces

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives