You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Status

Current state: Under Discussion

Discussion thread: https://lists.apache.org/thread/tbhmkf44jhjf8lqmo7w2whynbgttf1o6

PR: https://github.com/apache/kafka/pull/12545

Motivation

Currently we use Deserializer#deserialize(String topic, Headers headers, byte[] data) in Fetcher#parseRecord(TopicPartition, RecordBatch, Record) to deserialize key&value, we first call Utils.toArray(ByteBuffer) to convert ByteBuffer into byte[] and then call Deserializer#deserialize(String topic, Headers headers, byte[] data) which will cause memory allocation and memory copying. Actually, we can directly use ByteBuffer instead of byte[] for deserialization, which will reduce memory allocation and memory copying in some cases.

If we add the default method Deserializer#deserialize(String, Headers, ByteBuffer) and use it in Fetcher#parseRecord(TopicPartition, RecordBatch, Record), we can reduce the memory allocation and memory copy of StringDeserializer and ByteBufferDeserializer, of course if user-customized Deserializers implement this method, they also can reduce memory allocation and memory copying.

Public Interfaces

We propose adding default method Deserializer#deserialize(String, Headers, ByteBuffer).

ClassMethod

Deserializer

default T deserialize(String topic, Headers headers, ByteBuffer data) {
return deserialize(topic, headers, Utils.toArray(data));
}

ByteBufferDeserializer

@Override
public ByteBuffer deserialize(String topic, Headers headers, ByteBuffer data) {
return data;
}

StringDeserializer

@Override
public String deserialize(String topic, Headers headers, ByteBuffer data) {
if (data == null) {
return null;
}

try {
if (data.hasArray()) {
return new String(data.array(), data.position() + data.arrayOffset(), data.remaining(), encoding);
} else {
return new String(Utils.toArray(data), encoding);
}
} catch (UnsupportedEncodingException e) {
throw new SerializationException("Error when deserializing ByteBuffer to string due to unsupported encoding " + encoding);
}
}

Proposed Changes

  • Deserializer add default method deserialize(String, Headers, ByteBuffer);
  • Invoke Deserializer#deserialize(String, Headers, ByteBuffer) instead of Deserializer#deserialize(String, Headers, byte[]) in Fetcher#parseRecord(TopicPartition, RecordBatch, Record).

Compatibility, Deprecation, and Migration Plan

  • This proposal has no compatibility issues, we just add default method deserialize(String, Headers, ByteBuffer) which is compatible with the existing Deserializers.

Rejected Alternatives

Another solution I thought of is PoolArea, just like Netty, but this solution is more complicated.

  • No labels