Status
Current state: Under Discussion
Discussion thread: https://lists.apache.org/thread/tbhmkf44jhjf8lqmo7w2whynbgttf1o6
PR: https://github.com/apache/kafka/pull/12545
Motivation
Currently we use Deserializer#deserialize(String topic, Headers headers, byte[] data) in Fetcher#parseRecord(TopicPartition, RecordBatch, Record) to deserialize key&value, we first call Utils.toArray(ByteBuffer) to convert ByteBuffer into byte[] and then call Deserializer#deserialize(String topic, Headers headers, byte[] data) which will cause memory allocation and memory copying. Actually, we can directly use ByteBuffer instead of byte[] for deserialization, which will reduce memory allocation and memory copying in some cases.
If we add the default method Deserializer#deserialize(String, Headers, ByteBuffer) and use it in Fetcher#parseRecord(TopicPartition, RecordBatch, Record), we can reduce the memory allocation and memory copy of StringDeserializer and ByteBufferDeserializer, of course if user-customized Deserializers implement this method, they also can reduce memory allocation and memory copying.
Public Interfaces
We propose adding default method Deserializer#deserialize(String, Headers, ByteBuffer).
Class | Method |
---|---|
Deserializer | default T deserialize(String topic, Headers headers, ByteBuffer data) { |
ByteBufferDeserializer | @Override |
StringDeserializer | @Override |
Proposed Changes
Deserializer
add default methoddeserialize(String, Headers, ByteBuffer)
;- Invoke
Deserializer#deserialize(String, Headers, ByteBuffer)
instead ofDeserializer#deserialize(String, Headers, byte[])
inFetcher#parseRecord(TopicPartition, RecordBatch, Record)
.
Compatibility, Deprecation, and Migration Plan
This proposal has no compatibility issues, we just add default method deserialize(String, Headers, ByteBuffer) which is compatible with the existing Deserializers.
Rejected Alternatives
Another solution I thought of is PoolArea, just like Netty, but this solution is more complicated.