...
PR: https://github.com/apache/kafka/pull/12685Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
JIRA:
Jira | ||||||
---|---|---|---|---|---|---|
|
Motivation
Currently, we use Serializer#serialize(String, Headers, T) in KafkaProducer#doSend(ProducerRecord, Callback) to serialize key and value. First, we call Serializer#serialize(String, Headers, T) to convert T into byte[], then use Utils#wrapNullable(byte[]) to convert byte[] into ByteBuffer, and finally write ByteBuffer into MemoryRecordsBuilder through DefaultRecord#writeTo(DataOutputStream, int, long, ByteBuffer, ByteBuffer, Header[]).
Why don't we add a serializeToByteBuffer(String, Headers, T) method to Serializer, and then use Serializer#serializeToByteBuffer(String, Headers, T) in KafkaProducer#doSend(ProducerRecord, Callback)? If T is an instance of ByteBuffer or T is based on ByteBuffer, we would reduce a lot of memory allocation and memory copying.
Additionally, I plan to ultimately replace byte[] with ByteBuffer in Serializer.
Public Interfaces
We propose adding default method Serializer#serializeToByteBuffer(String, T), Serializer#serializeToByteBuffer(String, Headers, T) and Partitioner#partition(String, Object, ByteBuffer, Object, ByteBuffer, Cluster):
...