...
void setSchema(String topic, String schema);
In the interface Consumer<K, V>, we are going to add a new method as below
byte[] pollBuffer(Duration timeout)No change is needed for Consumer<K, V> because KIP-712 already introduced 'fetch.raw.bytes' so that the ingestion consumer can fetch the byte buffer directly.
Proposed Changes
We propose adding Parquet as the encoder and optionally compressor in Kafka producer client. When this feature is enabled, Parquet is used to encode the batch records segment and optionally compress. Parquet has the encoding and compression in a columnar-oriented way.
We divide consumers into two categories. Minimum change is needed for messaging consumer for reading the Parquet segment, and a new consuming method is proposed to add for ingestion consumerconsumers.
Messaging Consumer
In this scenario, the application expects one or more records with each poll. When the Kafka consumer client encounters the Parquet format, it invokes the Parquet reader library to unwrap the segment into records. The required change is similar to the producer side, as discussed above.
...