Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

                  void setSchema(String topic, String schema);

In the interface Consumer<K, V>, we are going to add a new method as below 

                byte[] pollBuffer(Duration timeout)No change is needed for Consumer<K, V> because KIP-712 already introduced 'fetch.raw.bytes' so that the ingestion consumer can fetch the byte buffer directly.


Proposed Changes

We propose adding Parquet as the encoder and optionally compressor in Kafka producer client. When this feature is enabled, Parquet is used to encode the batch records segment and optionally compress. Parquet has the encoding and compression in a columnar-oriented way. 

We divide consumers into two categories. Minimum change is needed for messaging consumer for reading the Parquet segment, and a new consuming method is proposed to add for ingestion consumerconsumers

Messaging Consumer

In this scenario, the application expects one or more records with each poll. When the Kafka consumer client encounters the Parquet format, it invokes the Parquet reader library to unwrap the segment into records. The required change is similar to the producer side, as discussed above.

...