Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The message header format for magic byte=1, now looks like -

1 byte magic

1 byte compression-attributes

4 byte CRC32 of the payload

...

The data received by a consumer for a topic might contain both compressed as well as uncompressed messages. The consumer iterator transparently decompresses compressed data and only returns an uncompressed message. The offset maintenance in the consumer gets a little tricky. In the zookeeper consumer, the consumed offset is updated each time a message is returned. This consumed offset should be a valid fetch offset for correct failure recovery. Since data is stored in compressed format on the broker, valid fetch offsets are the compressed message boundaries. Hence, for compressed data, the consumed offset will be advanced one compressed message at a time. This has the side effect of possible duplicates in the event of a consumer failure. For uncompressed data, consumed offset will be advanced one message at a time.

Backwards compatibility

A version 0.7 broker and consumer will be able to understand messages of magic byte values 0 and 1. So the brokers and consumers are backwards compatible.

Configuration changes

There are 2 new config parameters on the producer side -

Config parameter

Description

Default

compression.codec

Controls the compression codec to be used by the producer. O means no compression. 1 means GZIP compression (0: No compression, 1: GZIP compression, 2: Snappy compression, 3: LZ4 compression)

0

compressed.topics

comma separated list of topics for which compression should be enabled. This doesn't mean anything when compression.codec = 0

empty

 

compression.topics=empty

compression.topics="topicA,topicB"

compression.codec=0

All topics are uncompressed since compression is disabled

All topics are uncompressed since compression is disabled

compression.codec=1

All topics are compressed

Only the topics topicA and topicB are compressed compressed

...

...

Compression codecs supported

Currently, only GZIP compression is , Snappy and LZ4 compression codecs are supported.