Status

Current stateAccepted

Discussion thread: here

JIRA: here

Github PRPR 1212

Released: 0.10.0.0

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Kafka's initial LZ4 compression implementation is not interoperable. It does not follow the standard LZ4 framing specification (see https://cyan4973.github.io/lz4/lz4_Frame_format.html). This makes it difficult for third-party clients to support LZ4 compression using off-the-shelf libraries. This KIP proposes to fix kafka's LZ4 handling so that it is conformant with the LZ4F specification and enable clients to interoperate with respect to LZ4-compressed messages.

Specifically, KAFKA-1493 attempted to implement the LZ4F interoperable framing specification. There's a bug, however, that causes the frame checksum to be incorrectly calculated. Fixing this single byte (refered to as HC) is the goal of this KIP.

Public Interfaces

Changes to public interfaces:

None. Interface changes are for classes not currently marked as public in javadoc.

Proposed Changes

Old 0.8.2/0.9 clients (current behavior):

Old 0.8.2/0.9 brokers (current behavior):

New 0.10 clients (proposed behavior)

New 0.10 broker (proposed behavior):

KafkaLZ4* code:

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

Alternative #1: Create a new compression type, "LZ4F" . Rejected because this is really just a bugfix, not a new compression type. The number of compression types is limited by the number of bits available in message attribute byte. We currently use 2 bits to cover the 4 compression types (None, Gzip, Snappy, LZ4). Adding a second type for a "fixed" LZ4 would require pulling a 3rd bit from attributes bytes. Further, explaining to users the difference between LZ4 and LZ4F compression types is likely to be difficult.