Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

(Above: Comparison between other compression codecs, supported by Kafka.)

As of May August 2018, the draft implementation uses Java binding for ZStandard 1.3.4 is still in progress; it will be updated before merging if this proposal is approved5.

Accompanying Issues

However, supporting ZStandard is not just adding new compression codec; It introduces several issues related to it. We need to address those issues first:

...

Another issue worth bringing into play is the dictionary feature. ZStandard offers a training mode, which yields dictionary for compressing and decompression. It dramatically improves the compression ratio of small and repetitive input (e.g., semi-structured json), which perfectly fits into Kafka's case. (For real-world benchmark, see here) Although the details of how to adapt this feature into Kafka (example) should be discussed in the separate KIP, We need to leave room behind.

License

We can use zstd and its Java binding, zstd-jni without any problem, but we need to include their license - BSD and BSD 2 Clause license, respectively. They are not listed in the list of prohibited licenses also.

What we need is attaching the licenses for the dependencies. A recent update on Apache Spark shows how to approach this problem. They did:

  • 'LICENSE' file: License of the project itself (i.e., Apache License) and the list of source dependencies and their licenses.
  • 'LICENSE-binary' file: The list of binary dependencies and their licenses.
  • 'license' directory: Contains the license files of source dependencies.
  • 'license-binary' file: Contains the license files of binary dependencies.

Following this approach would be good, but we can take a little different approach since Kafka doesn't have lots of dependencies as many as Spark. (As of August 2018, the dependency we need to include license is jersey only.)

Public Interfaces

This feature requires modification on both of Configuration Options and Binary Log format.

...