Table of Contents |
---|
Status
Current state: "Under Discussion" Accepted
Discussion thread: here
JIRA: KAFKA-7632
...
The table below shows the valid range of compression.level per compression.type. (note: snappy is excluded since it does not support any compression level.) The valid range and default value of the compression level are entirely up to the compression library, so they may be changed in the future.
Compression Codec | availability | Valid Range | Default |
---|---|---|---|
gzip | Yes | 1 (Deflater.BEST_SPEED) ~ 9 (Deflater.BEST_COMPRESSION) | 6 |
snappy | No | - | - |
lz4 | Yes | 1 ~ 17 | 9 |
zstd | Yes | -131072 ~ 22 | 3 |
Proposed Changes
This option impacts the following processes:
...
Apache Kafka 2.7.0, GraalVM Java 8 (21.1.0), replicaton factor = 3.
Result
codec | level | produced message / sec | latency (ms) | size (bytes) | description |
---|---|---|---|---|---|
none | 2,739.50 | 205.34 | 5,659,454,754 | ||
gzip | 1 | 1,122.96 | 1,230.22 | 1,787,505,238 | min. level |
gzip | 6 | 717.71 | 2,041.24 | 1,644,280,629 | default level |
gzip | 9 | 608.54 | 2,413.66 | 1,643,517,758 | max. level |
lz4 | 1 | 1,694.69 | 603.46 | 2,211,346,795 | min. level |
lz4 | 9 | 1,199.93 | 878.85 | 2,184,022,257 | default level |
lz4 | 17 | 495.34 | 2,110.55 | 2,178,643,665 | max. level |
zstd | -5 | 7,653.45 | 156.88 | 1,997,500,892 | experimental level |
zstd | 1 | 6,317.52 | 68.55 | 1,521,783,958 | |
zstd | 3 | 4,760.54 | 286.79 | 1,494,620,615 | default level |
zstd | 12 | 988.95 | 863.89 | 1,458,150,768 | |
zstd | 18 | 85.20 | 2,017.92 | 1,492,015,424 |
It shows the following:
- Codec is the main factor that differentiates the compressed size. However, The compression level makes little impact on it. The maximum improvement is is gzip/1 vs. gzip/9 (8%), and the minimum is lz4/1 vs. lz/17 (1.5%).
- Excepting zstd/-5, when the compression level gets lower, messages/sec increase but latency decreases. Especially, compressing with zstd/1 produces 32.7% more messages per second than zstd/3 (current default), and gzip/1 produces 56.4% than gzip/6 (current default).
- For every compression codec, compression with minimum level (i.e., speed first strategy) resulted in the best messages/second rate.
...
Code Block | ||
---|---|---|
| ||
INCLUDE_TEST_JARS=true bin/kafka-run-class.sh kafka.TestLinearWriteSpeed --bytes 8192 --size 8192 --message-size 4096 --files 1 --compression {compression-codec} --level {compression-level} --log |
Result
codec | level | write speed (mb/sec) | description |
---|---|---|---|
none | 19678.841 | ||
gzip | 1 | 22007.042 | min. level |
gzip | 6 | 18425.707 | default level |
gzip | 9 | 19148.284 | max. level |
lz4 | 1 | 22776.967 | min. level |
lz4 | 9 | 20613.456 | default level |
lz4 | 17 | 19879.134 | max. level |
zstd | -5 | 19531.25 | experimental level |
zstd | 1 | 22910.557 | |
zstd | 3 | 19531.25 | default level |
zstd | 12 | 17477.628 | |
zstd | 18 | 21229.619 |
The result was almost similar. In general, the minimum compression level (=1) showed the best write speed (except zstd/-5).
...