Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently KafkaProducer uses an estimation to do guess the compressed message size from the uncompressed message size. The estimation is based on a weighted average in a sliding window on the compression ratio of the most recent batches for each compression type. The formula is the following:

Assume

...

COMPRESSION_RATIO_N

...

stands

...

for

...

the

...

compression

...

ratio

...

of

...

the

...

Nth batch.

...

The

...

estimated compression

...

ratio

...

for

...

the

...

(N+1)th

...

batch

...

is:

ESTIMATED_COMPRESSION_RATIO = Σ(COMPRESSION_RATIO_N * DAMPING_FACTOR^(N - 1) * (1 - DAMPING_FACTOR)) + INITIAL_COMPRESSION_RATIO * DAMPING_FACTOR^N

When the (N+1)th batch is generated, this estimated compression ratio will be used (multiplied by a factor of 1.05 for contingency) to estimate the compressed size from the uncompressed size. When the estimated compressed size reaches the batch.size configuration, the batch will be closed and sent to the brokers.

...