Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

This work was done in collaboration with Edoardo Comar

Motivation

With KIP-74, we now have a good way to limit the size of Fetch responses, but it may still be difficult for users to control overall memory since the consumer will send fetches in parallel to all the brokers which own partitions that it is subscribed to. Currently we have:

-max.fetch.bytes: This enables to control how much data will be returned by the broker for one fetch

-max.partition.fetch.bytes: This enables to control how much data per partition will be returned by the broker

None of these settings take into account that the consumer will be sending requests to multiple brokers in parallel so in practice the memory usage is as stated in KIP-74: min(num brokers * max.fetch.bytes,  max.partition.fetch.bytes * num_partitions)

 

To give users simpler control, it might make sense to add a new setting to properly limit the memory used by Fetch responses in the consumer in a similar fashion than what we already have on the producer.

...

The following option will be added for consumers to configure (in ConsumerConfigs.java):

  1. buffer.memory: Type: Long, Priority: High, Default: 100MB

    The total bytes of memory the consumer can use to buffer records received from the server and waiting to be processed (decompressed and deserialized).

    This setting differs from the total memory the consumer will use because some additional memory will be used for decompression (if compression is enabled), deserialization as well as for maintaining in-flight requests.

    Note that this setting must be at least as big as max.fetch.bytes.

Alongside, we will set the priority of max.partition.fetch.bytes to Low.

Proposed Changes

This KIP reuses the MemoryPool interface from KIP-72.

...