Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This new mode (name is enable.parallel.rebalance) will by default not be used by KafkaConsumer (e.g. will be by default equal to false). If the user chooses, he or she can activate this mode by setting the config's value to true. Note that when using this mode, offsets would no longer be given to the user in correct sequence (or order), but rather in a more haphazard manner. The mentality of this mode could be described as "give me all the offsets as fast as you can, I don't care about order." To illustrate, offsets could be sent to the user in this order:  1,2,3,6,4,7,5,8. Please note that with the enabling of this mode, there will be an increased burden on a computer's processor cores because an extra thread will be instantiated (per current design).

Proposed Changes

When rebalancing, instead of allowing the current consumer (consumer1) to restart from the last committed offset (70), we should allow it to start processing from the latest offset (120) that could be polled at the time of the consumer's recovery (assuming that the log has grown within the time that the process was down). Meanwhile, a new KafkaConsumer (consumer2) will be created to start processing from the last committed offset (70) and run in concurrency with consumer1. Please note that consumer2 will not terminate once it has reached offset 120, as it will continue running. If consumer1 crashes again, the respective lag that results will be taken up by consumer2 for processing. We could think of consumer2 as the safety net which catches any offsets which had slipped by consumer1 when it crashed.

...