...
When the user polls() for new offsets, if consumer2 has not finished, then only offsets from buffer2 will be sent to the user. After each poll(), buffer2 will be emptied of its contents after its records have been returned to the user to free up memory. (Note that if some offsets in buffer2 has not been committed yet, then those offsets will still be kept.) However, if consumer2 has terminated, then all offsets that has accumulated in buffer1 will also be sent (with both buffer2 and buffer1 being discarded, now that both are no longer used). In this manner, we could also guarantee ordering for the user.
te
Corner Cases
There is a possibility, however remote, that there could be a chained failure. In this case, either consumer1 or consumer2 will fail. If consumer1 were to fail, we could simply repeat our previous policy, and instantiate another thread to process the intervening lag. However, if consumer2 were to fail, no new processes will be created. This is because the offset range that consumer2 is processing is bounded, so this means that consumer2 does not have to deal with a continuous influx of new records – thus no "lag" technically to speak of. Note that if consumer1 crashes N times, N child processes will be created. Checkpointing will write the progress from these N child processes into N distinct logs, which will merge with one another once their respective process terminates.
Compatibility, Deprecation, and Migration Plan
...
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.g