Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.


Second part

Add parallelism on producing. Right now producers use customized partitioners to choose to write to different partitions. We could add a key-range to producer as well to split the batches to write different keys concurrently. The offset of each split phantom partitions will be strictly larger than the log end offset when doing the split, and during the split producers will temporarily restrict access until the split is finished. 

To merge a split, one has to first move the partition leaders to co-locate on the same broker. Any produce request going to the primary partition broker will be written to the primary partition log instead of fan-out partition log. The fan-out partitions also have a consumed offset which means there are still data that are not consumed. Either way, we will advance the partition offset with this batch while adding a range commit to the offset log.