Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

So consider a 5 node cluster, with a leader quota of 1MB/s, and a window of 10s, on a GbE network. The leader's throttle dominates, so the largest permissible replica.fetch.request.max.bytes would be 1MB/s * 10s = 10MB. Note that this calculation is independent of the number of brokers. However if we had a larger cluster, of 500 nodes, the network on the follower could become the bottleneck. Thus we would need to keep replica.fetch.request.max.bytes less than total-window-size * NetworkSpeed / #brokers =  10s * 100MB/s / 500 =  2MB.

Rejected Alternatives

There appear to be two sensible approaches to this problem: (1) omit partitions from fetch requests (follower) / fetch responses (leader) when they exceed their quota (2) delay them, as the existing quota mechanism does, using separate fetchers. Both appear to be valid approaches with slightly different design tradeoffs. The later was chosen to be more inline with the existing quota implementation. 

We also considered a more pessimistic approach which quota's the follower's fetch request, then applies an adjustment when the response returnsWe considered using the existing replica fetcher threads, implementing the throttle by restricting the number of bytes requested based on a quota. This mechanism has some distinct advantages, most notably it is conservative, meaning the throttle value will never be exceeded. However, whilst this approach should work, it requires an adjustment to be applied to the quota when the follower receives responses. This adjustment process is relatively complex adds some complexity when compared to the approach suggested herethe optimistic approaches. Thus this proposal was rejected .(This

...

is discussed in full here).