Page History

...

Another idea considered was to fetch new leader on the client , using the usual Metadata RPC call. And , once produce or fetch request fails with NOT_LEADER_OR_FOLLOWER or FENCED_LEADER_EPOCH. But save time on the Produce path client by avoiding the retry delay and on a failed request, instead retry immediately . This was rejected, as single metadata call can be slow, and there can be metadata propagation delays. So immediate retry on the Produce path won't always be fruitfulas soon as a new leader is available for the partition on the client. Consider the total time taken for a produce-path, when leader changes -

Total Time for alternative = Produce RPC(client to old leader) + Time taken to refresh metadata to get new eader + Produce RPC(client to new leader)
Total Time for the favored proposed changes = Produce RPC(client to old leader, response has new leader) + Produce RPC(client to new leader)

It can be clearly seen alternative has an extra-component, i.e. time taken to refresh metadata to get new eader. This time has a lower bound of 1 single Metadata RPC call, but degrades to many such calls if metadata propagation is slower through the cluster. Due to this, proposed changes, is the preferred alternative.

Space shortcuts

Child pages

Versions Compared

Old Version 11

New Version 12

Key