Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. The default value of tcp_syn_retries may be large, such as 6 for Linux. It means the default timeout value is 127 seconds for finishing the three-way handshake. A shorter timeout at the transportation level will help clients detect dead nodes faster. The existing configuration “request.timeout.ms” sets an upper-bound of the time used by both the transportation and application layer whose complexity varies. It’s risky to lower “request.timeout.ms” for detecting dead nodes quicker because of the involvement of the application layer logic.
  2. In some scenarios, the existing configuration "request.timeout.ms" is not able to time out the connections properly. For example, currently, the leastLoadedNode() provides a cached node with the criteria below. 
    1. Provide the connected node with least number of inflight requests
    2. If no connected node exists, provide the connecting node with the largest index in the cached list of nodes.
    3. If no connected or connecting node exists, provide the disconnected node which respects the reconnect backoff with the largest index in the cached list of nodes.

A node will remain the "connecting" status until the system timeout and close the socketuntil 2 ^ (tcp_sync_retries + 1) - 1 seconds elapsed, even if the requests binding to this node timed out. So the leastLoadedNode() might keep providing this same node and other nodes won't get a chance to process any requests. For example, when the user specifies a list of N bootstrap-servers and no connection has been built between the client and the servers, the least loaded node provider will poll all the server nodes specified by the user. If M servers in the bootstrap-servers list are offline, the client may take (127 * M) seconds to connect to the cluster. In the worst case when M = N - 1, the wait time can be several minutes.

...