Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The connection creation rate limits will be applied to the same quota window configuration (quota.window.size.seconds with 1 second default) as existing produce/fetch quotas and request rate quota (KIP-124). Since limit on connection creation rate on the broker is also a type of quota, this approach will keep it consistent with the existing quota implementations on the broker. If connection creation rate on the broker exceeds the broker-wide limit, the broker will delay accepting a new connection by an amount of time that brings the rate within the limit. If a listener-specific limit is specified, and the connection rate on that listener exceeds the limit, the broker will delay accepting new connection on that listener by an amount of time that brokers the rate of the listener within the listener limit. The maximum delay applied will be the quota window size (which also means that the minimum connection rate limit is effectively 1 connection creation / second). 

Since the broker has to accept the connection in order to know the IP address, limiting connection creation rate per IP means that the enforcement of the limit has to happen after connection is accepted. Dropping the connection right away as broker does for per-IP connection count limits (KIP-402) does not work as well for connection rate limiting, because the offending client is likely to immediately reconnect. At the same time, connection rate limiting by only delaying processing of connections would not work for cases where the connection creation rate is continuously higher than the set limits, which would create a large backlog. This case could be more common for clients that come through a proxy, and where there could be potentially a large number of incoming connections with the same IP. To address these two issues, our approach is as follows. If connection creation rate is reached for a specific IP address, the connection will be dropped. The broker will continue dropping connections for that IP until the rate for the IP is within broker will delay processing the connection by an amount of time that brings the rate within the limit or 1 second, whichever is earliest. After the delay, if the per-IP connection creation rate limitquota is still violated, the connection will be cleaned up; otherwise, the connection will be accepted.

Metrics

No new metrics will be added. The existing metric (kafka.network:type=Acceptor,name=AcceptorBlockedPercent,listener={listenerName}) that tracks the amount of time Acceptor is blocked from accepting connections will now additionally include the amount of time Acceptor is blocked due to hitting connection create limit (in addition to the time blocked due to hitting the maximum limit on currently active connections). 

...

The broker will track connection acceptance rates, broker-wide and , per every listener and per IP, via sensors that wrap the Rate metric with the MetricConfig. MetricConfig#quota will be set to the corresponding configured connection creation rate limit. When Acceptor accepts a new connection, the broker-wide and the corresponding listener's metric will be incremented. When the actual connection creation rate exceeds either broker-wide or listener-specific quota, quota violation exception will be thrown. On quota violation, the broker will calculate the delay needed to bring the metric within quota by using the same formula implemented in ClientQuotaManager.throttleTime. The Acceptor thread will wait for the delay duration before accepting new connections. The maximum delay applied will be the quota window size (1 second by default).

When quota violation happens due to reaching the limit for a IP address, the connection for the IP will be closed. No delay will be calculated. If another connection gets accepted for the same IP, it will either be accepted (if there is no quota violation) or rejected again (if there is a quota violation). 

Most of this logic will be added to ConnectionQuotas class, which currently throttles Acceptor thread to limit the number of active connections. ConnectionQuotas class be extended to enforce both the number of active connections and connection creation rate. This proposal adds another condition when the Acceptor thread waits, which will be implemented as delaying accepting a new connection based on whichever limit is reached first:

  • If the number of active connections is below the limit, but broker hits the connection rate limit, the Acceptor will wait for the calculated delay that brings the connection creation rate metric within quota.
  • If there are no available active connection slots, the broker waits for the new slot independent of whether connection rate exceeds quota or not.

When quota violation happens due to reaching the limit for a IP address, we will calculate the delay =  min(delay required to bring the rate within the quota, 1 second). We will re-use the same mechanism implemented with KIP-306 where the broker delays the response for failed client authentication. When the delay passes, we will check for quota violation again. The connection will be cleaned up (dropped) if the quota is still violated, or the connection will be added to one of the Processor queues for processing (accepted).

Compatibility, Deprecation, and Migration Plan

...