Status

Current state: Accepted

Discussion thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Unable to render Jira issues macro, execution error.  introduced a ZooKeeper client wrapper called kafka.zookeeper.ZooKeeperClient that encourages pipelined requests to zookeeper. This client pipelines requests to ZooKeeper by performing a "scatter-gather" of asynchronous calls provided by the underlying org.apache.zookeeper.ZooKeeper client. That is, the client would send a pipelined sequence of requests and wait for all of their responses.

The client left as is risks imposing heavy load on ZooKeeper. ZooKeeper itself only has a coarse-grained throttling mechanism in place through its zookeeper.globalOutstandingLimit config which defaults to 1000. This config is insufficient for several reasons:

  1. the limit is meant to protect ZooKeeper from memory pressure associated with a backlog of requests.
  2. the limit is applied across all connections. Even with this config, one misbehaved client will affect the other clients.

We need a throttling mechanism in the client-side to give administrators control over Kafka's impact on ZooKeeper, and for this we propose a new broker config called zookeeper.max.in.flight.requests.

Public Interfaces

This KIP only adds a new broker configuration described below.

Proposed Changes

This KIP proposes a new broker config called zookeeper.max.in.flight.requests which represents the maximum number of unacknowledged requests the client will send to ZooKeeper before blocking.

This config must be set to at least 1.

The default value is set to 10. We ran experiments showing the impact of zookeeper.max.in.flight.requests on completion times for various ZooKeeper-intensive controller protocols. The default was chosen to be the smallest number beyond which the experiment results have found diminishing returns.

Compatibility, Deprecation, and Migration Plan

From a protocol standpoint, the change is fully backwards compatible. Setting the default value to 10 implies administrators may see an increase in load in ZooKeeper than what was seen prior to the controller using kafka.zookeeper.ZooKeeperClient in Unable to render Jira issues macro, execution error. . Those who wish to retain existing ZooKeeper load should set zookeeper.max.in.flight.requests to 1.

Rejected Alternatives

Several options for the default value were considered:

  1. Set the default value to be unbounded (Integer.MAX_VALUE), effectively stating that no such limit be applied and to pipeline as aggressively as possible. This was rejected because a config should not default to removing control over the system.
  2. Set the default value to 1, effectively disabling pipelining and maintaining Kafka's synchronous requests to ZooKeeper. This was rejected because we want users to see the benefits of pipelining.
  • No labels