Current state: Accepted
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
introduced a ZooKeeper client wrapper called kafka.zookeeper.ZooKeeperClient
that encourages pipelined requests to zookeeper. This client pipelines requests to ZooKeeper by performing a "scatter-gather" of asynchronous calls provided by the underlying org.apache.zookeeper.ZooKeeper
client. That is, the client would send a pipelined sequence of requests and wait for all of their responses.
The client left as is risks imposing heavy load on ZooKeeper. ZooKeeper itself only has a coarse-grained throttling mechanism in place through its zookeeper.globalOutstandingLimit
config which defaults to 1000. This config is insufficient for several reasons:
We need a throttling mechanism in the client-side to give administrators control over Kafka's impact on ZooKeeper, and for this we propose a new broker config called zookeeper.max.in.flight.requests
.
This KIP only adds a new broker configuration described below.
This KIP proposes a new broker config called zookeeper.max.in.flight.requests
which represents the maximum number of unacknowledged requests the client will send to ZooKeeper before blocking.
This config must be set to at least 1.
The default value is set to 10. We ran experiments showing the impact of zookeeper.max.in.flight.requests
on completion times for various ZooKeeper-intensive controller protocols. The default was chosen to be the smallest number beyond which the experiment results have found diminishing returns.
From a protocol standpoint, the change is fully backwards compatible. Setting the default value to 10 implies administrators may see an increase in load in ZooKeeper than what was seen prior to the controller using kafka.zookeeper.ZooKeeperClient
in . Those who wish to retain existing ZooKeeper load should set zookeeper.max.in.flight.requests
to 1.
Several options for the default value were considered: