You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Here is some information on actually running Kafka as a production system. This is taken from the production clusters at LinkedIn. Here are our configurations. There is nothing magically about most of these, you may be able to improve on them.

Hardware

We are using dual quad-core Intel Xeon machines with 24GB of memory. In general this should not matter too much, we only see about 20% cpu at peak and the memory is probably more than is needed for caching the active segments of the log.

The disk throughput is important. We have 8x7200 rpm SATA drives in a RAID 10 array.

OS Settings

We use ext4 as the filesystem and run using software RAID 10.

We have adjusted the ulimit for the kafka process to increase the maximum number of file descriptors as we have lots of topics and lots of socket connections. No other tuning has been done.

Java

Here is our command line options. Not sure how well tuned these are, but we haven't had problems:

java -server -Xms3072m -Xmx3072m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70
     -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:logs/gc.log -Djava.awt.headless=true
     -Dcom.sun.management.jmxremote -classpath <long list of jars>

Monitoring

We monitor the following things:

Server Stats

  • Number of fetch and produce requests per second
  • Avg and max fetch and produce request time
  • Number of log flushes on the server
  • Average and max log flush time
  • Network throughput

Application stats

These are monitored per-application

  • Number of produce and fetch requests sent
  • Queued messages not yet sent (we use the async producer)

Audit

The final audit we do is on the correctness of the data delivery. We audit that every message that is sent is consumed by all consumers. The details of this are discussed in KAFKA-260.

Zookeeper

Zookeeper is essential for the correct operation of Kafka. There are some things that must be done to keep Zookeeper running happily.

  • No labels