Syslog Performance Test 2012-04-30
Who ran the test: Mike Percy <mpercy at cloudera dot com>
Test Setup
Overview
The Flume NG agent was run on its own physical machine in a single JVM. A separate client machine generated load against the Flume box in syslog format. Flume stored data onto a 9-node HDFS cluster configured on its own separate hardware. No virtual machines were used in this test.
...
Data description
Syslog entries containing sequentially increasing integers plus padding
Event size: 300 bytes
Results
Throughput summary
Num flows | Min aggregate events/sec | Max aggregate events/sec | Min avg. single-flow events/sec | Max avg. single-flow events/sec |
---|---|---|---|---|
6 | 41982.34 | 54538.92 | 6997.06 | 9089.82 |
7 | 45639.21 | 51646.33 | 6519.89 | 7378.05 |
8 | 64748.63 | 66095.53 | 8093.58 | 8261.94 |
9 | 57358.73 | 65506.95 | 6373.19 | 7278.55 |
10 | 58557.15 | 66324.04 | 5855.72 | 6632.40 |
11 | 59519.33 | 62419.89 | 5410.85 | 5674.54 |
12 | 60105.21 | 69164.94 | 5008.77 | 5763.74 |
13 | 69450.87 | 70590.71 | 5342.37 | 5430.05 |
14 | 62674.97 | 64030.08 | 4476.78 | 4573.58 |
15 | 64499.65 | 72783.06 | 4303.64 | 4852.20 |
16 | 65064.07 | 72714.94 | 4066.50 | 4544.68 |
Conclusions
- Flume appears to be capable of achieving approx. 70,000 events/sec on a single machine at the time of the test with no data loss
- The optimal number of parallel flows is nearly achieved by creating one flow per CPU core. Additional flows may be added with marginal benefit, likely up to 2x the number of physical cores available on the system, if hyper-threading is available.
...