Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Syslog Performance Test 2012-04-30

Who ran the test: Mike Percy <mpercy at cloudera dot com>

Test Setup

Overview
The Flume NG agent was run on its own physical machine in a single JVM. A separate client machine generated load against the Flume box in syslog format. Flume stored data onto a 9-node HDFS cluster configured on its own separate hardware. No virtual machines were used in this test.

...

Data description
Syslog entries containing sequentially increasing integers plus padding
Event size: 300 bytes

Results

Throughput summary

Num flows

Min aggregate events/sec

Max aggregate events/sec

Min avg. single-flow events/sec

Max avg. single-flow events/sec

6

41982.34

54538.92

6997.06

9089.82

7

45639.21

51646.33

6519.89

7378.05

8

64748.63

66095.53

8093.58

8261.94

9

57358.73

65506.95

6373.19

7278.55

10

58557.15

66324.04

5855.72

6632.40

11

59519.33

62419.89

5410.85

5674.54

12

60105.21

69164.94

5008.77

5763.74

13

69450.87

70590.71

5342.37

5430.05

14

62674.97

64030.08

4476.78

4573.58

15

64499.65

72783.06

4303.64

4852.20

16

65064.07

72714.94

4066.50

4544.68

Conclusions

  1. Flume appears to be capable of achieving approx. 70,000 events/sec on a single machine at the time of the test with no data loss
  2. The optimal number of parallel flows is nearly achieved by creating one flow per CPU core. Additional flows may be added with marginal benefit, likely up to 2x the number of physical cores available on the system, if hyper-threading is available.

...