Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Hardware specs
CPU: Intel Xeon L5630 2 x quad-core with Hyper-Threading @ 2133MHz (8 physical cores)
Memory: 48GB
OS: SLES 11sp1 (SuSE Linux 64-bit)

Flume configuration
Java version: 1.6.0u26 (Server Hotspot VM)
Java heap size: 2GB
Num. agents: 1
Num. parallel flows: varies (see results)
Source: SyslogTcpSource
Channel: MemoryChannel
Sink: HDFSEventSink with avro_event serialization and snappy serializer compression
Java heap: 2GB

Single-flow config

Code Block
agent.channels.svc_0_chan.type = memory
agent.channels.svc_0_chan.capacity = 100000
agent.channels.svc_0_chan.transactionCapacity = 1000

agent.sources.svc_0_src.type = org.apache.flume.source.SyslogTcpSource
agent.sources.svc_0_src.port = 10001
agent.sources.svc_0_src.channels = svc_0_chan

agent.sinks.svc_0_sink.type = hdfs
agent.sinks.svc_0_sink.hdfs.path = hdfs://xxxxxx.cloudera.com/service/20120430/flow0
agent.sinks.svc_0_sink.hdfs.fileType = DataStream
agent.sinks.svc_0_sink.hdfs.rollInterval = 300
agent.sinks.svc_0_sink.hdfs.rollSize = 0
agent.sinks.svc_0_sink.hdfs.rollCount = 0
agent.sinks.svc_0_sink.hdfs.batchSize = 1000
agent.sinks.svc_0_sink.hdfs.txnEventMax = 1000
agent.sinks.svc_0_sink.hdfs.kerberosPrincipal = flume/_HOST@CLOUDERA.COM
agent.sinks.svc_0_sink.hdfs.kerberosKeytab = /etc/flume-ng/conf/flume-xxxxxx.keytab
agent.sinks.svc_0_sink.serializer = avro_event
agent.sinks.svc_0_sink.serializer.compressionCodec = snappy
agent.sinks.svc_0_sink.channel = svc_0_chan

Hadoop configuration
The HDFS sink was connected to a 9-node Hadoop cluster running CDH3u3 with MIT Kerberos v5 security enabled.

Visualization of test setup
Image Added

Data description
Syslog entries containing sequentially increasing integers plus padding
Event size: 300 bytes

...