Page History

...

Flume configuration
Java version: 1.6.0u26 (Server Hotspot VM)
Java heap size: 2GB
Num. agents: 1
Num. parallel flows: varies (see results)
Source: SyslogTcpSource
Channel: MemoryChannel
Sink: HDFSEventSink with avro_event serialization and snappy serializer compression

Single-flow configFragment of flume.conf config file

Code Block

	none
	none


# Number of sources, channels, and sinks varied depending on tests.
# In each case, they are independent flows and therefore do not share threads, data, or resources.
# This example only shows 3 flows. The # of flows were varied from 6 to 16.
agent.sources = svc_0_src svc_1_src svc_2_src
agent.channels = svc_0_chan svc_1_chan svc_2_chan
agent.sinks = svc_0_sink svc_1_sink svc_2_sink

# example of one flow is below, i.e. "flow 0"
agent.channels.svc_0_chan.type = memory
agent.channels.svc_0_chan.capacity = 100000
agent.channels.svc_0_chan.transactionCapacity = 1000

agent.sources.svc_0_src.type = org.apache.flume.source.SyslogTcpSource
agent.sources.svc_0_src.port = 10001
agent.sources.svc_0_src.channels = svc_0_chan

agent.sinks.svc_0_sink.type = hdfs
agent.sinks.svc_0_sink.hdfs.path = hdfs://xxxxxx.cloudera.com/service/20120430/flow0
agent.sinks.svc_0_sink.hdfs.fileType = DataStream
agent.sinks.svc_0_sink.hdfs.rollInterval = 300
agent.sinks.svc_0_sink.hdfs.rollSize = 0
agent.sinks.svc_0_sink.hdfs.rollCount = 0
agent.sinks.svc_0_sink.hdfs.batchSize = 1000
agent.sinks.svc_0_sink.hdfs.txnEventMax = 1000
agent.sinks.svc_0_sink.hdfs.kerberosPrincipal = flume/_HOST@CLOUDERA.COM
agent.sinks.svc_0_sink.hdfs.kerberosKeytab = /etc/flume-ng/conf/flume-xxxxxx.keytab
agent.sinks.svc_0_sink.serializer = avro_event
agent.sinks.svc_0_sink.serializer.compressionCodec = snappy
agent.sinks.svc_0_sink.channel = svc_0_chan

# ... define flow 1 ...

Hadoop configuration
The HDFS sink was connected to a 9-node Hadoop cluster running CDH3u3 with MIT Kerberos v5 security enabled.

...

Child pages

Versions Compared

Old Version 4

New Version Current

Key