THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
Flume configuration
Java version: 1.6.0u26 (Server Hotspot VM)
Java heap size: 2GB
Num. agents: 1
Num. parallel flows: varies (see results)
Source: SyslogTcpSource
Channel: MemoryChannel
Sink: HDFSEventSink
with avro_event
serialization and snappy
serializer compression
Single-flow configFragment of flume.conf config file
Code Block | ||||
---|---|---|---|---|
| ||||
# Number of sources, channels, and sinks varied depending on tests. # In each case, they are independent flows and therefore do not share threads, data, or resources. # This example only shows 3 flows. The # of flows were varied from 6 to 16. agent.sources = svc_0_src svc_1_src svc_2_src agent.channels = svc_0_chan svc_1_chan svc_2_chan agent.sinks = svc_0_sink svc_1_sink svc_2_sink # example of one flow is below, i.e. "flow 0" agent.channels.svc_0_chan.type = memory agent.channels.svc_0_chan.capacity = 100000 agent.channels.svc_0_chan.transactionCapacity = 1000 agent.sources.svc_0_src.type = org.apache.flume.source.SyslogTcpSource agent.sources.svc_0_src.port = 10001 agent.sources.svc_0_src.channels = svc_0_chan agent.sinks.svc_0_sink.type = hdfs agent.sinks.svc_0_sink.hdfs.path = hdfs://xxxxxx.cloudera.com/service/20120430/flow0 agent.sinks.svc_0_sink.hdfs.fileType = DataStream agent.sinks.svc_0_sink.hdfs.rollInterval = 300 agent.sinks.svc_0_sink.hdfs.rollSize = 0 agent.sinks.svc_0_sink.hdfs.rollCount = 0 agent.sinks.svc_0_sink.hdfs.batchSize = 1000 agent.sinks.svc_0_sink.hdfs.txnEventMax = 1000 agent.sinks.svc_0_sink.hdfs.kerberosPrincipal = flume/_HOST@CLOUDERA.COM agent.sinks.svc_0_sink.hdfs.kerberosKeytab = /etc/flume-ng/conf/flume-xxxxxx.keytab agent.sinks.svc_0_sink.serializer = avro_event agent.sinks.svc_0_sink.serializer.compressionCodec = snappy agent.sinks.svc_0_sink.channel = svc_0_chan # ... define flow 1 ... |
Hadoop configuration
The HDFS sink was connected to a 9-node Hadoop cluster running CDH3u3 with MIT Kerberos v5 security enabled.
...