Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Code Block
sinkName(reqArg1, reqArg2[, optArg1="default" [optArg2=0]]{, kwarg1="default", kwarg2=0})

Wiki MarkupreqArg1 and reqArg2 are positional arguments and required in all instances. \ [ \ ] chars enclose optional positional arguments. All optional arguments have a default value and must be enumerated in order. Thus optArg1 and optArg2 are optional positional arguments, and have defaults that get filled in if the are not present. \ { \ } chars enclose optional keyword arguments. All keyword arguments are optional and have a default value and can be enumerated in any order. Thus kwarg1 and kwarg2 are keyword arguments with defaults.

Let's take tailDir as an example. Here's the definition in the manual.

...

What is a good amount of time for collector rolling?

Agents and Collectors

How do end-to-end acks work and where can I add a "filter" decorator to drop events?

The acks are generated from checksums of the body of events. So if you augment your events with new attributes (regex, value) the acks will still work. However, if you filter out events the checksums between the agentSink and the collectorSink the checksums won't sum up.

You can however, put filtering "after" the collector, or do filtering "before" the agent.

Ok because value adds attributes and does not modify the body.
node : <source> | agentE2ESink("ip of collector");
collector: collectorSource | value("newattr","newvalue") collectorSink("hdfs://xxxx", ...);

Ok because filter is before checksums calculated
node : <source> | filterOutEvents agentE2ESink("ip of collector");
collector: collectorSource | collectorSink("hdfs://xxxx", ...);

Ok because filter is after checksums are validated.
node : <source> | agentE2ESink("ip of collector");
collector: collectorSource | collector(xxx) { filterOutEvents escapedFormatDfs("hdfs://xxxx", ...) } ;

Not ok – checksums won't work out because events with checksum info never get checksum calculation.
node : <source> | agentE2ESink("ip of collector");
collector: collectorSource | filterOutEvents collectorSink("hdfs://xxxx", ...);

Plugins

I have a plugin that uses version xxx of Thrift and Flume is using version yyy.

...

Aaron Newton, a Cloudera Alum, actually suggested the name for the Flume project and it just seemed to fit.

.