Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Step 1: Spin Up Single Node Vagrant VM

  1. Spin up the the Metron Vagrant VM by follwoing the instructions in QuickStart

Step 2: Create a Kafka Topic for the New Data Source

Every data source whose events you are streaming into Metron must have its own Kafka topic. The ingestion tool of choice (for example, Apache Nifi) will push events into this Kafka topic.
  1. Create a Kafka topic called "squid" in the directory /usr/hdp/current/kafka-broker/bin/:
    cd /usr/hdp/current/kafka-broker/bin/
    ./kafka-topics.sh --zookeeper localhost:2181 --create --topic squid --partitions 1 --replication-factor 1 
  2. List all of the Kafka topics to ensure that the new topic exists:
  1. ./kafka-topics.sh --zookeeper localhost:2181 --list
    

You should see the following list of Kafka topics:

Prior to going through this tutorial make sure you have Metron properly installed.  Please see here for Metron installation and validation instructions.  Verify that the project has been built before creating the VM:

cd metron-platform

mvn clean package

We will be using a single VM setup for this exercise.  To setup the VM do the following steps:

vagrant plugin install vagrant-hostmanager

cd metron-deployment/vagrant/quick-dev-platform

./launch_dev_image.sh

vagrant ssh

Step 2: Create a Kafka Topic for the New Data Source

Every data source whose events you are streaming into Metron must have its own Kafka topic. The ingestion tool of choice (for example, Apache Nifi) will push events into this Kafka topic.  Create a Kafka topic called "squid":

/usr/hdp/current/kafka-broker/bin//kafka-topics.sh --zookeeper localhost:2181 --create --topic squid --partitions 1 --replication-factor 1

List all of the Kafka topics to ensure that the new topic exists:

/usr/hdp/current/kafka-broker/bin//kafka-topics.sh --zookeeper localhost:2181 --list

You should see the following list of Kafka topics:

  • bro
  • enrichment
  • pcap
  • snort
  • bro
  • enrichment
  • pcap
  • snort
  • squid
  • yaf

Step 3: Install Squid

...

  1. The first thing we need to do is decide if we will be using the Java-based parser or the Grok-based parser for the new telemetry. In this example we will be using the Grok parser. Grok parser is perfect for structured or semi-structured logs that are well understood (check) and telemetries with lower volumes of traffic (check).
  2. Next we need to define the Grok expression for our log. Refer to Grok documentation for additional details. In our case the pattern is:

    WDOM [^(?:http:\/\/|www\.|https:\/\/)]([^\/]+)

    SQUID_DELIMITED %{NUMBER:timestamp} %{SPACE:UNWANTED} %{INT:elapsed} %{IPV4:ip_src_addr} %{WORD:action}/%{NUMBER:code} %{NUMBER:bytes} %{WORD:method} http:\/\/\www.%{WDOM:url}\/ - %{WORD:UNWANTED}\/%{IPV4:ip_dst_addr} %{WORD:UNWANTED}\/%{WORD:UNWANTED}

    Notice the WDOM pattern (that is more tailored to Squid instead of using the generic Grok URL pattern) before defining the Squid log pattern. This is optional and is done for ease of use. Also, notice that we apply the UNWANTED tag for any part of the message that we don't want included in our resulting JSON structure. Finally, notice that we applied the naming convention to the IPV4 field by referencing the following list of field conventions.

  3. The last thing we need to do is to validate the Grok pattern to make sure it's valid. For our test we will be using a free Grok validator called Grok Constructor. A validated Grok expression should look like this:

  4. Now that the Grok pattern has been defined, we need to save it and move it to HDFS. Create a files called "squid" in the tmp directory and copy the Grok pattern into the file.
    touch /tmp/squid
    vi /tmp/squid
    //copy the grok pattern above to the squid file
  5. Now put the squid file into the directory where Metron stores its Grok parsers. Existing Grok parsers that ship with Metron are staged under /apps/metron/patterns/.
    su - hdfs
    hdfs dfs -put /tmp/squid /apps/metron/patterns/
    exit
    

Step 5: Create a

...

Parser configuration for the new Squid Storm Parser Topology

Now that the Grok pattern is staged in HDFS we need to

...

define a parser configuration for the Metron Parsing Topology.

...

mkdir /usr/metron/0.1BETA/flux/squid
cp /usr/metron/0.1BETA/flux/yaf/remote.yaml /usr/metron/0.1BETA/flux/squid/remote.yaml
vi /usr/metron/0.1BETA/flux/squid/remote.yaml

...

 The configurations are kept in Zookeeper so the sensor configuration must be uploaded there after it has been created.  A Grok parser configuration follows this format:

{
  "parserClassName": "org.apache.metron.parsers.GrokParser",
  "sensorTopic": "sensor name",
  "parserConfig": {
    "grokPath": "grok pattern",
    "patternLabel": "grok label",
    ... other optional fields
  }
}

Create a Squid Grok parser configuration file at /usr/metron/0.1BETA/config/zookeeper/parsers/squid.json with the following contents:

{
  "parserClassName": "org.apache.metron.parsers.GrokParser",
  "sensorTopic": "squid",
  "parserConfig": {
    "grokPath": "/apps/metron/patterns/squid",
    "patternLabel": "SQUID_DELIMITED",
    "timestampField": "timestamp"
  }
}

A script is provided to upload configurations to Zookeeper.  Upload the new parser config to Zookeeper:

/usr/metron/0.1BETA/bin/zk_load_configs.sh --mode PUSH -i /usr/metron/0.1BETA/config/zookeeper -z node1:2181 

...

Step 6: Deploy the new Parser Topology

Now that we have the Squid parser topology defined, lets deploy it to our cluster.cluster.
  1. Deploy the new squid paser topology:
    Deploy the new squid paser topology:
    sudo storm jar /usr/metron/0.1BETA/lib/metron-parsers-0.1BETA.jar org.apache.storm.flux.Flux --filter /usr/metron/0.1BETA/config/elasticsearch.properties --remote /usr/metron/0.1BETA/flux/squid/remote.yamlbin/start_parser_topology.sh -k node1:6667 -z node1:2181 -s squid
  2. Go to the Storm UI and you should now see new "squid" topology and ensure that the topology has no errors

...