Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


 

In this blog post we will walk through what it takes to setup a new telemetry source in Metron.  For this example we will setup a new sensor, capture the sensor logs, pipe the logs to Kafka, pick up the logs with a Metron parsing topology, parse them, and run them through the Metron stream processing pipeline. 

Our example sensor will be a Squid Proxy.  Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more.  Squid logs are simple to explain and easy to parse and the velocity of traffic coming from Squid is representative of a a typical network-based sensor.  Hence, we feel it's a good telemetry to use for this tutorial.

...

Step 1:

...

Download the Metron source code

If you are not running Metron from the USB stick you need to download and build the code.   Please see here for full Metron installation and validation instructions.  Verify that the project has been built before creating the VM.  First lets get Metron from Apache.

git clone https://github.com/apache/

 There are two ways to acquire Metron code for this code lab.  One is to download it from the USB stick administered for this exercise.  Two it would automatically be imported by running the code lab platform vagrant scripts 

cd /metron-deployment/vagrant/codelab-platform

./run.sh

By running the following script if you have the local copy of the code lab image from the USB stick it will use the USB version, but otherwise will get the image from Vagrant Atlas.  Beware the image is large so it will take a little while to download it.   

Step 2: Build the Metron code (Optional)

If you are not running Metron from the USB stick you need to download and build the code.   Please see here for full Metron installation and validation instructions.  Verify that the project has been built before creating the VM.  First lets get Metron from Apache.

git clone https://git-wip-us.apache.org/repos/asf/incubator-metron.git

git tag -l

Now you will see a list of Metron releases.  You will see major releases, minor releases, and release candidaescandidates.  Refer to the Metron website with regards to which is the current stable release recommended for downloading.  Once you select the Metron release run the following command to download it:

cd incubator-metronmetron

git checkout tags/[MetronReleaseVersion]

Now that we have downloaded Metron we need to build it.  For the purposes of this exercise we will build without running through Metron's unit and integration test suites.  To do so run the following command:

mvn clean package -DskipTests

...

Step 2: Build the Metron dev environment

Now that we have downloaded Metron and checked out the desired version, we need to setup our environment. There are a few choices as described here https://github.com/apache/metron/tree/master/metron-deployment. We'll choose Centos 6 for this example.

cd metron/metron-deployment/development/centos6

vagrant up

This will build Metron (without running the tests), package up relevant project artifacts as RPMs, setup and install Ambari to install and manage the single-node Hadoop cluster, and finally install Metron. Once the Vagrant command is finished, you should have a fully-running and self-contained virtual environment with Metron running inside of it.

TASK [deployment-report : debug] ***********************************************

ok: [node1] => {

    "success": [

        "Apache Metron deployed successfully",

        "   Ambari          @ http://node1:8080",

        "   Zookeeper       @ node1:2181",

        "   Kafka           @ node1:6667",

        "For additional information, see https://metron.apache.org/'"

    ]

}


PLAY RECAP *********************************************************************

node1                      : ok=152  changed=64   unreachable=0    failed=0

Step 3 : Installing a sample sensor

Log into the sensors node and install the squid sensor.  If you are on the QuickDev local FullDev Vagrant development platform your VM will be called node1.  If .  See https://github.com/apache/metron/tree/master/metron-deployment/development/centos6 for example. If you are on AWS environment your sensor node will be tagged with the [sensors] tag.  You can look through the AWS console to find which node in your cluster has this tag. 

cd metron-deployment/vagrant/codelab-platform/

vagrant ssh

...

For the Centos 6 local development environment, login as follows with password (in all lowercase) "vagrant"

ssh root@node1

Once you log into the sensor node you can install the Squid sensor.  

...

This will run through the install and the Squid sensor will be installed and started.  Now lets let's look at Squid logs.

sudo su -

cd /var/log/squid

ls 

...

Now lets setup the following environment variables on node1 to make it easier to navigate and carry over the commands from quickfull-dev to AWS or bare metal deployment.metal deployment.

source /etc/default/metron


export HDP

export ZOOKEEPER=node1:2181

export BROKERLIST=node1:6667

export HDP_HOME="/usr/hdp/current"

export METRON_VERSION="0.4.0"

export METRON_HOME="/usr/metron/${METRON_VERSION}hdp/current"

Note: It's worth checking the the values of ZOOKEEPER and BROKERLIST before continuing. You should supply a comma-delimited list of host:port items for the ZOOKEEPER and BROKERLIST variables if you are running in an environment with multiple hosts for Zookeeper and the Kafka brokers.

...

cat /var/log/squid/access.log | ${HDP_HOME}/kafka-broker/bin/kafka-console-producer.sh --broker-list $BROKERLIST --topic squid

${HDP_HOME}/kafka-broker/bin/kafka-console-consumer.sh --zookeeper $ZOOKEEPER bootstrap-server $BROKERLIST --topic squid --from-beginning

...

Notice that I apply the UNWANTED tag for any part of the message that I don't want included in my resulting JSON structure.  Finally, notice that I applied the naming convention to the IPV4 field by referencing the following list of field conventions.  The last thing I need to do is to validate my Grok pattern to make sure it's valid. For our test we will be using a free Grok validator called Grok Constructor.  A validated Grok expression should look like this:

 

 



Now that the Grok pattern has been defined we need to save it and move it to HDFS.  Existing Grok parsers that ship with Metron are staged under /apps/metron/patterns/

First we do a directory listing to see which patterns are available with the platform

# hdfs dfs -ls /apps/metron/patterns/

Found 7 items

-rwxr-xr-x   1 metron hdfs      13748 2019-08-21 20:37

[root@node1 bin]# hdfs dfs -ls

/apps/metron/patterns/

Found 5 items

-rw-r--r--   3 hdfs hadoop      13427 2016-04-25 07:07

asa

-rwxr-xr-x   1 metron hdfs       5202 2019-08-21 20:37 /apps/metron/patterns/common

-rwxr-xr-x   1 metron hdfs        524 2019-08-21 20:37 /apps/metron/patterns/

asa

fireeye

-

rw-r--r--   3 hdfs hadoop       5203 2016-04-25 07:07

rwxr-xr-x   1 metron hdfs       2551 2019-08-21 20:37 /apps/metron/patterns/

common

sourcefire

-

rw

rwxr-

r--r--   3 hdfs hadoop        524 2016-04-25 07:07

xr-x   1 metron hdfs        180 2019-08-21 20:37 /apps/metron/patterns/

fireeye

squid

-

rw

rwxr-

r--r--   3 hdfs hadoop       2552 2016-04-25 07:07

xr-x   1 metron hdfs       2220 2019-08-21 20:37 /apps/metron/patterns/

sourcefire

websphere

-

rw

rwxr-

r--r--   3 hdfs hadoop        879 2016-04-25 07:07

xr-x   1 metron hdfs        879 2019-08-21 20:37 /apps/metron/patterns/yaf

Now we add a new pattern need to move our new Squid pattern into the same directory.  Create a file from the grok pattern above: 

...

{
  "parserClassName": "org.apache.metron.parsers.GrokParser",
  "sensorTopic": "sensor name",
  "parserConfig": {
    "grokPath": "grok pattern",
    "patternLabel": "grok label",
    ... other optional fields
  }
}

Create There is a pre-packaged Squid Grok parser configuration file at ${METRON_HOME}/config/zookeeper/parsers/squid.json with the following contents:

...

vi ${METRON_HOME}/config/zookeeper/global.json

and set update the json to look as followscontain at least the following:

{
"es.clustername": "metron",
"es.ip": "node1:9300",
"es.date.format": "yyyy.MM.dd.HH",

"parser.error.topic": "indexing",
"fieldValidations" : [
{
"input" : [ "ip_src_addr", "ip_dst_addr" ],
"validation" : "IP",
"config" : {
"type" : "IPV4"
}
}
]

}

...

curl -XPUT 'http://node1:9200/_template/squid_index' -d '
{
  "template": "squid_index*",
  "mappings": {
    "squid_doc": {
      "dynamic_templates": [
      {
        "geo_location_point": {
          "match": "enrichments:geo:*:location_point",
          "match_mapping_type": "*",
          "mapping": {
            "type": "geo_point"
          }
        }
      },
      {
        "geo_country": {
          "match": "enrichments:geo:*:country",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_city": {
          "match": "enrichments:geo:*:city",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_location_id": {
          "match": "enrichments:geo:*:locID",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_dma_code": {
          "match": "enrichments:geo:*:dmaCode",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_postal_code": {
          "match": "enrichments:geo:*:postalCode",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_latitude": {
          "match": "enrichments:geo:*:latitude",
          "match_mapping_type": "*",
          "mapping": {
            "type": "float"
          }
        }
      },
      {
        "geo_longitude": {
          "match": "enrichments:geo:*:longitude",
          "match_mapping_type": "*",
          "mapping": {
            "type": "float"
          }
        }
      },
      {
        "timestamps": {
          "match": "*:ts",
          "match_mapping_type": "*",
          "mapping": {
            "type": "date",
            "format": "epoch_millis"
          }
        }
      },
      {
        "threat_triage_score": {
          "mapping": {
            "type": "float"
          },
          "match": "threat:triage:*score",
          "match_mapping_type": "*"
        }
      },
      {
        "threat_triage_reason": {
          "mapping": {
            "type": "text",
            "fielddata": "true"
          },
          "match": "threat:triage:rules:*:reason",
          "match_mapping_type": "*"
        }
      },
      {
        "threat_triage_name": {
          "mapping": {
            "type": "text",
            "fielddata": "true"
          },
          "match": "threat:triage:rules:*:name",
          "match_mapping_type": "*"
        }
      }
      ],
      "properties": {
        "timestamp": {
          "type": "date",
          "format": "epoch_millis"
        },
        "source:type": {
          "type": "keyword"
        },
        "ip_dst_addr": {
          "type": "ip"
        },
        "ip_dst_port_port": {
          "type": "integer"
        },
        "ip_src_addr": {
          "type": "integerip"
        },
        "ip_src_addrport": {
          "type": "ipinteger"
        },
        "ip_src_portalert": {
          "type": "integernested"
        },
        "metron_alert" : {
                  "type" : "nested"
        },
        "guid": {
          "type": "keyword"
        }
      }
    }
  }
}
'
# Verify the template installs as expected 
curl -XGET 'http://node1:9200/_template/squid_index?pretty'

...

  1. Sets up default mappings for metron-specific types, e.g. timestamps.
  2. Sets up types for properties that will come from the parsed data, e.g. ip_src_addr.

If you're using the Full dev environment, you might want to stop some of the other parsers to free up resources.

for parser in bro__snort__yaf profiler pcap batch_indexing; do storm kill parser; done

Now start the new squid parser topology:

...

Navigate to the squid parser topology in the Storm UI at http://node1:8744/index.html and verify the topology is up with no errors:

 



Now that we have a new running squid parser topology, generate some data to parse by running this command several times:

...

yellow open   yaf_index_2016.04.25.17     5   1      30750            0     17.4mb         17.4mb 

...


In order to verify that the messages were indexed correctly first install elastic search Head plugin:

Note

The Elasticsearch Head plugin is no longer available post 5.x. You have 3 options now:

  1. curl + REST API from the command line
  2. Google Chrome Head plugin
  3. The Kibana UI - see details here

 


And navigate to http://node1:9200/_plugin/head/ one of the above mentioned tools for data exploration.

...

Now lets see how we create a Kibana dashboard to visualize data in metron.  First click on Visualize, select a squid index, and add the fields you wan to display



 


Then click on save to save the query and import it into the main Metron dashboard:

...