Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Prior to going through this tutorial make sure you have Metron properly installed.  Please see here for Metron installation and validation instructions.  We will be using a single VM setup for this exercise.  To setup the VM do the following steps:

 

cd metron-deployment/vagrant/singlenode-vagrant
vagrant plugin install vagrant-hostmanager
vagrant up
vagrant ssh

...

Now that we have the sensor set up and generating logs we need to figure out how to pipe these logs to a Kafka topic.  To do so the first thing we need to do is setup a new Kafka topic for Squid.

 

cd /usr/hdp/2.3.4.0-3485current/kafka-broker/bin/

./kafka-topics.sh --zookeeper localhost:2181 --create --topic squid --partitions 1 --replication-factor 1

./kafka-topics.sh --zookeeper localhost:2181 --list

...

tail /var/log/squid/access.log | /usr/hdp/2.3.4.0-3485current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic squid

./kafka-console-consumer.sh --zookeeper node1:2181 --topic squid --from-beginning

...

WEBURL (?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)/)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)\b/?(?!@)))

 

 SQUIDSQUID_DELIMITED %{NUMBER:start_time} %{SPACE:UNWANTED}  %{INT:elapsed} %{IPV4:ip_src_addr} %{WORD:action}/%{NUMBER:code} %{NUMBER:bytes} %{WORD:method} %{WEBURL:url}

...

We need to move our new Squid pattern into the same directory.  Create a file from the grok pattern above: 

touch /tmp/squid

vi /tmp/squid

Then move it to HDFS:

su - hdfs

hdfs dfs -put /tmp/squid /apps/metron/patterns/

exit

Now that the Grok pattern is staged in HDFS we need to define Storm Flux configuration for the Metron Parsing Topology.  The configs are staged under 

/usr/metron/0.1BETA/config/topologies/ and each parsing topology has it's own set of configs.  Each directory for a topology has a remote.yaml which is designed to be run on AWS and local/test.yaml designed to run locally on a single-node VM.  At the moment of publishing this blog entry the following configs are available:

/usr/metron/0.1BETA/config/topologiesflux/yaf/test.yaml

/usr/metron/0.1BETA/config/topologies/yafflux/remote.yaml

/usr/metron/0.1BETA/config/topologiesflux/sourcefire/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/sourcefire/remote.yaml

/usr/metron/0.1BETA/configflux/topologies/asa/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/asa/remote.yaml

/usr/metron/0.1BETA/config/topologiesflux/fireeye/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/fireeye/remote.yaml

/usr/metron/0.1BETA/config/topologiesflux/bro/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/bro/remote.yaml

/usr/metron/0.1BETA/config/topologiesflux/ise/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/ise/remote.yaml

/usr/metron/0.1BETA/configflux/topologies/paloalto/test.yaml

/usr/metron/0.1BETA/configflux/topologies/paloalto/remote.yaml

/usr/metron/0.1BETA/configflux/topologies/lancope/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/lancope/remote.yaml

/usr/metron/0.1BETA/configflux/topologies/pcap/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/pcap/remote.yaml

/usr/metron/0.1BETA/config/topologiesflux/enrichment/test.yaml

/usr/metron/0.1BETA/config/topologiesflux/enrichment/remote.yaml

/usr/metron/0.1BETA/configflux/topologies/snort/test.yaml

/usr/metron/0.1BETA/configflux/topologies/snort/remote.yaml

Since we are going to be running locally on a VM we need to define a test.yaml for Squid.  The easiest way to do this is to copy one of the existing Grok-based configs (YAF) and tailor it for Squid.  

mkdir /usr/metron/0.1BETA/configflux/topologies/squid

cp /usr/metron/0.1BETA/config/topologiesflux/yaf/testremote.yaml /usr/metron/0.1BETA/config/topologiesflux/squid/testremote.yaml

vi /usr/metron/0.1BETA/config/topologiesflux/squid/testremote.yaml

And edit your config to look like this (changes highlighted in red):

 

name: "squid-test"

config:

    topology.workers: 1

 

 

components:

    -   id: "parser"

        className: "org.apache.metron.parsing.parsers.GrokParser"

        constructorArgs:

            - "../Metron-MessageParsers/src/main/resourcesapps/metron/patterns/squid"

            - "SQUID_DELIMITED"

        configMethods:

            -   name: "withMetronHDFSHomewithTimestampField"

                args:

                    - "start_time" 

            -   idname: "writerwithMetronHDFSHome"

                args:

                    - ""

    -   id: "writer"

        className className: "org.apache.metron.writer.KafkaWriter"

        constructorArgs:

            - "${kafka.broker}"

    -   id: "zkHosts"

        className: "storm.kafka.ZkHosts"

        constructorArgs:

            - "${kafka.zk}"

    -   id: "kafkaConfig"

        className: "storm.kafka.SpoutConfig"

        constructorArgs:

            # zookeeper hosts

            - ref: "zkHosts"

            # topic name

            - "${spout.kafka.topic.squid}"

            # zk root

            - ""

            # id

            - "${spout.kafka.topic.squid}"

        properties:

            -   name: "ignoreZkOffsets"

                value: false

            -   name: "startOffsetTime"

                value: -21

            -   name: "socketTimeoutMs"

                value: 1000000

 

spouts:

    -   id: "kafkaSpout"

        className: "storm.kafka.KafkaSpout"

        constructorArgs:

            - ref: "kafkaConfig"

 

bolts:

    -   id: "parserBolt"

        className: "org.apache.metron.bolt.ParserBolt"

        constructorArgs:

            - "${kafka.zk}"

            - "${spout.kafka.topic.squid}"

            - ref: "parser"

            - ref: "writer"

 

streams:

    -   name: "spout -> bolt"

        from: "kafkaSpout"

        to: "parserBolt"

        grouping:

            type: SHUFFLE

 

 

 

 

...

Start the new squid parser topology:

storm jar /usr/metron/0.1BETA/lib/metron-parsers-0.1BETA.jar org.apache.storm.flux.Flux --filter /usr/metron/0.1BETA/config/elasticsearch.properties --remote /usr/metron/0.1BETA/flux/squid/remote.yaml

Navigate to the squid parser topology in the Storm UI at http://node1:8744/index.html and verify the topology is up with no errors:

Image Added

Now that we have a new running squid parser topology, generate some data to parse by running this command several times:

tail /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic squid

Refresh the Storm UI and it should report data being parsed:

Image Added

Then navigate Elasticsearch at http://node1:9200/_cat/indices?v and verify that a squid index has been created:

health status index                     pri rep docs.count docs.deleted store.size pri.store.size
yellow open   yaf_index_2016.04.25.15     5   1       5485            0        4mb            4mb 
yellow open   snort_index_2016.04.26.12   5   1      24452            0     14.4mb         14.4mb 
yellow open   bro_index_2016.04.25.16     5   1       1295            0      1.9mb          1.9mb
yellow open   squid_index_2016.04.26.13   5   1          1            0      7.3kb          7.3kb 
yellow open   yaf_index_2016.04.25.17     5   1      30750            0     17.4mb         17.4mb