Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

The Setup

...

When adding you add a net new data source to Metron, the first step is to decide how to push the events from the new telemetry data source into Metron. You can use a number of data collection tools and that decision is decoupled from Metron. An excellent tool for pushing data into Metron is  Apache Nifi Apache NiFi which this section will describe how to use. The second step is to configure Metron to parse the telemetry data source so that downstream processing can be done on it. In this article we will walk you through how to perform both of these steps.

In the section previous section, we described the following set of requirements for Customer Foo who wanted to add the Squid telemetry data source Into into Metron.

  1. The proxy events from the Squid logs need to be ingested in real-time.
  2. The proxy logs must be parsed into a standardized JSON structure that Metron can understand.
  3. In real-time, the squid proxy event must be enriched so that the domain names are enriched with the IP information.
  4. In real-time, the IP within the proxy event must be checked for threat intel feeds.
  5. If there is a threat intel hit, an alert needs to must be raised.
  6. The end user must be able to see the new telemetry events and the alerts from the new data source.
  7. All of these requirements will need to must be implemented easily without writing any new Java code.

...

  • KAFKA_HOST = host where a Kafka broker is installed
  • ZOOKEEPER_HOST = host where a Zookeeper server is installed
  • PROBE_HOST = Host where your sensor, probes are installed. If don't have any sensors installed, pick the host where a storm supervisor is running
  • SQUID_HOST = Host where you want to install SQUID. If you don't care, just install on the PROBE_HOST
  • NIFI_HOST = The host where you will install NIFI. You want this this to be same host that you installed Squid.
  • HOST_WITH_ENRICHMENT_TAG = This is the host in your inventory hosts file that you put under the group "enrichment" 
  • SEARCH_HOST = This is the host where you have elastic or solr running. This is the host in your inventory hosts file that you put under the group "search". Pick one of the search hosts
  • SEARCH_HOST_PORT  = The port of the search host where indexing is configured. (e.g: 9300)
  • METRON_UI_HOST = This is the host where your metron Metron ui web application is running. This is the host in your inventory hosts file that you put under the group "web".
  • METRON_VERSION = The release of the metron Metron binaries you are working with (e.g: 0.2.0BETA-RC2)

...

Every data source whose events you are streaming into Metron must have its own Kafka topic. The ingestion tool of choice (for example, Apache NifiNiFi) will push events into this Kafka topic.  Instructions are the following:

...

  1. The first thing we need to do is decide if we will be using the Java-based parser or the Grok-based parser for the new telemetry. In this example we will be using the Grok parser. Grok parser is perfect for structured or semi-structured logs that are well understood (check) and telemetries with lower volumes of traffic (check).
  2. Next we need to define the Grok expression for our log. Refer to Grok documentation for additional details. In our case the pattern is:

     

    SQUID_DELIMITED %{NUMBER:timestamp}%{SPACE:UNWANTED} %{INT:elapsed}%{SPACE:UNWANTED}%{IPV4:ip_src_addr} %{WORD:action}/%{NUMBER:code} %{NUMBER:bytes} %{WORD:method} %{NOTSPACE:url} - %{WORD:UNWANTED}\/%{IPV4:ip_dst_addr} %{WORD:UNWANTED}\/%{WORD:UNWANTED}

     

  3. Notice that we apply the UNWANTED tag for any part of the message that we don't want included in our resulting JSON structure. Finally, notice that we applied the naming convention to the IPV4 field by referencing the following list of field conventions.

  4. The last thing we need to do is to validate the Grok pattern to make sure it's valid. For our test we will be using a free Grok validator called Grok Constructor. A validated Grok expression should look like this:

  5. Now that the Grok pattern has been defined, we need to save it and move it to HDFS. 
    1. ssh into HOST $HOST_WITH_ENRICHMENT_TAG as root
    2. Create a file called "squid" in the tmp directory and copy the Grok pattern into the file.
      1. touch /tmp/squid
      2. Open up the squid file add the grok Grok pattern defined above
    3. put the squid file into the directory where Metron stores its Grok parsers. Existing Grok parsers that ship with Metron are staged under /apps/metron/pattern
      1. su - hdfs
      2. hadoop fs -rmr /apps/metron/patterns/squid
      3. hdfs dfs -put /tmp/squid /apps/metron/patterns/

...

This squid processor topology will ingest from the squid Kafka topic we created earlier and then parse the event with Metron's Grok framework using the grok Grok pattern we defined earlier. The result of the parsing is a standard JSON Metron structure that then gets put on the "enrichment" Kafka topic for further processing.
But how does the squid events in the access.log get put into the "squid" Kafka topic such at the Parser topology can parse it?  We will do that using Apache NifiNiFi.

Using Apache

...

NiFi to Stream data into Metron

Put simply NiFi was built to automate the flow of data between systems. Hence it is a fantastic tool to collect, ingest and push data to Metron. The below instructions on how to install configure and create the nifi NiFi flow to push squid events into Metron.

Install, Configure and and Start Apache

...

NiFi

The following shows how to install Nifi NiFi on the VM. Do the following as root:

  1. ssh into HOST $NIFI_HOST
  2. Download NifiNiFi
    cd /usr/lib
    wget  http://public-repo-1.hortonworks.com/HDF/centos6/1.x/updates/1.2.0.0/HDF-1.2.0.0-91.tar.gz
    tar -zxvf HDF-1.2.0.0-91.tar.gz 
  3. Edit Nifi NiFi Configuration to update the port of the nifi NiFi web app: nifi.web.http.port=8089
    cd HDF-1.2.0.0/nifi
    vi  conf/nifi.properties
    //update nifi.web.http.port to 8089
  4. Install Nifi NiFi as service
    bin/nifi.sh install nifi
  5. Start the Nifi NiFi Service
    service nifi start
  6. Go to the Nifi NiFi Web: http://$NIFI_HOST:8089/nifi/

Create a

...

NiFi Flow to stream events to Metron

Now we will create a flow to capture events from squid and push them into metron

...