Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • KAFKA_HOST = host where a Kafka broker is installed
  • ZOOKEEPER_HOST = host where a Zookeeper server is installed
  • PROBE_HOST = Host where your sensor, probes are installed. If don't have any sensors installed, pick the host where a storm supervisor is running
  • SQUID_HOST = Host where you want to install SQUID. If you don't care, just install on the PROBE_HOST
  • HOST_WITH_ENRICHMENT_TAG = This is the host in your inventory hosts file that you put under the group "enrichment" 

 

How to Parse the Squid Telemetry Data Source to Metron

...

  1. The first thing we need to do is decide if we will be using the Java-based parser or the Grok-based parser for the new telemetry. In this example we will be using the Grok parser. Grok parser is perfect for structured or semi-structured logs that are well understood (check) and telemetries with lower volumes of traffic (check).
  2. Next we need to define the Grok expression for our log. Refer to Grok documentation for additional details. In our case the pattern is:

    SQUID_DELIMITED %{NUMBER:timestamp} %{SPACE:UNWANTED}  %{INT:elapsed} %{IPV4:ip_src_addr} %{WORD:action}/%{NUMBER:code} %{NUMBER:bytes} %{WORD:method} %{NOTSPACE:url} - %{WORD:UNWANTED}\/%{IPV4:ip_dst_addr} %{WORD:UNWANTED}\/%{WORD:UNWANTED}

  3. Notice that we apply the UNWANTED tag for any part of the message that we don't want included in our resulting JSON structure. Finally, notice that we applied the naming convention to the IPV4 field by referencing the following list of field conventions.

  4. The last thing we need to do is to validate the Grok pattern to make sure it's valid. For our test we will be using a free Grok validator called Grok Constructor. A validated Grok expression should look like this:

  5. Now that the Grok pattern has been defined, we need to save it and move it to HDFS. 
    1. ssh into HOST $HOST_WITH_ENRICHMENT_TAG as root
    2. Create a
    files
    1. file called "squid" in the tmp directory and copy the Grok pattern into the file.
      1. touch
    Now
      1. /tmp/squid
    vi /tmp/squid //copy the grok pattern above to the squid file
      1. Open up the squid file add the grok pattern defined above
    1. put the squid file into the directory where Metron stores its Grok parsers. Existing Grok parsers that ship with Metron are staged under /apps/metron/
    patterns/.
    1. pattern
      1. su
      1. -
      1. hdfs
      2. hadoop
      1. fs
      1. -rmr
      1. /apps/metron/patterns/squid
      1. hdfs
      1. dfs
      1. -put
      1. /tmp/squid
      1. /apps/metron/patterns/
     
    exit
    

Step

...

4: Create a Parser configuration for the new Squid Storm Parser Topology

Now that the Grok pattern is staged in HDFS we need to define a parser configuration for the Metron Parsing Topology.  The configurations are kept in Zookeeper so the sensor configuration must be uploaded there after it has been created.  A Grok parser configuration follows this format:

...