Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

In the previous section, Adding a New Telemetry Data, we walked through how to add a new Squid data source to Apache Metron. The inevitable next question is how I can I enrich the telemetry events in real-time as they flow through the platform? Enrichment is critical when identifying threats or as we like to call it "finding the needle in the haystack." The customers requirements are the following:

...

Metron Enrichment Framework Explained

Step 1: Setup and

...

Prerequisites

  1. Complete the instructions in Adding a new Telemetry Data Source.
  2. Make sure the following variables are configured based on your environment: 

     

    • KAFKA_HOST = The host where a Kafka broker is installed.
    • ZOOKEEPER_HOST = The host where a Zookeeper server is installed.
    • PROBE_HOST = The host where your sensor, probes are installed. If don't have any sensors installed, pick the host where a Storm supervisor is running.
    • SQUID_HOST = The host where you want to install SQUID. If you don't care, just install SQUID on the PROBE_HOST.
    • NIFI_HOST = Host where you will install NIFI. You want this this to be same host on which you installed Squid.
    • HOST_WITH_ENRICHMENT_TAG = The host in your inventory hosts file that you put under the group "enrichment." 
    • SEARCH_HOST = The host where you have Elastic or Solr running. This is the host in your inventory hosts file that you put under the group "search". Pick one of the search hosts.
    • SEARCH_HOST_PORT  = The port of the search host where indexing is configured. (e.g., 9300)
    • METRON_UI_HOST = The host where your Metron UI web application is running. This is the host in your inventory hosts file that you put under the group "web."
    • METRON_VERSION = The release of the Metron binaries you are working with. (e.g., 0.2.0BETA-RC2)

...

  1. As root user, log into $HOST_WITH_ENRICHMENT_TAG.
  2. Cut and paste the followwing following data into a file called "whois_ref.csv" on your virtual machine. This csv file represents our enrichment source.  


    google.com, "Google Inc.", "US", "Dns Admin",874306800000
    work.net, "", "US", "PERFECT PRIVACY, LLC",788706000000
    capitalone.com, "Capital One Services, Inc.", "US", "Domain Manager",795081600000
    cisco.com, "Cisco Technology Inc.", "US", "Info Sec",547988400000
    cnn.com, "Turner Broadcasting System, Inc.", "US", "Domain Name Manager",748695600000
    news.com, "CBS Interactive Inc.", "US", "Domain Admin",833353200000
    nba.com, "NBA Media Ventures, LLC", "US", "C/O Domain Administrator",786027600000
    espn.com, "ESPN, Inc.", "US", "ESPN, Inc.",781268400000
    pravda.com, "Internet Invest, Ltd. dba Imena.ua", "UA", "Whois privacy protection service",806583600000
    hortonworks.com, "Hortonworks, Inc.", "US", "Domain Administrator",1303427404000
    microsoft.com, "Microsoft Corporation", "US", "Domain Administrator",673156800000
    yahoo.com, "Yahoo! Inc.", "US", "Domain Administrator",790416000000
    rackspace.com, "Rackspace US, Inc.", "US", "Domain Admin",903092400000
    1and1.co.uk, "1 & 1 Internet Ltd","UK", "Domain Admin",943315200000

     

  3. The schema of this enrichment source is domain|owner|registeredCountry|registeredTimestamp. Make sure you don't have an empty newline character as the last line of the CSV file, as that will result in a null pointer exception.
     
    We will use the whois_ref.csv file in step 5.

...

  1. As root user, log $HOST_WITH_ENRICHMENT_TAG.
  2. Cut and paste the following into file into a file called "enrichment_config_temp.json" (make sure to set ZOOKEEPER_HOST with your specific value). 
    {
         "zkQuorum" : "$ZOOKEEPER_HOST:2181"
        ,"sensorToFieldList" : {
              "squid" : {
                 "type" : "ENRICHMENT"
                ,"fieldToEnrichmentTypes" : {
                     "domain_without_subdomains" : [ "whois" ]
                  }
              }
        }
    }
  3. Because copying and pasting from this blog will include some non-ascii invisible characters, run the following command to strip them out:

    iconv -c -f utf-8 -t ascii enrichment_config_temp.json -o enrichment_config.json

    We will use the extractor_config file in step 5.

...

  1. Now that we have the enrichment source and enrichment config defined, we can now run the loader to move the data from the enrichment source to the Metron enrichment Store and store the enrichment config in Zookeeper.

    /usr/metron/$METRON_RELEASE/bin/flatfile_loader.sh -n enrichment_config.json -i whois_ref.csv -t enrichment -c t -e extractor_config.json
  2. This command loads your enrichment data in Hbase and establishes a Zookeeper mapping. The data is populated into an Hbase HBase table called enrichment. To verify that the logs were properly ingested into HbaseHBase, run the following command: 
    hbase shell
    scan 'enrichment'
  3. To check if the Zookeeper enrichment tag was properly populated, run the following:

    /usr/metron/0.1BETA/bin/zk_load_configs.sh -m DUMP -z ZOOKEEPER_HOST:2181
  4. Generate some data by using the Squid client to execute http requests. (Do this about 20 times.)

    squidclient http://www.cnn.com

...