Table of Contents |
---|
In the previous section, Adding a New Telemetry Data, we walked through how to add a new Squid data source squid to Apache Metron. The inevitable next question is how I can I enrich the telemetry events in real-time as it flows they flow through the platform. ? Enrichment is critical when identifying threats or as we like to call it "finding the needle in the haystack." . The customers requirement requirements are the following:
- The proxy events from the Squid logs needs to must be ingested in real-time.
- The proxy logs has to must be parsed into a standardized JSON structure that Metron can understand.
- In real-time, the squid proxy event needs to must be enriched so that the domain named names are enriched with the IP information.
- In real-time, the IP with in within the proxy event must be checked against for threat intel feeds.
- If there is a threat intel hit, an alert needs to must be raised.
- The end user must be able to see the new telemetry events and the alerts from the new data source.
- All of this these requirements will need to must be implemented easily without writing any new java Java code.
In this section, we will walk you through how to do requirement 3.
Metron Enrichment Framework Explained
Step 1: Setup and
...
Prerequisites
- Complete You should have completed the instructions in Adding a new Telemetry Data Source.
- Make sure the following variables are configured based on your environment:
- KAFKA_HOST = The host where a Kafka broker is installed.
- ZOOKEEPER_HOST = The host where a Zookeeper server is installed.
- PROBE_HOST =
- The host where your sensor, probes are installed. If don't have any sensors installed, pick the host where a
- Storm supervisor is running.
- SQUID_HOST =
- The host where you want to install SQUID. If you don't care, just install SQUID on the PROBE_HOST.
- NIFI_HOST =
- Host where you will install NIFI. You want this this to be same host
- on which you installed Squid.
- HOST_WITH_ENRICHMENT_TAG =
- The host in your inventory hosts file that you put under the group "enrichment."
- SEARCH_
- HOST =
- The host where you have
- Elastic or
- Solr running.
- This is the host in your inventory hosts file that you put under the group "search". Pick one of the search hosts.
- SEARCH_HOST_PORT = The port of the search host where indexing is configured. (e.g
- ., 9300)
- METRON_UI_HOST =
- The host where your
- Metron UI web application is running.
- This is the host in your inventory hosts file that you put under the group "web."
- METRON_VERSION = The release of the
- Metron binaries you are working with. (e.g
- ., 0.2.0BETA-RC2)
Step
...
2: Create a Mock Enrichment Source
Whois data is expensive so we will not be providing it. Instead we wrote a basic whois scraper (out of context for this exercise) that produces a CSV format for whois data as follows:data. To incorporate this data, complete the following steps:
- As root user, log Log into $HOST_WITH_ENRICHMENT_TAG as root user.
- Cut and paste the below following data into a file called "whois_ref.csv" on your virtual machine. This csv file represents our enrichment source.
google.com, "Google Inc.", "US", "Dns Admin",874306800000
work.net, "", "US", "PERFECT PRIVACY, LLC",788706000000
capitalone.com, "Capital One Services, Inc.", "US", "Domain Manager",795081600000
cisco.com, "Cisco Technology Inc.", "US", "Info Sec",547988400000
cnn.com, "Turner Broadcasting System, Inc.", "US", "Domain Name Manager",748695600000
news.com, "CBS Interactive Inc.", "US", "Domain Admin",833353200000
nba.com, "NBA Media Ventures, LLC", "US", "C/O Domain Administrator",786027600000
espn.com, "ESPN, Inc.", "US", "ESPN, Inc.",781268400000
pravda.com, "Internet Invest, Ltd. dba Imena.ua", "UA", "Whois privacy protection service",806583600000
hortonworks.com, "Hortonworks, Inc.", "US", "Domain Administrator",1303427404000
microsoft.com, "Microsoft Corporation", "US", "Domain Administrator",673156800000
yahoo.com, "Yahoo! Inc.", "US", "Domain Administrator",790416000000
rackspace.com, "Rackspace US, Inc.", "US", "Domain Admin",903092400000
1and1.co.uk, "1 & 1 Internet Ltd","UK", "Domain Admin",943315200000 The schema of this enrichment source is domain|owner|registeredCountry|registeredTimestamp. Make sure you don't have an empty newline character as the last line of the CSV file, as that will result in a null pointer exception.
We will use the whois_ref.csv file in step 5.
Step 3: Configure an Extractor Config File
- Configure an extractor config file that describes the enrichment source. cut Cut and paste this file into a file called "extractor_config_temp.json." :
{
"config" : {
"columns" : {
"domain" : 0
,"owner" : 1
,"home_country" : 2
,"registrar": 3
,"domain_created_timestamp": 4
}
,"indicator_column" : "domain"
,"type" : "whois"
,"separator" : ","
}
,"extractor" : "CSV"
} Because copying and pasting from this blog will include some non-ascii invisible characters, run the following command to strip them out please run.:
iconv -c -f utf-8 -t ascii extractor_config_temp.json -o extractor_config.json
...
- We will use the extractor_config file in step 4
Step
...
4: Configure Element to Enrichment Mapping
We now have need to configure what element of a tuple should be enriched with what enrichment type. This configuration will be stored in zookeeperZookeeper.
- LogAs root user, log $HOST_WITH_ENRICHMENT_TAG.
- Cut and paste the following into file into a file called "enrichment_config_temp.json" (make sure to set ZOOKEEPER_HOST with your specific value).
{
"zkQuorum" : "$ZOOKEEPER_HOST:2181"
,"sensorToFieldList" : {
"squid" : {
"type" : "ENRICHMENT"
,"fieldToEnrichmentTypes" : {
"domain_without_subdomains" : [ "whois" ]
}
}
}
} Because copying and pasting from this blog will include some non-ascii invisible characters, run the following command to strip them out, ru the following:
iconv -c -f utf-8 -t ascii enrichment_config_temp.json -o enrichment_config.json
We will use the extractor_config file in step 5.
Step 35: Run the Enrichment Loader
Now that we have the enrichment source and enrichment config defined, we can now run the loader to move the data from the enrichment source to the Metron enrichment Store and store the enrichment config in
...
Zookeeper.
/usr/metron/
...
$METRON_RELEASE/bin/flatfile_loader.sh -n enrichment_config.json -i whois_ref.csv -t enrichment -c t -e extractor_config.json
...
...
- This command loads your enrichment data
...
- in Hbase and establishes a Zookeeper mapping
...
- . The data
...
- is populated into
...
- an HBase table called enrichment. To verify that the logs were properly ingested into
...
- HBase, run the following command:
hbase shell
...
scan 'enrichment'
...
To
...
check if the Zookeeper enrichment tag was properly populated, run the following:
/usr/metron/0.1BETA/bin/zk_load_configs.sh -m DUMP -z
...
ZOOKEEPER_HOST:2181
Generate some data by using the
...
Squid client to execute http requests. (
...
Do this about 20 times.)
squidclient http://www.cnn.com
Step 6: View the
...
New Enriched Telemetry Events in Metron UI
...
- Go to
...
- the
...
- Metron UI: http://METRON_UI_HOST:5000.
- Select the Dashboard Tab.
- Edit the Squid Event Details Panel that you created in the Add Telemetry Docs procedure by clicking on the edit icon. Metron displays the Discover page.
- Add the following new enrichment fields to the selected fields section (see the section highlighted in red):
- Click the Save Button to save the Search; save it with same name "Squid Event Details".
- Click on the Dashboard Page and delete the Squid Event Details panel, then re-add it.
The Squid Event Details panel should now have the new enriched fields.
Notice the enrichments:
Make sure you delete all Squid indexes. Re-ingest the data (see previous blog post) and the messages should be automatically enriched.
In the Metron-UI, refresh the dashboard and view the data in the Squid Panel in the dashboard:
Notice the enrichments here ( whois.owner, whois.domain_created_timestamp, whois.registrar, whois.home_country).