Now that we have created a new telemetry we can see how we can add new enrichments to that telemetry. In this exercise we will be looking at adding a whois enrichment to the Squid telemetry we setup in the previous entry. Whois data is expensive so we will not be providing it. Instead I wrote a basic whois scraper (out of context for this exercise) that produces a CSV format for whois data as follows:
...
...
Inc.",
...
"US",
...
"Dns
...
Admin",874306800000
work.net,
...
"",
...
"US",
...
"PERFECT
...
PRIVACY,
...
LLC",788706000000
capitalone.com,
...
"Capital
...
One
...
Services,
...
Inc.",
...
"US",
...
"Domain
...
Manager",795081600000
cisco.com,
...
"Cisco
...
Technology
...
Inc.",
...
"US",
...
"Info
...
Sec",547988400000
cnn.com,
...
"Turner
...
Broadcasting
...
System,
...
Inc.",
...
"US",
...
"Domain
...
Name
...
Manager",748695600000
news.com,
...
"CBS
...
Interactive
...
Inc.",
...
"US",
...
"Domain
...
Admin",833353200000
nba.com,
...
"NBA
...
Media
...
Ventures,
...
LLC",
...
"US",
...
"C/O
...
Domain
...
Administrator",786027600000
espn.com,
...
"ESPN,
...
Inc.",
...
"US",
...
"ESPN,
...
Inc.",781268400000
pravda.com,
...
"Internet
...
Invest,
...
Ltd.
...
dba
...
Imena.ua",
...
"UA",
...
"Whois
...
privacy
...
protection
...
service",806583600000
hortonworks.com,
...
"Hortonworks,
...
Inc.",
...
"US",
...
"Domain
...
Administrator",1303427404000
microsoft.com,
...
"Microsoft
...
Corporation",
...
"US",
...
"Domain
...
Administrator",673156800000
yahoo.com,
...
"Yahoo!
...
Inc.",
...
"US",
...
"Domain
...
Administrator",790416000000
rackspace.com,
...
"Rackspace
...
US,
...
Inc.",
...
"US",
...
"Domain
...
Admin",903092400000
1and1.co.uk, "1 & 1 Internet Ltd","UK", "Domain Admin",943315200000
Please cut and paste this data into a file called "whois_ref.csv" on your virtual machine.
...
iconv -c -f utf-8 -t ascii extractor_config_temp.json -o extractor_config.json
Update enrichment config
And another config to load the zookeeper enrichment config:
{
...
"zkQuorum"
...
:
...
"
...
$ZOOKEEPER_HOME:2181"
...
,"sensorToFieldList"
...
:
...
{
...
"squid"
...
:
...
{
...
"type"
...
:
...
"ENRICHMENT"
...
,"fieldToEnrichmentTypes"
...
:
...
{
...
"domain_without_subdomains" : [ "whois"
...
]
...
}
...
}
...
}
}
Please cut and paste this file into a file called "enrichment_config_temp.json" on the virtual machine. Because copying and pasting from this blog will include some non-ascii invisible characters, to strip them out please run
...
In order to demonstrate the enrichment capabilities of Metron you need to drop all existing indexes for Squid where the data was ingested prior to enrichments being enabled. To do so go back to the head plugin and deleted the indexes like so:
No need to drop index
Make sure you delete all Squid indexes. Re-ingest the data (see previous blog post) and the messages should be automatically enriched. The new message should look as follows:
...