Enrichments

Enrichments add additional context to the streaming message. For example, if a given message has an external IP an enrichment would be to tag geo data to that message. Another example would be if a message contains a domain name then we can tag a whois entry to that message. There are three primary benefits to adding context via enrichments to a message:

Correlation: if you know which user and asset the message is intended to and where it's coming from it's easier to correlated it with other related messages
ML: having full context via streaming allows scoring against ML models via real time as opposed to gathering the context in batch and then applying the model in batch
Accuracy: the underlying enrichment information always changes (users sign on and off, machines change IPs, etc) and you want to enrich as close to the capture time as possible
Investigation: having a full context for a given piece of metadata or alert means less consoles to fumble through and gets us closer to the 'single pane of glass' interface

Metron currently provides an extensible framework to plug in enrichments. Each enrichment has two components: an enrichment data source and and enrichment bolt.

Prior to enabling an enrichment capability within Metron the enrichment store (which for Metron is primarily Hbase) has to be loaded with enrichment data. Enrichment data can either be bulk loaded from HDFS or be streamed into enrichment store via pluggable loading framework. The enrichment loader transforms the enrichment into a JSON format that is understandable to Metron. The loading framework has additional capabilities for aging data out of the enrichment stores based on time. Once the stores are loaded an enrichment bolt that can interact with the enrichment store can be incorporated into the enrichment topology. Each enrichment bolt can enrich a specific field/tag within a Metron message. When a bolt recognizes that it is able to enrich a field it reaches into the enrichment store, pulls out the enrichment, and tags the message with the enrichment. The enrichment is then stored within the bolt's in-memory cache. Metron uses the underlying Storm routing capabilities to make sure that similar enrichment values are sent to the appropriate bolts that already have these values cached in-memory, thereby giving Metron it's superior scale and speed when compared to other big data streaming systems that do not have this capability.

The following list of enrichments is Currently supported in Metron:

Enrichment	Description	Enrichment Store	Enrichment Source	Metron Message Field Name(s)	Loader Type	Refresh Rate	Metron Enrichment Architecture
GeoIP	Tags on GeoIP (lat-lon coordinates + City/State/Country) to any external IP address. This can be applied both to alerts as well as metadata telemetries to be able to map them to a geo location.	MySQL	Maxmind Geolite http://dev.maxmind.com/geoip/legacy/geolite/	src_ip, dst_ip	Bulk from HDFS	Once every 3 months	Geo Enrichment
Asset	Given an IP, figure out the host name of the asset. Then given the hostname of the asset tell me everything else about that asset that is known from LDAP, AD, or enterprise inventory stores	HBase	LDAP, AD, DNS logs, enterprise inventory stores	src_ip, dst_ip	Not yet provided. Roadmap item	Once every hour	Asset Enrichment
User	Given a session or an alert for a certain ip-application pair, tell me which user this session/alert belongs to	Hbase	LDAP, AD, proxy logs	src_ip + application	Not yet provided. Roadmap item	Once every 5 minutes	User Enrichment
	More to come....

Space shortcuts

Page tree