You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Enrichments add additional context to the streaming message.  For example, if a given message has an external IP an enrichment would be to tag geo data to that message.  Another example would be if a message contains a domain name then we can tag a whois entry to that message.  There are three primary benefits to adding context via enrichments to a message:

  • Correlation: if you know which user and asset the message is intended to and where it's coming from it's easier to correlated it with other related messages 
  • ML: having full context via streaming allows scoring against ML models via real time as opposed to gathering the context in batch and then applying the model in batch 
  • Accuracy: the underlying enrichment information always changes (users sign on and off, machines change IPs, etc) and you want to enrich as close to the capture time as possible
  • Investigation: having a full context for a given piece of metadata or alert means less consoles to fumble through and gets us closer to the 'single pane of glass' interface 

Metron currently provides an extensible framework to plug in enrichments.  Each enrichment has two components: an enrichment data source and and enrichment bolt.

Prior to enabling an enrichment capability within Metron the enrichment store (which for Metron is primarily Hbase) has to be loaded with enrichment data.  Enrichment data can either be bulk loaded from HDFS or be streamed into enrichment store via pluggable loading framework.  The enrichment loader transforms the enrichment into a JSON format that is understandable to Metron.  The loading framework has additional capabilities for aging data out of the enrichment stores based on time.  Once the stores are loaded an enrichment bolt that can interact with the enrichment store can be incorporated into the enrichment topology.  Each enrichment bolt can enrich a specific field/tag within a Metron message.  When a bolt recognizes that it is able to enrich a field it reaches into the enrichment store, pulls out the enrichment, and tags the message with the enrichment.  The enrichment is then stored within the bolt's in-memory cache.  Metron uses the underlying Storm routing capabilities to make sure that similar enrichment values are sent to the appropriate bolts that already have these values cached in-memory, thereby giving Metron it's superior scale and speed when compared to other big data streaming systems that do not have this capability.  

  • No labels