Metron Tutorial - Fundamentals Part 5: Threat Triage

In part 4, you learned how we can attach threat intelligence indicators to the messages that are passing through the enrichment Storm topology. The problem, however, is that not all threat intelligence indicators are made equal. Some require immediate response, whereas others can be dealt with or investigated as time and availability permits. What we need is the ability to triage and rank threats by severity.

Now that we know what we should do, the next question is how to accomplish it; in other words, we must define what exactly we mean when we say "severity." The capability as implemented in Metron is accomplished by providing the ability to associate possibly complex conditions to numeric scores. Then, for each message, the set of conditions are evaluated and the set of numbers for matching conditions are aggregated via a configurable aggregation function. This aggregated score is added to the message in the threat.triage.level. Let's dig a bit deeper into this and provide an example.

Stellar Language

The heart of the problem is how one defines a "condition." In Metron, we provide a custom domain specific language for defining conditions.

The query language supports the following:

Referencing fields in the enriched JSON
Simple boolean operations:
- and, &&
- not
- or, ||
Determining whether a field exists (via exists)
The ability to have parenthesis to make order of operations explicit
A fixed set of functions which take strings and return boolean. Currently:
- IN_SUBNET(ip, cidr1, cidr2, ...)
- IS_EMPTY(str)
- STARTS_WITH(str, prefix)
- ENDS_WITH(str, suffix)
- REGEXP_MATCH(str, pattern)
A fixed set of string to string transformation functions. Currently:
- TO_LOWER
- TO_UPPER
- TRIM

Consider, for example, the following JSON message:

...

  "src_ip_addr" : "192.168.0.1"

 ,"is_local" : true

...

Consider the query:

IN_SUBNET( src_ip_addr, '192.168.0.0/24') or src_ip_addr in [ '10.0.0.1', '10.0.0.2' ] or exists(is_local)

This evaluates to true precisely when one of the following is true for a message:

The value of the src_ip_addr field is in the 192.168.0.0/24 subnet
The value of the src_ip_addr field is 10.0.0.1 or 10.0.0.2
The field is_local exists

Threat Triage Configuration

Now that we have the ability to define conditions, for each sensor we need to associate these conditions to scores. Since this is a per-sensor configuration, this fits nicely within the sensor enrichment configuration held in zookeeper. This configuration fits well within the threatIntel section of the configuration like so:

...

  ,"threatIntel" : {

...

           , "triageConfig" : {

                     "riskLevelRules" : {

                                 "condition1" : level1

                               , "condition2" : level2

...

                     ,"aggregator" : "MAX"

riskLevelRules correspond to the set of condition to numeric level mappings that define the threat triage for this particular sensor. aggregator is an aggregation function that takes all non-zero scores representing the matching queries from riskLevelRules and aggregates them into a single score. The current supported aggregation functions are

MAX : The max of all of the associated values for matching queries
MIN : The min of all of the associated values for matching queries
MEAN : The mean of all of the associated values for matching queries
POSITIVE_MEAN : The mean of the positive associated values for the matching queries.

Example

So, where we left off in part 4 was a working threat intelligence enrichment. Now, let's see if we can triage those threats for the squid data flowing through. In particular, let's triage the threat alerts for the squid sensor data higher under the following conditions:

If the threat intel enrichment type zeusList as defined in part 4 is alerted, then we want to consider that an alert of score of 5
If the url is neither a .com nor a .net, then we want to consider that alert a score of 10

For each message we will assign the maximum score across all conditions as the triage score. This translates into the following configuration:

...

  ,"threatIntel" : {

...

           , "triageConfig" : {

                     "riskLevelRules" : {

                                 "exists(threatintels.hbaseThreatIntel.domain_without_subdomains.zeusList)" : 5

                               , "not(ENDS_WITH(domain_without_subdomains, '.com') or ENDS_WITH(domain_without_subdomains, '.net'))" : 10

                     ,"aggregator" : "MAX"

In order to apply this triage configuration, we must modify the configuration for the squid sensor in the enrichment topology. To do this, we should modify $METRON_HOME/config/zookeeper/sensors/squid.json on node1 However, since the configuration in zookeeper may have be out of sync with the configuration on disk, we must make sure they are in sync by executing the following command:

$METRON_HOME/bin/zk_load_configs.sh -m PULL -z node1:2181 -f -o $METRON_HOME/config/zookeeper

We should ensure that the configuration for squid exists by checking out

cat $METRON_HOME/config/zookeeper/enrichments/squid.json

Now we can edit the configuration. In $METRON_HOME/config/zookeeper/enrichments/squid.json edit the section titled riskLevelRules and add the two rules above to the map:

"exists(threatintels.hbaseThreatIntel.domain_without_subdomains.zeusList)" : 5

"not(ENDS_WITH(domain_without_subdomains, '.com') or ENDS_WITH(domain_without_subdomains, '.net'))" : 10

Also, ensure that the aggregator field indicates MAX

After modifying the configuration, we can push the configuration back to zookeeper and have the enrichment topology pick it up with live data via

$METRON_HOME/bin/zk_load_configs.sh -m PUSH -z node1:2181 -i $METRON_HOME/config/zookeeper

Now, if we reload the data from the part 4 via

tail /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list node1:6667 --topic squid

Now, if we check the squid index using the elasticsearch head plugin, we can see the threats triage as we would expect:

Non-Threat Data

For URL's from cnn.com, we see no threat alert, so no triage level is set. Notice the lack of a threat.triage.level field.

Threat Data from alamman.com has a triage level of 5

Because alamman.com is a malicious host from the zeusList threat intel feed but is a .com address, it's assigned threat.triage.level of 5.

Threat Data from atmape.ru has a triage level of 10

Because atmape.ru is both a malicious host from the zeusList threat intel feed as well as a non .com and non .net address, it's assigned threat.triage.level of 10.

s{Wwwwww

We{hreatintels.hbaseT{hreatIntel.url.zeusList

Space shortcuts

Blog