...
- ssh into Host $HOST_WITH_ENRICHMENT_TAG as root.
- Create a Squid Grok parser configuration file at /usr/metron/$METRON_VERSION/config/zookeeper/parsers/squid.json:
touch /usr/metron/$METRON_VERSION/config/zookeeper/parsers/squid.json
- Add the following contents:
{
"parserClassName": "org.apache.metron.parsers.GrokParser",
"sensorTopic": "squid",
"parserConfig": {
"grokPath": "/apps/metron/patterns/squid",
"patternLabel": "SQUID_DELIMITED",
"timestampField": "timestamp"
},
"fieldTransformations" : [
{
"transformation" : "STELLAR"
,"output" : [ "full_hostname", "domain_without_subdomains" ]
,"config" : {
"full_hostname" : "URL_TO_HOST(url)"
,"domain_without_subdomains" : "DOMAIN_REMOVE_SUBDOMAINS(full_hostname)"
}
}
]}
Notice the use of the fieldTransformations in the parser configuration. Our Grok Parser is set up to extract the URL, but really we want just the domain or even the domain without subdomains. To do this, we can use the Metron Transformation Language field transformation. The Metron Transformation Language is a Domain Specific Language that allows users to define extra transformations to be done on the messages flowing through the topology. It supports a wide range of common network and string-related functions as well as function composition and list operations. In our case, we extract the hostname from the URL via the URL_TO_HOST function and remove the domain names with DOMAIN_REMOVE_SUBDOMAINS thereby creating two new fields, "full_hostname" and "domain_without_subdomains" to each message.
4. All
parser
configurations
are
stored
in
Zookeeper. Use
the
following
script
to
upload
configurations
to
Zookeeper:
/usr/metron/$METRON_VERSION/bin/zk_load_configs.sh --mode PUSH -i /usr/metron/$METRON_VERSION/config/zookeeper -z $ZOOKEEPER_HOST:2181
Note: You might receive the following warning messages when you execute the previous command. You can safely ignore these warning messages.
log4j:WARN No appenders could be found for logger (org.apache.curator.framework.imps.CuratorFrameworkImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
...
1. Create a file called squid.json at /usr/metron/$METRON_VERSION/config/zookeeper/indexing/:
touch $METRON_HOME/config/zookeeper/indexing/squid.json and populate
2. Populate it with the following:
{
"elasticsearch": {
"index": "squid",
"batchSize": 5,
"enabled" : true
},
"hdfs"" :{
"index": "squid",
"batchSize": 5,
"enabled" : true
}
}
This This file sets the batch size of 5 and the index name to squid for both the Elasticsearch and HDFS writers.
2 3. Push the configuration to ZooKeeper:
/usr/metron/$METRON_VERSION/bin/zk_load_configs.sh --mode PUSH -i /usr/metron/$METRON_VERSION/config/zookeeper -z $ZOOKEEPER_HOST:2181
Step 6: Validate the Squid Message
...