Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

cat /var/log/squid/access.log | ${HDP_HOME}/kafka-broker/bin/kafka-console-producer.sh --broker-list $BROKERLIST --topic squid

${HDP_HOME}/kafka-broker/bin/kafka-console-consumer.sh --zookeeper $ZOOKEEPER --topic squid --from-beginning


Note: The following steps for manually creating the Grok expression, copying the pattern to HDFS, and creating the parser and indexing json configs for the sensor is no longer necessary in full dev. The files are installed by default and you can simply start the squid topology as described below to achieve the end result of these steps.


This should ingest our Squid logs into Kafka.  Now we are ready to tackle the Metron parsing topology setup.  The first thing we need to do is decide if we will be using the Java-based parser of a Grok-based parser for the new telemetry.  In this example we will be using the Grok parser.  Grok parser is perfect for structured or semi-structured logs that are well understood (check) and telemetries with lower volumes of traffic (check).  The first thing we need to do is define the Grok expression for our log.  Refer to Grok documentation for additional details.  In our case the pattern is:

...

${METRON_HOME}/bin/zk_load_configs.sh -m DUMP -z $ZOOKEEPER

Now, install an Elasticsearch template for your new sensor so that we can effectively query results in the Metron Alerts UI.

Note: This is a new step that is necessary as of the meta alerts feature and Elasticsearch 5.6.2 upgrade.

Run the following commands from the CLI.

curl -XPUT 'http://node1:9200/_template/squid_index' -d '
{
  "template": "squid_index*",
  "mappings": {
    "squid_doc": {
      "dynamic_templates": [
      {
        "geo_location_point": {
          "match": "enrichments:geo:*:location_point",
          "match_mapping_type": "*",
          "mapping": {
            "type": "geo_point"
          }
        }
      },
      {
        "geo_country": {
          "match": "enrichments:geo:*:country",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_city": {
          "match": "enrichments:geo:*:city",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_location_id": {
          "match": "enrichments:geo:*:locID",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_dma_code": {
          "match": "enrichments:geo:*:dmaCode",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_postal_code": {
          "match": "enrichments:geo:*:postalCode",
          "match_mapping_type": "*",
          "mapping": {
            "type": "keyword"
          }
        }
      },
      {
        "geo_latitude": {
          "match": "enrichments:geo:*:latitude",
          "match_mapping_type": "*",
          "mapping": {
            "type": "float"
          }
        }
      },
      {
        "geo_longitude": {
          "match": "enrichments:geo:*:longitude",
          "match_mapping_type": "*",
          "mapping": {
            "type": "float"
          }
        }
      },
      {
        "timestamps": {
          "match": "*:ts",
          "match_mapping_type": "*",
          "mapping": {
            "type": "date",
            "format": "epoch_millis"
          }
        }
      },
      {
        "threat_triage_score": {
          "mapping": {
            "type": "float"
          },
          "match": "threat:triage:*score",
          "match_mapping_type": "*"
        }
      },
      {
        "threat_triage_reason": {
          "mapping": {
            "type": "text",
            "fielddata": "true"
          },
          "match": "threat:triage:rules:*:reason",
          "match_mapping_type": "*"
        }
      },
      {
        "threat_triage_name": {
          "mapping": {
            "type": "text",
            "fielddata": "true"
          },
          "match": "threat:triage:rules:*:name",
          "match_mapping_type": "*"
        }
      }
      ],
      "properties": {
        "timestamp": {
          "type": "date",
          "format": "epoch_millis"
        },
        "source:type": {
          "type": "keyword"
        },
        "ip_dst_addr": {
          "type": "ip"
        },
        "ip_dst_port": {
          "type": "integer"
        },
        "ip_src_addr": {
          "type": "ip"
        },
        "ip_src_port": {
          "type": "integer"
        },
        "alert": {
          "type": "nested"
        },
        "guid": {
          "type": "keyword"
        }
      }
    }
  }
}
'
# Verify the template installs as expected 
curl -XGET 'http://node1:9200/_template/squid_index?pretty'

This template accomplishes two things:

  1. Sets up default mappings for metron-specific types, e.g. timestamps.
  2. Sets up types for properties that will come from the parsed data, e.g. ip_src_addr.

Now start Start the new squid parser topology:

...