Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following steps guide you through how to add this new telemetry.

Step 1:

...

Every data source whose events you are streaming into Metron must have its own Kafka topic. The ingestion tool of choice (for example, Apache Nifi) will push events into this Kafka topic.  Instructions are the following:
  1. Log into KAFKA_HOST as root
  2. Create Kafka topic called squid:
    1. /use/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper $ZOOKEEPER_HOST:2181 --create --topic squid --partitions 1 --replication-factor 1
  3. List all of the Kafka topics to ensure that the new topic exists:
    1. /use/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper $ZOOKEEPER_HOST:2181 --list
  4. You should see the following list of Kafka topics:
  • bro
  • enrichment
  • pcap
  • snort
  • squid
  • yaf

Step 2: Install Squid

...

sudo yum install squid
sudo service squid start

...

sudo su -
cd /var/log/squid
ls

You see that there are three types of logs available: access.log, cache.log, and squid.out. We are interested in access.log becasuse that is the log that records the proxy usage.

...

squidclient http://www.cnn.com
squidclient http://www.nba.com
cat /var/log/squid/access.log

In production environments you would configure your users web browsers to point to the proxy server, but for the sake of simplicity of this tutorial we will use the client that is packaged with the Squid installation. After we use the client to simulate proxy requests, the Squid log entries should look as follows:

1461576382.642    161 127.0.0.1 TCP_MISS/200 103701 GET http://www.cnn.com/ - DIRECT/199.27.79.73 text/html
1461576442.228    159 127.0.0.1 TCP_MISS/200 137183 GET http://www.nba.com/ - DIRECT/66.210.41.9 text/html

...

Install the Squid Sensor

  1. ssh into $SQUID_HOST
  2. Install and start Squid:
    sudo yum install squid
    sudo service squid start
  3. With Squid started, look at the the different log files that get created:
    sudo su -
    cd /var/log/squid
    ls

    You see that there are three types of logs available: access.log, cache.log, and squid.out. We are interested in access.log becasuse that is the log that records the proxy usage.

  4. Initially the access.log is empty. Let's generate a few entries for the log, then list the new contents of the access.log:

    squidclient "http://www.aliexpress.com/af/shoes.html?ltype=wholesale&d=y&origin=n&isViewCP=y&catId=0&initiative_id=SB_20160622082445&SearchText=shoes"
    squidclient "http://www.help.1and1.co.uk/domains-c40986/transfer-domains-c79878"
    squidclient "http://www.pravda.ru/science/"
    squidclient "https://www.google.com/maps/place/Waterford,+WI/@42.7639877,-88.2867248,12z/data=!4m5!3m4!1s0x88059e67de9a3861:0x2d24f51aad34c80b!8m2!3d42.7630722!4d-88.2142563"
    squidclient "http://www.brightsideofthesun.com/2016/6/25/12027078/anatomy-of-a-deal-phoenix-suns-pick-bender-chriss"
    squidclient "https://www.microsoftstore.com/store/msusa/en_US/pdp/Microsoft-Band-2-Charging-Stand/productID.329506400"
    squidclient "http://www.autonews.com/article/20151115/RETAIL04/311169971/toyota-fj-cruiser-is-scarce-hot-and-high-priced"
    squidclient "https://tfl.gov.uk/plan-a-journey/"
    squidclient "https://www.facebook.com/Africa-Bike-Week-1550200608567001/"
    squidclient "http://www.ebay.com/itm/02-Infiniti-QX4-Rear-spoiler-Air-deflector-Nissan-Pathfinder-/172240020293?fits=Make%3AInfiniti%7CModel%3AQX4&hash=item281a4e2345:g:iMkAAOSwoBtW4Iwx&vxp=mtr"
    squidclient "http://www.recruit.jp/corporate/english/company/index.html"
    squidclient "http://www.lada.ru/en/cars/4x4/3dv/about.html"
    squidclient "http://www.help.1and1.co.uk/domains-c40986/transfer-domains-c79878"
    squidclient "http://www.aliexpress.com/af/shoes.html?ltype=wholesale&d=y&origin=n&isViewCP=y&catId=0&initiative_id=SB_20160622082445&SearchText=shoes"

    In production environments you would configure your users web browsers to point to the proxy server, but for the sake of simplicity of this tutorial we will use the client that is packaged with the Squid installation. After we use the client to simulate proxy requests, the Squid log entries should look as follows:

    1467011157.401 415 127.0.0.1 TCP_MISS/200 337891 GEThttp://www.aliexpress.com/af/shoes.html? - DIRECT/207.109.73.154 text/html
    1467011158.083 671 127.0.0.1 TCP_MISS/200 41846 GEThttp://www.help.1and1.co.uk/domains-c40986/transfer-domains-c79878 - DIRECT/212.227.34.3 text/html
    1467011159.978 1893 127.0.0.1 TCP_MISS/200 153925 GEThttp://www.pravda.ru/science/ - DIRECT/185.103.135.90 text/html
  5. Using the Squid log entries, we can determine the format of the log entires which is:

    timestamp | time elapsed | remotehost | code/status | bytes | method | URL rfc931 peerstatus/peerhost | type

Step 2: Create a Kafka Topic for the New Data Source

Every data source whose events you are streaming into Metron must have its own Kafka topic. The ingestion tool of choice (for example, Apache Nifi) will push events into this Kafka topic.  Instructions are the following:
  1. Log into KAFKA_HOST as root
  2. Create Kafka topic called squid:
    1. /use/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper $ZOOKEEPER_HOST:2181 --create --topic squid --partitions 1 --replication-factor 1
  3. List all of the Kafka topics to ensure that the new topic exists:
    1. /use/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper $ZOOKEEPER_HOST:2181 --list
  4. You should see the following list of Kafka topics:
  • bro
  • enrichment
  • pcap
  • snort
  • squid
  • yaf

...

Step 3: Create a Grok Statement to Parse the Squid Telemetry Event

...