You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Intro

One of the key design principles of Apache Metron is that it should be easily extensibleWe envision many users using Metron as a platform and building custom capabilities on top of it; one of which will be to add new telemetry data sources. For the purpose of this documentation, we will walk you through how to add a new data telemetry data source: Squid proxy logs.

Setting up the Use Case Scenario

Customer Foo has installed Metron TP1 and they are using the out-of-the-box data sources (PCAP, YAF/Netflow, Snort, and Bro). They love Metron! But now they want to add a new data source to the platform: Squid proxy logs.

Customer Foo's Requirements

The following are the customer's requirements for Metron with respect to this new data source:

  1. The proxy events from Squid logs need to be ingested in real-time.
  2. The proxy logs must be parsed into a standardized JSON structure that Metron can understand.
  3. In real-time, the Squid proxy event needs to be enriched so that the domain names are enriched with the IP information.
  4. In real-time, the IP within the proxy event must be checked for threat intel feeds.
  5. If there is a threat intel hit, an alert needs to be raised.
  6. The end user must be able to see the new telemetry events and the alerts from the new data source.
  7. All of these requirements will need to be implemented easily without writing any new Java code.

What is Squid?

Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. For more information on Squid see Squid-cache.org.

How Metron Enriches a Squid Telemetry Event

When you make an outbound http connection to https://www.cnn.com from a given host, the following entry is added to a Squid file called access.log.

The following represents the magic that Metron will do to this telemetry event as it is streamed through the platform in real-time:

Key Points

Some key points to highlight:

  • We will be adding a net new data source without writing any code. Metron strives for easy extensibility and this is a good example of it.
  • This is a repeatable pattern for a majority of telemetry data sources.
  • No labels