Intro
One of the key design principles of Apache Metron is that it should be easily extensible. We envision many users using Metron as a platform and building custom capabilities on top of it; one of which will be to add new telemetry data sources. For the purpose of this documentation, we will walk you through how to add a new data telemetry data source: Squid proxy logs.
Setting up the Use Case Scenario
Customer Foo has installed Metron TP1 and they are using the out-of-the-box data sources (PCAP, YAF/Netflow, Snort, and Bro). They love Metron! But now they want to add a new data source to the platform: Squid proxy logs.
Customer Foo's Requirements
The following are the customer's requirements for Metron with respect to this new data source:
- The proxy events from Squid logs need to be ingested in real-time.
- The proxy logs must be parsed into a standardized JSON structure that Metron can understand.
- In real-time, the Squid proxy event needs to be enriched so that the domain names are enriched with the IP information.
- In real-time, the IP within the proxy event must be checked for threat intel feeds.
- If there is a threat intel hit, an alert needs to be raised.
- The end user must be able to see the new telemetry events and the alerts from the new data source.
- All of these requirements will need to be implemented easily without writing any new Java code.
What is Squid?
Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. For more information on Squid see Squid-cache.org.
How Metron Enriches a Squid Telemetry Event
When you make an outbound http connection to https://www.cnn.com from a given host, the following entry is added to a Squid file called access.log.
The following represents the magic that Metron will do to this telemetry event as it is streamed through the platform in real-time:
Key Points
Some key points to highlight:
- We will be adding a net new data source without writing any code. Metron strives for easy extensibility and this is a good example of it.
- This is a repeatable pattern for a majority of telemetry data sources.