Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Clients, of course, are listed separately here.

Kafka Connect

Kafka has a built-in framework called Kafka Connect for writing sources and sinks that either continuously ingest data into Kafka or continuously ingest data in Kafka into external systems. The connectors themselves for different applications or data systems are federated and maintained separately from the main code base. An externally hosted list of connectors is maintained by Confluent at the Confluent Hub.

Distributions & Packaging

Stream Processing

Hadoop Integration

  • Confluent HDFS Connector - A sink connector for the Kafka Connect framework for writing data from Kafka to Hadoop HDFS
  • Camus - LinkedIn's Kafka=>HDFS pipeline. This one is used for all data at LinkedIn, and works great.
  • Kafka Hadoop Loader A different take on Hadoop loading functionality from what is included in the main distribution.
  • Flume - Contains Kafka source (consumer) and sink (producer)
  • KaBoom - A high-performance HDFS data loader

Database Integration

Search and Query

  • ElasticSearch ElasticsearchThis project, Kafka Standalone Consumer will read the messages from Kafka, processes and index them in ElasticSearch.

Web Management Consoles

Management Consoles

...

  • Web Console - Displays information about your Kafka cluster including which nodes are up and what topics they host data for.
  • Kafka Offset Monitor - Displays the state of all consumers and how far behind the head of the stream they are.
  • Capillary Displays the state and deltas of Kafka-based Apache Storm topologies. Supports Kafka >= 0.8. It also provides an API for fetching this information for monitoring purposes.
  • Doctor Kafka - Service for cluster auto healing and workload balancing.
  • Cruise Control - Fully automate the dynamic workload rebalance and self-healing of a Kafka cluster.
  • Burrow - Monitoring companion that provides consumer lag checking as a service without the need for specifying thresholds.
  • Chaperone - An audit system that monitors the completeness and latency of data stream.
  • Sematext integration for Kafka monitoring that collects and charts 200+ Kafka metrics
  • Xinfra Monitor - A framework that monitors and exposes metrics showing availability and performance of Kafka clusters and mirrored pipelines.

AWS Integration

Logging

Flume - Kafka plugins

...

Packing and Deployment

...