Page History

https://developer.ibm.com/messaging/message-hub/Here is a list of tools we have been told about that integrate with Kafka outside the main distribution. We haven't tried them all, so they may not work!

Clients, of course, are listed separately here.

Kafka Connect

Kafka has a built-in framework called Kafka Connect for writing sources and sinks that either continuously ingest data into Kafka or continuously ingest data in Kafka into external systems. The connectors themselves for different applications or data systems are federated and maintained separately from the main code base. You can find a list of them here.

Distributions & Packaging

Confluent Platform 1.0 - http://confluent.io/product/. Downloads - http://confluent.io/downloads/.
Cloudera Kafka source https://github.com/cloudera-labs/kafka/tree/cdh5-0.8.2_1.1.0 and release http://www.cloudera.com/content/cloudera/en/developers/home/cloudera-labs/apache-kafka.html
Hortonworks Kafka source ??? and and release http://hortonworks.com/hadoop/kafka/
Stratio Kafka source ??? for for ubuntu http://repository.stratio.com/sds/1.1/ubuntu/13.10/binary/ and for RHEL http://repository.stratio.com/sds/1.1/RHEL/
IBM Message Hub - https://developer.ibm.com/messaging/message-hub/ - Kafka as-a-service in a IBM's Bluemix PaaS

...

Storm - A stream-processing framework.
Samza - A YARN-based stream processing framework.
Storm Spout - Consume messages from Kafka and emit as Storm tuples
Kafka-Storm - Kafka 0.8, Storm 0.9, Avro integration
SparkStreaming - Kafka reciever receiver supports Kafka 0.8 and above above
Flink - Apache Flink has an integration with Kafka
IBM Streams - A stream processing framework with Kafka source and sink to consume and produce Kafka messages

Hadoop Integration

Kafka Connect sink - A sink for Kafka's connector framework.
Camus - LinkedIn's Kafka=>HDFS pipeline. This one is used for all data at LinkedIn, and works great.
Kafka Hadoop Loader A different take on Hadoop loading functionality from what is included in the main distribution.
Flume - Contains Kafka Source (consumer) and Sink (producer)
KaBoom - A high-performance HDFS data loader

...

ElasticSearch - This project, Kafka Standalone Consumer will read the messages from Kafka, processes and index them in ElasticSearch. There are also several Kafka Connect connectors for ElasticSeach.
Presto - The Presto Kafka connector allows you to query Kafka in SQL using Presto.
Hive - Hive SerDe that allows querying Kafka (Avro only for now) using Hive SQL

...

Mozilla Metrics Service - A Kafka and Protocol Buffers based metrics and logging system
Ganglia Integration
SPM for Kafka
Coda Hale Metric Reporter to Kafka

Packing and Deployment

...

Space shortcuts

Child pages

Versions Compared

Old Version 36

New Version 37

Key

Kafka Connect

Distributions & Packaging

Hadoop Integration

Packing and Deployment