This page tracks the users of Spark. Feel free to add yourself to this list (you will need a wiki user account) and explain how you use Spark. Please add a short description (up to three bullet points) and a link to your organization or project.
Companies & Organizations
- UC Berkeley AMPLab - Big data research lab that initially launched Spark
- We're building a variety of open source projects on Spark, including Shark, MLbase, and Spark Streaming, and developing new distributed systems techniques that improve the engine
- We have both graduate students and a team of professional software engineers working on the stack
- Adatao, Inc. - Pervasive Data Science in the Enterprise
- Team of ex-Googlers & Yahoos with large-scale infrastructure experience (including both flavors of MapReduce at Google & Yahoo) & PhD's in ML/Data Mining
- Determined that Spark, among the many alternatives, answered the right problem statements with the right design
- Amrita Center for Cyber Security Systems and Networks
- Autodesk
- Baidu
- Bizo
- Check out our talk on Spark at Bizo at Spark user meetup
- Celtra
- Conviva - Experience Live
- See our talk at AmpCamp on how we are using Spark to provide real time video optimization
- Databricks
- Formed by the creators of Apache Spark and Shark, Databricks is working to greatly expand these open source projects and transform big data analysis in the process. We're deeply committed to keeping all work on these systems open source.
- Digby
- Exabeam
- Falkonry
- Freeman Lab at HHMI
- We are using Spark for analyzing and visualizing patterns in large-scale recordings of brain activity in real time
- GraphFlow, Inc.
- Groupon
- IBM Almaden
- Istanbul Sehir University
- Knoldus Software LLC
- Magine TV
- MediaCrossing - Digital Media Trading Experts in the New York and Boston areas
- We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get the right answer to our queries in a much shorter amount of time.
- NFLabs
- Nokia Solutions and Networks
- Ooyala, Inc. - Powering personalized video experiences across all screens
- See our blog post on how we use Spark for Fast Queries
- See our presentation on Cassandra, Spark, and Shark
- Peerialism
- PlanBMedia
- Premise
- Quantifind
- Sinnia
- Sohu
- Taobao (Alibaba)
- We built one of the world's first Spark on YARN production clusters.
- See our blog posts (in Chinese) about Spark at Taobao: http://rdc.taobao.org/?tag=spark
- TrendMicro
- TruEffect Inc
- Tuplejump
- UC Santa Cruz
- Yahoo!
Software and Research Projects
- Shark - Hive and SQL on top of Spark
- MLbase - Machine Learning project on top of Spark
- BlinkDB - a massively parallel, approximate query engine built on top of Shark and Spark
- GraphX - a graph processing & analytics framework on top of Spark
- Apache Mesos - Cluster management system that supports running Spark
- Tachyon - In memory storage system that supports running Spark
- BigR - Native R (and other front-ends) for Big-Data-Science/Machine-Learning with open API on top of Spark+Hadoop
- Apache MRQL - A query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark
- OpenDL - A deep learning algorithm library based on Spark framework. Just kick off.