THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
This page tracks the users of Spark. To add yourself to the list, please email user@spark.apache.org with your organization name, URL, and a short description.
Companies & Organizations
- UC Berkeley AMPLab - Big data research lab that initially launched Spark
- We're building a variety of open source projects on Spark, including Shark, MLbase, and Spark Streaming, and developing new distributed systems techniques that improve the engine
- We have both graduate students and a team of professional software engineers working on the stack
- Adatao, Inc. - Predictive Data Intelligence for All
- Team of ex-Googlers & Yahoos with large-scale infrastructure experience (including both flavors of MapReduce at Google & Yahoo) & PhD's in ML/Data Mining
- Determined that Spark, among the many alternatives, answered the right problem statements with the right design
- Amrita Center for Cyber Security Systems and Networks
- Atigeo – integrated Spark in xPatterns, our big data analytics platform, as a replacement for Hadoop MR
- Autodesk
- Baidu
- Bakdata – using Spark (and Shark) to perform interactive exploration of large datasets
- Bizo
- Check out our talk on Spark at Bizo at Spark user meetup
- Celtra
- Conviva - Experience Live
- See our talk at AmpCamp on how we are using Spark to provide real time video optimization
- Databricks
- Formed by the creators of Apache Spark and Shark, Databricks is working to greatly expand these open source projects and transform big data analysis in the process. We're deeply committed to keeping all work on these systems open source.
- Providing support for Apache Spark in partnership with Cloudera.
- Dianping.com
- Digby
- EURECOM
- Exabeam
- Faimdata
- Build eCommerce and data intelligence solutions to the retail industry on top of Spark/Shark/Spark Streaming
- Falkonry
- Freeman Lab at HHMI
- We are using Spark for analyzing and visualizing patterns in large-scale recordings of brain activity in real time
- Fundacion CTIC
- GraphFlow, Inc.
- Groupon
- IBM Almaden
- Istanbul Sehir University
- Knoldus Software LLC
- Magine TV
- MediaCrossing – Digital Media Trading Experts in the New York and Boston areas
- We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get the right answer to our queries in a much shorter amount of time.
- NFLabs
- Nokia Solutions and Networks
- Ooyala, Inc. – Powering personalized video experiences across all screens
- See our blog post on how we use Spark for Fast Queries
- See our presentation on Cassandra, Spark, and Shark
- Peerialism
- PlanBMedia
- Premise
- Quantifind
- Sinnia
- Sohu
- Taboola – Powering "Content You May Like" around the web
- Taobao (Alibaba)
- We built one of the world's first Spark on YARN production clusters.
- See our blog posts (in Chinese) about Spark at Taobao: http://rdc.taobao.org/?tag=spark
- Techbase
- TrendMicro
- TruEffect Inc
- Tuplejump
- UC Santa Cruz
- Yahoo!
- Yandex
- Using Spark in Yandex Islands, to process islands identified from a search robot
Software and Research Projects
- Shark - Hive and SQL on top of Spark
- MLbase - Machine Learning project on top of Spark
- BlinkDB - a massively parallel, approximate query engine built on top of Shark and Spark
- GraphX - a graph processing & analytics framework on top of Spark (GraphX has been merged into Spark)
- Apache Mesos - Cluster management system that supports running Spark
- Tachyon - In memory storage system that supports running Spark
- BigR - Native R (and other front-ends) for Big-Data-Science/Machine-Learning with open API on top of Spark+Hadoop
- Apache MRQL - A query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark
- OpenDL - A deep learning algorithm library based on Spark framework. Just kick off.
- SparkR - R frontend for Spark