Powered By Spark

This page tracks the users of Spark. Feel free to add yourself to this list (you will need a wiki user account) and explain how you use Spark. Please add a short description (up to three bullet points) and a link to your organization or project.

Companies & Organizations

UC Berkeley AMPLab - Big data research lab that initially launched Spark
- We're building a variety of open source projects on Spark, including Shark, MLbase, and Spark Streaming, and developing new distributed systems techniques that improve the engine
- We have both graduate students and a team of professional software engineers working on the stack
Adatao, Inc. - Pervasive Data Science in the Enterprise
- Team of ex-Googlers & Yahoos with large-scale infrastructure experience (including both flavors of MapReduce at Google & Yahoo) & PhD's in ML/Data Mining
- Determined that Spark, among the many alternatives, answered the right problem statements with the right design
Amrita Center for Cyber Security Systems and Networks
Autodesk
Baidu
Celtra
Conviva - Experience Live
- See our talk at AmpCamp on how we are using Spark to provide real time video optimization
Databricks
Digby
Exabeam
Falkonry
Freeman Lab at HHMI
- We are using Spark for analyzing and visualizing patterns in large-scale recordings of brain activity in real time
GraphFlow, Inc.
Groupon
Istanbul Sehir University
Knoldus Software LLC
Magine TV
MediaCrossing - Digital Media Trading Experts in the New York and Boston areas
- We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get the right answer to our queries in a much shorter amount of time.
NFLabs
Nokia Solutions and Networks
Ooyala, Inc. - Powering personalized video experiences across all screens
- See our blog post on how we use Spark for Fast Queries
- See our presentation on Cassandra, Spark, and Shark
Peerialism
PlanBMedia
Premise
Sohu
Taobao
TruEffect Inc
Tuplejump
UC Santa Cruz

Software and Research Projects

Shark - Hive and SQL on top of Spark
MLbase - Machine Learning project on top of Spark
BlinkDB - a massively parallel, approximate query engine built on top of Shark and Spark
GraphX - a graph processing & analytics framework on top of Spark
Apache Mesos - Cluster management system that supports running Spark
Tachyon - In memory storage system that supports running Spark
BigR - Native R (and other front-ends) for Big-Data-Science/Machine-Learning with open API on top of Spark+Hadoop (soon to be open sourced)
Apache MRQL - A query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark
OpenDL - Deep learning training work based on Spark. Just kick off

Child pages

Powered By Spark

Companies & Organizations

Software and Research Projects