Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page tracks external software projects that supplement Apache Spark and add to its ecosystem.  To add an item to this page, please send a note to user@spark.apache.org with the name of the project, a brief description, and URL.

Spark Packages

The Note: the Spark package index provides a more comprehensive community-managed list of add-ons for Spark along with installation instructions.

...

libraries and applications that work with Spark. You can add a package as long as you have a GitHub repository.

Infrastructure Projects

...

Applications Using Spark

...

  • Apache Mahout - Previously on Hadoop MapReduce, Mahout has switched to using Spark as the backend
  • Apache MRQL - A query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark
  • BlinkDB - a massively parallel, approximate query engine built on top of Shark and Spark
  • Spindle - Spark/Parquet-based web analytics query engine\
  • Thunderain - a framework for combining stream processing with historical data, think Lamba architecture
  • DF from Ayasdi - a Pandas-like data frame implementation for Spark