Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added explanatory endorsement of Spark

...

  • UC Berkeley AMPLab - Big data research lab that initially launched Spark
  • Adatao, Inc. - Pervasive Data Science in the Enterprise
    • Team of ex-Googlers & Yahoos with large-scale infrastructure experience (including both flavors of MapReduce at Google & Yahoo) & PhD's in ML/Data Mining
    • Determined that Spark, among the many alternatives, answered the right problem statements with the right design

Software Projects

  • MLbase - Machine Learning project on top of Spark
  • Shark - Hive and SQL on top of Spark
  • Apache Mesos - Cluster management system that supports running Spark
  • BigR - Native R (and other front-ends) for Big-Data-Science/Machine-Learning with open API on top of Spark+Hadoop (soon to be open sourced)