Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Spark Installation

Follow instructions hereto install latest sparkhttps://spark.apache.org/docs/latest/spark-standalone.html.  In particular:

  1. Install spark (either download pre-built spark, or build assembly from source).  Note that Spark has different distributions for different versions of Hadoop.  Keep note of the spark-assembly-*.jar location.
  2. Start Spark cluster (Master and workers).  Keep note of the Spark master URL.  This can be found in Spark master WebUI.

...

IssueCauseResolution

java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode

Guava library version conflict between Spark and Hadoop.  See HIVE-7387 and SPARK-2420 for details.Temporarily remove guava jars from HADOOP_HOME, until JIRA's are resolved.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 5.0:0 had a not serializable result: java.io.NotSerializableException: org.apache.hadoop.io.BytesWritable

Spark serializer not set to kryoSet spark.serializer to be org.apache.spark.serializer.KryoSerializer as described above

...