Page History

...

Issue	Cause	Resolution
Error: Could not find or load main class org.apache.spark.deploy.SparkSubmit	Spark dependency not correctly set.	Add Spark dependency to Hive, see Step 3 above.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 5.0:0 had a not serializable result: java.io.NotSerializableException: org.apache.hadoop.io.BytesWritable	Spark serializer not set to Kryo.	Set spark.serializer to be org.apache.spark.serializer.KryoSerializer, see Step 5 above.
[ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected	Hive has upgraded to Jline2 but jline 0.94 exists in the Hadoop lib.	Delete jline from the Hadoop lib directory (it's only pulled in transitively from zk). export HADOOP_USER_CLASSPATH_FIRST=true If this error occurs during mvn test, perform a mvn clean install on the root project and itests directory.
java.lang.SecurityException: class "javax.servlet.DispatcherType"'s signer information does not match signer information of other classes in the same package at java.lang.ClassLoader.checkCerts(ClassLoader.java:952)	Two versions of the servlet-api are in the classpath.	This should be fixed by HIVE-8905. Remove the servlet-api-2.5.jar under hive/lib.
Spark executor gets killed all the time and Spark keeps retrying the failed stage; you may find similar information in the YARN nodemanager log. WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=217989,containerID=container_1421717252700_0716_01_50767235] is running beyond physical memory limits. Current usage: 43.1 GB of 43 GB physical memory used; 43.9 GB of 90.3 GB virtual memory used. Killing container.	For Spark on YARN, nodemanager would kill Spark executor if it used more memory than the configured size of "spark.executor.memory" + "spark.yarn.executor.memoryOverhead".	Increase "spark.yarn.executor.memoryOverhead" to make sure it covers the executor off-heap memory usage.
Run query and get an error like: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask In hive logs, it shows: java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79)	Happens if doing testing on Mac machines, this general Mac Snappy issue is not unique to spark, but it is needed for startup of Spark client.	Run this command before starting Hive or HiveServer2: export HADOOP_OPTS="-Dorg.xerial.snappy.tempdir=/tmp -Dorg.xerial.snappy.lib.name=libsnappyjava.jnilib $HADOOP_OPTS"

Recommended Configuration

...

Space shortcuts

Child pages

Versions Compared

Old Version 47

New Version 48

Key

Recommended Configuration