Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: For HIVE-9448 and HIVE-9337, added a Spark section with some background, and a "Remote Spark Driver" section with even more background. Filled in the newly-added configurations.

...

For the Windows operating system, Hive needs to pass the HIVE_HADOOP_CLASSPATH Java parameter while starting HiveServer2 using "-hiveconf hive.hadoop.classpath=%HIVE_LIB%". Users can set this parameter in hiveserver2.xml.

Spark

Apache Spark was added in Hive 1.1 (HIVE-7292 and the merge-to-trunk JIRA's HIVE-9257, 9352, 9448).    For information see the design document Hive on Spark and Hive on Spark: Getting Started.

 
hive.spark.job.monitor.timeout
 
  • Default Value: 60 seconds
  • Added In: Hive 1.2.0 with HIVE-9337
 

Timeout for job monitor to get Spark job state.

Remote Spark Driver

The remote Spark driver is the application launched in the Spark cluster, that submits the actual Spark job.  It was introduced in HIVE-8528.  It is a long-lived application initialized upon the first query of the current user, until the user's session is closed.  The following properties control the remote communication between the remote Spark driver and the Hive client that spawns it.

hive.spark.client.future.timeout
  • Default Value: 60 seconds
  • Added In: Hive 1.2.0 with HIVE-9337

Timeout for requests from Hive client to remote Spark driver.

hive.spark.client.connect.timeout
  • Default Value: 1000 miliseconds
  • Added In: Hive 1.2.0 with HIVE-9337

Timeout for remote Spark driver in connecting back to Hive client.

hive.spark.client.server.connect.timeout
  • Default Value: 60000 miliseconds
  • Added In: Hive 1.2.0 with HIVE-9337

Timeout for remote Spark driver in connecting back to Hive client.

hive.spark.client.secret.bits
  • Default Value: 256
  • Added In: Hive 1.2.0 with HIVE-9337

Number of bits of randomness in the generated secret for communication between Hive client and remote Spark driver.  Rounded down to nearest multiple of 8.

hive.spark.client.rpc.threads
  • Default Value: 8
  • Added In: Hive 1.2.0 with HIVE-9337

Maximum number of threads for remote Spark driver's RPC event loop.

hive.spark.client.channel.log.level
  • Default Value: null
  • Added In: Hive 1.2.0 with HIVE-9337

Channel logging level for remote Spark driver.  One of DEBUG, ERROR, INFO, TRACE, WARN.  If unset, TRACE is chosen.

Tez

Apache Tez was added in Hive 0.13.0 (HIVE-4660 and HIVE-6098).  For information see the design document Hive on Tez, especially the Installation and Configuration section.

...