Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: minor edits in new Blobstore section (thanks, Sergio Peña)

...

This is an HDFS root directory under which Hive's REPL DUMP command will operate, creating dumps to replicate along to other warehouses. 

 

Blobstore (i.e. Amazon S3)

A set Starting in release 2.2.0, a set of configurations were was added to enable read/write performance improvements when working with tables stored on blobstore systems, such as Amazon S3.

 

hive.blobstore.supported.schemes
  • Default value: s3,s3a,s3n
  • Added In: Hive 2.2.0 with HIVE-14270

List of supported blobstore schemes that Hive uses to apply special read/write performance improvements.

hive.blobstore.optimizations.enabled
  • Default value: true
  • Added In: Hive 2.2.0 with HIVE-15121

This parameter is a global variable that enables a number of optimizations when running on blobstores.
Some of the optimizations, such as hive.blobstore.use.blobstore.as.scratchdir, won't be used if this variable is set to false.

hive.blobstore.use.blobstore.as.scratchdir
  • Default value: false
  • Added In: Hive 2.2.0 with HIVE-14270

Set this to true to enable the use of scratch directories directly on blob storage systems (it may cause performance penalties).

hive.exec.input.listing.max.threads

 

  • Default value: 0 (disabled)
  • Added In: Hive 2.2.0 with HIVE-15881

 

Set this to a maximum number of threads that Hive will use to list file information form from file systems, such as file size and number of files per table (recommended > 1 for blobstore).

...