Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add hive.fetch.task.conversion (HIVE-2925), hive.fetch.task.aggr (HIVE-4002), and doc hive.fetch.task.conversion.threshold (HIVE-3990)

...

Where to insert into multilevel directories like "insert directory '/HIVEFT25686/chinna/' from table".

hive.fetch.task.conversion
  • Default Value: minimal
  • Added In: Hive 0.10.0 with HIVE-2925

Some select queries can be converted to a single FETCH task, minimizing latency. Currently the query should be single sourced not having any subquery and should not have any aggregations or distincts (which incur RS), lateral views and joins. Supported values are minimal and more.

1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
2. more : SELECT, FILTER, LIMIT only (TABLESAMPLE, virtual columns)

hive.fetch.task.aggr
  • Default Value: false
  • Added In: Hive 0.12.0 with HIVE-4002 (description added in Hive 0.13.0 with HIVE-5793)

Aggregation queries with no group-by clause (for example, select count(*) from src) execute final aggregations in a single reduce task. If this parameter is set to true, Hive delegates the final aggregation stage to a fetch task, possibly decreasing the query time.

hive.fetch.task.conversion.threshold
  • Default Value: -1
  • Added In: Hive 0.13.0 with HIVE-3990

Input threshold for applying hive.fetch.task.conversion. If target table is native, input length is calculated by summation of file lengths. If it's not native, the storage handler for the table can optionally implement the org.apache.hadoop.hive.ql.metadata.InputEstimator interface. A negative threshold means hive.fetch.task.conversion is applied without any input length threshold.

hive.cache.expr.evaluation
  • Default Value: true
  • Added In: Hive 0.12.0 with HIVE-4209
  • Bug Fix: Hive 0.14.0 with HIVE-7314 (expression caching doesn't work when using UDF inside another UDF or a Hive function)

...