Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Introduction

Hive is used for both interactive queries as well as part. The Hive variable substitution mechanism was designed to avoid some of the code that was getting baked into the scripting language on top of Hive. Examples such as:

Code Block
$ a=b
$ hive -e " describe $a "

are becoming commonplace. This is frustrating as Hive becomes closely coupled with scripting languages. The Hive startup time of a couple seconds is non-trivial when doing thousands of manipulations such as multiple hive -e invocations.

Hive Variables combine the set capability you know and love with some limited yet powerful (evil laugh) substitution ability. For example:

Code Block
$ bin/hive -hiveconf a=b -e 'set a; set hiveconf:a; \
create table if not exists b (col int); describe ${hiveconf:a}'

results in:

Code Block
Hive history file=/tmp/edward/hive_job_log_edward_201011240906_1463048967.txt
a=b
hiveconf:a=b
OK
Time taken: 5.913 seconds
OK
col	int	
Time taken: 0.754 seconds

For general information about Hive command line options, see Hive CLI.

Info
titleVersion information

The hiveconf option was added in version 0.7.0 (JIRA HIVE-1096). Version 0.8.0 added the options define and hivevar (JIRA HIVE-2020), which are equivalent and are not described here. They create custom variables in a namespace that is separate from the hiveconf, system, and env namespaces.

Using Variables

There are three namespaces for variables – hiveconf, system, and env. The hiveconf variables are set as normal:

Code Block
set x=myvalue

However they are retrieved using:

Code Block
${hiveconf:x}

Annotated examples of usage from the test case ql/src/test/queries/clientpositive/set_processor_namespaces.q:

Code Block
set zzz=5;
--  sets zzz=5
set zzz;

set system:xxx=5;
set system:xxx;
-- sets a system property xxx to 5

set system:yyy=${system:xxx};
set system:yyy;
-- sets yyy with value of xxx

set go=${hiveconf:zzz};
set go;
-- sets go base on value on zzz

set hive.variable.substitute=false;
set raw=${hiveconf:zzz};
set raw;
-- disable substitution set a value to the literal

set hive.variable.substitute=true;

EXPLAIN SELECT * FROM src where key=${hiveconf:zzz};
SELECT * FROM src where key=${hiveconf:zzz};
--use a variable in a query

set a=1;
set b=a;
set c=${hiveconf:${hiveconf:b}};
set c;
--uses nested variables. 


set jar=../lib/derby.jar;

add file ${hiveconf:jar};
list file;
delete file ${hiveconf:jar};
list file;

Disabling Variable Substitution

Variable substitution is on by default. If this causes an issue with an already existing script, disable it.

Code Block
set hive.variable.substitute=false;