Table of Contents |
---|
Why
...
Replace the
...
Existing Hive CLI?
Hive CLI is a legacy tool which had two main use cases. One The first is that it served as a thick client for SQL on Hadoop and another the second is that it served as a command line tool for Hive Server (the original Hive server, now often referred to as "HiveServer1. HiveServer1 is already deprecated "). Hive Server has been deprecated (HIVE-6977) and removed from the Hive code base as of Hive 1.0.0 and replaced with HiveServer2 (HIVE-2935), so the second use case #2 is out of the questionno longer applies. For #1the first use case, Beeline provides or is supposed to provide equal functionality, yet is implemented differently from Hive CLI.
As it has been a while that Ideally, Hive CLI should be deprecated as the Hive community has been recommending long recommended using the Beeline + HS2 HiveServer2 configuration, ideally we should deprecating Hive CLI. Because of ; however, because of the wide use of Hive CLI, we instead propose replacing Hive CLI's implementation with Beeline plus embedded HS2 HiveServer2 so that the Hive community only needs to maintain a single code path. In this way, Hive CLI is just an alias to Beeline at either both the shell script level or at and the high code level. The goal is that no changes or minimum minimal changes are expected required from existing user scrip scripts using Hive CLI.
...
Hive CLI
...
Functionality Support
We use beeline Beeline to implement the old Hive CLI functionality. In case some existing Hive CLI features are not supported in new replaced CLIBeeline, we are able to use the following command to use the deprecated Hive CLI tool.
No Format |
---|
export USE_DEPRECATED_CLI=true |
And Note that the log4j configuration file has been changed to "beeline-log4j.properties".
Hive CLI
...
Options Support
To get help, run "hive -H
" or "hive --help
".
No Format |
---|
usage: hive -d,--define <key=value> Variable subsitution to apply to hive commands. e.g. -d A=B or --define A=B --database <databasename> Specify the database to use -e <quoted-query-string> SQL from command line -f <filename> SQL from files -H,--help Print help information --hiveconf <property=value> Use value for given property --hivevar <key=value> Variable subsitution to apply to hive commands. e.g. --hivevar A=B -i <filename> Initialization SQL file -S,--silent Silent mode in interactive shell -v,--verbose Verbose mode (echo executed SQL to the console) |
Hive CLI Interactive Shell
...
Commands Support
Example for source command:
No Format |
---|
hive> source /root/test.sql; hive> show tables; numbers_bucketed test2 testavro2 |
Hive CLI
...
Configuration Support
Configuration Name | Supported in New CLI |
---|---|
hive.cli.errors.ignore | Yes |
hive.cli.prompt | Yes |
hive.cli.pretty.output.num.cols | No |
hive.cli.print.current.db | No |
Performance Impacts
Using the JMH to measure the average time cost when retrieving a data set, we have the following resultresults.
No Format |
---|
Benchmark Mode Samples Score Error Units o.a.h.b.c.CliBench.BeeLineDriverBench.testSQLWithInitialFile avgt 1 1713326099.000 ? NaN ns/op o.a.h.b.c.CliBench.CliDriverBench.testSQLWithInitialFile avgt 1 1852995786.000 ? NaN ns/op |
The lower the score is the less the better since we are evaluate evaluating the time cost time. And we didn't have . There is not a clear performance gap in terms of retrieving data.
...