Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Create new Beeline Properties table; Add fetchSize to table

...

Command

Description

!<SQLLine command>

List of SQLLine commands available at http://sqlline.sourceforge.net/.

Example: !quit exits the Beeline client.

!delimiter

Set the delimiter for queries written in Beeline. Multi-character delimiters are allowed, but quotation marks, slashes, and -- are not allowed. Defaults to ;

Usage: !delimiter $$

Version: 3.0.0 (HIVE-10865)

Beeline

...

Properties

Property

Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.

Use ";" (semicolon) to terminate commands. Comments in scripts can be specified using the "--" prefix.

Command
Description
fetchsize

Standard JDBC enables you to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size.
Setting the fetch size in Beeline overrides the JDBC driver's default fetch size and affects subsequent statements executed in the current session.

  1. A value of -1 instructs Beeline to use the JDBC driver's default fetch size (default)
  2. A value of zero or more is passed to the JDBC driver for each statement
  3. Any other negative value will throw an Exception

    Usage: !set fetchsize 200

Version: 4.0.0 (HIVE-22853)

Beeline Hive Commands

Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.

Use ";" (semicolon) to terminate commands. Comments in scripts can be specified using the "--" prefix.

<ivyurl> <ivyurl>  <ivyurl> <ivyurl>  <ivyurl> <ivyurl>As of Hive 1.2.0, adds

Command

Description

reset

Resets the configuration to the default values.

reset <key>

Resets the value of a particular configuration variable (key) to the default value.
Note: If you misspell the variable name, Beeline will not show an error.

set <key>=<value>

Sets the value of a particular configuration variable (key).
Note: If you misspell the variable name, Beeline will not show an error.

set

Prints a list of configuration variables that are overridden by the user or Hive.

set -v

Prints all Hadoop and Hive configuration variables

reset

Resets the configuration to the default values.

reset <key>

Resets the value of a particular configuration variable (key) to the default value.
Note: If you misspell the variable name, Beeline will not show an error.

set <key>=<value>

Sets the value of a particular configuration variable (key).
Note: If you misspell the variable name, Beeline will not show an error.

set

Prints a list of configuration variables that are overridden by the user or Hive.

set -v

Prints all Hadoop and Hive configuration variables.

add FILE[S] <filepath> <filepath>*
add JAR[S] <filepath> <filepath>*
add ARCHIVE[S] <filepath> <filepath>*

Adds one or more files, jars, or archives to the list of resources in the distributed cache. See Hive Resources for more information.

add FILE[S]

<filepath> <filepath>*


add JAR[S]

<filepath> <filepath>*


add ARCHIVE[S]

<filepath> <filepath>*

Adds one or more files, jars, or archives to the list of resources in the distributed cache. See Hive Resources for more information.

add FILE[S] <ivyurl> <ivyurl>* 
add JAR[S] <ivyurl> <ivyurl>* 
add ARCHIVE[S] <ivyurl> <ivyurl>*
As of Hive 1.2.0, adds one or more files, jars or archives to the list of resources in the distributed cache using an Ivy URL using an Ivy URL of the form ivy://group:module:version?query_string. See Hive Resources for more information.

list FILE[S]
list JAR[S]
list ARCHIVE[S]

Lists the resources already added to the distributed cache. See Hive Resources for more information. (As of Hive 0.14.0: HIVE-7592).

list FILE[S] <filepath>*
list JAR[S] <filepath>*
list ARCHIVE[S] <filepath>*

Checks whether the given resources are already added to the distributed cache or not. See Hive Resources for more information.

delete FILE[S] <filepath>*
delete JAR[S] <filepath>*
delete ARCHIVE[S] <filepath>*

Removes the resource(s) from the distributed cache.

delete FILE[S] <ivyurl> <ivyurl>* 
delete JAR[S] <ivyurl> <ivyurl>* 
delete ARCHIVE[S] <ivyurl> <ivyurl>*

As of Hive 1.2.0, removes the resource(s) which were added using the <ivyurl> from the distributed cache. See Hive Resources for more information.

reloadAs of Hive 0.14.0, makes HiveServer2 aware of any jar changes in the path specified by the configuration parameter hive.reloadable.aux.jars.path (without needing to restart HiveServer2). The changes can be adding, removing, or updating jar files.

dfs <dfs command>

Executes a dfs command.

<query string>

Executes a Hive query and prints results to standard output.

...

Running with nohangup (nohup) and ampersand (&) will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.  

Code Block
languagetext
nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log &

Fetch size

In order of precedence, the fetch size for beeline is determined using :

  1. If the beeline user does nothing, each query will use the fetch size received from HS2
  2. If the beeline user sets the fetchSize in the JDBC connection string, each query will use the fetch size specified there
  3. If the user wants to set the fetchSize in the session, they can with the syntax: !set fetchSize xxx
    1. Setting a fetchSize of 0 will direct the driver to use the fetch size provided from HS2
    2. Setting a fetchSize greater than 0 will set the driver fetch size to the specified value
    3. Setting a fetchSize of -1 directs beeline to use the default JDBC default behavior: use the connection string fetchSize and, if none is specified, fallback to the fetch size specified by HS2 (this is the default beeline fetchSize value)
    4. Setting a fetchSize of any other negative integer value is an error

Keeping in mind that whatever the client requests for a fetch size will be overruled on the HiveServer, for every FetchResults request, depending on the configured value for hive.server2.thrift.resultset.max.fetch.size. When a client requests a fetchSize larger than the max, a WARN message is emitted into the HS2 logs for further investigation and to direct clients to adjust their expectations (and configurations).

...

) will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.  

Code Block
languagetext
nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log &

JDBC

HiveServer2 has a JDBC driver. It supports both embedded and remote access to HiveServer2. Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users.

...

In the order of precedence, Hive JDBC driver uses the following criteria to determine fetchSize for ResultSet.

  1. Fetch size is set based on what is received from HS2 (hive.server2.thrift.resultset.default.fetch.size) during the client session open sequence
  2. Fetch size is set in the JDBC connection string (not well documented: jdbc:hive2://localhost:10000;fetchSize=100)
  3. Fetch size is set by the application code via JDBC setFetchSize [1]

Keeping in mind that whatever the client requests for a fetch size will be overruled on the HiveServer, for every FetchResults request, depending on the configured value for hive.server2.thrift.resultset.max.fetch.size. When a client requests a fetchSize larger than the max, a WARN message is emitted into the HS2 logs for further investigation and to direct clients to adjust their expectations (and configurations).

[1] https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setFetchSize-int-

Python Client

A Python client driver is available on github. For installation instructions, see Setting Up HiveServer2: Python Client Driver.

...