Hive Data Types

Table of Contents

Column

...

Types

These are all the supported column types in Hive:

...

Timestamps are interpreted to be timezoneless and stored as an offset from the UNIX epoch. Convenience UDFs for conversion to and from timezones are provided (to_utc_timestamp, from_utc_timestamp).
All existing datetime UDFs (month, day, year, hour, etc.) work with the TIMESTAMP data type.

Info

title	Version

Timestamps were introduced in Hive 0.8.0 (HIVE-2272).

Decimal

...

Datatype

The Decimal datatype in Hive is based on Java's BigDecimal which is used for representing immutable arbitrary precision decimal numbers in Java. All regular number operations (e.g. +, -, *, /) and relevant UDFs (e.g. Floor, Ceil, Round, and many more) have been updated to handle decimal types. You can cast to/from decimal types like you would do with other numeric types. The persistence format of the decimal type supports both scientific and non-scientific notation. Therefore, regardless of whether your dataset contains data like 1E+44 (scientific notation) or 4004 (non-scientific notation) or a combination of both, decimal datatype can be used for it.

With the present Hive implementation, the Decimal datatype has unlimited precision limited to 38 digits. While this has absolutely no minimal impact on usability, it's worth noting for performance reasons. HIVE-3796 is tracking the progress of allowing users to specify scale and precision when creating tables with decimal datatype.

The Decimal datatype is discussed further in Floating point typesPoint Types below.

Info

title	Version

...

Decimal datatype was introduced in Hive 0.11.0 (HIVE-2693).

Union Types

Union types can at any one point hold exactly one of their specified data types. You can create an instance of this type using the create_union UDF:

...

No Format

SELECT create_union(0, key), create_union(if(key<100, 0, 1), 2.0, value), create_union(1, "a", struct(2, "b")) FROM src LIMIT 2;

{0:"238"}	{1:"val_238"}	{1:{"col1":2,"col2":"b"}}
{0:"86"}	{0:2.0}	{1:{"col1":2,"col2":"b"}}

Literals

Integral

...

Types

Integral literals are assumed to be INT by default, unless the number exceeds the range of INT in which case it is interpreted as a BIGINT, or if one of the following postfixes is present on the number.

Type	Postfix	Example
TINYINT	Y	100Y
SMALLINT	S	100S
BIGINT	L	100L

String

...

Types

String literals can be expressed with either single quotes (') or double quotes ("). Hive uses C-style escaping within the strings.

Floating

...

Point Types

Floating point literals are assumed to be DOUBLE. Scientific notation is not yet supported.

Decimal

...

Types

Info

title	Version

Decimal datatype was introduced in Hive 0.11.0 (HIVE-2693). See Decimal Datatype above.

Decimal literals provide precise values and greater precision and range for floating point numbers than the DOUBLE data type. Decimal data types store exact representations of numeric values, while DOUBLE data types store very close approximations of numeric values.

Decimal types (See Decimal datatype above.) They are needed for use cases in which the precision (very close) approximation of a DOUBLE is insufficient, such as financial applications, equality and inequality checks, and rounding operations. They are also needed for use cases that deal with numbers outside the DOUBLE range (approximately -10³⁰⁸ to 10³⁰⁸) or very close to zero (-10^-308 to 10^-308). The precision of a Decimal type is practically unlimited. For a general discussion of the limits of the DOUBLE type, see the Wikipedia article Double-precision floating-point format.

...

The precision of a Decimal type is limited to 38 digits in Hive. See HIVE-4271 and HIVE-4320 for comments about the reasons for choosing this limit.

Using Decimal Types

You can create a table in Hive that uses the Decimal type with the following syntax:

...

Code Block
select cast(t as boolean) from decimal_2;

Mathematical UDFs

Decimal also supports many mathematical UDFs with the same syntax as used in the case of DOUBLE.

Basic mathematical operations that can use decimal types include:

PlusPositive
Negative
Addition
Subtraction
Multiplication
Division
Average (avg)
Sum
Count
Modulus (pmod)

These rounding functions can also take decimal types:

...

But these mathematical UDFs are not currently supported:

Exp
Log (Lnln)
Log2
Log10
Sqrt

Power(decimal, n) only supports positive integer values for the exponent n.

Casting Decimal Values

Casting is supported between decimal values and any other primitive type such as integer, double, boolean, and so on.

Testing Decimal Types

...

Space shortcuts

Child pages

Versions Compared

Old Version 12

New Version 13

Key

Hive Data Types

Column

Types

Decimal

Datatype

Union Types

Literals

Integral

Types

String

Types

Floating

Point Types

Decimal

Types

Using Decimal Types

Mathematical UDFs

Casting Decimal Values

Testing Decimal Types

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 12

New Version 13

Key

Hive Data Types

Column

Types

Decimal

Datatype

Union Types

Literals

Integral

Types

String

Types

Floating

Point Types

Decimal

Types

Using Decimal Types

Mathematical UDFs

Casting Decimal Values

Testing Decimal Types