Page History

...

Return Type	Name(Signature)	Description
N rows	explode(ARRAY)	Returns one row for each element from the array.
N rows	explode(MAP)	Returns one row for each key-value pair from the input map with two columns in each row: one for the key and another for the value. (As of Hive 0.8.0.)
	inline(ARRAY<STRUCT[,STRUCT]>)	Explodes an array of structs into a table. (As of Hive 0.10.)
Array Type	explode(array<TYPE> a)	For each element in a, generates a row containing that element.
tuple	json_tuple(jsonStr, k1, k2, ...)	Takes a set of names (keys) and a JSON string, and returns a tuple of values. This is a more efficient version of the `get_json_object` UDF because it can get multiple keys with just one call.
tuple	parse_url_tuple(url, p1, p2, ...)	This is similar to the `parse_url()` UDF but can extract multiple parts at once out of a URL. Valid part names are: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY:<KEY>.
N rows	posexplode(ARRAY)	Behaves like `explode` for arrays, but includes the position of items in the original array by returning a tuple of `(pos, value)`. (As of Hive 0.13.0.)
N rows	stack(INT n, v_1, v_2, ..., v_k)	Breaks up v_1, ..., v_k into n rows. Each row will have k/n columns. n must be constant.

...

explode() takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. UDTF's UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW.

An As an example use of using explode() in the SELECT expression list is as follows:Consider , consider a table named myTable that has a single column (myCol) and two rows:

Array<int> myCol
[100,200,300]
[400,500,600]

...

posexplode() is similar to explode but instead of just returning the elements of the array it returns the element as well as its position in the original array.

An As an example use of using posexplode() in the SELECT expression list is as follows:Consider , consider a table named myTable that has a single column (myCol) and two rows:

Array<int> myCol
[100,200,300]
[400,500,600]

...

A new json_tuple() UDTF is introduced in Hive 0.7. It takes a set of names (keys) and a JSON string, and returns a tuple of values using one function. This is much more efficient than calling GET_JSON_OBJECT to retrieve more than one key from a single JSON string. In any case where a single JSON string would be parsed more than once, your query will be more efficient if you parse it once, which is what JSON_TUPLE is for. As JSON_TUPLE is a UDTF, you will need to use the LATERAL VIEW syntax in order to achieve the same goal.

...

Code Block
select a.timestamp, get_json_object(a.appevents, '$.eventid'), get_json_object(a.appenvets, '$.eventname') from log a;

should be changed to:

Code Block
select a.timestamp, b.* from log a lateral view json_tuple(a.appevent, 'eventid', 'eventname') b as f1, f2;

...

Space shortcuts

Child pages

Versions Compared

Old Version 137

New Version 138

Key