Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add link to Lateral View in json_tuple section, misc. edits

...

Return Type

Name(Signature)

Description

N rows

explode(ARRAY)

Returns one row for each element from the array.

N rows

explode(MAP)

Returns one row for each key-value pair from the input map with two columns in each row: one for the key and another for the value. (As of Hive 0.8.0.)

 

inline(ARRAY<STRUCT[,STRUCT]>)

Explodes an array of structs into a table. (As of Hive 0.10.)

Array Type

explode(array<TYPE> a)

For each element in a, generates a row containing that element.

tuple

json_tuple(jsonStr, k1, k2, ...)

Takes a set of names (keys) and a JSON string, and returns a tuple of values. This is a more efficient version of the get_json_object UDF because it can get multiple keys with just one call.

tuple

parse_url_tuple(url, p1, p2, ...)

This is similar to the parse_url() UDF but can extract multiple parts at once out of a URL. Valid part names are: HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, USERINFO, QUERY:<KEY>.

N rows

posexplode(ARRAY)

Behaves like explode for arrays, but includes the position of items in the original array by returning a tuple of (pos, value). (As of Hive 0.13.0.)

 N rows

stack(INT n, v_1, v_2, ..., v_k)

Breaks up v_1, ..., v_k into n rows. Each row will have k/n columns. n must be constant.

...

explode() takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. UDTF's UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW.

An As an example use of using explode() in the SELECT expression list is as follows:Consider , consider a table named myTable that has a single column (myCol) and two rows:

Array<int> myCol

[100,200,300]

[400,500,600]

...

posexplode() is similar to explode but instead of just returning the elements of the array it returns the element as well as its position in the original array.

An As an example use of using posexplode() in the SELECT expression list is as follows:Consider , consider a table named myTable that has a single column (myCol) and two rows:

Array<int> myCol

[100,200,300]

[400,500,600]

...

A new json_tuple() UDTF is introduced in Hive 0.7. It takes a set of names (keys) and a JSON string, and returns a tuple of values using one function. This is much more efficient than calling GET_JSON_OBJECT to retrieve more than one key from a single JSON string. In any case where a single JSON string would be parsed more than once, your query will be more efficient if you parse it once, which is what JSON_TUPLE is for. As JSON_TUPLE is a UDTF, you will need to use the LATERAL VIEW syntax in order to achieve the same goal.

...

Code Block
select a.timestamp, get_json_object(a.appevents, '$.eventid'), get_json_object(a.appenvets, '$.eventname') from log a;

should be changed to:

Code Block
select a.timestamp, b.*
from log a lateral view json_tuple(a.appevent, 'eventid', 'eventname') b as f1, f2;

...