...
Return Type | Name(Signature) | Description |
---|---|---|
N rows | explode(ARRAY) | Returns one row for each element from the array. |
N rows | explode(MAP) | Returns one row for each key-value pair from the input map with two columns in each row: one for the key and another for the value. (As of Hive 0.8.0.) |
| inline(ARRAY<STRUCT[,STRUCT]>) | Explodes an array of structs into a table. (As of Hive 0.10.) |
Array Type | explode(array<TYPE> a) | For each element in a, generates a row containing that element. |
tuple | json_tuple(jsonStr, k1, k2, ...) | Takes a set of names (keys) and a JSON string, and returns a tuple of values. This is a more efficient version of the |
tuple | parse_url_tuple(url, p1, p2, ...) | This is similar to the |
N rows | posexplode(ARRAY) | Behaves like |
N rows | stack(INT n, v_1, v_2, ..., v_k) | Breaks up v_1, ..., v_k into n rows. Each row will have k/n columns. n must be constant. |
...
explode()
takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. UDTF's UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW.
An As an example use of using explode()
in the SELECT expression list is as follows:Consider , consider a table named myTable that has a single column (myCol) and two rows:
Array<int> myCol |
---|
[100,200,300] |
[400,500,600] |
...
posexplode()
is similar to explode
but instead of just returning the elements of the array it returns the element as well as its position in the original array.
An As an example use of using posexplode()
in the SELECT expression list is as follows:Consider , consider a table named myTable that has a single column (myCol) and two rows:
Array<int> myCol |
---|
[100,200,300] |
[400,500,600] |
...
A new json_tuple() UDTF is introduced in Hive 0.7. It takes a set of names (keys) and a JSON string, and returns a tuple of values using one function. This is much more efficient than calling GET_JSON_OBJECT to retrieve more than one key from a single JSON string. In any case where a single JSON string would be parsed more than once, your query will be more efficient if you parse it once, which is what JSON_TUPLE is for. As JSON_TUPLE is a UDTF, you will need to use the LATERAL VIEW syntax in order to achieve the same goal.
...
Code Block |
---|
select a.timestamp, get_json_object(a.appevents, '$.eventid'), get_json_object(a.appenvets, '$.eventname') from log a; |
should be changed to:
Code Block |
---|
select a.timestamp, b.* from log a lateral view json_tuple(a.appevent, 'eventid', 'eventname') b as f1, f2; |
...