...
Operator | Operand types | Description | ||
---|---|---|---|---|
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="04732f53e42a923f-0a493176-41074453-a0078b78-a2f965997d072a8808ee565f"><ac:plain-text-body><![CDATA[ | A[n] | A is an Array and n is an int | returns the nth element in the array A. The first element has index 0 e.g. if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar' | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="30030bbfbaaed17f-4136040d-48d249a0-93919387-592bd4d241c2d1082dcc714d"><ac:plain-text-body><![CDATA[ | M[key] | M is a Map<K, V> and key has type K | returns the value corresponding to the key in the map e.g. if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar' | ]]></ac:plain-text-body></ac:structured-macro> |
S.x | S is a struct | returns the x field of S e.g for struct foobar {int foo, int bar} foobar.foo returns the integer stored in the foo field of the struct. |
...
Return Type | Aggregation Function Name (Signature) | Description | ||
---|---|---|---|---|
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="abd4b95cf8c40a81-6a7e80eb-46374f0d-9159ba04-5890cf026d20920281e54e90"><ac:plain-text-body><![CDATA[ | BIGINT | count(*), count(expr), count(DISTINCT expr[, expr_.]) | count(*) - Returns the total number of retrieved rows, including rows containing NULL values; count(expr) - Returns the number of rows for which the supplied expression is non-NULL; count(DISTINCT expr[, expr]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL. | ]]></ac:plain-text-body></ac:structured-macro> |
DOUBLE | sum(col), sum(DISTINCT col) | returns the sum of the elements in the group or the sum of the distinct values of the column in the group | ||
DOUBLE | avg(col), avg(DISTINCT col) | returns the average of the elements in the group or the average of the distinct values of the column in the group | ||
DOUBLE | min(col) | returns the minimum value of the column in the group | ||
DOUBLE | max(col) | returns the maximum value of the column in the group |
...
Dynamic-partition insert (or multi-partition insert) is designed to solve this problem by dynamically determining which partitions should be created and populated while scanning the input table. This is a newly added feature that is only available from version 0.6.0 (trunk now). In the dynamic partition insert, the input column values are evaluated to determine which partition this row should be inserted into. If that partition has not been created, it will create that partition automatically. Using this feature you need only one insert statement to create and populate all necessary partitions. In addition, since there is only one insert statement, there is only one corresponding MapReduce job. This significantly improves performance and reduce the Hadoop cluster workload comparing to the multiple insert case.
...