Table of Contents |
---|
EXPLAIN Syntax
Hive provides an EXPLAIN
command that shows the execution plan for a query. The syntax for this statement is as follows:
Code Block |
---|
EXPLAIN [EXTENDED|DEPENDENCY|AUTHORIZATION|LOCKS|VECTORIZATION] query
|
AUTHORIZATION
is supported from HIVE 0.14.0 via HIVE-5961.
...
With the FORMATTED
keyword, it will be returned in JSON format.
Code Block |
---|
"OUTPUTS":["hdfs://localhost:9000/tmp/.../-mr-10000"],"INPUTS":["default@srcpart","default@src","default@srcpart@ds=2008-04-08/hr=11","default@srcpart@ds=2008-04-08/hr=12","default@srcpart@ds=2008-04-09/hr=11","default@srcpart@ds=2008-04-09/hr=12"],"OPERATION":"QUERY","CURRENT_USER":"navis","AUTHORIZATION_FAILURES":["Permission denied: Principal [name=navis, type=USER] does not have following privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, name=default.src], [SELECT] on Object [type=TABLE_OR_VIEW, name=default.srcpart]]"]}
|
Tne LOCKS Clause
This is useful to understand what locks the system will acquire to run the specified query. Since HIVE-17683
For example
Code Block |
---|
EXPLAIN LOCKS UPDATE target SET b = 1 WHERE p IN (SELECT t.q1 FROM source t WHERE t.a1=5)
|
Will produce output like this.
Code Block |
---|
LOCK INFORMATION:
default.source -> SHARED_READ
default.target.p=1/q=2 -> SHARED_READ
default.target.p=1/q=3 -> SHARED_READ
default.target.p=2/q=2 -> SHARED_READ
default.target.p=2/q=2 -> SHARED_WRITE
default.target.p=1/q=3 -> SHARED_WRITE
default.target.p=1/q=2 -> SHARED_WRITE |
Code Block |
---|
EXPLAIN FORMATTED LOCKS <sql>
|
is also supported which will produce JSON encoded output.
The VECTORIZATION Clause
See HIVE-11394 for more details.
User-level Explain Output
Since HIVE-8600 in Hive 1.1.0, we support a user-level explain extended output for any query at the log4j INFO level after set hive.log.explain.output=true (default is false).
Since HIVE-18469 in Hive 3.1.0, the user-level explain extended output for any query will be shown in the WebUI / Drilldown / Query Plan after set hive.server2.webui.explain.output=true (default is false).
Since HIVE-9780 in Hive 1.2.0, we support a user-level explain for Hive on Tez users. After set hive.explain.user=true (default is false) if the following query is sent, the user can see a much more clearly readable tree of operations.
Since HIVE-11133 in Hive 3.0.0, we support a user-level explain for Hive on Spark users. A separate configuration is used for Hive-on-Spark, hive.spark.explain.user which is set to false by default.
Code Block |
---|
EXPLAIN select sum(hash(key)), sum(hash(value)) from src_orc_merge_test_part where ds='2012-01-03' and ts='2012-01-03+14:46:31' |
Code Block |
---|
Plan optimized by CBO.
Vertex dependency in root stage
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Stage-0
Fetch Operator
limit:-1
Stage-1
Reducer 2
File Output Operator [FS_8]
compressed:false
Statistics:Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
Group By Operator [GBY_6]
| aggregations:["sum(VALUE._col0)","sum(VALUE._col1)"]
| outputColumnNames:["_col0","_col1"]
| Statistics:Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
|<-Map 1 [SIMPLE_EDGE]
Reduce Output Operator [RS_5]
sort order:
Statistics:Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
value expressions:_col0 (type: bigint), _col1 (type: bigint)
Group By Operator [GBY_4]
aggregations:["sum(_col0)","sum(_col1)"]
outputColumnNames:["_col0","_col1"]
Statistics:Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE
Select Operator [SEL_2]
outputColumnNames:["_col0","_col1"]
Statistics:Num rows: 500 Data size: 47000 Basic stats: COMPLETE Column stats: NONE
TableScan [TS_0]
alias:src_orc_merge_test_part
Statistics:Num rows: 500 Data size: 47000 Basic stats: COMPLETE Column stats: NONE
|