Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In INSERT ... SELECT ... queries, the dynamic partition columns must be specified last among the columns in the select SELECT statement and in the same order of the directory hierarchy in which the table data are stored (i.e., normally in the order in which the partitions were declared in the CREATE TABLE statement)they appear in the PARTITION() clause.

  • all DP columns – only allowed in nonstrict mode. In strict mode, we should throw an error. e.g.,
    Code Block
       INSERT OVERWRITE TABLE T PARTITION (ds, hr) 
       SELECT key, value, ds, hr FROM srcpart WHERE ds is not null and hr>10;
    
  • mixed SP & DP columns. e.g.,
    Code Block
       INSERT OVERWRITE TABLE T PARTITION (ds='2010-03-03', hr) 
       SELECT key, value, /*ds,*/ hr FROM srcpart WHERE ds is not null and hr>10;
    
  • SP is a subpartition of a DP: should throw an error because partition column order determins directory hierarchy. We cannot change the hierarchy in DML. e.g.,
    Code Block
       -- throw an exception
       INSERT OVERWRITE TABLE T PARTITION (ds, hr = 11) 
       SELECT key, value, ds/*, hr*/ FROM srcpart WHERE ds is not null and hr=11;
    
  • multi-table insert. e.g.,
    Code Block
       FROM S
       INSERT OVERWRITE TABLE T PARTITION (ds='2010-03-03', hr) 
       SELECT key, value, ds, hr FROM srcpart WHERE ds is not null and hr>10
       INSERT OVERWRITE TABLE R PARTITION (ds='2010-03-03, hr=12)
       SELECT key, value, ds, hr from srcpart where ds is not null and hr = 12;
    
  • CTAS – syntax is a little bit different from CTAS on non-partitioned tables, since
    the schema of the target table is not totally derived from the select-clause.
    We need to specify the schema including partitioning columns in the
    create-clause. e.g.,
    Code Block
       CREATE TABLE T (key int, value string) PARTITIONED BY (ds string, hr int) AS 
       SELECT key, value, ds, hr+1 hr1 FROM srcpart WHERE ds is not null and hr>10;
    
    The above example shows the case of all DP columns in CTAS. If you want put
    some constant for some partitioning column, you can specify it in the
    select-clause. e.g,
    Code Block
       CREATE TABLE T (key int, value string) PARTITIONED BY (ds string, hr int) AS 
       SELECT key, value, "2010-03-03", hr+1 hr1 FROM srcpart WHERE ds is not null and hr>10;
    

...