Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • incremental reads with insert_overwrite
    • Lets say there is a pipeline setup using incremental reads. table1 -- (transformation) --> table2 
    • if there is an insert overwrite on table1, previous records are considered 'deleted'  in table1. But, incremental reads on table1 will not send deletions with current implementation. So table2 data will continue to have the previous records. This is not being tracked as part of this RFC, but an important feature to implement IMO.
  • scaling insert overwrite
    • Partitioner implementation today stores all file groups effected in memory. We may have to make the internal data structures of Partitioner spillable, if 'insert_overwrite_table' is done on a large table with large number of file groups.

...