Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-145-Support-SQL-windowing-table-valued-function-td45269.html

Vote thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-145-Support-SQL-windowing-table-valued-function-td455712nd-td46452.html

JIRA: 

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-19604

...

Windowing table-valued functions

Tumbling Windows

The table-valued function TUMBLE  assigns a window for each row of a relation based on a time attribute columnWe would like to propose 4 kinds of window TVF: tumbling, hopping, cumulative, session. The return value of TUMBLE is a relation window TVF is a relation that includes all columns of data as well as additional two columns named one column named window to indicate the assigned window. The window column is a composite ROW type consists of window_start and , window_end and window_time which indicate the assigned window columns. The window_time column field is the a time attribute of the record after window TVF, it alwsays equal to "window_end - 1"

Tumbling Windows

The table-valued function TUMBLE  assigns a window for each row of a relation based on a time attribute column. The return value of TUMBLE is a relation that includes all columns of data as well as additional a ROW type column named window to indicate the assigned window. The original row time attribute "timecol" will be a regular timestamp column after window TVF. All assigned windows have the same length, and that’s why tumbling sometimes is named as “fixed windowing”.

TUMBLE  that takes three required parameters and one optional parameter:

Code Block
languagesql

TUMBLE(data, DESCRIPTOR(timecol), size [, offset ])

...

The table-valued function HOP assigns windows that cover rows within the interval of size and shifting every slide based on a timestamp column. The return value of HOP is  is a relation that includes all columns of data as well as additional two columns named window_start and window_end and window_time which indicate the assigned window. The window_time column is the time attribute of the record after window TVF, it alwsays equal to "window_end - 1". The a ROW type column named window to indicate the assigned window. The original row time attribute "timecol" will be a regular timestamp column after window TVF. Windows assigned could have overlapping so hopping sometime is named as “sliding windowing”.

...

The table-valued function CUMULATE assigns windows that cover rows within an initial interval of step size, and expanding to one more step size (keep window start fixed) every step until to the max window size. The return value of CUMULATE is a relation that includes all columns of data as well as additional two columns named window_start and window_end and window_time which a ROW type column named window to indicate the assigned window. The window_time column is the time attribute of the record after window TVF, it alwsays equal to "window_end - 1". The original row time attribute "timecol" will be a regular timestamp column after window TVF. Windows assigned could have overlapping.

...

The table-valued function SESSION assigns windows that cover rows based on datetime. Within a session window, distances of rows are less than interval. Session window is applied per key. The returned return value of SESSION SE is a relation that includes all columns of data as well as additional three columns named window_start and window_end and window_time which indicate the assigned window. The window_time column is the time attribute of the record after window TVF, it alwsays equal to "window_end - 1"a ROW type column named window to indicate the assigned window. The original row time attribute "timecol" will be a regular timestamp column after window TVF.  

...

Explore more ability of Polymorphic Table Functions

It's very exciting to see the potential ability of PTF.  As we can see, in the future, maybe we can support the following features in SQL in a standard PTF way.

  • advanced operations supported in Table API (FLIP-29), e.g. drop_columns, user-defined-table-aggregate
  • user defined join operator
  • a shortcut TopN function
  • re-assign watermarks?
  • re-partition data, similar feature to Hive DISTRIBUTED BY syntax. 
  • ...