Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Page properties



Table of Contents

Motivation

...

  • Make Table API users to easily leverage the powerful features in SQL, e.g. deduplication, topn, etc
  • Provide convenient frequently used functionalities, e.g. introducing a series of methods for null data handling (it will become a problem if there are hundreds of columns) and data sampling and splitting (it is very common in ML to split a data set into multiple splits separately for training and validation)

Public Interfaces

...

Align SQL functionalities

deduplicate

Specification:

...

CREATE TEMPORARY VIEW T AS
SELECT * FROM TABLE(

   TUMBLE(TABLE table, DESCRIPTOR(rowtime), INTERVAL '10' MINUTES));

SELECT *

FROM (

   SELECT *,

     ROW_NUMBER() OVER (PARTITION BY window_start, window_end, col1

       ORDER BY col2 asc, col3 desc) AS rownum

   FROM T)

WHERE rownum <= 3

...

hint

Specification:

Table hint(String name, Map<String, String> options)

// The hint options defined in RelHint supports both List and Map. Currently it only uses Map in Flink. So there is no need to add the following interface for now. We could introduce it when we support List in Flink.
Table hint(String name, String... parameters)

...