Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Released: <Flink Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

We can introduce richer merge strategies, one of which is already introduced is PartialUpdateMergeFunction, which completes non-NULL fields when merging. We can introduce more powerful merge strategies, such as support for pre-aggregated merges. Currently the pre-aggregation is used by many big data systems, e.g. Apache Doris, Apache Kylin, Druid, to reduce storage cost and accelerate aggregation query. By introducing pre-aggregated merge to Flink table store, it can acquire the same benefit.  Aggregate functions which we plan to  implement includes sum, max/min, count, replace_if_not_null, replace, concatenate, or/and.

...

Wiki Markup
--DDL
CREATE TABLE T (
    pk STRING PRIMARY KEY NOT ENFOCED,
    sum_field1 BIGINT,
    max_field1 BIGINT,
    default_field BIGINT ) 
WITH ( 
'merge-engine' = 'aggregation',
'aggregate-function' = '\{
    sum_field1:sum,
    max_field2:max
  \}'   \-\- sum up all sum_field1 with same pk; get max value of all max_field1 with same pk
);

-- DML
INSERT INTO T VALUES ('pk1', 1, 1, 1);
INSERT INTO T VALUES ('pk1', 1, 1, NULL);
-- verify
SELECT * FROM T;
=> output 'pk1', 2, 2, NULL

...