...
Released: <Flink Version>
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
We can introduce richer merge strategies, one of which is already introduced is PartialUpdateMergeFunction, which completes non-NULL fields when merging. We can introduce more powerful merge strategies, such as support for pre-aggregated merges. Currently the pre-aggregation is used by many big data systems, e.g. Apache Doris, Apache Kylin, Druid, to reduce storage cost and accelerate aggregation query. By introducing pre-aggregated merge to Flink table store, it can acquire the same benefit. Aggregate functions which we plan to implement includes sum, max/min, count, replace_if_not_null, replace, concatenate, or/and.
...
Wiki Markup |
---|
--DDL CREATE TABLE T ( pk STRING PRIMARY KEY NOT ENFOCED, sum_field1 BIGINT, max_field1 BIGINT, default_field BIGINT ) WITH ( 'merge-engine' = 'aggregation', 'aggregate-function' = '\{ sum_field1:sum, max_field2:max \}' \-\- sum up all sum_field1 with same pk; get max value of all max_field1 with same pk ); -- DML INSERT INTO T VALUES ('pk1', 1, 1, 1); INSERT INTO T VALUES ('pk1', 1, 1, NULL); -- verify SELECT * FROM T; => output 'pk1', 2, 2, NULL |
...