...
We can introduce richer merge strategies, one of which is already introduced is PartialUpdateMergeFunction, which completes non-NULL fields when merging. We can introduce more powerful merge strategies, such as support for pre-aggregated merges. Currently the pre-aggregation is used by many big data systems, e.g. Apache Doris, Apache Kylin, Druid, to reduce storage cost and accelerate aggregation query. By introducing pre-aggregated merge to Flink Table Store, it can acquire the same benefit. Aggregate functions which we plan to implement includes sum sum, max/min, count, replacelast_ifnon_notnull_nullvalue, replace, concatenate, last_value, listagg, bool_or/bool_and..
Public Interfaces
Basic usage of pre-aggregated merge
...
The max/min aggregate function fupports DECIMAL, TINYINT, SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, INTERVAL(INTERVAL YEAR TO MONTH, INTERVAL DAY TO SECOND), DATE, TIME, TIMESTAMP, TIMESTAMP_LTZ data types.
The count/ last_non_null_value/last_value aggregate functions support all data types.
...
The bool_and/bool_or aggregate function supports BOOLEAN data type.
Default value of these aggregate functions
sum: default value of corresponding data type
count: 0
max/min/last_non_null_value/last_value: NULL
listagg: ""
bool_and: true
bool_or: false
Changelog support
In most cases, the modification to Table Store is INSERT changes. However, Table Store can also be converted into retract stream which may include retract messages (UPDATE/DELETE changes).
...
Aggregate functions supporting for UPDATE changes: sum, count.
Aggregate functions supporting for DELETE changes: sum, count.
It needs more design to make other aggregate functions support UPDATE/DELETE changes.
...