...
On the other hand, this operator would increate the end-to-end latency of the Flink job, as the latest records would only be output when flush() is triggered. In the worst case where the latest record of a key is buffered right after a flush, the latency of the record would increase by max-flush-interval. Besides, each buffering operation would bring a read & write state access, whose overhead might be mitigated if the in-memory state cache proposed in FLIP-325 could be implemented and utilized.
Built-in operators and functions that would be affected by this FLIP
...
The design proposed in this FLIP is backward compatible.
We propose to deprecate the tableIf this FLIP and FLIP-325 could be implemented, we propose to deprecate the table.exec.mini-batch.* configurations in Table/SQL programs. As we can see, tableconfigurations in Table/SQL programs. As we can see, table.exec.mini-batch brings two optimizations: reduced state backend access and reduced number of output records, and the design proposed in this FLIP and FLIP-325 can cover these optimizations plus the following additional advantagesbatch brings two optimizations: reduced state backend access and reduced number of output records, and the design proposed in this FLIP and FLIP-325 can cover these optimizations with the following additional advantages:
Less heap memory usage. The operator only needs to store the aggregated value and one output record for each unique key, rather than the full list of the original elements.
Better cache hit rate. Since hot key does not have to be evicted from the cache periodically (due to mini-batch processing).
No need to increase time of the synchronous checkpoint stage.
Applicable to operators in DataStream API as well, instead of only Table/SQL API.
...