Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

On the other hand, this operator would increate the end-to-end latency of the Flink job, as the latest records would only be output when flush() is triggered. In the worst case where the latest record of a key is buffered right after a flush, the latency of the record would increase by max-flush-interval. Besides, each buffering operation would bring a read & write state access, whose overhead might be mitigated if the in-memory state cache proposed in FLIP-325 could be implemented and utilized.

Built-in operators and functions that would be affected by this FLIP

...

The design proposed in this FLIP is backward compatible.


We propose to deprecate the tableIf this FLIP and FLIP-325 could be implemented, we propose to deprecate the table.exec.mini-batch.*  configurations in Table/SQL programs. As we can see, tableconfigurations in Table/SQL programs. As we can see, table.exec.mini-batch brings two optimizations: reduced state backend access and reduced number of output records, and the design proposed in this FLIP and FLIP-325 can cover these optimizations plus the following additional advantagesbatch brings two optimizations: reduced state backend access and reduced number of output records, and the design proposed in this FLIP and FLIP-325 can cover these optimizations with the following additional advantages:

  • Less heap memory usage. The operator only needs to store the aggregated value and one output record for each unique key, rather than the full list of the original elements.

  • Better cache hit rate. Since hot key does not have to be evicted from the cache periodically (due to mini-batch processing).

  • No need to increase time of the synchronous checkpoint stage.

  • Applicable to operators in DataStream API as well, instead of only Table/SQL API.

...