...
With this FLIP, we need to adjust the APIs around idleness, such that it's clear that only users should be able to make a source instance go idle. We need to emphasize the potential of late records and subsequent data loss if no counter-measures are taken.
Public Interfaces
- WatermarkOutput
Output, Input to rename StreamStatus
InputWatermarkOutput
Proposed Changes
The biggest change includes StreamStatus
albeit it's an internal class. We propose to adjust the definition (JavaDoc) to exclude records and rename the class to WatermarkStatus
and move it to the watermarkstatus package. Finally, we need to rename emitStreamStatus
and processStreamStatus
in a few public evolving interfaces (Input, Output). Here, we can add deprecated shortcuts to the top-level interfaces with the old name.
...
Since sources may offer users the ability to go IDLE
beyond the WatermarkStrategy#withIdleness
, we propose to also add WatermarkOutput#markActive
to quickly go active again without the need to explicitly emit a watermark first. In fact, we would change the contract of WatermarkOutput#emitWatermark
to not automatically go active: going back and forth should always be a cautious decision in the implementationsThis addition allows to reduce the number of late records when a source knows that records are available until it processed a batch and actually emit a watermark.
Compatibility, Deprecation, and Migration Plan
...
- User code using
Input
andOutput
would work without recompilation. Compiling would yield deprecation warnings.
...