Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • CountTrigger(threshold)(non-purging): Fires after the number of elements in a window passes the given threshold.

  • EventTimeTrigger()(purging): Fire after the watermark passes the end of a window.

  • ProcessingTimeTrigger() (purging): Fire after processing time passes the end of a window.

  • ContinuousEventTimeTrigger(interval)(non-purging): Fire repeatedly with a given periodicity according to the watermark.

  • ContinuousProcessingTimeTrigger(interval)(non-purging): Fire repeatedly with a given periodicity according to processing time.

  • DeltaTrigger(delta-function)(non-purging): Fire once the delta between a start-value and the newly added value exceeds a given threshold.

 

  • PurgingTrigger(trigger)(purging): Turns any trigger into a purging trigger.

Limitations of the current triggers: 

 

  1. There is no out-of-the-box way to combine two triggers. As an example, the user cannot specify a trigger that fires “after the watermark passes the end of the window and only if the number of elements is at least 5”. This could be useful for cases where data are anonymized using K-anonymity strategies.
  2. In a previous document we described the notion of “allowed lateness”. Allowed lateness is a setting of the WindowOperator and a window for which maxTimestamp + allowedLateness <= currentWatermark is considered late. Lateness affects two things: 1) elements that belong to late windows are dropped, and 2) the state of a window is dropped as soon as the watermark passes the end of the window plus the allowed lateness (garbage collection). Although we allow elements to be late, the current triggers do not provide any possibility to customize the way a pipeline fires for late but not dropped elements. These firings are called “late firings”. In addition, there is no way to specify “early firings”i.e. firings that happen before the watermark (or the processing time) passes the end of the window. To illustrate the above, the user cannot specify a policy that says: “fire in daily windows with allowed lateness of 12 hours, but give me early results every hour, and after the watermark passes the end of the window, fire every 10 elements”.

...

These two triggers target the first limitation mentioned above, i.e. forming composite triggers. Both of them take as argument a list of triggers of variable length and the Any(trigger1, trigger2,…) fires if any of the child triggers fires, while the All(trigger1, trigger2,…) fires once all of the child triggers fire. With these triggers, the user can specify a trigger that fires when the watermark passes the end of the window and only if there are at least 5 elements by writing:

...

These target the second limitation, i.e. early and late firings. The *.afterEndOfWindow() will fire when the watermark (or the processing time) passes the end of the window, while the .*afterFirstElement(delay) fires after the watermark (or the processing time) passes the timestamp of the first element since the last firing of a window, plus the specified “delay”. The latter triggering policy becomes more interesting when specifying early and late firings, where a window may have multiple such firings. Focusing on the *.afterEndOfWindow(), these triggers allow the specification of an early trigger, and, in the case of the EventTimeTrigger, the specification of a late one. Early firings correspond to firings that happen beforebefore the watermark passes the end of the window, while late ones happen after the watermark has passed and the end of the window, but before it passes the end of window plus the allowed lateness. In processing time, there is no late trigger as the notion of lateness does not exist. For more, see here.

...