Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Semantics of Sequences of Windows

Timestamp-based time windows are aligned on a fixed granularity, all other types of windows are not. We will see how this plays out and why it might be important in the following.

When you have several window operations in sequence you want to know which logical window an element belongs to at each state of processing. For example, in this code:

Code Block
languagescala
titleDataStream Example
linenumberstrue
val stream: DataStream[(Int, Int, Int)] = env.createStream(new KafkaSource(...))
val windowed: DataStream[(Int, Int, Int)] = stream
  .keyBy(0)
  .window(TimestampTime.of(5, SECONDS)
  .sumByKey(2)

  .keyBy(1)
  .window(TimestampTime.of(5, SECONDS))
  .minByKey(2)

we have two 5-second windows but the key by which the operations work is different. Since the timestamp for elements that result from the first window operations is the minimum of the timestamps inside the window the resulting element would also be in that same window. Since we have windows based on timestamps the windows are aligned. Therefore, the result of a window from t to t + 5 will also be in the window from t to t + 5 in the second window operation.

Technically speaking, if you have an element with timestamp t and windows of size duration (both in milliseconds) the start of the window that this element is in is given by t - (t % duration). The same applies to sliding time windows on timestamp time. 

 

Aligned Time Windows

 

General Windows

...