Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejava
firstline1
public final class SlidingWindows extends Windows<TimeWindow> {
    /** The size of the windows in milliseconds. */
    public final long sizeMs;

    /** The grace period in milliseconds. */
    private final long graceMs;

	/** Base constructor, accepts both size and grace variables to create window */
	private SlidingWindows(final long sizeMs, final long graceMs){
		this.sizeMs = sizeMs;
		this.graceMs = graceMs;

	}

	/**Constructor just with size, sets grace to default */

	private SlidingWindows(final long sizeMs){
		this.sizeMs = sizeMs;
		this.graceMs = -1;

	}

	/**
     * Return a SlidingWindow of size given by user in milliseconds
     *
     * @param size The size of the window, > 0
     * @return a new window definition with default maintain duration of 1 day, size as given
     * @throws IllegalArgumentException if the window size is negative or zero
     */
	public static SlidingWindows of(final Duration size) throws IllegalArgumentException{}

	 /**
     * Reject out-of-order events that arrive more than {@code afterWindowEnd}
     * after the end of its window.
     *
     * @param afterWindowEnd The grace period to admit out-of-order events to a window.
     * @return updated builder window with same size and new grace
     * @throws IllegalArgumentException if {@code afterWindowEnd} is negative or can't be represented as {@code long milliseconds}
     */
	public SlidingWindows grace(final Duration afterWindowEnd) throws IllegalArgumentException{}
}

...

Creating a new class of Windows, SlidingWindows, to add aggregation flexibility and features for users. Building off the TimeWindow implementation, these sliding Sliding windows will have an inclusive start and end point and each window will be unique, meaning that each distinct set of records will only appear in one window. This is in contrast with hopping windows, which can mimic sliding windows with an advance of 1ms, but result in many overlapping windows, as shown below. This causes expensive aggregation calculations for each window, whereas for the sliding windows, aggregations are only done for each unique window, cutting down significantly on the amount of work required.

Figure 1: Windows with SlidingWindow Implementation: Window size 10ms, 9 7 windows for 5 4 records.

Image Removed


Image RemovedImage AddedImage Added







Figure 2: Windows with HoppingWindow with a 1ms advance: Window size 10ms, 17 10 windows for 5 4 records.

Image RemovedImage Added

With Sliding Windows, a new window is defined when a new record enters on the right, or one after a record leaves the window on the left. In this example, a new window is formed with C as the end point when it enters the window area, creating a window with the unique set {A, B, C}. Then, as the window slides along, a new window is defined 1ms after a record, A, leaves the window, creating a new window with the unique set of {B, C}. The next window to be created in this example would be 1ms after B leaves the window, creating a window with just {C}. In this implementation, there are two ways to define a distinct window using the sliding window semantics, one when a record enters the window from the right (Figure 3), and one when a record leaves the window on the left (Figure 4). When a record leaves the window, the next window is 1ms higher than that record's timestamp. 

Figure 3: New window defined after C enters

Image Removed

Figure 4: New window defined after A exits

...


Processing Windows

To process a new record, there are three major steps.


  1. Create the two new windows defined by this record
    1. Aggregate existing records that fall within these windows with the current record's value
  2. Find existing windows that the new record falls within
    1. Add the new record's value to these windows' aggregations
  3. Send the updated and newly created windows to the window store.

With Sliding Windows, a new record creates two new windows: one that ends at the record's timestamp, and one that starts 1ms after the record's timestamp. The first window will contain the new record, while the second will not. This is shown with record b in figure 3. These windows were chosen to ensure that for all the records, every combination of possible records given the window size will be created. This allows for easy aggregation when creating new windows, as any aggregation needed in a new window will have already been calculated in an existing window. For a new window that overlaps with existing records (record a in figure 3), these records will be aggregated and added to the new window's value.

Figure 3: New windows of size 10ms defined for b

Image Added

Any already created windows that the new record falls within will be updated to have an aggregation that includes the new records value. This requires doing a scan through the WindowStore to find these windows, similar to the SessionWindow scan required for aggregating SessionWindows, and updating the values. 

Out-of-order Records

Records that come out of order will be processed the same way as in-order records, as long as they fall within the grace period. Any Two new windows will be created by the late record, one ending at that record will still be created, and the existing windows that are changed by the late record will be updated. Any record that falls outside of the grace period (either user defined or default) will be discarded's timestamp, and one starting at 1ms past that record's timestamp. These windows will be updated with existing records that fall into them by using aggregations previously calculated. Additionally, any existing windows that the new record falls within will be updated with a new aggregation that includes this late record

Compatibility, Deprecation, and Migration Plan

Implementing a new feature, should be compatible with all existing features.

Rejected Alternatives

Other Types of Sliding Windows:

There are semantic differences between some implementations of Sliding Windows. We chose to have both ends of the window be inclusive, and to define windows based on a unique set of events. There are alternative implementations where either or both ends are not inclusive. Additionally, there are different ways to decide the start and end points of a window. We chose to define a new window based on a new record entering on the right, or one after a record leaves the window on the left. We chose this because it is intuitive with the window "sliding." An alternative implementation could define a window based on a new record entering on the left and 1ms before a record leaves on the right.

Other Implementation Options:

...

  • Creating a new window with the record's timestamp at the beginning, and a second window with 1ms before the record's timestamp at the end. This version of defining 2 new windows is less intuitive and is less compatible with looking at a time segment back in time.
  • Having either/both ends not inclusive. Traditionally, time windows are inclusive, and as hopping windows and tumbling windows are exclusive on the endpoint, this is a distinguishing feature. Changing exclusivity is just a matter of adding 1ms.
  • Having one window anchored and ending at the current time. By setting the retention time equal to the window size + grace period, this can be achieved, and allowing only this type of sliding window limits the historical views and therefore flexibility for use cases.

Additional Features:

  • Allow users to specify granularity of window definition (something other than 1ms). This is an option for the future, but allowing a user to specify 1 second or 1 minute granularity results in windows that are very similar to tumbling windows, straying from the purpose of sliding windows.


Other Implementation Options:

  • Making SlidingWindows an option within TimeWindow. This would add more requirements for users and blur the lines between hopping and sliding windows, while potentially making optimization harder.