Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Window function call ::= function_name(arg1, arg2, ... argN) OVER (frame_var AS)? 
( (PARTITION BY expr1, expr2, …. exprN)? (ORDER BY exprA, exprB, … exprN)? Frame_Spec? )

...

from emp select array_sum((from w select value w.emp.salary)) over w as (partition by emp.dept

Design

The following components were added:

...

 This expression is for a single window function call.

 Contents:

  • Function name
  • Function arguments
  • List of ‘partition by’ expressions
  • List of ‘order by’ expressions and order modifiers
  • Frame mode: (enum) RANGE | ROWS | GROUPS
  • Frame boundary start, end : (enum) CURRENT_ROW | UNBOUNDED_PRECEDING |  UNBOUNDED_FOLLOWING | BOUNDED_PRECEDING (+ boundary expression) | BOUNDED_FOLLOWING ( + boundary expression)
  • Frame exclusion kind: (enum) CURRENT_ROW | GROUP | TIES | NO_OTHERS
  • Frame variable and its field list mapping
  • Function call expression

Algebricks Operator

org.apache.hyracks.algebricks.core.algebra.operators.logical.WindowOperator

...

Window operator runtime consists of an abstract class org.apache.hyracks.algebricks.runtime.operators.win.AbstractWindowPushRuntime and 3 its subclasses:

  • WindowSimplePushRuntime – used for row_number(), rank(), dense_rank(), and some others – running aggregates that do not require information about partition length

  • WindowMaterializingPushRuntime – used for percentcume_rankdist(), ntile(), percent_rank() – running aggregates that require information about partition length

  • WindowNestedPlansPushRuntime – used for regular aggregates – this is the only implementation that supports frame computation

Open items

  • .  

    • WindowNestedPlansUnboundedPushRuntime - optimized version used when the frame is equivalent to the whole partition (unbounded preceding to unbounded following)
    • WindowNestedPlansRunningPushRuntime - optimized version used when the frame is unbounded preceding to current row / n following

Open items

  • Optimize performance of WindowNestedPlansPushRuntime: support reverse aggregation steps for sliding frames (x preceding to y following, etc)

  • Need to implement remaining window functions: lead(), lag(), first_value(), nth_value(), last_value(), cume_dist()

  • Need to create an optimizer rule that combines multiple window operators into a single one if their partition and frame definitions are the same
  • Optimize performance of WindowNestedPlansPushRuntime