Status
Current state: [One of "Under Discussion", "Accepted", "Rejected"]
Discussion thread:
JIRA or Github Issue:
Released: <Doris Version>
Google Doc: <If the design in question is unclear or needs to be discussed and reviewed, a Google Doc can be used first to facilitate comments from others.>
Motivation
Lambda function is very convenient. Users can write the logic they want to achieve by freely. Now Doris starts to support some complex data types,eg: array, map, struct...
Related Research
Detailed Design
select array_map(x -> abs(x), [1,-2,-3]); +-----------------------------------------------+ | array_map([x] -> abs(x(0)), ARRAY(1, -2, -3)) | +-----------------------------------------------+ | [1, 2, 3] | +-----------------------------------------------++ select array_map((x,y) -> (x - y), [10,20,30], [10,25,50]); +--------------------------------------------------------------------------+ | array_map([x, y] -> (x(0) - y(1)), ARRAY(10, 20, 30), ARRAY(10, 25, 50)) | +--------------------------------------------------------------------------+ | [0, -5, -20] | +--------------------------------------------------------------------------+ select array_filter([1,2,3,4],[1,0,1,0]); +----------------------------------------------------+ | array_filter(ARRAY(1, 2, 3, 4), ARRAY(1, 0, 1, 0)) | +----------------------------------------------------+ | [1, 3] | +----------------------------------------------------+ select array_map((x,y) -> (x >= y), [10,20,30], [10,25,50]); +---------------------------------------------------------------------------+ | array_map([x, y] -> (x(0) >= y(1)), ARRAY(10, 20, 30), ARRAY(10, 25, 50)) | +---------------------------------------------------------------------------+ | [1, 0, 0] | +---------------------------------------------------------------------------+ select array_filter(x -> (x > 0), array_map((x,y) -> (x >= y), [10,20,30], [10,25,50])) as res; +------+ | res | +------+ | [1] | +------+
Need a new type for FE, when register and find functions.
Need a new Expr when paser the user input lambda expression. we can separate into the lambda argument and real expr,
Then could bind the X ,Y... to the array1,array2....., eg: the nested type of array, and id.
When the function arrive to BE, first need to execute the children except children(0), because the lambda argument X,Y
is bind to the params, need get the nested data column.
Scheduling
Firstly, need to implent the array_map and array_filter lambda function, We can add them together to realize many interesting functions.
Secondly, Implement various functions on a large scope, eg: arrayFirst, arrayLast, arrayFill.....
[1] https://clickhouse.com/docs/zh/sql-reference/functions/higher-order-functions/
[2] https://prestodb.io/blog/2020/03/02/presto-lambda