Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Describe the problems you are trying to solveCurrently, when users invoke a function or call a procedure, they must specify all fields in order. When there are a large number of parameters, it is easy to make mistakes and cannot omit specifying non-mandatory fields. By using named parameters, you can selectively specify the required parameters, reducing the probability of errors and making it more convenient to use.

Public Interfaces

Proposed Changes

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior how will we phase out the older behavior?
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

Test Plan

...

Interface change

Modify DataTypeHint to support specifying whether the parameter is optional.

@PublicEvolving

@Retention(RetentionPolicy.RUNTIME)

@Target({ElementType.TYPE, ElementType.METHOD, ElementType.FIELD, ElementType.PARAMETER})

public @interfaceDataTypeHint {


   /**

   *   Used to indicate whether the parameter is optional or not.

   */

   booleanisOptional() defaulttrue;


}



For function developer and call procedure developer 

For UDF development or call developers, the UDX or procedure we develop can be roughly divided into two types:

  1. Class overloads multiple methods with different parameters and types. Users need to specify all the parameters when using.



// UDF Development

publicstaticclassNamedArgumentsTableFunctionextendsTableFunction<Object> {


   // Method overloads with different parameter sets

   // Example usage: SELECT * FROM TABLE(my_table_function(in1 => 'value1', in2 => 'value2'))

   @FunctionHint(

           input = {@DataTypeHint("STRING"), @DataTypeHint("STRING")},

           output = @DataTypeHint("STRING"),

           argumentNames = {"in1", "in2"})

   publicvoideval(Stringarg1, Stringarg2) {

       collect(arg1 + ", " + arg2);

   }


   // Example usage: SELECT * FROM TABLE(my_table_function(in1 => 'value1', in2 => 'value2', in3 => 'value3'))

   @FunctionHint(

           input = {@DataTypeHint("STRING"), @DataTypeHint("STRING"), @DataTypeHint("STRING")},

           output = @DataTypeHint("STRING"),

           argumentNames = {"in1", "in2", "in3"})

   publicvoideval(Stringarg1, Stringarg2, Stringarg3) {

       collect(arg1 + ", " + arg2);

   }

}



// Call Procedure Development

publicstaticclassNamedArgumentsProcedureimplementsProcedure {


   // Method overloads with different parameter sets

   // Example usage: CALL myNamedProcedure(in1 => 'value1', d => 100)

   @ProcedureHint(

           input = {@DataTypeHint("STRING"), @DataTypeHint("INT")},

           output = @DataTypeHint("STRING"),

           argumentNames = {"in1", "in2"})

   publicString[] call(ProcedureContextprocedureContext, Stringarg1, Integerarg2) {

       returnnewString[]{arg1 + ", " + arg2};

   }


   // Example usage: CALL myNamedProcedure(in1 => 100, in2 => 200)

   @ProcedureHint(

           input = {@DataTypeHint("INT"), @DataTypeHint("INT")},

           output = @DataTypeHint("STRING"),

           argumentNames = {"in1", "in2"})

   publicString[] call(ProcedureContextprocedureContext, Integerarg1, Integerarg2) {

       returnnewString[]{arg1 + ", " + arg2};

   }

}




  1. The class can only have one method, and users can optionally specify parameters.



// UDF Development

publicstaticclassNamedArgumentsTableFunctionextendsTableFunction<Object> {


   // Example usage: SELECT * FROM TABLE(my_table_function(in1 => 'value1', in2 => 'value2'))

   // Example usage: SELECT * FROM TABLE(my_table_function(in1 => 'value1', in2 => 'value2', in3 => 'value3'))

   @FunctionHint(

           input = {@DataTypeHint("STRING"), @DataTypeHint(value = "STRING", isOptional = false), @DataTypeHint(value = "STRING", isOptional = true)},

           output = @DataTypeHint("STRING"),

           argumentNames = {"in1", "in2", "in3"})

   publicvoideval(Stringarg1, Stringarg2, Stringarg3) {

       collect(arg1 + ", " + arg2 + "," + arg3);

   }

}



// Call Procedure Development

publicstaticclassNamedArgumentsProcedureimplementsProcedure {


   // Example usage: CALL myNamedProcedure(in1 => 'value1', in2 => 'value2')

   // Example usage: CALL myNamedProcedure(in1 => 'value1', in2 => 'value2', in3 => 'value3')

   @ProcedureHint(

           input = {@DataTypeHint(value = "STRING", isOptional = false), @DataTypeHint(value = "STRING", isOptional = false), @DataTypeHint(value = "STRING", isOptional = true)},


           output = @DataTypeHint("STRING"),

           argumentNames = {"c", "d", "e"})

   publicString[] call(ProcedureContextprocedureContext, Stringarg1, Stringarg2, Stringarg3) {

       returnnewString[]{arg1 + ", " + arg2 + "," + arg3};

   }

}

For users

Users can now use functions and call procedures with named parameters as follows:

```sql

-- for scalar function


SELECT my_scalar_function(param1 => ‘value1’, param2 => ‘value2’’) FROM []


-- for table function


SELECT  *  FROMTABLE(my_table_function(param1 => 'value1', param2 => 'value2'))


-- for agg function


SELECT my_agg_function(param1 => 'value1', param2 => 'value2') FROM []


-- for call procedure


CALL  procedure_name(param1 => ‘value1’, param2 => ‘value2’)


```

Proposed Changes

Implementing Named Parameters:

In FLIP-65, UDX has already supported the ability to parse argument names from FunctionHint and ProcedureHint.  Calcite has long supported the ability to use named optional and default arguments when calling functions (CALCITE-941), but some issues and limitations were discovered during the poc.

1. Passing Named Parameters to the Calcite:

Due to Calcite's behavior of reordering operands based on named arguments, the reordering relies on the implementation of the SqlOperandTypeCheck interface in order to implement the SqlOperandMetadata interface. Therefore, it is necessary to extend the TypeInferenceOperandChecker class for this specific implementation. 

2.Fixing Calcite Bugs Related to Named Parameters:

During POC verification, bugs were discovered in Calcite that caused issues during the validation phase. We need to modify the SqlValidatorImpl and SqlToRelConverter to address these problems.

Capabilities and Limitations

Capabilities

  • Multiple Parameter Lists:

Within one UDX and Procedure, you can declare multi parameter lists with different sets of parameters for overloaded functions.

  • Reflection-based Named Parameters:

When using the feature without explicitly specifying parameter hints, the function will automatically retrieve the parameter names using reflection, allowing them to be used as named parameters.

  • Optional Argument Names:

If the argument names specified in the hint do not match the actual parameter names, it will fall back to using default values.   When users use optional argument names, we need corresponding UDX methods without overloads to avoid function conflicts caused by assigning default values.

Limitations:

  • Variable arguments are not supported with named parameters.
  • The same UDX or call procedure class cannot simultaneously support optional arguments and multiple parameter lists.

Compatibility, Deprecation, and Migration Plan

Compatibility:

Since named parameters are a newly introduced feature, there are no compatibility issues. However, it's important to note that functions registered using the deprecated TableEnvironment.registerFunction API will not support named parameters.

Deprecation:

No deprecation is required for this feature.

Migration Plan:

No migration is needed as this is a new feature.

Test Plan

To validate this feature, we can develop custom UDFs or call procedures and use named parameters when invoking them.

Rejected Alternatives




Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.