Page properties | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
private
final
Map<ObjectIdentifier, FunctionDefintion> tempFunctions =
new
LinkedHashMap<>();
Status
Current state: Under Discussion
Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
JIRA: FLINK-14090
...
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
This FLIP would add explicit temporary function support by renaming a few variable names, and potential deprecating renaming some APIs in favor of new APIs that reflects their nature of dealing with temporary functions.
...
However, they need to be done simultaneously to achieve the functionality in full stack.
Proposed Changes
We will first clarify the dimensions of functions.
System (can be used interchangeably with "builtin") | Catalog | |
---|---|---|
Non-Temporary | system functions | catalog functions |
Temporary | temporary system functions | temporary catalog functions |
1. Support Two Types of Temporary Functions
...
- temporary system function that has no namespace and overrides built-in functions
- temporary catalog function that has catalog and database namespaces and overrides catalog functions.
...
Their DDLs are “CREATE/DROP TEMPORARY SYSTEM FUNCTION”.
Existing They will be renamed from “registerScalar/Table/AggregateFunctions()” will be deprecated in favor of the new APIs.
b) Temporary Catalog Functions
We will add a new member variable to FunctionCatalog as “Map<ObjectIdentifier, UserDefinedFunction> tempFunctions“ tempCatalogFunctions“ to hold those temporary catalog functions in a central place, and new APIs “registerTemporaryScalar/Table/AggregateFunction(ObjectIdentifier, UserDefinedFunction)”.
Lifespan of temp functions are not tied to those of catalogs and databases. Users can create temp catalog functions even though catalogs/dbs in their fully qualified names don't even exist.
Their DDLs are “CREATE/DROP TEMPORARY FUNCTION”.
Some other proposed SQL commands are:
"SHOW FUNCTIONS" - list names of temp and non-temp system/built-in functions, and names of temp and catalog functions in the current catalog and db
"SHOW ALL FUNCTIONS" - list names of temp and non-temp system/built functions, and fully qualified names of temp catalog functions and catalog functions in all catalogs and dbs
"SHOW ALL TEMPORARY FUNCTIONS" - list fully qualified names of temp catalog functions in all catalog and db
...
Lifespan of both types of temporary functions will be within a session, and will destroyed upon session end.
Note: corresponding DDL and SQL commands are not part of this FLIP
2. Support Precise Function Reference
Because built-in system functions don’t have namespaces, a precise function reference in Flink must be either temporary catalog functions with namespaces or catalog functions.
The resolution order will be
- Temporary catalog functions with no namespace
- Catalog functions
3. Support Ambiguous Function Reference with a Redefined Resolution Order
For ambiguous function reference, there are 4 types of functions to consider: temporary functions with and without no namespaces, Flink built-in system functions, and catalog functions.
...
- Temporary system functions
- Flink Built-in System functions
- Temporary catalog functions, in the current catalog and current database of the session
- Catalog functions, in the current catalog and current database of the session
...
Temp functions should rank above their corresponding persistent/built-in functions due to its temporary nature - users want to overwrite built-in or persistent functions with something temporary that is only visible to themselves and the session, and not impacting other users. In contrary, 1) if users don’t have the intention of overwriting other functions, they can just name the temporary functions to something else, considering the manipulation cost is so low for temporary objects, and 2) if built-in functions precede temporary functions, there’s no way to reference temp functions anymore
Flink built-in System functions should precede catalog functions, because 1) it always give a deterministic resolution order on ambiguous reference by invoking the built-in functions 2) catalog functions can always be precisely referenced with fully/partially qualified names. In contrary, if catalog functions precede built-in functions, built-in functions can never be referenced.
...
Code Block | ||
---|---|---|
| ||
Class FunctionIdentifier { // empty for system functions Optional<ObjectIdentifier> oi; String name; // for temporary/non-temporary system function // for temporary/non-temporary catalog function ObjectIdentifier oi; Optional<ObjectIdentifier> getIdentifier() {} Optional<String> getName() {} Optional<FunctionIdentifier> of(ObjectIdentifier oi) {} Optional<FunctionIdentifier> of(String name) {} } |
Changes to CallExpression and UnresolvedCallExpression
...
Code Block | ||
---|---|---|
| ||
private final Map<String, FunctionDefintion> tempSystemFunctions = new LinkedHashMap<>(); private final Map<FunctionIdentifier, FunctionDefinition> tempFunctions = new LinkedHashMap<>(); public void registerTemporarySystemScalarFunctionregisterTempSystemScalarFunction(String name, ScalarFunction function) { // put into tempSystemFunctions } public void registerTemporarySystemTableFunctionregisterTempSystemTableFunction(String name, TableFunction function) { // put into tempSystemFunctions } public void registerTemporarySystemAggregateFunctionregisterTempSystemAggregateFunction(String name, AggregateFunction function) { // put into tempSystemFunctions } public void registerTemporaryScalarFunctionregisterTempCatalogScalarFunction(ObjectIdentifier foi, ScalarFunction function) { // put into tempFunctions } public void registerTemporaryTableFunctionregisterTempCatalogTableFunction(ObjectIdentifier fi, TableFunction function) { // put into tempFunctions } public void registerTemporaryAggregateFunctionregisterTempCatalogAggregateFunction(ObjectIdentifier fi, AggregateFunction function) { // put into tempFunctions } public void dropTemporarySystemFunction(String name) {} public void dropTemporaryFunctiondropTemporaryCatalogFunction(FunctionIdentifier fi) {} public Optional<FunctionLookup.Result> lookupFunction(FunctionIdentifier fi) { if (fi.getObjectIdentifier().isPresent()) { // resolvePreciseFunctionReference(fi.getObjectIdentifier()); } else { resolveAmbiguousFunctionReference(fi.getName()); } } private Optional<FunctionLookup.Result> resolvePreciseFunctionReference(FunctionIdentifier fi) { // resolve order: // 1. Temporary functions // 2. Catalog functions } private Optional<FunctionLookup.Result> resolveAmbiguousFunctionReference(String name); // resolve order: // 1. Temporary system functions // 2. Builtin functions // 3. Temporary functions, in the current catalog/db // 2. Catalog functions, in the current catalog/db } |
...