THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
Authors: Jincheng Sun, Dian Fu, Aljoscha Krettek
Status
Current state: "Under Discussion"
Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html
...
For Python Table API jobs, if an operator contains Python use-defined function, it will be given a resource which is the origin resource + the resource used by the Python process.
Compatibility, Deprecation, and Migration Plan
- This FLIP is a new feature and so there is no compatible issue with previous versions.
Implementation Plan
- Support the basic functionality of Python ScalarFunction
- Support chaining Python ScalarFunctions
- Python Execution Environment Management. For example, multiple operators can reuse the same Python SDK Harness.
- Python Dependency Management. The Python UDF may depend on third party dependencies, we should provide a proper way to handle it.
- Add a series of Java and Python Coders for all kinds of data types supported. The data encoded with Java coder should be able to decode with the corresponding Python coder, vice verse.
- Add cython support for udf execution.
- Add validation check for places where Python ScalarFunction cannot be used
- Support to use decorator syntax for Python functions
- Support the basic functionality of Python TableFunction
- Add rules to push down the Python ScalarFunctions contained in the join condition of Correlate node
- Add Python Correlate nodes merge rule
- Support the basic functionality of Python AggregateFunction without DataView support
- Add validation check for places where Python AggregateFunction could not be used
- Add ListView support
- Add MapView support
- Add user-defined metrics support
- Add documentation for Python user-defined functions