Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Pandas Support
    • Add toPandas and fromPandas interfaces in Table API as conversions between Table and pandas.
    • Support to use pandas UDFs directly in Python Table API.

Modules

  • flink-python(maven module)
    • pyflink(python package)
      • table 
      • shell
    • flink-python-tableThe place for all python interface definitions and implementations, such as Table, Window, TableEnvironment, TableConfig, ConnectorDescriptor, DataType, TableSchema, TableSource, TableSink etc. i.e all the user interface in `flink-table-common` and `flink-table-api-java` should be there.
    • flink-python-streaming(in the future)
      • streaming(in the future)
      • others...
  • flink-clients(maven module)

          Support for submitting Python Table API job in CliFrontend, such as `flink run -py wordcount.py`.

We      We need to add components in FLINK JIRA as follows:

  • API/Python - for Python API (already exists)
  • Runtime/Python - for Python function execution.
  • Table SQL/Python - for Python user-defined function execution
  • Python Shell - for interactive Python program
  • flink-python-shell

          For interactive development, similar to scala-shell. flink-python-shell users can write and run Python Table API (and Python Datastream API in the future).

  • flink-clients

...

Architecture

We don't develop python operators like `flink-python` and `flink-stream-python`. To get the most out of the existing Java/Scala results (the Calcite-based optimizer), the Python Table API only needs to define the Python Table API interface.  Calls to the existing Java Table API implementation to meet the needs of python users with minimal effort. So our main job is to implement communication between Python VM and Java VM, as shown below:

...