Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Authors:  Fu, Dian, Jincheng Sun

Status

Current state["Under Discussion"]

...

  • It provides users the ability to switch between PyFlink and Pandas seamlessly when processing data in Python language. Users could process data using one execution engine and switch to another seamlessly. For example, it may happen that users have already got a Pandas dataframe at hand and want to perform some expensive transformation of it. Then they could convert it to a PyFlink table and leverage the power of Flink engine. Users could also convert a PyFlink table to Pandas dataframe and perform transformation of it with the rich functionalities provided by the Pandas ecosystem.
  • No third-party No intermediate connectors are needed when converting between them.

...

The signature of `from_pandas` is as following:

class TableEnvironment(object):

    def from_pandas(self, pdf: pd.DataFrame, schema: Union[RowType, List[str], Tuple[str], List[DataType], Tuple[DataType]] = None, splits_num: int = None) -> Table:

        pass


The argument `schema` is used to specify the schema of the result table:

...

The signature of `to_pandas` is as following:

class Table(object):

    def to_pandas(self) -> pd.DataFrame:

        pass

Proposed Changes

from_pandas

...