Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

It is important to make the following observation: if we don't provide the Pipeline class, users can still accomplish the same use-cases targeted by Pipeline by explicitly writing the training logic and inference logic separately using Estimator/Transformer APIs. But users would have to construct this chain of Estimator/Transformer twice (for training and inference respectively).

Design Principle

Multiple choices exist to address the use-cases targeted by this design doc. In the following, we explain the design principle followed by the proposed design, to hopefully make the understanding of the design choices more intuitive.

As much as possible, the API design should allow users to address the new use-case while still enjoying the existing functionalities.

For example, the existing Pipeline class allows users to compose an Estimator from a linear chain of Estimator/Transformer, without requiring users to specify this linear chain twice (see Background Section for more detail).

Correspondingly, as we extend the Flink ML API to suppose DAG of Estimator/Transformer, we believe the APIs should provide this functionality:

  • Allow users to compose an Estimator from a DAG of Estimator/Transformer, without requiring users to specify this DAG twice.


Public Interfaces

This FLIP proposes quite a few changes and addition to the existing Flink ML APIs. We first describe the proposed API additions and changes, followed by the API code of interfaces and classes after making the proposed changes.

...