Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejava
public class GraphNode {
    public final Stage<?> stage;
    public final TableId[] estimatorInputs;
    public final TableId[] algoInputsalgoOpInputs;
    public final TableId[] outputs;
}

...

Code Block
languagejava
/**
 * A Graph acts as an Estimator. A Graph consists of a DAG of stages, each of which could be an
 * Estimator, Model, Transformer or AlgoOperator. When `Graph::fit` is called, the stages are
 * executed in a topologically-sorted order. If a stage is an Estimator, its `Estimator::fit` method
 * will be called on the input tables (from the input edges) to fit a Model. Then the Model will be
 * used to transform the input tables and produce output tables to the output edges. If a stage is
 * an AlgoOperator, its `AlgoOperator::transform` method will be called on the input tables and
 * produce output tables to the output edges. The GraphModel fitted from a Graph consists of the
 * fitted Models and AlgoOperators, corresponding to the Graph's stages.
 */
@PublicEvolving
public final class Graph implements Estimator<Graph, GraphModel> {
    public Graph(List<GraphNode> nodes, TableId[] estimatorInputIds, TableId[] algoInputsalgoOpInputs, TableId[] outputs, TableId[] inputModelData, TableId[] outputModelData) {...}

    @Override
    public GraphModel fit(Table... inputs) {...}

    @Override
    public void save(String path) throws IOException {...}

    @Override
    public static Graph load(String path) throws IOException {...}
}

...

Code Block
languagejava
/**
 * A GraphBuilder provides APIs to build Estimator/Model/AlgoOperator from a DAG of stages, each of
 * which could be an Estimator, Model, Transformer or AlgoOperator.
 */
@PublicEvolving
public final class GraphBuilder {
    private int maxOutputLength = 20;

    public GraphBuilder() {}

    /**
     * Specifies the upper bound (could be loose) of the number of output tables that can be
     * returned by the TransformerModel::getModelData and AlgoOperator::transform methods, for any stage
     * involved in this Graph.
     *
     * <p>The default upper bound is 20.
     */
    public GraphBuilder setMaxOutputLength(int maxOutputLength) {...}

    /**
     * Creates a TableId associated with this GraphBuilder. It can be used to specify the passing of
     * tables between stages, as well as the input/output tables of the Graph/GraphModel generated
     * by this builder.
     *
     * @return A TableId.
     */
    public TableId createTableId() {...}

    /**
     * IfAdds thean stageAlgoOperator isin an Estimator, both its fit method and the transform method of its fitted
     * Model would bethe graph.
     *
     * <p>When the graph runs as Estimator, the transform() of the given AlgoOperator would be
     * invoked with the given inputs. Then when the graphGraphModel runs.
fitted by this graph runs, *the
     * <p>Iftransform() thisof stagethe is a Model, Transformer or AlgoOperator, its transform method given AlgoOperator would be
     * invoked with the given inputs when the graph runs.
     *
     * <p>Returns<p>When athe listgraph ofruns TableIds,as whichAlgoOperator representsor outputsModel, ofthe AlgoOperator::transform() of the given stage.AlgoOperator
     */
 would be invoked publicwith TableId[] getOutputs(Stage<?> stage, TableId... inputs) {...}

the given inputs.
     /**
     * If@param thisalgoOp stageAn isAlgoOperator aninstance.
 Estimator, its fit method would* be@param invokedinputs withA estimatorInputs,list andof the
TableIds which represents inputs  *to transform method() of itsthe fittedgiven
 Model would be invoked with algoInputs.
*     *AlgoOperator.
     * <p>This method throws Exception if@return A list of TableIds which represents the stage is not an Estimator.outputs of transform() of the given
     *
     *AlgoOperator.
 <p>This method is useful when*/
 the state is anpublic Estimator AND the Estimator::fit needs to takeTableId[] addAlgoOperator(AlgoOperator<?> algoOp, TableId... inputs) {...}

     /**
 a different list of Tables* fromAdds the Model::transform ofan Estimator in the fitted Modelgraph.
     *
     * <p>Returns<p>When athe listgraph ofruns TableIdsas Estimator, the whichfit() representsof outputsthe of Model::transform of the fitted Model.given Estimator would be invoked with
     */
 the given inputs. publicThen TableId[] getOutputs(Stage<?> stage, TableId[] estimatorInputs, TableId[] algoInputs) {...}

    /**when the GraphModel fitted by this graph runs, the transform() of the
     * TheModel setModelData()fitted ofby the fittedgiven GraphModelEstimator shouldwould invokebe theinvoked setModelData()with of the given inputs.
     * stage with the given inputs.
     */
 <p>When the graph runs publicas void setModelData(Stage<?> stage, TableId... inputs) {...}

    /**AlgoOperator or Model, the fit() of the given Estimator would be
     * The getModelDatainvoked with the given inputs, then the transform() of the Model fitted GraphModel should invoke the getModelData() ofby the given
     * Estimator would be invoked with the given inputs.
     * stage.
     * @param estimator An Estimator instance.
     * <p>Returns@param inputs aA list of TableIds, which represents theinputs outputsto of getModelDatafit() of the given Estimator as
     * stage.
    well as  */
    public TableId[] getModelData(Stage<?> stage) {...}

inputs to transform() of the Model fitted by the given Estimator.
     /**
 @return A list of *TableIds Returnswhich anrepresents Estimatorthe instanceoutputs withof thetransform() followingof behavior:
the Model fitted   *by
     * <p>1) Estimator::fit should take the given inputsEstimator.
 and return a Model with the following */
    public * behavior.
     *TableId[] addEstimator(Estimator<?, ?> estimator, TableId... inputs) {...}

     /**
 <p>2) Model::transform should take the* givenAdds inputsan andEstimator returnin the given outputsgraph.
     *
     * <p>The<p>When fitthe methodgraph ofruns the returned Estimator andas Estimator, the fit() of the transformgiven methodEstimator ofwould thebe fittedinvoked Modelwith
     * should invoke the corresponding methodsestimatorInputs. Then when the GraphModel fitted by this graph runs, the transform() of the
 internal    stages* asModel specifiedfitted by the
 given Estimator would be *invoked with GraphBuildermodelInputs.
     */
     public* Estimator<?, ?> buildEstimator(TableId[] inputs, TableId[] outputs) {...}

    /**<p>When the graph runs as AlgoOperator or Model, the fit() of the given Estimator would be
     * Returns an Estimator instance with invoked with estimatorInputs, then the transform() of the followingModel behavior:
fitted by the given  *Estimator
     * <p>1) Estimator::fit should take the given inputs and returns a Model with the following
     * behaviorwould be invoked with modelInputs.
     *
     * @param estimator An Estimator instance.
     *
 @param estimatorInputs A list *of <p>2) Model::transform should take the given inputs and return the given outputs.TableIds which represents inputs to fit() of the given
     *
     * <p>3) Model::setModelData should take the given inputModelDataEstimator.
     *
 @param modelInputs A  * <p>4) Model::getModelData should return the given outputModelData.list of TableIds which represents inputs to transform() of the Model
     *
     * <p>The fit method offitted by the returnedgiven Estimator and the transform/setModelData/getModelData.
     * methods@return ofA thelist fittedof ModelTableIds shouldwhich invokerepresents the outputs correspondingof methodstransform() of the internalModel stagesfitted asby
     * specified by the GraphBuilder  the given Estimator.
     */
    public TableId[] addEstimator(
            Estimator<?, ?> buildEstimator(TableId[] inputsestimator, TableId[] outputsestimatorInputs, TableId[] inputModelData, TableId[] outputModelDatamodelInputs) {...}

    /**
     * Returns anWhen the graph runs as Estimator, AlgoOperator instanceor withModel, the following behavior: setModelData() of the given
     *
 Model would be invoked *with <p>1) Estimator::fit should take the given estimatorInputs and returns a Model with the
     * following behaviorthe given inputs.
     *
     * @param model A Model instance.
     *
     * <p>2) Model::transform should take @param inputs A list of TableIds which represents inputs to setModelData() of the given
     transformerInputs* and return the given outputsModel.
     */
    public * <p>3) Model::setModelData should take the given inputModelData.void setModelData(Model<?> model, TableId... inputs) {...}

    /**
     *
 When the graph runs as Estimator, *AlgoOperator <p>4)or Model::getModelData should return , the getModelData() of the given outputModelData.
     * Model would be invoked.
     *
 <p>The fit method of the* returned@param Estimatormodel andA the transform/setModelData/getModelDataModel instance.
     * methods@return ofA thelist fittedof ModelTableIds shouldwhich invokerepresents the corresponding methodsoutputs of getModelData() of the internal stages asgiven Model.
     * specified by the GraphBuilder./
    public TableId[] getModelData(Model<?> model) {...}

     /**/
    public Estimator<?, ?> buildEstimator(TableId[] estimatorInputs, TableId[] algoInputs, TableId[] outputs, TableId[] inputModelData, TableId[] outputModelData) {...}

    /*** Wraps nodes of the graph into an Estimator.
     *
     * <p>When the returned Estimator runs, and when the Model fitted by the returned Estimator
     * runs, the sequence of operations recorded by the {@code addAlgoOperator(...)}, {@code
     * addEstimator(...)}, {@code setModelData(...)} and {@code getModelData(...)} would be executed
     * as specified in the Java doc of the corresponding methods.
     *
     * @param inputs A list of TableIds which represents inputs to fit() of the returned Estimator
     *     as well as inputs to transform() of the Model fitted by the returned Estimator.
     * @param outputs A list of TableIds which represents outputs of transform() of the Model fitted
     *     by the returned Estimator.
     * @return An Estimator which wraps the nodes of this graph.
     */
    public Estimator<?, ?> buildEstimator(TableId[] inputs, TableId[] outputs) {...}

    /**
     * Wraps nodes of the graph into an Estimator.
     *
     * <p>When the returned Estimator runs, and when the Model fitted by the returned Estimator
     * runs, the sequence of operations recorded by the {@code addAlgoOperator(...)}, {@code
     * addEstimator(...)}, {@code setModelData(...)} and {@code getModelData(...)} would be executed
     * as specified in the Java doc of the corresponding methods.
     *
     * @param inputs A list of TableIds which represents inputs to fit() of the returned Estimator
     *     as well as inputs to transform() of the Model fitted by the returned Estimator.
     * @param outputs A list of TableIds which represents outputs of transform() of the Model fitted
     *     by the returned Estimator.
     * @param inputModelData A list of TableIds which represents inputs to setModelData() of the
     *     Model fitted by the returned Estimator.
     * @param outputModelData A list of TableIds which represents outputs of getModelData() of the
     *     Model fitted by the returned Estimator.
     * @return An Estimator which wraps the nodes of this graph.
     */
    public Estimator<?, ?> buildEstimator(
            TableId[] inputs,
            TableId[] outputs,
            TableId[] inputModelData,
            TableId[] outputModelData) {...}

    /**
     * Wraps nodes of the graph into an Estimator.
     *
     * <p>When the returned Estimator runs, and when the Model fitted by the returned Estimator
     * runs, the sequence of operations recorded by the {@code addAlgoOperator(...)}, {@code
     * addEstimator(...)}, {@code setModelData(...)} and {@code getModelData(...)} would be executed
     * as specified in the Java doc of the corresponding methods.
     *
     * @param estimatorInputs A list of TableIds which represents inputs to fit() of the returned
     *     Estimator.
     * @param modelInputs A list of TableIds which represents inputs to transform() of the Model
     *     fitted by the returned Estimator.
     * @param outputs A list of TableIds which represents outputs of transform() of the Model fitted
     *     by the returned Estimator.
     * @param inputModelData A list of TableIds which represents inputs to setModelData() of the
     *     Model fitted by the returned Estimator.
     * @param outputModelData A list of TableIds which represents outputs of getModelData() of the
     *     Model fitted by the returned Estimator.
     * @return An Estimator which wraps the nodes of this graph.
     */
    public Estimator<?, ?> buildEstimator(
            TableId[] estimatorInputs,
            TableId[] modelInputs,
            TableId[] outputs,
            TableId[] inputModelData,
            TableId[] outputModelData) {...}

    /**
     * Wraps nodes of the graph into an AlgoOperator.
     *
     * <p>When the returned AlgoOperator runs, the sequence of operations recorded by the {@code
     * addAlgoOperator(...)} and {@code addEstimator(...)} would be executed as specified in the
     * Java doc of the corresponding methods.
     *
     * @param inputs A list of TableIds which represents inputs to transform() of the returned
     *     AlgoOperator.
     * @param outputs A list of TableIds which represents outputs of transform() of the returned
     *     AlgoOperator.
     * @return An AlgoOperator which wraps the nodes of this graph.
     */
    public AlgoOperator<?> buildAlgoOperator(TableId[] inputs, TableId[] outputs) {...}

    /**
     * Wraps nodes of the graph into a Model.
     *
     * <p>When the returned Model runs, the sequence of operations recorded by the {@code
     * addAlgoOperator(...)} and {@code addEstimator(...)} would be executed as specified in the
     * ReturnsJava an AlgoOperator instance withdoc of the followingcorresponding behavior:methods.
     *
     * <p>1) AlgoOperator::transform should take the given inputs and returns the given outputs @param inputs A list of TableIds which represents inputs to transform() of the returned
     *     Model.
     *
     * <p>The transform method  @param outputs A list of TableIds which represents outputs of transform() of the returned
 AlgoOperator    * should invoke the corresponding methodsModel.
     * of@return A theModel internalwhich stageswraps asthe specifiednodes byof thethis GraphBuildergraph.
     */
    public AlgoOperator<Model<?> buildAlgoOperatorbuildModel(TableId[] inputs, TableId[] outputs) {...}

[] outputs) {...}

    /**
     * Wraps nodes of the graph into a Model.
     /**
     * <p>When Returnsthe areturned Model instance withruns, the followingsequence behavior:
of operations recorded by the *{@code
     * <p>1) Model::transform should take the given inputs and returns the given outputs.addAlgoOperator(...)}, {@code addEstimator(...)}, {@code setModelData(...)} and {@code
     *
     * <p>The transform method getModelData(...)} would be executed as specified in the Java doc of the returnedcorresponding
 Model should invoke the corresponding* methods.
    of the*
     * internal stages as specified by the GraphBuilder.
     */@param inputs A list of TableIds which represents inputs to transform() of the returned
    public Model<?> buildModel(TableId[] inputs, TableId[] outputs) {...}

 *     Model.
     /**
     * Returns a Model instance with the following behavior:* @param outputs A list of TableIds which represents outputs of transform() of the returned
     *
     * <p>1) Model::transform should take the given inputs and returns the given outputs.
     *Model.
     * @param inputModelData A list of TableIds which represents inputs to setModelData() of the
     * <p>2) Model::setModelData should take thereturned given inputModelDataModel.
     *
 @param outputModelData A list *of <p>3) Model::getModelData should return the given outputModelData.
     *TableIds which represents outputs of getModelData() of the
     * <p>The transform/setModelData/getModelData methods of the returned Model should invoke the.
     * corresponding@return methodsA ofModel the internalwhich stageswraps asthe specifiednodes byof thethis GraphBuildergraph.
     */
    public Model<?> buildModel(
            TableId[] inputs,
            TableId[] outputs,
            TableId[] inputModelData,
            TableId[] outputModelData) {...}
}

...

Code Block
languagejava
GraphBuilder builder = new GraphBuilder();

// Creates nodes
Stage<AlgoOperator<?> stage1 = new TransformerA();
Stage<AlgoOperator<?> stage2 = new TransformerA();
Stage<Estimator<?> stage3 = new EstimatorB();
// Creates inputs and inputStates
TableId input1 = builder.createTableId();
TableId input2 = builder.createTableId();
// Feeds inputs to nodes and gets outputs.
TableId output1 = builder.getOutputsaddAlgoOperator(stage1, input1)[0];
TableId output2 = builder.getOutputsaddAlgoOperator(stage2, input2)[0];
TableId output3 = builder.getOutputsaddEstimator(stage3, output1, output2)[0];

// Specifies the ordered lists of inputs, outputs, input states and output states that will
// be used as the inputs/outputs of the corresponding Graph and GraphTransformer APIs.
TableId[] inputs = new TableId[] {input1, input2};
TableId[] outputs = new TableId[] {output3};

// Generates the Graph instance.
Estimator<?, ?> estimator = builder.buildEstimator(inputs, outputs);
// The fit method takes 2 tables which are mapped to input1 and input2.
Model<?> model = estimator.fit(...);
// The transform method takes 2 tables which are mapped to input1 and input2.
Table[] results = model.transform(...);

...

Code Block
languagejava
GraphBuilder builder = new GraphBuilder();

// Creates nodes
Stage<Estimator<?> stage1 = new EstimatorA();
Stage<AlgoOperator<?> stage2 = new TransformerB();
// Creates inputs
TableId estimatorInput1 = builder.createTableId();
TableId estimatorInput2 = builder.createTableId();
TableId transformerInput1 = builder.createTableId();

// Feeds inputs to nodes and gets outputs.
TableId output1 = builder.getOutputsaddEstimator(stage1, new TableId[] {estimatorInput1, estimatorInput2}, new TableId[] {transformerInput1})[0];
TableId output2 = builder.getOutputsaddAlgoOperator(stage2, output1)[0];

// Specifies the ordered lists of estimator inputs, transformer inputs, outputs, input states and output states
// that will be used as the inputs/outputs of the corresponding Graph and GraphTransformer APIs.
TableId[] estimatorInputs = new TableId[] {estimatorInput1, estimatorInput2};
TableId[] transformerInputs = new TableId[] {transformerInput1};
TableId[] outputs = new TableId[] {output2};
TableId[] inputModelData = new TableId[] {};
TableId[] outputModelData = new TableId[] {};

// Generates the Graph instance.
Estimator<?, ?> estimator = builder.buildEstimator(estimatorInputs, transformerInputs, outputs, inputModelData, outputModelData);
// The fit method takes 2 tables which are mapped to estimatorInput1 and estimatorInput2.
Model<?> model = estimator.fit(...);
// The transform method takes 1 table which is mapped to transformerInput1.
Table[] results = model.transform(...);

...