Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejava
/** This interface represents a configurable {@link NonKeyedPartitionStream}. */
interface ProcessConfigurableAndNonKeyedPartitionStream<T> extends NonKeyedPartitionStream<T>,
ProcessConfigurable<ProcessConfigurableAndNonKeyedPartitionStream<T>> {}

/** This interface represents a configurable {@link KeyedPartitionStream}. */
interface ProcessConfigurableAndKeyedPartitionStream<K, T> extends KeyedPartitionStream<K,T>,
ProcessConfigurable<ProcessConfigurableAndKeyedPartitionStream<K, T>> {}

/** This interface represents a configurable {@link GlobalStream}. */
interface ProcessConfigurableAndGlobalStream<T> extends GlobalStream<T>,
ProcessConfigurable<ProcessConfigurableAndGlobalStream<T>> {}


/** This interface represents a configurable {@link TwoNonKeyedPartitionStreams}. */
interface ProcessConfigurableAndTwoNonKeyedPartitionStreams<T1, T2> extends TwoNonKeyedPartitionStreams<T1, T2>,
ProcessConfigurable<ProcessConfigurableAndTwoNonKeyedPartitionStreams<T1, T2>> {}

/** This interface represents a configurable {@link TwoKeyedPartitionStreams}. */
interface ProcessConfigurableAndTwoKeyedPartitionStreams<K, T1, T2> extends TwoKeyedPartitionStreams<K, T1, T2>,
ProcessConfigurable<ProcessConfigurableAndTwoKeyedPartitionStreams<K, T1, T2>> {}


/** This interface represents a configurable {@link TwoGlobalStreams}. */
interface ProcessConfigurableAndTwoGlobalStreams<T1, T2> extends TwoGlobalStreams<T1, T2>,
ProcessConfigurable<ProcessConfigurableAndTwoGlobalStreams<T1, T2>> {}

Regarding the second pointRegarding the second point, processconnectAndProcess and other operations performed on DataStreams will return an instance of corresponding interfaces. We will change the return type of one-input/two-output/two-input process and toSink methods as shown in the table below.

stream type

function

old return type

new return type

NonKeyedPartitionStream

process(OneInputStreamProcessFunction)

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

process(TwoOutputStreamProcessFunction)

TwoNonKeyedPartitionStreams

ProcessConfigurableAndTwoNonKeyedPartitionStreams

connectAndProcess(NonKeyedPartitionStream, TwoInputStreamProcessFunction)

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

connectAndProcess(BroadcastStream, TwoInputStreamProcessFunction

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

toSink(Sink)

void

ProcessConfigurable


KeyedPartitionStream

process(OneInputStreamProcessFunction, KeySelector)

KeyedPartitionStream

ProcessConfigurableAndKeyedPartitionStream

process(OneInputStreamProcessFunction)

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

process(TwoOutputStreamProcessFunction, KeySelector, KeySelector)

TwoKeyedPartitionStreams

ProcessConfigurableAndTwoKeyedPartitionStreams

process(TwoOutputStreamProcessFunction)

TwoNonKeyedPartitionStreams

ProcessConfigurableAndTwoNonKeyedPartitionStreams

connectAndProcess(KeyedPartitionStream, TwoInputStreamProcessFunction);

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

connectAndProcess( KeyedPartitionStream, TwoInputStreamProcessFunction, KeySelector)

KeyedPartitionStream

ProcessConfigurableAndKeyedPartitionStream

connectAndProcess(BroadcastStream, TwoInputStreamProcessFunction)

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

toSink(Sink)

void

ProcessConfigurable

GlobalStream

process(OneInputStreamProcessFunction)

GlobalStream

ProcessConfigurableAndGlobalStream

process(TwoOutputStreamProcessFunction)

TwoGlobalStreams

ProcessConfigurableAndTwoGlobalStreams

connectAndProcess(GlobalStream, TwoInputStreamProcessFunction)

GlobalStream

ProcessConfigurableAndGlobalStream

toSink(Sink)

void

ProcessConfigurable

BroadcastStream

connectAndProcess(KeyedPartitionStream, TwoInputStreamProcessFunction)

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

connectAndProcess(NonKeyedPartitionStream, TwoInputStreamProcessFunction)

NonKeyedPartitionStream

ProcessConfigurableAndNonKeyedPartitionStream

In order to support the configuration of fine-grained resources, the API layer must be able to set the specifications of each SlotSharingGroup. Therefore, we will move In order to support the configuration of fine-grained resources, the API layer must be able to set the specifications of each `SlotSharingGroup`. Therefore, we will move `org.apache.flink.api.common.operators.SlotSharingGroupSlotSharingGroup` from the flink-core module to the flink-core-api module proposed in the umbrella FLIP.

Introduce Runtime Context

Unlike attributes like the name of process operation, some information (such as current key) can only be obtained when the process function is executed. In order to build a bridge between user functions and the execution engine, we provide a unified entrypoint called Unlike attributes like the name of process operation, some information(such as current key) can only be obtained when the process function is executed. In order to build a bridge between user functions and the execution engine, we provide a unified entrypoint called Runtime Context.

We divide all contextual into multiple parts according to their functions:

...

Their definitions are as follows:

JobInfo

Code Block
languagejava
titleJobInfor.java


/** The {@link JobInfo} represents the meta information of the job. */
public interface JobInfo {
    /**
     * Get the name of current job.
     *
     * @return the name of current job
     */
    String getJobName();

    /** Get the {@link ExecutionMode} of current job. */
    ExecutionMode getExecutionMode();

    /** Execution mode of this current job. */
    enum ExecutionMode {
        STREAMING,
        BATCH
    }

    // We will gradually add more related methods here.
}

TaskInfo

Code Block
languagejava
titleJobInfor.java
/** The {@link TaskInfo} represents the meta information of the task. */
public interface TaskInfo {
    /**
     * Get the parallelism of current task.
     *
     * @return the parallelism of this process function.
     */
    int getParallelism();

    /**
     * Get the max parallelism of current task.
     *
     * @return The max parallelism.
     */
    int getMaxParallelism();

    /**
     * Get the name of current task.
     *
     * @return The name of current task.
     */
    String getTaskName();

    // We will gradually add more related methods here.
}

ProcessingTimeManager

Code Block
languagejava
titleJobInfor.java
/**
 * This is responsibility for managing runtime information related to processing time of process
 * function.
 */
public interface ProcessingTimeManager {
    /**
     * Register a processing timer for this process function. onProcessingTimer method of this
     * function will be invoked as callback if the timer expires.
     *
     * @param timestamp to trigger timer callback.
     */
    void registerProcessingTimer(long timestamp);

    /**
     * Get the current processing time.
     *
     * @return current processing time.
     */
    long currentProcessingTime();
}

It should be noted that we can only register processing timer here. This is because we no longer unify event time and processing time into a single time service. Simply put, event Time can be implemented using the generalized watermark mechanism, and more detailed reasons have been explained in the Umbrella FLIP.

...

.

StateManager

Code Block
languagejava
titleJobInfor.java
/** This is responsibility for managing runtime information related to state of process function. */
public interface StateManager {
    /**
     * Get the key of current record.
     *
     * @return The key of current processed record for {@link KeyedPartitionStream}. {@link
     *     Optional#empty()} for other non-keyed stream.
     */
    <K> Optional<K> getCurrentKey();

   // The logic of getting states is omitted in this FLIP.
    
}

The logic of getting states is omitted, but we will introduce it in detail in other sub-FLIPs later.

WatermarkManager

Code Block
languagejava
titleJobInfor.java
/**
 * This is responsibility for managing runtime information related to watermark of process function.
 */
public interface WatermarkManager {
     // The logic of processing watermark is omitted in this FLIP.
}

The logic of processing watermark is omitted, but we will introduce it in detail in other sub-FLIPs later.

RuntimeContext

Code Block
languagejava
titleJobInfor.java

/**
 * A RuntimeContext contains information about the context in which process functions are executed.
 * Each parallel instance of the function will have a context through which it can access contextual
 * information, such as the current key and execution mode. Through this context, we can also
 * interact with the execution layer, such as getting state, emitting watermark, registering timer, etc.
*/
public interface RuntimeContext {
    /** Get the {@link JobInfo} of this process function. */
    JobInfo getJobInfo();

    /** Get the {@link TaskInfo} of this process function. */
    TaskInfo getTaskInfo();

    /** Get the {@link ProcessingTimeManager} of this process function. */
    ProcessingTimeManager getProcessingTimeManager();

    /** Get the {@link StateManager} of this process function. */
    StateManager getStateManager();

    /** Get the {@link WatermarkManager} of this process function. */
    WatermarkManager getWatermarkManager();
}

...

OneInputStreamProcessFunction

Code Block
languagejava
titleJobInfor.java
/** This class contains all logical related to process records from single input. */
public interface OneInputStreamProcessFunction<IN, OUT> extends ProcessFunction {
    // Omit logic of processing data and life-cycle methods. They can be found in FLIP-1.

    /**
     * Callback for processing timer.
     *
     * @param timestamp when this callback is triggered.
     * @param output to emit record.
     * @param ctx, runtime context in which this function is executed.
     */
    default void onProcessingTimer(long timestamp, Collector<OUT> output, RuntimeContext ctx) {}
}

TwoInputStreamProcessFunction

Code Block
languagejava
titleJobInfor.java
/** This class contains all logical related to process records from two input. */
public interface TwoInputStreamProcessFunction<IN1, IN2, OUT> extends ProcessFunction {
    // Omit logic of processing data and life-cycle methods. They can be found in FLIP-1.

    /**
     * Callback for processing timer.
     *
     * @param timestamp when this callback is triggered.
     * @param output to emit record.
     * @param ctx, runtime context in which this function is executed.
     */
    default void onProcessingTimer(long timestamp, Collector<OUT> output, RuntimeContext ctx) {}
}

TwoOutputStreamProcessFunction

Code Block
languagejava
titleJobInfor.java
/** This class contains all logical related to process and emit records to two outputs. */
public interface TwoOutputStreamProcessFunction<IN, OUT1, OUT2> extends ProcessFunction {
    // Omit logic of processing data and life-cycle methods. They can be found in FLIP-1.

    /**
     * Callback for processing timer.
     *
     * @param timestamp when this callback is triggered.
     * @param output1 to emit record.
     * @param output2 to emit record.
     * @param ctx, runtime context in which this function is executed.
     */
    default void onProcessingTimer(
            long timestamp, Collector<OUT1> output1, Collector<OUT2> output2, RuntimeContext ctx) {}
}

...