Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note that the internal digest of an expression therefore might change between Flink versions after restoring from a JSON plan. However, the Web UI should be stable since we encode the operator description in the JSON plan.

Topology Versioning

ExecNode

We consider the ExecNode to Operator 1:n relationship in naming of Transformations. 

...

In future Flink versions, we might support that a new operator can subscribe to multiple uid()'s of previous operators during restore. This would make operator migration easier.

Connectors

For sources and sinks that might be composed of multiple operators, we need to propagate the uid suffix for providers. Currently, this is only necessary for DataStreamScanProvider and DataStreamSinkProvider.

// introduce new context
DataStreamSinkProvider.consumeDataStream(DataStream<RowData> dataStream, ProviderContext): DataStreamSink<?> // introduce new context
DataStreamScanProvider.produceDataStream(StreamExecutionEnvironment execEnv, ProviderContext): DataStream<RowData> // Creates one or more ExecNode-aware UIDs ProviderContext.generateUid(String): String // for example: // ProviderContext.generateUid("preprocessor-1") // leads to: // "13_stream-exec-sink-1_provider_preprocessor-1"

Testing Infrastructure

ExecNode Tests

...

We will have completeness tests that verify that every ExecNode version has corresponding tests via classpath scanning and annotation checking.

Connector / Format Tests

We can also use the testing infrastructure for connectors and formats. Esp. connector that use DataStreamSourceProvider and DataStreamSinkProvider need tests for correctly set uid(). This can be done via Completeness Tests.

Function Tests

For functions, we don't need annotations. All built-in functions are either declared in BuiltInFunctionDefinitions or FlinkSqlOperatorTable and we can use reflections for completeness tests, change detection tests, and restore tests.

...