...
Note that the internal digest of an expression therefore might change between Flink versions after restoring from a JSON plan. However, the Web UI should be stable since we encode the operator description in the JSON plan.
Topology Versioning
ExecNode
We consider the ExecNode to Operator 1:n relationship in naming of Transformations.
...
In future Flink versions, we might support that a new operator can subscribe to multiple uid()'s of previous operators during restore. This would make operator migration easier.
Connectors
For sources and sinks that might be composed of multiple operators, we need to propagate the uid suffix for providers. Currently, this is only necessary for DataStreamScanProvider and DataStreamSinkProvider.
// introduce new context
DataStreamSinkProvider.consumeDataStream(DataStream<RowData> dataStream, ProviderContext): DataStreamSink<?> // introduce new context
DataStreamScanProvider.produceDataStream(StreamExecutionEnvironment execEnv, ProviderContext): DataStream<RowData> // Creates one or more ExecNode-aware UIDs ProviderContext.generateUid(String): String // for example: // ProviderContext.generateUid("preprocessor-1") // leads to: // "13_stream-exec-sink-1_provider_preprocessor-1"
Testing Infrastructure
ExecNode Tests
...
We will have completeness tests that verify that every ExecNode version has corresponding tests via classpath scanning and annotation checking.
Connector / Format Tests
We can also use the testing infrastructure for connectors and formats. Esp. connector that use DataStreamSourceProvider and DataStreamSinkProvider need tests for correctly set uid(). This can be done via Completeness Tests.
Function Tests
For functions, we don't need annotations. All built-in functions are either declared in BuiltInFunctionDefinitions or FlinkSqlOperatorTable and we can use reflections for completeness tests, change detection tests, and restore tests.
...