Discussion threadhttps://lists.apache.org/thread/gjgr3b3w2c379ny7on3khjgwyjp2gyq1
Vote threadhttps://lists.apache.org/thread/vsn21hmmv1km2ybl3hq00p7nhrp1dnsw
JIRA

Unable to render Jira issues macro, execution error.

Release1.15

Motivation

This FLIP is strongly related to FLIP-196 and its motivation also applies to this FLIP. Apart from the actual stability guarantees we provide there is another problem one could observe over the past couple of Flink releases. Aside from the initial set of @Public APIs, we almost never upgraded @Experimental@PublicEvolving and @PublicEvolving@Public and thereby not giving more reliable stability guarantees for our users.

One reason for this pattern is that the Flink developers would like to keep APIs moldable in case that they are not perfect. In the beginning of the development of a new feature this makes sense. However, the community should also try to stabilize new APIs as soon as possible so that Flink users can rely on them. If we don’t do this, then users start building against @PublicEvolving and weaker annotated APIs. This might then lead to disappointment.

Proposed Changes

In order to improve the API graduation process (aka stabilizing more APIs more quickly) I would like to pick up an idea that has been proposed here. The basic idea is to add a since entry to the annotation and saying that after n releases an API needs to be graduated unless there is a very good reason for not doing this. The very good reason needs to be documented explicitly so that its validity can be verified. This process will effectively inverse the graduation process from opt-in to opt-out with a good argument.

For the time until graduation (@Experimental@PublicEvolving and @PublicEvolving@Public), I would propose per default two release cycles. Assuming a 4 months release cadence, this would give us 8 months for reaching the next stability level. In total we will have 16 months going from @Experimental to @Public. This might not be a lot of time but it will force the community to focus on newly introduced APIs and lead to fewer half baked stopgap solutions that become permanent.

Process-wise we could add a test case that checks the stability annotations and compares the since field with the current version. If the since field differs, then there must be an entry for the current version that explains why it has not been promoted.

Additionally to this process change, I would also propose that we go over all user facing APIs once and try to update their stability guarantee. I am sure that we will find a couple of APIs that should have been marked @Public long ago.

Extended annotation

In order to support the automatic reminding of graduation candidates we could change the stability annotation the following way:

Extended annotation
@Target(ElementType.TYPE)
public @interface PublicEvolving {

   FlinkVersion since();

   GraduationMiss[] missedGraduations();
}

public @interface GraduationMiss {
   FlinkVersion graduation();

   String reason();
}

// Usage
@PublicEvolving(
       since = FlinkVersion.V1_11_0,
       missedGraduations = {
           @GraduationMiss(graduation = FlinkVersion.V1_13_0, reason = "foobar"),
           @GraduationMiss(graduation = FlinkVersion.V1_14_0, reason = "barfoo")
       })
public class Foobar {}


Compatibility, Deprecation, and Migration Plan

  • We would have to replace the existing stability annotation with the new ones

Test Plan

  • We need a test that fails if an API object has missed its graduation w/o a good reason.
    • In order to not fail right away when introducing a new FlinkVersion, we could add an isStable field that activates the checks for this version.

Follow ups

  • We should also add a similar process to Deprecation. Transitivity would go into the opposite direction by inspecting usages. It would need to have a reason + target release for removal.

Rejected Alternatives