Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • JobResultStore#STORAGE_PATH: This parameter is used for defining the storage path of the JobResultEntry files. In order to align it with other similar configuration parameters, we would set the default to ${HighAvailabilityOptions#HA_STORAGE_PATH}/job-results-store/${HighAvailabilityOptions#HA_CLUSTER_ID}
  • JobResultStoreOptions#DELETE_ON_COMMIT: This parameter is used to simulate the current behavior, i.e. deleting the entries right after the cleanup is finalized and committed. This parameter would be enabled by default leading to no resources being left and no manual cleanup being required. Having this flag enabled would, as a consequence, also lead to weaker guarantees on the failover stability.


Compatibility, Deprecation, and Migration Plan

This change removes a “semi-public” RunningJobsRegistry. This only affects users with custom implementations of HA Services. For regular users that are using Zookeeper or Kubernetes implementation of HA Services (which are provided by Flink), this won’t be noticeable.

Test Plan

The proposed change should be tested with unit, integration and e2e tests for common failover scenarios

Rejected Alternatives

Other Implementations of JobResultStore

Other HA implementations like KubernetesJobResultStore or ZooKeeperJobResultStore were not considered due to the following reasons:

  • Both HA implementations (ZK and k8s) are bundled with an object store (e.g. s3, gcs) to store the actual data. The metadata was stored in ZooKeeper or k8s ConfigMaps due to their support for of read-after-write consistency guarantees which, for instance, S3 didn't provide in the past. Nowadays object stores provide this guarantee.
  • The file-based approach makes it easier for a 3rd party to clean job results up.

Compatibility, Deprecation, and Migration Plan

This change removes a “semi-public” RunningJobsRegistry. This only affects users with custom implementations of HA Services. For regular users that are using Zookeeper or Kubernetes implementation of HA Services (which are provided by Flink), this won’t be noticeable.

Test Plan

The proposed change should be tested with unit, integration and e2e tests for common failover scenarios

Rejected Alternatives

...

  • .