Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: Under Discussion

Discussion threadhere (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)

JIRA:[FLINK-29110] Support to mount a dynamically-created pvc for JM and TM in standalone mode with StatefulSet. - ASF JIRA (apache.org)

...

Page properties


Discussion thread
Vote thread
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-29110

Release


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The JobManager and TaskManagers of the Flink cluster currently deployed in the standalone[1] mode are both deployed through the Deployment of Kubernetes with flink-kubernetes-operator.

...

Using  StatefulSet instead of Deployment to deploy JobManager and TaskManagers can automatically mount PVC for each pod of JobManager and TaskManagers, and can maintain the relationship between PVC and each pod[2].

Public Interfaces

The public interface is the FlinkDeployment custom resource descriptor (CRD), see below.

Proposed Changes

FlinkDeployment CRD

Code Block
languageyml
titleCR example with volumeClaimTemplate
linenumberstrue
kind: FlinkDeployment
metadata:
  namespace: default
  name: basic-example
spec:
  image: flink:1.14.3
  flinkVersion: v1_14
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
  serviceAccount: flink
  jobManager:
    replicas: 1
    resource:
      memory: "2048m"
      cpu: 1
    volumeClaimTemplates: // (only needed for standalone clusters)
      - metadata:
          name: log
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: "lvm"
          resources:
            requests:
              storage: 10Gi
    podTemplate:
      apiVersion: v1
      kind: Pod
      metadata:
        name: job-manager-pod-template
      spec:
        containers:
          - name: flink-main-container
            volumeMounts:
              - name: log
                mountPath: /opt/flink/log
  taskManager:
    replicas: 4 // (only needed for standalone clusters)*     
    resource:
      memory: "2048m"
      cpu: 1
    volumeClaimTemplates: // (only needed for standalone clusters)
      - metadata:
          name: log
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: "lvm"
          resources:
            requests:
              storage: 10Gi
    podTemplate:
      apiVersion: v1
      kind: Pod
      metadata:
        name: task-manager-pod-template
      spec:
        containers:
          - name: flink-main-container
            volumeMounts:
              - name: log
                mountPath: /opt/flink/log
  mode: standalone 

...

Code Block
languagejava
titleTaskManagerSpec.java
public class TaskManagerSpec {
    /** Resource specification for the TaskManager pods. */
    private Resource resource;

    /** Number of TaskManager replicas. If defined, takes precedence over parallelism */
    @SpecReplicas private Integer replicas;

    /**
     * Volume Claim Templates for TaskManager StatefulSet, it will be used to mount custom PVCs just
     * for standalone mode.
     */
    private List<PersistentVolumeClaim> volumeClaimTemplates = new ArrayList<>();

    /** TaskManager pod template. It will be merged with FlinkDeploymentSpec.podTemplate. */
    private Pod podTemplate;
}

StandaloneFlinkService

To support Dynamic-created PVC mounting, within the operator, we deploy Flink JM and TM using StatefulSet instead of Deployment to maintain a one-to-one correspondence between PVC and pod.

The previous logic for creating and deleting clusters that involved Deployment resource operations has all been changed to operating StatefulSet resource by fabric8 kubernetes client.

Compatibility, Deprecation, and Migration Plan

The CRD volumeClaimTemplate can be null to maintain compatibility with the released 1.1.0 and before version.

Test Plan

We can test the creation of the dynamic PVC by creating a Flink standalone cluster in a real k8s clusters, And kill one TaskManager pod and wait for it recovered and mount previous existed PVC successfully.

...

Delete the CR, all created PVCs will be retained, those can be deleted manually and permanently.

Rejected Alternatives

Using ReadWriteMany PVC for all pods of TM with current native or standalone mode.

Or using other operator like flink-on-k8s-operator to mount one-to-one PVC for each TMs.

References

...