Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Give users full control over k8s pod template (no mapping/whitelisting)
  • Layering/merging of pod templates (operator itself could also apply cluster wide defaults)

  kind: FlinkDeployment
metadata:
  name: flink-wordcount-example
  namespace: flink-operator
  annotations:
  labels:
    environment: development
spec:
  image: example:latest
  flinkVersion: "1.14"
  flinkConfiguration:
    // everything for flink-conf.yaml
    state.savepoints.dir: file:///checkpoints/flink/savepoints
  podTemplate:
    // everything that a k8s pod template supports
    // defaults for both, job and task manager pods
  jobManager:
    resources:
      requests:
        memory: "200Mi"
        cpu: "0.1"
    replicas: 1
    podTemplate:
      // everything that a k8s pod template supports
      // layered over common template
  taskManager:
    taskSlots: 2
    resources:
      requests:
        memory: "200Mi"
        cpu: "0.1"
    podTemplate:
      // everything that a k8s pod template supports,
      // layered over common template
  // job can be optional for plain session cluster
  job:    
    jarURI: "file:///wordcount-operator-example-1.0.0-SNAPSHOT.jar"
    parallelism: 3
    entryClass: "org.apache.flink.WordCount"
    args: ""
    cancelMode: (savepoint, none)
    restoreMode: (savepoint, checkpoint, none)
  logging:
    // customize logging config in flink container   
  status:
    ...
    // information about the actual state
    // including latest savepoint/checkpoint etc.   

Java Operator SDK

The Flink operator should be built using the java-operator-sdk . The java operator sdk is the state of the art approach for building a Kubernetes operator in Java. It uses the Fabric8 k8s client like Flink does and it is open source with Apache 2.0 license.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users? 
  • If we are changing behavior how will we phase out the older behavior? 
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

As this is a completely new standalone component, no migration will be necessary strictly speaking. Compatibility is to be seen and will depend on any changes required to the Flink Kubernetes integration.

Test Plan

Describe in few sentences how the FLIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Rejected Alternatives

...

Using Go to implement the operator

While Go is often a natural fit for implementing k8s operators and there are already some open-source examples of Flink operators implemented in Go we still feel that Java is more suitable for this new component.

Main reasons for choosing Java over Go

  • Direct access to Flink Client libraries for submitting, managing jobs and handling errors
  • Most Flink developers have strong Java experience while there are only few Go experts
  • Easier to integrate with existing build system and tooling
  • Required k8s clients and tools for building an operator are also available in Java

References