Page History

...

CRD to express Flink application (for details see CRD CR example section below)
- External jar artifact fetcher support (s3, https etc.) via init container
- similar to https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#pod-template
- creates an empty session cluster, no application/job management
- the session cluster can be used to control jobs externally (like submission via REST API)
- Supports all Flink configuration properties
- Docker image
- Upgrade policy (savepoint, stateless)
- Restore policy (savepoint, latest externalized checkpoint, stateless)
- Pod template for jobmanager and taskmanager
  - full control over k8s pod template (

unrestricted k8s pod configuration

- - no mapping/whitelisting)
  - layering/merging of pod templates (operator itself could also apply cluster wide defaults)
- Support explicit session cluster (no job management) and application mode
Create & deploy new Flink application
- Empty state
- From savepoint
Upgrade Flink application with or w/o savepoint on any CR change, including:
- Flink configuration change
- Job jar change
- Docker image change
Pause/Resume Flink application
- the job will not continue its data processing
- the job will not be deleted from the cluster
- the job will release its resources back to the cluster (can be used by other jobs)
- Stops job with savepoint, tracks savepoint/last checkpoint in CR status for resume.
Delete Flink application
Integrate with Flink Kubernetes HA module [4]
- When selected, operator can obtain latest checkpoint from config map and does not depend on a potentially unavailable Flink job REST API
- This should the default, but not a hard dependency
Support Flink UI ingress
CI/CD with operator Docker image artifact, publish image in dockerhub

...

In the long run it might make sense to support both deployment modes in the operator, however initially we should focus the development effort on a single approach. Maybe start with support for [2] since we could reuse the code in a Java based implementation.

CRD

...

CR Example

kind: FlinkDeployment
metadata:
name: flink-wordcount-example
namespace: flink-operator
annotations:
labels:
environment: development
spec:
image: example:latest
flinkVersion: "1.14"
flinkConfiguration:
// everything for flink-conf.yaml
state.savepoints.dir: file:///checkpoints/flink/savepoints
podTemplate:
// everything that a k8s pod template supports
// defaults for both, job and task manager pods
jobManager:
resources:
requests:
memory: "200Mi"
cpu: "0.1"
replicas: 1
podTemplate:
// everything that a k8s pod template supports
// layered over common template
taskManager:
taskSlots: 2
resources:
requests:
memory: "200Mi"
cpu: "0.1"
podTemplate:
// everything that a k8s pod template supports,
// layered over common template
// job can be optional for plain session cluster
job:
jarURI: "file:///wordcount-operator-example-1.0.0-SNAPSHOT.jar"
parallelism: 3
entryClass: "org.apache.flink.WordCount"
args: ""
cancelMode: (savepoint, none)
restoreMode: (savepoint, checkpoint, none)
logging:
// customize logging config in flink container
status:
...
// information about the actual state
// including latest savepoint/checkpoint etc.

...

The Flink operator should be built using the java-operator-sdk . The java operator sdk is the state of the art approach for building a Kubernetes operator in Java. It uses the Fabric8 k8s client like Flink does and it is open source with Apache 2.0 license.

...

Page tree

Versions Compared

Old Version 4

New Version 5

Key

CRD

CR Example