Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This would subsequently mean to adjust all other places to rely on or refer to apache/flink-docker code or docs. Eventually, all other purely docker related places can be completely removed: flink-contrib/docker-flink, flink-container/docker and docs/ops/deployment/docker.md.

Docker utils

A script (or family of scripts) can be added to apache/flink-docker to implement various standard 

actions which are usually required to run in Dockerfile or container entry point. Some examples:

  • for Dockerfile:
    • install Flink (version, scala version)
          RUN flink_docker_utils install_flink --flink-version  0.10.0 --scala-version 2.11
  • for entry point:
    • start JM (job/session) or TM

...


    • set a configuration option in flink_conf.yaml
          flink_docker_utils configure "option.name" “value”

Entry point

The default entry point script can accept a command to start JM (job/session) or TM. Additionally, users can customise starting the process by e.g. setting environment vars. This would already allow users to run the default docker image in various modes without creating a custom image, like:

...

  • the mainstream docker hub image
  • how to run it in various modes 
    • session JM
    • single job JM (plus packing job artifacts)
    • TM
    • other options
    • environment variables
      • FLINK_PROPERTIES to add more Flink config options to flink-conf.yaml (once, Flink supports configuring with env variables, we can consider to deprecate FLINK_PROPERTIES)
      • ENABLE_BUILT_IN_PLUGINS
      • Custom jar paths (pointing e.g. to custom locations in mounted docker volumes)
      • Custom logging conf
  • how to extend it to build a custom image
    • install python/hadoop
    • install optional /lib dependencies from /opt or externally
    • install /plugins
    • add user job jar for single job mode
  • add docs with examples of running compose/swarm
    • give script examples (mention job/session)

Also, existing docs/scripts for other components relying on docker images have to be adjusted to reference and adopt approaches described in the docker docs of apache/flink-docker. Eventually, docker/compose/swarm/bluemix scripts can be removed in favour of examples in docs (discuss with community).

Logging

Currently, if deployment scripts start the Flink process in foreground, the logs will be outputted only to the console but no logs will be appended to the usual files locally. Outputting logs to console makes sense in case of running the docker container as this is the usual docker way. The problem is that the web ui cannot display the logs because it relies on those local files. Here we have two options:

...

We can modify the log4j-console.properties to also output logs into the usual files in case of starting foreground Flink process

...

. Here we have to also check how it satisfies the container space limits nicely (rolling fixed files etc).

Custom logging

We could also provide an environment variable which contains custom logging properties to rewrite the file in base image in entry point script, similar to FLINK_PROPERTIES for config options in flink-conf.yaml.

Another option is to expose an environment variable pointing to another location of logging properties, e.g. in a mounted volume.Start background Flink process and output logs from files to console, like in the kubernetes standalone session doc example

Compatibility, Deprecation, and Migration Plan

In general, there should be no compatibility issues because the mainstream docker hub image is not supposed to be changed a lot . The docker utility scripts and new docs are just a refactoring or adding more stuff. We have to discuss in mailing list the removal of because we are planning mostly some refactoring and docs extension.

As discussed in mailing list, once the user documentation is good enough, we are going to remove the existing docker/compose/swarm/bluemix scripts in:

Implementation steps

  • Document the official docker hub image and examples of how to run it (as of now)
  • Document examples of how to extend the official docker hub image (as of now)
  • Extend entry point script with job cluster mode and user job artefacts, document it
  • Modify the log4j-console.properties to also output logs into the files for WebUI
  • Make custom logging properties possible
  • Remove the existing docker/compose/swarm/bluemix scripts in flink-contrib/docker-flink and flink-container/docker

Test Plan

Firstly, mostly manual testing. Later we can think of more extensive docker CI tests.

...