Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Page properties


Discussion thread
Vote thread
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-17160

Release1.11

Status

Current state"Under Discussion"

Discussion thread:

JIRA:

...


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Table of Contents

Motivation

The integration with docker in Flink is currently addressed in many places which often intersect, repeat each other or apply different approaches. This makes it really hard to follow the whole topic for users and maintainers. This FLIP suggests how to unify this topic. It means having one place which has the Dockerfile, all necessary scripts and docs following each other in a consistent way without any repetitions or contradictions.

...

There are also custom Dockerfiles for tests and development purposes. This FLIP keeps them out of scope of this effort.

...

The changes can affect entry point of the existing docker hub image. Additionally, custom Dockerfile entry point and custom user Dockerfile, which extends official image, can use new documented bash utilities to perform standard actions.

Proposed Changes

The idea is to keep all docker related resources in apache/flink-docker. It already has a detailed Dockerfile which is well suited for common use cases or at least serves as a good starting point. The suggestion is to make it extensible for other concerns which are currently addressed in other places. This mainly means refactoring of the existing code and introducing more docs as a first step. This effort should enable further improvements and follow-ups for the docker integration with Flink.

This would subsequently mean to adjust all other places to rely on or refer to apache/flink-docker code or docs. Eventually, all other purely docker related places can be completely removed: flink-contrib/docker-flink, flink-container/docker and docs/ops/deployment/docker.md.

Docker utils

A script (or family of scripts) can be added to apache/flink-docker to implement various standard 

actions which are usually required to run in Dockerfile or container entry point. Some examples:

  • for Dockerfile:
    • install Flink (version, scala version)
          RUN flink_docker_utils install_flink --flink-version  0.10.0 --scala-version 2.11
    • install python version 2 or 3
          RUN flink_docker_utils install_python 2
    • install hadoop shaded (version)
          RUN flink_docker_utils install_shaded_hadoop (2.8.3_10.0)
    • possibly more shortcuts to install opt dependencies to lib, like
          RUN flink_docker_utils install_queryable_state
  • for entry point:
    • start JM (job/session) or TM

...

    • set a configuration option in flink_conf.yaml

...

.

...

    • set RPC address
          flink_docker_utils set_rpc_address “address”

...

          flink_docker_utils set_web_ui_port 8081

Entry point

The default entry point script can accept a command to start JM (job/session) or TM. Additionally, users can pass other arguments to customise starting the process , like setting ports etcby e.g. setting environment vars. This would already allow users to run the default docker image in various modes without creating a custom image, like:

docker run flink session_jobmanager --webui_port 8081env ENABLE_BUILT_IN_PLUGINS=true --env-file ./env.list

User docs

User docs can be extended in markdown files of apache/flink-docker. The docs should explain:

  • the mainstream docker hub image
  • how to run it in various modes 
    • session JM
    • single job JM (plus packing job artifacts)
    • TM
    • other options
    • environment variables
      • FLINK_PROPERTIES to add more Flink config options to flink-conf.yaml (once, Flink supports configuring with env variables, we can consider to deprecate FLINK_PROPERTIES)
      • ENABLE_BUILT_IN_PLUGINS
      • Custom jar paths (pointing e.g. to custom locations in mounted docker volumes)
      • Custom logging conf
  • how to extend it to build a custom image
    • install python/hadoop
    • install optional /lib dependencies from /opt or externally
    • install /plugins
    • add user job jar for single job mode
    how to use flink_docker_utils to create a custom entry point
  • add docs with examples of running compose/swarm
    • give script examples (mention job/session)

Also, existing docs/scripts for other components relying on docker images have to be adjusted to reference and adopt approaches described in the docker docs of apache/flink-docker. Eventually, docker/compose/swarm/bluemix scripts can be removed in favour of examples in docs (discuss with community).

Logging

Currently, if deployment scripts start the Flink process in foreground, the logs will be outputted only to the console but no logs will be appended to the usual files locally. Outputting logs to console makes sense in case of running the docker container as this is the usual docker way. The problem is that the web ui cannot display the logs because it relies on those local files. Here we have two options:

...

We can modify the log4j-console.properties to also output logs into the usual files in case of starting foreground Flink process

...

. Here we have to also check how it satisfies the container space limits nicely (rolling fixed files etc).

Custom logging

We could also provide an environment variable which contains custom logging properties to rewrite the file in base image in entry point script, similar to FLINK_PROPERTIES for config options in flink-conf.yaml.

Another option is to expose an environment variable pointing to another location of logging properties, e.g. in a mounted volume.Start background Flink process and output logs from files to console, like in the kubernetes standalone session doc example

Compatibility, Deprecation, and Migration Plan

In general, there should be no compatibility issues because the mainstream docker hub image is not supposed to be changed a lot . The docker utility scripts and new docs are just a refactoring or adding more stuff. We have to discuss in mailing list the removal of because we are planning mostly some refactoring and docs extension.

As discussed in mailing list, once the user documentation is good enough, we are going to remove the existing docker/compose/swarm/bluemix scripts in:

Implementation steps

  • Document the official docker hub image and examples of how to run it (as of now)
  • Document examples of how to extend the official docker hub image (as of now)
  • Remove flink-contrib/docker-flink
  • Extend entry point script and docs with job cluster mode and user job artefacts
  • Remove flink-container/docker

Tentative improvements:

  • Modify the log4j-console.properties to also output logs into the files for WebUI
  • Make logging properties configurable
  • Split stdout/stderr file container logs

Test Plan

Firstly, mostly manual testing. Later we can think of more extensive docker CI tests.

Future road map

We can still do move improvements to the user experience with docker in Flink:

  • Investigate how to support developers to build a custom image for a snapshot version, e.g. for a certain commit in Flink repo
  • Rewrite Flink options in flink-conf.yaml by environment variables when Flink process starts
  • Refactor Flink bash scripts into one thin script which uses a Java bootstrap utility to prepare and configure started Flink process (similar to BashJavaUtils for memory setup)

For more details see also this FLIP discussion thread and more detailed proposal doc.

Rejected Alternatives

None so far.