You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Status

Current state"Under Discussion"

Discussion thread:

JIRA: Unable to render Jira issues macro, execution error.

Released: Not released

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The integration with docker in Flink is currently addressed in many places which often intersect, repeat each other or apply different approaches. This makes it really hard to follow the whole topic for users and maintainers. This FLIP suggests how to unify this topic. It means having one place which has the Dockerfile, all necessary scripts and docs following each other in a consistent way without any repetitions or contradictions.

Current state of Dockerfiles

Currently, we have a lot of places in our repository and docs where different aspects of Flink docker integration are scattered around:

flink-contrib/docker-flink

This was the first ever Flink contribution about how to integrate it with docker and build the docker image with different Flink/Scala/Hadoop versions. This module addresses running Flink in standalone session mode. Also, example scripts to run it on docker compose and swarm were introduced. Additionally, an example of the integration with IBM Bluemix was added. The compose example is also copied into a separate docker example repo.

flink-container/docker

This module was introduced later to address running a Flink job in standalone mode. It is similar to the previous module, some parts are the same. Additionally, the build can be done with python and job artefacts. It also documents how to pack a user job with Flink into the docker container and run with docker compose. Its Dockerfile is also used in a sibling module to run it on Kubernetes.

apache/flink-docker (ML discussion thread)

This is the latest official Dockerfile for Flink. It is used to build official Flink docker images for the docker hub. It does not install Python and Hadoop. Its standard entry point configures RPC address and RPC/WebUI ports to constants.

docs/ops/deployment/docker.md

This is part of official Flink documentation. It refers to all other mentioned places in an attempt to put it altogether.

Mostly Flink process is started within the container with a start-foreground command which does not log into files, only to console. This breaks logs in Web UI. Kubernetes standalone session doc example uses another approach where logs are forwarded from files to console and starting the background Flink process which does not break logs in Web UI.

Other components relying on docker images

There are also other places which rely on the listed above:

There are also custom Dockerfiles for tests and development purposes. This FLIP keeps them out of scope of this effort.

Public Interfaces

The changes can affect entry point of the existing docker hub image. Additionally, custom Dockerfile entry point and custom user Dockerfile, which extends official image, can use new documented bash utilities to perform standard actions.

Proposed Changes

The idea is to keep all docker related resources in apache/flink-docker. It already has a detailed Dockerfile which is well suited for common use cases or at least serves as a good starting point. The suggestion is to make it extensible for other concerns which are currently addressed in other places.

This would subsequently mean to adjust all other places to rely on or refer to apache/flink-docker code or docs. Eventually, all other purely docker related places can be completely removed: flink-contrib/docker-flink, flink-container/docker and docs/ops/deployment/docker.md.

Docker utils

A script (or family of scripts) can be added to apache/flink-docker to implement various standard 

actions which are usually required to run in Dockerfile or container entry point. Some examples:

  • for Dockerfile:
    • install Flink (version, scala version)
          RUN flink_docker_utils install_flink --flink-version  0.10.0 --scala-version 2.11
    • install python version 2 or 3
          RUN flink_docker_utils install_python 2
    • install hadoop shaded (version)
          RUN flink_docker_utils install_shaded_hadoop (2.8.3_10.0)
    • possibly more shortcuts to install opt dependencies to lib, like
          RUN flink_docker_utils install_queryable_state
  • for entry point:
    • start JM (job/session) or TM
          flink_docker_utils start_jobmaster

          flink_docker_utils start_session_jobmanager

          flink_docker_utils start_taskmanager
    • set a configuration option in flink_conf.yaml
          flink_docker_utils configure "option.name" “value”
    • set RPC address
          flink_docker_utils set_rpc_address “address”
    • set various standard RPC/WebUI ports
          flink_docker_utils set_web_ui_port 8081

Entry point

The default entry point script can accept a command to start JM (job/session) or TM. Additionally, users can pass other arguments to customise starting the process, like setting ports etc. This would already allow users to run the default docker image in various modes without creating a custom image, like:

docker run flink session_jobmanager --webui_port 8081

User docs

User docs can be extended in markdown files of apache/flink-docker. The docs should explain:

  • the mainstream docker hub image
  • how to run it in various modes 
    • session JM
    • single job JM (plus packing job artifacts)
    • TM
    • other options
  • how to extend it to build a custom image
    • install python/hadoop
    • install optional /lib dependencies from /opt or externally
    • install /plugins
    • add user job jar for single job mode
  • how to use flink_docker_utils to create a custom entry point
  • add docs with examples of running compose/swarm
    • give script examples (mention job/session)

Also, existing docs/scripts for other components relying on docker images have to be adjusted to reference and adopt approaches described in the docker docs of apache/flink-docker. Eventually, docker/compose/swarm/bluemix scripts can be removed in favour of examples in docs (discuss with community).

Logging

Currently, if deployment scripts start the Flink process in foreground, the logs will be outputted only to the console but no logs will be appended to the usual files locally. Outputting logs to console makes sense in case of running the docker container as this is the usual docker way. The problem is that the web ui cannot display the logs because it relies on those local files. Here we have two options:

  • Modify the log4j-console.properties to also output logs into the usual files in case of starting foreground Flink process

Start background Flink process and output logs from files to console, like in the kubernetes standalone session doc example

Compatibility, Deprecation, and Migration Plan

In general, there should be no compatibility issues because the mainstream docker hub image is not supposed to be changed a lot. The docker utility scripts and new docs are just a refactoring or adding more stuff. We have to discuss in mailing list the removal of the existing docker/compose/swarm/bluemix scripts in:

Test Plan

Firstly, mostly manual testing. Later we can think of more extensive docker CI tests.

Rejected Alternatives

None so far.

Status

Current state"Under Discussion"

Discussion thread:

JIRA: Unable to render Jira issues macro, execution error.

Released: Not released

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The integration with docker in Flink is currently addressed in many places which often intersect, repeat each other or apply different approaches. This makes it really hard to follow the whole topic for users and maintainers. This FLIP suggests how to unify this topic. It means having one place which has the Dockerfile, all necessary scripts and docs following each other in a consistent way without any repetitions or contradictions.

Current state of Dockerfiles

Currently, we have a lot of places in our repository and docs where different aspects of Flink docker integration are scattered around:

flink-contrib/docker-flink

This was the first ever Flink contribution about how to integrate it with docker and build the docker image with different Flink/Scala/Hadoop versions. This module addresses running Flink in standalone session mode. Also, example scripts to run it on docker compose and swarm were introduced. Additionally, an example of the integration with IBM Bluemix was added. The compose example is also copied into a separate docker example repo.

flink-container/docker

This module was introduced later to address running a Flink job in standalone mode. It is similar to the previous module, some parts are the same. Additionally, the build can be done with python and job artefacts. It also documents how to pack a user job with Flink into the docker container and run with docker compose. Its Dockerfile is also used in a sibling module to run it on Kubernetes.

apache/flink-docker (ML discussion thread)

This is the latest official Dockerfile for Flink. It is used to build official Flink docker images for the docker hub. It does not install Python and Hadoop. Its standard entry point configures RPC address and RPC/WebUI ports to constants.

docs/ops/deployment/docker.md

This is part of official Flink documentation. It refers to all other mentioned places in an attempt to put it altogether.

Mostly Flink process is started within the container with a start-foreground command which does not log into files, only to console. This breaks logs in Web UI. Kubernetes standalone session doc example uses another approach where logs are forwarded from files to console and starting the background Flink process which does not break logs in Web UI.

Other components relying on docker images

There are also other places which rely on the listed above:

There are also custom Dockerfiles for tests and development purposes. This FLIP keeps them out of scope of this effort.

Public Interfaces

The changes can affect entry point of the existing docker hub image. Additionally, custom Dockerfile entry point and custom user Dockerfile, which extends official image, can use new documented bash utilities to perform standard actions.

Proposed Changes

The idea is to keep all docker related resources in apache/flink-docker. It already has a detailed Dockerfile which is well suited for common use cases or at least serves as a good starting point. The suggestion is to make it extensible for other concerns which are currently addressed in other places.

This would subsequently mean to adjust all other places to rely on or refer to apache/flink-docker code or docs. Eventually, all other purely docker related places can be completely removed: flink-contrib/docker-flink, flink-container/docker and docs/ops/deployment/docker.md.

Docker utils

A script (or family of scripts) can be added to apache/flink-docker to implement various standard 

actions which are usually required to run in Dockerfile or container entry point. Some examples:

  • for Dockerfile:
    • install Flink (version, scala version)
          RUN flink_docker_utils install_flink --flink-version  0.10.0 --scala-version 2.11
    • install python version 2 or 3
          RUN flink_docker_utils install_python 2
    • install hadoop shaded (version)
          RUN flink_docker_utils install_shaded_hadoop (2.8.3_10.0)
    • possibly more shortcuts to install opt dependencies to lib, like
          RUN flink_docker_utils install_queryable_state
  • for entry point:
    • start JM (job/session) or TM
          flink_docker_utils start_jobmaster

          flink_docker_utils start_session_jobmanager

          flink_docker_utils start_taskmanager
    • set a configuration option in flink_conf.yaml
          flink_docker_utils configure "option.name" “value”
    • set RPC address
          flink_docker_utils set_rpc_address “address”
    • set various standard RPC/WebUI ports
          flink_docker_utils set_web_ui_port 8081

Entry point

The default entry point script can accept a command to start JM (job/session) or TM. Additionally, users can pass other arguments to customise starting the process, like setting ports etc. This would already allow users to run the default docker image in various modes without creating a custom image, like:

docker run flink session_jobmanager --webui_port 8081

User docs

User docs can be extended in markdown files of apache/flink-docker. The docs should explain:

  • the mainstream docker hub image
  • how to run it in various modes 
    • session JM
    • single job JM (plus packing job artifacts)
    • TM
    • other options
  • how to extend it to build a custom image
    • install python/hadoop
    • install optional /lib dependencies from /opt or externally
    • install /plugins
    • add user job jar for single job mode
  • how to use flink_docker_utils to create a custom entry point
  • add docs with examples of running compose/swarm
    • give script examples (mention job/session)

Also, existing docs/scripts for other components relying on docker images have to be adjusted to reference and adopt approaches described in the docker docs of apache/flink-docker. Eventually, docker/compose/swarm/bluemix scripts can be removed in favour of examples in docs (discuss with community).

Logging

Currently, if deployment scripts start the Flink process in foreground, the logs will be outputted only to the console but no logs will be appended to the usual files locally. Outputting logs to console makes sense in case of running the docker container as this is the usual docker way. The problem is that the web ui cannot display the logs because it relies on those local files. Here we have two options:

  • Modify the log4j-console.properties to also output logs into the usual files in case of starting foreground Flink process

Start background Flink process and output logs from files to console, like in the kubernetes standalone session doc example

Compatibility, Deprecation, and Migration Plan

In general, there should be no compatibility issues because the mainstream docker hub image is not supposed to be changed a lot. The docker utility scripts and new docs are just a refactoring or adding more stuff. We have to discuss in mailing list the removal of the existing docker/compose/swarm/bluemix scripts in:

Test Plan

Firstly, mostly manual testing. Later we can think of more extensive docker CI tests.

Rejected Alternatives

None so far.

  • No labels