You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Status

Current state: Under Discussion

Discussion thread: here

JIRA: KAFKA-15445

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Docker is a widely adopted tool among developers for its ease of deployment. Popular Apache projects like Hadoop, Spark, and Flink publish a docker image. However, Apache Kafka does not have an official docker image currently. This KIP aims to publish Apache Kafka docker image through the ASF docker profile

This KIP targets to publish docker image for JVM based Apache Kafka.

POC

We did an initial POC of our image and compared with other existing popular Apache Kafka docker images

Image Name

Image size (Uncompressed)

Startup total time (in seconds)

Time to start the kafka server inside container (in seconds)

Memory Used

Java version

Apache Kafka version

Proposed apache/kafka with Java 21

457.86 MB

3.885

0.816

382.4 MB

21

3.6.0 

Proposed apache/kafka with Java 17

460.76 MB

2.174

0.679

326 MB

17

3.5.1 

bitnami/kafka

517.16 MB

3.469

0.889

344 MB

17

3.5.1 

confluentinc/cp-kafka

876.4 MB

3.341

0.701

390.6 MB

11

7.5.0-ccs 

ubuntu/kafka

+ ubuntu/zookeeper

(Image doesn’t support kraft and source code is not public)

359.03 MB + 359.03 MB

1.425 + 0.636

0.827

335 MB + 78 MB

11

3.1.0 

Note: ubuntu/kafka is smaller as it is making use of Apache Kafka:3.1.0 and Java 11. With same AK and java versions, our image size was 367MB 

Public Interfaces

  • There will be a new additional artifact i.e. Apache Kafka docker image for every Apache Kafka release.
  • Sample docker-compose.yml

  • Quick start examples

  • Add public documentation

Proposed Changes

  • There will be a docker image as an additional artifact for every Apache Kafka release.
  • This docker image will consist of JVM based Kafka and will have support for Linux based AMD and ARM architectures. 
  • Extend the existing Kafka system tests framework to run against the new docker image for each tag. 
  • Extend the Apache CI/CD pipeline to publish a new docker image to public DockerHub through the ASF docker profile

Scala Version

Currently Kafka has support for two flavours of binaries based on Scala version: 2.12 and Scala 2.13.
We will support Docker Image only for Scala 2.13

  • Less maintenance overhead.
  • Recommended by Apache Kafka.

  • Should anyone require a Docker image with Scala version 2.12, it can be conveniently generated by making a minor adjustment to the Apache Kafka tar file specified in the provided Dockerfiles. This modification will align the image with the 2.12 version of Apache Kafka.

JAVA Version

  • We will support only the latest JAVA supported by Apache Kafka.
  • Apache Kafka is already being built and tested with JAVA 21.
  • By release 3.7.0 (the next release), Apache Kafka will have JAVA 21 as one of the officially supported versions.
  • Hence, we will support JAVA 21 in our docker image.

What if users want Docker Image with a different Java version?

  • For users seeking a Docker image with an alternative Java version, they will have the flexibility to build their own Docker image utilising the Dockerfiles we provide. In our documentation, we will provide clear guidance on the designated base images for various Java versions.
  • We will update the Java major version as part of minor Apache Kafka releases. The implication is that users who include broker plugins alongside the broker should use custom images to ensure their custom code is not broken by Java upgrades.

Docker Base Image

To run Apache Kafka, only JRE, and not JDK, is needed in the Docker image.
We have decided to utilise the eclipse-temurin:<JRE-version> as the base image for docker image.

  • This is one of the best images for JAVA JRE which is maintained by a reputed organisation.
  • Flink relies on eclipse-temurin base images, affirming their authenticity and increasing our confidence in their reliability.

  • This image has support for arm64 and amd64 both.

  • Since it’s just JRE it is lightweight in terms of image size of 263MB.

NOTE: JAVA 21 was released on Sep 19. As of Oct 4, eclipse-temurin has not yet provided a Docker image for JRE 21. We anticipate that they will release the JRE 21 image prior to our Apache Kafka Docker image release.

Image Naming

Image naming should:

  1. Transparently communicate the packaged Kafka version.

  2. Maintain the above point in the event of CVEs/bugs requiring a dedicated Docker release.

Adhering to the outlined constraints, image tagging can follow this format
<image-name>:<kafka-version>-<optional-suffix>

  • For example, for 3.7.0 version of kafka, the image name with tagging would be apache/kafka:3.7.0
  • In case of a CVE post 3.7.0, the name of the released image will depend on the section ReleaseProcess
    • apache/kafka:3.7.1 => Docker release along with Apache Kafka minor release
    • apache/kafka:3.7.0-1 => Only docker image release, hence added a suffix

Compatibility, Deprecation, and Migration Plan

  • For existing apache kafka users there will be no impact as JVM based docker image will be a new feature.

Test Plan

  • Testing the functionality of the Apache Kafka packaged in the image

    • The image will consist of the official tarball released by Apache Kafka.

    • The above tarball is pre tested as the part of Apache Kafka release.

    • Hence no extra testing is required for the Apache Kafka packaged in the image.

  • Testing the Docker Image - Integration of the Apache Kafka with the Docker

    • Dockerizing Apache Kafka requires additional steps like, passing the configs from the user to the properties file in the container, passing credentials etc.

    • Sanity tests will be added to test the proper functionality of the docker image.

Release Process

Following are the 2 scenarios to introduce Docker image:

  1. Docker image release during AK release

    1. RM would have generated and pushed Apache Kafka's Release Candidate artifacts to apache sftp server hosted in home.apache.org by release.py script
    2. Run the script to build the docker image(using the above Release Candidate tarball URL) and test the image locally.
    3. The docker image needs to be pushed to some Dockerhub repo(eg. Release Manager's) for the evaluation of RC Docker image.

    4. Start the Voting for RC, which will include the Docker image as well as docker sanity tests report.

    5. In case any docker image specific issue is detected, that will be evaluated by the community, if it’s a release blocker or not.

    6. Once the vote passes, the image will be pushed to apache/kafka with the version as tag.

    7. Steps for the Docker image release will be included in the Release Process doc of Apache Kafka

    8. eg. for AK release 3.7.0 and image released will be apache/kafka:3.7.0 (=> image contains AK 3.7.0)
  2. Docker Image release post AK Release(eg CVE in the base image)

    1. [Preferred Approach] Approach 1: Only Docker Image Release
      1. This step will be followed in case only Docker Image need to be released(eg CVE in the base image).
      2. Execute the script to build the docker image(using the already publicly released AK tarball URL) and test the image locally.

      3. Once the Docker image artifact is ready, it will get reviewed by the community and voting will be conducted, just for the Docker image release.

      4. This image will then be pushed to apache/kafka  with proper tagging to communicate kafka version.

      5. No AK release and image released will be apache/kafka:3.7.0-1 (=> image contains AK 3.7.0)
    2. Approach 2: Complete Apache Kafka release along with Docker Image release
      1. Here we will release do the minor release of AK artifacts as well
      2. Steps followed will be same as normal AK release.
      3. We will have a 3.7.1 release. AK release 3.7.1 and image released will be apache/kafka:3.7.1 (=> image contains AK 3.7.1)

Ownership of the Docker Images' Release

  • Suggestion: The docker image release should be owned by the Release Manager.

  • As per the current release process, only PMC members should be allowed to push to apache/kafka docker image.

  • If the RM is not a PMC member, they’ll need to take help from a PMC member to release the image.

Rejected Alternatives

NA

  • No labels