Ozone Docker images

Docker heavily is used at the ozone development, but there are three main use-cases:

dev:
- We use docker to start local pseudo-clusters (docker provides unified environment, but no image creation is required)
test:
- We create docker images from the dev branches to test ozone in kubernetes and other container orchestator system
- We provide apache/ozone images for each release to make it easier the evaluation of Ozone. These images are not created for production usage.
production:
- We document how can you create your own docker image for your production cluster.

Let's check each of the use-cases in more details:

Development

Ozone artifact contains example docker-compose directories to make it easier to start Ozone cluster in your local machine.

From distribution:

cd compose/ozone
docker-compose up -d

Note: If having difficulties with docker-compose V2 parsing docker config files, use docker-compose Version 2.17.3 or above.

After a local build

cd  hadoop-ozone/dist/target/ozone-*/compose
docker-compose up -d

These environments are very important tools to start different type of Ozone clusters at any time.

To be sure that the compose files are up-to-date, we also provide acceptance test suites which start the cluster and check the basic behaviour.

The acceptance tests are part of the distribution, and you can find the test definitions in ./smoketest directory.

You can start the tests from any compose directory:

For example:

cd compose/ozone
./test.sh

Implementation details

./compose tests are based on the apache/hadoop-runner docker image. The image itself doesn't contain any Ozone jar file or binary just the helper scripts to start ozone.

hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone distribution itself is mounted from the including directory:

(Example docker-compose fragment)

 scm:
      image: apache/hadoop-runner:jdk11
      volumes:
         - ../..:/opt/hadoop
      ports:
         - 9876:9876

The containers are conigured based on environment variables, but because the same environment variables should be set for each containers we maintain the list of the environment variables in a separated file:

 scm:
      image: apache/hadoop-runner:jdk11
      #...
      env_file:
          - ./docker-config

The docker-config file contains the list of the required environment variables:

OZONE-SITE.XML_ozone.om.address=om
OZONE-SITE.XML_ozone.om.http-address=om:9874
OZONE-SITE.XML_ozone.scm.names=scm
OZONE-SITE.XML_ozone.enabled=True
#...

As you can see we use naming convention. Based on the name of the environment variable, the appropariate hadoop config XML (ozone-site.xml in our case) will be generated by a script which is included in the hadoop-runner base image.

The entrypoint of the hadoop-runner image contains a helper shell script which triggers this transformation and cab do additional actions (eg. initialize scm/om storage, download required keytabs, etc.) based on environment variables.

Test/Staging

The docker-compose based approach is recommended only for local test not for multi node cluster. To use containers on a multi-node cluster we need a Container Orchestrator like Kubernetes.

Kubernetes example files are included in the kubernetes folder.

Please note: all the provided images are based the hadoop-runner image which contains all the required tool for testing in staging environments. For production we recommend to create your own, hardened image with your own base image.

Test the release

The release can be tested with deploying any of the example clusters:

cd kubernetes/examples/ozone
kubectl apply -f

Plese note that in this case the latest released container will be downloaded from the dockerhub.

Test the development build

To test a development build you can create your own image and upload it to your own docker registry:

mvn clean install -f pom.ozone.xml -DskipTests -Pdocker-build,docker-push -Ddocker.image=myregistry:9000/name/ozone

The configured image will be used in all the generated kubernetes resources files (image: keys are adjusted during the build)

cd kubernetes/examples/ozone
kubectl apply -f

Production

We strongly recommend to use your own image in your production cluster and adjust base image, umask, security settings, user settings according to your own requirements.

You can use the source of our development images as an example:

Most of the elements are optional and just helper function but to use the provided example kubernetes resources you may need the scripts from here

The two python scripts convert environment variables to real hadoop XML config files
The start.sh executes the python scripts (and other initialization) based on environment variables.

Containers

Ozone related container images and source locations:

| Container              |repository                                            | branch               | base           | available tags
--------------------------------------------------------------------------------------------------------------------------
| apache/ozone             | https://github.com/apache/hadoop-docker-ozone      | ozone-...            | hadoop-runner  | 0.3.0,0.4.0,...
| apache/hadoop-runner     | https://github.com/apache/hadoop                   | docker-hadoop-runner | centos         | jdk11,jdk8,latest
| apache/ozone:build (WIP) | https://github.com/apache/hadoop-docker-ozone      | ozone-build          |                |

Space shortcuts

Page tree