Page History

...

To run an Ozone cluster you have multiple option:

Use prebaked docker images from the dockerhub (no build is required, but these images are not from the Apache Hadoop project)
Build a new Ozone cluster from the source code and start a cluster with docker (also useful for development)
Build new Ozone cluster from the source code and start it with the startup scripts without docker.

We will describe all of these scenarios in the next sections

Starting Ozone cluster with docker (using prebuilt images)

The easiest to start an Ozone cluster is using prebuilt docker images uploaded to the docker hub.

Warning

Please note that the docker images are not provided by the Apache project, (yet, see

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	HADOOP-14898

for the official containers). This method uses third-party docker images from the flokkr projects.

...

Code Block

title	docker-compose.yaml

version: "3"
services:
   namenode:
      image: flokkr/hadoop:ozone
      hostname: namenode
      ports:
         - 50070:50070
         - 9870:9870
      environment:
          ENSURE_NAMENODE_DIR: /data/namenode
      env_file:
         - ./docker-config
      command: ["/opt/hadoop/bin/hdfs","namenode"]
   datanode:
      image: flokkr/hadoop:ozone
      ports:
        - 9864
      env_file:
         - ./docker-config
      command: ["/opt/hadoop/bin/hdfs","datanode"]
   ksm:
      image: flokkr/hadoop:ozone
      ports:
         - 9874:9874
      env_file:
          - ./docker-config
      command: ["/opt/hadoop/bin/hdfs","ksm"]
   scm:
      image: flokkr/hadoop:ozone
      ports:
         - 9876:9876
      env_file:
          - ./docker-config
      command: ["/opt/hadoop/bin/hdfs","scm"]

And the configuration in the docker-config file:

Code Block

title	docker-config

CORE-SITE.XML_fs.defaultFS=hdfs://namenode:9000
OZONE-SITE.XML_ozone.ksm.address=ksm
OZONE-SITE.XML_ozone.scm.names=scm
OZONE-SITE.XML_ozone.enabled=True
OZONE-SITE.XML_ozone.scm.datanode.id=/data/datanode.id
OZONE-SITE.XML_ozone.scm.block.client.address=scm
OZONE-SITE.XML_ozone.container.metadata.dirs=/data/metadata
OZONE-SITE.XML_ozone.handler.type=distributed
OZONE-SITE.XML_ozone.scm.client.address=scm
HDFS-SITE.XML_dfs.namenode.rpc-address=namenode:9000
HDFS-SITE.XML_dfs.namenode.name.dir=/data/namenode
LOG4J.PROPERTIES_log4j.rootLogger=INFO, stdout
LOG4J.PROPERTIES_log4j.appender.stdout=org.apache.log4j.ConsoleAppender
LOG4J.PROPERTIES_log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

Save both the files to a new directory and run the containers with:

Code Block
docker-compose up -d

You can check the status of the components:

Code Block
docker-compose ps

You can check the output of the servers with:

Code Block
docker-compose logs

As the webui ports are forwarded to the external machine, you can check the web UI:

* Storage Container Manager: http://localhost:9876/
* Key Space Manager: http://localhost:9874/
* Datanode: http://localhost:9870/

You can start multiple datanodes with:

Code Block
docker-compose scale datanode=3

You can test the commands from the OzoneShell page after opening a new shell in one of the containers:

Code Block
docker-compose exec datanode bash

Notes:

Please note, that:

The containers could be configured by environment variables. We just moved out the env definitions to an external file to avoid duplication.
For more detailed explanation of the Configuration variables see the OzoneConfiguration page.
The flokkr base image contains a simple script to convert environment variables to files, based on naming convention. All of the environment variables will be converted to traditional Hadoop config XMLs or log4j configuration files

Starting Ozone cluster with docker from the source build

Warning

Only do this if

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	HDFS-12469

is merged. (Or use the patch).

...

Check the child pages for more detailed instructions

First, it uses a much more smaller common image which doesn't contains Hadoop.
Second, the real Hadoop should be built from the source and the dist director should be mapped to the container.

With this method you can start a full cluster on your local machine from your own build.

Build Ozone

To build Ozone, please checkout the hadoop sources from github or the apache git repository. Then checkout the ozone branch, HDFS-7240 and build it.

Code Block
git checkout HDFS-7240 mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Pdist -Dtar -DskipShade

Note: skipShade is just to make compilation faster and not really required.

This will give you a directory in your hadoop-dist/target directory which could be mapped to the docker containers.

Start the cluster

Code Block
cd dev-support/compose/ozone docker-compose up

For more docker-compose commands, please check the previous section.

Starting Ozone cluster with shell scripts from the source build (without docker)

This is the traditional method. You need a build as defined in the previous section.

You can start it by going to the hadoop-dist/target/hadoop-3.1.0-alpha directory and start the cluster.

Configuration

There is a detailed documentation about the configuration of Ozone cluster. But If you would like to getting started, just save the following snippet to the etc/hadoop/ozone-site.xml (inside your hadoop distribution)

Code Block

<properties>
<property><name>ozone.ksm.address</name><value>localhost</value></property>
<property><name>ozone.scm.datanode.id</name><value>/tmp/datanode.id</value></property>
<property><name>ozone.scm.names</name><value>localhost</value></property>
<property><name>ozone.handler.type</name><value>distributed</value></property>
<property><name>ozone.container.metadata.dirs</name><value>/tmp/metadata</value></property>
<property><name>ozone.scm.block.client.address</name><value>localhost</value></property>
<property><name>ozone.scm.client.address</name><value>localhost</value></property>
<property><name>ozone.enabled</name><value>True</value></property>
</properties>

Run

Ozone is designed to run concurrently with HDFS. The simplest way to start is to run start-dfs.sh from the $HADOOP/sbin/start-dfs.sh. Once HDFS is running, please verify it is fully functional by running some commands like.

Code Block
hdfs dfs -mkdir /usr hdfs dfs -ls /

Once you are sure that HDFS is running, start Ozone. To start ozone, you need to start SCM and KSM. Currently we assume that both KSM and SCM is running on the same node, this will change in future.

Code Block
./hdfs --daemon start scm ./hdfs --daemon start ksm

if you would like to start HDFS and Ozone together, you can do that by running a single command.

Code Block
$HADOOP/sbin/start-ozone.sh

This command will start HDFS and then start the ozone components.

Once you have ozone running you can use these ozone shell commands to create a volume, bucket and keys.

Space shortcuts

Page tree

Versions Compared

Old Version 2

New Version 3

Key

Starting Ozone cluster with docker (using prebuilt images)

Starting Ozone cluster with docker from the source build

Build Ozone

Start the cluster

Starting Ozone cluster with shell scripts from the source build (without docker)

Configuration

Run