Ozone CI is executed by Github Actions. The following page collects a few practical usage of github actinos.

Executing one test multiple times

Let's say you have an intermittent test which should be analyzed. You can do the following the run the test multiple times:

(1) Schedule a new github actions build

Add a new github actions file to your master branch. For example: .github/workflows/single-test.yml

name: it-client
on:
  schedule:
   - cron: '*/30 * * * *'
env:
  MAVEN_OPTS: -Dhttp.keepAlive=false -Dmaven.wagon.http.pool=false -Dmaven.wagon.http.retryHandler.class=standard -Dmaven.wagon.http.retryHandler.count=3
jobs:
  it-client:
    name: it-client
    runs-on: ubuntu-18.04
    steps:
      - uses: actions/checkout@v2
        with:
          ref: leader-election-problem
      - name: Cache for maven dependencies
        uses: actions/cache@v2
        with:
          path: ~/.m2/repository
          key: maven-repo-${{ hashFiles('**/pom.xml') }}-8-single
          restore-keys: |
            maven-repo-${{ hashFiles('**/pom.xml') }}-8
            maven-repo-${{ hashFiles('**/pom.xml') }}
            maven-repo-
      - name: Execute tests
        run: hadoop-ozone/dev-support/checks/integration.sh -Dtest=TestOzoneRpcClient
        env:
          ITERATIONS: 20
      - name: Summary of failures
        run: cat target/integration/summary.txt
        if: always()
      - name: Archive build results
        uses: actions/upload-artifact@v2
        if: always()
        with:
          name: it-client
          path: target/integration
      - name: Delete temporary build artifacts before caching
        run: |
          #Never cache local artifacts
          rm -rf ~/.m2/repository/org/apache/ozone/hdds*
          rm -rf ~/.m2/repository/org/apache/ozone/ozone*
        if: always()

Important parameters:

  • Scheduling: modify the scheduling line (this test runs twice per hour: */30 * * *)
  • Branch: Which of your branches should be tested (in my case this is leader-election-problem)
  • Test name: which test should be executed. (put it after the integration.sh) Can be one or more tests (like -Dtest=TestCommitWatcher,TestWatchForCommit) or a profile (like -P=it-client)
  • Number of iterations: number of times the test should be run in the same scheduled run - tweak this depending on the schedule and time required for running the test once

(2) IMPORTANT: Commit the previous file to your master (cron scheduling should be defined in the master), but it will test the given branch (defined by ref:)

Usually I do something like this:

git checkout -b elek-master origin/master
vim .github/workflows/single-test.yml
...

git add .
git commit -m "Commit new workflow definition"

#push local elek-master branch to the master branch of the elek remote (which is my fork)
git push elek elek-master:master

(3) Logging

You might need to update the logging level. For integration test it can be done with editing: hadoop-ozone/integration-test/src/test/resources/log4j.properties

For example, these two lines turning on DEBUG log for this two specific Ratis classes:

log4j.logger.org.apache.ratis.grpc.server.GrpcLogAppender=DEBUG
log4j.logger.org.apache.ratis.server.impl.RaftServerImpl=DEBUG

But you can also turn on DEBUG log for all ratis classes:

log4j.logger.org.apache.ratis.grpc=DEBUG


Again: log modification should be pushed to your test branch, scheduling should be pushed to your master.

Result will be visible under the actions tab on your fork:


Login to the build environment

If something is wrong only on the github actions machine, you can login to the machine during the build. The easiest way to do this, using tmate. Add a new step to your github action definition:

- name: Setup tmate session
  run: |
      sudo apt-get update
      sudo apt-get install -y tmate openssh-client
      echo -e 'y\n'|ssh-keygen -q -t rsa -N "" -f ~/.ssh/id_rsa
      tmate -S /tmp/tmate.sock new-session -d
      tmate -S /tmp/tmate.sock wait tmate-ready
      tmate -S /tmp/tmate.sock display -p '#{tmate_ssh}'
      tmate -S /tmp/tmate.sock display -p '#{tmate_web}'

After pushing it, you can find the login details in the github actions log:


Use ssh or the web UI to login to the build environment.

Please note that this is 100% unsecure. Anybody can login to the machine. Don't do it when the GITHUB_TOKEN is used in the actions.

Please also note, that this is a background process. The ssh session will be terminated when the build is done. You can add a sleep to avoid this (see the last line)

name: build-branch
on:
  - push
jobs:
  build:
    name: compile
    runs-on: ubuntu-18.04
    steps:
      - uses: actions/checkout@master
      - name: Setup tmate session
        run: |
            sudo apt-get update
            sudo apt-get install -y tmate openssh-client
            echo -e 'y\n'|ssh-keygen -q -t rsa -N "" -f ~/.ssh/id_rsa
            tmate -S /tmp/tmate.sock new-session -d
            tmate -S /tmp/tmate.sock wait tmate-ready
            tmate -S /tmp/tmate.sock display -p '#{tmate_ssh}'
            tmate -S /tmp/tmate.sock display -p '#{tmate_web}'
      - run: hadoop-ozone/dev-support/checks/build.sh
      - run: sleep 100000
  • No labels