Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

Workflow

Depending on the stage you would like to test, different steps are required. This part explains what commands to run in order to reproduce a failure at each stage.

Build

A build failure like shown below can be reproduced by copying the failed command, starting with ci/build.py, and running it on your local machine while being in the root of your mxnet source directory. This step does NOT require a GPU, nor CUDA dependencies. 

...

In this case, you would like to run ci/build.py --build --platform ubuntu_build_cuda /work/runtime_functions.sh build_ubuntu_gpu_cuda8_cudnn5, which would produce an output like the following image:

Test

Reproducing test failures requires an additional step due to MXNet binaries not being present in your local workspace. 

Dependencies

First we have to generate these dependecies before a test can be executed. These can be resolved by the stash commands, which are indicated by the message "Restore files previously stashed"

Image Added

In this case, the stash is labelled as mkldnn_gpu. The easiest way to map this to a build-step, is by opening the Jenkinsfile and searching for    pack_lib('mkldnn_gpu'  In this case, you will find a block like the following:

Wiki Markup
    'GPU: MKLDNN': {
      node('mxnetlinux-cpu') {
        ws('workspace/build-mkldnn-gpu') {
          init_git()
          sh "ci/build.py --build --platform ubuntu_build_cuda /work/runtime_functions.sh build_ubuntu_gpu_mkldnn"
          pack_lib('mkldnn_gpu', mx_mkldnn_lib)
        }
      }
    },

This means that the build-step you are looking for is called "GPU: MKLDNN". Now, please execute the steps described in the Build-Paragraph above before continuing. 

Test execution

After the binaries have been generated successfully, please take the failed command from the screenshot above and execute it in the root of your MXNet workspace. In this case, you would like to run ci/build.py --nvidiadocker --build --platform ubuntu_gpu /work/runtime_functions.sh unittest_ubuntu_python2_gpu . Please note the parameter --nvidiadocker in this example. This indicates that this test requires a GPU and is thus only executable on a Ubuntu machine with Nvidia-Docker and a GPU installed. The result of this execution should look like follows:

Image Added


Troubleshooting

In case you run into any issues, please try the following steps:

Cleaning the workspace

Run make clean in the root of your MXNet workspace to remove all artefacts. Afterwards, continue with the build step to regenerate them. The result would look like follows:

Image Added


Stepping into the container

It is possible to step into the container to run commands manually. The instructions will be shown after a container fails to execute.

TODO: Show an example