Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel4
minLevel2

Note: Please feel free to comment on the Google Doc. We will merge the finalized proposal back here.

https://docs.google.com/document/d/1wrUBv4ksKVCA67x4hOj4LbX2xydz86PZkqYfo0heB6c/edit?usp=sharing Table of ContentsmaxLevel4minLevel2

Problem

The ring Reduce communication pattern used by NCCL (Figure 1a) and Parameter server Reduce (Figure 1b) currently used in MXNet are not optimal for small batch sizes on p3.16xlarge instances with 8 GPUs.

...