Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Both methods are shown in Figure 34.


View file
nametree.pdf
height400
View file
namemulti-tree.pdf
height400

                          (a) Single tree algorithm                                             (b) Multiple tree algorithm

Figure 34. Proposed Reduce and Broadcast algorithms.

...

                                   (c) Reduce                                                                                                                         (d) Broadcast
Figure 45. Block diagram of proposed addition. Changes to old initialization (InitMergeBuffersAndComm), Reduce and Broadcast are illustrated.

...

When to switch between Single and Multiple tree



Figure 56. VGG-16 performance as function of MXNET_KVSTORE_BIGARRAY_BOUND using batch size 4 per GPU.

...


Vs. Parameter Server (in comm.h)Vs. NCCL (in kvstore_nccl.h)
Resnet-501.191.33
VGG-165.891.06
Inception-v31.151.34
AlexNet6.601.42

Figure 67. End-to-end training results on synthetic data showing speed-up vs. NCCL on fp32 and fp16.

...