Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Trees are generated in such a sequential fashion described above. To discourage later trees from using previously used links, we apply a multiplicative penalty term MXNET_KVSTORE_TREE_LINK_USAGE_PENALTY (default = 0.7) whenever a link has been used. This is multiplied to the initial link topology adjacency matrix where 3 represents double NVLink connection and 2 represents single NVLink connection.

...

(a) Parameter sweep of MXNET_KVSTORE_GPUARRAYTREE_ARRAY_BOUND                                      (b) 1 Push-Pull before Wait                                               (c) 150 Push-Pulls before Wait
Figure 7. VGG-16 performance as function of MXNET_KVSTORE_GPUARRAYTREE_ARRAY_BOUND using batch size 4 per GPU. These figures show that beyond 1M-10M float32's, multi-tree begins to do better than a single tree.

...