You only use one GPU. There is no communication for single GPU training, so BytePS does not give you any benefit.
Your distributed training already achieves (near-)linear scaling. That means your task is not bottlenecked by communication (but by computation instead). BytePS only optimizes the communication.

Contact

For the above, "we" stand for the BytePS team. The primary developers of BytePS team now are Yibo Zhu, Yimin Jiang and Chang Lan. They can be reached via the following email address. We also thank other developers as well.

...

Page tree

Versions Compared

Old Version 2

New Version 3

Key

Contact

Page tree

Page History

Versions Compared

Old Version 2

New Version 3

Key

Contact