Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

TVM , MKLDNN and nGraph use customized data formats. Interaction between these backends with MXNet requires data format conversion. TVM, MKLDNN, TensorRT and nGraph fuses operators. Integration with these backends should happen in the granularity of subgraphs instead of in the granularity of operators. To fuse operators, it's obvious that we need to divide a graph into subgraphs so that the operators in a subgraph can be fused into a single operator. To handle customized data formats, we should partition a computation graph into subgraphs as well. Each subgraph contains only TVM, MKLDNN or nGraph operators. In this way, MXNet converts data formats only when entering such a subgraph, and the operators inside a subgraph handle format conversion themselves if necessary. This makes interaction of TVM and MKLDNN with MXNet much easier. Neither the MXNet executor nor the MXNet operators need to deal with customized data formats. Even though invoking these libraries from MXNet requires similar steps, the partitioning rule and the subgraph execution of these backends can be different. As such, we define the following interface for backends to customize graph partitioning and subgraph execution inside an operator. More details can be found at PR 12157 and Subgraph API.


MXNet Horovod Integration

Apache MXNet now supports distributed training using Horovod framework. Horovod is an open source distributed framework created at Uber. It leverages efficient inter-GPU communication to distribute and aggregate model parameters across multiple workers thus allowing efficient use of network bandwidth and scaling of training of deep learning models. To learn more about MXNet-Horovod integration, check out this blog.

JVM Memory Management

The MXNet Scala and Java API uses native memory to manage NDArray, Symbol, Executor, DataIterators using MXNet's internal C APIs. The C APIs provide appropriate interfaces to create, access and free these objects. MXNet Scala has corresponding Wrappers and APIs that have pointer references to the native memory. Before this project, JVM users (e.g. Scala, Clojure, or Java) of MXNet have to manage MXNet objects manually using the dispose pattern. There are a few usability problems with this approach:

...