Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document provides a detailed description of the MXNet-TensorRT runtime integration feature.  This document covers advanced techniques, contains a roadmap reflecting the current state of the feature and future directions, and also contains up-to-date benchmarks.  If you'd like a quick overview of the feature with a tutorial describing a simple use-case please refer to this MXNet hosted tutorial.  For more information you may also visit the original design proposal page.

Table of Contents

Why is TensorRT integration useful?

...

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyMXNET-1086


Decouple  NNVM to ONNX from NNVM to TensorRT in MXNet

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyMXNET-1252

The current nnvm_to_onnx classes are tightly coupled to TensorRT.  We could extract all of the TensorRT specific functionality and have a proper separation between nnvm_to_onnx and onnx_to_tensorrt.  When structuring nnvm_to_onnx we should make use of object hierarchy to convert to specific opsets of onnx to help us maintain compatibility with different toolsets.  We should create a base class that performs generic onnx conversions.  We should then specialized objects that inherit from the base onnx class and take care of the  differences between opsets.  We should also create unit tests on a per-op basis to make sure we're introducing regressions.


Currently supported operators:

...