Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Benchmarked on c5n.4x Ubuntu 16.04 LTS with NaiveEngine and cython enabled.


Workload

Current FFI (us)

TVM FFI (us)

zeros((3, 4))

27.2

4.9

zeros((3, 4), dtype=‘float64’)

28.3

5.6

zeros((3, 4), ctx = “cpu(0)”, dtype=‘float64’)

26.4

6.7

tensordot(a, b, ((1, 0), (0, 1)))

31.8

13.3


References

[1] [RFC] MXNet Imperative Op Invocation Overhead https://github.com/apache/incubator-mxnet/issues/17097
[2] [PR] Fix collect_params().zero_grad() in gluon numpy interface https://github.com/apache/incubator-mxnet/pull/16716
[3] [DGL] https://docs.dgl.ai/en/latest/developer/ffi.html
[4] [COMMENT] https://github.com/apache/incubator-mxnet/issues/17097#issuecomment-568325041