Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Setup

EC2 Instance: p3.8xlarge

CUDNN: 7.4.2

CUDA: 10.0

Commit Hash: b3b952f9d5490ee2707209ab866e6c3f094e2046 (PoC changes made on top of this built from source)

...

Model (Samples/sec)Batch SizeOriginal Model (Samples/sec)Mixed Precision Model (Samples/sec)Original Model with Implicit Type Conversion (MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION=1) (Samples/sec)



imagenet1k-resnet-152

1857272
2140

140

142
4240270228
8320470261
16405680315



resnet50_v1

1215165205
2370330365
4560600545
8760980635
169351400790


FAQ

Will the arg_params and aux_params be casted to fp16 ?

...