Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Category

Model

Latency batchsize=1 (ms, small is better)

Throughput batchsize=128 (fps, big is better)

no mkldnn

release 1.3 + mkldnn

speedup

no mkldnn

release 1.3 + mkldnn

speedup

CNN/classification

ResNet-50 v1

97.19

18.94

5.13

10.29

132.05

12.84

ResNet-50 v2

98.69

18.93

5.21

9.94

127.17

12.79

Inception v3

175.17

26.34

6.65

5.74

110.00

19.16

Inception v4

330.93

66.96

4.94

3.04

59.28

19.47

DenseNet

111.66

53.31

2.09

8.52

121.79

14.30

MobileNet

38.56

7.32

5.27

24.87

380.54

15.30

VGG16

406.50

40.08

10.14

2.91

69.84

23.96

AlexNet

64.60

4.33

14.90

26.58

689.86

25.96

inception-resnet v2

181.10

111.28

1.63

5.48

69.39

12.66

CNN/object detection

Faster R-CNN

1175.74

95.15

12.36

0.85

10.51

12.36

SSD-VGG16

721.03

127.48

5.66

1.43(batchsize=224)

27.35(batchsize=224)

19.13

SSD-MobileNet

 

100.75

 

 

57.73(batchsize=256)

 

RNN

GNMT

683.43

100.30

6.81

1.46(batchsize=64)

9.97(batchsize=64)

6.83

GAN

DCGAN

8.94

0.22

41.36

109.13

4059.74

37.20

  • Performance gain from operator fusion by subgraph

...

Category

...

Model

...


...

CNN/classification

...

CNN/object detection

...

RNN

...

  • Inference Accuracy

The model is from gluon model zoo by pre-trained parameters. The top1 and top5 accuracy are verified by MKL-DNN backend.

...