Inference Performance

This group of the performance test is gathered on AWS EC2 instance C5.18xLarge with 1 socket and 1 processor.

Category	Model	Latency batchsize=1 (ms, small is better)			Throughput batchsize=128 (fps, big is better)
Category	Model	no mkldnn	release 1.3 + mkldnn	speedup	no mkldnn	release 1.3 + mkldnn	speedup
CNN/classification	ResNet-50 v1	97.19	18.94	5.13	10.29	132.05	12.84
	ResNet-50 v2	98.69	18.93	5.21	9.94	127.17	12.79
	Inception v3	175.17	26.34	6.65	5.74	110.00	19.16
	Inception v4	330.93	66.96	4.94	3.04	59.28	19.47
	DenseNet	111.66	53.31	2.09	8.52	121.79	14.30
	MobileNet	38.56	7.32	5.27	24.87	380.54	15.30
	VGG16	406.50	40.08	10.14	2.91	69.84	23.96
	AlexNet	64.60	4.33	14.90	26.58	689.86	25.96
	inception-resnet v2	181.10	111.28	1.63	5.48	69.39	12.66
CNN/object detection	Faster R-CNN	1175.74	95.15	12.36	0.85	10.51	12.36
	SSD-VGG16	721.03	127.48	5.66	1.43（batchsize=224)	27.35(batchsize=224)	19.13
	SSD-MobileNet		100.75			57.73(batchsize=256)
RNN	GNMT	683.43	100.30	6.81	1.46(batchsize=64)	9.97(batchsize=64)	6.83
GAN	DCGAN	8.94	0.22	41.36	109.13	4059.74	37.20

Inference Accuracy

The model is from gluon model zoo by pre-trained parameters. The top1 and top5 accuracy are verified by MKL-DNN backend.

...

Page tree