Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

2.4kCitations
Citations of this article
2.0kReaders
Mendeley users who have this article in their library.
Get full text

Abstract

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

References Powered by Scopus

Deep residual learning for image recognition

175065Citations
50380Readers
Get full text

ImageNet: A Large-Scale Hierarchical Image Database

51389Citations
9799Readers
Get full text

Going deeper with convolutions

39707Citations
23240Readers
Get full text

Cited by Powered by Scopus

Searching for mobileNetV3

7109Citations
3583Readers
Get full text

GhostNet: More features from cheap operations

3280Citations
1455Readers
Get full text

Mnasnet: Platform-aware neural architecture search for mobile

2218Citations
2270Readers
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., … Kalenichenko, D. (2018). Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 2704–2713). IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00286

Readers over time

‘17‘18‘19‘20‘21‘22‘23‘24‘250150300450600

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 678

74%

Researcher 184

20%

Lecturer / Post doc 28

3%

Professor / Associate Prof. 25

3%

Readers' Discipline

Tooltip

Computer Science 692

67%

Engineering 307

30%

Mathematics 19

2%

Physics and Astronomy 18

2%

Save time finding and organizing research with Mendeley

Sign up for free
0