Translator Disclaimer
16 February 2018 Quantization and training of object detection networks with low-precision weights and activations
Author Affiliations +
As convolutional neural networks have demonstrated state-of-the-art performance in object recognition and detection, there is a growing need for deploying these systems on resource-constrained mobile platforms. However, the computational burden and energy consumption of inference for these networks are significantly higher than what most low-power devices can afford. To address these limitations, this paper proposes a method to train object detection networks with low-precision weights and activations. The probability density functions of weights and activations of each layer are first directly estimated using piecewise Gaussian models. Then, the optimal quantization intervals and step sizes for each convolution layer are adaptively determined according to the distribution of weights and activations. As the most computationally expensive convolutions can be replaced by effective fixed point operations, the proposed method can drastically reduce computation complexity and memory footprint. Performing on the tiny you only look once (YOLO) and YOLO architectures, the proposed method achieves comparable accuracy to their 32-bit counterparts. As an illustration, the proposed 4-bit and 8-bit quantized versions of the YOLO model achieve a mean average precision of 62.6% and 63.9%, respectively, on the Pascal visual object classes 2012 test dataset. The mAP of the 32-bit full-precision baseline model is 64.0%.
© 2018 SPIE and IS&T 1017-9909/2018/$25.00 © 2018 SPIE and IS&T
Bo Yang, Jian Liu, Li Zhou, Yun Wang, and Jie Chen "Quantization and training of object detection networks with low-precision weights and activations," Journal of Electronic Imaging 27(1), 013020 (16 February 2018).
Received: 10 August 2017; Accepted: 23 January 2018; Published: 16 February 2018

Back to Top