21 July 2017 Image annotation by deep neural networks with attention shaping
Author Affiliations +
Proceedings Volume 10420, Ninth International Conference on Digital Image Processing (ICDIP 2017); 104201W (2017) https://doi.org/10.1117/12.2281747
Event: Ninth International Conference on Digital Image Processing (ICDIP 2017), 2017, Hong Kong, China
Abstract
Image annotation is a task of assigning semantic labels to an image. Recently, deep neural networks with visual attention have been utilized successfully in many computer vision tasks. In this paper, we show that conventional attention mechanism is easily misled by the salient class, i.e., the attended region always contains part of the image area describing the content of salient class at different attention iterations. To this end, we propose a novel attention shaping mechanism, which aims to maximize the non-overlapping area between consecutive attention processes by taking into account the history of previous attention vectors. Several weighting polices are studied to utilize the history information in different manners. In two benchmark datasets, i.e., PASCAL VOC2012 and MIRFlickr-25k, the average precision is improved by up to 10% in comparison with the state-of-the-art annotation methods.
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kexin Zheng, Kexin Zheng, Shaohe Lv, Shaohe Lv, Fang Ma, Fang Ma, Fei Chen, Fei Chen, Chi Jin, Chi Jin, Yong Dou, Yong Dou, } "Image annotation by deep neural networks with attention shaping", Proc. SPIE 10420, Ninth International Conference on Digital Image Processing (ICDIP 2017), 104201W (21 July 2017); doi: 10.1117/12.2281747; https://doi.org/10.1117/12.2281747
PROCEEDINGS
7 PAGES


SHARE
RELATED CONTENT

Learning deep similarity in fundus photography
Proceedings of SPIE (February 24 2017)
Head-heuristic human detection in RGB-D images
Proceedings of SPIE (August 09 2018)
Hybrid pyramid/neural network object recognition
Proceedings of SPIE (February 25 1994)
Audio-visual gender recognition
Proceedings of SPIE (November 15 2007)
Generalized neocognitron model for facial recognition
Proceedings of SPIE (October 01 1991)

Back to Top