Translator Disclaimer
25 March 2003 Selective visual attention in object detection processes
Author Affiliations +
Object detection is an enabling technology that plays a key role in many application areas, such as content based media retrieval. Attentive cognitive vision systems are here proposed where the focus of attention is directed towards the most relevant target. The most promising information is interpreted in a sequential process that dynamically makes use of knowledge and that enables spatial reasoning on the local object information. The presented work proposes an innovative application of attention mechanisms for object detection which is most general in its understanding of information and action selection. The attentive detection system uses a cascade of increasingly complex classifiers for the stepwise identification of regions of interest (ROIs) and recursively refined object hypotheses. While the most coarse classifiers are used to determine first approximations on a region of interest in the input image, more complex classifiers are used for more refined ROIs to give more confident estimates. Objects are modelled by local appearance based representations and in terms of posterior distributions of the object samples in eigenspace. The discrimination function to discern between objects is modeled by a radial basis functions (RBF) network that has been compared with alternative networks and been proved consistent and superior to other artifical neural networks for appearance based object recognition. The experiments were led for the automatic detection of brand objects in Formula One broadcasts within the European Commission's cognitive vision project DETECT.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lucas Paletta, Anurag Goyal, and Christian Greindl "Selective visual attention in object detection processes", Proc. SPIE 5015, Applications of Artificial Neural Networks in Image Processing VIII, (25 March 2003);


Knowledge-guided parsing in video databases
Proceedings of SPIE (April 14 1993)
MPEG-7 and image understanding systems
Proceedings of SPIE (June 23 2003)
Recognition as translating images into text
Proceedings of SPIE (January 10 2003)

Back to Top