This paper presents a face detection system that synergizes audio localization and visual face detection. This audiovisual
face detection system is based on microphone sound localization, and image processing algorithms. The
system integrates the application of sound localization by Time Delay of Arrival and the iterative application of
Adaptive Background Segmentation, to robustly perform real-time face detection on a stream of webcam images.
Experimental results using an array of 24 microphones and a fixed-view webcam, show that the audiovisual face
detection system is able to perform face detection of success rate 97.5% at 0.82 seconds of convergence time, and
5.8Hz display frame rate, on a Pentium IV 2.5GHz.