Comparison of weighting strategies in early and late fusion approaches to audio-visual person authentication

Harin Sellahewa; Naseer Al-Jawad; Andrew C. Morris; Dalei Wu; Jacques Koreman; Sabah A. Jassim

doi:10.1117/12.667214

2 May 2006 Comparison of weighting strategies in early and late fusion approaches to audio-visual person authentication

Harin Sellahewa, Naseer Al-Jawad, Andrew C. Morris, Dalei Wu, Jacques Koreman, Sabah A. Jassim

Proceedings Volume 6250, Mobile Multimedia/Image Processing for Military and Security Applications; 62500C (2006) https://doi.org/10.1117/12.667214
Event: Defense and Security Symposium, 2006, Orlando (Kissimmee), Florida, United States

Abstract

Person authentication can be strongly enhanced by the combination of different modalities. This is also true for the face and voice signals, which can be obtained with minimal inconvenience for the user. However, features from each modality can be combined at various different levels of processing and for face and voice signals the advantage of fusion depends strongly on the way they are combined. The aim of the work presented is to investigate the optimal strategy for combining voice and face modalities for signals of varying quality. The experimental data are taken from a newly acquired database using a PDA, which contains audio-visual recordings in different conditions. Voice features use mel-frequency cepstral coefficients, while the face signal is parameterised using wavelet coefficients in certain subbands. Results are presented for both early (feature-level) and late (score-level) fusion. At each level different fixed and variable weightings are used, both to weight between frames within each modality and to weight between modalities, where weights are based on some measure of signal reliability, such as the accuracy of automatic face detection or the audio signal to noise ratio. In addition, the contribution to authentication of information from different areas of the face is explored to determine a regional weighting for the face coefficients.

Citation Download Citation

Harin Sellahewa, Naseer Al-Jawad, Andrew C. Morris, Dalei Wu, Jacques Koreman, and Sabah A. Jassim "Comparison of weighting strategies in early and late fusion approaches to audio-visual person authentication", Proc. SPIE 6250, Mobile Multimedia/Image Processing for Military and Security Applications, 62500C (2 May 2006); https://doi.org/10.1117/12.667214

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
12 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Biometrics

Databases

Personal digital assistants

Wavelets

Mouth

Video

Visualization

Show All Keywords

Keywords/Phrases

Search In:

Publication Years