Paper
8 June 2001 Subjective analysis of a HMM-based visual speech synthesizer
Jay J. Williams, Aggelos K. Katsaggelos, Dean C. Garstecki
Author Affiliations +
Proceedings Volume 4299, Human Vision and Electronic Imaging VI; (2001) https://doi.org/10.1117/12.429527
Event: Photonics West 2001 - Electronic Imaging, 2001, San Jose, CA, United States
Abstract
Emerging broadband communication systems promise a future of multimedia telephony. The addition of visual information, for example, during telephone conversions would be most beneficial to people with impaired hearing useful for speech reading, based on existing narrowband communications system used for speech signal. A Hidden Markov Model (HMM)-based visual speech synthesizer is designed to improve speech understanding. The key elements in the application of HMMs to this problem are: a) the decomposition of the overall modeling task into key stages; and, b) the judicious determination of the components of the observation vector for each stage. The main contribution of this paper is the development of a novel correlation HMM model that is able to integrate independently trained acoustic and visual HMMs for speech-to-visual synthesis. This model allows increased flexibility in choosing model topologies for the acoustic and visual HMMs. It also reduces the amount of required training data compared to early integration modeling techniques. Results form objective and subjective analysis show that an HMM correlating model can significantly decrease audio-visual synchronization errors and increase speech understanding.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jay J. Williams, Aggelos K. Katsaggelos, and Dean C. Garstecki "Subjective analysis of a HMM-based visual speech synthesizer", Proc. SPIE 4299, Human Vision and Electronic Imaging VI, (8 June 2001); https://doi.org/10.1117/12.429527
Lens.org Logo
CITATIONS
Cited by 5 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Acoustics

Signal to noise ratio

Visual process modeling

Laser induced plasma spectroscopy

Information visualization

Data modeling

RELATED CONTENT


Back to Top