26 March 2001 Multicomponent FM demodulation of speech based on the short-time Fourier transform (STFT) phase
Author Affiliations +
Abstract
Speech is a signal which is produced as a combination of frication and a quasi periodic train of glottal pulses excites the vocal tract and causes it to resonate. Information is encoded on the signal as the vocal tract changes configuration, resulting in a rapid change of the resonant frequencies. We develop methods, based on differentiation of the short time Fourier transform (STFT) phase, which effectively demodulates the speech signal and produces accurate, high resolution time-frequency estimates of both the resonances and the signal excitation. The method effectively condenses the STFT surface along curves representing the instantaneous frequencies of the vocal tract resonances and the channel group delay function.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Douglas J. Nelson, "Multicomponent FM demodulation of speech based on the short-time Fourier transform (STFT) phase", Proc. SPIE 4391, Wavelet Applications VIII, (26 March 2001); doi: 10.1117/12.421230; https://doi.org/10.1117/12.421230
PROCEEDINGS
12 PAGES


SHARE
RELATED CONTENT

Modulation frequency and efficient audio coding
Proceedings of SPIE (November 20 2001)
Invertible time-frequency representations
Proceedings of SPIE (October 02 1998)
Scale and harmonic-type signals
Proceedings of SPIE (October 11 1994)
Time-varying polyspectra and reduced Wigner-Ville trispectrum
Proceedings of SPIE (November 30 1992)

Back to Top