Translator Disclaimer
7 June 1995 Pitch-based methods for speech detection and automatic frequency recovery
Author Affiliations +
There are many applications for which it is desireable to reliably detect the presence of speech. Examples of these applications are speech compression, voice activated devices and machine speech recognition. In this paper, a method of speech detection is developed which uses a frequency-domain pitch-based signal-to-noise ratio (SNR) estimate. This method takes full advantage of the spectral structure of pitch, which is the primary speech excitation function. The primary output of the detection algorithm is a decision that speech is present or not present. In addition, the algorithm provides an estimate of the speech SNR which may be used to estimate signal quality. This SNR estimate is important for applications such as estimating the reliability of machine-based recognition processes. Additional advantages of this method are that it is independent of signal gain and it works well under adverse conditions such as poor SNR and in the presence of interference. A by-product of the pitch-based detection process is a method for automatic recovery of frequency offset of mistuned analog speech. Mistuning is a condition which can arise in the demodulation of single-side-band amplitude-modulated (SSB-AM) speech if the precise carrier is not used in the demodulation process. This can cause severe problems since speech becomes nearly unintelligible if it is mistuned more than 100 Hz. The methods presented here use a double complex correlation of the complex speech spectrum to recover the carrier offset. This process provides significantly better resolution than more conventional correlation processes based on the speech power- spectrum.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Douglas J. Nelson and Joseph Pencak "Pitch-based methods for speech detection and automatic frequency recovery", Proc. SPIE 2563, Advanced Signal Processing Algorithms, (7 June 1995);

Back to Top