3 September 2008 Psychoacoustic speech feature optimization through adaptive generalized scale transforms
Author Affiliations +
We are presenting a method for the improvement of small scale text independent automatic speaker identification systems. A small scale identification system is a system with a relatively small number of enrolled speakers (20 or less). The proposed improvement is obtained from adaptive frequency warping. Most modern speaker identification systems employ a short-time speech feature extraction method that relies on frequency warped cepstral representations. One of the most popular frequency warping types is based on the mel-scale. While the mel-scale provides a substantial boost in recognition performance for large scale systems, it is suboptimal for small scale systems. With experiments we have shown that our methodology has the potential to reduce the error rate of small scale systems by 24% over the mel-scale approach.
© (2008) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Robert M. Nickel, "Psychoacoustic speech feature optimization through adaptive generalized scale transforms", Proc. SPIE 7074, Advanced Signal Processing Algorithms, Architectures, and Implementations XVIII, 70740U (3 September 2008); doi: 10.1117/12.795465; https://doi.org/10.1117/12.795465


Back to Top