28 April 2010 Experimental study on GMM-based speaker recognition
Author Affiliations +
Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent.
© (2010) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Wenxing Ye, Wenxing Ye, Dapeng Wu, Dapeng Wu, Antonio Nucci, Antonio Nucci, "Experimental study on GMM-based speaker recognition", Proc. SPIE 7708, Mobile Multimedia/Image Processing, Security, and Applications 2010, 770804 (28 April 2010); doi: 10.1117/12.849201; https://doi.org/10.1117/12.849201

Back to Top