25 August 2004 Generalized dimensions applied to speaker identification
Author Affiliations +
This paper describes an application of fractal dimensions to speech processing and speaker identification. There are several dimensions that can be used to characterize speech signals such as box dimension, correlation dimension, etc. We are mainly concerned with the generalized dimensions of speech signals as they provide more information than individual dimensions. Generalized dimensions of arbitrary orders are used in speaker identification in this work. Based on the experimental data, the artificial phase space is generated and smooth behavior of correlation integral is obtained in a straightforward and accurate analysis. Using the dimension D(2) derived from the correlation integral, the generalized dimension D(q) of an arbitrary order q is calculated. Moreover, experiments applying the generalized dimension in speaker identification have been carried out. A speaker recognition dedicated Chinese language speech corpus with PKU-SRSC, recorded by Peking University, was used in the experiments. The results are compared to a baseline speaker identification that uses MFCC features. Experimental results have indicated the usefulness of fractal dimensions in characterizing speaker's identity.
© (2004) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Limin Hou, Limin Hou, Shuozhong Wang, Shuozhong Wang, "Generalized dimensions applied to speaker identification", Proc. SPIE 5404, Biometric Technology for Human Identification, (25 August 2004); doi: 10.1117/12.542828; https://doi.org/10.1117/12.542828


Back to Top