20 March 2013 A speaker change detection method based on coarse searching
Author Affiliations +
Proceedings Volume 8768, International Conference on Graphic and Image Processing (ICGIP 2012); 87681S (2013) https://doi.org/10.1117/12.2010844
Event: 2012 International Conference on Graphic and Image Processing, 2012, Singapore, Singapore
The conventional speaker change detection (SCD) method using Bayesian Information Criterion (BIC) has been widely used. However, its performance relies on the choice of penalty factor and suffers from mass calculation. The twostep SCD is less time consuming but generates more detection errors. The limitation of conventional method’s performance originates from the two adjacent data windows. We propose a strategy that inserts an interval between the two adjacent fixed-size data windows in each analysis window. The dissimilarity value between the data windows is regarded as the probability of a speaker identity change within the interval area. Then this analysis window is slid along the audio by a large step to locate the areas where speaker change points may appear. Afterwards we only focus on these areas and locate precisely where the change points are. Other areas where a speaker change point unlikely appears are abandoned. The proposed method is computationally efficient and more robust to noise and penalty factor compared with conventional method. Evaluated on the corpus of China Central Television (CCTV) news, the proposed method obtains 74.18% reduction in calculation time and 22.24% improvement in F1-measure compared with the conventional approach.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xue-yuan Zhang, Xue-yuan Zhang, Qian-hua He, Qian-hua He, Yan-xiong Li, Yan-xiong Li, Jun He, Jun He, } "A speaker change detection method based on coarse searching", Proc. SPIE 8768, International Conference on Graphic and Image Processing (ICGIP 2012), 87681S (20 March 2013); doi: 10.1117/12.2010844; https://doi.org/10.1117/12.2010844

Back to Top