Paper
17 April 2019 Combination of GMM-UBM and DTW for voice command authentication system
Evelyn Kurniawati, Sasiraj Somarajan
Author Affiliations +
Proceedings Volume 11071, Tenth International Conference on Signal Processing Systems; 110710D (2019) https://doi.org/10.1117/12.2520442
Event: Tenth International Conference on Signal Processing Systems, 2018, Singapore, Singapore
Abstract
In this paper, we present a combination of statistical and template based pattern matching to solve the problem of authentication with very short command words. Same features are used in both methods to reduce the computational weight. The first method uses GMM-UBM (Gaussian Mixture Model with Universal Background Model) which is well known in speaker recognition field, but lacks the ability to model the temporal aspect of speech. The second method provides a remedy for this, with the classical DTW (Dynamic Time Warping) on the cepstrum features. Two scheme of combining the model is explored; firstly with layer design when DTW distance is calculated only if GMM-UBM accepts the speaker, and secondly by weighting the DTW distance using the confidence of GMM-UBM result. With this combination, a 23% and 17% improvement in EER was observed respectively, each with differing characteristics on 3 different error types that is investigated. The experiment was conducted on evaluation set of RSR2015 database part 2, which contains short words meant for command and control task. Performance analysis is done using Detection Error Tradeoff curve (DET) and Equal Error Rate (EER).
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Evelyn Kurniawati and Sasiraj Somarajan "Combination of GMM-UBM and DTW for voice command authentication system", Proc. SPIE 11071, Tenth International Conference on Signal Processing Systems, 110710D (17 April 2019); https://doi.org/10.1117/12.2520442
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication and 1 patent.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Speaker recognition

Speech recognition

Error analysis

Statistical analysis

Back to Top