17 March 2017 Speaker gender identification based on majority vote classifiers
Author Affiliations +
Proceedings Volume 10341, Ninth International Conference on Machine Vision (ICMV 2016); 103410A (2017); doi: 10.1117/12.2268741
Event: Ninth International Conference on Machine Vision, 2016, Nice, France
Abstract
Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Eya Mezghani, Maha Charfeddine, Henri Nicolas, Chokri Ben Amar, "Speaker gender identification based on majority vote classifiers ", Proc. SPIE 10341, Ninth International Conference on Machine Vision (ICMV 2016), 103410A (17 March 2017); doi: 10.1117/12.2268741; https://doi.org/10.1117/12.2268741
PROCEEDINGS
5 PAGES


SHARE
KEYWORDS
Feature extraction

System identification

Classification systems

Multimedia

Performance modeling

Speech recognition

Binary data

Back to Top