Paper
25 October 1994 Hybrid vector quantization/neural tree network classifiers for speaker recognition
Kevin R. Farrell, Richard J. Mammone
Author Affiliations +
Abstract
A new classification system for text-independent speaker recognition is presented. Text- independent speaker recognition systems generally model each speaker with a single classifier. The traditional methods use unsupervised training algorithms, such as vector quantization (VQ), to model each speaker. Such methods base their decision on the distortion between an observation and the speaker model. Recently, supervised training algorithms, such as neural networks, have been successfully applied to speaker recognition. Here, each speaker is represented by a neural network. Due to their discriminative training, neural networks capture the differences between speakers and use this criteria for decision making. Hence, the output of a neural network can be considered as an interclass measure. The VQ classifier, on the other hand, uses a distortion which is independent of the other speaker models, and can be considered as an intraclass measure. Since these two measures are based on different criteria, they can be effectively combined to yield improved performance. This paper uses data fusion concepts to combine the outputs of the neural tree network and VQ classifiers. The combined system is evaluated for text-independent speaker identification and verification and is shown to outperform either classifier when used individually.
© (1994) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kevin R. Farrell and Richard J. Mammone "Hybrid vector quantization/neural tree network classifiers for speaker recognition", Proc. SPIE 2277, Automatic Systems for the Identification and Inspection of Humans, (25 October 1994); https://doi.org/10.1117/12.191879
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Speaker recognition

Data fusion

Neural networks

Systems modeling

Distortion

Quantization

Data modeling

Back to Top