September 1, 2008

Phone-based cepstral polynomial SVM system for speaker recognition

Citation

S. S. Kajarekar, “Phone-based cepstral polynomial svm system for speaker recognition,” in Proc. 9th Annual Conference of the International Speech Communication Association 2008 (INTERSPEECH 2008), pp. 845–848.

Abstract

We have been using a phone-based cepstral system with polynomial features in NIST evaluations for the past two years. This system uses three broad phone classes, three states per class, and third-order polynomial features obtained from MFCC features. In this paper, we present a complete analysis of the system. We start from a simpler system that does not use phones or states and show that the addition of phones gives a significant improvement. We show that adding state information does not provide improvement on its own but provides a significant improvement when used with phone classes. We complete the system by applying nuisance attribute projection (NAP) and score normalization. We show that splitting features after a joint NAP over all phone classes results in a significant improvement. Overall, we obtain about 25% performance improvement with polynomial features based on phones and states, and obtain a system with performance comparable to a state-of-the-art SVM system.
Index Terms: Speaker recognition, feature extraction, pattern recognition.

↓ Download

Phone-based cepstral polynomial SVM system for speaker recognition

Abstract

Read more from SRI

SRI and University of Houston receive $3.6M to develop a microreactor to convert carbon dioxide to methanol using renewable energy

Teaching machines to learn like humans could help autonomous systems deal with unfamiliar environments

Office of Special Education Programs extends SRI’s funding for the Center for IDEA Early Childhood Data Systems