August 1, 2011

Factor analysis back ends for MLLR transforms in speaker recognition

Citation

N. Scheffer, Y. Lei, and L. Ferrer, “Factor analysis back ends for MLLR transforms in speaker recognition,” in Proc. Interspeech, 2011, pp. 257–260.

Abstract

The purpose of this work is to show how recent developments in cepstral-based systems for speaker recognition can be leveraged for the use of Maximum Likelihood Linear Regression (MLLR) transforms. Speaker recognition systems based on MLLR transforms have shown to be greatly beneficial in combination with standard systems, but most of the advances in speaker modeling techniques have been implemented for cepstral features. We show how these advances, based on Factor Analysis, such as eigenchannel and ivector, can be easily employed to achieve very high accuracy. We show that they outperform the current state-of-the-art MLLR-SVM system that SRI submitted during the NIST SRE 2010 evaluation. The advantages of leveraging the new approaches are manyfold: the ability to process a large amount of data, working in a reduced dimensional space, importing any advances made for cepstral systems to the MLLR features, and the potential for system combination at the ivector level.

Index Terms: speaker verification, MLLR, factor analysis

↓ Download

Factor analysis back ends for MLLR transforms in speaker recognition

Abstract

Read more from SRI

SRI research aims to make generative AI more trustworthy

Transforming matter: Researchers create perfect crystals from amorphous blobs

2024 at SRI: A year of breakthroughs