Author: Mitchell McLaren
-
Application of Convolutional Neural Networks to Language Identification in Noisy Conditions
This paper proposes two novel frontends for robust language identification (LID) using a convolutional neural network (CNN) trained for automatic speech recognition (ASR).
-
Simplified VTS-Based I-Vector Extraction in Noise-Robust Speaker Recognition
In this work, we propose an efficient simplification scheme, named sVTS, in order to show that the VTS approach gives improvements in large scale applications compared to state-of-the-art systems.
-
A Novel Scheme for Speaker Recognition Using a Phonetically-Aware Deep Neural Network
We propose a novel framework for speaker recognition in which extraction of sufficient statistics for the state-of-the-art i-vector model is driven by a deep neural network (DNN) trained for automatic speech recognition (ASR).
-
Quality Measure Functions for Calibration of Speaker Recognition Systems in Various Duration Conditions
This paper investigates the effect of utterance duration to the calibration of a modern i-vector speaker recognition system with probabilistic linear discriminant analysis (PLDA) modeling.
-
Recent Developments in Voice Biometrics: Robustness and High Accuracy
We highlight SRI’s innovations that resulted from the IARPA Biometrics Exploitation Science & Technology (BEST) and the DARPA Robust Automatic Transcription of Speech (RATS) programs, as well as SRI’s approach for codec degraded speech.
-
Improving Language Identification Robustness to Highly Channel-Degraded Speech through Multiple System Fusion
We describe a language identification system developed for robustess to noise conditions such as those encountered under the DARPA RATS program, which is focused on multi-channel audio collected in high noise conditions.
-
Modulation features for noise robust speaker identification
In this paper, we present a robust acoustic feature on top of robust modeling techniques to further improve speaker identification performance.
-
A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation
This paper presents SRI’s submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata.
-
Adaptive Gaussian Backend for Robust Language Identification
This paper proposes adaptive Gaussian backend (AGB), a novel approach to robust language identification (LID).
-
Improving Speaker Identification Robustness to Highly Channel-Degraded Speech Through Multiple System Fusion
This article describes our submission to the speaker identification (SID) evaluation for the first phase of the DARPA Robust Audio and Transcription of Speech (RATS) program.