Author: Mitchell McLaren
-
Exploring the role of phonetic bottleneck features for speaker and language recognition
Using bottleneck features extracted from a deep neural network (DNN) trained to predict senone posteriors has resulted in new, state-of-the-art technology for language and speaker identification.
-
Improving robustness against reverberation for automatic speech recognition
In this work, we explore the role of robust acoustic features motivated by human speech perception studies, for building ASR systems robust to reverberation effects.
-
Study of senone-based deep neural network approaches for spoken language recognition
This paper compares different approaches for using deep neural networks (DNNs) trained to predict senone posteriors for the task of spoken language recognition (SLR).
-
Speech-based assessment of PTSD in a military population using diverse feature classes
We analyzed recordings of the Clinician-Administered PTSD Scale (CAPS) interview from military personnel diagnosed as PTSD positive versus negative.
-
Mitigating the effects of non-stationary unseen noises on language recognition performance
We introduce a new dataset for the study of the effect of highly non-stationary noises on language recognition (LR) performance.
-
Improved speaker recognition using DCT coefficients as features
We recently proposed the use of coefficients extracted from the 2D discrete cosine transform (DCT) of log Mel filter bank energies to improve speaker recognition over the traditional Mel frequency cepstral coefficients (MFCC) with appended deltas and double deltas (MFCC/deltas).
-
Softsad: Integrated frame-based speech confidence for speaker recognition
In this paper we propose softSAD: the direct integration of speech posteriors into a speaker recognition system instead of using speech activity detection (SAD).
-
Advances in deep neural network approaches to speaker recognition
In this work, we report the same achievement in DNN-based SID performance on microphone speech. We consider two approaches to DNN-based SID: one that uses the DNN to extract features, and another that uses the DNN during feature modeling.
-
Application of Convolutional Neural Networks to Speaker Recognition in Noisy Conditions
This paper applies a convolutional neural network (CNN) trained for automatic speech recognition (ASR) to the task of speaker identification (SID).
-
Spoken Language Recognition Based on Senone Posteriors
This paper explores in depth a recently proposed approach to spoken language recognition based on the estimated posteriors for a set of senones representing the phonetic space of one or more languages.
-
A Deep Neural Network Speaker Verification System Targeting Microphone Speech
We recently proposed the use of deep neural networks (DNN) in place of Gaussian Mixture models (GMM) in the i-vector extraction process for speaker recognition.
-
Trial-Based Calibration for Speaker Recognition in Unseen Conditions
This work presents Trial-Based Calibration (TBC), a novel, automated calibration technique robust to both unseen and widely varying conditions.