Speech & natural language publications
-
Automatic Speech Transcription for Low-Resource Languages — The Case of Yoloxóchitl Mixtec (Mexico)
In the present study, we focus exclusively on progress in developing speech recognition for the language of interest, Yoloxóchitl Mixtec (YM), an Oto-Manguean language spoken by fewer than 5000 speakers…
-
The 2016 Speakers in the Wild Speaker Recognition Evaluation
This article provides details of the SITW speaker recognition challenge and analysis of evaluation results. We provide an analysis of some of the top performing systems submitted during the evaluation and…
-
Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech
Current state-of-the-art automatic speech recognition systems are sensitive to changing acoustic conditions, which can cause significant performance degradation.
-
Exploring the role of phonetic bottleneck features for speaker and language recognition
Using bottleneck features extracted from a deep neural network (DNN) trained to predict senone posteriors has resulted in new, state-of-the-art technology for language and speaker identification.
-
A Phonetically Aware System for Speech Activity Detection
In this paper, we focus on a dataset of highly degraded signals, developed under the DARPA Robust Automatic Transcription of Speech (RATS) program.
-
Analyzing the effect of channel mismatch on the SRI language recognition evaluation 2015 system
We present the work done by our group for the 2015 language recognition evaluation (LRE) organized by the National Institute of Standards and Technology (NIST).
-
Noise and reverberation effects on depression detection from speech
This study compares the effect of noise and reverberation on depression prediction using standard mel-frequency cepstral coefficients, and features designed for noise robustness, damped oscillator cepstral coefficients.
-
The MERL/SRI System for the 3rd chime challenge using beamforming, robust feature extraction and advanced speech recognition
This paper introduces the MERL/SRI system designed for the 3rd CHiME speech separation and recognition challenge (CHiME-3).
-
Improving robustness against reverberation for automatic speech recognition
In this work, we explore the role of robust acoustic features motivated by human speech perception studies, for building ASR systems robust to reverberation effects.
-
Time-frequency convolutional networks for robust speech recognition
This work presents a modified CDNN architecture that we call the time-frequency convolutional network (TFCNN), in which two parallel layers of convolution are performed on the input feature space: convolution…
-
Study of senone-based deep neural network approaches for spoken language recognition
This paper compares different approaches for using deep neural networks (DNNs) trained to predict senone posteriors for the task of spoken language recognition (SLR).
-
Speech-based assessment of PTSD in a military population using diverse feature classes
We analyzed recordings of the Clinician-Administered PTSD Scale (CAPS) interview from military personnel diagnosed as PTSD positive versus negative.