Speech & natural language publications
-
Analysis and prediction of heart rate using speech features from natural speech
We predict HR from speech using the SRI BioFrustration Corpus.In contrast to previous studies we use continuous spontaneous speech as input.
-
Toward human-assisted lexical unit discovery without text resources
This work addresses lexical unit discovery for languages without (usable) written resources.
-
Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend
This paper gives an in-depth presentation of the multi-microphone speech recognition system we submitted to the 3rd CHiME speech separation and recognition challenge and its extension.
-
Conversational In-Vehicle Dialog Systems: The past, present, and future
We review research and development activities for in-vehicle dialog systems, examine findings, discuss key challenges, and share our visions for voice-enabled interaction and intelligent assistance for smart vehicles over the…
-
Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition
This work investigates the performance of traditional deep neural networks under varying acoustic conditions and evaluates their performance with speech recorded under realistic background conditions that are mismatched with respect…
-
Minimizing Annotation Effort for Adaptation of Speech-Activity Detection Systems
This paper focuses on the problem of selecting the best-possible subset of available audio data given a budgeted time for annotation.
-
The SRI CLEO Speaker-State Corpus
We introduce the SRI CLEO (Conversational Language about Everyday Objects) Speaker-State Corpus of speech, video, and biosignals.
-
On the Issue of Calibration in DNN-Based Speaker Recognition Systems
This article is concerned with the issue of calibration in the context of Deep Neural Network (DNN) based approaches to speaker recognition. We propose a hybrid alignment framework, which stems…
-
The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation
In this paper, we present the SRI system submission to the NIST OpenSAD 2015 speech activity detection (SAD) evaluation. We present results on three different development databases that we created…
-
Automatic Speech Transcription for Low-Resource Languages — The Case of Yoloxóchitl Mixtec (Mexico)
In the present study, we focus exclusively on progress in developing speech recognition for the language of interest, Yoloxóchitl Mixtec (YM), an Oto-Manguean language spoken by fewer than 5000 speakers…
-
The 2016 Speakers in the Wild Speaker Recognition Evaluation
This article provides details of the SITW speaker recognition challenge and analysis of evaluation results. We provide an analysis of some of the top performing systems submitted during the evaluation and…
-
Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech
Current state-of-the-art automatic speech recognition systems are sensitive to changing acoustic conditions, which can cause significant performance degradation.