Speech & natural language publications
-
The SRI AVEC-2014 Evaluation System
We explore a diverse set of features based only on spoken audio to understand which features correlate with self-reported depression scores according to the Beck depression rating scale.
-
A Deep Neural Network Speaker Verification System Targeting Microphone Speech
We recently proposed the use of deep neural networks (DNN) in place of Gaussian Mixture models (GMM) in the i-vector extraction process for speaker recognition.
-
Application of Convolutional Neural Networks to Speaker Recognition in Noisy Conditions
This paper applies a convolutional neural network (CNN) trained for automatic speech recognition (ASR) to the task of speaker identification (SID).
-
Evaluating Robust Features on Deep Neural Networks for Speech Recognition in Noisy and Channel Mismatched Conditions
In this work we present a study exploring both conventional DNNs and deep Convolutional Neural Networks (CNN) for noise- and channel-degraded speech recognition tasks using the Aurora4 dataset.
-
Recent Improvements in SRI’s Keyword Detection System for Noisy Audio
We present improvements to a keyword spotting (KWS) system that operates in highly adverse channel conditions with very low signal-to-noise ratio levels.
-
Spoken Language Recognition Based on Senone Posteriors
This paper explores in depth a recently proposed approach to spoken language recognition based on the estimated posteriors for a set of senones representing the phonetic space of one or…
-
Content Matching for Short Duration Speaker Recognition
We show how content matching can be effectively done at the statistics level to enable the use of standard verification backends. While no significant improvements were observed for the general…
-
Identifying User Demographic Traits through Virtual-World Language Use
The paper presents approaches for identifying real-world demographic attributes based on language use in the virtual world.
-
Articulatory Features from Deep Neural Networks and Their Role in Speech Recognition
This paper presents a deep neural network (DNN) to extract articulatory information from the speech signal and explores different ways to use such information in a continuous speech recognition task.
-
Application of Convolutional Neural Networks to Language Identification in Noisy Conditions
This paper proposes two novel frontends for robust language identification (LID) using a convolutional neural network (CNN) trained for automatic speech recognition (ASR).
-
Trial-Based Calibration for Speaker Recognition in Unseen Conditions
This work presents Trial-Based Calibration (TBC), a novel, automated calibration technique robust to both unseen and widely varying conditions.
-
Highly Accurate Phonetic Segmentation Using Correction Models and System Fusion
We investigate techniques for boosting the accuracy of automatic phonetic segmentation based on HMM acoustic-phonetic models.