Speech & natural language publications
-
Crowdsourcing Emotional Speech
We describe the methodology for the collection and annotation of a large corpus of emotional speech data through crowdsourcing.
-
Language Diarization for Semi-supervised Bilingual Acoustic Model Training
In this paper, we investigate several automatic transcription schemes for using raw bilingual broadcast news data in semi-supervised bilingual acoustic model training.
-
Tackling Unseen Acoustic Conditions in Query-by-Example Search Using Time and Frequency Convolution for Multilingual Deep Bottleneck Features
This paper revisits two neural network architectures developed for noise and channel robust ASR, and applies them to building a state-of-art multilingual QbE system.
-
Noise-robust Exemplar Matching for Rescoring Query-by-Example Search
This paper describes a two-step approach for keyword spotting task in which a query-by-example search is followed by noise robust exemplar matching rescoring.
-
Analysis of Phonetic Markedness and Gestural Effort Measures for Acoustic Speech-Based Depression Classification
In this paper we analyze articulatory measures to gain further insight into how articulation is affected by depression.
-
Calibration Approaches for Language Detection
In this paper, we focus on situations in which either (1) the system-modeled languages are not observed during use or (2) the test data contains OOS languages that are unseen…
-
Leveraging Deep Neural Network Activation Entropy to Cope with Unseen Data in Speech Recognition
This work aims to estimate the propagation of such distortion in the form of network activation entropy, which is measured over a short-time running window on the activation from each…
-
Improving Robustness of Speaker Recognition to New Conditions Using Unlabeled Data
We benchmark these approaches on several distinctly different databases, after we describe our SRICON-UAM team system submission for the NIST 2016 SRE.
-
Inferring Stance from Prosody
Speech conveys many things beyond content, including aspects of stance and attitude that have not been much studied.
-
Hybrid Convolutional Neural Networks for Articulatory and Acoustic Information Based Speech Recognition
This work explores using deep neural networks (DNNs) and convolutional neural networks (CNNs) for mapping speech data into its corresponding articulatory space. Our speech-inversion results indicate that the CNN models…
-
Joint modeling of articulatory and acoustic spaces for continuous speech recognition tasks
This paper investigates using deep neural networks (DNN) and convolutional neural networks (CNNs) for mapping speech data into its corresponding articulatory space.
-
Speech recognition in unseen and noisy channel conditions
This work investigates robust features, feature-space maximum likelihood linear regression (fMLLR) transform, and deep convolutional nets to address the problem of unseen channel and noise conditions in speech recognition.