Speech & natural language publications
-
Rich system combination for keyword spotting in noisy and acoustically heterogeneous audio streams
We address the problem of retrieving spoken information from noisy and heterogeneous audio archives using a rich system combination with a diverse set of noise-robust modules and audio characterization.
-
Articulatory trajectories for large-vocabulary speech recognition
We present a neural network model to estimate articulatory trajectories from speech signals where the model was trained using synthetic speech signals generated by Haskins Laboratories’ task-dynamic model of speech…
-
A noise robust i-vector extractor using vector taylor series for speaker recognition
We propose a novel approach for noise-robust speaker recognition, where the model of distortions caused by additive and convolutive noises is integrated into the i-vector extraction framework.
-
SRIUBC-Core: Multiword Soft Similarity Models for Textual Similarity
We explored the use of neural probabilistic language models and a TF-IDF weighted variant of Explicit Semantic Analysis.
-
A Procedure for Estimating Gestural Scores from Speech Acoustics
This paper demonstrates that the proposed iterative approach is superior to conventional acoustically-referenced dynamic timing-warping procedures and provides reliable gestural annotation for speech datasets.
-
Virtual World Language Use and Real World Identity: Sociolinguistic Findings from the VERUS Project
This presentation describes the findings of the VERUS project in using virtual world language usage behavior to identify real world demographic attributes.
-
Multi-system fusion of extended context prosodic and cepstral features for paralinguistic speaker trait classification
This paper focuses on the identification of seven speaker trait categories from the Interspeech Speaker Trait Challenge: likeability, intelligibility, openness, conscientiousness, extraversion, agreeableness, and neuroticism.
-
Discriminatively trained phoneme confusion model for keyword spotting
This work proposes the use of discriminative training to construct a phoneme confusion model, which expands the phonemic index of a KWS system by adding phonemic variation to handle the…
-
Socio-linguistic decisions and gender mapping across real and virtual world cultures
This study examines a large corpus of online gaming chat and avatar names to explore gender differences in virtual world (VW) language use.
-
A unified approach for audio characterization and its application to speaker recognition
Knowledge of the nuisance characteristics present in the signal can be used to improve performance of the system. In some cases, the nature of these nuisance characteristics is known a…
-
Effects of audio and ASR quality on cepstral and high-level speaker verification systems
We evaluate the effect that improved audio quality has for speaker verification performance, using a recently released full-bandwidth version of microphone data from the SRE2010 evaluation.
-
Adaptive and Discriminative Modeling for Improved Mispronunciation Detection
In this work, we extend our approach with the use of model adaptation and discriminative modeling techniques, inspired on methods that have been effective in the area of speaker identification.