Speech & natural language publications
-
Weighting Schemes for Audio-visual Fusion in Speech Recognition
In this work we demonstrate an improvement in the state-of-the-art large vocabulary continuous speech recognition (LVCSR) performance, under clean and noisy conditions, by the use of visual information, in addition…
-
The Meeting Project at ICSI
In collaboration with colleagues at UW, OGI, IBM, and SRI, we are developing technology to process spoken language from informal meetings.
-
To “Errrr” is Human: Ecology and Acoustics of Speech Disfluencies
The ecological and acoustic evidence provide insights about human language production in real-world contexts. Such evidence can also guide methods for the processing of spontaneous speech in automatic speech recognition…
-
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and…
-
Machine Learning Techniques for the Identification of Cues for Stop Place
This paper is situated in a long line of phonetic studies that seek to determine and qualify the acoustic cues humans use to identify stop place. The present study draws…
-
Finding Consensus in Speech Recognition: Word Error Minimization and Other Applications of Confusion Networks
We describe a new framework for distilling information from word lattices to improve the accuracy of speech recognition and obtain a more perspicuous representation of a set of alternative hypotheses.
-
An Efficient Repair Procedure For Quick Transcriptions
The procedure we propose in this paper aims to em cleanse/ such quick transcriptions so that they align better with the acoustic evidence and thus provide for better acoustic models…
-
Prosodic Features for Automatic Text-Independent Evaluation of Degree of Nativeness for Language Learner
Predicting the degree of nativeness of a student's utterance is an important issue in computer-aided language learning.
-
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue…
-
The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning
The EduSpeak(TM) system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
-
Effects of Speech Recognition-based Pronunciation Feedback on Second-Language Pronunciation Ability
This study's goal was to determine whether receiving a particular type of feedback on nativeness of second-language accent positively influenced pronunciation over time.
-
Word-Level Rate of Speech Modeling Using Rate-Specific Phones and Pronunciations
We propose to use rate-specific phone models and pronunciations for ROS modeling at the word level. Words are given three types of pronunciations -- fast, slow, and medium -- consisting…