Speech & natural language publications
-
Automatic Detection of Sentence Boundaries and Disfluencies based on Recognized Words
We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Several model combination approaches are…
-
Collection and Detailed Transcription of a Speech Database for Development of Language Learning Technologies
We describe the methodologies for collecting and annotating a Latin-American Spanish speech database. We use the annotated database to investigate rater reliability, the effect of each phone on overall perceived…
-
Crosslinguistic Disfluency Modeling: A Comparative Analysis of Swedish and American English Human-Human and Human-Machine Dialogues
We report results from a cross-language study of disfluencies (DFs) in Swedish and American English human-machine and human-human dialogs. We focus on differences suggestive of how speakers utilize DFs in…
-
Lexical, Prosodic, and Syntactic Cues for Dialog Acts
This paper presents a preliminary investigation into the realization of a particular class of dialog acts which play an essential structuring role in dialog, the backchannels or acknowledgements tokens. We…
-
Modeling Dynamic Prosodic Variation for Speaker Verification
In this work, we take a first step toward capturing suprasegmental patterns for automatic speaker verification. Prosody modeling improves the verification performance of a cepstrum-based Gaussian mixture model system (as…
-
Speech Trends and Predictions, or Do We Need Text?
The rapid growth in the number of companies devoted to speech recognition applications attests to this growth in performance. This brief report explores the further potential for speech technology.
-
The Past, Present, and Future of Speech Processing
This article provides a succinct review of the history and current status of the field of speech processing research and describes future contributions speech processing will make to society.
-
Nonlinear Discriminant Feature Extraction for Robust Text-Independent Speaker Recognition
We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perceptron (MLP) to maximize the separation between speakers by…
-
Dialog Act Modeling for Conversational Speech
We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts', which were hand-labeled in 1155 conversations from…
-
Entropy-based Pruning of Backoff Language Models
A criterion for pruning parameters from N-gram backoff language models is developed, based on the relative entropy between the original and the pruned model. It is shown that the relative…
-
The Development of SRI’s 1997 Broadcast News Transcription System
This paper describes SRI's 1997 broadcast news transcription system used for the 1997 DARPA H4 evaluations. Our system had several novel components. We briefly describe these features and give comparative…
-
Speech Technology and Language Learning: Some Examples from VILTS, the Voice Interactive Language Training System
In this paper we describe the development of the Voice Interactive Language Training System (VILTS) and our experience in exploring the potential of speech technology in service to language learning.