Speech & natural language publications
-
The SRI March 2000 Hub-5 Conversational Speech Transcription System
We describe SRI's large vocabulary conversational speech recognition system as used in the March 2000 NIST Hub-5E evaluation.
-
Prosody-Based Automatic Segmentation of Speech into Sentences and Topics
Using decision tree and hidden Markov modeling techniques, we combine prosodic cues with word-based approaches, and evaluate performance on two speech corpora, Broadcast News and Switchboard. Results show that the…
-
Language Modelling for Multilingual Speech Translation
As with acoustic modelling, sparse training data is one of the main problems in language modelling tasks. We ideally want to have enough properly matched data to train models for…
-
Phonetic Consequences of Speech Disfluency
Analyses of American English show that disfluency affects a variety of phonetic aspects of speech, including segment durations, intonation, voice quality, vowel quality, and coarticulation patterns. These effects provide clues…
-
Data-Driven Subclassification of Disfluent Repetitions Based on Prosodic Features
This study delves into the acoustic and prosodic information of repetitions, one of the most common disfluencies. A hierarchical clustering of prosodic features reveals three subsets of repetitions, each reflecting…
-
Robust Text-Independent Speaker Identification over Telephone Channels
This paper addresses the issue of closed-set text-independent speaker identifcation from samples of speech recorded over the telephone. It focuses on the effects of acoustic mismatches between training and testing…
-
Finding Consensus Among Words: Lattice-based Word Error Minimization
We describe a new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER). Our approach thus overcomes the mismatch between…
-
Modeling the Prosody of Hidden Events for Improved Word Recognition
We investigate a new approach for using speech prosody as a knowledge source for speech recognition. The idea is to penalize word hypotheses that are inconsistent with prosodic features such…
-
Combining Words and Prosody for Information Extraction from Speech
In this work we demonstrate the use of em prosodic cues, alone and in combination with words, for segmentation and name finding. In experiments, we find that prosodic cues alone…
-
Combining Words and Speech Prosody for Automatic Topic Segmentation
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topic units. The approach combines hidden Markov models, statistical language models,…
-
Efficient Lattice Representation and Generation
We describe two new techniques for reducing word lattice sizes without eliminating hypotheses.
-
Automatic Detection of Sentence Boundaries and Disfluencies based on Recognized Words
We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Several model combination approaches are…