Speech & natural language publications
-
A Prosody-based Approach to End-of-Utterance Detection That Does Not Require Speech Recognition
In this paper we demonstrate that the improvements due to the prosodic knowledge can be realized largely without alignment information, i.e., without requiring a speech recognizer. A prosodic end-of-utterance detector…
-
Novel Approaches to Arabic Speech Recognition: Report from the 2002 Johns Hopkins Summer Workshop
This paper reports on our project at the 2002 Johns Hopkins Summer Workshop, which focused on the recognition of dialectal Arabic.
-
Training a Prosody-Based Dialog Act Tagger from Unlabeled Data
Here we investigate the use of unlabeled data for training HMM-based dialog act taggers. Three techniques are shown to be effective for bootstrapping a tagger from very small amounts of…
-
Prosodic Knowledge Sources for Automatic Speech Recognition
We investigate three models, each exploiting a different level of prosodic information, in rescoring N-best hypotheses according to how well recognized words correspond to prosodic features of the utterance.
-
The Robustness of an Almost-Parsing Language Model Given Errorful Training Data
An almost-parsing language model has been developed that provides a framework for tightly integrating multiple knowledge sources. Lexical features and syntactic constraints are integrated into a uniform linguistic structure (called…
-
“TalkPrinting”: Improving Speaker Recognition by Modeling Stylistic Features
This paper describes "TalkPrinting", a program of research aimed at adding such stylistic features to conventional systems.
-
What Will People Say? Speech System Design and Language/Cultural Differences
This paper evaluates the effectiveness of three speech system design strategies in Pashto, a little-studied language of Afghanistan and Pakistan, drawing comparisons with English where possible.
-
Modeling Word-Level Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition
We propose to use a set of parallel rate-specific acoustic and pronunciation models. Rate switching is permitted at word boundaries, to allow within-sentence speech rate variation, which is common in…
-
Automatic Dialog Act Labeling With Minimal Supervision
We investigate the problem of automatically tagging dialog acts when hand-labeled training data is scarce. The tagging paradigm employed is a hidden Markov model in which dialog acts are states…
-
SRILM – An Extensible Language Modeling Toolkit
SRILM is a collection of C libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications.
-
Automatic Punctuation and Disfluency Detection in Multi-Party Meetings Using Prosodic and Lexical Cues
We investigate automatic approaches to finding "hidden" spontaneous speech events, such as sentence boundaries and disfluencies, in multi-party meetings.
-
Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog
We investigate the use of prosody for the detection of frustration and annoyance in natural human-computer dialog.