Towards a Framework for Continuous Planning and Execution
This paper reports on the first phase of the Continuous Planning and Execution Framework (CPEF), a system that employs sophisticated plan generation, execution, monitoring, and repair capabilities to solve complex…
Discriminative Training of Minimum Cost Speaker Verification Systems
This paper presents a new training procedure for speaker verification systems. Results are presented from the 1997 NIST Speaker Recognition Evaluation corpus indicating that the VCF performance can be improved…
Automatic Detection of Discourse Structure for Speech Recognition and Understanding
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 `Dialog Acts' (DAs), (question, answer, backchannel, agreement,…
Using Information Extraction to Improve Information Retrieval
The authors describe an approach to applying a particular kind of Natural Language Processing NLP system to the TREC routing task in Information Retrieval IR.
Structure and Performance of a Dependency Language Model
We present a maximum entropy language model that incorporates both syntax and semantics via a dependency grammar.
Speech: A Privileged Modality
In this article, we use our interaction model to demonstrate that during multimodal fusion, speech should be a privileged modality, driving the interpretation of a query, and that in certain…
HMM State Clustering Across Allophone Class Boundaries
We present a novel approach to hidden Markov model (HMM) state clustering based on the use of broad phone classes and an allophone class entropy measure. Our algorithm allows clustering…
A Study of Multilingual Speech Recognition
This paper describes our work in developing multilingual (Swedish and English) speech recognition systems in the ATIS domain. The acoustic component of the multilingual systems is realized through sharing Gaussian…
A Prosody-Only Decision-Tree Model for Disfluency Detection
We have developed a disfluency detection method using decision tree classifiers that use only local and automatically extracted prosodic features. Because the model doesn't rely on lexical information, it is…
Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction
The aim of the work described in this paper is to develop methods for automatically assessing the pronunciation quality of specific phone segments uttered by students learning a foreign language.
Diagrammatic Methods for Deriving and Relating Temporal Neural Network Algorithms
We present an alternative approach based on a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time, without…
A Lognormal Tied Mixture Model of Pitch for Prosody-Based Speaker Recognition
In this work, we develop a statistical model of pitch that allows unbiased estimation of pitch statistics from pitch tracks which are subject to doubling and/or halving.