Author: SRI International
-
Modeling Linguistic Segment and Turn Boundaries for N-best Rescoring of Spontaneous Speech
We present an N-best rescoring algorithm that removes the effect of segmentation mismatch. Furthermore, we show that explicit language modeling of hidden linguistic segment boundaries is improved by including turn-boundary events in the model.
-
Multimodal Interfaces for Internet
In this paper, we present a Java-enabled application with a multimodal (pen and voice) interface over the web. Our implementation approach was to add Java to the set of languages accepted by the Open Agent Architecture (OAA), a framework for rapidly prototyping complex applications, and particularly suited to those with multimodal interfaces.
-
Using Differential Constraints to Reconstruct Complex Surfaces from Stereo
Stereo reconstruction algorithms often fail to properly deal with complex surfaces, because there is not enough image information. We propose to guide the reconstruction process using a priori information about the differential geometry of the object surfaces.
-
Model Transformation for Robust Speaker Recognition from Telephone Data
In the context of automatic speaker recognition, we propose a model transformation technique that renders speaker models more robust to acoustic mismatches and to data scarcity by appropriately increasing their variances.
-
Handset-Dependent Background Models for Robust Text-Independent Speaker Recognition
This paper studies the effects of handset distortion on telephone-based speaker recognition performance. Results on the 1996 NIST Speaker Recognition Evaluation corpus show that using handset-matched background models reduces false acceptances (at a 10% miss rate) by more than 60% over previously reported (handset-independent) approaches.
-
Neural-Network Based Measures of Confidence for Word Recognition
This paper proposes a probabilstic framework to define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the confidence in a word hypothesis, via a neural network.
-
HTTP://WWW.SPEECH.SRI.COM/DEMOS/ATIS.HTML
This paper presents a speech-enabled WWW demonstration based on the Air Travel Information System (ATIS) domain. SRI’s speech recognition technology and natural language understanding are fully integrated in a Java application using the DECIPHER(TM) speech recognition system and the Open Agent Architecture(TM).
-
Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System
We describe the development of the SRI system evaluated in the 1996 DARPA continuous speech recognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4 evaluation was to recognition speech from broadcast television and radio shows.
-
Hub4 Language Modeling Using Domain Interpolation and Data Clustering
In SRI’s language modeling experiments for the Hub4 domain, three basic approaches were pursued: interpolating multiple models estimated from Hub4 and non-Hub4 training data, adapting the language model (LM) to the focus conditions, and adapting the LM to different topic types.
-
The Display Of Cultural Knowledge In Cultural Transmission: Models Of Participation From The Pacific Island Of Kosrae
-
WebWatcher: A Tour Guide for the World Wide Web
We explore the notion of a tour guide software agent for assisting users browsing the World Wide Web. This paper describes a simple but operational tour guide, called WebWatcher, which has given over 5000 tours to people browsing CMU’s School of Computer Science Web pages.
-
Designing Student-Computer Interactions For Guided Inquiry Learning Environments