Speech & natural language publications
-
Improvements in MLLR-Transform-based Speaker Recognition
We previously proposed the use of MLLR transforms derived from a speech recognition system as speaker features in a speaker verification system. In this paper we report recent improvements to…
-
A Study of Intentional Voice Modifications for Evading Automatic Speaker Recognition
We investigate the effect of intentional voice modifications on a state-of-the-art speaker recognition system. The investigation includes data collection, where normal and changed voices are collected from subjects conversing by…
-
Generalized Linear Kernels for One-Versus-All Classification: Application to Speaker Recognition
In this paper, we examine the problem of kernel selection for one-versus-all (OVA) classification of multiclass data with support vector machines (SVMs). We focus specifically on the problem of training…
-
The Contribution of Cepstral and Stylistic Features to SRI’s 2005 NIST Speaker Recognition Evaluation System
Recent work in speaker recognition has demonstrated the advantage of modeling stylistic features in addition to traditional cepstral features, but to date there has been little study of the relative…
-
Speech Recognition Engineering Issues in Speech-to-Speech Translation System Design for Low Resource Languages and Domains
This paper, using case studies of creating speech translation systems between English and languages such as Pashto and Farsi, describes some of the practical issues and the solutions that were…
-
Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings
This paper investigates a scheme for joint segmentation and classification of dialog acts (DAs) of the ICSI Meeting Corpus based on hidden-event language models and a maximum entropy classifier for…
-
Combining Prosodic, Lexical and Cepstral Systems for Deceptive Speech Detection
We report on machine learning experiments to distinguish deceptive from nondeceptive speech in the Columbia-SRI-Colorado (CSC) corpus. Specifically, we propose a system combination approach using different models and features for…
-
Cross-Domain and Cross-Language Portability of Acoustic Features Estimated by Multilayer Perceptrons
In this paper we investigate how portable such features are across domains and languages. We show that even without retraining, English-trained MLP features can provide a significant boost to recognition…
-
Detecting Action Items in Multi-party Meetings: Annotation and Initial Experiments
This paper presents the results of initial investigation and experiments into automatic action item detection from transcripts of multi-party human-human meetings.
-
Shallow Discourse Structure for Action Item Detection
We investigated automatic action item detection from transcripts of multi-party meetings.
-
Recent Innovations in Speech-to-Text Transcription at SRI-ICSI-UW
We summarize recent progress in automatic speech-to-text transcription at SRI, ICSI, and the University of Washington. The work encompasses all components of speech modeling found in a state-of-the-art recognition system,…
-
A Multimodal Discourse Ontology for Meeting Understanding
In this paper, we present a multimodal discourse ontology that serves as a knowledge representation and annotation framework for the discourse understanding component of an artificial personal office assistant.