Speech & natural language publications
-
Nonparametric feature normalization for SVM-based speaker verification
We investigate several feature normalization and scaling approaches for use in speaker verification based on support vector machines.
-
Open-vocabulary spoken term detection using graphone-based hybrid recognition systems
We address the problem of retrieving out-of-vocabulary (OOV) words/queries from audio archives for spoken term detection (STD) task. In this work, we employ hybrid recognition systems which contain both words…
-
Name-aware Speech Recognition for Interactive Question Answering
In this work we show how interactivity in a voice-enabled question answering application may improve speech recognition. We allow the user to provide a target named entity before asking the…
-
An Iterative Unsupervised Learning Method for Information Distillation
In this work, we propose an iterative unsupervised sentence extraction method to answer open-ended natural language queries about an event. The approach consists of finding the subset of sentences that…
-
Exploiting dialogue act tagging and prosodic information for action item identification
In this paper we investigate the use of dialogue act tagging to improve the identification of action item descriptions and prosodic information to improve action item agreements.
-
Extracting Question/Answer Pairs in Multi-party Meetings
In this paper we introduce a new task for multi-party meetings: extracting question/answer pairs. We propose a method based on discriminative classification of individual sentences as questions and answers via…
-
Meeting Adjourned: Off-line Learning Interfaces for Automatic Meeting Understanding
We explore interfaces for presenting this information to users after a meeting is completed, using two post-meeting interfaces that display information from topics and action items respectively.
-
Meeting Structure Annotation
We describe a generic set of tools for representing, annotating, and analysing multi-party discourse, including: an ontology of multimodal discourse, a programming interface for that ontology, and NOMOS – a flexible and…
-
Voice-Based Speaker Recognition Combining Acoustic and Stylistic Features
We present a survey of the state of the art in voice-based speaker identification research. We describe the general framework of a text-independent speaker verification system, and, as an example,…
-
Error-Driven Generalist+Experts (EDGE): a Multi-Stage Ensemble Framework for Text Categorization
We introduce a multi-stage ensemble framework, Error-Driven Generalist+ Expert or Edge, for improved classification on large-scale text categorization problems.
-
Automatic Labeling Inconsistencies Detection and Correction for Sentence Unit Segmentation in Conversational Speech
In this work, we present various methods to detect labeling inconsistencies in the ICSI meeting corpus. We show that by automatically detecting and removing the inconsistent examples from the training…
-
Detecting nonnative speech using speaker recognition approaches
Detecting whether a talker is speaking his native language is useful for speaker recognition, speech recognition, and intelligence applications. We study the problem of detecting nonnative speakers of American English,…