Speech & natural language publications

January 1, 2008

Recognizing Arabic speakers with English phones

We investigate the question of whether phone recognition models trained on large English databases can be used for speaker recognition in another language.

Publications, Speech & natural language publications
January 1, 2008

Improving NER in Arabic using a morphological tagger

ByDayne Freitag

We discuss a named entity recognition system for Arabic, and show how we incorporated the information provided by MADA, a full morphological tagger which uses a morphological analyzer.

Publications, Speech & natural language publications
January 1, 2008

Automatic Annotation of Dialogue Structure from Simple User Interaction

ByJohn Niekrasz

We investigate, through the transformation of human annotations into hypothetical idealized user interactions, the relative utility of various modes of user interaction and techniques for their interpretation.

Publications, Speech & natural language publications
January 1, 2008

Meeting Adjourned: Off-line Learning Interfaces for Automatic Meeting Understanding

ByJohn Niekrasz

We explore interfaces for presenting this information to users after a meeting is completed, using two post-meeting interfaces that display information from topics and action items respectively.

Publications, Speech & natural language publications
December 1, 2007

OOV Detection by Joint Word/Phone Lattice Alignment

ByDimitra Vergyri

We propose a new method for detecting out-of-vocabulary (OOV) words for large vocabulary continuous speech recognition (LVCSR) systems. Our method is based on performing a joint alignment between independently generated…

Publications, Speech & natural language publications
December 1, 2007

Integrating Several Annotation Layers for Statistical Information Distillation

We present a sentence extraction algorithm for Information Distillation, a task where for a given templated query, relevant passages must be extracted from massive audio and textual document sources.

Publications, Speech & natural language publications
December 1, 2007

Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages

We explore the use of morph-based language models in large-vocabulary continuous speech recognition systems across four so-called “morphologically rich” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic. The morphs are…

Publications, Speech & natural language publications
December 1, 2007

Reranking Machine Translation Hypotheses With Structured and Web-based Language Models

In this paper, we investigate the use of linguistically motivated and computationally efficient structured language models for reranking N-best hypotheses in a statistical machine translation system.

Publications, Speech & natural language publications
December 1, 2007

Building A Highly Accurate Mandarin Speech Recognizer

We describe a highly accurate large-vocabulary continuous Mandarin speech recognizer, a collaborative effort among four research organizations. Particularly, we build two acoustic models (AMs) with significant differences but with similar…

Publications, Speech & natural language publications
October 1, 2007

Capturing a Taxonomy of Failures During Automatic Interpretation of Questions Posed in Natural Language

In this paper, we present a study – conducted in the context of the Halo Project – cataloging the types of failures that occur when capturing knowledge from natural language.

Publications, Speech & natural language publications
October 1, 2007

Capturing and Answering Questions Posed to a Knowledge-Based System

As part of the ongoing project, Project Halo, our goal is to build a system capable of answering questions posed by novice users to a formal knowledge base. In our…

Publications, Speech & natural language publications
October 1, 2007

Extending Boosting for Large Scale Spoken Language Understanding

We propose three methods for extending the Boosting family of classifiers motivated by the real-life problems we have encountered. Our results indicate that it is possible to obtain the same…

Publications, Speech & natural language publications