Horacio Franco

May 1, 2011

Robust speech representation of voiced sounds based on synchrony determiniation with PLLS

We propose to include synchrony effects, known to exist in the auditory system, to represent voiced parts of the speech signal in a robust way.

July 1, 2010

EduSpeak®: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications

SRI International’s EduSpeak® system is a SDK that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.

April 1, 2009

Recent advances in SRI’s IraqComm Iraqi Arabic-English speech-to-speech translation system

We summarize recent progress on SRI’s IraqComm™ IraqiArabic-English two-way speech-to-speech translation system.

September 1, 2008

MUESLI: Multiple utterance error correction for a spoken language interface

We propose a method for using all available information to help correct recognition errors in tasks that use constrained grammars of the kind used in the domain of Command and Control (CC) systems.

August 1, 2007

IraqComm: A Next Generation Translation System

This paper describes the IraqComm translation system that mediates and translates spontaneous conversations between an English speaker and a speaker of colloquial Iraqi Arabic.

September 1, 2005

Robust Feature Compensation in Nonstationary and Multiple Noise Environments

We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending on the observation.

May 1, 2004

Voicing Feature Integration in SRI’s Decipher LVCSR System

We augment the Mel cepstral (MFCC) feature representation with voicing features from an independent front end.

January 1, 2004

Limited-Domain Speech-to-Speech Translation between English and Pashto

This paper describes a prototype system for near-real-time spontaneous, bidirectional translation between spoken English and Pashto.

September 1, 2003

Development of Phrase Translation Systems for Handheld Computers: from Concept to Field

We describe the development and conceptual evolution of handheld spoken phrase translation systems, beginning with an initial undirectional system for translation of English phrases, and later extending to a limited bidirectional phrase translation system.

June 1, 2003

Iterative Statistical Language Model Generation for Use with an Agent-Oriented Natural Language Interface

We describe a method for developing a statistical language model (SLM) with high keyword spotting accuracy for a natural language interface (NLI). The NLI is based on the Adaptive Agent Oriented Software Architecture (AAOSA).

January 1, 2003

Modeling Word-Level Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition

We propose to use a set of parallel rate-specific acoustic and pronunciation models. Rate switching is permitted at word boundaries, to allow within-sentence speech rate variation, which is common in conversational speech.

March 1, 2002

DynaSpeak: SRI’s Scalable Speech Recognizer for Embedded and Mobile Systems

We introduce SRI’s new speech recognition engine, DynaSpeak(TM), which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on integer arithmetic.

Author: Horacio Franco