Author: Horacio Franco
Robust speech representation of voiced sounds based on synchrony determiniation with PLLS
We propose to include synchrony effects, known to exist in the auditory system, to represent voiced parts of the speech signal in a robust way.
EduSpeak®: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications
SRI International’s EduSpeak® system is a SDK that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
Recent advances in SRI’s IraqComm Iraqi Arabic-English speech-to-speech translation system
We summarize recent progress on SRI’s IraqComm™ IraqiArabic-English two-way speech-to-speech translation system.
MUESLI: Multiple utterance error correction for a spoken language interface
We propose a method for using all available information to help correct recognition errors in tasks that use constrained grammars of the kind used in the domain of Command and Control (CC) systems.
IraqComm: A Next Generation Translation System
This paper describes the IraqComm translation system that mediates and translates spontaneous conversations between an English speaker and a speaker of colloquial Iraqi Arabic.
Robust Feature Compensation in Nonstationary and Multiple Noise Environments
We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending on the observation.
Voicing Feature Integration in SRI’s Decipher LVCSR System
We augment the Mel cepstral (MFCC) feature representation with voicing features from an independent front end.
Limited-Domain Speech-to-Speech Translation between English and Pashto
This paper describes a prototype system for near-real-time spontaneous, bidirectional translation between spoken English and Pashto.
Development of Phrase Translation Systems for Handheld Computers: from Concept to Field
We describe the development and conceptual evolution of handheld spoken phrase translation systems, beginning with an initial undirectional system for translation of English phrases, and later extending to a limited bidirectional phrase translation system.
Iterative Statistical Language Model Generation for Use with an Agent-Oriented Natural Language Interface
We describe a method for developing a statistical language model (SLM) with high keyword spotting accuracy for a natural language interface (NLI). The NLI is based on the Adaptive Agent Oriented Software Architecture (AAOSA).
Modeling Word-Level Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition
We propose to use a set of parallel rate-specific acoustic and pronunciation models. Rate switching is permitted at word boundaries, to allow within-sentence speech rate variation, which is common in conversational speech.
DynaSpeak: SRI’s Scalable Speech Recognizer for Embedded and Mobile Systems
We introduce SRI’s new speech recognition engine, DynaSpeak(TM), which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on integer arithmetic.