Author: Victor Abrash
-
Classification of Lexical Stress Using Spectral and Prosodic Features for Computer-assisted Language Learning Systems
We present a system for detection of lexical stress in English words spoken by English learners. This system was designed to be part of the EduSpeak® computer-assisted language learning (CALL) software.
-
Lexical Stress Classification for Language Learning Using Spectral and Segmental Features
We present a system for detecting lexical stress in English words spoken by English learners. The system uses both spectral and segmental features to detect three levels of stress for each syllable in a word.
-
SRILM at sixteen: Update and outlook
We review developments in the SRI Language Modeling Toolkit (SRILM) since 2002, when a previous paper on SRILM was published.
-
EduSpeak®: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications
SRI International’s EduSpeak® system is a SDK that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
-
Robust Feature Compensation in Nonstationary and Multiple Noise Environments
We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending on the observation.
-
Development of Phrase Translation Systems for Handheld Computers: from Concept to Field
We describe the development and conceptual evolution of handheld spoken phrase translation systems, beginning with an initial undirectional system for translation of English phrases, and later extending to a limited bidirectional phrase translation system.
-
DynaSpeak: SRI’s Scalable Speech Recognizer for Embedded and Mobile Systems
We introduce SRI’s new speech recognition engine, DynaSpeak(TM), which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on integer arithmetic.
-
The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning
The EduSpeak(TM) system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
-
Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizers
In this paper, we propose a new algorithm to train mixtures of transformation networks (MTNs) in the hybrid connectionist recognition framework. We apply the new algorithm to nonnative speaker adaptation, and present recognition results for the 1994 WSJ Spoke 3 development set.
-
Connectionist Speaker Normalization and Adaptation
We explore supervised speaker adaptation and normalization in the MLP component of a hybrid hidden Markov model/multilayer perceptron version of SRI’s DECIPHER™ speech recognition system. Our approach combines both adaptation and normalization in a single, consistent manner, works with limited adaptation data, and is text-independent.
-
Incorporating linguistic features in a hybrid HMM/MLP speech recognizer
We propose two schemes for incorporating distinctive speech features (sonorant, fricative, nasal, vocalic, and voiced) into the MLP component of our system. We show a small improvement in recognition performance on a 160-word speaker-independent continuous-speech Japanese conference room reservation database.
-
Modeling Consistency in a Speaker Independent Continuous Speech Recognition System
In this paper we discuss a Gender Dependent Neural Network (GDNN) which can be tuned for each gender, while sharing most of the speaker independent parameters. We use a classification network to help generate gender-dependent phonetic probabilities for a statistical (HMM) recognition system.