Author: Horacio Franco

September 1, 2001

Improved Maximum Mutual Information Estimation Training of Continuous Density HMMs

We derive a new set of equations for MMIE based on a quasi-Newton algorithm, without relying on EBW. We find that by adopting a generalized form of the MMIE criterion, the H-criterion, convergence speed and recognition performance can be improved
October 1, 2000

Prosodic Features for Automatic Text-Independent Evaluation of Degree of Nativeness for Language Learner

Predicting the degree of nativeness of a student’s utterance is an important issue in computer-aided language learning.
October 1, 2000

Consonant Discrimination in Elicited and Spontaneous Speech: A Case for Signal-Adaptive Front Ends in ASR

This work investigates an approach to add back such transient information to a speech recognizer, without losing the robustness of the standard acoustic models. We demonstrate a set of phonetically-motivated acoustic features that discriminate a preliminary test set of highly ambiguous voiceless stops in CV contexts.
August 1, 2000

Effects of Speech Recognition-based Pronunciation Feedback on Second-Language Pronunciation Ability

This study’s goal was to determine whether receiving a particular type of feedback on nativeness of second-language accent positively influenced pronunciation over time.
August 1, 2000

The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning

The EduSpeak(TM) system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
May 1, 2000

The SRI March 2000 Hub-5 Conversational Speech Transcription System

We describe SRI’s large vocabulary conversational speech recognition system as used in the March 2000 NIST Hub-5E evaluation.
January 1, 2000

Rate-dependent Acoustic Modeling for Large Vocabulary Conversational Speech Recognition

In this paper, we evaluate our approach on a large-vocabulary conversational speech recognition (LVCSR) task over the telephone, with several minimal pair comparisons based on different baseline systems.
August 1, 1998

Collection and Detailed Transcription of a Speech Database for Development of Language Learning Technologies

We describe the methodologies for collecting and annotating a Latin-American Spanish speech database. We use the annotated database to investigate rater reliability, the effect of each phone on overall perceived nonnativeness, and the frequency of specific pronunciation errors.
September 1, 1997

Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction

The aim of the work described in this paper is to develop methods for automatically assessing the pronunciation quality of specific phone segments uttered by students learning a foreign language.
April 1, 1997

Automatic Pronunciation Scoring for Language Instruction

In this paper we show that we can significantly improve HMM- based scores by using average phone segment posterior probabilities. Correlation between machine and human scores went up from r=0.50 with likelihood-based scores to r=0.88 with posterior-based scores.
September 1, 1995

Connectionist Speaker Normalization and Adaptation

We explore supervised speaker adaptation and normalization in the MLP component of a hybrid hidden Markov model/multilayer perceptron version of SRI’s DECIPHER™ speech recognition system. Our approach combines both adaptation and normalization in a single, consistent manner, works with limited adaptation data, and is text-independent.
January 1, 1994

Incorporating linguistic features in a hybrid HMM/MLP speech recognizer

We propose two schemes for incorporating distinctive speech features (sonorant, fricative, nasal, vocalic, and voiced) into the MLP component of our system. We show a small improvement in recognition performance on a 160-word speaker-independent continuous-speech Japanese conference room reservation database.