Author: Horacio Franco
-
Improved Maximum Mutual Information Estimation Training of Continuous Density HMMs
We derive a new set of equations for MMIE based on a quasi-Newton algorithm, without relying on EBW. We find that by adopting a generalized form of the MMIE criterion, the H-criterion, convergence speed and recognition performance can be improved
-
Prosodic Features for Automatic Text-Independent Evaluation of Degree of Nativeness for Language Learner
Predicting the degree of nativeness of a student’s utterance is an important issue in computer-aided language learning.
-
Consonant Discrimination in Elicited and Spontaneous Speech: A Case for Signal-Adaptive Front Ends in ASR
This work investigates an approach to add back such transient information to a speech recognizer, without losing the robustness of the standard acoustic models. We demonstrate a set of phonetically-motivated acoustic features that discriminate a preliminary test set of highly ambiguous voiceless stops in CV contexts.
-
The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning
The EduSpeak(TM) system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
-
Effects of Speech Recognition-based Pronunciation Feedback on Second-Language Pronunciation Ability
This study’s goal was to determine whether receiving a particular type of feedback on nativeness of second-language accent positively influenced pronunciation over time.
-
The SRI March 2000 Hub-5 Conversational Speech Transcription System
We describe SRI’s large vocabulary conversational speech recognition system as used in the March 2000 NIST Hub-5E evaluation.
-
Rate-dependent Acoustic Modeling for Large Vocabulary Conversational Speech Recognition
In this paper, we evaluate our approach on a large-vocabulary conversational speech recognition (LVCSR) task over the telephone, with several minimal pair comparisons based on different baseline systems.
-
Collection and Detailed Transcription of a Speech Database for Development of Language Learning Technologies
We describe the methodologies for collecting and annotating a Latin-American Spanish speech database. We use the annotated database to investigate rater reliability, the effect of each phone on overall perceived nonnativeness, and the frequency of specific pronunciation errors.
-
Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction
The aim of the work described in this paper is to develop methods for automatically assessing the pronunciation quality of specific phone segments uttered by students learning a foreign language.
-
Automatic Pronunciation Scoring for Language Instruction
In this paper we show that we can significantly improve HMM- based scores by using average phone segment posterior probabilities. Correlation between machine and human scores went up from r=0.50 with likelihood-based scores to r=0.88 with posterior-based scores.
-
Connectionist Speaker Normalization and Adaptation
We explore supervised speaker adaptation and normalization in the MLP component of a hybrid hidden Markov model/multilayer perceptron version of SRI’s DECIPHER™ speech recognition system. Our approach combines both adaptation and normalization in a single, consistent manner, works with limited adaptation data, and is text-independent.
-
Incorporating linguistic features in a hybrid HMM/MLP speech recognizer
We propose two schemes for incorporating distinctive speech features (sonorant, fricative, nasal, vocalic, and voiced) into the MLP component of our system. We show a small improvement in recognition performance on a 160-word speaker-independent continuous-speech Japanese conference room reservation database.