Author: Dimitra Vergyri
-
Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition
In this paper we investigate different procedures that enable us to use training data by automatically inserting the missing diacritics into the transcription.
-
Voicing Feature Integration in SRI’s Decipher LVCSR System
We augment the Mel cepstral (MFCC) feature representation with voicing features from an independent front end.
-
Cross-dialectal Acoustic Data Sharing for Arabic Speech Recognition
In this paper we describe the use of acoustic data from Modern Standard Arabic (MSA) to improve the recognition of Egyptian Conversational Arabic (ECA).
-
Limited-Domain Speech-to-Speech Translation between English and Pashto
This paper describes a prototype system for near-real-time spontaneous, bidirectional translation between spoken English and Pashto.
-
Development of Phrase Translation Systems for Handheld Computers: from Concept to Field
We describe the development and conceptual evolution of handheld spoken phrase translation systems, beginning with an initial undirectional system for translation of English phrases, and later extending to a limited bidirectional phrase translation system.
-
Novel Approaches to Arabic Speech Recognition: Report from the 2002 Johns Hopkins Summer Workshop
This paper reports on our project at the 2002 Johns Hopkins Summer Workshop, which focused on the recognition of dialectal Arabic.
-
Prosodic Knowledge Sources for Automatic Speech Recognition
We investigate three models, each exploiting a different level of prosodic information, in rescoring N-best hypotheses according to how well recognized words correspond to prosodic features of the utterance.
-
Building an ASR System for Noisy Environments: SRI’s 2001 SPINE Evaluation System
We describe SRI’s recognition system as used in the 2001 DARPA Speech in Noisy Environments (SPINE) evaluation. The SPINE task involves recognition of speech in simulated military environments.
-
Weighting Schemes for Audio-visual Fusion in Speech Recognition
In this work we demonstrate an improvement in the state-of-the-art large vocabulary continuous speech recognition (LVCSR) performance, under clean and noisy conditions, by the use of visual information, in addition to the traditional audio one.
-
An Efficient Repair Procedure For Quick Transcriptions
The procedure we propose in this paper aims to em cleanse/ such quick transcriptions so that they align better with the acoustic evidence and thus provide for better acoustic models for automatic speech recognition (ASR).