Speech & natural language publications
-
On Using MLP Features in LVCSR
One of the major research thrusts in the speech group at ICSI is to use Multi-Layer Perceptron (MLP) based features in automatic speech recognition (ASR). This paper presents a study…
-
Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition
In this paper we investigate different procedures that enable us to use training data by automatically inserting the missing diacritics into the transcription.
-
Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech
We compare and contrast two different models for detecting sentence-like units in continuous speech. Both models combine lexical, syntactic, and prosodic information.
-
Identifying Agreement and Disagreement in Conversational Speech: Use of Bayesian Networks to Model Pragmatic Dependencies
We describe a statistical approach for modeling agreements and disagreements in conversational interaction.
-
Managing uncertainty in dialogue information state for real time understanding of multi-human meeting dialogue
Our ultimate aim is to model human-human dialogue (to the extent that it is feasible) in real-time, providing useful services (e.g. relevant document retrieval) and answering queries about the dialogue…
-
Modeling NERFs for Speaker Recognition
We introduce a new type of feature to capture long-range patterns associated with individual speakers or with speaking styles. NERFs, or Nonuniform Extraction Region Features, are defined based on regions…
-
Voicing Feature Integration in SRI’s Decipher LVCSR System
We augment the Mel cepstral (MFCC) feature representation with voicing features from an independent front end.
-
Application of the Modified Group Delay Function to Speaker Identification and Discrimination
In this paper, we explore new methods by which speakers can be identified and discriminated, using features derived from the fourier transform phase. A Gaussian mixture model (GMM) based speaker…
-
Cross-dialectal Acoustic Data Sharing for Arabic Speech Recognition
In this paper we describe the use of acoustic data from Modern Standard Arabic (MSA) to improve the recognition of Egyptian Conversational Arabic (ECA).
-
TRAPping Conversational Speech: Extending TRAP/Tandem Approaches to Conversational Telephone Speech Recognition
In this paper we report experiments with a reduced conversational speech task that led to the adoption of a number of engineering decisions for the design of an acoustic front…
-
The Use of a Linguistically Motivated Language Model in Conversational Speech Recognition
In this paper we show that such a model can be used effectively and efficiently in all stages of a complex, multi-pass conversational telephone speech recognition system.
-
Improving Automatic Sentence Boundary Detection with Confusion Networks
We extend existing methods for automatic sentence boundary detection by leveraging multiple recognizer hypotheses in order to provide robustness to speech recognition errors.