Author: Horacio Franco

July 22, 2020

Wideband Spectral Monitoring Using Deep Learning

We present a system to perform spectral monitoring of a wide band of 666.5 MHz, located within a range of 6 GHz of Radio Frequency (RF) bandwidth, using state-of-the-art deep learning approaches.
June 1, 2018

Voices Obscured in Complex Environmental Settings (VOiCES) corpus

This work is a multi-organizational effort led by SRI International and Lab41 with the intent to push forward state-of-the-art distant microphone approaches in signal processing and speech recognition.
December 1, 2017

Tackling Unseen Acoustic Conditions in Query-by-Example Search Using Time and Frequency Convolution for Multilingual Deep Bottleneck Features

This paper revisits two neural network architectures developed for noise and channel robust ASR, and applies them to building a state-of-art multilingual QbE system.
December 1, 2017

Noise-robust Exemplar Matching for Rescoring Query-by-Example Search

This paper describes a two-step approach for keyword spotting task in which a query-by-example search is followed by noise robust exemplar matching rescoring.
August 1, 2017

Leveraging Deep Neural Network Activation Entropy to Cope with Unseen Data in Speech Recognition

This work aims to estimate the propagation of such distortion in the form of network activation entropy, which is measured over a short-time running window on the activation from each neuron of a given hidden layer, and these measurements are then used to compute summary entropy.
March 1, 2017

Speech recognition in unseen and noisy channel conditions

This work investigates robust features, feature-space maximum likelihood linear regression (fMLLR) transform, and deep convolutional nets to address the problem of unseen channel and noise conditions in speech recognition.
March 1, 2017

Joint modeling of articulatory and acoustic spaces for continuous speech recognition tasks

This paper investigates using deep neural networks (DNN) and convolutional neural networks (CNNs) for mapping speech data into its corresponding articulatory space.
September 1, 2016

Unsupervised Learning of Acoustic Units Using Autoencoders and Kohonen Nets

This work investigates learning acoustic units in an unsupervised manner from real-world speech data by using a cascade of an autoencoder and a Kohonen net.
September 1, 2016

Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition

This work investigates the performance of traditional deep neural networks under varying acoustic conditions and evaluates their performance with speech recorded under realistic background conditions that are mismatched with respect to the training data.
December 1, 2015

Time-frequency convolutional networks for robust speech recognition

This work presents a modified CDNN architecture that we call the time-frequency convolutional network (TFCNN), in which two parallel layers of convolution are performed on the input feature space: convolution across time and frequency, each using a different pooling layer.
December 1, 2015

Improving robustness against reverberation for automatic speech recognition

In this work, we explore the role of robust acoustic features motivated by human speech perception studies, for building ASR systems robust to reverberation effects.
May 1, 2015

Classification of Lexical Stress Using Spectral and Prosodic Features for Computer-assisted Language Learning Systems

We present a system for detection of lexical stress in English words spoken by English learners. This system was designed to be part of the EduSpeak® computer-assisted language learning (CALL) software.