Speech & natural language publications
-
Toward Fail-Safe Speaker Recognition: Trial-Based Calibration with a Reject Option
In this work, we extend the TBC method, proposing a new similarity metric for selecting training data that results in significant gains over the one proposed in the original work.
-
Resilient Data Augmentation Approaches to Multimodal Verification in the News Domain
Building on multimodal embedding techniques, we show that data augmentation via two distinct approaches improves results: entity linking and cross-domain local similarity scaling.
-
Natural Language Access: When Reasoning Makes Sense
We argue that to use natural language effectively, we must have both a deep understanding of the subject domain and a general-purpose reasoning capability.
-
Wideband Spectral Monitoring Using Deep Learning
We present a system to perform spectral monitoring of a wide band of 666.5 MHz, located within a range of 6 GHz of Radio Frequency (RF) bandwidth, using state-of-the-art deep…
-
Dual orexin and MCH neuron-ablated mice display severe sleep attacks and cataplexy
These results indicate a functional interaction between orexin and MCH neurons in vivo that suggests the synergistic involvement of these neuronal populations in the sleep/wakefulness cycle.
-
Mapping Individual to Group Level Collaboration Indicators Using Speech Data
To address the challenge of mapping characteristics of individuals’ speech to information about the group, we coded behavioral and learning-related indicators of collaboration at the individual level.
-
Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings
This article focuses on speaker recognition using speech acquired using a single distant or far-field microphone in an indoors environment.
-
Analysis of Complementary Information Sources in the Speaker Embeddings Framework
In this study, our aim is analyzing the behavior of the speaker recognition systems based on speaker embeddings toward different front-end features, including the standard MFCC, as well as PNCC,…
-
Structure-based lead optimization to improve antiviral potency and ADMET properties of phenyl-1H-pyrrole-carboxamide entry inhibitors targeted to HIV-1 gp120
We are continuing our concerted effort to optimize our first lead entry antagonist, NBD-11021, which targets the Phe43 cavity of the HIV-1 envelope glycoprotein gp120, to improve antiviral potency and…
-
How to train your speaker embedding extractor
In this study, we aim to explore some of the fundamental requirements for building a good speaker embeddings extractor. We analyze the impact of voice activity detection, types of degradation,…
-
Voices Obscured in Complex Environmental Settings (VOiCES) corpus
This work is a multi-organizational effort led by SRI International and Lab41 with the intent to push forward state-of-the-art distant microphone approaches in signal processing and speech recognition.
-
Approaches to multi-domain language recognition
Approaches found to provide robustness in multi-domain LID include a domain-and-language-weighted Gaussian backend classifier, duration-aware calibration, and a source normalized multi-resolution neural network backend.