Speech, technology and research lab
Communicating with, and through, computer applications
The Speech Technology and Research (STAR) Laboratory brings together a multidisciplinary mix of engineers, computer scientists and linguists. Together, our experts build systems for a wide range of applications including signal processing; data indexing and mining; and computer-aided learning. SRI’s speech and language technologies allow us to interact more naturally with computing applications and provide a wealth of actionable information about our intentions, health, and emotional state.
Core technologies and applications
Real-world impact
-
SRI’s AI-driven voice analysis could help screen for mental health conditions
Researchers at SRI are developing tools to help clinicians keep a close eye on depression, PTSD, and other mental health issues.
-
SRI is developing textiles that record audio
Turning piezoelectric materials and lithium-ion batteries into thread, innovators will weave fabrics that record sound.
-
Nuance Partners with SCIENTIA Puerto Rico
SRI spin-out Nuance Communications to expand access its Dragon Medical One for the island’s physicians and nurses
Featured researchers
-
Dimitra Vergyri
Director, Speech Technology and Research Laboratory (STAR)
-
Horacio Franco
Chief Scientist, Speech Technology and Research Laboratory
-
Aaron Lawson
Assistant Laboratory Director, Speech Technology and Research Laboratory
-
Martin Graciarena
Technical Manager, Speech Technology and Research Laboratory
-
Mitchell McLaren
Senior Computer Scientist, Speech Technology and Research Laboratory
-
Harry Bratt
Senior Computer Scientist, Speech Technology and Research Laboratory
Platforms
Publications
-
Toward Fail-Safe Speaker Recognition: Trial-Based Calibration with a Reject Option
In this work, we extend the TBC method, proposing a new similarity metric for selecting training data that results in significant gains over the one proposed in the original work.
-
Resilient Data Augmentation Approaches to Multimodal Verification in the News Domain
Building on multimodal embedding techniques, we show that data augmentation via two distinct approaches improves results: entity linking and cross-domain local similarity scaling.
-
Natural Language Access: When Reasoning Makes Sense
We argue that to use natural language effectively, we must have both a deep understanding of the subject domain and a general-purpose reasoning capability.