September 8, 2021

John Niekrasz

Technical Manager, Artificial Intelligence Center

John Niekrasz is a Technical Manager in the Advanced Analytics group of SRI’s Artificial Intelligence Center. His core interests center on the use of automated discourse analysis in support of technologies that understand purposes and intentions in language use, particularly in informal and conversational genres. His published research includes automated essay scoring, speech summarization, linguistic pragmatics, conversational communication, discourse segmentation, information extraction, dialogue systems, and human-computer interfaces.

Prior to joining SRI in 2010, Niekrasz was at Stanford’s Center for the Study of Language and Information, where he carried out research for the DARPA Personal Assistant that Learns (PAL) program. There, under the SRI-led Cognitive Assistant that Learns and Organizes (CALO) project, he developed a personal meeting assistant and studied methods for automatically extracting action items and decisions during spoken meetings. His subsequent doctoral research at the University of Edinburgh was conducted as part of the EU-funded Augmented Multi-party Interaction (AMI) project, which investigated automated content linking and indexing to support spoken conversations in real time.

Niekrasz holds a Ph.D. in linguistics from the University of Edinburgh and a B.S. in symbolic systems from Stanford University. His dissertation examined algorithms for automatically summarizing communicative activities in multi-party face-to-face conversations.

Recent publications

October 8, 2022

Accelerating Human Authorship of Information Extraction Rules

We simulate the process of corpus review and word list creation, showing that several simple interventions greatly improve recall as a function of simulated labor.
August 1, 2016

Feature Derivation for Exploitation of Distant Annotation via Pattern Induction against Dependency Parses

We consider the use of distant supervision for biological information extraction, and introduce two understudied corpora of this form, the Biological Expression Language (BEL) Large Corpus and the Pathway Logic…
January 1, 2016

An Annotated Corpus and Method for Analysis of Ad-Hoc Structures Embedded in Text

We describe a method for identifying and performing functional analysis of structured regions that are embedded in natural language documents, such as tables or key-value lists.
January 1, 2016

Assessing Problem-Solving Process At Scale

This paper describes a hybrid approach to assessing process at scale in the context of the use of computational thinking practices during programming.
October 1, 2013

Unsupervised Discovery and Extraction of Semi-Structured Regions in Text Via Self-Information

We present initial work that uses significant patterns to generate extraction rules, and conclude with a discussion of future directions of our work.
January 1, 2012

A corpus of online discussions for research into linguistic memes

We describe a 460-million word corpus of online discussions.

John Niekrasz

Technical Manager, Artificial Intelligence Center

Recent publications

Accelerating Human Authorship of Information Extraction Rules

Feature Derivation for Exploitation of Distant Annotation via Pattern Induction against Dependency Parses

An Annotated Corpus and Method for Analysis of Ad-Hoc Structures Embedded in Text

Assessing Problem-Solving Process At Scale

Unsupervised Discovery and Extraction of Semi-Structured Regions in Text Via Self-Information

A corpus of online discussions for research into linguistic memes

Read more from SRI