Author: Ajay Divakaran

April 2, 2019

Lucid Explanations Help: Using a Human-AI Image-Guessing Game to Evaluate Machine Explanation Helpfulness

We propose a Twenty-Questions style collaborative image retrieval game as a method of evaluating the efficacy of explanations (visual evidence or textual justification) in the context of Visual Question Answering.
September 7, 2018

Zero-Shot Object Detection

We introduce and tackle the problem of zero-shot object detection, which aims to detect object classes which are not observed during training.
May 1, 2014

Emotion Detection in Speech Using Deep Networks

We propose a novel staged hybrid model for emotion detection in speech. Hybrid models exploit the strength of discriminative classifiers along with the representational power of generative models.
December 1, 2013

Dynamic Pooling for Complex Event Recognition

Complex events are defined as events composed of several characteristic behaviors, whose temporal configuration can change from sequence to sequence.
October 1, 2013

Semantic Pooling for Complex Event Detection

We propose a semantic pooling approach to tackle this issue. Unlike the conventional pooling over the entire video or specific spatial regions of a video, we employ a discriminative approach to acquire abstract semantic “regions” for pooling.
August 1, 2013

Affect Analysis in Natural Human Interaction Using Joint Hidden Conditional Random Fields

We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics.
July 1, 2013

Affect Analysis in Natural Human Interaction Using Joint Hidden Conditional Random Fields

We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics.
April 1, 2013

On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video

This paper explores how unsupervised audio segmentation systems like speaker diarization can be adapted to automatically identify low-level sound concepts similar to annotator defined concepts and how these concepts can be used for audio indexing.
October 1, 2012

Multimedia Event Recounting with Concept Based Representation

We conduct a pilot study of the multimedia event recounting problem, which answers the question why this video is recognized as this event, i.e. what evidences this decision is made on.
May 1, 2010

Weapon Identification Across Varying Acoustic Conditions Using an Exemplar Embedding Approach

In this paper we present a first study of using an exemplar embedding approach to automatically detect and classify firearm type across different recording conditions.
May 1, 2010

Wide Area Active Collaborative Tracking of Waterborne Vessels

We describe a real-time wide area surveillance system (WA-ACTV) for the automatic tracking of vessels using a network of PTZ cameras.
April 1, 2009

Automatic Food Documentation and Volume Computation Using Digital Imaging and Electronic Transmission

Improving methodology even modestly would advance our knowledge about the influence of food intake on health.