Author: Ajay Divakaran
-
Lucid Explanations Help: Using a Human-AI Image-Guessing Game to Evaluate Machine Explanation Helpfulness
We propose a Twenty-Questions style collaborative image retrieval game as a method of evaluating the efficacy of explanations (visual evidence or textual justification) in the context of Visual Question Answering.
-
Zero-Shot Object Detection
We introduce and tackle the problem of zero-shot object detection, which aims to detect object classes which are not observed during training.
-
Emotion Detection in Speech Using Deep Networks
We propose a novel staged hybrid model for emotion detection in speech. Hybrid models exploit the strength of discriminative classifiers along with the representational power of generative models.
-
Dynamic Pooling for Complex Event Recognition
Complex events are defined as events composed of several characteristic behaviors, whose temporal configuration can change from sequence to sequence.
-
Semantic Pooling for Complex Event Detection
We propose a semantic pooling approach to tackle this issue. Unlike the conventional pooling over the entire video or specific spatial regions of a video, we employ a discriminative approach to acquire abstract semantic “regions” for pooling.
-
Affect Analysis in Natural Human Interaction Using Joint Hidden Conditional Random Fields
We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics.
-
Affect Analysis in Natural Human Interaction Using Joint Hidden Conditional Random Fields
We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics.
-
On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video
This paper explores how unsupervised audio segmentation systems like speaker diarization can be adapted to automatically identify low-level sound concepts similar to annotator defined concepts and how these concepts can be used for audio indexing.
-
Multimedia Event Recounting with Concept Based Representation
We conduct a pilot study of the multimedia event recounting problem, which answers the question why this video is recognized as this event, i.e. what evidences this decision is made on.
-
Weapon Identification Across Varying Acoustic Conditions Using an Exemplar Embedding Approach
In this paper we present a first study of using an exemplar embedding approach to automatically detect and classify firearm type across different recording conditions.
-
Wide Area Active Collaborative Tracking of Waterborne Vessels
We describe a real-time wide area surveillance system (WA-ACTV) for the automatic tracking of vessels using a network of PTZ cameras.
-
Automatic Food Documentation and Volume Computation Using Digital Imaging and Electronic Transmission
Improving methodology even modestly would advance our knowledge about the influence of food intake on health.