Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation

SRI authors: ,

Citation

Y. Yao, Divakaran,A., Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation, IEEE International Conference on Computer Vision and Pattern Recognition, Long Beach, CA, June 16-21, 2019.

Abstract

We propose novel Stacked Spatio-Temporal Graph Convolutional Networks (Stacked-STGCN) for action segmentation, i.e., predicting and localizing a sequence of actions over long videos. We extend the Spatio-Temporal Graph Convolutional Network (STGCN) originally proposed for skeleton-based action recognition to enable nodes with different characteristics (e.g., scene, actor, object, action, etc.), feature descriptors with varied lengths, and arbitrary temporal edge connections to account for large graph deformation commonly associated with complex activities. We further introduce the stacked hourglass architecture to STGCN to leverage the advantages of an encoder-decoder design for improved generalization performance and localization accuracy. We explore various descriptors such as frame-level VGG, segment-level I3D, RCNN-based object, etc. as node descriptors to enable action segmentation based on joint inference over comprehensive contextual information. We show results on CAD120 (which provides pre-computed node features and edge weights for fair performance comparison across algorithms) as well as a more complex real-world activity dataset, Charades. Our Stacked-STGCN in general achieves 4.0% performance improvement over the best reported results in F1 score on CAD120 and 1.3% in mAP on Charades using VGG features.


Read more from SRI

  • A photo of Mary Wagner

    Recognizing the life and work of Mary Wagner 

    A cherished SRI colleague and globally respected leader in education research, Mary Wagner leaves behind an extraordinary legacy of groundbreaking work supporting children and youth with disabilities and their families.

  • Testing XRGo in a robotics laboratory

    Robots in the cleanroom

    A global health leader is exploring how SRI’s robotic telemanipulation technology can enhance pharmaceutical manufacturing.

  • SRI research aims to make generative AI more trustworthy

    Researchers have developed a new framework that reduces generative AI hallucinations by up to 32%.