Citation
Xiao, J., Cheng, H., Han, F., & Sawhney, H. (2009). Geo-Based Aerial Surveillance Video Processing For Scene Understanding And Object Tracking. International Journal of Pattern Recognition and Artificial Intelligence, 23(07), 1285-1307.
Abstract
This paper presents an approach to extract semantic layers from aerial surveillance videos for scene understanding and object tracking. The input videos are captured by low flying aerial platforms and typically consist of strong parallax from non-ground-plane structures as well as moving objects. Our approach leverages the geo-registration between video frames and reference images (such as those available from Terraserver and Google satellite imagery) to establish a unique geo-spatial coordinate system for pixels in the video. The geo-registration process enables Euclidean 3D reconstruction with absolute scale unlike traditional monocular structure from motion where continuous scale estimation over long periods of time is an issue. Geo-registration also enables correlation of video data to other stored information sources such as GIS (Geo-spatial Information System) databases. In addition to the geo-registration and 3D reconstruction aspects, the other key contributions of this paper also include: (1) providing a reliable geo-based solution to estimate camera pose for 3D reconstruction, (2) exploiting appearance and 3D shape constraints derived from geo-registered videos for labeling of structures such as buildings, foliage, and roads for scene understanding, and (3) elimination of moving object detection and tracking errors using 3D parallax constraints and semantic labels derived from geo-registered videos. Experimental results on extended time aerial video data demonstrates the qualitative and quantitative aspects of our work.
Keywords: Aerial video processing, scene segmentation, layer extraction and labeling, moving object detection and tracking