The US increasingly relies on surveillance video to determine when activities of interest occur in a surveilled location. The growth in video volume places a difficult burden on the analyst workforce charged with evaluating streaming video or performing forensic analysis on archived video. This paper presents a video summarization pipeline that attempts to reduce the volume of video analysts must watch by summarizing the video into shorter, presumably important clips. The pipeline incorporates object recognition and tracking to generate clips composed of bounding boxes for objects across time, segments these clips into unique trajectories, trains a stacked sparse autoencoder, then generates a summary based on reconstruction error within the autoencoder, where high error indicates a unique (relative to previous) object trajectory. The paper then compares performance of the summarization pipeline applied to research datasets to performance on more realistic DoD surveillance datasets.
K. Pitstick, J. Hansen, M. Klein, E. Morris, and J. Vazquez-Trejo, "Applying video summarization to aerial surveillance," Proc. SPIE 10635, Ground/Air Multisensor Interoperability, Integration, and Networking for Persistent ISR IX, 106350D (Presented at SPIE Defense + Security: April 16, 2018; Published: 4 May 2018); https://doi.org/10.1117/12.2314877.
Conference Presentations are recordings of oral presentations given at SPIE conferences and published as part of the proceedings. They include the speaker's narration with video of the slides and animations. Most include full-text papers. Interactive, searchable transcripts and closed captioning are now available for 2018 presentations, with transcripts for prior recordings added daily.
Search our growing collection of more than 16,000 conference presentations, including many plenaries and keynotes.