This paper describes a system for the mining of surveillance
video. Our main contributions are: providing a high level query
language for submitting queries about spatial and temporal
relations of background regions and moving entities, and about
human activities; providing a compiler to map high level queries
into a set of novel Petri net filters that utilize computer vision
algorithms to answer components of the queries; and providing a
powerful graphical interface where users have the ability to
formulate the query visually.
Along with increased demand for access to distributed information and communication via the Internet, wireless PDA devices and cellular phones, there is a tremendous need for information providers to adapt to various user demands, not only in terms of the diversity of content, but also in how that content is delivered. The diversity of user, network and client capabilities often make traditional approaches such as the storage of multiple representations impractical.
In the case of video, one of the primary considerations is the ability to provide content over low bandwidth and noisy networks as well as higher capacity networks. In this paper, we focus on novel techniques to adapt video for its delivery over IP. We provide insight into the existing problems and provide an overview of how our video analysis work is being applied in these areas.
Automated classification of digital video is emerging as an important piece of the puzzle in the design of content management systems for digital libraries. The ability to classify videos into various classes such as sports, news, movies, or documentaries, increases the efficiency of indexing, browsing, and retrieval of video in large databases. In this paper, we discuss the extraction of features that enable identification of sports videos directly from the compressed domain of MPEG video. These features include detecting the presence of action replays, determining the amount of scene text in vide, and calculating various statistics on camera and/or object motion. The features are derived from the macroblock, motion,and bit-rate information that is readily accessible from MPEG video with very minimal decoding, leading to substantial gains in processing speeds. Full-decoding of selective frames is required only for text analysis. A decision tree classifier built using these features is able to identify sports clips with an accuracy of about 93 percent.
Video segmentation plays an integral role in many multimedia applications, such as digital libraries, content management systems, and various other video browsing, indexing, and retrieval systems. Many algorithms for segmentation of video have appeared within the past few years. Most of these algorithms perform well on cuts, but yield poor performance on gradual transitions or special effects edits. A complete video segmentation system must also achieve good performance on special effect edit detection. In this paper, we discuss the performance of our Video Trails-based algorithms, with other existing special effect edit-detection algorithms within the literature. Results from experiments testing for the ability to detect edits from TV programs, ranging from commercials to news magazine programs, including diverse special effect edits, which we have introduced.
We examine the potential role of probabilistic analysis in the integration of sensor based trajectory planning and motion/structure estimation. We report in this article three formalisms illustrating this approach.
We describe algorithms for measuring the thickness of neuron dendritic processes and the shape of neuron cell bodies. The design of these tools follows a `semi-automatic' approach. Image processing tools that would fail when applied to the whole image can produce very useful results if the user confines them by hand to small parts of the image, and if the user is given the opportunity to undo or correct the results interactively.