Analyzing composite behaviors involving objects from multiple categories in surveillance videos is a challenging task due to the complicated relationships among human and objects. This paper presents a novel behavior analysis framework using a hierarchical dynamic Bayesian network (DBN) for video surveillance systems. The model is built for extracting objects' behaviors and their relationships by representing behaviors using spatial-temporal characteristics. The recognition of object behaviors is processed by the DBN at multiple levels: features of objects at low level, objects and their relationships at middle level, and event at high level, where event refers to behaviors of a single type object as well as behaviors consisting of several types of objects such as "a person getting in a car." Furthermore, to reduce the complexity, a simple model selection criterion is addressed, by which the appropriated model is picked out from a pool of candidate models. Experiments are shown to demonstrate that the proposed framework could efficiently recognize and semantically describe composite object and human activities in surveillance videos.
This paper presents a novel approach using a context-sensitive Bayesian network for natural scene modeling and
classification. In contrast to the common approach using of semantic features, we learn the major spatial arrangement
(spatial and context information) of scenes and relationships between local semantic concepts and global scene meanings
using a contextual Bayesian network. Images' scene probabilities are inferred in a two-level process based on
characteristic objects in the image as well as spatial arrangements of key entities through the Bayesian network. We
demonstrate the promise of this Bayesian network approach on a set of natural scenes, comparing it with existing state of