Metadata interoperability is crucial for various kinds of surveillance applications and systems, e.g. metadata mining in
multi-sensor environments, metadata exchange in networked camera systems or information fusion in multi-sensor and
multi-detector environments. Different metadata formats have been proposed to foster metadata interoperability, but they
show significant limitations. ViPER, CVML and MPEG Visual Surveillance MAF support only the visual modality,
CVML's frame based approach leads to inefficient representation, and MPEG-7's comprehensiveness handicaps its
efficient usage for a specific application. To overcome these limitations we propose the Surveillance Application
Metadata (SAM) model, capable of describing online and offline analysis results as a set of time lines containing events.
A set of sensors, detectors, recorded media items and object instances is described centrally and linked from the event
descriptions. The time lines can be related to a subset of sensors and detectors for any modality and different levels of
abstraction. Hierarchical classification schemes are used for many purposes, such as types of properties and their values,
event types, object classes, coordinate systems etc. in order to allow for application specific adaptations without
modifying the data model while ensuring the controlled use of terms. The model supports efficient representation of
dense spatio-temporal information such as object trajectories. SAM is not bound to a specific serialization but can be
mapped to different existing formats within the limitations evoked by the target format. SAM specifications and
examples have been made available
The application of the mean shift algorithm to color image segmentation has been proposed in 1997 by Comaniciu and Meer. We apply the mean shift color segmentation to image sequences, as the first step of a moving object segmentation algorithm. Previous work has shown that it is well suited for this task, because it provides better temporal stability of the segmentation result than other approaches. The drawback is higher computational cost. For speed up of processing on image sequences we exploit the fact that subsequent frames are similar and use the cluster centers of previous frames as initial estimates, which also enhances spatial segmentation continuity. In contrast to other implementations we use the originally proposed CIE LUV color space to ensure high quality segmentation results. We show that moderate quantization of the input data before conversion to CIE LUV has little influence on the segmentation quality but results in significant speed up. We also propose changes in the post-processing step to increase the temporal stability of border pixels. We perform objective evaluation of the segmentation results to compare the original algorithm with our modified version. We show that our optimized algorithm reduces processing time and increases the temporal stability of the segmentation.
We present a case study of establishing a description infrastructure for an audiovisual content-analysis and retrieval system. The description infrastructure consists of an internal metadata model and access tool for using it. Based on an analysis of requirements, we have selected, out of a set of candidates, MPEG-7 as the basis of our metadata model.
The openness and generality of MPEG-7 allow using it in broad range of applications, but increase complexity and hinder interoperability. Profiling has been proposed as a solution, with the focus on selecting and constraining description tools. Semantic constraints are currently only described in textual form. Conformance in terms of semantics can thus not be evaluated automatically and mappings between different profiles can only be defined manually. As a solution, we propose an approach to formalize the semantic constraints of an MPEG-7 profile using a formal vocabulary expressed in OWL, which allows automated processing of semantic constraints.
We have defined the Detailed Audiovisual Profile as the profile to be used in our metadata model and we show how some of the semantic constraints of this profile can be formulated using ontologies. To work practically with the metadata model, we have implemented a MPEG-7 library and a client/server document access infrastructure.
Retrieval in current multimedia databases is usually limited to browsing and searching based on low-level visual features and explicit textual descriptors. Semantic aspects of visual information are mainly described in full text attributes or mapped onto specialized, application specific description schemes. Result lists of queries are commonly represented by textual descriptions and single key frames. This approach is valid for text documents and images, but is often insufficient to represent video content in a meaningful way. In this paper we present a multimedia retrieval framework focusing on video objects, which fully relies on the MPEG-7 standard as information base. It provides a content-based retrieval interface which uses hierarchical content-based video summaries to allow for quick viewing and browsing through search results even on bandwidth limited Web applications. Additionally semantic meaning about video content can be annotated based on domain specific ontologies, enabling a more targeted search for content. Our experiences and results with these techniques will be discussed in this paper.
We propose a search & retrieval (S&R) tool, which supports the combination of a text search with content-based search for video and image content. This S&R system allows the formulation of complex queries allowing the arbitrary combination of content-based and text-based query elements with logical operators. The system will be implemented as a client/server system. The entire S&R system is designed in such a way that the client system can be either a web application accessing the server over the Internet or a native client with local access to the server. The S&R tool is embedded into a system called MECiTV - Media Collaboration for iTV. Within MECiTV a complete authoring environment for iTV content will be developed. The proposed S&R tool will enable iTV authors and content producers to efficiently search for already existing material in order to reduce costs for iTV productions.