The growing traffic density in cities fuels the desire for collision assessment systems on public transportation. For this application, video analysis is broadly accepted as a cornerstone. For trams, the localization of tramway tracks is an essential ingredient of such a system, in order to estimate a safety margin for crossing traffic participants. Tramway-track detection is a challenging task due to the urban environment with clutter, sharp curves and occlusions of the track. In this paper, we present a novel and generic system to detect the tramway track in advance of the tram position. The system incorporates an inverse perspective mapping and a-priori geometry knowledge of the rails to find possible track segments. The contribution of this paper involves the creation of a new track reconstruction algorithm which is based on graph theory. To this end, we define track segments as vertices in a graph, in which edges represent feasible connections. This graph is then converted to a max-cost arborescence graph, and the best path is selected according to its location and additional temporal information based on a maximum a-posteriori estimate. The proposed system clearly outperforms a railway-track detector. Furthermore, the system performance is validated on 3,600 manually annotated frames. The obtained results are promising, where straight tracks are found in more than 90% of the images and complete curves are still detected in 35% of the cases.
We explore an automatic real-time change detection system to assist military personnel during transport and surveillance, by detection changes in the environment with respect to a previous operation. Such changes may indicate the presence of Improvised Explosive Devices (IEDs), which can then be bypassed. While driving, images of the scenes are acquired by the camera and stored with their GPS positions. At the same time, the best matching reference image (from a previous patrol) is retrieved and registered to the live image. Next a change mask is generated by differencing the reference and live image, followed by an adaptive thresholding technique. Post-processing steps such as Markov Random Fields, local texture comparisons and change tracking, further improve time- and space-consistency of changes and suppress noise. The resulting changes are visualized as an overlay on the live video content. The system has been extensively tested on 28 videos, containing over 10,000
manually annotated objects. The system is capable of detecting small test objects of 10 cm<sup>3</sup> at a range of 40
meters. Although the system shows an acceptable performance in multiple cases, the performance degrades under certain circumstances for which extensions are discussed.
Many proposed video content analysis algorithms for surveillance applications are very computationally intensive, which limits the integration in a total system, running on one processing unit (e.g. PC). To build flexible prototyping systems of low cost, a distributed system with scalable processing power is therefore required. This paper discusses requirements for surveillance systems, considering two example applications. From these requirements, specifications for a prototyping architecture are derived. An implementation of the proposed architecture is presented, enabling mapping of multiple software modules onto a number of processing units (PCs). The architecture enables fast prototyping of new algorithms for complex surveillance applications without considering resource constraints.
Partners of the CANDELA project are realizing a system for real-time image processing for traffic and video-surveillance applications. This system performs some segmentation, labels the extracted blobs and follows their track into the scene. We also address the problem of evaluating the results of such processes. We are developing a tool to generate and manage the results of the performance evaluation of VCA systems. This evaluation is done by comparison of the results of the global application and its components with a ground truth file generated manually. Both manually and automatically generated description files are formatted in XML. This descriptive markup language is then treated to assemble appropriately parts of the document and process this metadata. For a scientific purpose this tool will provide an objective measure of improvement and a mean to choose between competitive methods. In addition, it is a powerful tool for algorithm designers to measure the progress of their work at the different levels of the processing chain. For an industrial purpose this tool will assess both the accuracy of the VCA with an obvious marketing impact. We present the definition of the evaluation tool, its metrics and specific implementations designed for our applications.
A principal challenge for reducing the cost of complex systems-on-chip is to pursue more generic systems for a broad range of products. For this purpose, we explore three new architectural concepts for state-of-the-art video applications. First, we discuss a reusable scalable hardware architecture employing a hierarchical communication network fitting with the natural hierarchy of the application. In a case study, we show that MPEG streaming in DTV occurs at high level, while subsystems communicate at lower levels. The second concept is a software design that scales over a number of processors to enable reuse over a range of VLSI process technologies. We explore this via an H.264 decoder implementation that scales nearly linearly over up to eight processors by applying
data partitioning. The third concept is resource-scalability, which is required to satisfy real-time constraints in a system with a high amount of shared resources. An example complexity-scalable MPEG-2 encoder scales the required cycle budget with a factor of three, in parallel with a smooth degradation of quality.
A principal challenge for reducing the cost for designing complex systems-on-chip is to pursue more generic systems for a broad range of products. For this purpose, we explore three new architectural concepts for state-of-art video applications. First, we discuss a reusable scalable hardware architecture employing a hierarchical
communication network fitting with the natural hierarchy of the application. In a case study, we show that MPEG streaming in DTV occurs at high level, while subsystems communicate at lower levels. The second concept is a software design that scales over a number of processors to enable reuse over a range of VLSI process technologies. We explore this via an H.264 decoder implementation scaling nearly linearly over up to eight processors by applying data partitioning. The third topic is resource-scalability, which is required to satisfy realtime constraints in a system with a high amount of shared resources. An example complexity-scalable MPEG-2 coder scales the required cycle budget with a factor of three, in parallel with a smooth degradation of quality.
Proc. SPIE. 5022, Image and Video Communications and Processing 2003
KEYWORDS: Consumer electronics, Image processing, Video, Data processing, Very large scale integration, Video processing, Data communications, Electronic imaging, Computer architecture, System on a chip
Due to the increasing significance of development costs in the competitive domain of high-volume consumer electronics, generic solutions are required to enable reuse of the design effort and to increase the potential market volume. As a result from this, Systems-on-Chip (SoCs) contain a growing amount of fully programmable media processing devices as opposed to application-specific systems, which offered the most attractive solutions due to a high performance density. The following motivates this trend. First, SoCs are increasingly dominated by their communication infrastructure and embedded memory, thereby making the cost of the functional units less significant. Moreover, the continuously growing design costs require generic solutions that can be applied over a broad product range. Hence, powerful programmable SoCs are becoming increasingly attractive. However, to enable power-efficient designs, that are also scalable over the advancing VLSI technology, parallelism should be fully exploited. Both task-level and instruction-level parallelism can be provided by means of e.g. a VLIW multiprocessor architecture.
To provide the above-mentioned scalability, we propose to partition the data over the processors, instead of traditional functional partitioning. An advantage of this approach is the inherent locality of data, which is extremely important for communication-efficient software implementations. Consequently, a software implementation is discussed, enabling e.g. SD resolution H.264 decoding with a two-processor architecture, whereas High-Definition (HD) decoding can be achieved with an eight-processor system, executing the same software. Experimental results show that the data communication considerably reduces up to 65% directly improving the overall performance. Apart from considerable improvement in memory bandwidth, this novel concept of partitioning offers a natural approach for optimally balancing the load of all processors, thereby further improving the overall speedup.
In the field of consumer electronics, the advent of new features such as Internet, games, video conferencing, and mobile communication has triggered the convergence of television and computers technologies. This requires a generic media-processing platform that enables simultaneous execution of very diverse tasks such as high-throughput stream-oriented data processing and highly data-dependent irregular processing with complex control flows. As a representative application, this paper presents the mapping of a Main Visual profile MPEG-4 for High-Definition (HD) video onto a flexible architecture platform. A stepwise approach is taken, going from the decoder application toward an implementation proposal. First, the application is decomposed into separate tasks with self-contained functionality, clear interfaces, and distinct characteristics. Next, a hardware-software partitioning is derived by analyzing the characteristics of each task such as the amount of inherent parallelism, the throughput requirements, the complexity of control processing, and the reuse potential over different applications and different systems. Finally, a feasible implementation is proposed that includes amongst others a very-long-instruction-word (VLIW) media processor, one or more RISC processors, and some dedicated processors. The mapping study of the MPEG-4 decoder proves the flexibility and extensibility of the media-processing platform. This platform enables an effective HW/SW co-design yielding a high performance density.
The architecture for block-based video applications (e.g. MPEG/JPEG coding, graphics rendering) is usually based on a processor engine, connected to an external background SDRAM memory where reference images and data are stored. In this paper, we reduce the required memory bandwidth for MPEG coding up to 67% by identifying the optimal block configuration and applying embedded data compression up to a factor four. It is shown that independent compression of fixed-sized data blocks with a fixed compression ratio can decrease the memory bandwidth for a limited set of compression factors only. To achieve this result, we exploit the statistical properties of the burst-oriented data exchange to memory. It has been found that embedded compression is particularly attractive for bandwidth reduction when a compression ratio 2 or 4 is chosen. This moderate compression factor can be obtained with a low-cost compression scheme such as DPCM with a small acceptable loss of quality.
A distributed multimedia computing system consists of sources, processing units and presentation devices in which each component operates autonomously. This independency can be exploited for optimization of individual component performance. Flexibility in performance improvement is enhanced by using independent clock domains. This paper presents a Video I/O model for such a multimedia system. This model provides an asynchronous communication interface for independent clock domains with the ability to synchronize a video display to one of the video sources. The communication interface has been used for the design of I/O modules in a multimedia system, which are briefly outlined.