In this paper we present an approach for tracking people across non overlapping cameras. The approach proposed
is based a multi-dimensional feature vector and its covariance, defining an appearance model of every detected
blob in the network of cameras. The model integrates relative position, color and texture descriptors of each
detected object. Association of objects across non-overlapping cameras is performed by matching detected
objects appearance with past observations. Availability of tracking within every camera can further improve
the accuracy of such association by matching several targets appearance models with detected regions. For this
purpose we present an automatic clustering technique allowing to build a multi-valued appearance model from a
collection of covariance matrices. The proposed approach does not require geometric or colorimetric calibration
of the cameras. We will illustrate the method for tracking people in relatively crowded scenes in a collection of
indoors cameras taken in a mass transportation site. We will present success and challenges yet to be addressed
by the proposed approach.
Security systems increasingly rely on the use of Automated Video Surveillance (AVS) technology. In particular the use of digital video renders itself to internet and local communications, remote monitoring, and to computer processing. AVS systems can perform many tedious and repetitive tasks currently performed by trained security personnel. AVS technology has already made some significant steps towards automating some basic security functions such as: motion detection, object tracking and event-based video recording. However, there are still many problems associated with just these automated functions, which need to be addressed further. Some examples of these problems are: the high "false alarm rate" and the "loss of track" under total or partial occlusion, when used under a wide range of operational parameters (day, night, sunshine, cloudy, foggy, range, viewing angle, clutter, etc.). Current surveillance systems work well only under a narrow range of operational parameters. Therefore, they need be hardened against a wide range of operational conditions. In this paper, we present a Multi-spectral fusion approach to perform accurate pedestrian segmentation under varying operational parameters. Our fusion method combines the "best" detection results from the visible images and the "best" from the thermal images. Commonly, the motion detection results in the visible images are easily affected by noise and shadows. The objects in the thermal image are relatively stable, but they may be missing some parts of the objects, because they thermally blend with the background. Our method makes use of the "best" object components and de-emphasize the "not best".