Compressed video is the digital raw material provided by video-surveillance systems and used for archiving and
indexing purposes. Multimedia standards have therefore a direct impact on such systems. If MPEG-2 used to be the
coding standard, MPEG-4 (part 2) has now replaced it in most installations, and MPEG-4 AVC/H.264 solutions are now
being released. Finely analysing the complex and rich MPEG-4 streams is a challenging issue addressed in that paper.
The system we designed is based on five modules: low-resolution decoder, motion estimation generator, object motion
filtering, low-resolution object segmentation, and cooperative decision. Our contributions refer to as the statistical
analysis of the spatial distribution of the motion vectors, the computation of DCT-based confidence maps, the automatic
motion activity detection in the compressed file and a rough indexation by dedicated descriptors. The robustness and
accuracy of the system are evaluated on a large corpus (hundreds of hours of in-and outdoor videos with pedestrians and
vehicles). The objective benchmarking of the performances is achieved with respect to five metrics allowing to estimate
the error part due to each module and for different implementations. This evaluation establishes that our system analyses
up to 200 frames (720x288) per second (2.66 GHz CPU).