Military Operations in Urban Terrain (MOUT) require the capability to perceive and to analyze the situation around a
patrol in order to recognize potential threats. A permanent monitoring of the surrounding area is essential in order to
appropriately react to the given situation, where one relevant task is the detection of objects that can pose a threat.
Especially the robust detection of persons is important, as in MOUT scenarios threats usually arise from persons. This
task can be supported by image processing systems. However, depending on the scenario, person detection in MOUT can
be challenging, e.g. persons are often occluded in complex outdoor scenes and the person detection also suffers from low
image resolution. Furthermore, there are several requirements on person detection systems for MOUT such as the
detection of non-moving persons, as they can be a part of an ambush. Existing detectors therefore have to operate on
single images with low thresholds for detection in order to not miss any person. This, in turn, leads to a comparatively
high number of false positive detections which renders an automatic vision-based threat detection system ineffective. In
this paper, a hybrid detection approach is presented. A combination of a discriminative and a generative model is
examined. The objective is to increase the accuracy of existing detectors by integrating a separate hypotheses
confirmation and rejection step which is built by a discriminative and generative model. This enables the overall
detection system to make use of both the discriminative power and the capability to detect partly hidden objects with the
models. The approach is evaluated on benchmark data sets generated from real-world image sequences captured during
MOUT exercises. The extension shows a significant improvement of the false positive detection rate.