Herein is the description of the methodology we adopted to develop a set of algorithms performing the automatic recognition and localisation of sites which are observed through an IR camera from a flying mobile. Considered sites are solid buildings such as houses, power-stations... They must be significant enough to allow satisfactory recognition. However they may include planar subparts like roads, greenfields,... To achieve this recognition, 3D site models are recomputed from CAD models to which are added selected attributes. Chosen models are sets of polyhedral facets which may be processed as derived sets of vertices or edges as well. Polyhedral models are particularly fitting general infrared image properties. Geometrical information is worked from the very beginning of the segmentation process. Image processing procedures extract visual features fitting at best the selected model constituents. At first, a 2D image graph is backprojected into a 3D graph thanks to the model (prediction) and then projection onto the 2D space carries the verification from the generated 3D hypotheses, until matching and localisation are completed. Sporadic monocular images are supposed to be output from an infrared camera. Nevertheless radar images, when available, are concurrently supplied. Provided simple data fusion process, radar information improves greatly the detection of emerging sites and the focus of attention on limited areas of the infrared image, from which the effective recognition is performed. A first implementation of the system is currently under completion relying on edge-based models. Extended use of models allowing feature cooperation is planned and other features like points of interest, regions are already taken into account.