In the context of aerial imagery, one of the first steps toward a coherent processing of the information contained in multiple images is geo-registration, which consists in assigning geographic 3D coordinates to the pixels of the image. This enables accurate alignment and geo-positioning of multiple images, detection of moving objects and fusion of data acquired from multiple sensors. To solve this problem there are different approaches that require, in addition to a precise characterization of the camera sensor, high resolution referenced images or terrain elevation models, which are usually not publicly available or out of date. Building upon the idea of developing technology that does not need a reference terrain elevation model, we propose a geo-registration technique that applies variational methods to obtain a dense and coherent surface elevation model that is used to replace the reference model. The surface elevation model is built by interpolation of scattered 3D points, which are obtained in a two-step process following a classical stereo pipeline: first, coherent disparity maps between image pairs of a video sequence are estimated and then image point correspondences are back-projected. The proposed variational method enforces continuity of the disparity map not only along epipolar lines (as done by previous geo-registration techniques} but also across them, in the full2D image domain. In the experiments, aerial images from synthetic video sequences have been used to validate the proposed technique.
Sophisticated strategies have been recently proposed for the detection of moving objects in non-stabilized camera setups. These strategies model both, background and foreground, using spatio-temporal non-parametric estimation. However, as no appropriate methods for dynamical kernel bandwidth are available, high-quality results cannot be obtained in all situations. Here, an automatic and efficient kernel bandwidth estimation strategy for spatio-temporal modeling is proposed. Background kernel bandwidth is estimated via a novel statistical analysis of spatially weighted data distributions, whereas foreground kernel bandwidth is estimated using a mean shift based analysis of previously detected foreground regions.
To index, search, browse and retrieve relevant material, indexes describing the video content are required. Here, a
new and fast strategy which allows detecting abrupt and gradual transitions is proposed. A pixel-based analysis is
applied to detect abrupt transitions and, in parallel, an edge-based analysis is used to detect gradual transitions.
Both analysis are reinforced with a motion analysis in a second step, which significantly simplifies the threshold
selection problem while preserving the computational requirements. The main advantage of the proposed system
is its ability to work in real time and the experimental results show high recall and precision values.
Here, a new and efficient strategy is introduced which allows moving objects detection and segmentation in video
sequences. Other strategies use the mixture of gaussians to detect static areas and dynamic areas within the images so
that moving objects are segmented , , , . For this purpose, all these strategies use a fixed number of gaussians
per pixel. Typically, more than two or three gaussians are used to obtain good results when images contain noise and
movement not related to objects of interest. Nevertheless, the use of more than one gaussian per pixel involves a high
computational cost and, in many cases, it adds no advantages to single gaussian segmentation. This paper proposes a
novel automatic moving object segmentation which uses an adaptive variable number of gaussians to reduce the overall
computational cost. So, an automatic strategy is applied to each pixel to determine the minimum number of gaussians
required for its classification. Taking into account the temporal context that identifies the reference image pixels as
background (static) or moving (dynamic), either the full set of gaussians or just one gaussian are used. Pixels classified
with the full set are called MGP (Multiple Gaussian Pixel), while those classified with just one gaussian are called SGP
(Single Gaussian Pixel). So, a computation reduction is achieved that depends on the size of this last set. Pixels with a
dynamic reference are always MGP. They can be Dynamic-MGP (DMGP) when they belong to the dynamic areas of the
image. However, if the classification result shows that the pixel matches one of the gaussian set, then the pixel is labeled
static and therefore it is called Static-MGP (SMGP). Usually, these last ones are noise pixels, although they could belong
to areas with movement not related to objects of interest. Finally, pixels with a static reference that still match the same
gaussian are SGP and they belong to the static background of the image. However, if they do not match the associated
gaussian, they are changed either to SMGP or DMGP. In addition, any pixel can maintain its status and SMGP can be
changed to DMGP and SGP. A state diagram shows the transition schemes and its characterizations, allowing the
forecasting of the reduction of the computational cost of the segmentation process. Tests have shown that the use of the
proposed strategy implies a limited loss of accuracy in the segmentations obtained, when comparing with other strategies
that use a fixed number of gaussians per pixel, while achieving very high reductions of the overall computational cost of