Unseen object detection problem is known as a semantic matching problem. Thus, a semantic matcher takes two images as an input – the request image and the test image. The request image represents an object class needed to be found on the test image. In this paper, we propose a new region proposal based semantic matcher. In our region based semantic matcher we use the same ideas as in R-CNN. Our Body CNN also generates proposals similar to classical Faster R-CNN, and Head-CNN compares proposals with a request descriptor, extracted from the request image. To extract features from the request image we use Request descriptor CNN. All three CNNs – Head, Body and Request descriptor are trained together, end-to-end for seen class object detection by request and then applied to both seen and unseen classes. We have trained and tested our CNN on Pascal VOC Dataset.
The paper addresses the problem of low quality 3D terrain models enhancement. We propose the approach based on convolutional neural networks (CNN), namely, on Pix2Pix method that uses generative adversarial networks for imageto-image translation. We use heightmap 3D terrain models representation to use classical CNNs. The network was trained on a synthetic dataset that included 150000 images and heightmaps of different landscapes. Our model showed the relative mean absolute difference equal to 0.459% on synthetic testing dataset. In addition, we demonstrate landscapes generation on the real data from Google Maps using our model.
In this paper we propose a new algorithm for image filtering using morphological thickness map. Compared to the other smoothing methods, such as anisotropic diffusion, comparative filters, guided and rolling guidance filters, the benefit of our method is that it natively works with the image structure – thickness map, so it does not depend on the various levels of image noise, lightning conditions and effects. We present the method idea, algorithm itself and various experimental results. The results of the filtering using our algorithm can be widely applied in such image processing tasks as image segmentation, motion analysis, invariant feature transformation, data compression.
The paper proposes a semantic segmentation algorithm based on Convolutional Neural Networks (CNN) related to the problem of presenting multispectral sensor-derived images in Enhanced Vision Systems (EVS). The CNN architecture based on residual SqueezeNet with deconvolutional layers is presented. To create an in-domain training dataset for CNN, a semi-automatic scenario with the use of photogrammetric technique is described. Experimental results are shown for problem-oriented images, obtained by TV and IR sensors of the EVS prototype in a set of flight experiments.
In this paper, we propose a background stabilization method for an arbitrary camera movement. We investigate the state of the art algorithms for feature point detection and introduce a composite LBP descriptor to describe the feature points both with an algorithm for feature points matching on a sequence of images. In addition, an algorithm for constructing an affine transformation of the old frame in the sequence into the new one for the tasks of stabilization and image stitching was proposed.