Image registration is a problem of aligning two or more images of the same scene or object. The case when images are taken using different sensors - multimodal image registration - has applications in medical imaging and remote sensing. Unfortunately, many of the existing image registration methods operate under crude assumptions (i.e., the intensities of images are linearly correlated), which makes them inapplicable for the accurate multimodal registration. One approach to this task is to use deep learning to capture the complex intensity dependencies between images of different modalities. However, while deep learning methods produce good results, most of them are trained end-to-end and do not utilize the accumulated body of knowledge about image registration using “classic” information-theoretic and statistical methods. In this paper we consider the specific case of multimodal image registration - of optical and synthetic aperture radar (SAR) images. We use classic feature-based registration pipeline (first, corresponding feature points are found, then RANSAC is used as the transform estimator). Within this method we compare the effectiveness of various feature point detection and correspondence methods - both neural network-based and traditional. We find that Siamese network outperforms (but only slightly) the classic cross-entropy-based method for finding correspondences. Finally, we propose a hybrid method and show that it outperforms both “classic” method and an end-to-end network by a significant margin.
In this paper the authors compared the accuracy of several stereo matching algorithms using problem-oriented metrics developed by the authors earlier for obstacle detection. For comparison we have chosen the most computationally effective open-source algorithms, suitable for using in autonomous systems with limited processor capacities. The quality of the algorithms was compared on the public dataset KITTI Stereo Evaluation 2015. The hypothesis that the problemoriented metric of the stereo matching quality will lead to a different ranking than the universal metric, was not confirmed. At the same time, our measurements of the algorithms execution time showed results significantly different from those stated on KITTI portal.
Proc. SPIE. 11041, Eleventh International Conference on Machine Vision (ICMV 2018)
KEYWORDS: Image processing algorithms and systems, Light sources, Cameras, Sensors, Image segmentation, Dielectrics, Image acquisition, 3D modeling, Algorithm development, Color image segmentation, RGB color model
In this work we discuss the known algorithms for linear colour segmentation based on a physical approach and propose a new modification of segmentation algorithm. This algorithm is based on a region adjacency graph framework without a pre-segmentation stage. Proposed edge weight functions are defined from linear image model with normal noise. The colour space projective transform is introduced as a novel pre-processing technique for better handling of shadow and highlight areas. The resulting algorithm is tested on a benchmark dataset consisting of the images of 19 natural scenes selected from the Barnard’s DXC-930 SFU dataset and 12 natural scene images newly published for common use. The dataset is provided with pixel-by-pixel ground truth colour segmentation for every image. Using this dataset, we show that the proposed algorithm modifications lead to qualitative advantages over other model-based segmentation algorithms, and also show the positive effect of each proposed modification. The source code and datasets for this work are available for free access at http://github.com/visillect/segmentation.
This paper addresses the problem of image fusion of optical (visible and thermal domain) data and radar data for the purpose of visualization. These types of images typically contain a lot of complimentary information, and their joint visualization can be useful and more convenient for human user than a set of individual images. To solve the image fusion problem we propose a novel algorithm that utilizes some peculiarities of human color perception and based on the grey-scale structural visualization. Benefits of presented algorithm are exemplified by satellite imagery.
This paper presents a method of radial distortion automatic compensation on video from an unknown camera. The proposed algorithm estimates the distortion parameters by analyzing a sequence of video frames. It does not require any calibration objects, but is based on the assumption that the original scene contained straight lines. The method tries to perform such radial distortion correction that makes lines look generally straighter. To estimate the overall curvature of the lines we propose to use the fast Hough transform; without actually detecting them in the image. The proposed algorithm has been tested on real data.
In this paper we study multiple reflection effect in a fold of material with regard to color constancy problem. Namely we consider light source chromaticity estimation using perceived material color. We measured relative spectra of reflected light source emission for different positions under folds. Experiment was performed on 105 fabric samples. Using this data we discuss applicability of different spectral models for description of observed chromaticity deviation in different fold’s areas. Obtained experimental data was released in open access.
We study a technique for improving visualization quality of noisy multispectral images. Contrast form visualization approach is considered, which guarantees a non-zero contrast in the output image when there is a difference between the spectra of the object and the background in the input image. The improvement is based on channel weighting according to estimation of the noise level. We show this approach to reduce noise in color visualization of real multispectral images. The low-noise visualizations are demonstrated to be more comprehensive to a human on examples from a publicly available dataset of Earth surface images. Noise variance estimation needed for weighting uses the method proposed earlier by the authors. The validation dataset consists of publicly available images of Earth surface.
In this paper we propose a novel method for localization based on matching two stereo images. It is based on minimizing the sum of square distances between each 3D point and four corresponding 3D rays. The method shows good results for practical localization purposes. Moreover it is robust to the presence of feature point correspondences with zero disparity, which is usually a problem for classical methods. The algorithm is tested in comparison to the classical method. It has linear complexity with respect to feature point correspondence number.
We describe a fast method for road detection in images from a vehicle cabin camera. Straight section of roadway is detected using Fast Hough Transform and the method of dynamic programming. We assume that location of horizon line in the image and the road pattern are known. The developed method is fast enough to detect the roadway on each frame of the video stream in real time and may be further accelerated by the use of tracking.
Demosaicing is the process of reconstruction of a full-color image from Bayer mosaic, which is used in digital cameras for image formation. This problem is usually considered as an interpolation problem. In this paper, we propose to consider the demosaicing problem as a problem of solving an underdetermined system of algebraic equations using regularization methods. We consider regularization with standard l1/2-, l1 -, l2- norms and their effect on quality image reconstruction. The experimental results showed that the proposed technique can both be used in existing methods and become the base for new ones