In this work, we consider a problem of quadrilateral document borders detection in images captured by a mobile device’s camera. State-of-the-art algorithms for the quadrilateral document borders detection are not designed for cases when one of the document borders is either completely out of the frame, obscured, or of low contrast. We propose the algorithm which correctly processes the image in such cases. It is built on the classical contour-based algorithm. We modify the latter using the document’s aspect ratio which is known a priori. We demonstrate that this modification reduces the number of incorrect detections by 34% on an open dataset MIDV-500.
This paper presents a method for metric rectification of planar objects that preserves angles and length ratios. An inner structure of an object is assumed to follow the laws of Manhattan World i.e. the majority of line segments are aligned with two orthogonal directions of the object. For that purpose we introduce the method that estimates the position of two vanishing points corresponding to the main object directions. It is based on an original optimization function of segments that estimates a vanishing point position. For calculation of the rectification homography with two vanishing points we propose a new method based on estimation of the camera rotation so that the camera axis is perpendicular to the object plane. The proposed method can be applied for rectification of various objects such as documents or building facades. Also since the camera rotation is estimated the method can be employed for estimation of object orientation (for example, during a surgery with radiograph of osteosynthesis implants). The method was evaluated on the MIDV-500 dataset containing projectively distorted images of documents with complex background. According to the experimental results an accuracy of the proposed method is better or equal to the-state-of-the-art if the background occupies no more than half of the image. Runtime of the method is around 3ms on core i7 3610qm CPU.
The paper considers the problem of images cropping obtained by projective transformation of source images. The problem is highly relevant to analysis of projective distorted images. We propose two cropping algorithms based on estimation of pixel stretching under the transformation. The algorithms use the ratio of pixel neighborhood areas and the ratio of their chord lengths. The methods comparison is conducted by estimation of cropped background relative areas. The experiment uses real dataset containing projective distorted images of the pages of Russian civil passports. The method based on chord lengths ratio shows better results on highly distorted images.
Registration of images of different nature is an important technique used in image fusion, change detection, efficient information representation and other problems of computer vision. Solving this task using feature-based approaches is usually more complex than registration of several optical images because traditional feature descriptors (SIFT, SURF, etc.) perform poorly when images have different nature. In this paper we consider the problem of registration of SAR and optical images. We train neural network to build feature point descriptors and use RANSAC algorithm to align found matches. Experimental results are presented that confirm the method’s effectiveness.
Keypoint detection is an important tool of image analysis, and among many contemporary keypoint detection algorithms YAPE is known for its computational performance, allowing its use in mobile and embedded systems. One of its shortcomings is high sensitivity to local contrast which leads to high detection density in high-contrast areas while missing detections in low-contrast ones. In this work we study the contrast sensitivity of YAPE and propose a modification which compensates for this property on images with wide local contrast range (Yet Another Contrast-Invariant Point Extractor, YACIPE). As a model example, we considered the traffic sign recognition problem, where some signs are well-lighted, whereas others are in shadows and thus have low contrast. We show that the number of traffic signs on the image of which has not been detected any keypoints is 40% less for the proposed modification compared to the original algorithm.
The paper describes a technology that allows for automatizing the process of evaluating the grain quality in a grain tank of a combine harvester. Special recognition algorithm analyzes photographic images taken by the camera, and that provides automatic estimates of the total mass fraction of broken grains and the presence of non-grains. The paper also presents the operating details of the tank prototype as well as it defines the accuracy of the algorithms designed.
Computing image patch descriptors for correspondence problems relies heavily on hand-crafted feature transformations, e.g. SIFT, SURF. In this paper, we explore a Siamese pairing of fully connected neural networks for the purpose of learning discriminative local feature descriptors. Resulting ANN computes 128-D descriptors, and demonstrates consistent speedup as compared to such state-of-the-art methods as SIFT and FREAK on PCs as well as in embedded systems. We use L2 distance to reflect descriptor similarity during both training and testing. In this way, feature descriptors we propose can be easily compared to their hand-crafted counterparts. We also created a dataset augmented with synthetic data for learning local features, and it is available online. The augmentations provide training data for our descriptors to generalise well against scaling and rotation, shift, Gaussian noise, and illumination changes.
We study a technique for improving visualization quality of noisy multispectral images. Contrast form visualization approach is considered, which guarantees a non-zero contrast in the output image when there is a difference between the spectra of the object and the background in the input image. The improvement is based on channel weighting according to estimation of the noise level. We show this approach to reduce noise in color visualization of real multispectral images. The low-noise visualizations are demonstrated to be more comprehensive to a human on examples from a publicly available dataset of Earth surface images. Noise variance estimation needed for weighting uses the method proposed earlier by the authors. The validation dataset consists of publicly available images of Earth surface.
We describe an original low cost hardware setting for efficient testing of stereo vision algorithms. The method uses a combination of a special hardware setup and mathematical model and is easy to construct, precise in applications of our interest. For a known scene we derive its analytical representation, called virtual scene. Using a four point correspondence between the scene and virtual one we compute extrinsic camera parameters, and project virtual scene on the image plane, which is the ground truth for depth map. Another result, presented in this paper, is a new depth map quality metric. Its main purpose is to tune stereo algorithms for particular problem, e.g. obstacle avoidance.
This work considers the tracking of the UAV (unmanned aviation vehicle) on the basis of onboard observations of natural landmarks including azimuth and elevation angles. It is assumed that UAV's cameras are able to capture the angular position of reference points and to measure the angles of the sight line. Such measurements involve the real position of UAV in implicit form, and therefore some of nonlinear filters such as Extended Kalman filter (EKF) or others must be used in order to implement these measurements for UAV control. Recently it was shown that modified pseudomeasurement method may be used to control UAV on the basis of the observation of reference points assigned along the UAV path in advance. However, the use of such set of points needs the cumbersome recognition procedure with the huge volume of on-board memory. The natural landmarks serving as such reference points which may be determined on-line can significantly reduce the on-board memory and the computational difficulties. The principal difference of this work is the usage of the 3D reference points coordinates which permits to determine the position of the UAV more precisely and thereby to guide along the path with higher accuracy which is extremely important for successful performance of the autonomous missions. The article suggests the new RANSAC for ISOMETRY algorithm and the use of recently developed estimation and control algorithms for tracking of given reference path under external perturbation and noised angular measurements.
The real time object detection task is considered as a part of a project devoted to development of autonomous ground robot. This problem has been successfully solved with Random Ferns algorithm, which belongs to keypoint-based method and uses fast machine learning algorithms for keypoint matching step. As objects in the real world are not always planar, in this article we describe experiments of applying this algorithm for non-planar objects. Also we introduce a method for fast detection of a special class of non-planar objects | those which can be decomposed into planar parts (e.g. faces of a box). This decomposition needs one detector for each side, which may significantly affect speed of detection. Proposed approach copes with it by omitting repeated steps for each detector and organizing special queue of detectors. It makes the algorithm three times faster than naive one.
The study concerned deals with a new approach to the problem of detecting vehicle passes in vision-based automatic vehicle classification system. Essential non-affinity image variations and signals from induction loop are the events that can be considered as detectors of an object presence. We propose several vehicle detection techniques based on image processing and induction loop signal analysis. Also we suggest a combined method based on multi-sensor analysis to improve vehicle detection performance. Experimental results in complex outdoor environments show that the proposed multi-sensor algorithm is effective for vehicles detection.
In this paper, we consider the problem of object's velocity estimation via video stream by comparing three new methods of velocity estimation named as vertical edge algorithm, modified Lucas-Kanade method, and feature points algorithm. As an applied example the task of automatic evaluation of vehicles' velocity via video stream on toll roads is chosen. We took some videos from cameras mounted on the toll roads and marked them out to determine true velocity. Comparison is carried out of performance in the correct velocity detection of the proposed methods with each other. The relevance of this paper is practical implementation of these methods overcoming all the difficulties of realization.