We present differentiable implementations of several common image processing algorithms: Canny edge detector, Niblack thresholding and Harris corner detector. The implementations are presented in the form of fully convolutional networks and explicitly arranged exactly to the original algorithms. Usage of such form of the algorithms allows to tune their parameters with a gradient descent. We performed parameter tuning in the edge detection problem and it shows that our implementation enables us to obtain better results on the BSDS-500 dataset. As a part of implementations of algorithms, we introduce a generalization of pooling method, which allows using arbitrary structure element. We also analyze the given neural network architectures and show the connections with contemporary approaches.
The paper considers the problem of images cropping obtained by projective transformation of source images. The problem is highly relevant to analysis of projective distorted images. We propose two cropping algorithms based on estimation of pixel stretching under the transformation. The algorithms use the ratio of pixel neighborhood areas and the ratio of their chord lengths. The methods comparison is conducted by estimation of cropped background relative areas. The experiment uses real dataset containing projective distorted images of the pages of Russian civil passports. The method based on chord lengths ratio shows better results on highly distorted images.
The paper considers the problem of estimating a transform connecting two images of one plane object. The method based on RANSAC is proposed for calculating the parameters of projective transform which uses points and lines correspondences simultaneously. A series of experiments was performed on synthesized data. Presented results show that the algorithm convergence rate is significantly higher when actual lines are used instead of points of lines intersection. When using both lines and feature points it is shown that the convergence rate does not depend on the ratio between lines and feature points in the input dataset.
The work is devoted to the research on the calculation of a projective transformation, which arises in the problems in machine vision. The details of the calculation of projective transformation and found specificities of mathematical libraries implementations are carefully analyzed. The comparisons of different approaches are provided in terms of both productivity and accuracy, using both artificially generated and real data.
Document capture with a smartphone camera is already here to stay. Interactive applications for document capture and its enhancement have filled mobile application stores. However, discounting the predictions and judging only from the experience of using such applications, they are not yet ready to compete with stationary scanners when high quality and reliability is required. This paper is devoted to analysis of the problem of document detection in the image and evaluation of the quality of existing mobile applications. Based on this analysis we present a new reliable algorithm for document capture, based on the boundary segments detection and constructing a segments graph to fit rectangular projective model. The algorithm achieves about 95% quality of document detection and outperforms all of the reviewed algorithms, implemented in mobile applications.