In this paper we describe stitching protocol, which allows to obtain high resolution images of long length monochromatic objects with periodic structure. This protocol can be used for long length documents or human-induced objects in satellite images of uninhabitable regions like Arctic regions. The length of such objects can reach notable values, while modern camera sensors have limited resolution and are not able to provide good enough image of the whole object for further processing, e.g. using in OCR system. The idea of the proposed method is to acquire a video stream containing full object in high resolution and use image stitching. We expect the scanned object to have straight boundaries and periodic structure, which allow us to introduce regularization to the stitching problem and adapt algorithm for limited computational power of mobile and embedded CPUs. With the help of detected boundaries and structure we estimate homography between frames and use this information to reduce complexity of stitching. We demonstrate our algorithm on mobile device and show image processing speed of 2 fps on Samsung Exynos 5422 processor
This paper explores method of layer-by-layer training for neural networks to train neural network, that use approximate calculations and/or low precision data types. Proposed method allows to improve recognition accuracy using standard training algorithms and tools. At the same time, it allows to speed up neural network calculations using fast-processed approximate calculations and compact data types. We consider 8-bit fixed-point arithmetic as the example of such approximation for image recognition problems. In the end, we show significant accuracy increase for considered approximation along with processing speedup.
In this paper, we propose an expansion of convolutional neural network (CNN) input features based on Hough Transform. We perform morphological contrasting of source image followed by Hough Transform, and then use it as input for some convolutional filters. Thus, CNNs computational complexity and the number of units are not affected. Morphological contrasting and Hough Transform are the only additional computational expenses of introduced CNN input features expansion. Proposed approach was demonstrated on the example of CNN with very simple structure. We considered two image recognition problems, that were object classification on CIFAR-10 and printed character recognition on private dataset with symbols taken from Russian passports. Our approach allowed to reach noticeable accuracy improvement without taking much computational effort, which can be extremely important in industrial recognition systems or difficult problems utilising CNNs, like pressure ridge analysis and classification.
In this paper, we introduce slant detection method based on Fast Hough Transform calculation and demonstrate its application in industrial system for Russian passports recognition. About 1.5% of this kind of documents appear to be slant or italic. This fact reduces recognition rate, because Optical Recognition Systems are normally designed to process normal fonts. Our method uses Fast Hough Transform to analyse vertical strokes of characters extracted with the help of x-derivative of a text line image. To improve the quality of detector we also introduce field grouping rules. The resulting algorithm allowed to reach high detection quality. Almost all errors of considered approach happen on passports of nonstandard fonts, while slant detector works in appropriate way.
An iterative algorithm is proposed for blind multi-image deblurring of binary images. The binarity is the only prior restriction imposed on the image. Image formation model assumes convolution with arbitrary kernel and addition of a constant value. Penalty functional is composed using binarity constraint for regularization. The algorithm estimates the original image and distortion parameters by alternate reduction of two parts of this functional. Experimental results for natural (non-synthetic) data are present.
Computing image patch descriptors for correspondence problems relies heavily on hand-crafted feature transformations, e.g. SIFT, SURF. In this paper, we explore a Siamese pairing of fully connected neural networks for the purpose of learning discriminative local feature descriptors. Resulting ANN computes 128-D descriptors, and demonstrates consistent speedup as compared to such state-of-the-art methods as SIFT and FREAK on PCs as well as in embedded systems. We use L2 distance to reflect descriptor similarity during both training and testing. In this way, feature descriptors we propose can be easily compared to their hand-crafted counterparts. We also created a dataset augmented with synthetic data for learning local features, and it is available online. The augmentations provide training data for our descriptors to generalise well against scaling and rotation, shift, Gaussian noise, and illumination changes.
Neural network calculations for the image recognition problems can be very time consuming. In this paper we propose three methods of increasing neural network performance on SIMD architectures. The usage of SIMD extensions is a way to speed up neural network processing available for a number of modern CPUs. In our experiments, we use ARM NEON as SIMD architecture example. The first method deals with half float data type for matrix computations. The second method describes fixed-point data type for the same purpose. The third method considers vectorized activation functions implementation. For each method we set up a series of experiments for convolutional and fully connected networks designed for image recognition task.