A skew-estimation method using straight lines in document images is presented. Unlike conventional approaches exploiting the properties of text, we formulate the skew-estimation problem as an estimation task using straight lines in images and focus on robust and accurate line detection. To be precise, we adopt a block-based edge detector followed by a progressive line detector to take clues from a variety of sources such as text lines, boundaries of figures/tables, vertical/horizontal separators, and boundaries of textblocks. Extensive experiments on the datasets of skewed images and competition results reveal that the proposed method works robustly and yields accurate skew-estimation results.
Conventional image stitching methods were developed under the assumption or condition that (1) the optical center of a camera is fixed (fixed-optical-center case) or (2) the camera captures a plane target (plane-target case). Hence, users should know or test which condition is more appropriate for the given set of images and then select a right algorithm or try multiple stitching algorithms. We propose a unified framework for the image stitching and rectification problem, which can handle both cases in the same framework. To be precise, we model each camera pose with six parameters (three for the rotation and three for the translation) and develop a cost function that reflects the registration errors on a reference plane. The designed cost function is effectively minimized via the Levenberg–Marquardt algorithm. For the given set of images, when it is found that the relative camera motions between the images are large, the proposed method performs rectification of images and then composition using the rectified images; otherwise, the algorithm simply builds a visually pleasing result by selecting a viewpoint. Experimental results on synthetic and real images show that our method successfully performs stitching and metric rectification.
A generalized wavelet domain image fusion method which imposes weights on each of the wavelet coefficients for improving the conventional wavelet domain approach is presented. The weights are controlled in the least-squares sense for enhancing the details while suppressing excessive high frequency components. In experiments with IKONOS and QuickBird satellite data, we demonstrated that the proposed method shows a comparable or better performance than conventional methods in terms of various objective quality metrics.
When the horizon or long edges are skewed in photos, they may seem unstable unless they are artistic intentions, and hence we may wish to correct the skews. For the skew correction of faint as well as strong horizons, we propose a skew estimation method for natural images. We first apply a long-block-based edge detector that can construct edge maps even when the edge is faint and/or background is cluttered. We also propose a robust line-detection method that uses the generated edge map, based on progressive probabilistic Hough transform followed by refinement steps. For each of the detected lines, we define their weight and estimate the image skew based on the weighted votes from the lines. Since all the pixels in the long-blocks are used for the edge-map construction, the proposed method can find noisy or faint lines while rejecting curved or short lines. Experimental results show that the first salient angle corresponds with the image skew in most cases, and the skews are successfully corrected.
This paper proposes an algorithm for the detection of pillars or posts in the video captured by a single camera
implemented on the fore side of a room mirror in a car. The main purpose of this algorithm is to complement
the weakness of current ultrasonic parking assist system, which does not well find the exact position of pillars or
does not recognize narrow posts. The proposed algorithm is consisted of three steps: straight line detection, line
tracking, and the estimation of 3D position of pillars. In the first step, the strong lines are found by the Hough
transform. Second step is the combination of detection and tracking, and the third is the calculation of 3D
position of the line by the analysis of trajectory of relative positions and the parameters of camera. Experiments
on synthetic and real images show that the proposed method successfully locates and tracks the position of
pillars, which helps the ultrasonic system to correctly locate the edges of pillars. It is believed that the proposed
algorithm can also be employed as a basic element for vision based autonomous driving system.
We propose a new autofocus method for digital cameras based on the separation of color components from the incoming light rays and the measure of their disparity. For separating color components, we place two apertures with the red and blue color filters on them. This enables us to get two monochromatic images that have disparities proportional to the distances of the objects from the camera. We also propose a new measure to find the disparity of these color components because the conventional disparity measures show low accuracy for the pair of different color channel images. The measure is based on the observation that the overlap of images with disparity has many weak gradients, whereas the overlap with no disparity has a small number of strong gradients. One of two images is shifted from left to right, and the measure is computed for each position. Then, the position with the maximum measure is considered as the disparity, and the direction and distance of focus are computed from the estimated disparity and camera parameters. The proposed method is implemented in a commercial compact camera, and it has been demonstrated that the method finds the focus in a wide range of distance and illumination conditions.
This paper proposes a new colorization method based on the chrominance blending. The weights for the blending are computed by using the random walker algorithm, which is a soft segmentation technique that provides sharp probability transition on object boundaries. As a result, the proposed method reduces color bleeding and provides improved colorization performances compared to conventional ones.
We propose a new method that classifies wafer images according to their defect types for automatic defect classification in semiconductor fabrication processes. Conventional image classifiers using global properties cannot be used in this problem, because the defects usually occupy very small regions in the images. Hence, the defects should first be segmented, and the shape of the segment and the features extracted from the region are used for classification. In other words, we need to develop a classification-after-segmentation approach for the use of features from the small regions corresponding to the defects. However, the segmentation of scratch defects is not easy due to the shrinking bias problem when using conventional methods. We propose a new Markov random field-based method for the segmentation of wafer images. Then we design an AdaBoost-based classifier that uses the features extracted from the segmented local regions.
An algorithm is proposed for the detection of scene changes in video sequences. The algorithm is based on the comparison of several features which represents the characteristics of frames. More specifically, the feature extraction is confined to several blocks that contains strong edges instead of the overall image as in the conventional algorithms, in order to concentrate more on the important colors and objects than the backgrounds. Several non-overlapping blocks of predefined size are first found, which contain strong edges in the frame. Then, three different kinds of features are extracted by using the pixels in the blocks. One is the color histogram of pixels in the blocks, the second one is the sum of absolute difference (SAD) between the blocks of current and previous frame as in video coding, and the third is the number of active blocks, which have the edge strength larger than a given threshold. The dissolve and wipe are detected by comparing the histogram of the blocks, the cut is detected by the SAD, and fade-in/outs are detected by the number of active blocks. The comparison of several test sequences shows that the color histogram from the strong edge blocks is a promising feature for detecting wipes and dissolves. Also, cut detection performance by the SAD of strong edge blocks is shown to be comparable to the conventional feature based algorithm. The fade-in/outs are also easily detected with high precision, by counting the number of active blocks.
We proposed an algorithm for the tracking of facial feature points based on the block matching algorithm (BMA) with a new shape of window considering the feature point characteristics and scale/angle changes of the face. The window used in the proposed algorithm is the set of pixels in the 8 radial lines of 0 degree(s),45 degree(s),... from the feature point, i.e. the window has the shape of cross plus 45 degree(s) rotated cross. This shape of window is shown to be more efficient than the conventional rectangular window in tracking the facial feature points, because the points and their neighbor are not usually the objects of rigid body. But since the feature points are usually on the edges of luminance or color changes, at least one of the radial line crosses the edge and it gives distinct measure for tracking the point. Also the radial line window requires less computational complexity than the rectangular window and more readily adjusted with respect to scale and angle changes. For the estimation of scale changes, the facial region is segmented at each frame using the normalized color, and the number of pixels in the facial region are compared.