We describe a document image segmentation algorithm to classify a scanned document into different regions
such as text/line drawings, pictures, and smooth background. The proposed scheme is relatively independent
of variations in text font style, size, intensity polarity and of string orientation. It is intended for use in an
adaptive system for document image compression. The principal parts of the algorithm are the generation of
the foreground and background layers and the application of hierarchical singular value decomposition (SVD)
in order to smoothly fill the blank regions of both layers so that the high compression ratio can be achieved.
The performance of the algorithm, both in terms of its effectiveness and computational efficiency, was evaluated
using several test images and showed superior performance compared to other techniques.
In this paper, we present a novel efficient flicker noise reductionmethod for single images scanned by overhead line sensors.
The flicker noise here is perceived as horizontal bands which are not necessarily periodic. We view the flicker pattern as
the noise of row cumulative histogram along the vertical direction, and propose two novel cumulative histogram filtering
approaches to smooth the artifact, including using different Gaussian variance and padding the image. The proposed
algorithm is then used to reduce the flicker noise in our scanned color images. The computational complexity of the
proposed algorithm is further analyzed. The algorithm operates on singe images, and does not rely on the frequency of
alternative currency, nor requires the horizontal bands are periodic. Experimental results show the superior performance of
the proposed method in comparison to other existing methods.
In recent years, various gesture recognition systems have been studied for use in television and video games.
In such systems, motion areas ranging from 1 to 3 meters deep have been evaluated. However, with the burgeoning
popularity of small mobile displays, gesture recognition systems capable of operating at much shorter ranges have
become necessary. The problems related to such systems are exacerbated by the fact that the camera's field of view is
unknown to the user during operation, which imposes several restrictions on his/her actions.
To overcome the restrictions generated from such mobile camera devices, and to create a more flexible gesture
recognition interface, we propose a hybrid hand gesture system, in which two types of gesture recognition modules are
prepared and with which the most appropriate recognition module is selected by a dedicated switching module. The two
recognition modules of this system are shape analysis using a boosting approach (detection-based approach) and
motion analysis using image frame differences (motion-based approach)(for example, see).
We evaluated this system using sample users and classified the resulting errors into three categories: errors that
depend on the recognition module, errors caused by incorrect module identification, and errors resulting from user
actions. In this paper, we show the results of our investigations and explain the problems related to short-range gesture
Logos are considered valuable intellectual properties and a key component of the goodwill of a business. In
this paper, we propose a natural scene logo recognition method which is segmentation-free and capable of
processing images extremely rapidly and achieving high recognition rates. The classifiers for each logo are trained
jointly, rather than independently. In this way, common features can be shared across multiple classes for better
generalization. To deal with large range of aspect ratio of different logos, a set of salient regions of interest
(ROI) are extracted to describe each class. We ensure the selected ROIs to be both individually informative and
two-by-two weakly dependant by a Class Conditional Entropy Maximization criteria. Experimental results on a
large logo database demonstrate the effectiveness and efficiency of our proposed method.
Quality of camera-based whiteboard images is highly related to the light environment and the writing effect of the
content. Specular reflection and low contrast reduce the readability of captured whiteboard images frequently. A
novel method is proposed to enhance camera-based whiteboard images in this paper. The images are enhanced
by removing the highlight specular reflection to improve the visibility and emphasizing the content to improve
the readability of the whiteboards. The method can be practically embedded in mobile devices with image
Distortion correction methods for digital camera document images of thick volumes or curved papers become important
for camera-based document recognition technologies. In this paper we propose a novel distortion correction method for
digital camera document images based on "shape from parallel geodesics." This method considers the following features:
parallel lines corresponding to character strings or ruled lines of tables on extended surface become parallel geodesics on
a curved paper surface and a smoothly curved paper can be modeled by a ruled surface, which is sweep surface of rulings.
The projected geodesics and rulings exist in the input image derived from perspective transformation. The presented
method extracts the projected geodesics, estimates the projected rulings in the input image, estimates the ruled surface
that models the curved paper, and generates the corrected image, in this order. The projected rulings are estimated by the
condition derived from only parallelism of geodesics without the requirements for equal spacing. This method can
estimate the ruled surface model directly by numerical operations of differentiation, integration and matrix inversion
without any iterative calculation. We also report on experiments that show the effectiveness of the proposed method.