You have requested a machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Neither SPIE nor the owners and publishers of the content make, and they explicitly disclaim, any express or implied representations or warranties of any kind, including, without limitation, representations and warranties as to the functionality of the translation feature or the accuracy or completeness of the translations.
Translations are not retained in our system. Your use of this feature and the translations is subject to all use restrictions contained in the Terms and Conditions of Use of the SPIE website.
30 March 1995Divide-and-conquer approach to Japanese text segmentation
This paper presents a robust text segmentation algorithm for printed Japanese documents. A divide-and-conquer approach is proposed to handle a large variety of image qualities and print styles. The approach can adapt its processing strategies according to the text quality, i.e., a method using diverse knowledge sources will be exploited to segment degraded text while a fast simple method will be used for good quality text. Since the algorithm can adaptively select the methods for different scenarios, the segmentation is highly efficient in terms of speed and accuracy. The segmenter has tree modules for image preprocessing, line segmentation, and character segmentation. The preprocessor uses the statistical information of the image connected components to globally estimate character size and uses projection profile to determine image quality. The line segmenter requires a `thresholding and smoothing' step prior to line extraction if the image is noisy. During character segmentation, the character segmenter first tries to locate components which contain touching characters. If touching characters exist, an algorithm which includes a profile-based splitting and classifier-based multiple hypothesis processing will be invoked to perform the segmentation.
The alert did not successfully save. Please try again later.
Stephen W. Lam, Qunfeng Liao, Sargur N. Srihari, "Divide-and-conquer approach to Japanese text segmentation," Proc. SPIE 2422, Document Recognition II, (30 March 1995); https://doi.org/10.1117/12.205824