15 December 2003 A general framework for multicharacter segmentation and its application in recognizing multilingual Asian documents
Author Affiliations +
Abstract
In this paper we propose a general framework for character segmentation in complex multilingual documents, which is an endeavor to combine the traditionally separated segmentation and recognition processes into a cooperative system. The framework contains three basic steps: Dissection, Local Optimization and Global Optimization, which are designed to fuse various properties of the segmentation hypotheses hierarchically into a composite evaluation to decide the final recognition results. Experimental results show that this framework is general enough to be applied in variety of documents. A sample system based on this framework to recognize Chinese, Japanese and Korean documents and experimental performance is reported finally.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Di Wen, Di Wen, Xiaoqing Ding, Xiaoqing Ding, } "A general framework for multicharacter segmentation and its application in recognizing multilingual Asian documents", Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.528951; https://doi.org/10.1117/12.528951
PROCEEDINGS
8 PAGES


SHARE
Back to Top