Translator Disclaimer
3 March 2014 Text vectorization based on character recognition and character stroke modeling
Author Affiliations +
Proceedings Volume 9027, Imaging and Multimedia Analytics in a Web and Mobile World 2014; 902707 (2014)
Event: IS&T/SPIE Electronic Imaging, 2014, San Francisco, California, United States
In this paper, a text vectorization method is proposed using OCR (Optical Character Recognition) and character stroke modeling. This is based on the observation that for a particular character, its font glyphs may have different shapes, but often share same stroke structures. Like many other methods, the proposed algorithm contains two procedures, dominant point determination and data fitting. The first one partitions the outlines into segments and second one fits a curve to each segment. In the proposed method, the dominant points are classified as “major” (specifying stroke structures) and “minor” (specifying serif shapes). A set of rules (parameters) are determined offline specifying for each character the number of major and minor dominant points and for each dominant point the detection and fitting parameters (projection directions, boundary conditions and smoothness). For minor points, multiple sets of parameters could be used for different fonts. During operation, OCR is performed and the parameters associated with the recognized character are selected. Both major and minor dominant points are detected as a maximization process as specified by the parameter set. For minor points, an additional step could be performed to test the competing hypothesis and detect degenerated cases.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhigang Fan, Bingfeng Zhou, Francis Tse, Yadong Mu, and Tao He "Text vectorization based on character recognition and character stroke modeling", Proc. SPIE 9027, Imaging and Multimedia Analytics in a Web and Mobile World 2014, 902707 (3 March 2014);


Representing videos in tangible products
Proceedings of SPIE (March 03 2014)
System for line drawings interpretation
Proceedings of SPIE (August 01 1992)
Seam carving with improved edge preservation
Proceedings of SPIE (January 27 2010)
Tabular document recognition
Proceedings of SPIE (March 23 1994)

Back to Top