4 February 2013 Generation of PDF with vector symbols from scanned document
Author Affiliations +
The paper is devoted to the algorithm for generation of PDF with vector symbols from scanned documents. The complex multi-stage technique includes segmentation of the document to text/drawing areas and background, conversion of symbols to lines and Bezier curves, storing compressed background and foreground. In the paper we concentrate on symbol conversion that comprises segmentation of symbol bodies with resolution enhancement, contour tracing and approximation. Presented method outperforms competitive solutions and secures the best compression rate/quality ratio. Scaling of initial document to other sizes as well as several printing/scanning-to-PDF iterations expose advantages of proposed way for handling with document images. Numerical vectorization quality metric was elaborated. The outcomes of OCR software and user opinion survey confirm high quality of proposed method.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ilya V. Kurilin, Ilya V. Kurilin, Ilia V. Safonov, Ilia V. Safonov, Michael N. Rychagov, Michael N. Rychagov, Hokeun Lee, Hokeun Lee, Sang Ho Kim, Sang Ho Kim, Donchul Choi, Donchul Choi, "Generation of PDF with vector symbols from scanned document", Proc. SPIE 8653, Image Quality and System Performance X, 86530R (4 February 2013); doi: 10.1117/12.2000527; https://doi.org/10.1117/12.2000527


Fast approach for toner saving
Proceedings of SPIE (January 25 2011)
Retrieval of historical documents by word spotting
Proceedings of SPIE (January 18 2009)
Benchmarking of document page segmentation
Proceedings of SPIE (December 21 1999)

Back to Top