30 March 1995 Script determination in document images
Author Affiliations +
Abstract
We have developed techniques for distinguishing which language is represented in an image of text. This work is restricted to an important subset of the world's languages, using techniques that should be applicable across even more comprehensive samples. The method first classifies the script into two broad classes: European and Asian. This classification is based on the spatial relationships of fiducial points related to the upward concavities in character structures. Script identification within the Asian class, (Japanese, Chinese, Korean) is performed by analysis of the optical density distribution of the text images.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Larry Spitz, Larry Spitz, } "Script determination in document images", Proc. SPIE 2422, Document Recognition II, (30 March 1995); doi: 10.1117/12.205831; https://doi.org/10.1117/12.205831
PROCEEDINGS
9 PAGES


SHARE
Back to Top