3 April 1997 Producing good font attribute determination using error-prone information
Author Affiliations +
A method to provide estimates of font attributes in an OCR system, using detectors of individual attributes that are error-prone. For an OCR system to preserve the appearance of a scanned document, it needs accurate detection of font attributes. However, OCR environments have noise and other sources of errors, tending to make font attribute detection unreliable. Certain assumptions about font use can greatly enhance accuracy. Attributes such as boldness and italics are more likely to change between neighboring words, while attributes such as serifness are less likely to change within the same paragraph. Furthermore, the document as a whole, tends to have a limited number of sets of font attributes. These assumptions allow a better use of context than the raw data, or what would be achieved by simpler methods that would oversmooth the data.
© (1997) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Robert Cooperman, Robert Cooperman, } "Producing good font attribute determination using error-prone information", Proc. SPIE 3027, Document Recognition IV, (3 April 1997); doi: 10.1117/12.270079; https://doi.org/10.1117/12.270079

Back to Top