1 August 1992 Document understanding using layout styles of title page images
Author Affiliations +
An important problem in the application of compound document architectures is the input of data from raster images. One technique is to use visual, syntactic cues found in the layout of the raster document to infer its logical structure or semantics. Another is to use context derived from characters recognized within a given block of raster data. Both character- and image- based information are considered here. A well-constrained environment is defined for use in developing rules that can be applied to basic book title page understanding. This paper identifies the attributes of title page layout objects which aid in mapping them into the fields of a simple bibliographic format. Using as input the raster images of the title page and the verso of the title page along with the ASCII output of a generic character recognition engine from these same images, a system of rules is defined for generating a marked-up text wherein key bibliographic fields may be identified.
© (1992) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Louis H. Sharpe, Louis H. Sharpe, Basil Manns, Basil Manns, "Document understanding using layout styles of title page images", Proc. SPIE 1661, Machine Vision Applications in Character Recognition and Industrial Inspection, (1 August 1992); doi: 10.1117/12.130273; https://doi.org/10.1117/12.130273


Multipole methods for visual reconstruction
Proceedings of SPIE (June 22 1993)
Extraction of text boxes from engineering drawings
Proceedings of SPIE (July 31 1992)
Detection of text strings from mixed text/graphics images
Proceedings of SPIE (December 20 2000)
Location and recovery of text on oriented surfaces
Proceedings of SPIE (December 21 1999)
Compressing images for the Internet
Proceedings of SPIE (January 01 1998)
Generic algorithms for motion compensation and transformation
Proceedings of SPIE (February 25 2008)

Back to Top