30 March 1995 Measuring document image skew and orientation
Author Affiliations +
Abstract
Several approaches have previously been taken for identifying document image skew. At issue are efficiency, accuracy, and robustness. We work directly with the image, maximizing a function of the number of ON pixels in a scanline. Image rotation is simulated by either vertical shear or accumulation of pixel counts along sloped lines. Pixel sum differences on adjacent scanlines reduce isotropic background noise from non-text regions. To find the skew angle, a succession of values of this function are found. Angles are chosen hierarchically, typically with both a coarse sweep and a fine angular bifurcation. To increase efficiency, measurements are made on subsampled images that have been pre-filtered to maximize sensitivity to image skew. Results are given for a large set of images, including multiple and unaligned text columns, graphics, and large area halftones. The measured intrinsic angular error is inversely proportional to the number of sampling points on a scanline. This method does not indicate when text is upside-down, and it also requires sampling the function at 90 degrees of rotation to measure text skew in landscape mode. However, such text orientation can be determined (as one of four directions) by noting that Roman characters in all languages have many more ascenders than descenders, and using morphological operations to identify such pixels. Only a small amount of text is required for accurate statistical determination of orientation, and images without text are identified as such.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Dan S. Bloomberg, Gary E. Kopec, Lakshmi Dasari, "Measuring document image skew and orientation", Proc. SPIE 2422, Document Recognition II, (30 March 1995); doi: 10.1117/12.205832; https://doi.org/10.1117/12.205832
PROCEEDINGS
15 PAGES


SHARE
RELATED CONTENT

Automated microscopy system for peripheral blood cells
Proceedings of SPIE (November 22 2000)
OPTIMA
Proceedings of SPIE (November 01 1990)
Imaging applications platform: concept to implementation
Proceedings of SPIE (September 22 1992)
Coding Of Data For Laser Recorders
Proceedings of SPIE (October 26 1983)
Image analysis using threshold reduction
Proceedings of SPIE (July 01 1991)

Back to Top