4 February 2013 Preprocessing document images by resampling is error prone and unnecessary
Author Affiliations +
Integrity tests are proposed for image processing algorithms that should yield essentially the same output under 90 degree rotations, edge-padding and monotonic gray-scale transformations of scanned documents. The tests are demonstrated on built-in functions of the Matlab Image Processing Toolbox. Only the routine that reports the area of the convex hull of foreground components fails the rotation test. Ensuring error-free preprocessing operations like size and skew normalization that are based on resampling an image requires more radical treatment. Even if faultlessly implemented, resampling is generally irreversible and may introduce artifacts. Fortunately, advances in storage and processor technology have all but eliminated any advantage of preprocessing or compressing document images by resampling them. Using floating point coordinate transformations instead of resampling images yields accurate run-length, moment, slope, and other geometric features.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
George Nagy, George Nagy, } "Preprocessing document images by resampling is error prone and unnecessary", Proc. SPIE 8658, Document Recognition and Retrieval XX, 86580U (4 February 2013); doi: 10.1117/12.2006115; https://doi.org/10.1117/12.2006115


Back to Top