1 August 1992 Contextual analysis of machine-printed addresses
Author Affiliations +
The assignment of a nine digit ZIP Code (ZIP + 4 Code) to the digital image of a machine printed address block is a problem of central importance in automated mail sorting. This problem is especially difficult since most addresses do not contain ZIP + 4 Codes and often the information that must be read to match an address to one of the 28 million entries in the ZIP + 4 file is either erroneous, incomplete, or missing altogether. This paper discusses a system for interpreting a machine printed address and assigning a ZIP + 4 Code that uses a constraint satisfaction approach. Words in an address block are first segmented and parsed to assign probable semantic categories. Word images are then recognized by a combination of digit, character, and word recognition algorithms. The control structure uses a constraint satisfaction problem solving approach to match the recognition results to an entry in the ZIP + 4 file. It is shown how this technique can both determine correct responses as well as compensate for incomplete or erroneous information. Experimental results demonstrate the success of this system. In a recent test on over 1000 machine printed address blocks, the ZIP + 4 encode rate was over 73 percent. This compares to the success rate of current postal OCRs which is about 45 percent. Additionally, the word recognition algorithm recognizes over 92 percent of the input images (over 98 percent in the top 10 choices.
© (1992) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Peter B. Cullen, Peter B. Cullen, Tin Kam Ho, Tin Kam Ho, Jonathan J. Hull, Jonathan J. Hull, Michal Prussak, Michal Prussak, Sargur N. Srihari, Sargur N. Srihari, "Contextual analysis of machine-printed addresses", Proc. SPIE 1661, Machine Vision Applications in Character Recognition and Industrial Inspection, (1 August 1992); doi: 10.1117/12.130293; https://doi.org/10.1117/12.130293


Archiving of line-drawing images
Proceedings of SPIE (November 20 1995)
Text segmentation for automatic document processing
Proceedings of SPIE (January 06 1999)
Degraded character recognition based on gradient pattern
Proceedings of SPIE (February 26 2010)

Back to Top