22 December 1999 Imaged document information location and extraction using an optical correlator
Author Affiliations +
Today, the paper document is fast becoming a thing of the past. With the rapid development of fast, inexpensive computing and storage devices, many government and private organizations are archiving their documents in electronic form (e.g., personnel records, medical records, patents, etc.). Many of these organizations are converting their paper archives to electronic images, which are then stored in a computer database. Because of this, there is a need to efficiently organize this data into comprehensive and accessible information resources and provide for rapid access to the information contained within these imaged documents. To meet this need, Litton PRC and Litton Data Systems Division are developing a system, the Imaged Document Optical Correlation and Conversion System (IDOCCS), to provide a total solution to the problem of managing and retrieving textual and graphic information from imaged document archives. At the heart of IDOCCS, optical correlation technology provide a means for the search and retrieval of information from imaged documents. IDOCCS can be used to rapidly search for key words or phrases within the imaged document archives and has the potential to determine the types of languages contained within a document. In addition, IDOCCS can automatically compare an input document with the archived database to determine if it is a duplicate, thereby reducing the overall resources required to maintain and access the document database. Embedded graphics on imaged pages can also be exploited, e.g., imaged documents containing an agency's seal or logo can be singled out. In this paper, we present a description of IDOCCS as well as preliminary performance results and theoretical projections.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Bruce W. Stalcup, Bruce W. Stalcup, Phillip W. Dennis, Phillip W. Dennis, Robert Barry Dydyk, Robert Barry Dydyk, } "Imaged document information location and extraction using an optical correlator", Proc. SPIE 3967, Document Recognition and Retrieval VII, (22 December 1999); doi: 10.1117/12.373497; https://doi.org/10.1117/12.373497


Non-Manhattan layout extraction algorithm
Proceedings of SPIE (March 20 2013)
Table analysis for multiline cell identification
Proceedings of SPIE (December 20 2000)
Font group identification using reconstructed fonts
Proceedings of SPIE (January 24 2011)
Concept-based retrieval of biomedical images
Proceedings of SPIE (May 18 2003)

Back to Top