Paper
28 January 2008 Large scale parallel document image processing
Author Affiliations +
Proceedings Volume 6815, Document Recognition and Retrieval XV; 68150S (2008) https://doi.org/10.1117/12.765482
Event: Electronic Imaging, 2008, San Jose, California, United States
Abstract
Building a system which allows to search a very large database of document images requires professionalization of hardware and software, e-science and web access. In astrophysics there is ample experience dealing with large data sets due to an increasing number of measurement instruments. The problem of digitization of historical documents of the Dutch cultural heritage is a similar problem. This paper discusses the use of a system developed at the Kapteyn Institute of Astrophysics for the processing of large data sets, applied to the problem of creating a very large searchable archive of connected cursive handwritten texts. The system is adapted to the specific needs of processing document images. It shows that interdisciplinary collaboration can be beneficial in the context of machine learning, data processing and professionalization of image processing and retrieval systems.
© (2008) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tijn van der Zant, Lambert Schomaker, and Edwin Valentijn "Large scale parallel document image processing", Proc. SPIE 6815, Document Recognition and Retrieval XV, 68150S (28 January 2008); https://doi.org/10.1117/12.765482
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Databases

Image processing

Data processing

Computing systems

Astrophysics

Data archive systems

Cultural heritage

Back to Top