Translator Disclaimer
16 February 2010 Recognizing characters of ancient manuscripts
Author Affiliations +
Proceedings Volume 7531, Computer Vision and Image Analysis of Art; 753106 (2010)
Event: IS&T/SPIE Electronic Imaging, 2010, San Jose, California, United States
Considering printed Latin text, the main issues of Optical Character Recognition (OCR) systems are solved. However, for degraded handwritten document images, basic preprocessing steps such as binarization, gain poor results with state-of-the-art methods. In this paper ancient Slavonic manuscripts from the 11th century are investigated. In order to minimize the consequences of false character segmentation, a binarization-free approach based on local descriptors is proposed. Additionally local information allows the recognition of partially visible or washed out characters. The proposed algorithm consists of two steps: character classification and character localization. Initially Scale Invariant Feature Transform (SIFT) features are extracted which are subsequently classified using Support Vector Machines (SVM). Afterwards, the interest points are clustered according to their spatial information. Thereby, characters are localized and finally recognized based on a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background clutter (e.g. stains, tears) and faded out characters.
© (2010) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Markus Diem and Robert Sablatnig "Recognizing characters of ancient manuscripts", Proc. SPIE 7531, Computer Vision and Image Analysis of Art, 753106 (16 February 2010);

Back to Top