7 January 1999 Robust baseline-independent algorithms for segmentation and reconstruction of Arabic handwritten cursive script
Author Affiliations +
Abstract
The problem of cursive script segmentation is an essential one for handwritten character recognition. This is specially true for Arabic text where cursive is the only mode even for typewritten font. In this paper, we present a generalized segmentation approach for handwritten Arabic cursive scripts. The proposed approach is based on the analysis of the upper and lower contours of the word. The algorithm searchers for local minima points along the upper contour and local maxima points along the lower contour of the word. These points are then marked as potential letter boundaries (PLB). A set of rules, based on the nature of Arabic cursive scripts, are then applied to both upper and lower PLB points to eliminate some of the improper ones. A matching process between upper and lower PLBs is then performed in order to obtain the minimum number of non-overlapping PLB for each word. The output of the proposed segmentation algorithm is a set of labeled primitives that represent the Arabic word. In order to reconstruct the original word from its corresponding primitives and diacritics, a novel binding and dot assignment algorithm is introduced. The algorithm achieved correct segmentation rate of 97.7% when tested on samples of loosely constrained handwritten cursive script words consisting of 7922 characters written by 14 different writers.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Khaled Mostafa, Ahmed M. Darwish, "Robust baseline-independent algorithms for segmentation and reconstruction of Arabic handwritten cursive script", Proc. SPIE 3651, Document Recognition and Retrieval VI, (7 January 1999); doi: 10.1117/12.335805; https://doi.org/10.1117/12.335805
PROCEEDINGS
11 PAGES


SHARE
Back to Top