Translator Disclaimer
17 January 2005 New statistical method for machine-printed Arabic character recognition
Author Affiliations +
Proceedings Volume 5676, Document Recognition and Retrieval XII; (2005)
Event: Electronic Imaging 2005, 2005, San Jose, California, United States
Although about 300 million people worldwide, in several different languages, take Arabic characters for writing, Arabic OCR has not been researched as thoroughly as other widely used characters (Latin or Chinese). In this paper, a new statistical method is developed to recognize machine-printed Arabic characters. Firstly, the entire Arabic character set is pre-classified into 32 sub-sets in terms of character forms (Isolated, Final, Initial, Medial), special zones (divided according to the headline and the baseline of a text line) that characters occupy and component information (with or without secondary parts, say, diacritical marks, movements, etc.). Then 12 types of directional features are extracted from character profiles. After dimension reduction by linear discriminant analysis (LDA), features are sent to modified quadratic discriminant function (MQDF), which is utilized as the final classifier. At last, similar characters are discriminated before outputting recognition results. Selecting involved parameters properly, encouraging experimental results on test sets demonstrate the validity of proposed approach.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Hua Wang, Xiaoqing Ding, Jianming Jin, and M. Halmurat "New statistical method for machine-printed Arabic character recognition", Proc. SPIE 5676, Document Recognition and Retrieval XII, (17 January 2005);

Back to Top