Over the past decades, text recognition technologies have focused immensely on noncursive isolated scripts. A text recognition system for the cursive Pashto script will serve as a great contribution, allowing the traditional, cultural, and educational Pashto literature to be converted into machine-readable form. We propose the use of deep learning architectures based on the transfer learning for the recognition of Pashto ligatures. For recognition analysis and evaluation, the ligature images in the dataset are preprocessed by data augmentation techniques, i.e., negatives, contours, and rotated to increase the variation of each sample and size of the original dataset. Rich feature representations are automatically extracted from the Pashto ligature images using deep convolution layers of the convolution neural network (CNN) architectures using fine-tuned approach. Pretrained CNN architectures: AlexNet, GoogleNet, and VGG (VGG-16 and VGG-19) are used for classification by feeding the extracted features to a fully connected layer and a softmax layer. The proposed deep transfer-based learning has achieved phenomenal recognition rates for Pashto ligatures on benchmark FAST-NU Pashto dataset. An accuracy of 97.24%, 97.46%, and 99.03% is achieved using AlexNext, GoogleNet, and VGGNet architectures, respectively. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
CITATIONS
Cited by 7 scholarly publications.
Optical character recognition
Data modeling
RGB color model
Convolution
Feature extraction
Analytical research
Network architectures