A Holistic Technique for an Arabic OCR System
AbstractAnalytical based approaches in Optical Character Recognition (OCR) systems can endure a significant amount of segmentation errors, especially when dealing with cursive languages such as the Arabic language with frequent overlapping between characters. Holistic based approaches that consider whole words as single units were introduced as an effective approach to avoid such segmentation errors. Still the main challenge for these approaches is their computation complexity, especially when dealing with large vocabulary applications. In this paper, we introduce a computationally efficient, holistic Arabic OCR system. A lexicon reduction approach based on clustering similar shaped words is used to reduce recognition time. Using global word level Discrete Cosine Transform (DCT) based features in combination with local block based features, our proposed approach managed to generalize for new font sizes that were not included in the training data. Evaluation results for the approach using different test sets from modern and historical Arabic books are promising compared with state of art Arabic OCR systems. View Full-Text
Scifeed alert for new publicationsNever miss any articles matching your research from any publisher
- Get alerts for new papers matching your research
- Find out the new papers from selected authors
- Updated daily for 49'000+ journals and 6000+ publishers
- Define your Scifeed now
Nashwan, F.M.A.; Rashwan, M.A.A.; Al-Barhamtoshy, H.M.; Abdou, S.M.; Moussa, A.M. A Holistic Technique for an Arabic OCR System. J. Imaging 2018, 4, 6.
Nashwan FMA, Rashwan MAA, Al-Barhamtoshy HM, Abdou SM, Moussa AM. A Holistic Technique for an Arabic OCR System. Journal of Imaging. 2018; 4(1):6.Chicago/Turabian Style
Nashwan, Farhan M.A.; Rashwan, Mohsen A.A.; Al-Barhamtoshy, Hassanin M.; Abdou, Sherif M.; Moussa, Abdullah M. 2018. "A Holistic Technique for an Arabic OCR System." J. Imaging 4, no. 1: 6.