Next Article in Journal
Pressure Mapping Mat for Tele-Home Care Applications
Previous Article in Journal
Wireless Metal Detection and Surface Coverage Sensing for All-Surface Induction Heating
Article Menu

Export Article

Open AccessArticle
Sensors 2016, 16(3), 346; doi:10.3390/s16030346

Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research

1
Institute for Information Technology and Communications (IIKT), Otto-von-Guericke-University Magdeburg, D-39016 Magdeburg, Germany
2
Faculty of Computers and Information, Menoufia University-MUFIC, Menoufia 32721, Egypt
3
Department of Computer, Umm Al-Qura University, Makkah 21421, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Academic Editor: Vittorio M. N. Passaro
Received: 16 December 2015 / Revised: 26 February 2016 / Accepted: 29 February 2016 / Published: 11 March 2016
(This article belongs to the Section Physical Sensors)
View Full-Text   |   Download PDF [7261 KB, uploaded 24 March 2016]   |  

Abstract

Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. View Full-Text
Keywords: Arabic handwritings; optical character recognition (OCR); handwriting synthesis; digital pens; word segmentation; word segmentation; feature extraction and analysis; Active Shape Model; recognition and interpretation; intelligent systems Arabic handwritings; optical character recognition (OCR); handwriting synthesis; digital pens; word segmentation; word segmentation; feature extraction and analysis; Active Shape Model; recognition and interpretation; intelligent systems
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Dinges, L.; Al-Hamadi, A.; Elzobi, M.; El-etriby, S. Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research. Sensors 2016, 16, 346.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top