Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (15)

Search Parameters:
Keywords = Arabic OCR

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 2124 KiB  
Article
Toward Building a Domain-Based Dataset for Arabic Handwritten Text Recognition
by Khawlah Alhefdhi, Abdulmalik Alsalman and Safi Faizullah
Electronics 2025, 14(12), 2461; https://doi.org/10.3390/electronics14122461 - 17 Jun 2025
Viewed by 412
Abstract
The problem of automatic recognition of handwritten text has recently been widely discussed in the research community. Handwritten text recognition is considered a challenging task for cursive scripts, such as Arabic-language scripts, due to their complex properties. Although the demand for automatic text [...] Read more.
The problem of automatic recognition of handwritten text has recently been widely discussed in the research community. Handwritten text recognition is considered a challenging task for cursive scripts, such as Arabic-language scripts, due to their complex properties. Although the demand for automatic text recognition is growing, especially to assist in digitizing archival documents, limited datasets are available for Arabic handwritten text compared to other languages. In this paper, we present novel work on building the Real Estate and Judicial Documents dataset (REJD dataset), which aims to facilitate the recognition of Arabic text in millions of archived documents. This paper also discusses the use of Optical Character Recognition and deep learning techniques, aiming to serve as the initial version in a series of experiments and enhancements designed to achieve optimal results. Full article
Show Figures

Figure 1

15 pages, 1403 KiB  
Article
BERTopic for Enhanced Idea Management and Topic Generation in Brainstorming Sessions
by Asma Cheddak, Tarek Ait Baha, Youssef Es-Saady, Mohamed El Hajji and Mohamed Baslam
Information 2024, 15(6), 365; https://doi.org/10.3390/info15060365 - 20 Jun 2024
Cited by 9 | Viewed by 5487
Abstract
Brainstorming is an important part of the design thinking process since it encourages creativity and innovation through bringing together diverse viewpoints. However, traditional brainstorming practices face challenges such as the management of large volumes of ideas. To address this issue, this paper introduces [...] Read more.
Brainstorming is an important part of the design thinking process since it encourages creativity and innovation through bringing together diverse viewpoints. However, traditional brainstorming practices face challenges such as the management of large volumes of ideas. To address this issue, this paper introduces a decision support system that employs the BERTopic model to automate the brainstorming process, which enhances the categorization of ideas and the generation of coherent topics from textual data. The dataset for our study was assembled from a brainstorming session on “scholar dropouts”, where ideas were captured on Post-it notes, digitized through an optical character recognition (OCR) model, and enhanced using data augmentation with a language model, GPT-3.5, to ensure robustness. To assess the performance of our system, we employed both quantitative and qualitative analyses. Quantitative evaluations were conducted independently across various parameters, while qualitative assessments focused on the relevance and alignment of keywords with human-classified topics during brainstorming sessions. Our findings demonstrate that BERTopic outperforms traditional LDA models in generating semantically coherent topics. These results demonstrate the usefulness of our system in managing the complex nature of Arabic language data and improving the efficiency of brainstorming sessions. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) for Economics and Business Management)
Show Figures

Graphical abstract

24 pages, 4818 KiB  
Article
Recognition of Arabic Air-Written Letters: Machine Learning, Convolutional Neural Networks, and Optical Character Recognition (OCR) Techniques
by Khalid M. O. Nahar, Izzat Alsmadi, Rabia Emhamed Al Mamlook, Ahmad Nasayreh, Hasan Gharaibeh, Ali Saeed Almuflih and Fahad Alasim
Sensors 2023, 23(23), 9475; https://doi.org/10.3390/s23239475 - 28 Nov 2023
Cited by 18 | Viewed by 4171
Abstract
Air writing is one of the essential fields that the world is turning to, which can benefit from the world of the metaverse, as well as the ease of communication between humans and machines. The research literature on air writing and its applications [...] Read more.
Air writing is one of the essential fields that the world is turning to, which can benefit from the world of the metaverse, as well as the ease of communication between humans and machines. The research literature on air writing and its applications shows significant work in English and Chinese, while little research is conducted in other languages, such as Arabic. To fill this gap, we propose a hybrid model that combines feature extraction with deep learning models and then uses machine learning (ML) and optical character recognition (OCR) methods and applies grid and random search optimization algorithms to obtain the best model parameters and outcomes. Several machine learning methods (e.g., neural networks (NNs), random forest (RF), K-nearest neighbours (KNN), and support vector machine (SVM)) are applied to deep features extracted from deep convolutional neural networks (CNNs), such as VGG16, VGG19, and SqueezeNet. Our study uses the AHAWP dataset, which consists of diverse writing styles and hand sign variations, to train and evaluate the models. Prepossessing schemes are applied to improve data quality by reducing bias. Furthermore, OCR character (OCR) methods are integrated into our model to isolate individual letters from continuous air-written gestures and improve recognition results. The results of this study showed that the proposed model achieved the best accuracy of 88.8% using NN with VGG16. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

17 pages, 4419 KiB  
Article
A Deep Learning Approach for Arabic Manuscripts Classification
by Lutfieh S. Al-homed, Kamal M. Jambi and Hassanin M. Al-Barhamtoshy
Sensors 2023, 23(19), 8133; https://doi.org/10.3390/s23198133 - 28 Sep 2023
Cited by 5 | Viewed by 3110
Abstract
For centuries, libraries worldwide have preserved ancient manuscripts due to their immense historical and cultural value. However, over time, both natural and human-made factors have led to the degradation of many ancient Arabic manuscripts, causing the loss of significant information, such as authorship, [...] Read more.
For centuries, libraries worldwide have preserved ancient manuscripts due to their immense historical and cultural value. However, over time, both natural and human-made factors have led to the degradation of many ancient Arabic manuscripts, causing the loss of significant information, such as authorship, titles, or subjects, rendering them as unknown manuscripts. Although catalog cards attached to these manuscripts might contain some of the missing details, these cards have degraded significantly in quality over the decades within libraries. This paper presents a framework for identifying these unknown ancient Arabic manuscripts by processing the catalog cards associated with them. Given the challenges posed by the degradation of these cards, simple optical character recognition (OCR) is often insufficient. The proposed framework uses deep learning architecture to identify unknown manuscripts within a collection of ancient Arabic documents. This involves locating, extracting, and classifying the text from these catalog cards, along with implementing processes for region-of-interest identification, rotation correction, feature extraction, and classification. The results demonstrate the effectiveness of the proposed method, achieving an accuracy rate of 92.5%, compared to 83.5% with classical image classification and 81.5% with OCR alone. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

33 pages, 3518 KiB  
Review
Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction
by Rayyan Najam and Safiullah Faizullah
Appl. Sci. 2023, 13(13), 7568; https://doi.org/10.3390/app13137568 - 27 Jun 2023
Cited by 24 | Viewed by 9000
Abstract
Arabic handwritten-text recognition applies an OCR technique and then a text-correction technique to extract the text within an image correctly. Deep learning is a current paradigm utilized in OCR techniques. However, no study investigated or critically analyzed recent deep-learning techniques used for Arabic [...] Read more.
Arabic handwritten-text recognition applies an OCR technique and then a text-correction technique to extract the text within an image correctly. Deep learning is a current paradigm utilized in OCR techniques. However, no study investigated or critically analyzed recent deep-learning techniques used for Arabic handwritten OCR and text correction during the period of 2020–2023. This analysis fills this noticeable gap in the literature, uncovering recent developments and their limitations for researchers, practitioners, and interested readers. The results reveal that CNN-LSTM-CTC is the most suitable architecture among Transformer and GANs for OCR because it is less complex and can hold long textual dependencies. For OCR text correction, applying DL models to generated errors in datasets improved accuracy in many works. In conclusion, Arabic OCR has the potential to further apply several text-embedding models to correct the resultant text from the OCR, and there is a significant gap in studies investigating this problem. In addition, there is a need for more high-quality and domain-specific OCR Arabic handwritten datasets. Moreover, we recommend the practical development of a space for future trends in Arabic OCR applications, derived from current limitations in Arabic OCR works and from applications in other languages; this will involve a plethora of possibilities that have not been effectively researched at the time of writing. Full article
Show Figures

Figure 1

27 pages, 2712 KiB  
Review
A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges
by Safiullah Faizullah, Muhammad Sohaib Ayub, Sajid Hussain and Muhammad Asad Khan
Appl. Sci. 2023, 13(7), 4584; https://doi.org/10.3390/app13074584 - 4 Apr 2023
Cited by 46 | Viewed by 16235
Abstract
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize [...] Read more.
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize documents for improved productivity and accessibility and for preservation of historical documents. This paper provides a survey of the current state-of-the-art applications, techniques, and challenges in Arabic OCR. We present the existing methods for each step of the complete OCR process to identify the best-performing approach for improved results. This paper follows the keyword-search method for reviewing the articles related to Arabic OCR, including the backward and forward citations of the article. In addition to state-of-art techniques, this paper identifies research gaps and presents future directions for Arabic OCR. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Graphical abstract

21 pages, 1415 KiB  
Article
Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
by Zohreh Khosrobeigi, Hadi Veisi, Ehsan Hoseinzade and Hanieh Shabanian
Appl. Sci. 2022, 12(22), 11760; https://doi.org/10.3390/app122211760 - 19 Nov 2022
Cited by 10 | Viewed by 6542
Abstract
Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, [...] Read more.
Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition Based on Deep Learning)
Show Figures

Figure 1

20 pages, 2854 KiB  
Article
Novel Perspectives for the Management of Multilingual and Multialphabetic Heritages through Automatic Knowledge Extraction: The DigitalMaktaba Approach
by Sonia Bergamaschi, Stefania De Nardis, Riccardo Martoglia, Federico Ruozzi, Luca Sala, Matteo Vanzini and Riccardo Amerigo Vigliermo
Sensors 2022, 22(11), 3995; https://doi.org/10.3390/s22113995 - 25 May 2022
Cited by 17 | Viewed by 3338
Abstract
The linguistic and social impact of multiculturalism can no longer be neglected in any sector, creating the urgent need of creating systems and procedures for managing and sharing cultural heritages in both supranational and multi-literate contexts. In order to achieve this goal, text [...] Read more.
The linguistic and social impact of multiculturalism can no longer be neglected in any sector, creating the urgent need of creating systems and procedures for managing and sharing cultural heritages in both supranational and multi-literate contexts. In order to achieve this goal, text sensing appears to be one of the most crucial research areas. The long-term objective of the DigitalMaktaba project, born from interdisciplinary collaboration between computer scientists, historians, librarians, engineers and linguists, is to establish procedures for the creation, management and cataloguing of archival heritage in non-Latin alphabets. In this paper, we discuss the currently ongoing design of an innovative workflow and tool in the area of text sensing, for the automatic extraction of knowledge and cataloguing of documents written in non-Latin languages (Arabic, Persian and Azerbaijani). The current prototype leverages different OCR, text processing and information extraction techniques in order to provide both a highly accurate extracted text and rich metadata content (including automatically identified cataloguing metadata), overcoming typical limitations of current state of the art approaches. The initial tests provide promising results. The paper includes a discussion of future steps (e.g., AI-based techniques further leveraging the extracted data/metadata and making the system learn from user feedback) and of the many foreseen advantages of this research, both from a technical and a broader cultural-preservation and sharing point of view. Full article
(This article belongs to the Collection Sensors and Communications for the Social Good)
Show Figures

Figure 1

20 pages, 6349 KiB  
Article
Exploiting Script Similarities to Compensate for the Large Amount of Data in Training Tesseract LSTM: Towards Kurdish OCR
by Saman Idrees and Hossein Hassani
Appl. Sci. 2021, 11(20), 9752; https://doi.org/10.3390/app11209752 - 19 Oct 2021
Cited by 5 | Viewed by 4441
Abstract
Applications based on Long-Short-Term Memory (LSTM) require large amounts of data for their training. Tesseract LSTM is a popular Optical Character Recognition (OCR) engine that has been trained and used in various languages. However, its training becomes obstructed when the target language is [...] Read more.
Applications based on Long-Short-Term Memory (LSTM) require large amounts of data for their training. Tesseract LSTM is a popular Optical Character Recognition (OCR) engine that has been trained and used in various languages. However, its training becomes obstructed when the target language is not resourceful. This research suggests a remedy for the problem of scant data in training Tesseract LSTM for a new language by exploiting a training dataset for a language with a similar script. The target of the experiment is Kurdish. It is a multi-dialect language and is considered less-resourced. We choose Sorani, one of the Kurdish dialects, that is mostly written in Persian-Arabic script. We train Tesseract using an Arabic dataset, and then we use a considerably small amount of texts in Persian-Arabic to train the engine to recognize Sorani texts. Our dataset is based on a series of court case documents in the Kurdistan Region of Iraq. We also fine-tune the engine using 10 Unikurd fonts. We use Lstmeval and Ocreval to evaluate the outputs. The result indicates the achievement of 95.45% accuracy. We also test the engine using texts outside the context of court cases. The accuracy of the system remains close to what was found earlier indicating that the script similarity could be used to overcome the lack of large-scale data. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 10112 KiB  
Article
CNN-Based Page Segmentation and Object Classification for Counting Population in Ottoman Archival Documentation
by Yekta Said Can and M. Erdem Kabadayı
J. Imaging 2020, 6(5), 32; https://doi.org/10.3390/jimaging6050032 - 14 May 2020
Cited by 13 | Viewed by 4587
Abstract
Historical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) [...] Read more.
Historical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) methods, which increase the importance of the page segmentation and layout analysis. Degradation of documents, digitization errors, and varying layout styles are the issues that complicate the segmentation of historical documents. The properties of Arabic scripts such as connected letters, ligatures, diacritics, and different writing styles make it even more challenging to process Arabic script historical documents. In this study, we developed an automatic system for counting registered individuals and assigning them to populated places by using a CNN-based architecture. To evaluate the performance of our system, we created a labeled dataset of registers obtained from the first wave of population registers of the Ottoman Empire held between the 1840s and 1860s. We achieved promising results for classifying different types of objects and counting the individuals and assigning them to populated places. Full article
(This article belongs to the Special Issue Recent Advances in Historical Document Processing)
Show Figures

Figure 1

23 pages, 863 KiB  
Article
Generative vs. Discriminative Recognition Models for Off-Line Arabic Handwriting
by Moftah Elzobi and Ayoub Al-Hamadi
Sensors 2018, 18(9), 2786; https://doi.org/10.3390/s18092786 - 24 Aug 2018
Cited by 4 | Viewed by 3252
Abstract
The majority of handwritten word recognition strategies are constructed on learning-based generative frameworks from letter or word training samples. Theoretically, constructing recognition models through discriminative learning should be the more effective alternative. The primary goal of this research is to compare the performances [...] Read more.
The majority of handwritten word recognition strategies are constructed on learning-based generative frameworks from letter or word training samples. Theoretically, constructing recognition models through discriminative learning should be the more effective alternative. The primary goal of this research is to compare the performances of discriminative and generative recognition strategies, which are described by generatively-trained hidden Markov modeling (HMM), discriminatively-trained conditional random fields (CRF) and discriminatively-trained hidden-state CRF (HCRF). With learning samples obtained from two dissimilar databases, we initially trained and applied an HMM classification scheme. To enable HMM classifiers to effectively reject incorrect and out-of-vocabulary segmentation, we enhance the models with adaptive threshold schemes. Aside from proposing such schemes for HMM classifiers, this research introduces CRF and HCRF classifiers in the recognition of offline Arabic handwritten words. Furthermore, the efficiencies of all three strategies are fully assessed using two dissimilar databases. Recognition outcomes for both words and letters are presented, with the pros and cons of each strategy emphasized. Full article
Show Figures

Figure 1

2 pages, 150 KiB  
Editorial
Document Image Processing
by Laurence Likforman-Sulem and Ergina Kavallieratou
J. Imaging 2018, 4(7), 84; https://doi.org/10.3390/jimaging4070084 - 22 Jun 2018
Cited by 2 | Viewed by 4320
(This article belongs to the Special Issue Document Image Processing)
19 pages, 5686 KiB  
Article
Open Datasets and Tools for Arabic Text Detection and Recognition in News Video Frames
by Oussama Zayene, Sameh Masmoudi Touj, Jean Hennebert, Rolf Ingold and Najoua Essoukri Ben Amara
J. Imaging 2018, 4(2), 32; https://doi.org/10.3390/jimaging4020032 - 31 Jan 2018
Cited by 10 | Viewed by 10442
Abstract
Recognizing texts in video is more complex than in other environments such as scanned documents. Video texts appear in various colors, unknown fonts and sizes, often affected by compression artifacts and low quality. In contrast to Latin texts, there are no publicly available [...] Read more.
Recognizing texts in video is more complex than in other environments such as scanned documents. Video texts appear in various colors, unknown fonts and sizes, often affected by compression artifacts and low quality. In contrast to Latin texts, there are no publicly available datasets which cover all aspects of the Arabic Video OCR domain. This paper describes a new well-defined and annotated Arabic-Text-in-Video dataset called AcTiV 2.0. The dataset is dedicated especially to building and evaluating Arabic video text detection and recognition systems. AcTiV 2.0 contains 189 video clips serving as a raw material for creating 4063 key frames for the detection task and 10,415 cropped text images for the recognition task. AcTiV 2.0 is also distributed with its annotation and evaluation tools that are made open-source for standardization and validation purposes. This paper also reports on the evaluation of several systems tested under the proposed detection and recognition protocols. Full article
(This article belongs to the Special Issue Document Image Processing)
Show Figures

Figure 1

11 pages, 1550 KiB  
Article
A Holistic Technique for an Arabic OCR System
by Farhan M. A. Nashwan, Mohsen A. A. Rashwan, Hassanin M. Al-Barhamtoshy, Sherif M. Abdou and Abdullah M. Moussa
J. Imaging 2018, 4(1), 6; https://doi.org/10.3390/jimaging4010006 - 27 Dec 2017
Cited by 23 | Viewed by 8046
Abstract
Analytical based approaches in Optical Character Recognition (OCR) systems can endure a significant amount of segmentation errors, especially when dealing with cursive languages such as the Arabic language with frequent overlapping between characters. Holistic based approaches that consider whole words as single units [...] Read more.
Analytical based approaches in Optical Character Recognition (OCR) systems can endure a significant amount of segmentation errors, especially when dealing with cursive languages such as the Arabic language with frequent overlapping between characters. Holistic based approaches that consider whole words as single units were introduced as an effective approach to avoid such segmentation errors. Still the main challenge for these approaches is their computation complexity, especially when dealing with large vocabulary applications. In this paper, we introduce a computationally efficient, holistic Arabic OCR system. A lexicon reduction approach based on clustering similar shaped words is used to reduce recognition time. Using global word level Discrete Cosine Transform (DCT) based features in combination with local block based features, our proposed approach managed to generalize for new font sizes that were not included in the training data. Evaluation results for the approach using different test sets from modern and historical Arabic books are promising compared with state of art Arabic OCR systems. Full article
(This article belongs to the Special Issue Document Image Processing)
Show Figures

Figure 1

25 pages, 7261 KiB  
Article
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research
by Laslo Dinges, Ayoub Al-Hamadi, Moftah Elzobi and Sherif El-etriby
Sensors 2016, 16(3), 346; https://doi.org/10.3390/s16030346 - 11 Mar 2016
Cited by 11 | Viewed by 8318
Abstract
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the [...] Read more.
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

Back to TopTop