Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (14)

Search Parameters:
Keywords = Arabic Speech-To-Text

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2410 KiB  
Article
UA-HSD-2025: Multi-Lingual Hate Speech Detection from Tweets Using Pre-Trained Transformers
by Muhammad Ahmad, Muhammad Waqas, Ameer Hamza, Sardar Usman, Ildar Batyrshin and Grigori Sidorov
Computers 2025, 14(6), 239; https://doi.org/10.3390/computers14060239 - 18 Jun 2025
Cited by 1 | Viewed by 673
Abstract
The rise in social media has improved communication but also amplified the spread of hate speech, creating serious societal risks. Automated detection remains difficult due to subjectivity, linguistic diversity, and implicit language. While prior research focuses on high-resource languages, this study addresses the [...] Read more.
The rise in social media has improved communication but also amplified the spread of hate speech, creating serious societal risks. Automated detection remains difficult due to subjectivity, linguistic diversity, and implicit language. While prior research focuses on high-resource languages, this study addresses the underexplored multilingual challenges of Arabic and Urdu hate speech through a comprehensive approach. To achieve this objective, this study makes four different key contributions. First, we have created a unique multi-lingual, manually annotated binary and multi-class dataset (UA-HSD-2025) sourced from X, which contains the five most important multi-class categories of hate speech. Secondly, we created detailed annotation guidelines to make a robust and perfect hate speech dataset. Third, we explore two strategies to address the challenges of multilingual data: a joint multilingual and translation-based approach. The translation-based approach involves converting all input text into a single target language before applying a classifier. In contrast, the joint multilingual approach employs a unified model trained to handle multiple languages simultaneously, enabling it to classify text across different languages without translation. Finally, we have employed state-of-the-art 54 different experiments using different machine learning using TF-IDF, deep learning using advanced pre-trained word embeddings such as FastText and Glove, and pre-trained language-based models using advanced contextual embeddings. Based on the analysis of the results, our language-based model (XLM-R) outperformed traditional supervised learning approaches, achieving 0.99 accuracy in binary classification for Arabic, Urdu, and joint-multilingual datasets, and 0.95, 0.94, and 0.94 accuracy in multi-class classification for joint-multilingual, Arabic, and Urdu datasets, respectively. Full article
(This article belongs to the Special Issue Recent Advances in Social Networks and Social Media)
Show Figures

Figure 1

18 pages, 373 KiB  
Article
Machine Learning- and Deep Learning-Based Multi-Model System for Hate Speech Detection on Facebook
by Amna Naseeb, Muhammad Zain, Nisar Hussain, Amna Qasim, Fiaz Ahmad, Grigori Sidorov and Alexander Gelbukh
Algorithms 2025, 18(6), 331; https://doi.org/10.3390/a18060331 - 1 Jun 2025
Cited by 2 | Viewed by 696
Abstract
Hate speech is a complex topic that transcends language, culture, and even social spheres. Recently, the spread of hate speech on social media sites like Facebook has added a new layer of complexity to the issue of online safety and content moderation. This [...] Read more.
Hate speech is a complex topic that transcends language, culture, and even social spheres. Recently, the spread of hate speech on social media sites like Facebook has added a new layer of complexity to the issue of online safety and content moderation. This study seeks to minimize this problem by developing an Arabic script-based tool for automatically detecting hate speech in Roman Urdu, an informal script used most commonly for South Asian digital communications. Roman Urdu is relatively complex as there are no standardized spellings, leading to syntactic variations, which increases the difficulty of hate speech detection. To tackle this problem, we adopt a holistic strategy using a combination of six machine learning (ML) and four Deep Learning (DL) models, a dataset from Facebook comments, which was preprocessed (tokenization, stopwords removal, etc.), and text vectorization (TF-IDF, word embeddings). The ML algorithms used in this study are LR, SVM, RF, NB, KNN, and GBM. We also use deep learning architectures like CNN, RNN, LSTM, and GRU to increase the accuracy of the classification further. It is proven by the experimental results that deep learning models outperform the traditional ML approaches by a significant margin, with CNN and LSTM achieving accuracies of 95.1% and 96.2%, respectively. As far as we are aware, this is the first work that investigates QLoRA for fine-tuning large models for the task of offensive language detection in Roman Urdu. Full article
(This article belongs to the Special Issue Linguistic and Cognitive Approaches to Dialog Agents)
Show Figures

Figure 1

18 pages, 585 KiB  
Article
Improving Diacritical Arabic Speech Recognition: Transformer-Based Models with Transfer Learning and Hybrid Data Augmentation
by Haifa Alaqel and Khalil El Hindi
Information 2025, 16(3), 161; https://doi.org/10.3390/info16030161 - 20 Feb 2025
Viewed by 1619
Abstract
Diacritical Arabic (DA) refers to Arabic text with diacritical marks that guide pronunciation and clarify meanings, making their recognition crucial for accurate linguistic interpretation. These diacritical marks (short vowels) significantly influence meaning and pronunciation, and their accurate recognition is vital for the effectiveness [...] Read more.
Diacritical Arabic (DA) refers to Arabic text with diacritical marks that guide pronunciation and clarify meanings, making their recognition crucial for accurate linguistic interpretation. These diacritical marks (short vowels) significantly influence meaning and pronunciation, and their accurate recognition is vital for the effectiveness of automatic speech recognition (ASR) systems, particularly in applications requiring high semantic precision, such as voice-enabled translation services. Despite its importance, leveraging advanced machine learning techniques to enhance ASR for diacritical Arabic has remained underexplored. A key challenge in developing DA ASR is the limited availability of training data. This study introduces a transformer-based approach leveraging transfer learning and data augmentation to address these challenges. Using a cross-lingual speech representation (XLSR) model pretrained on 53 languages, we fine-tune it on DA and integrate connectionist temporal classification (CTC) with transformers for improved performance. Data augmentation techniques, including volume adjustment, pitch shift, speed alteration, and hybrid strategies, further mitigate data limitations, significantly reducing word error rates (WER). Our methods achieve a WER of 12.17%, outperforming traditional ASR systems and setting a new benchmark for DA ASR. These findings demonstrate the potential of advanced machine learning to address longstanding challenges in DA ASR and enhance its accuracy. Full article
Show Figures

Figure 1

20 pages, 1420 KiB  
Article
A Survey of Grapheme-to-Phoneme Conversion Methods
by Shiyang Cheng, Pengcheng Zhu, Jueting Liu and Zehua Wang
Appl. Sci. 2024, 14(24), 11790; https://doi.org/10.3390/app142411790 - 17 Dec 2024
Cited by 3 | Viewed by 3677
Abstract
Grapheme-to-phoneme conversion (G2P) is the task of converting letters (grapheme sequences) into their pronunciations (phoneme sequences). It plays a crucial role in natural language processing, text-to-speech synthesis, and automatic speech recognition systems. This paper provides a systematical overview of the G2P conversion from [...] Read more.
Grapheme-to-phoneme conversion (G2P) is the task of converting letters (grapheme sequences) into their pronunciations (phoneme sequences). It plays a crucial role in natural language processing, text-to-speech synthesis, and automatic speech recognition systems. This paper provides a systematical overview of the G2P conversion from different perspectives. The conversion methods are first presented in the paper; detailed discussions are conducted on methods based on deep learning technology. For each method, the key ideas, advantages, disadvantages, and representative models are summarized. This paper then mentioned the learning strategies and multilingual G2P conversions. Finally, this paper summarized the commonly used monolingual and multilingual datasets, including Mandarin, Japanese, Arabic, etc. Two tables illustrated the performance of various methods with relative datasets. After making a general overall of G2P conversion, this paper concluded with the current issues and the future directions of deep learning-based G2P conversion. Full article
(This article belongs to the Collection Trends and Prospects in Multimedia)
Show Figures

Figure 1

20 pages, 4970 KiB  
Article
Revealing the Next Word and Character in Arabic: An Effective Blend of Long Short-Term Memory Networks and ARABERT
by Fawaz S. Al-Anzi and S. T. Bibin Shalini
Appl. Sci. 2024, 14(22), 10498; https://doi.org/10.3390/app142210498 - 14 Nov 2024
Cited by 1 | Viewed by 1363
Abstract
Arabic raw audio datasets were initially gathered to produce a corresponding signal spectrum, which was further used to extract the Mel-Frequency Cepstral Coefficients (MFCCs). The pronunciation dictionary, language model, and acoustic model were further derived from the MFCCs’ features. These output data were [...] Read more.
Arabic raw audio datasets were initially gathered to produce a corresponding signal spectrum, which was further used to extract the Mel-Frequency Cepstral Coefficients (MFCCs). The pronunciation dictionary, language model, and acoustic model were further derived from the MFCCs’ features. These output data were processed into Baidu’s Deep Speech model (ASR system) to attain the text corpus. Baidu’s Deep Speech model was implemented to precisely identify the global optimal value rapidly while preserving a low word and character discrepancy rate by attaining an excellent performance in isolated and end-to-end speech recognition. The desired outcome in this work is to forecast the next word and character in a sequential and systematic order that applies under natural language processing (NLP). This work combines the trained Arabic language model ARABERT with the potential of Long Short-Term Memory (LSTM) networks to predict the next word and character in an Arabic text. We used the pre-trained ARABERT embedding to improve the model’s capacity and, to capture semantic relationships within the language, we educated LSTM + CNN and Markov models on Arabic text data to assess the efficacy of this model. Python libraries such as TensorFlow, Pickle, Keras, and NumPy were used to effectively design our development model. We extensively assessed the model’s performance using new Arabic text, focusing on evaluation metrics like accuracy, word error rate, character error rate, BLEU score, and perplexity. The results show how well the combined LSTM + ARABERT and Markov models have outperformed the baseline models in envisaging the next word or character in the Arabic text. The accuracy rates of 64.9% for LSTM, 74.6% for ARABERT + LSTM, and 78% for Markov chain models were achieved in predicting the next word, and the accuracy rates of 72% for LSTM, 72.22% for LSTM + CNN, and 73% for ARABERET + LSTM models were achieved for the next-character prediction. This work unveils a novelty in Arabic natural language processing tasks, estimating a potential future expansion in deriving a precise next-word and next-character forecasting, which can be an efficient utility for text generation and machine translation applications. Full article
Show Figures

Figure 1

24 pages, 22050 KiB  
Article
SOD: A Corpus for Saudi Offensive Language Detection Classification
by Afefa Asiri and Mostafa Saleh
Computers 2024, 13(8), 211; https://doi.org/10.3390/computers13080211 - 20 Aug 2024
Viewed by 1772
Abstract
Social media platforms like X (formerly known as Twitter) are integral to modern communication, enabling the sharing of news, emotions, and ideas. However, they also facilitate the spread of harmful content, and manual moderation of these platforms is impractical. Automated moderation tools, predominantly [...] Read more.
Social media platforms like X (formerly known as Twitter) are integral to modern communication, enabling the sharing of news, emotions, and ideas. However, they also facilitate the spread of harmful content, and manual moderation of these platforms is impractical. Automated moderation tools, predominantly developed for English, are insufficient for addressing online offensive language in Arabic, a language rich in dialects and informally used on social media. This gap underscores the need for dedicated, dialect-specific resources. This study introduces the Saudi Offensive Dialectal dataset (SOD), consisting of over 24,000 tweets annotated across three levels: offensive or non-offensive, with offensive tweets further categorized as general insults, hate speech, or sarcasm. A deeper analysis of hate speech identifies subtypes related to sports, religion, politics, race, and violence. A comprehensive descriptive analysis of the SOD is also provided to offer deeper insights into its composition. Using machine learning, traditional deep learning, and transformer-based deep learning models, particularly AraBERT, our research achieves a significant F1-Score of 87% in identifying offensive language. This score improves to 91% with data augmentation techniques addressing dataset imbalances. These results, which surpass many existing studies, demonstrate that a specialized dialectal dataset enhances detection efficacy compared to mixed-language datasets. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling)
Show Figures

Figure 1

18 pages, 1858 KiB  
Article
Arabic Toxic Tweet Classification: Leveraging the AraBERT Model
by Amr Mohamed El Koshiry, Entesar Hamed I. Eliwa, Tarek Abd El-Hafeez and Ahmed Omar
Big Data Cogn. Comput. 2023, 7(4), 170; https://doi.org/10.3390/bdcc7040170 - 26 Oct 2023
Cited by 24 | Viewed by 4664
Abstract
Social media platforms have become the primary means of communication and information sharing, facilitating interactive exchanges among users. Unfortunately, these platforms also witness the dissemination of inappropriate and toxic content, including hate speech and insults. While significant efforts have been made to classify [...] Read more.
Social media platforms have become the primary means of communication and information sharing, facilitating interactive exchanges among users. Unfortunately, these platforms also witness the dissemination of inappropriate and toxic content, including hate speech and insults. While significant efforts have been made to classify toxic content in the English language, the same level of attention has not been given to Arabic texts. This study addresses this gap by constructing a standardized Arabic dataset specifically designed for toxic tweet classification. The dataset is annotated automatically using Google’s Perspective API and the expertise of three native Arabic speakers and linguists. To evaluate the performance of different models, we conduct a series of experiments using seven models: long short-term memory (LSTM), bidirectional LSTM, a convolutional neural network, a gated recurrent unit (GRU), bidirectional GRU, multilingual bidirectional encoder representations from transformers, and AraBERT. Additionally, we employ word embedding techniques. Our experimental findings demonstrate that the fine-tuned AraBERT model surpasses the performance of other models, achieving an impressive accuracy of 0.9960. Notably, this accuracy value outperforms similar approaches reported in recent literature. This study represents a significant advancement in Arabic toxic tweet classification, shedding light on the importance of addressing toxicity in social media platforms while considering diverse languages and cultures. Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
Show Figures

Figure 1

20 pages, 6480 KiB  
Article
Arabic Mispronunciation Recognition System Using LSTM Network
by Abdelfatah Ahmed, Mohamed Bader, Ismail Shahin, Ali Bou Nassif, Naoufel Werghi and Mohammad Basel
Information 2023, 14(7), 413; https://doi.org/10.3390/info14070413 - 16 Jul 2023
Cited by 7 | Viewed by 2439
Abstract
The Arabic language has always been an immense source of attraction to various people from different ethnicities by virtue of the significant linguistic legacy that it possesses. Consequently, a multitude of people from all over the world are yearning to learn it. However, [...] Read more.
The Arabic language has always been an immense source of attraction to various people from different ethnicities by virtue of the significant linguistic legacy that it possesses. Consequently, a multitude of people from all over the world are yearning to learn it. However, people from different mother tongues and cultural backgrounds might experience some hardships regarding articulation due to the absence of some particular letters only available in the Arabic language, which could hinder the learning process. As a result, a speaker-independent and text-dependent efficient system that aims to detect articulation disorders was implemented. In the proposed system, we emphasize the prominence of “speech signal processing” in diagnosing Arabic mispronunciation using the Mel-frequency cepstral coefficients (MFCCs) as the optimum extracted features. In addition, long short-term memory (LSTM) was also utilized for the classification process. Furthermore, the analytical framework was incorporated with a gender recognition model to perform two-level classification. Our results show that the LSTM network significantly enhances mispronunciation detection along with gender recognition. The LSTM models attained an average accuracy of 81.52% in the proposed system, reflecting a high performance compared to previous mispronunciation detection systems. Full article
Show Figures

Figure 1

16 pages, 6207 KiB  
Article
Object Recognition System for the Visually Impaired: A Deep Learning Approach using Arabic Annotation
by Nada Alzahrani and Heyam H. Al-Baity
Electronics 2023, 12(3), 541; https://doi.org/10.3390/electronics12030541 - 20 Jan 2023
Cited by 13 | Viewed by 5109
Abstract
Object detection is an important computer vision technique that has increasingly attracted the attention of researchers in recent years. The literature to date in the field has introduced a range of object detection models. However, these models have largely been English-language-based, and there [...] Read more.
Object detection is an important computer vision technique that has increasingly attracted the attention of researchers in recent years. The literature to date in the field has introduced a range of object detection models. However, these models have largely been English-language-based, and there is only a limited number of published studies that have addressed how object detection can be implemented for the Arabic language. As far as we are aware, the generation of an Arabic text-to-speech engine to utter objects’ names and their positions in images to help Arabic-speaking visually impaired people has not been investigated previously. Therefore, in this study, we propose an object detection and segmentation model based on the Mask R-CNN algorithm that is capable of identifying and locating different objects in images, then uttering their names and positions in Arabic. The proposed model was trained on the Pascal VOC 2007 and 2012 datasets and evaluated on the Pascal VOC 2007 testing set. We believe that this is one of a few studies that uses these datasets to train and test the Mask R-CNN model. The performance of the proposed object detection model was evaluated and compared with previous object detection models in the literature, and the results demonstrated its superiority and ability to achieve an accuracy of 83.9%. Moreover, experiments were conducted to evaluate the performance of the incorporated translator and TTS engines, and the results showed that the proposed model could be effective in helping Arabic-speaking visually impaired people understand the content of digital images. Full article
(This article belongs to the Special Issue Applications of Neural Networks for Speech and Language Processing)
Show Figures

Figure 1

16 pages, 2468 KiB  
Article
Contact-Induced Change in an Endangered Language: The Case of Cypriot Arabic
by Spyros Armostis and Marilena Karyolemou
Languages 2023, 8(1), 10; https://doi.org/10.3390/languages8010010 - 26 Dec 2022
Cited by 4 | Viewed by 7219
Abstract
Cypriot Arabic (CyAr) is a severely endangered Semitic variety spoken by Cypriot Maronites. It belongs to the group of “peripheral varieties” of Arabic that were separated from the core Arabic-speaking area and came into contact with non-Semitic languages. Although there has been a [...] Read more.
Cypriot Arabic (CyAr) is a severely endangered Semitic variety spoken by Cypriot Maronites. It belongs to the group of “peripheral varieties” of Arabic that were separated from the core Arabic-speaking area and came into contact with non-Semitic languages. Although there has been a renewed interest since the turn of the century for the study of CyAr, some aspects of its structure are still not well known. In this paper, we present and analyze a number of developments in CyAr induced by contact with Cypriot Greek. Our methodology for investigating such phenomena makes a novel contribution to the description of this underrepresented variety, as it was based not only on existing linguistic descriptions and text corpora in the literature, but mainly on a vast corpus of naturalistic oral speech data from the Archive of Oral Tradition of CyAr. Our analysis revealed the complexity of investigated contact phenomena and the differing degrees of integration of borrowings into the lexico-grammatical system of CyAr. Full article
(This article belongs to the Special Issue Investigating Language Contact and New Varieties)
23 pages, 4407 KiB  
Article
Empirical Comparison between Deep and Classical Classifiers for Speaker Verification in Emotional Talking Environments
by Ali Bou Nassif, Ismail Shahin, Mohammed Lataifeh, Ashraf Elnagar and Nawel Nemmour
Information 2022, 13(10), 456; https://doi.org/10.3390/info13100456 - 27 Sep 2022
Cited by 3 | Viewed by 2117
Abstract
Speech signals carry various bits of information relevant to the speaker such as age, gender, accent, language, health, and emotions. Emotions are conveyed through modulations of facial and vocal expressions. This paper conducts an empirical comparison of performances between the classical classifiers: Gaussian [...] Read more.
Speech signals carry various bits of information relevant to the speaker such as age, gender, accent, language, health, and emotions. Emotions are conveyed through modulations of facial and vocal expressions. This paper conducts an empirical comparison of performances between the classical classifiers: Gaussian Mixture Model (GMM), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Artificial neural networks (ANN); and the deep learning classifiers, i.e., Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) in addition to the ivector approach for a text-independent speaker verification task in neutral and emotional talking environments. The deep models undergo hyperparameter tuning using the Grid Search optimization algorithm. The models are trained and tested using a private Arabic Emirati Speech Database, Ryerson Audio–Visual Database of Emotional Speech and Song dataset (RAVDESS) database, and a public Crowd-Sourced Emotional Multimodal Actors (CREMA) database. Experimental results illustrate that deep architectures do not necessarily outperform classical classifiers. In fact, evaluation was carried out through Equal Error Rate (EER) along with Area Under the Curve (AUC) scores. The findings reveal that the GMM model yields the lowest EER values and the best AUC scores across all datasets, amongst classical classifiers. In addition, the ivector model surpasses all the fine-tuned deep models (CNN, LSTM, and GRU) based on both evaluation metrics in the neutral, as well as the emotional speech. In addition, the GMM outperforms the ivector using the Emirati and RAVDESS databases. Full article
(This article belongs to the Special Issue Signal Processing Based on Convolutional Neural Network)
Show Figures

Figure 1

22 pages, 1693 KiB  
Review
Arabic Automatic Speech Recognition: A Systematic Literature Review
by Amira Dhouib, Achraf Othman, Oussama El Ghoul, Mohamed Koutheair Khribi and Aisha Al Sinani
Appl. Sci. 2022, 12(17), 8898; https://doi.org/10.3390/app12178898 - 5 Sep 2022
Cited by 27 | Viewed by 11128
Abstract
Automatic Speech Recognition (ASR), also known as Speech-To-Text (STT) or computer speech recognition, has been an active field of research recently. This study aims to chart this field by performing a Systematic Literature Review (SLR) to give insight into the ASR studies proposed, [...] Read more.
Automatic Speech Recognition (ASR), also known as Speech-To-Text (STT) or computer speech recognition, has been an active field of research recently. This study aims to chart this field by performing a Systematic Literature Review (SLR) to give insight into the ASR studies proposed, especially for the Arabic language. The purpose is to highlight the trends of research about Arabic ASR and guide researchers with the most significant studies published over ten years from 2011 to 2021. This SLR attempts to tackle seven specific research questions related to the toolkits used for developing and evaluating Arabic ASR, the supported type of the Arabic language, the used feature extraction/classification techniques, the type of speech recognition, the performance of Arabic ASR, the existing gaps facing researchers, along with some future research. Across five databases, 38 studies met our defined inclusion criteria. Our results showed different open-source toolkits to support Arabic speech recognition. The most prominent ones were KALDI, HTK, then CMU Sphinx toolkits. A total of 89.47% of the retained studies cover modern standard Arabic, whereas 26.32% of them were dedicated to different dialects of Arabic. MFCC and HMM were presented as the most used feature extraction and classification techniques, respectively: 63% of the papers were based on MFCC and 21% were based on HMM. The review also shows that the performance of Arabic ASR systems depends mainly on different criteria related to the availability of resources, the techniques used for acoustic modeling, and the used datasets. Full article
(This article belongs to the Special Issue Automatic Speech Recognition)
Show Figures

Figure 1

24 pages, 3512 KiB  
Article
Mispronunciation Detection and Diagnosis with Articulatory-Level Feedback Generation for Non-Native Arabic Speech
by Mohammed Algabri, Hassan Mathkour, Mansour Alsulaiman and Mohamed A. Bencherif
Mathematics 2022, 10(15), 2727; https://doi.org/10.3390/math10152727 - 2 Aug 2022
Cited by 15 | Viewed by 4472
Abstract
A high-performance versatile computer-assisted pronunciation training (CAPT) system that provides the learner immediate feedback as to whether their pronunciation is correct is very helpful in learning correct pronunciation and allows learners to practice this at any time and with unlimited repetitions, without the [...] Read more.
A high-performance versatile computer-assisted pronunciation training (CAPT) system that provides the learner immediate feedback as to whether their pronunciation is correct is very helpful in learning correct pronunciation and allows learners to practice this at any time and with unlimited repetitions, without the presence of an instructor. In this paper, we propose deep learning-based techniques to build a high-performance versatile CAPT system for mispronunciation detection and diagnosis (MDD) and articulatory feedback generation for non-native Arabic learners. The proposed system can locate the error in pronunciation, recognize the mispronounced phonemes, and detect the corresponding articulatory features (AFs), not only in words but even in sentences. We formulate the recognition of phonemes and corresponding AFs as a multi-label object recognition problem, where the objects are the phonemes and their AFs in a spectral image. Moreover, we investigate the use of cutting-edge neural text-to-speech (TTS) technology to generate a new corpus of high-quality speech from predefined text that has the most common substitution errors among Arabic learners. The proposed model and its various enhanced versions achieved excellent results. We compared the performance of the different proposed models with the state-of-the-art end-to-end technique of MDD, and our system had a better performance. In addition, we proposed using fusion between the proposed model and the end-to-end model and obtained a better performance. Our best model achieved a 3.83% phoneme error rate (PER) in the phoneme recognition task, a 70.53% F1-score in the MDD task, and a detection error rate (DER) of 2.6% for the AF detection task. Full article
(This article belongs to the Special Issue Recent Advances in Artificial Intelligence and Machine Learning)
Show Figures

Figure 1

13 pages, 335 KiB  
Article
A Comparative Study of Arabic Part of Speech Taggers Using Literary Text Samples from Saudi Novels
by Reyadh Alluhaibi, Tareq Alfraidi, Mohammad A. R. Abdeen and Ahmed Yatimi
Information 2021, 12(12), 523; https://doi.org/10.3390/info12120523 - 15 Dec 2021
Cited by 10 | Viewed by 4359
Abstract
Part of Speech (POS) tagging is one of the most common techniques used in natural language processing (NLP) applications and corpus linguistics. Various POS tagging tools have been developed for Arabic. These taggers differ in several aspects, such as in their modeling techniques, [...] Read more.
Part of Speech (POS) tagging is one of the most common techniques used in natural language processing (NLP) applications and corpus linguistics. Various POS tagging tools have been developed for Arabic. These taggers differ in several aspects, such as in their modeling techniques, tag sets and training and testing data. In this paper we conduct a comparative study of five Arabic POS taggers, namely: Stanford Arabic, CAMeL Tools, Farasa, MADAMIRA and Arabic Linguistic Pipeline (ALP) which examine their performance using text samples from Saudi novels. The testing data has been extracted from different novels that represent different types of narrations. The main result we have obtained indicates that the ALP tagger performs better than others in this particular case, and that Adjective is the most frequent mistagged POS type as compared to Noun and Verb. Full article
Show Figures

Figure 1

Back to TopTop