This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications
by
Ibtihel Mansour
Ibtihel Mansour 1,
Mohamed Hamroun
Mohamed Hamroun 2,3,*
,
Sonia Lajmi
Sonia Lajmi 4,5,
Ryma Abassi
Ryma Abassi 1
and
Damien Sauveron
Damien Sauveron 2
1
Innov’COM, Sup’Com, University of Carthage, Ariana 2083, Tunisia
2
XLIM, UMR CNRS 7252, University of Limoges, Avenue Albert Thomas, 87060 Limoges, France
3
3iL Ingénieurs, 43 Rue de Sainte Anne, 87000 Limoges, France
4
MIRACL Laboratory, Technopole of Sfax, University of Sfax, P.O. Box 242, Sfax 3031, Tunisia
5
Faculty of Computing and Information, Al-Baha University, Al-Baha 65779, Saudi Arabia
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2025, 9(11), 281; https://doi.org/10.3390/bdcc9110281 (registering DOI)
Submission received: 19 August 2025
/
Revised: 30 October 2025
/
Accepted: 4 November 2025
/
Published: 8 November 2025
Abstract
Deaf and hearing-impaired individuals rely on sign language, a visual communication system using hand shapes, facial expressions, and body gestures. Sign languages vary by region. For example, Arabic Sign Language (ArSL) is notably different from American Sign Language (ASL). This project focuses on creating an Arabic Sign Language Recognition (ArSLR) System tailored for healthcare, aiming to bridge communication gaps resulting from a lack of sign-proficient professionals and limited region-specific technological solutions. Our research addresses limitations in sign language recognition systems by introducing a novel framework centered on ResNet50ViT, a hybrid architecture that synergistically combines ResNet50’s robust local feature extraction with the global contextual modeling of Vision Transformers (ViT). We also explored a tailored Vision Transformer variant (SignViT) for Arabic Sign Language as a comparative model. Our main contribution is the ResNet50ViT model, which significantly outperforms existing approaches, specifically targeting the challenges of capturing sequential hand movements, which traditional CNN-based methods struggle with. We utilized an extensive dataset incorporating both static (36 signs) and dynamic (92 signs) medical signs. Through targeted preprocessing techniques and optimization strategies, we achieved significant performance improvements over conventional approaches. In our experiments, the proposed ResNet50-ViT achieved a remarkable 99.86% accuracy on the ArSL dataset, setting a new state-of-the-art, demonstrating the effectiveness of integrating ResNet50’s hierarchical local feature extraction with Vision Transformer’s global contextual modeling. For comparison, a fine-tuned Vision Transformer (SignViT) attained 98.03% accuracy, confirming the strength of transformer-based approaches but underscoring the clear performance gain enabled by our hybrid architecture. We expect that RAFID will help deaf patients communicate better with healthcare providers without needing human interpreters.
Share and Cite
MDPI and ACS Style
Mansour, I.; Hamroun, M.; Lajmi, S.; Abassi, R.; Sauveron, D.
Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications. Big Data Cogn. Comput. 2025, 9, 281.
https://doi.org/10.3390/bdcc9110281
AMA Style
Mansour I, Hamroun M, Lajmi S, Abassi R, Sauveron D.
Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications. Big Data and Cognitive Computing. 2025; 9(11):281.
https://doi.org/10.3390/bdcc9110281
Chicago/Turabian Style
Mansour, Ibtihel, Mohamed Hamroun, Sonia Lajmi, Ryma Abassi, and Damien Sauveron.
2025. "Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications" Big Data and Cognitive Computing 9, no. 11: 281.
https://doi.org/10.3390/bdcc9110281
APA Style
Mansour, I., Hamroun, M., Lajmi, S., Abassi, R., & Sauveron, D.
(2025). Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications. Big Data and Cognitive Computing, 9(11), 281.
https://doi.org/10.3390/bdcc9110281
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.