Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications

Mansour, Ibtihel; Hamroun, Mohamed; Lajmi, Sonia; Abassi, Ryma; Sauveron, Damien

doi:10.3390/bdcc9110281

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications

by

Ibtihel Mansour

¹,

Mohamed Hamroun

^2,3,*

,

Sonia Lajmi

^4,5,

Ryma Abassi

¹

and

Damien Sauveron

²

¹

Innov’COM, Sup’Com, University of Carthage, Ariana 2083, Tunisia

²

XLIM, UMR CNRS 7252, University of Limoges, Avenue Albert Thomas, 87060 Limoges, France

³

3iL Ingénieurs, 43 Rue de Sainte Anne, 87000 Limoges, France

⁴

MIRACL Laboratory, Technopole of Sfax, University of Sfax, P.O. Box 242, Sfax 3031, Tunisia

⁵

Faculty of Computing and Information, Al-Baha University, Al-Baha 65779, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(11), 281; https://doi.org/10.3390/bdcc9110281 (registering DOI)

Submission received: 19 August 2025 / Revised: 30 October 2025 / Accepted: 4 November 2025 / Published: 8 November 2025

Download Versions Notes

Abstract

Deaf and hearing-impaired individuals rely on sign language, a visual communication system using hand shapes, facial expressions, and body gestures. Sign languages vary by region. For example, Arabic Sign Language (ArSL) is notably different from American Sign Language (ASL). This project focuses on creating an Arabic Sign Language Recognition (ArSLR) System tailored for healthcare, aiming to bridge communication gaps resulting from a lack of sign-proficient professionals and limited region-specific technological solutions. Our research addresses limitations in sign language recognition systems by introducing a novel framework centered on ResNet50ViT, a hybrid architecture that synergistically combines ResNet50’s robust local feature extraction with the global contextual modeling of Vision Transformers (ViT). We also explored a tailored Vision Transformer variant (SignViT) for Arabic Sign Language as a comparative model. Our main contribution is the ResNet50ViT model, which significantly outperforms existing approaches, specifically targeting the challenges of capturing sequential hand movements, which traditional CNN-based methods struggle with. We utilized an extensive dataset incorporating both static (36 signs) and dynamic (92 signs) medical signs. Through targeted preprocessing techniques and optimization strategies, we achieved significant performance improvements over conventional approaches. In our experiments, the proposed ResNet50-ViT achieved a remarkable 99.86% accuracy on the ArSL dataset, setting a new state-of-the-art, demonstrating the effectiveness of integrating ResNet50’s hierarchical local feature extraction with Vision Transformer’s global contextual modeling. For comparison, a fine-tuned Vision Transformer (SignViT) attained 98.03% accuracy, confirming the strength of transformer-based approaches but underscoring the clear performance gain enabled by our hybrid architecture. We expect that RAFID will help deaf patients communicate better with healthcare providers without needing human interpreters.

Keywords: ArSL; recognition; deep learning; fine-tuning; CNN; vision transformer

Share and Cite

MDPI and ACS Style

Mansour, I.; Hamroun, M.; Lajmi, S.; Abassi, R.; Sauveron, D. Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications. Big Data Cogn. Comput. 2025, 9, 281. https://doi.org/10.3390/bdcc9110281

AMA Style

Mansour I, Hamroun M, Lajmi S, Abassi R, Sauveron D. Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications. Big Data and Cognitive Computing. 2025; 9(11):281. https://doi.org/10.3390/bdcc9110281

Chicago/Turabian Style

Mansour, Ibtihel, Mohamed Hamroun, Sonia Lajmi, Ryma Abassi, and Damien Sauveron. 2025. "Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications" Big Data and Cognitive Computing 9, no. 11: 281. https://doi.org/10.3390/bdcc9110281

APA Style

Mansour, I., Hamroun, M., Lajmi, S., Abassi, R., & Sauveron, D. (2025). Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications. Big Data and Cognitive Computing, 9(11), 281. https://doi.org/10.3390/bdcc9110281

Article Menu

Hybrid Deep Learning Models for Arabic Sign Language Recognition in Healthcare Applications

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI