Dual-Attention EfficientNet Hybrid U-Net for Segmentation of Rheumatoid Arthritis Hand X-Rays

Madallah Alruwaili; Mahmood A. Mahmood; Murtada K. Elbashir

doi:10.3390/diagnostics15243105

,

and

¹

Department of Computer Engineering and Networks, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia

²

Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka 72388, Aljouf, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Diagnostics2025, 15(24), 3105;https://doi.org/10.3390/diagnostics15243105
(registering DOI)

This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics

Version Notes

Order Reprints

Abstract

Background: Accurate segmentation in radiographic imaging remains difficult due to heterogeneous contrast, acquisition artifacts, and fine-scale anatomical boundaries. Objective: This paper presents a Hybrid Attention U-Net, which paired an EfficientNet-B3 encoder with a decoder that is both lightweight, featuring CBAM and SCSE modules, and complementary for channel-wise and spatial-wise recalibration of sharper boundary recovery. Methods: The preprocessing phase uses percentile windowing, N4 bias compensation, per-image normalization, and geometric standardization as well as sparse geometric augmentations to reduce domain shift and make the pipeline viable. Results: For hand X-ray segmentation, the model achieves results with Dice = 0.8426, IoU around 0.78, pixel accuracy = 0.9058, ROC-AUC = 0.9074, and PR-AUC = 0.8452, and converges quickly at the early stages and remains steady at late epochs. Controlled ablation shows that the main factor of overlap quality of EfficientNet-B3 and that smaller batches (bs = 16) are always better at gradient noise and implicit regularization than larger batches. The qualitative overlays are complementary to quantitative gains that reveal more distinct cortical profiles and lower background leakage. Conclusions: It is computationally moderate, end-to-end trainable, and can be easily extended to multi-class problems through a softmax head and class-balanced objectives, rendering it a powerful, deployable option for musculoskeletal radiograph segmentation as well as an effective baseline in future clinical translation analyses.

Keywords:

rheumatoid arthritis; hand radiographs; X-ray segmentation; hybrid U-Net; EfficientNet; dual attention (CBAM, scSE); pseudo-mask supervision

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.