Next Article in Journal
Impact of Single- Versus Multiple-Type HPV Infections on Cervical Cytological and Histological Abnormalities: The Dominant Oncogenic Potential of HPV16 Single-Type Infections
Previous Article in Journal
Utility of REMS-Derived Fragility Score and Trabecular Bone Score in Evaluating Bone Health in Type 2 Diabetes Mellitus
Previous Article in Special Issue
Structural Posterior Fossa Malformations: MR Imaging and Neurodevelopmental Outcome
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Hybrid State–Space and Vision Transformer Framework for Fetal Ultrasound Plane Classification in Prenatal Diagnostics

1
Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania
2
Applied College, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
*
Author to whom correspondence should be addressed.
Diagnostics 2025, 15(22), 2879; https://doi.org/10.3390/diagnostics15222879 (registering DOI)
Submission received: 1 October 2025 / Revised: 10 November 2025 / Accepted: 12 November 2025 / Published: 13 November 2025
(This article belongs to the Special Issue Advances in Fetal Imaging)

Abstract

Background and Objective: Accurate classification of standard fetal ultrasound planes is a critical step in prenatal diagnostics, enabling reliable biometric measurements and anomaly detection. Conventional deep learning approaches, particularly convolutional neural networks (CNNs) and transformers, often face challenges such as domain variability, noise artifacts, class imbalance, and poor calibration, which limit their clinical utility. This study proposes a hybrid state–space and vision transformer framework designed to address these limitations by integrating sequential dynamics and global contextual reasoning. Methods: The proposed framework comprises five stages: (i) preprocessing for ultrasound harmonization using intensity normalization, anisotropic diffusion filtering, and affine alignment; (ii) hybrid feature encoding with a state–space model (SSM) for sequential dependency modeling and a vision transformer (ViT) for global self-attention; (iii) multi-task learning (MTL) with anatomical regularization leveraging classification, segmentation, and biometric regression objectives; (iv) gated decision fusion for balancing local sequential and global contextual features; and (v) calibration strategies using temperature scaling and entropy regularization to ensure reliable confidence estimation. The framework was comprehensively evaluated on three publicly available datasets: FETAL_PLANES_DB, HC18, and a large-scale fetal head dataset. Results: The hybrid framework consistently outperformed baseline CNN, SSM-only, and ViT-only models across all tasks. On FETAL_PLANES_DB, it achieved an accuracy of 95.8%, a macro-F1 of 94.9%, and an ECE of 1.5%. On the Fetal Head dataset, the model achieved 94.1% accuracy and a macro-F1 score of 92.8%, along with superior calibration metrics. For HC18, it achieved a Dice score of 95.7%, an IoU of 91.7%, and a mean absolute error of 2.30 mm for head circumference estimation. Cross-dataset evaluations confirmed the model’s robustness and generalization capability. Ablation studies further demonstrated the critical role of SSM, ViT, fusion gating, and anatomical regularization in achieving optimal performance. Conclusions: By combining state–space dynamics and transformer-based global reasoning, the proposed framework delivers accurate, calibrated, and clinically meaningful predictions for fetal ultrasound plane classification and biometric estimation. The results highlight its potential for deployment in real-time prenatal screening and diagnostic systems.
Keywords: fetal ultrasound; prenatal diagnostics; state–space models; vision transformers; multi-task learning fetal ultrasound; prenatal diagnostics; state–space models; vision transformers; multi-task learning

Share and Cite

MDPI and ACS Style

Tehsin, S.; Alshaya, H.; Bouchelligua, W.; Nasir, I.M. Hybrid State–Space and Vision Transformer Framework for Fetal Ultrasound Plane Classification in Prenatal Diagnostics. Diagnostics 2025, 15, 2879. https://doi.org/10.3390/diagnostics15222879

AMA Style

Tehsin S, Alshaya H, Bouchelligua W, Nasir IM. Hybrid State–Space and Vision Transformer Framework for Fetal Ultrasound Plane Classification in Prenatal Diagnostics. Diagnostics. 2025; 15(22):2879. https://doi.org/10.3390/diagnostics15222879

Chicago/Turabian Style

Tehsin, Sara, Hend Alshaya, Wided Bouchelligua, and Inzamam Mashood Nasir. 2025. "Hybrid State–Space and Vision Transformer Framework for Fetal Ultrasound Plane Classification in Prenatal Diagnostics" Diagnostics 15, no. 22: 2879. https://doi.org/10.3390/diagnostics15222879

APA Style

Tehsin, S., Alshaya, H., Bouchelligua, W., & Nasir, I. M. (2025). Hybrid State–Space and Vision Transformer Framework for Fetal Ultrasound Plane Classification in Prenatal Diagnostics. Diagnostics, 15(22), 2879. https://doi.org/10.3390/diagnostics15222879

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop