Previous Article in Journal
Automated Mucormycosis Diagnosis from Paranasal CT Using ResNet50 and ConvNeXt Small
Previous Article in Special Issue
The Effect of Data Leakage and Feature Selection on Machine Learning Performance for Early Parkinson’s Disease Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

The Efficacy of Semantics-Preserving Transformations in Self-Supervised Learning for Medical Ultrasound

1
David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON N2L 3G1, Canada
2
Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
3
Schulich School of Medicine and Dentistry, Western University, London, ON N6A 3K7, Canada
*
Author to whom correspondence should be addressed.
Bioengineering 2025, 12(8), 855; https://doi.org/10.3390/bioengineering12080855
Submission received: 17 June 2025 / Revised: 26 July 2025 / Accepted: 4 August 2025 / Published: 8 August 2025
(This article belongs to the Special Issue Mathematical Models for Medical Diagnosis and Testing)

Abstract

Data augmentation is a central component of joint embedding self-supervised learning (SSL). Approaches that work for natural images may not always be effective in medical imaging tasks. This study systematically investigated the impact of data augmentation and preprocessing strategies in SSL for lung ultrasound. Three data augmentation pipelines were assessed: (1) a baseline pipeline commonly used across imaging domains, (2) a novel semantic-preserving pipeline designed for ultrasound, and (3) a distilled set of the most effective transformations from both pipelines. Pretrained models were evaluated on multiple classification tasks: B-line detection, pleural effusion detection, and COVID-19 classification. Experiments revealed that semantics-preserving data augmentation resulted in the greatest performance for COVID-19 classification—a diagnostic task requiring global image context. Cropping-based methods yielded the greatest performance on the B-line and pleural effusion object classification tasks, which require strong local pattern recognition. Lastly, semantics-preserving ultrasound image preprocessing resulted in increased downstream performance for multiple tasks. Guidance regarding data augmentation and preprocessing strategies was synthesized for developers working with SSL in ultrasound.
Keywords: data augmentation; machine learning; self-supervised learning; transfer learning; ultrasound data augmentation; machine learning; self-supervised learning; transfer learning; ultrasound

Share and Cite

MDPI and ACS Style

VanBerlo, B.; Hoey, J.; Wong, A.; Arntfield, R. The Efficacy of Semantics-Preserving Transformations in Self-Supervised Learning for Medical Ultrasound. Bioengineering 2025, 12, 855. https://doi.org/10.3390/bioengineering12080855

AMA Style

VanBerlo B, Hoey J, Wong A, Arntfield R. The Efficacy of Semantics-Preserving Transformations in Self-Supervised Learning for Medical Ultrasound. Bioengineering. 2025; 12(8):855. https://doi.org/10.3390/bioengineering12080855

Chicago/Turabian Style

VanBerlo, Blake, Jesse Hoey, Alexander Wong, and Robert Arntfield. 2025. "The Efficacy of Semantics-Preserving Transformations in Self-Supervised Learning for Medical Ultrasound" Bioengineering 12, no. 8: 855. https://doi.org/10.3390/bioengineering12080855

APA Style

VanBerlo, B., Hoey, J., Wong, A., & Arntfield, R. (2025). The Efficacy of Semantics-Preserving Transformations in Self-Supervised Learning for Medical Ultrasound. Bioengineering, 12(8), 855. https://doi.org/10.3390/bioengineering12080855

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop