Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising

Degu, Mizanu Zelalem; Madhusoodanan, Midhila; Chippa, Medha; Hareendranathan, Abhilash

doi:10.3390/aimed1030018

Open AccessReview

Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising

Department of Radiology and Diagnostic Imaging, University of Alberta, Edmonton, AB T6G 2R3, Canada

^*

Authors to whom correspondence should be addressed.

AI Med. 2026, 1(3), 18; https://doi.org/10.3390/aimed1030018 (registering DOI)

Submission received: 12 March 2026 / Revised: 15 June 2026 / Accepted: 21 June 2026 / Published: 26 June 2026

Download

Browse Figures

Versions Notes

Abstract

(1) Background: Ultrasound (US) imaging is widely used in clinical diagnosis but is often degraded by speckle noise, which reduces image quality and can hinder interpretation. Deep learning (DL) has emerged as a promising approach for US denoising, yet its clinical applicability remains unclear. (2) Methods: A scoping review of studies published in the last four years on DL-based US denoising was conducted following PRISMA-ScR guidelines. Searches were performed in IEEE-Xplore, PubMed, ScienceDirect, Scopus, Web of Science, and Google Scholar. Data was extracted on anatomy, noise type, learning paradigm, network architecture, datasets, evaluation metrics, and performance outcomes. (3) Results: From 951 records retrieved, 36 studies were included. Most focused on breast, fetal, cardiac, and abdominal US. Convolutional neural networks (CNNs), particularly U-Net, were the most common approach, while generative adversarial network, vision transformers, and variational autoencoders were less explored. Reported peak signal-to-noise ratio ranged from 30 to 45 dB and structural similarity index measure from 0.85 to 0.97. Most studies (34 out of 36) relied on synthetic noise, 2D images and paired datasets, with limited evaluation on real clinical images. (4) Conclusion: Supervised CNN-based methods dominate US image denoising, but clinical translation is limited by reliance on synthetic data. Non-paired and no-ground-truth learning approaches remain underexplored despite their suitability for US imaging. Progress is further hindered by inconsistent evaluation protocols, limited robustness assessment on clinical tasks, and restricted dataset access. Future work should focus on standardized clinically meaningful evaluation, openly available datasets, and clinical validation to improve reliability and generalizability of DL-based US denoising methods.

Keywords:

ultrasound image; ultrasound denoising; deep learning; speckle noise reduction

1. Introduction

Over the last few decades, diagnostic ultrasound (US) has been increasingly used for non-invasive assessment of soft tissues. It is widely used in clinical practice for applications such as abdominal and pelvic imaging, obstetrics and gynecology, cardiology (echocardiography), musculoskeletal evaluation, vascular assessment and in image-guided procedures. Unlike expensive imaging modalities like computed tomography (CT) or magnetic resonance imaging (MRI), US is relatively low-cost and can be used for real-time assessment of both soft tissue and bone [1,2,3]. Additionally, unlike X-ray-based modalities, US employs high-frequency acoustic pulses to probe tissues and captures echo signals without exposing patients to ionizing radiation, making it particularly suitable for vulnerable populations such as pregnant women, children, and critically ill patients requiring repeated imaging [4,5]. The widespread clinical adoption of US has been further accelerated by advancements in portability and miniaturization. Point-of-care US has become integral to emergency medicine, critical care, internal medicine, anesthesia, and rural healthcare, supporting rapid triage, intervention guidance, and real-time physiological monitoring [6].

Despite these obvious advantages, US image quality is limited by various noise artifacts. US imaging relies on the transmission of short acoustic pulses in the frequency range of 2–20 MHz through tissues of varying acoustic impedance [7,8]. As the US beam propagates, it undergoes reflection, refraction, absorption, and attenuation, which can lead to various noise and artifacts including speckle noise, electronic noise, clutter, reverberation, and shadowing [9,10]. The effects of these noises can often be compounded, which drastically degrades image fidelity and makes visual interpretation challenging [9,11]. Figure 1 illustrates common types of US noise across different anatomical regions, demonstrating how they degrade visual quality, blur anatomical structures, and obscure tissue boundaries. As illustrated in Figure 2, in addition to the artifacts arising from the physical properties of US, image acquisition factors such as probe pressure, probe angle, acquisition depth, gain settings, and system-specific processing steps applied to the raw signal also produce wide variability in the image quality [12,13].

Over the past decades, US denoising has progressed from hand-crafted statistical models to data-driven deep learning (DL) frameworks. Early statistical methods, such as the Lee filter [14], Frost filter [15], speckle reducing anisotropic diffusion (SRAD) [16], total variation (TV) regularization, and non-local means (NLM) filtering [17,18], have been employed to reduce US image noise while improving texture and preserving edges. These methods are often limited by sensitivity to parameter tuning, high computational cost, reliance on hand-crafted features and statistical assumptions [19,20]. Data driven DL models such as convolution neural networks (CNNs), U-Net, GAN, and Vision Transformer (ViT) have been used for US enhancement by learning complex mappings and anatomically consistent transformations from the original US data without relying on handcrafted features or statistical assumptions. Most recently, self-supervised methods like Noise2Noise (N2N) [21] addressed the lack of clean ground truth data by enabling models to learn the underlying signal directly from pairs of independent noisy observations. Instead of relying on noise-free reference images, which are impractical to obtain in clinical settings, these approaches exploit the statistical nature of noise so that the network can recover consistent anatomical structures while suppressing random variations [21].

Figure 2. US images are acquired with different parameters and probe settings. (a) Pancreas images obtained at low and high transducer frequencies, illustrating the trade-off between penetration depth and image resolution [22]. (b) Lung images acquired using linear and curvilinear probes, demonstrating differences in field of view and depth coverage [23]. (c) Liver images obtained with low and high gain settings, showing changes in brightness, signal amplification, and noise [24].

Previous reviews on US image denoising and de-speckling primarily emphasized classical filtering approaches, such as SRAD, wavelet denoising, and NLM methods, with comparatively limited coverage of learning-based techniques [25]. More recent surveys on medical image denoising [26] tend to focus on general purpose DL methods across multiple imaging modalities, including CT, MRI, and X-ray, without addressing the unique challenges specific to US. In many cross-modality reviews, US appears only as a minor subsection, rather than as a dedicated topic. The review by Gupta et al. [27] discussed classical speckle reduction filters such as spatial domain smoothing, transform-based filters, and partial differential equation (PDE)-based diffusion. A more recent survey by Sivaanpu et al. [13], examined 97 studies and offered a broad overview spanning classical, transformer and hybrid techniques. This survey provides a taxonomy of methods, discusses the nature of US noise, and tabulates the strengths and weaknesses of different denoising approaches, consolidating two decades of research. Furthermore, survey level analyses by Kaur et al. [28] and Sagheer et al. [29] spanning multiple modalities, including US, MRI, CT, and X-ray, discussed the unique characteristics of US noise, such as multiplicative speckle, tissue-dependent scattering, and acquisition variability.

While existing surveys have provided valuable overviews, earlier reviews primarily focused on traditional filtering and frequency-domain transform techniques, with limited discussion of recent DL-based methods and their clinical implications. More recent surveys broadly categorized conventional and DL-based approaches but emphasized methodological progression rather than critical analysis of emerging learning paradigms, clinical realism, and translational applicability. In particular, there remains limited synthesis of recent US-specific DL denoising methods, including self-supervised learning (SSL), diffusion-based frameworks, hybrid CNN-transformer models, and generative approaches. Prior reviews also provide limited discussion of dataset realism, reproducibility, external validation, and downstream clinical relevance, while insufficiently addressing the difficulty of obtaining clean US ground-truth data for supervised learning (SL). Accordingly, this review presents a focused synthesis of recent DL-based US denoising methods, datasets, evaluation strategies, and persistent research gaps, with emphasis on generalizability, reproducibility, and clinical translation.

2. Materials and Methods

This review was conducted using the Rayyan review tool for study screening and selection, while the review protocol is registered on the Open Science Framework (OSF) database. The study followed the Preferred Reporting Items for Scoping Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines (Table S1).

2.1. Search Strategy and Data Sources

A structured literature search was conducted for research between 6 October 2025 and 8 October 2025 across major scientific databases that predominantly index peer-reviewed journals and established conference proceedings in medical imaging and machine learning (ML). The databases include IEEE Xplore, PubMed, Science Direct, Scopus, Web of Science, and Google Scholar (for supplementary search and snowballing). Search terms were designed to capture both the ultrasound imaging and DL components of the topic. The search was conducted by sequentially querying each database using individual keyword phrases, including “US denoising”, “US image enhancement”, “Medical image enhancement”, “Medical image denoising”, “US de-speckling”, and “Speckle reduction”. The same search strategy was applied consistently across all databases. Results obtained from individual searches were aggregated.

The complete search strings employed to query the selected databases in this study are provided below.

IEEE Xplore (All Metadata): ((“All Metadata”:“ultrasound” OR “All Metadata”:“ultrasonography” OR “All Metadata”:“sonography” OR “All Metadata”:“ultrasonic imaging”) AND (“All Metadata”: deep learning” OR “All Metadata”:“machine learning”) AND (“All Metadata”:“US denoising” OR “All Metadata”: ultrasound denoising” OR “All Metadata”:“US image enhancement” OR “All Metadata”: ultrasound image enhancement” OR “All Metadata”: “medical image enhancement” OR “All Metadata”:“medical image denoising” OR “All Metadata”: “US de-speckling” OR “All Metadata”:“despeckling” OR “All Metadata”:“speckle reduction” OR “All Metadata”:“speckle suppression” OR “All Metadata”:“speckle noise reduction”)) AND Publication Year ≥ 2022.
PubMed (Title/Abstract): ((ultrasound [Title/Abstract] OR ultrasonography [Title/Abstract] OR sonography [Title/Abstract] OR “ultrasonic imaging” [Title/Abstract]) AND (“deep learning” [Title/Abstract] OR “machine learning” [Title/Abstract]) AND (“US denoising” [Title/Abstract] OR “ultrasound denoising” [Title/Abstract] OR “US image enhancement” [Title/Abstract] OR “ultrasound image enhancement” [Title/Abstract] OR “medical image enhancement” [Title/Abstract] OR “medical image denoising” [Title/Abstract] OR “US de-speckling” [Title/Abstract] OR despeckling [Title/Abstract] OR “speckle reduction” [Title/Abstract] OR “speckle suppression” [Title/Abstract] OR “speckle noise reduction” [Title/Abstract])) AND Publication Date ≥ 2022.
Scopus (TITLE-ABS-KEY): TITLE-ABS-KEY ((ultrasound OR ultrasonography OR sonography OR “ultrasonic imaging”) AND (“deep learning” OR “machine learning”) AND (“US denoising” OR “ultrasound denoising” OR “US image enhancement” OR “ultrasound image enhancement” OR “medical image enhancement” OR “medical image denoising” OR “US de-speckling” OR despeckling OR “speckle reduction” OR “speckle suppression” OR “speckle noise reduction”)) AND PUBYEAR ≥ 2022.
Web of Science (Topic Search): TS = ((ultrasound OR ultrasonography OR sonography OR “ultrasonic imaging”) AND (“deep learning” OR “machine learning”) AND (“US denoising” OR “ultrasound denoising” OR “US image enhancement” OR “ultrasound image enhancement” OR “medical image enhancement” OR “medical image denoising” OR “US de-speckling” OR despeckling OR “speckle reduction” OR “speckle suppression” OR “speckle noise reduction”)) AND Publication Year ≥ 2022.
ScienceDirect (Title, Abstract, Keywords): (ultrasound OR ultrasonography OR sonography OR “ultrasonic imaging”) AND (“deep learning” OR “machine learning”) AND (“US denoising” OR “ultrasound denoising” OR “US image enhancement” OR “ultrasound image enhancement” OR “medical image enhancement” OR “medical image denoising” OR “US de-speckling” OR despeckling OR “speckle reduction” OR “speckle suppression” OR “speckle noise reduction”) AND Year ≥ 2022.

2.1.1. Inclusion Criteria

Studies were included if they met all the following criteria:

Focus on diagnostic US imaging;
Use DL or ML methods for US image denoising or speckle reduction;
Published in a peer-reviewed journal;
Published from 2022.

The review was restricted to studies published from 2022 onward with the intention of focusing on recent developments in DL-based ultrasound denoising. This decision aimed to capture contemporary research that is more likely to reflect the current methodological directions in medical image restoration, including transformer-based architecture, diffusion-inspired models, and self-supervised learning strategies.

2.1.2. Exclusion Criteria

Studies were excluded if they met one of the following

Focus on non-US Image;
Focus on downstream tasks (classification, segmentation, and detection);
Pre-print;
Non-peer reviewed journal.

2.2. Study Screening and Selection

The screening process followed the PRISMA-ScR guidelines and was conducted in multiple stages. First, the titles were screened to remove clearly irrelevant studies. Second, duplicate records were removed. Abstracts of the remaining articles were then screened to identify studies explicitly addressing DL-based US image denoising or speckle reduction. Finally, full-text screening was performed to confirm eligibility based on the predefined inclusion criteria.

2.3. Data Extraction

A structured data charting process was employed to systematically extract and organize information from all included studies. The data extraction form was piloted on a subset of studies and calibrated by the review team prior to use to ensure consistency. The following variables were extracted from each study.

Anatomy (Breast/Fetal/Cardiac/Abdominal/Musculoskeletal/Others);
Imaging dimensionality(2D/3D/Videos);
Target noise type (Speckle, Gaussian noise);
Learning paradigm (Supervised Learning/Self Supervised/Unsupervised);
DL architecture (CNN/U-Net/GAN/Transformer);
Summary of the proposed denoising methodology;
Training data characteristics;
Evaluation metrics used for quantitative assessment;
Summary of performance results;
Limitations of the study;
Code and dataset availability.

2.4. Data Handling and Summary

Extracted data were categorized by learning paradigm, network architecture, application domain, and data source. Descriptive statistics and frequency counts were used to summarize quantitative variables, while qualitative information such as methodology details and clinical application were organized in tables. Trends and patterns across studies were identified to provide an overview of the current landscape in DL-based US denoising.

2.5. Limitations

This scoping review is limited to English-language publications, which may have excluded relevant studies in other languages. Additionally, the findings are primarily descriptive, and the interpretations are based on the data reported in the included studies, which may introduce subjectivity.

3. Results and Discussion

3.1. Study Selection

The study identification, screening, eligibility assessment, and inclusion process followed the PRISMA-ScR guidelines and is summarized in Figure 3. A total of 951 records were identified through database searching, and after removing duplicates, 801 unique records remained for screening. Title and abstract screening led to the exclusion of 629 records. Full texts of 96 reports were sought for retrieval, of which 93 were successfully assessed for eligibility. Following full-text evaluation, 57 studies were excluded for reasons including being out of scope, review articles, content duplication, or non-English publication, resulting in the inclusion of 36 studies in the final review.

3.2. Characteristics of the Studies

The studies exhibited extensive diversity in terms of learning paradigms, network architecture, application domains, and data sources. Studies focused on B-mode US image denoising, with clinical applications spanning breast, fetal, cardiac, thyroid, liver, nerve, carotid artery, and general abdominal US. In terms of learning paradigms, most studies used SL approaches, while small subsets of recent studies implemented SSL or unsupervised (USL) learning. The quantitative evaluation of US image denoising performance in the reviewed studies relies on a combination of reference-based and no-reference metrics, including mean squared error (MSE), root mean squared error (RMSE), mean structural similarity index measure (MSSIM), equivalent number of looks (ENL), contrast-to-noise ratio (CNR), signal-to-noise ratio (SNR), feature similarity index measure (FSIM), edge preservation index (EPI), natural image quality evaluator (NIQE), perception-based image quality evaluator (PIQE), figure of merit (FOM), improvement in signal-to-noise ratio (ISNR), speckle index (SI), signal-to-reconstruction error ratio (SRE), and universal image quality index (UIQ). These metrics are employed to assess different aspects of denoising outcomes, such as noise suppression, structural fidelity, perceptual quality, edge preservation, and contrast characteristics. A summary of the key characteristics of the studies, including learning paradigm, architecture type, dataset domain, and reported evaluation metrics, is provided in Table 1.

Table 1. Consolidated summary of DL-based US denoising and analysis studies, categorized by ML approach, architectural framework, anatomy and performance evaluation metrics.

Studies	Machine Earning Paradigm	DL Architecture	Dataset Domain (Anatomy)	Metrics
Cui et al. [30], Soy et al. [31], Chi et al. [32], Kavand et al. [33], Jha et al. [34], El-Hag et al. [35], Reddy et al. [36]	SL	CNN	Breast, Thyroid, Ovary (PCOS), Carotid Artery, General US	PSNR, SSIM, MSE, RMSE, NIQE, PIQE, ENL, AGM, SSI, EI
Khalifa et al. [37], Devi et al. [38], Hsu et al. [39], Satish et al. [40], Monkam et al. [41], Goudarzi et al. [42]	SL	U-Net	Breast, Liver, Lung, Fetal (Cardiac/Head), Carotid Artery, General US, Chicken Breast, Bovine Liver	PSNR, SSIM, MSE, EPI, ENL, CNR, SNR, AGM
Saranya et al. [43], Slimi et al. [44], Bhute et al. [45]	SL	DAE	Breast, General US	PSNR, SSIM, MSE
Jiménez-Gaona et al. [46], Sivaanpu et al. [47], Liu et al. [48], Gan et al. [49]	SL	GAN	Breast, Fetal Head, General US	PSNR, SSIM, MSSIM, MSE, RMSE, FOM, FSIM
Chen et al. [50], Oliveira et al. [51], Li et al. [52]	SL	CAE	Breast, Lung, Nerve, Cardiac, Fetal Head	PSNR, SSIM, RMSE
Jiang et al. [53], Mahmoudi et al. [54]	SL	DnCNN	General US, Carotid Artery	PSNR, SSIM
Chen et al. [55], Sivaanpu et al. [56], Bu et al. [57]	SL	Hybrid CNN + Transformer	Fetal Head, Breast, Dental, Cardiac Phantom	PSNR, SSIM, RMSE, MSE, NIQE, ENL, SNR, CNR, ISNR, SI
Vimala et al. [58]	SL	LPRNN (CNN + RNN)	Breast	MSE
Slimi et al. [59], Yu et al. [60], Sun et al. [61]	SSL	DAE/U-Net	Breast, Thyroid, Abdominal, General US	PSNR, SSIM
Zhang et al. [62]	USL	CNN/U-Net	Nerve	PSNR, SSIM, FSIM, EPI, CNR, SRE, UIQ, MSR
Chen et al. [63], Wei et al. [64], Basile et al. [65]	USL	VAE + U-Net	Liver, Breast, Abdominal, Heart, Mediastinum	PSNR, SSIM, MSSIM, ENL, MSE, CNR, SNR

Across nearly all application domains, SL approaches were predominant, with the highest concentration observed in the breast US image, illustrated in Figure 4. This finding is consistent with the relative availability of the curated datasets and established evaluation benchmarks in breast imaging, which facilitate the use of paired or pseudo-ground truth data for supervised training. SL-based methods were also widely applied in carotid artery, fetal head, cardiac, liver, lung, and abdominal imaging, indicating that supervised paradigm remains the default choice when annotated data are accessible. In contrast, SSL and USL approaches were significantly less represented and confined to a limited subset of domains, such as liver, nerve, thyroid, and abdominal US. The relatively sparse adoption of SSL and USL across organ-specific datasets highlights an ongoing reliance on supervised formulations despite widespread acknowledgement of ground-truth limitations in US image denoising. As shown in Table 1, reported DL architecture included conventional CNN-based denoisers, U-Net, denoising autoencoder (DAE) and convolutional auto encoder (CAE) based models, GAN framework, denoising CNN (DnCNN), hybrid networks, as well as transformer models. Supervised CNNs and U-Nets were applied broadly across breast, liver, lung, cardiac, fetal, and carotid artery image denoising, while transformer-based and hybrid architectures were more commonly applied to specialized domains such as nerve and cardiac US image denoising.

Recent studies have begun investigating the SSL and USL paradigms to reduce the dependence on clean reference images. N2N-based approaches [42,64,65] demonstrated that the denoising models can be trained using paired noisy acquisitions without requiring fully clean targets. Similarly, diffusion-based denoising framework [63] has also emerged, typically adopting U-Net as the underlying architecture, making it particularly relevant for clinical US imaging where ground-truth acquisition is inherently challenging.

3.3. Study Transparency and Validation Characteristics

The reviewed studies demonstrated variability in reproducibility, validation, and translational reporting practices. Most studies relied on publicly available datasets, although some used private institution-specific and phantom-generated data, with limited transparency regarding acquisition settings and dataset composition. External validation was performed with a considerable proportion (20/36) of the studies, see Appendix A. However, the external evaluation datasets were single and internally acquired samples, potentially limiting generalizability across clinical environments.

Combinations of publicly available and private datasets for evaluating denoising performance were commonly utilized, suggesting the presence of both inter-dataset and intra-dataset variability. However, most studies did not explicitly analyze or report the impact of clinically relevant factors such as scanner differences, acquisition protocols, and operator dependency on denoising performance within consistent anatomical settings.

Code and data availability remained limited across the reviewed studies. Only a small subset of studies [47,53,57,62] provided openly accessible source code or implementation details, while several others offered resources only upon request. Although many studies utilized publicly available benchmark datasets, none of the studies that employed internally acquired clinical datasets made their full datasets publicly accessible.

In addition, approximately only one quarter of the reviewed studies [30,32,34,36,40,41,51,56,62] evaluated their denoising approaches in the context of downstream tasks such as detection, segmentation, and classification, despite the broader objective of US image enhancement and denoising being to support subsequent clinical interpretation and computer-aided analysis.

Furthermore, all reviewed studies adopted retrospective evaluation protocols, and none conducted prospective or real-world clinical testing. Consequently, the robustness and resilience of the proposed models under live clinical conditions, including variations in operators, scanners, acquisition protocols, patient populations, and real-time imaging artifacts remain largely unverified. A structured assessment of methodological quality, transparency, validation, and translational characteristics for all included studies is provided in Appendix A (Table A1).

3.4. Training Data and Noise Modeling Strategy

The reviewed studies exhibit considerable variability in training data composition and noise modeling strategies. A substantial proportion of studies relied on synthetic or simulated speckle noise, often generated using multiplicative noise models applied to clean US images. Several studies employed publicly available US datasets, particularly in breast imaging, nerve, thyroid, cardiac, and carotid artery applications, while most of the others used private or institution-specific datasets. Moreover, all reviewed studies focused on 2D US image denoising, with no study explicitly addressing the sequential and spatiotemporal characteristics inherent to clinical US acquisition, even though clinical US acquisition is typically performed as continuous probe sweeps that naturally produce spatiotemporal video data.

3.5. Evaluation Metrics

PSNR and SSIM were the most frequently reported quantitative evaluation metrics. These full-reference measures were commonly used to assess reconstruction fidelity when paired or simulated ground-truth data were available. In addition, several studies reported US-specific metrics such as CNR and ENL to better reflect speckle suppression and contrast enhancement. On the other hand, a smaller subset of studies employed no-reference perceptual metrics, including NIQE and PIQE, particularly in scenarios where clean reference images were unavailable. Other metrics, such as entropy-based, edge-based, or gradient-based measures, were reported sporadically and were often specific to individual studies. The list of evaluation metrics reported across studies is summarized in Table 1.

3.6. Descriptive Summary of Reported Quantitative Metrics

Studies employing synthetic speckle noise often evaluate performance under multiple noise levels and variance settings. To maintain clarity and avoid overrepresentation of individual experimental conditions, the reported values were aggregated within each study. These aggregated metrics are presented solely as descriptive summaries of the values reported in the literature and are not intended for direct cross-study performance comparison, as substantial heterogeneity exists across datasets, noise models, anatomical targets, preprocessing methods, training protocols, and evaluation settings.

The reviewed studies employed a combination of publicly available and privately acquired US datasets. As shown in Table 2, a substantial portion of the literature relied on open datasets, including BUSI [66], US-CASE [67], DDTI [68], BUSID [69], Breast US dataset (Dataset B) [70], UNS [71], CAMUS [72], MedPix [73], PCOS [74], CBIS-DDSM [75], INBreast [76], HC18 [77], CCA [78], BUS-BRA [79], and US-4 [80]. These public datasets are widely used to support benchmarking and reproducibility. In contrast, Table 3 summarizes studies using private clinical datasets obtained from hospitals and medical centers. While these datasets may better reflect real clinical imaging conditions, differences in acquisition protocols and restricted data accessibility further limit direct quantitative comparability across studies.

Table 2. Summary of studies performance report on public or open clinical US datasets.

Study	Dataset	PSNR (dB)	SSIM	Other Metrics Score
Slimi et al. [59]	BUS-BRA	33.82	0.7625	-
Saranya et al. [43]	PICMUS	44.48	0.935	-
Khalifa et al. [37]	Breast US	40.72	0.940	-
Cui et al. [30]	BUID	-	-	ENL = 5.71, AGM = 38.57, NIQE = 4.25, PIQE = 31.83
	BUSI	-	-	ENL = 2.71, AGM = 33.24, NIQE = 4.74, PIQE = 50.61
	CCA	-	-	ENL = 0.76, AGM = 40.27, NIQE = 4.36, PIQE = 64.39
	US-case	-	-	ENL = 3.50, AGM = 65.18, NIQE = 5.38, PIQE = 50.57
Chen et al. [63]	US-CASE	35.19	0.90	-
Slimi et al. [44]	BUS-BRA	20.60	0.81	-
Chi et al. [32]	DDTI	36.82	0.93	-
Jiménez-Gaona et al. [46]	BUSI	39.79	0.96	-
Wei et al. [64]	BUSI	40.03	-	SSI = 0.80
Chen et al. [55]	UNS	32.82	0.9358	SSI = 0.79
	CAMUS	35.29	0.9317	SSI = 0.78
Kavand et al. [33]	BUI + MedPix	30.50	0.97	UIQ = 0.54
Jha et al. [34]	PCOS	72.96	0.99	UIQ = 0.23
Sivaanpu et al. [56]	HC18	-	0.965	ENL = 7.26, NIQE = 4.61, MSE = 13.905, SRE = 32.61, UIQ = 0.04,
Sivaanpu et al. [47]	HC18	33.86	0.91	ISNR = 23.57dB
	BUSI	34.16	0.90	ISNR = 18.52dB
El-Hag et al. [35]	BUSI	28.72	0.77	NIQE = 4.50, MSE = 157.3, SNR = 40.95dB
Bhute et al. [45]	BUSI	23.64	0.92	MSE = 0.0048
Bu et al. [57]	HC18	40.62	0.98	RMSE = 2.33
Hsu et al. [39]	BUSI + US-4	42.27	0.99	-
Reddy et al. [36]	INBreast + CBIS-DDSM	64.44	-	NIQE = 0.08, MSE = 0.22
Vimala et al. [58]	CBIS-DDSM	-	-	MSE = 13
	INBreast	-	-	MSE = 8.3
Monkam et al. [41]	HC18	-	-	ENL = 15.71, CNR = 1.10, SNR = 39.32dB, SRE = 27.46
	BUSI	-	-	ENL = 17.04, CNR = 4.20, SNR = 34.54dB, SRE = 17.04
	CCA	-	-	SNR = 40.87, CNR = 2.59, AGM = 35.92, ENL = 23.02

Table 3. Summary of performance report of studies on private clinical US dataset.

Study	Dataset	PSNR (dB)	SSIM	Other Metrics Score
Saranya et al. [39]	Fetus	44.48	0.935	-
Chen et al. [60]	(abdominal)	32.22	0.89	-
Sun et al. [57]	Thyroid	32.89	0.88	-
Soy et al. [31]	Synthetic US	34.38	0.93	MSE = 0.0021
Devi et al. [38]	Clinical US	32.22	0.88	MSE = 0.0008, UIQ = 0.65
Sivaanpu et al. [56]	Heart Phantom	-	-	CNR = 18.78dB, MSR = 3.85
Basile et al. [65]	Abdominal	-	-	ENL = 55.89, MSE = 0.004, SSI = 0.33, CNR = 4.21dB, SNR = 8.57dB
Jiang et al. [53]	Breast	23.13	0.81	-
Liu et al. [48]	breast, heart, lymph node	38.13	-	RMSE = 3.25, UIQ = 0.98
Satish et al. [40]	Fetal cardiac	29.07	0.86	-
Goudarzi et al. [42]	Heart	37.27	0.90	MSE = 0.006
	Chicken breast	37.11	0.91	MSE = 0.008
	Bovine liver	31.28	0.88	MSE = 0.017
Li et al. [52]	Fetal Heart	34.31	0.88	RMSE = 5.10
Gan et al. [49]	Liver	-	-	NIQE = 0.58, PIQE = 0.79, RMSE = 0.39

Figure 5a illustrates the range and distribution of PSNR and SSIM values reported in the reviewed studies. Most PSNR values were reported to be between approximately 30 dB and 45 dB, with several studies reporting higher values beyond this range. Reported SSIM values were commonly observed between approximately 0.85 and 0.97. The spread of the reported values reflects the substantial diversity of datasets, anatomical targets, imaging protocols, noise conditions, preprocessing strategies, and evaluation settings adopted across the literature. Variations in metric definitions, data normalization, and calculation procedures may also contribute to the observed differences among studies, limiting direct comparability of the reported metrics. An unusually high PSNR value was also reported in a PCOS-related study; however, as the evaluation was limited to a single anatomical domain, the generalizability of this result across different ultrasound imaging settings remains uncertain. Therefore, the reported quantitative metrics should be interpreted as descriptive summaries of the literature rather than direct indicators of relative method performance.

Similarly, Figure 5b presents the reported no-reference US image quality metrics, including ENL, EPI, NIQE, and CNR, extracted from the reviewed studies. The reported values span a broad range across the literature, consistent with the heterogeneity of imaging modalities, anatomical targets, acquisition conditions, and evaluation methodologies. As with PSNR and SSIM, differences in experimental design, dataset characteristics, and metric computation limit direct comparison across studies. These distributions provide an overview of how image quality has been quantified in prior studies and highlight the considerable variation in evaluation practices within ultrasound image enhancement research.

3.7. Methodological Trends

According to the reviewed studies, U-Net-based architecture and its variants were among the most frequently adopted approaches. Over 10 studies employed U-Net either as a standalone denoising network or as a backbone integrated with other methodological components. Its widespread use likely reflects its implementation simplicity and effectiveness in capturing multi-scale contextual information while preserving the fine spatial details during denoising tasks. CNN-based models, including DAE, CAE, and DnCNN, also remained foundational across the literature, particularly within supervised learning paradigms. More broadly, CNNs dominated the domain, with more than seven studies utilizing CNN-based architecture as primary denoising networks, while convolutional components were incorporated into nearly all proposed methods except ViT-based approaches. GAN-based approaches were frequently employed to enhance the perceptual quality and texture realism, though these methods often required careful loss balancing and complex training procedures. In recent years (2024–2025), variational autoencoders (VAE), transformer-based, and hybrid CNN-transformer architectures have emerged, aiming to capture global contextual information, long-range dependencies and probabilistic representations, especially for more complex domains such as cardiac, nerve, and fetal US image denoising. As shown in Figure 6, the trends indicate a gradual evolution from conventional CNN and U-Net models toward hybrid, transformer, and VAE architectures, reflecting both methodological innovation and adaptation to practical challenges in clinical US datasets, while U-Net and CNN, and hybrid CNN-ViT models remain widely used.

3.8. Identified Gaps

The structured mapping of the literature highlighted several persistent research gaps in DL-based US image denoising. A major limitation is the heavy reliance on simulated or synthetic noise models, with many studies trained on paired synthetic noise datasets. While these approaches facilitate controlled experimentation, they may not fully capture the complex and spatially varying characteristics of real clinical US speckles, potentially limiting clinical generalizability. Importantly, this challenge is not solely a data availability issue but also reflects the inherent difficulty of obtaining clean US ground truth, since speckles are intrinsically linked to the US image formation process and tissue scattering physics. Consequently, supervised models trained on synthetic paired data may achieve strong quantitative performance while remaining insufficiently robust under realistic clinical acquisition conditions. In this context, SSL and USL approaches, including N2N and diffusion-based methods, have emerged in a limited number of studies, particularly in settings where obtaining clean reference images is impractical. These methods are designed to learn mappings without explicit ground truth supervision, making them relevant for ultrasound imaging scenarios with limited or noisy annotations. Nevertheless, despite the inherent challenges associated with acquiring clean ultrasound images, such approaches remain relatively underrepresented compared to conventional supervised CNN-based methods.

The review also revealed limited investigation of model robustness and generalizability. Although many studies utilized combinations of publicly available and private datasets, suggesting the presence of both inter-dataset and intra-dataset variability, most studies did not explicitly analyze the impact of scanner differences, acquisition protocols, operator dependency, and patient population variability on denoising performance within clinically consistent settings. Similarly, external validation was often limited, restricting comprehensive assessment of robustness across diverse real-world imaging conditions. Emerging architectures such as hybrid CNN-transformer models and VAE-based frameworks demonstrated a promising performance within their respective experimental settings; however, their reliability and generalizability across devices, institutions, and clinical domains remain insufficiently explored.

In addition, the translational and clinical relevance of many proposed methods remains insufficiently validated. Although US image enhancement and denoising are ultimately intended to support clinical interpretation and downstream computer-aided analysis, only a limited number of studies incorporated expert evaluation or assessed performance in downstream tasks such as lesion detection, segmentation, classification, diagnostic assessment, or image-guided interventions. Furthermore, all reviewed studies relied on retrospective evaluation protocols without prospective or real-time clinical testing, leaving the robustness, reliability, and practical applicability of these methods under live clinical conditions largely unverified.

Reproducibility also remains a persistent challenge. Many studies did not provide open-source implementations or sufficiently detailed methodological protocols. Although publicly available benchmark datasets were frequently used, institution-specific clinical datasets were generally not released, limiting independent validation and fair comparison across methods. Thus, while the studies demonstrate substantial methodological innovation, these gaps imply the need for standardized benchmarking frameworks, multi-center validation studies, improved transparency in data and code sharing, and clinically grounded evaluation strategies for robust and generalizable US denoising solutions.

4. Conclusions

This review presented a comprehensive and systematic analysis of recent DL-based approaches for US image denoising and speckle reduction. By examining 36 studies published between 2022 and 2025, the review mapped the current methodological landscape in terms of learning paradigms, architectural designs, dataset usage, noise modeling strategies, and evaluation metrics. The findings indicate that supervised learning approaches, particularly CNN and U-Net-based architectures, remain the most frequently adopted methods across the reviewed literature, largely driven by the availability of curated datasets and synthetic noise generation strategies. On the other hand, emerging architectures, including GANs, VAE, N2N and diffusion-based models such as diffusion U-Net, have received limited attention despite their potential advantage for US, where finding a clean ground truth image is impractical.

Our review also shows a lack of standardized evaluation protocols, with a continued dependence on PSNR and SSIM despite their limited clinical interpretability, insufficient evaluation of temporal consistency in US video sequence data, and limited assessment of performance on downstream clinical tasks such as lesion detection, segmentation, and diagnosis. Additionally, the review found limited or inconsistently reported evidence of robustness under dataset shift (e.g., across different datasets, devices, and acquisition protocols), and restricted access to datasets and source code.

Accordingly, future research should place greater emphasis on openly accessible datasets and source code, standardized and transparent reporting practices, and clinically meaningful evaluation frameworks that incorporate both perceptual quality assessment and downstream clinical tasks such as lesion detection, segmentation, and diagnosis. More rigorous robustness assessment under dataset and device variability, together with further investigation of temporal consistency in US video sequences and non-paired or no-ground-truth training approaches, is also needed. Finally, large-scale cross-dataset validation, multi-center clinical assessment, and prospective real-time clinical studies remain essential for developing reliable, generalizable, and clinically applicable DL-based US image denoising solutions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/aimed1030018/s1, Table S1: Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) Checklist. Reference [81] is cited in the supplementary materials.

Author Contributions

Conceptualization, M.Z.D., A.H., M.M. and M.C.; methodology, M.Z.D., A.H., M.M. and M.C.; validation, M.Z.D., A.H., M.M. and M.C.; formal analysis, M.Z.D.; investigation, M.Z.D., A.H., M.M. and M.C.; resources, M.Z.D., A.H., M.M. and M.C.; data curation, M.Z.D., M.M. and M.C.; writing—original draft preparation, M.Z.D.; writing—review and editing, A.H., M.M. and M.C.; visualization, M.Z.D., A.H., M.M. and M.C.; supervision, A.H.; project administration, A.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Alberta Innovates AICE Concepts Grant awarded to Hareendranathan. The funding agency had no involvement in the study design, data collection, analysis, manuscript writing, or submission decision.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We would like to acknowledge Stephanie Wichuk for providing language editing support for the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

US	Ultrasound
PRISMA-ScR	Preferred Reporting Items for a Systematic Review and Meta-Analysis extensions for Scoping Review
CNN	Convolutional neural network
GAN	Generative adversarial network
VAE	Variational autoencoder
CT	Computed tomography
MRI	Magnetic resonance imaging
NLM	Non-local mean
PDE	Partial Differential Equation
SL	Supervised learning
SSL	Self-supervised learning
USL	Unsupervised learning
MSE	Mean squared error
RMSE	Root mean squared error
MSSIM	Mean structural similarity index
ENL	Equivalent number of looks
CNR	Contrast-to-noise ratio
SNR	Signal-to-noise ratio
FSIM	Feature similarity index measure
EPI	Edge preservation index
NIQE	Natural image quality evaluator
PIQE	Perception-based image quality evaluator
FOM	Figure of merit
ISNR	Improvement in signal-to-noise ratio
SI	Speckle index
SRE	Signal-to-reconstruction error
UIQ	Universal image quality
SSIM	Structural similarity index
PSNR	Peak signal-to-noise ratio

Appendix A. Methodological Quality and Validation Characteristics of Included Studies

Table A1. Structured assessment of methodological quality, transparency, validation, and translational characteristics of the included studies. The table summarizes key indicators relevant to reproducibility, generalizability, and clinical applicability, including external validation, use of publicly available datasets, code availability, clinical testing protocol, multicenter data utilization, and downstream task evaluation. ✓ indicates that the characteristic was reported, available, or performed; Χ indicates that the characteristic was not reported, unavailable, not used, or not performed. Retro. denotes retrospective evaluation.

Paper Title	External Validation	Public Dataset	Code Availability	Clinical Testing Protocol	Multi-Center Data	Downstream Task Evaluation
Bu et al. [57]	✓	✓	✓	Retro.	Χ	Χ
Sivaanpu et al. [47]	✓	✓	✓	Retro.	Χ	Χ
Hsu et al. [39]	✓	✓	✓	Retro.	Χ	Χ
Jiang et al. [53]	✓	✓	Χ	Retro.	Χ	Χ
El-Hang et al. [35]	✓	✓	✓	Retro.	Χ	Χ
Chi et al. [32]	✓	✓	✓	Retro.	Χ	Χ
Khalifa et al. [37]	✓	✓	✓	Retro.	Χ	Χ
Mahmoudi et al. [54]	✓	✓	Χ	Retro.	Χ	Χ
Goudarzi et al. [42]	✓	Χ	Χ	Retro.	Χ	Χ
Devi et al. [38]	✓	Χ	Χ	Retro.	Χ	Χ
Chen et al. [63]	✓	✓	Χ	Retro.	Χ	Segmentation
Chen et al. [55]	✓	✓	Χ	Retro.	Χ	Χ
Reddy et al. [36]	✓	✓	Χ	Retro.	Χ	Classification
Zhang et al. [62]	✓	✓	✓	Retro.	Χ	Segmentation
Vimala et al. [58]	✓	✓	Χ	Retro.	Χ	Χ
Wei et al. [64]	✓	✓	✓	Retro.	Χ	Χ
Oliveira et al. [51]	✓	✓	Χ	Retro.	Χ	Classification
Soy et al. [31]	Χ	Χ	Χ	Retro.	Χ	Χ
Slimi et al. [59]	✓	✓	Χ	Retro.	Χ	Χ
Gan et al. [49]	✓	✓	Χ	Retro.	Χ	Χ
Jha et al. [34]	✓	✓	Χ	Retro.	Χ	Classification
Cui et al. [30]	✓	✓	✓	Retro.	✓	Segmentation
Li et al. [52]	✓	✓	✓	Retro.	Χ	Χ
Sun et al. [61]	Χ	Χ	Χ	Retro.	Χ	Χ
Yu et al. [60]	✓	✓	✓	Retro.	Χ	Χ
Sivaanpu et al. [56]	✓	✓	Χ	Retro.	Χ	Segmentation
Bhute et al. [45]	✓	✓	Χ	Retro.	Χ	Χ
Lui et al. [48]	✓	✓	Χ	Retro.	✓	Χ
Kavand et al. [33]	✓	✓	✓	Retro.	Χ	Χ
Chen et al. [50]	✓	✓	✓	Retro.	Χ	Χ
Satish et al. [40]	Χ	Χ	Χ	Retro.	Χ	Classification
Monkam et al. [41]	✓	✓	✓	Retro.	✓	Detection
Jiménez-Gaona et al. [46]	✓	✓	✓	Retro.	Χ	Χ
Saranya et al. [43]	✓	✓	Χ	Retro.	Χ	Χ
Slimi et al. [44]	✓	✓	Χ	Retro.	Χ	Χ
Basile et al. [65]	✓	Χ	✓	Retro.	Χ	Χ

References

Wolstenhulme, S. Peter Hoskins, Kevin Martin and Abigail Thrush (eds). Diagnostic Ultrasound: Physics and Equipment. Ultrasound 2020, 28, 62. [Google Scholar] [CrossRef]
Szabo, T. Diagnostic Ultrasound Imaging—Inside Out; Elsevier: Amsterdam, The Netherlands, 2004. [Google Scholar]
Mahesh, M. The Essential Physics of Medical Imaging, Third Edition. Med. Phys. 2013, 40, 077301. [Google Scholar] [CrossRef] [PubMed]
Torloni, M.R.; Vedmedovska, N.; Merialdi, M.; Betrán, A.P.; Allen, T.; González, R.; Platt, L.D. Safety of ultrasonography in pregnancy: WHO systematic review of the literature and meta-analysis. Ultrasound Obstet. Gynecol. 2009, 33, 599–608. [Google Scholar] [CrossRef] [PubMed]
Quarato, C.M.I.; Lacedonia, D.; Salvemini, M.; Tuccari, G.; Mastrodonato, G.; Villani, R.; Fiore, L.A.; Scioscia, G.; Mirijello, A.; Saponara, A.; et al. A Review on Biological Effects of Ultrasounds: Key Messages for Clinicians. Diagnostics 2023, 13, 855. [Google Scholar] [CrossRef] [PubMed]
Atkinson, P.R.; Milne, J.; Diegelmann, L.; Lamprecht, H.; Stander, M.; Lussier, D.; Pham, C.; Henneberry, R.; Fraser, J.M.; Howlett, M.K.; et al. Does Point-of-Care Ultrasonography Improve Clinical Outcomes in Emergency Department Patients with Undifferentiated Hypotension? An International Randomized Controlled Trial from the SHoC-ED Investigators. Ann. Emerg. Med. 2018, 72, 478–489. [Google Scholar] [CrossRef] [PubMed]
Sarvazyan, A.P.; Urban, M.W.; Greenleaf, J.F. Acoustic waves in medical imaging and diagnostics. Ultrasound Med. Biol. 2013, 39, 1133–1146. [Google Scholar] [CrossRef] [PubMed]
Grogan, S.P.; Mount, C.A. Ultrasound Physics and Instrumentation. In StatPearls; StatPearls Publishing: Treasure Island, FL, USA, 2025. [Google Scholar]
Pinton, G.F.; Trahey, G.E.; Dahl, J.J. Sources of image degradation in fundamental and harmonic ultrasound imaging using nonlinear, full-wave simulations. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2011, 58, 754–765. [Google Scholar] [CrossRef] [PubMed]
Goodsitt, M.M.; Carson, P.L.; Witt, S.; Hykes, D.L.; Kofler, J.M., Jr. Real-time B-mode ultrasound quality control test procedures. Report of AAPM Ultrasound Task Group No. 1. Med. Phys. 1998, 25, 1385–1406. [Google Scholar] [CrossRef] [PubMed]
Moore, C.L.; Copel, J.A. Point-of-care ultrasonography. N. Engl. J. Med. 2011, 364, 749–757. [Google Scholar] [CrossRef] [PubMed]
Yahya, N.; Kamel, N.S.; Malik, A.S. Subspace-based technique for speckle noise reduction in ultrasound images. Biomed. Eng. Online 2014, 13, 154. [Google Scholar] [CrossRef] [PubMed]
Sivaanpu, A.; Punithakumar, K.; Zheng, R.; Nguyen, K.T.; Noga, M.; Ta, D.; Lou, E.H.M.; Le, L.H. Speckle Noise Reduction Techniques in Ultrasound Imaging: A comprehensive review of the last two decades (2005–2024). Comput. Methods Programs Biomed. 2025, 274, 109150. [Google Scholar] [CrossRef] [PubMed]
Lee, J.S. Digital Image Enhancement and Noise Filtering by Use of Local Statistics. IEEE Trans. Pattern Anal. Mach. Intell. 1980, PAMI-2, 165–168. [Google Scholar] [CrossRef] [PubMed]
Frost, V.S.; Stiles, J.A.; Shanmugan, K.S.; Holtzman, J.C. A model for radar images and its application to adaptive digital filtering of multiplicative noise. IEEE Trans. Pattern Anal. Mach. Intell. 1982, PAMI-4, 157–166. [Google Scholar] [CrossRef] [PubMed]
Yongjian, Y.; Acton, S.T. Speckle reducing anisotropic diffusion. IEEE Trans. Image Process. 2002, 11, 1260–1270. [Google Scholar] [CrossRef] [PubMed]
Pizurica, A.; Philips, W. Estimating the probability of the presence of a signal of interest in multiresolution single- and multiband image denoising. IEEE Trans. Image Process. 2006, 15, 654–665. [Google Scholar] [CrossRef] [PubMed]
Coupé, P.; Hellier, P.; Kervrann, C.; Barillot, C. Nonlocal means-based speckle filtering for ultrasound images. IEEE Trans. Image Process. 2009, 18, 2221–2229. [Google Scholar] [CrossRef] [PubMed]
Duarte-Salazar, C.A.; Castro-Ospina, A.E.; Becerra, M.A.; Delgado-Trejos, E. Speckle Noise Reduction in Ultrasound Images for Improving the Metrological Evaluation of Biomedical Applications: An Overview. IEEE Access 2020, 8, 15983–15999. [Google Scholar] [CrossRef]
Wu, S.; Zhu, Q.; Xie, Y. Evaluation of various speckle reduction filters on medical ultrasound images. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2013, 2013, 1148–1151. [Google Scholar] [CrossRef] [PubMed]
Lehtinen, J.; Munkberg, J.; Hasselgren, J.; Laine, S.; Karras, T.; Aittala, M.; Aila, T. Noise2Noise: Learning Image Restoration without Clean Data. In Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, Stockholm, Sweden, 10–15 July 2018; pp. 2965–2974. Available online: https://proceedings.mlr.press/v80/lehtinen18a.html (accessed on 20 June 2026).
Zander, D.; Hüske, S.; Hoffmann, B.; Cui, X.W.; Dong, Y.; Lim, A.; Jenssen, C.; Löwe, A.; Koch, J.B.H.; Dietrich, C.F. Ultrasound Image Optimization (“Knobology”): B-Mode. Ultrasound Int. Open 2020, 6, E14–E24. [Google Scholar] [CrossRef] [PubMed]
Mandal, S.; Connolly, B.; Suh, E.; Moxham, J.; Hart, N. P132 Comparative study of the Curvilinear Ultrasound Probe (CUP) vs the Linear Ultrasound Probe (LUP) to measure Rectus Femoris Cross Sectional Area (RFcsa). Thorax 2011, 66, A120–A121. [Google Scholar] [CrossRef][Green Version]
Wierman, E. Understanding Gain in Ultrasound; E.L. Medical Imaging: Loveland, CO, USA, 2019; Volume 2026. [Google Scholar]
Pizurica, A.; Wink, A.M.; Vansteenkiste, E.; Philips, W.; Roerdink, B.T. A Review of Wavelet Denoising in MRI and Ultrasound Brain Imaging. Curr. Med. Imag. Rev. 2006, 2, 247–260. [Google Scholar] [CrossRef]
Tian, C.; Fei, L.; Zheng, W.; Xu, Y.; Zuo, W.; Lin, C.W. Deep learning on image denoising: An overview. Neural Netw. 2020, 131, 251–275. [Google Scholar] [CrossRef] [PubMed]
Gupta, N.; Shukla, A.P.; Agarwal, S. Despeckling of Medical Ultrasound Images: A Technical Review. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 2016, 8, 11–19. [Google Scholar] [CrossRef]
Kaur, A.; Dong, G. A Complete Review on Image Denoising Techniques for Medical Images. Neural Process. Lett. 2023, 55, 7807–7850. [Google Scholar] [CrossRef]
Mohd Sagheer, S.V.; George, S.N. A review on medical image denoising algorithms. Biomed. Signal Process. Control 2020, 61, 102036. [Google Scholar] [CrossRef]
Cui, W.; Pan, Z.; Li, X.; Tang, Y.; Sun, S. Physical imaging model-guided deep variational despeckling framework for ultrasound images. Knowl. Based Syst. 2025, 329, 114409. [Google Scholar] [CrossRef]
Soy, A.; Prakash, V.V. Medical Image Denoising using Deep Convolutional Autoencoders for Ultrasound. In Proceedings of the 2025 International Conference on Automation and Computation (AUTOCOM), Dehradun, India, 4–6 March 2025; pp. 262–267. [Google Scholar]
Chi, J.; Miao, J.; Chen, J.H.; Wang, H.; Yu, X.; Huang, Y. DSTAN: A Deformable Spatial-temporal Attention Network with Bidirectional Sequence Feature Refinement for Speckle Noise Removal in Thyroid Ultrasound Video. J. Imag. Inform. Med. 2024, 37, 3264–3281. [Google Scholar] [CrossRef] [PubMed]
Kavand, A.; Bekrani, M. Speckle noise removal in medical ultrasonic image using spatial filters and DnCNN. Multimed. Tools Appl. 2024, 83, 45903–45920. [Google Scholar] [CrossRef]
Jha, M.; Gupta, R.; Saxena, R. Noise cancellation of polycystic ovarian syndrome ultrasound images using robust two-dimensional fractional fourier transform filter and VGG-16 model. Int. J. Inf. Technol. 2024, 16, 2497–2504. [Google Scholar] [CrossRef]
El-Hag, N.A.; El-Hoseny, H.M.; Harby, F. DNN-driven hybrid denoising: Advancements in speckle noise reduction. J. Opt. 2025, 54, 3126–3135. [Google Scholar] [CrossRef]
Reddy, N.; Chitteti, C.; Yesupadam, S.; Desanamukula, V.; Vellela, S.S.; Bommagani, N. Enhanced Speckle Noise Reduction in Breast Cancer Ultrasound Imagery Using a Hybrid Deep Learning Model. Ingénierie Des. Systèmes D. Inf. 2023, 24, 1063–1071. [Google Scholar] [CrossRef]
Khalifa, M.; Hamza, H.M.; Hosny, K.M. De-speckling of medical ultrasound image using metric-optimized knowledge distillation. Sci. Rep. 2025, 15, 23703. [Google Scholar] [CrossRef] [PubMed]
Devi, P.N.; Senthil, K.M.; Meenakshipriya, B.; Selvavignesh, S.; Balaji, S.T.; Ananya, K.S.; Arunkumar, R.; Athithyaa, B.K. Denoising of Medical Ultrasound Images Using Deep Learning with Channel and Spatial Attention Based Modified U-Net. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–5. [Google Scholar]
Hsu, W.T.; Agbodike, O.; Chen, J. Attentive U-Net with Physics-Informed Loss for Noise Suppression in Medical Ultrasound Images. In Proceedings of the 2024 10th International Conference on Applied System Innovation (ICASI), Kyoto, Japan, 17–21 April 2024; pp. 409–411. [Google Scholar] [CrossRef]
Satish, S.; Herald Anantha Rufus, N.; Antony Freeda Rani, M.; Senthil Rama, R. U-Net-Based Denoising Autoencoder Network for De-Speckling in Fetal Ultrasound Images. In Fourth International Conference on Image Processing and Capsule Networks; Springer Nature: Singapore, 2023; pp. 323–338. [Google Scholar]
Monkam, P.; Lu, W.; Jin, S.; Shan, W.; Wu, J.; Zhou, X.; Tang, B.; Zhao, H.; Zhang, H.; Ding, X.; et al. US-Net: A lightweight network for simultaneous speckle suppression and texture enhancement in ultrasound images. Comput. Biol. Med. 2023, 152, 106385. [Google Scholar] [CrossRef] [PubMed]
Goudarzi, S.; Rivaz, H. Deep Ultrasound Denoising Without Clean Data; SPIE: Bellingham, WA, USA, 2023. [Google Scholar]
Senthamizh Selvi, R.; Suruthi, S.; Samyuktha Shrruthi, K.R.; Varsha, B.; Saranya, S.; Babu, B. Ultrasound Image Denoising Using Cascaded Median Filter and Autoencoder. In Proceedings of the 2023 4th International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 20–22 September 2023; pp. 296–302. [Google Scholar] [CrossRef]
Slimi, T.; Ferjaoui, R.; Khalifa, A.B. Ultrasound Imaging Enhancement Using Denoising AutoEncoders. In Proceedings of the 2025 IEEE 22nd International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 17–20 February 2025; pp. 209–214. [Google Scholar] [CrossRef]
Bhute, S.; Mandal, S.; Guha, D. Speckle Noise Reduction in Ultrasound Images using Denoising Auto-encoder with Skip connection. In Proceedings of the 2024 IEEE South Asian Ultrasonics Symposium (SAUS), Gujarat, India, 27–29 March 2024; pp. 1–4. [Google Scholar] [CrossRef]
Jiménez-Gaona, Y.; Rodríguez-Alvarez, M.J.; Escudero, L.; Sandoval, C.; Lakshminarayanan, V. Ultrasound breast images denoising using generative adversarial networks (GANs). Intell. Data Anal. 2024, 28, 1661–1678. [Google Scholar] [CrossRef]
Sivaanpu, A.; Punithakumar, K.; Thanikasalam, K.; Noga, M.; Zheng, R.; Ta, D.; Lou, E.H.M.; Le, L.H. A Lightweight Ultrasound Image Denoiser Using Parallel Attention Modules and Capsule Generative Adversarial Network. Inform. Med. Unlocked 2024, 50, 101569. [Google Scholar] [CrossRef]
Liu, J.; Li, C.; Liu, L.; Chen, H.; Han, H.; Zhang, B.; Zhang, Q. Speckle noise reduction for medical ultrasound images based on cycle-consistent generative adversarial network. Biomed. Signal Process. Control 2023, 86, 105150. [Google Scholar] [CrossRef]
Gan, J.; Wang, L.; Liu, Z.; Wang, J. Multi-scale ultrasound image denoising algorithm based on deep learning model for super-resolution reconstruction. In Proceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System, Guangzhou, China, 25–27 August 2023; pp. 6–11. [Google Scholar] [CrossRef]
Chen, Y.; Guo, Z. TranSpeckle: An edge-protected transformer for medical ultrasound image despeckling. IET Image Process. 2023, 17, 4014–4027. [Google Scholar] [CrossRef]
Oliveira-Saraiva, D.; Mendes, J.; Leote, J.; Gonzalez, F.A.; Garcia, N.; Ferreira, H.A.; Matela, N. Make It Less Complex: Autoencoder for Speckle Noise Removal—Application to Breast and Lung Ultrasound. J. Imag. 2023, 9, 217. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Zeng, X.; Dong, Q.; Wang, X. RED-MAM: A residual encoder-decoder network based on multi-attention fusion for ultrasound image denoising. Biomed. Signal Process. Control 2023, 79, 104062. [Google Scholar] [CrossRef]
Jiang, M.; You, C.; Wang, M.; Zhang, H.; Gao, Z.; Wu, D.; Tan, T. Controllable Deep Learning Denoising Model for Ultrasound Images Using Synthetic Noisy Image; Springer Nature: Cham, Switzerland, 2024; pp. 297–308. [Google Scholar]
Mahmoudi Mehr, O.; Mohammadi, M.R.; Soryani, M. Deep Learning-Based Ultrasound Image Despeckling by Noise Model Estimation. Iran. J. Electr. Electron. Eng. 2023, 19, 1–13. [Google Scholar] [CrossRef]
Chen, Y.; Guo, Z.; Yuan, J.; Li, X.; Yu, H. Dual-TranSpeckle: Dual-pathway transformer based encoder-decoder network for medical ultrasound image despeckling. Comput. Biol. Med. 2024, 173, 108313. [Google Scholar] [CrossRef] [PubMed]
Sivaanpu, A.; Punithakumar, K.; Zheng, R.; Noga, M.; Ta, D.; Lou, E.H.M.; Le, L.H. Speckle Noise Reduction for Medical Ultrasound Images Using Hybrid CNN-Transformer Network. IEEE Access 2024, 12, 168607–168625. [Google Scholar] [CrossRef]
Bu, Z.; Zhou, G.; Chen, Y. A Complementary Global and Local Knowledge Network for Ultrasound Denoising with Fine-Grained Refinement. arXiv 2024, arXiv:2310.03402. [Google Scholar]
Vimala, B.B.; Srinivasan, S.; Mathivanan, S.K.; Muthukumaran, V.; Babu, J.C.; Herencsar, N.; Vilcekova, L. Image Noise Removal in Ultrasound Breast Images Based on Hybrid Deep Learning Technique. Sensors 2023, 23, 1167. [Google Scholar] [CrossRef] [PubMed]
Slimi, T.; Djeha, A.; Khalifa, A.B. Medical Ultrasound Image Improvement Based on Denoising Convolutional Autoencoder. In Proceedings of the 2025 IEEE 22nd International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 17–20 February 2025; pp. 715–720. [Google Scholar] [CrossRef]
Yu, C.; Ren, F.; Bao, S.; Yang, Y.; Xu, X. Self-supervised ultrasound image denoising based on weighted joint loss. Digit. Signal Process. 2025, 162, 105151. [Google Scholar] [CrossRef]
Sun, C.; Chi, J.; Yu, H.; Wu, B.; Li, Z.; Huang, Y. Self-Supervised Denoising of Thyroid Ultrasound Images Using SE-Module Enhanced U-Net with FPN. In Proceedings of the 2025 37th Chinese Control and Decision Conference (CCDC), Xiamen, China, 16–19 May 2025; pp. 4212–4217. [Google Scholar] [CrossRef]
Zhang, T.-T.; Shu, H.; Lam, K.-Y.; Chow, C.-Y.; Li, A. Feature decomposition and enhancement for unsupervised medical ultrasound image denoising and instance segmentation. Appl. Intell. 2023, 53, 9548–9561. [Google Scholar] [CrossRef]
Chen, N.; Zhang, Y.; Fan, C.; Zhao, W.; Wang, C.; Wang, H. DiffusionClusNet: Deep Clustering-Driven Diffusion Models for Ultrasound Image Enhancement. IEEE Trans. Consum. Electron. 2025, 71, 1495–1503. [Google Scholar] [CrossRef]
Wei, P.; Wang, L.; Gan, J.; Shi, X.; Shang, M. Incorporation of Structural Similarity Index and Regularization Term into Neighbor2Neighbor Unsupervised Learning Model for Efficient Ultrasound Image Data Denoising. Appl. Sci. 2024, 14, 7988. [Google Scholar] [CrossRef]
Basile, M.; Gibiino, F.; Cavazza, J.; Semplici, P.; Cocco, M.; Marcelloni, F.; Bechini, A.; Vanello, N. Unsupervised Learning of Speckle Removal from Real Ultrasound Acquisitions without Clean Data. In Proceedings of the 2024 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Eindhoven, The Netherlands, 26–28 June 2024; pp. 1–6. [Google Scholar] [CrossRef]
Al-Dhabyani, W.; Gomaa, M.; Khaled, H.; Fahmy, A. Dataset of breast ultrasound images [Dataset]. Data Brief. 2020, 28, 104863. [Google Scholar] [CrossRef] [PubMed]
Geertsma, T.S. Ultrasound Cases [Dataset]. Available online: https://www.ultrasoundcases.info (accessed on 20 June 2026).
Lina, P.; Carlos, V.; Fabián, N.; Oscar, D.; Emma, M.; Eduardo, R. An open access thyroid ultrasound image database [Dataset]. Proc. SPIE 2015, 9287, 92870W. [Google Scholar] [CrossRef]
Rodrigues, P.S. Breast Ultrasound Image [Dataset]; Mendeley Data: London, UK, 2017. [Google Scholar] [CrossRef]
Yap, M.H.; Pons, G.; Martí, J.; Ganau, S.; Sentís, M.; Zwiggelaar, R.; Davison, A.K.; Martí, R. Automated Breast Ultrasound Lesions Detection Using Convolutional Neural Networks. IEEE J. Biomed. Health Inform. 2018, 22, 1218–1226. [Google Scholar] [CrossRef] [PubMed]
Morelia, A.L. Ultrasound Nerve Segmentation [Dataset]. 2016. Available online: www.kaggle.com/c/ultrasound-nerve-segmentation (accessed on 20 June 2026).
Leclerc, S.; Smistad, E.; Pedrosa, J.; Ostvik, A.; Cervenansky, F.; Espinosa, F.; Espeland, T.; Berg, E.A.R.; Jodoin, P.M.; Grenier, T.; et al. Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography. IEEE Trans. Med. Imaging 2019, 38, 2198–2210. [Google Scholar] [CrossRef] [PubMed]
National Library of Medicine. MedPix [Dataset]. 2021. Available online: https://medpix.nlm.nih.gov/home (accessed on 20 June 2026).
Indirani, A. PCOS Dataset [Dataset]; Figshare LLP: London, UK, 2024. [Google Scholar] [CrossRef]
Sawyer-Lee, R.; Gimenez, F.; Hoogi, A.; Rubin, D. Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) [Dataset]. Cancer Imaging Arch. 2016. [Google Scholar] [CrossRef]
Moreira, I.C.; Amaral, I.; Domingues, I.; Cardoso, A.; Cardoso, M.J.; Cardoso, J.S. INbreast: Toward a full-field digital mammographic database [Dataset]. Acad. Radiol. 2012, 19, 236–248. [Google Scholar] [CrossRef] [PubMed]
van den Heuvel, T.L.A.; de Bruijn, D.; de Korte, C.L.; van Ginneken, B. Automated measurement of fetal head circumference using 2D ultrasound images. PLoS ONE 2018, 13, e0200412. [Google Scholar] [CrossRef] [PubMed]
Momot, A. Common Carotid Artery Ultrasound Images [Dataset], Mendeley Data. 2022. Available online: https://data.mendeley.com/datasets/d4xt63mgjm/1 (accessed on 20 June 2026).
Gómez-Flores, W.; Gregorio-Calas, M.J.; Coelho de Albuquerque Pereira, W. BUS-BRA: A breast ultrasound dataset for assessing computer-aided diagnosis systems [Dataset]. Med. Phys. 2024, 51, 3110–3123. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Zhang, C.; Liu, L.; Feng, C.; Dong, C.; Luo, Y.; Wan, X. USCL: Pretraining Deep Ultrasound Image Diagnosis Model Through Video Contrastive Representation Learning. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021: 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Part VIII, pp. 627–637. [Google Scholar] [CrossRef]
Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMAScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Examples of common noise and artifacts in US imaging and their impact on image quality. (a) Motion artifacts in cardiac US leading to blurred structures. (b) Speckle noise obscuring the bone boundary in wrist US. (c) Speckle noise and acoustic shadowing reduce the visibility of thyroid nodule margins. Images were collected from projects conducted at the NIDUS Laboratory, including cardiac, wrist, and thyroid US studies.

Figure 3. PRISMA-ScR flow diagram summarizing the literature search and study selection process for US image denoising. A total of 951 records were identified, 801 were screened, 93 underwent full-text eligibility assessment, and 36 studies were included.

Figure 4. Distribution of ML approaches across different anatomical regions in the studies. The number of studies is shown for supervised learning (SL), self-supervised learning (SSL), and unsupervised learning (USL) methods.

Figure 5. Performance score distribution across studies. (a) Frequently used reference-based evaluation metrics scores distribution. (b) Common no-reference evaluation metrics scores distribution.

Figure 6. Architectural trends employed by US image denoising studies over the years.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Degu, M.Z.; Madhusoodanan, M.; Chippa, M.; Hareendranathan, A. Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising. AI Med. 2026, 1, 18. https://doi.org/10.3390/aimed1030018

AMA Style

Degu MZ, Madhusoodanan M, Chippa M, Hareendranathan A. Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising. AI in Medicine. 2026; 1(3):18. https://doi.org/10.3390/aimed1030018

Chicago/Turabian Style

Degu, Mizanu Zelalem, Midhila Madhusoodanan, Medha Chippa, and Abhilash Hareendranathan. 2026. "Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising" AI in Medicine 1, no. 3: 18. https://doi.org/10.3390/aimed1030018

APA Style

Degu, M. Z., Madhusoodanan, M., Chippa, M., & Hareendranathan, A. (2026). Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising. AI in Medicine, 1(3), 18. https://doi.org/10.3390/aimed1030018

Article Menu

Scoping Review of Recent Trends and Challenges in Artificial Intelligence Based Medical Ultrasound Denoising

Abstract

1. Introduction

2. Materials and Methods

2.1. Search Strategy and Data Sources

2.1.1. Inclusion Criteria

2.1.2. Exclusion Criteria

2.2. Study Screening and Selection

2.3. Data Extraction

2.4. Data Handling and Summary

2.5. Limitations

3. Results and Discussion

3.1. Study Selection

3.2. Characteristics of the Studies

3.3. Study Transparency and Validation Characteristics

3.4. Training Data and Noise Modeling Strategy

3.5. Evaluation Metrics

3.6. Descriptive Summary of Reported Quantitative Metrics

3.7. Methodological Trends

3.8. Identified Gaps

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Methodological Quality and Validation Characteristics of Included Studies

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI