Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model

Peng, Jincheng; Guo, Zhenyu; Chen, Xing; Zhou, Ming

doi:10.3390/electronics14204061

Open AccessArticle

Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model

¹

Yangtze Delta Region Academy of Beijing Institute of Technology, Jiaxing 314019, China

²

Beijing Key Laboratory of Millimeter Wave and Terahertz Techniques, Beijing Institute of Technology, Beijing 100081, China

³

School of Engineering, University of Southern Queensland, Toowoomba, QLD 4350, Australia

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2025, 14(20), 4061; https://doi.org/10.3390/electronics14204061 (registering DOI)

Submission received: 14 August 2025 / Revised: 2 October 2025 / Accepted: 11 October 2025 / Published: 15 October 2025

(This article belongs to the Special Issue Edge AI for Biomedical Applications: Innovations in Sensing, Computing and Security)

Download

Browse Figures

Versions Notes

Abstract

Currently, for polycystic ovary syndrome (PCOS), diagnostic methods are mainly divided into hormonal indicators and ultrasound imaging. However, ultrasound images are often affected by noise and artifacts during the imaging process. This significantly degrades image quality and increases the difficulty of diagnosis. This paper proposes a PCOS ultrasound image denoising method based on an improved DDPM. During the forward diffusion process of the original model, Gaussian noise is progressively added using a cosine-based scheduling strategy. In the reverse diffusion process, a conditional noise predictor is introduced and combined with the original ultrasound image information to iteratively denoise and recover a clear image. Additionally, we fine-tuned and optimized the model to better suit the requirements of PCOS ultrasound image denoising. Experimental results show that our model outperforms state-of-the-art methods in both noise suppression and structural fidelity. It delivers a fully automated PCOS-ultrasound denoising pipeline whose diffusion-based restoration preserves clinically salient anatomy, improving the reliability of downstream assessments.

Keywords:

PCOS ultrasound image processing; denoising diffusion probabilistic model (DDPM); deep learning; image denoising

1. Introduction

Polycystic ovary syndrome (PCOS) is a common endocrine disorder characterized by hormonal imbalance, with an increasing prevalence among women of reproductive age worldwide. Currently, approximately 6% to 12% of women of childbearing age are affected by PCOS globally. The clinical manifestations of PCOS typically include menstrual irregularities (oligomenorrhea, amenorrhea, or dysfunctional uterine bleeding), hirsutism, obesity, infertility, elevated circulating free testosterone levels, a luteinizing hormone/follicle-stimulating hormone (LH/FSH) ratio > 2–3, ovulatory dysfunction, and hyperandrogenemia, often accompanied by polycystic ovarian morphology [1]. In addition to these clinical features, PCOS is associated with a broad spectrum of pathophysiological changes, including neuroendocrine abnormalities, dysregulation of glucose and lipid metabolism, and local ovarian dysfunction. Women with PCOS are at significantly increased risk for developing type 2 diabetes, hypertension, coronary heart disease, endometrial cancer, and gestational hypertension, all of which severely impact fertility, family stability, and quality of life. Therefore, advancing research on PCOS is of great significance for protecting women’s health.

Currently, research on the diagnosis of PCOS mainly focuses on two aspects: (1) biochemical analysis of patients’ hormone levels, and (2) auxiliary diagnosis based on ultrasound imaging in combination with the Rotterdam diagnostic criteria. The latter includes clinical and/or biochemical signs of hyperandrogenism, ovulatory dysfunction (oligomenorrhea or anovulation), and ultrasonographic features of polycystic ovaries (≥12 follicles measuring 2–9 mm in diameter and/or ovarian volume > 10 mL) [2]. Compared to hormonal assays, ultrasound imaging has become a crucial early diagnostic tool for clinicians and radiologists due to its non-invasive nature, real-time imaging capability, and cost-effectiveness. Ultrasound enables clear visualization of follicle location, size, and number, which, when combined with clinical findings such as hirsutism and male-pattern breast development, significantly improves diagnostic accuracy.

However, due to the technical limitations of ultrasound equipment, PCOS ultrasound images are more susceptible to noise and artifacts compared to CT and MRI, with speckle noise being particularly problematic. Speckle noise appears as a large number of randomly distributed granular artifacts, which obscure image boundaries, reduce tissue contrast, and may even mask clinically relevant features. In clinical practice, ultrasound imaging of PCOS faces several specific challenges: (1) The presence of air between the ultrasound probe and the skin leads to scattering of echo signals, resulting in severe speckle noise, which particularly hinders the identification of small follicles (~2 mm in diameter), thereby increasing the difficulty of visual interpretation and subsequent segmentation [3]. (2) Blurred follicle boundaries, low image contrast, and subtle grayscale differences significantly impair accurate feature extraction, thus hampering precise follicle identification and counting. (3) Severe overlap of tissue structures and complex background regions further complicate the differentiation of anatomical features, increasing the complexity and uncertainty of diagnosis. Therefore, there is an urgent need for effective ultrasound denoising methods that remove speckle noise while preserving texture and edges. High-quality images would improve analysis and support more accurate and efficient PCOS diagnosis.

Traditional image denoising methods, such as the bilateral filtering technique proposed by Tomasi et al. [4], perform image smoothing by simultaneously considering the grayscale differences and spatial distances between pixels using a dual-weighted template. While bilateral filtering can remove noise and partially preserve texture and edge information, it often leads to image detail blurring, thereby limiting its effectiveness in detail preservation. To address these limitations, Buades et al. [5] introduced the non-local means (NLM) filtering algorithm. Unlike conventional methods, NLM filtering suppresses noise by analyzing the similarity between entire image patches rather than individual pixels. NLM has demonstrated excellent results in removing additive Gaussian noise and has been widely applied in various medical imaging modalities such as MRI and CT. However, due to the multiplicative nature of speckle noise in ultrasound images, the standard NLM algorithm exhibits limited performance when processing such noise. To overcome this limitation, Deledalle et al. [6] proposed an adaptive patch size approach that dynamically adjusts patch sizes according to local image characteristics, thereby enhancing the algorithm’s adaptability to different image regions and improving the suppression of speckle noise. Coupe et al. [7] further developed the optimized Bayesian non-local means (OBNLM) filter, which employs the Pearson distance within a Bayesian framework to measure patch similarity, offering greater accuracy compared to the Euclidean distance used in traditional NLM. Yet, the OBNLM method still relies on assumptions about the underlying noise distribution, which may not always hold for the complex, spatially varying noise present in ultrasound images. Building on these advances, Dabov et al. [8] proposed a novel denoising method known as block-matching and 3D filtering (BM3D). The core concept of BM3D involves grouping multiple similar patches in the image into a three-dimensional block via block matching, exploiting non-local similarities among local patches to significantly enhance sparsity in the transform domain and improve denoising capability. Although BM3D excels in structural preservation and denoising performance for Gaussian noise, it is less adaptable to non-Gaussian noise found in real ultrasound images and is computationally intensive. Moreover, BM3D’s reliance on accurate patch matching can be hampered by the strong, non-additive speckle noise commonly observed in ultrasound imaging, further limiting its practical denoising effectiveness in this domain.

In recent years, the widespread application of deep learning models such as convolutional neural networks (CNNs), generative adversarial networks (GANs), and U-net architectures in image processing has provided new perspectives for ultrasound image denoising. These models can automatically learn complex features from images through end-to-end training, effectively remove noise, and simultaneously preserve image details and structures. Worku Jifara et al. [9] proposed a medical image denoising method based on a deep feedforward convolutional neural network, employing a residual learning strategy to obtain denoised images by subtracting the learned noise from the noisy input. When this method is directly applied to multiplicative noise, its performance is generally unsatisfactory. Yang et al. [10] introduced a deep learning model combining Wasserstein GAN (WGAN) with perceptual loss for low-dose CT image denoising. This approach preserves structural details while ensuring denoising performance and overcomes the image blurring issue caused by traditional mean squared error (MSE) loss functions. WGAN is a generative network model that can be used for multiplicative noise denoising. However, due to issues such as fixed hyperparameter settings and mode collapse, its performance is often unsatisfactory. Zhang et al. [11] proposed a generative adversarial network based on residual dense connectivity and weighted joint loss, achieving promising denoising results on ultrasound images. However, these GAN-based methods require strict Nash equilibrium conditions to avoid issues such as gradient explosion and mode collapse. To fundamentally address the limitations of GAN-based models, Jonathan Ho et al. [12] improved the mathematical formulation of the original diffusion probabilistic model by defining a Markov chain of diffusion steps, leading to the development of the denoising diffusion probabilistic model (DDPM). Subsequent studies have shown that diffusion models [13] have rapidly surpassed GANs as a promising approach for generating high-quality data, achieving remarkable results in novel computer vision tasks such as image synthesis, super-resolution, colorization, and image translation, thereby guiding image processing research into new directions. For instance, Peng et al. [14] proposed LW-DDPM, which reduces MRI image sampling cost and enhances the scalability of diffusion models by designing lightweight attention modules. Krishna [15] trained a large-scale generative model in the lung CT domain using a denoising diffusion probabilistic model (DDPM) combined with a classifier-free sampling strategy. Jiang et al. [16] proposed a method called Lung-DDPM for thoracic CT image synthesis, which can efficiently generate high-fidelity 3D synthetic CT images. Li [17] introduced a novel denoising method called the conditional denoising diffusion probabilistic model (c-DDPM), leveraging the advantages of DDPM for quality enhancement and restoration of ultra-low-dose CT lung nodule images.

In this study, we propose a denoising method for PCOS ultrasound images based on an improved denoising diffusion probabilistic model (DDPM). The proposed approach incorporates diffusion priors into the maximum a posteriori (MAP) framework to address the challenge of denoising medical ultrasound images. We retain the original DDPM’s fixed forward process. During the forward diffusion process, Gaussian noise is gradually added to perturb the data distribution using a cosine scheduling strategy. In the reverse diffusion process, a conditional noise predictor is introduced and combined with the information from the original noisy ultrasound image. The denoising is then achieved through an iterative reverse process, progressively restoring a clean image. Furthermore, we refine and optimize the model’s depth, attention mechanisms, and loss to accommodate better the specific requirements of PCOS ultrasound image denoising. Experimental results demonstrate that our method can effectively remove speckle noise from PCOS ultrasound images while preserving the texture and edge details of the original noisy images, achieving state-of-the-art generative performance and producing high-quality images.

2. Materials and Methods

2.1. Ultrasound Noise Degradation Modeling

The noise in ultrasound images primarily consists of two types: speckle noise and additive Gaussian noise. Among them, speckle noise is the dominant noise type in ultrasound imaging. It primarily results from coherent interference effects that occur as ultrasound waves emitted by the transducer propagate through tissues containing numerous sub-wavelength scatterers, such as cells and collagen fibers. The phase differences among these scattered waves lead to constructive (enhancing) or destructive (diminishing) interference, which manifests as randomly distributed granular textures in the image [18]. This speckle noise can obscure or distort the true information of the tissue, significantly degrading image quality and clarity, and thus adversely affecting the accuracy of medical diagnosis.

Although additive Gaussian noise is not the predominant noise in ultrasound images, it mainly originates from practical equipment such as electronic amplifiers, analog-to-digital converters (A/D converters), and other circuit components, and should not be neglected. In constructing the noise degradation model for ultrasound images, we also take this type of noise into account. Especially under conditions of low signal-to-noise ratio (SNR) and low contrast, the combination of multiplicative speckle noise and additive Gaussian noise can significantly degrade ultrasound image quality, severely impacting physicians’ ability to identify and diagnose pathological tissues.

In view of this, we adopt an image degradation approach to model noise, aiming to simulate better the complex noise conditions encountered in real clinical environments and thus enhance the robustness and effectiveness of noise suppression algorithms in practical diagnostic scenarios. In particular, we consider the statistical characteristics of the noise term in the multiplicative degradation model, establishing the relationship for multiplicative speckle noise as follows:

I_{n o i s e M u l} = I_{t r u e} \times n

(1)

where n typically follows a Rayleigh, K, or Gamma distribution. Here,

I_{n o i s e M u l}

denotes the noisy image, and

I_{t r u e}

represents the underlying clean image to be estimated. Similarly, the additive noise relationship is given by:

I_{n o i s e A d d} = I_{t r u e} + n

(2)

where n generally follows a Gaussian distribution. The total noise degradation model can be expressed as:

x = η_{1} + y \cdot η_{2}

(3)

where x is the noisy image, y is the unknown noise-free image, and

η_{1}

and

η_{2}

are the additive and multiplicative noise functions, respectively. This image degradation mechanism enables the simulated noise levels to closely resemble the typical noise characteristics commonly observed in clinical ultrasound images.

2.2. DDPM Model

Diffusion probabilistic models (DPM) [19] are a class of generative models constructed based on the principles of non-equilibrium thermodynamics. The core idea is to gradually transform the data distribution into a Gaussian noise distribution via a forward diffusion process, and then learn the reverse process to progressively denoise and recover the original data distribution. The denoising diffusion probabilistic model (DDPM) improves upon the original DPM’s mathematical framework and has been widely applied in image generation tasks. The DDPM process consists of two parameterized Markov chains: a forward process and a reverse process.

The forward process of DDPM is defined as follows:

q (x_{t} | x_{t - 1}) = N (x_{t}; \sqrt{1 - β_{t}} x_{t - 1}, β_{t} I)

(4)

where

x_{t}

is the noisy data at time step t, and

β_{t}

is the noise variance schedule over T diffusion steps.

The reverse process is the inverse of the forward process, where denoising is achieved by iteratively sampling in the reverse direction, gradually transforming Gaussian noise into a complete image

p (x_{0})

. For the distribution

p (x_{t - 1} | x_{t})

, the entire training dataset is required. DDPM approximates this distribution using a neural network parameterized by

θ

, and, leveraging Bayes’ theorem, it can be expressed as:

p_{θ} (x_{t - 1} | x_{t}) = N (x_{t - 1}; μ_{θ} (x_{t}, t), Σ_{θ} (x_{t}, t))

(5)

where

μ_{θ}

and

Σ_{θ}

denote the mean and variance predicted by the neural network.

2.3. Improved DDPM Denoising Model

Traditional denoising diffusion models are typically based on unconditional or simply conditioned inputs, using DDPM to progressively denoise

p_{θ} (x_{t - 1} | x_{t})

Traditional denoising diffusion models are typically based on unconditional or simply conditioned inputs, using DDPM to progressively denoise

p_{θ} (x_{t - 1}| x_{t})

from pure Gaussian noise to generate a random, clean image

x_{0}

. Building on the original T-step DDPM and inspired by [20], we enhance the modeling of data conditional distributions so that DDPM can be effectively applied to ultrasound image denoising tasks.

In the improved conditional diffusion model, the forward process remains consistent with the unconditional model: Gaussian noise is added over T steps to transform a clean image into a fully noisy image

y_{t}

. In the reverse diffusion process, a conditional noise prediction network

ε_{θ}

is introduced. Starting from

X_{T}

, the model iteratively denoises over T steps, mapping the initial Gaussian noise to a clean image with complex data distribution, ultimately reconstructing the corresponding clean ultrasound image

x_{0}

.

Specifically, our conditional noise prediction network is based on a modified U-net [21] architecture. We further refine the U-net by adjusting the number of layers and the attention mechanism, as illustrated in Figure 1. Both encoder and decoder network structures are constructed using convolutional residual blocks, squeeze-and-excitation (SE) attention modules, down-sampling and up-sampling convolutions, and skip connections between encoder and decoder layers. Each residual block consists of two normalization layers and an SiLU activation layer (compared to ReLU, SiLU is a symmetric activation function that better preserves information in the negative domain). In addition, we replace the original encoder attention module with an SE-attention mechanism, which effectively reduces the number of model parameters and improves computational efficiency, thereby introducing a lightweight design.

To ensure that the generated denoised image

x_{0}

retains the same information as the input noisy image

y

, the degraded noisy image

y_{0}

is concatenated with the noisy image

y_{t}

along the channel dimension and fed into the convolutional layers during the forward process. The timestep t is encoded into a timestep embedding

t_{e}

using a transformer-based sinusoidal positional encoding, which is then passed to each residual block. The U-net

ε_{θ}

model predicts the loss between the noise and the noise distribution and performs conditional sampling, enabling the network to learn contextual and semantic features from the noisy image. The training objective for the conditional distribution

x_{t - 1}

is formulated as:

p_{θ} (x_{t - 1}| x_{t}, y) = N (x_{t - 1}; μ_{θ} (x_{t}, y, t), β_{t} I)

(6)

where the mean is predicted by the U-net denoising network and

β_{t}

is the fixed noise variance at timestep

t

. The effective training target for the network

ε_{θ}

is further derived as:

x_{t - 1} = \frac{1}{\sqrt{a_{t}}} x_{t} - \frac{1 - a_{t}}{\sqrt{1 - {\overline{a}}_{t}}} ε_{θ} (x_{t}, y, t) + σ_{t} z

(7)

The final denoised ultrasound image

x_{0}

is obtained through iterative reverse diffusion.

2.4. Noise Addition Strategy and Loss Function

The original DDPM model employs a linear noise schedule. However, a linear schedule introduces excessive noise in the early stages, causing the data to diffuse too rapidly, as shown in Figure 2, which makes the reverse reconstruction process more challenging. In the later stages, as the data has already become nearly random noise, the incremental noise is insufficient, resulting in smaller changes and slower diffusion. This leads to inefficient use of diffusion or reverse diffusion steps. Therefore, we adopt a cosine noise schedule by setting the noise addition timetable via hyperparameters. This approach helps ensure a smoother noise addition process, reduces overall damage to image features, facilitates better noise diffusion in the forward process, preserves essential semantic information in the images, and improves the fidelity of the synthesized results.

The original DDPM utilizes the L1 loss function. Compared to L1 loss, L2 loss generally achieves faster convergence. However, when the prediction error is large, the gradient produced by the L2 loss can be excessive. In our denoising training process, we employ the smooth L1 loss [22], which combines the advantages of both L1 and L2 losses and is less sensitive to outliers. At the early stage of training, when the difference between the predicted and true values is large, smooth L1 loss provides larger gradients to accelerate convergence. As the model approaches the optimal solution, the gradient naturally decreases, which helps stabilize convergence and avoids oscillation near the optimum. Thus, the smooth L1 loss function effectively controls the gradient magnitude throughout the training process. The smooth L1 loss function is defined as follows:

S m o o t h L 1 (x) = \{\begin{array}{l} 0.5 x^{2}, & |x| < 1 \\ |x| - 0.5, & |x| \geq 1 \end{array}

(8)

where

x

denotes the difference between the predicted value and the ground truth, this loss function ensures both robustness to outliers and stable convergence during model training.

3. Results

Dataset and Experimental Environment

In this study, a publicly available ultrasound image dataset of polycystic ovary syndrome (PCOS), downloaded from the Kaggle platform [23], was utilized for model training and testing. The dataset originates from a six-month longitudinal follow-up of 80 PCOS patients, documenting ultrasound images captured at different time points. The original dataset comprises 1924 training images and 1932 testing images, encompassing a wide variety of cases. This rich diversity makes the dataset suitable for research on classification models in PCOS diagnosis and supports the application of explainable artificial intelligence in medical image analysis.

We validated the proposed method using the Polycystic Ovary Ultrasound Images Dataset from Telkom University Dataverse [24] (2021), a real clinical ultrasound dataset. It comprises ovarian ultrasound images, each annotated by a specialist as either polycystic ovary syndrome (PCOS) or Normal. Dataset size and organization: 54 publicly available JPEG images. Annotations and availability: Binary labels (PCOS/Normal) were assigned by specialists; no patient-level clinical variables or imaging device parameters are provided. Notably, raw ultrasound images collected across hospitals and devices typically exhibit a sector-shaped field of view covering the ovary and its surrounding region. However, the peripheral regions of these images often lack diagnostic value and may introduce irrelevant information in subsequent image processing tasks. Considering the limitations of GPU memory and computational time in the experimental environment, we performed cropping and resizing operations during data preprocessing. The original ultrasound images were standardized to resolutions of 128 × 128 and 256 × 256 pixels, retaining only the key regions containing the ovaries. This preprocessing step not only reduces interference from irrelevant information and enhances the accuracy of model training and denoising, but also meets the computational and input size requirements of the denoising tasks in this study.

To further improve the model’s generalization ability and enhance data diversity, data augmentation techniques were applied, including brightness enhancement, brightness suppression, contrast adjustment, image rotation, and flipping. Ultimately, a curated PCOS ultrasound image dataset was constructed for this study. As shown in Figure 3, both the original and processed PCOS ultrasound images are presented.

The experiments in this study were conducted on both Google Colab and a dedicated server environment. The models were primarily implemented using the PyTorch1.13.1 framework, and the denoising diffusion probabilistic model (DDPM) was constructed based on the Hugging Face Diffusers library. During training, the batch size was set to 20, with an initial learning rate of 0.0002. The Adam optimizer was employed, and the noise scheduling parameters (β₁, β₂) of the diffusion model followed a cosine schedule. The training process was run for 300 epochs, and the total number of diffusion steps was set to 1000.

To evaluate the performance of our denoising model, we adopted two common image quality assessment metrics: peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) [25]. PSNR is one of the most widely used metrics for lossy transformation tasks such as image denoising, compression, and deblurring. The reconstruction error is typically measured by the mean squared error (MSE). Given a ground-truth noise-free image

I_{o r i g i n}

and a denoised image

I_{d e n o i s e}

, the MSE is defined as:

M S E = \frac{1}{N M} \sum_{i = 1}^{N} \sum_{j = 1}^{M} {(I_{o r i g i n} (i, j) - I_{d e n o i s e} (i, j))}^{2}

(9)

where N and M denote the width and height of the image, respectively. Accordingly, the PSNR for

I_{d e n o i s e}

is defined as:

P S N R = 10 \times \log_{10} (\frac{{M A X}_{I}^{2}}{M S E})

(10)

The structural similarity index measure (SSIM) evaluates the similarity between two images from the perspectives of contrast, luminance, and structural information. In this study, SSIM is used to compare the similarity between the denoised and the original images; a higher SSIM value indicates greater similarity. Table 1 summarizes the PSNR and SSIM scores obtained by different denoising methods, all of which are implemented using publicly available code. Figure 4 presents the qualitative comparison of denoising results on PCOS ultrasound images produced by various models, with the image resolution set to 128 × 128 pixels. The top row of the figure displays the original images, noisy images, and the denoised results generated by bilateral filtering, wavelet [26], DnCNN [27], FFDNet [28], DDPM, and the proposed improved denoising diffusion probabilistic model. The bottom row shows enlarged views of the regions highlighted by white boxes, allowing for a more detailed visual comparison.

As illustrated in Figure 4 and Table 1, bilateral filtering, as an early traditional denoising technique, leaves considerable granular noise artifacts in the output images, which negatively affect the preservation of anatomical structures. Although wavelet-based denoising effectively suppresses some noise, it causes significant loss of texture details, making it unsuitable for maintaining the pathological invariance of ultrasound images.

In contrast, the three deep learning-based methods demonstrate superior denoising performance on raw ultrasound images without physical correction, enhancing the overall visual quality. DnCNN establishes a mapping between speckle-free and speckle-noisy images via deep learning, thereby achieving notable denoising results. While DnCNN retains more original texture details, noticeable speckle noise remains in the output. FFDNet further reduces speckle noise compared to DnCNN, but the resulting images exhibit over-smoothed textures, leading to weakened or even lost structural details.

The DDPM-based denoising model preserves the overall image contours and restores fine details more effectively than traditional methods. Although the diffusion model is not specifically designed for ultrasound denoising, it outperforms GAN/CNN-based methods when dealing with complex mixed noise (multiplicative + additive). However, its visual quality is still not optimal compared to our proposed approach. Our improved denoising diffusion probabilistic model demonstrates a clear advantage in retaining the original texture details of the images, yielding more uniform texture distribution and the highest similarity to the ground truth images.

To further evaluate the effectiveness of various deep learning-based denoising methods, we compared the heatmaps of ultrasound images denoised by different approaches, as shown in Figure 5. From the heatmaps, it can be observed that the proposed denoising diffusion probabilistic model (DDPM) effectively preserves the edges and texture structures of the original noisy images after denoising. Moreover, it achieves smoother texture transitions and avoids jagged structural distortions, resulting in superior visual consistency. In contrast, it is evident that images processed by DnCNN still exhibit noticeable noise in the magnified region of the third row in Figure 5. Both FFDNet and DDPM reduce noise to some extent; however, FFDNet often produces overly smoothed textures and may leave residual noise. Our method not only suppresses noise but also retains more image details. Overall, our approach outperforms other deep learning-based methods in terms of texture detail preservation, structural smoothness, and visual quality.

To further investigate the denoising performance of our model compared with the original DDPM, we analyzed the denoising effects and training loss functions at different training epochs, as depicted in Figure 6, using high-resolution images of 256 × 256 to observe texture details during training. As shown in Figure 6, after 50 training epochs, both the DDPM and our model produce images with relatively low resolution and some blurriness, indicating suboptimal denoising performance. This is mainly because the denoising network is not yet fully trained and can only provide a rough prediction of random noise, resulting in blurred texture details and synthetic artifacts. Nevertheless, our method shows a significant improvement over the original model. After 100 epochs, the image quality improves but still does not reach a satisfactory level. In experiments with 180, 250, and 300 epochs, the denoised images become visually more pleasing, but the preservation of texture details varies. It is evident that, in the original model, the white area at the bottom appears overly smoothed, while our method retains more natural texture. When the number of training epochs exceeds 250, the denoising quality tends to stabilize.

As shown in Figure 7, we conducted a comparative evaluation on the polycystic ovary ultrasound images dataset (all images resized to 256 × 256 pixels). Each row shows degraded inputs under two conditions: additive noise with signal-to-noise ratio (SNR) < 10 dB (top), and additive–speckle mixed noise (bottom). For each case, columns from left to right display the original image, the noisy input (“Add noise”/”Add noise & speckle noise”), and reconstructions produced by CTformer [29], Restormer [30], DDPM, and our method (Ours). Across both noise settings, our approach better preserves follicular boundary sharpness, suppresses background interference, and restores complete cyst contours, thereby improving the visibility of small follicles.

Meanwhile, the quantitative evaluation metrics in Table 2 further verify the effectiveness of the proposed method. Restormer demonstrates more advanced visual effects. Our method achieves performance comparable to Restormer in terms of both overall contour preservation and detail restoration. Lastly, we validated denoising performance at 512 × 512 resolution. The results in Figure 8 show that our method clearly surpasses the original DDPM; the image preserves overall textural characteristics while being smoothed.

To verify the effectiveness of the proposed method and ensure the stability and generalization performance of the model, we conducted ablation experiments on the improved component of the network model. We recorded the PSNR values every 10 epochs during training, and analyzed the PSNR curves for the original DDPM model, the model using only cosine scheduling, the model using only the smooth L1 loss, and our method. As shown in the left panel of Figure 9, our approach achieves improved denoising performance. Additionally, we compared the training loss curves of the DDPM model with standard L1 loss and our method with smooth L1 loss. As illustrated in the right panel of Figure 9, the smooth L1 loss achieves a lower initial loss and significantly faster convergence. Moreover, smooth L1 loss is less sensitive to outliers and better controls gradient changes during training, resulting in a smoother loss reduction curve.

4. Discussion

We present a fully automated denoising model tailored for polycystic ovary syndrome (PCOS) ultrasound imaging. By integrating targeted image preprocessing with learned restoration, the method enhances follicle and cyst delineation. In controlled comparisons on low-quality ultrasound data corrupted with both multiplicative and additive noise (signal-to-noise ratio, SNR < 10 dB), the restored images are visibly crisper, highlight diagnostically salient structures, and permit discrimination of subtle differences between affected and unaffected cases.

We validated the proposed method on the Polycystic Ovary Ultrasound Images Dataset. The denoised images preserved key morphological features essential for clinical assessment—including follicle size distribution, circularity/ellipticity, ovarian volume, and stromal echogenicity—and markedly sharpened inter-follicular boundaries, enabling reliable delineation of complete follicular contours to support diagnosis. These high-quality reconstructions also provide a solid foundation for subsequent clinical analyses.

Failure modes persist in a minority of cases. We observe occasional over-smoothing that blunts fine cystic edges and suppresses speckles, yielding a plastic-like aspect; hallucinated lumen-like structures, striations, or regular textures; amplification of physical artefacts (e.g., reverberation, side-lobes, mirror, comet-tail); halo/ringing at small cyst boundaries; and, rarely, local contrast inversion or gray-level ordering disruptions. Even so, the method generally increases image contrast, reduces the risk of missing very small follicles, and maintains diagnostic utility under extremely low SNR conditions by automatically recovering perceptual clarity and mitigating misleading detections.

This work lays a foundation for downstream segmentation, classification, and computer-aided recognition on high-quality reconstructions. As a next step, we plan a prospective evaluation on 100–150 clinical cases spanning multiple scanners and patient subtypes, with targeted stress-testing of the identified failure scenarios to further assess clinical fitness and guide iterative refinement.

5. Conclusions

In this study, we proposed a novel denoising method for polycystic ovary syndrome (PCOS) ultrasound images based on an improved denoising diffusion probabilistic model (DDPM). The proposed approach introduces Gaussian noise to the original data distribution in a stepwise manner, controlled by a cosine noise schedule. During the reverse diffusion phase, a conditional noise predictor is incorporated and combined with prior information from the ultrasound images. Through an iterative reverse process, the model progressively denoises the images, ultimately restoring clear and high-quality ultrasound images. Experimental results demonstrate that our method achieves state-of-the-art denoising performance, effectively generating ultrasound images with high-quality textures and well-preserved edge details.

However, this study still has certain limitations. Our method still exhibits a performance gap compared to the current state-of-the-art DDPM network in image generation tasks. Specifically, the model exhibits high GPU memory consumption and relatively slow inference speed during both training and inference phases. Additionally, there is still room for improvement in detail restoration for low-resolution image denoising tasks. Future research may focus on local structural features within the images, such as corner points, edge intersections, and regions with abrupt grayscale changes, to preserve structural consistency more effectively during the denoising process. This would further enhance the generalization ability and denoising performance of the model.

Author Contributions

Conceptualization, J.P.; methodology, J.P.; software, J.P. and Z.G.; validation, J.P., X.C. and Z.G.; formal analysis, J.P., X.C. and Z.G.; investigation, J.P., X.C., M.Z. and Z.G.; data curation, J.P., M.Z. and Z.G.; writing—original draft preparation, J.P. and Z.G.; writing— review and editing, J.P. and Z.G.; visualization, J.P., X.C. and Z.G.; project administration, M.Z.; funding acquisition, J.P. and M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Jiaxing Municipal Industrial Development Tackling Project under Grant 2025AC023.

Data Availability Statement

The PCOS Ultrasound Images dataset utilized in this study is publicly available on Kaggle (https://www.kaggle.com/datasets/anaghachoudhari/pcos-detection-using-ultrasound-images accessed on 20 May 2025). The validated dataset is publicly accessible through the Telkom University Dataverse.

Conflicts of Interest

The authors declare no conflict of interest.

References

Azziz, R.; Carmina, E.; Chen, Z.; Dunaif, A.; Laven, J.S.E.; Legro, R.S.; Lizneva, D.; Natterson-Horowtiz, B.; Teede, H.J. Polycystic ovary syndrome. Nat. Rev. Dis. Primers 2016, 2, 16057. [Google Scholar] [CrossRef] [PubMed]
Goodarzi, M.O.; Dumesic, D.A.; Chazenbalk, G.; Azziz, R. Polycystic ovary syndrome: Etiology, pathogenesis and diagnosis. Nat. Rev. Endocrinol. 2011, 7, 219–231. [Google Scholar] [CrossRef] [PubMed]
Noble, J.A.; Boukerroui, D. Ultrasound image segmentation: A survey. IEEE Trans. Med. Imaging 2006, 25, 987–1010. [Google Scholar] [CrossRef] [PubMed]
Tomasi, C.; Manduchi, R. Bilateral filtering for gray and color images. In Proceedings of the Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), Bombay, India, 7 January 1998; pp. 839–846. [Google Scholar]
Buades, A.; Coll, B.; Morel, J.M. A non-local algorithm for image denoising. In Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005. [Google Scholar]
Deledalle, C.A.; Duval, V.; Salmon, J. Non-local methods with shape-adaptive patches. J. Math. Imaging Vis. 2012, 43, 103–120. [Google Scholar] [CrossRef]
Coupe, P.; Hellier, P.; Kervrann, C.; Barillot, C. Nonlocal means-based speckle filtering for ultrasound images. IEEE Trans. Image Process. 2009, 18, 2221–2229. [Google Scholar] [CrossRef] [PubMed]
Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef] [PubMed]
Jifara, W.; Jiang, F.; Rho, S.; Cheng, M.; Liu, S. Medical image denoising using convolutional neural network: A residual learning approach. J. Supercomput. 2019, 75, 704–718. [Google Scholar] [CrossRef]
Yang, Q.; Yan, P.; Zhang, Y.; Yu, H.; Shi, Y.; Mou, X.; Kalra, M.K.; Zhang, Y.; Sun, L.; Wang, G. Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss. IEEE Trans. Med. Imaging 2018, 37, 1348–1357. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Zhang, J. Ultrasound image denoising using generative adversarial networks with residual dense connectivity and weighted joint loss. PeerJ Comput. Sci. 2022, 8, e873. [Google Scholar] [CrossRef] [PubMed]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
Dhariwal, P.; Nichol, A. Diffusion models beat GANs on image synthesis. Adv. Neural Inf. Process. Syst. 2021, 34, 8780–8794. [Google Scholar]
Peng, J.; Chen, G.; Saruta, K.; Terata, Y. 2D brain MRI image synthesis based on lightweight denoising diffusion probabilistic model. Med. Imaging Process Technol. 2023, 7, 2518. [Google Scholar] [CrossRef]
Krishna, A.; Wang, G.; Mueller, K. Multi-Conditioned Denoising Diffusion Probabilistic Model (mDDPM) for Medical Image Synthesis. arXiv 2024, arXiv:2409.04670. [Google Scholar]
Jiang, Y.; Lemaréchal, Y.; Bafaro, J.; Abi-Rjeile, J.; Joubert, P.; Després, P.; Manem, V. Lung-ddpm: Semantic layout-guided diffusion models for thoracic ct image synthesis. arXiv 2025, arXiv:2502.15204. [Google Scholar] [CrossRef] [PubMed]
Li, Q.; Li, C.; Yan, C.; Li, X.; Li, H.; Zhang, T.; Song, H.; Schaffert, R.; Yu, W.; Fan, Y.; et al. Ultra-low dose CT image denoising based on conditional denoising diffusion probabilistic model. In Proceedings of the 2022 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Suzhou, China, 14–16 October 2022; IEEE: Piscataway, NJ, USA, 2022. [Google Scholar]
Michailovich, O.V.; Tannenbaum, A. Despeckling of medical ultrasound images. IEEE Trans. Ultrason. Ferroelectr. Freq. Control. 2006, 53, 64–78. [Google Scholar] [CrossRef] [PubMed]
Sohl-Dickstein, J.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the International Conference on Machine Learning PMLR, Lille, France, 6–11 July 2015; pp. 2256–2265. [Google Scholar]
Saharia, C.; Chan, W.; Chang, H.; Lee, C.; Ho, J.; Salimans, T.; Fleet, D.; Norouzi, M. Palette: Image-to-image diffusion models. In Proceedings of the ACM SIGGRAPH 2022 Conference Proceedings, Vancouver, BC, Canada, 7–11 August 2022; pp. 1–10. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–26. [Google Scholar]
Choudhari, A. PCOS Detection Using Ultrasound Images. Available online: https://www.kaggle.com/datasets/anaghachoudhari/pcos-detection-using-ultrasound-images (accessed on 20 May 2025).
Wisesty, U.N.; Thufailah, I.F.; Dewi, R.M.; Adiwijaya, J.; Jondri. Study of Segmentation Technique and Stereology to Detect PCO Follicles on USG Images. J. Comput. Sci. 2018, 14, 351–359. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–618. [Google Scholar] [CrossRef] [PubMed]
Zhong, S.; Cherkassky, V. Image denoising using wavelet thresholding and model selection. In Proceedings of the 2000 International Conference on Image Processing (Cat. No. 00CH37101), 10–13 September 2000; IEEE: Vancouver, BC, Canada, 2000; Volume 3, pp. 262–265. [Google Scholar]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [PubMed]
Zhang, K.; Zuo, W.; Zhang, L. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 2018, 27, 4608–4622. [Google Scholar] [CrossRef] [PubMed]
Wang, D.; Fan, F.; Wu, Z.; Liu, R.; Wang, F.; Yu, H. CTformer: Convolution-free Token2Token dilated vision transformer for low-dose CT denoising. Phys. Med. Biol. 2023, 68, 065012. [Google Scholar] [CrossRef] [PubMed]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.-H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]

Figure 1. Diagram of the DDPM-based ultrasound image denoising network model. The figure on the right shows the detailed architecture of the U-net

ε_{θ}

.

Figure 1. Diagram of the DDPM-based ultrasound image denoising network model. The figure on the right shows the detailed architecture of the U-net

ε_{θ}

.

Figure 2. Effect diagrams of the forward process of adding noise using different methods.

Figure 3. The original and processed PCOS ultrasound images from the PCOS ultrasound image dataset.

Figure 4. Comparison of denoising results for PCOS ultrasound images using different models (image resolution: 128 × 128). From left to right, the images are: the original image, the noisy image, bilateral filtering, wavelet, DnCNN, FFDNet, DDPM, and finally our proposed method.

Figure 5. Comparison of heatmaps for ultrasound images processed by different denoising models.

Figure 6. Comparison of denoising results between DDPM and our model at different training epochs, with the image resolution set to 256 × 256.

Figure 7. Comparative denoising results on the Polycystic Ovary Ultrasound Images Dataset, including our approach.

Figure 8. Comparison of denoising results between DDPM and our model, with the image resolution set to 512 × 512.

Figure 9. Comparison of training PSNR for different improved components of the network model, and comparison of training losses between DDPM with L1 loss and our method with smooth L1 loss.

Table 1. The different evaluation metrics for ultrasound image denoising using different models. Significance is measured relative to the baseline (e.g., DnCNN). *: p < 0.05, **: p < 0.01.

Methods	Timestep	PSNR	SSIM	FID	Inference Time	Significance
Original image	/	$\infty$	1	0	-	-
Bilateral filtering	/	18.56	0.4068	296.1803	-	-
Wavelet	/	23.29	0.6550	274.0399	-	-
DnCNN	/	25.31	0.7289	177.7236	1 s	Baseline
FFDNet	/	26.47	0.7935	122.2012	1–2 s	*
DDPM	1000	28.05	0.8254	90.9101	31 s	**
Ours	1000	29.62	0.8612	55.3037	18 s	**

Table 2. Evaluation metrics for ultrasound image denoising across models under additive and speckle noise.

Methods	PSNR		SSIM		FID		LPIPS		Parameters
/	Addnoise	Speckle noise	Addnoise	Speckle noise	Addnoise	Speckle noise	Addnoise	Speckle noise	-
CTformer	26.71	23.15	0.692	0.598	209.13	254.28	0.347	0.413	1.92 M
Restormer	30.45	24.17	0.785	0.719	131.63	186.44	0.168	0.232	26.3 M
DDPM	27.78	22.39	0.728	0.646	209.51	234.98	0.235	0.307	53.2 M
Ours	28.74	23.82	0.716	0.668	171.80	203.16	0.246	0.271	37.9 M

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Peng, J.; Guo, Z.; Chen, X.; Zhou, M. Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model. Electronics 2025, 14, 4061. https://doi.org/10.3390/electronics14204061

AMA Style

Peng J, Guo Z, Chen X, Zhou M. Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model. Electronics. 2025; 14(20):4061. https://doi.org/10.3390/electronics14204061

Chicago/Turabian Style

Peng, Jincheng, Zhenyu Guo, Xing Chen, and Ming Zhou. 2025. "Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model" Electronics 14, no. 20: 4061. https://doi.org/10.3390/electronics14204061

APA Style

Peng, J., Guo, Z., Chen, X., & Zhou, M. (2025). Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model. Electronics, 14(20), 4061. https://doi.org/10.3390/electronics14204061

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Denoising Degraded PCOS Ultrasound Images Using an Enhanced Denoising Diffusion Probabilistic Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Ultrasound Noise Degradation Modeling

2.2. DDPM Model

2.3. Improved DDPM Denoising Model

2.4. Noise Addition Strategy and Loss Function

3. Results

Dataset and Experimental Environment

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI