Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling

Zhang, Zhonghan; Qin, Guihe; Liang, Yanhua; Sun, Minghui; Wang, Yingqing; Song, Jiaru

doi:10.3390/rs17061061

Open AccessArticle

Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling

by

Zhonghan Zhang

^1,2

,

Guihe Qin

^1,2,*,

Yanhua Liang

^1,2,

Minghui Sun

^1,2,

Yingqing Wang

^1,2 and

Jiaru Song

^1,2

¹

College of Computer Science and Technology, Jilin University, Changchun 130012, China

²

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(6), 1061; https://doi.org/10.3390/rs17061061

Submission received: 27 January 2025 / Revised: 9 March 2025 / Accepted: 14 March 2025 / Published: 17 March 2025

(This article belongs to the Special Issue New Technologies, Methods and Studies for Seismic and Radar Subsurface Exploration)

Download

Browse Figures

Versions Notes

Abstract

Three-dimensional (3D) seismic data, essential for revealing subsurface structures and exploring oil and gas resources, are often contaminated by noise with an unknown prior distribution. Existing denoising research faces great challenges due to the scarcity of ground truth and the difficulty in obtaining prior knowledge of noise distributions. Moreover, few algorithms are specifically designed to leverage the unique spatial structural information inherent in 3D seismic data, leading to inefficient utilization of this valuable information during denoising. To address these issues, we propose Self-Supervised Seismic Denoising using the Denoising Diffusion Probabilistic Model (SSDn-DDPM), an algorithm specifically tailored for 3D seismic data that utilizes diffusion generative models for self-supervised blind denoising. The algorithm begins with self-supervised modeling of seismic noise to estimate its distribution. Subsequently, spatial structural information of 3D seismic data is leveraged to improve the accuracy of noise distribution estimation. Furthermore, the algorithm integrates the noise distribution estimation network into the diffusion model to further guide and refine the sampling process, thereby optimizing computational complexity and improving detail representation. Finally, it performs self-supervised 3D seismic noise suppression using the diffusion probabilistic model. In the experimental section, we comprehensively compare the proposed algorithm with six different types of seismic denoising methods. Various comparative experiments demonstrate that the proposed algorithm achieves exceptional denoising performance on 3D seismic data, even without ground truth or any prior knowledge about the noise distribution.

Keywords:

self-supervised learning; active seismic surveys; diffusion model; 3D seismic processing; noise suppression; seismic noise modeling

1. Introduction

Seismic noise suppression is a classic problem in seismic data processing, and high Signal-to-Noise-Ratio (SNR) seismic data are crucial for enhancing the quality of subsequent processing. Although many studies have proposed various effective methods for improving the SNR in seismic data, the advancement of exploration into deeper formations has resulted in increased complexity and diversity in deep subsurface seismic data. Therefore, it is necessary to constantly search for more effective seismic noise suppression methods. In this paper, we explore a novel algorithm for 3D seismic noise suppression that differs significantly from previous algorithms in implementation and greatly improves noise suppression effectiveness.

According to the mathematical characteristics of seismic signals, various traditional seismic data denoising methods have been developed. These classical methods can be broadly categorized into filtering-based [1,2,3,4,5,6,7,8,9], sparse transform-based [10,11,12,13,14,15,16,17,18,19,20,21], and rank-reduction-based approaches [22,23,24,25]. Amid the rapid advancement in deep learning technology, deep learning-based methods have gained significant traction in seismic denoising. In the following section, we compare the proposed method with the representative algorithms in the three traditional methods. For deep learning methods, this paper also compares several representative algorithms based on different learning tasks and algorithm frameworks.

Filtering-based seismic denoising methods exploit the separability of signals and noise in the time or transform domain by designing appropriate filters and selecting suitable parameters to separate signals and noise. Typical filtering-based seismic denoising methods include median filtering [1,2], time-frequency peak filtering [3], bandpass filtering [4], nonlocal means [5], and prediction filtering [6]. Generally, filtering-based denoising methods offer good interpretability and can handle various noise distributions by selecting appropriate filters. However, due to variations in the ability of filters to separate noise, these methods have limitations when dealing with complex seismic noise. Prediction filtering approaches rely on the continuity assumption of seismic reflection signals, which posits that seismic signals exhibit self-correlation. These approaches can preserve the original features of seismic signals, especially reflection information. Notably, the noise estimation component of our proposed method shares underlying assumptions with these approaches. In the case of median filtering methods, Structure-Oriented Space-Varying Median Filtering (SOSVMF) [2] is a noteworthy approach that incorporates a sparsity constraint during the iterative process to better preserve signal energy and enhance amplitude preservation.

Decomposition-based seismic denoising methods are considered a generalized form of filtering that achieves noise suppression through adaptive signal decomposition. Classical mode decomposition algorithms mainly include empirical mode decomposition [7,8] and variational mode decomposition [9]. These methods generally exhibit good robustness while effectively preserving reflection information within signals. However, they also face challenges such as high computational complexity, limited generalizability, and sensitivity to parameter selection.

Sparse transform-based seismic denoising methods are based on compressive sensing theory [26]. Classical methods include the Fourier transform [10], Radon transform [11], wavelet transform [12], seislet transform [13], curvelet transform [14,15], contourlet transform [16], and shearlet transform [17]. The basis functions used to represent signals are typically limited in number, which introduces errors. To achieve higher denoising accuracy, sparse dictionary learning has been applied to seismic denoising. One of the most prominent methods is the K-Singular Value Decomposition (K-SVD) [18,19], which iteratively updates the dictionary and sparse coefficients to minimize reconstruction error. K-SVD employs singular value decomposition for dictionary construction, resulting in high computational costs and unsuitability for high-dimensional data. The Sequential Generalized K-means (SGK) [20,21] algorithm achieves performance comparable to K-SVD while significantly reducing computational complexity, making it more suitable for high-dimensional seismic data.

The rank-reduction-based approach posits that seismic signals are low-rank and that random noise increases the rank of the data matrix. Multichannel Singular Spectrum Analysis (MSSA) [22,23] is a classical rank-reduction method that reduces the rank of a constructed Hankel matrix to remove noise from seismic data. Damped Rank Reduction (DRR) [24,25] focuses on optimizing the rank reduction of an MSSA-transformed Hankel matrix, thereby improving denoising performance.

With the development of deep learning algorithms, numerous network models have been applied to seismic denoising tasks. Some researchers have introduced denoising convolutional neural network (DnCNN) [27] models for seismic denoising [28,29,30,31,32]. Some scholars have integrated convolutional neural networks with transformers, leveraging transformers to extract data features more effectively and achieve impressive results in seismic noise suppression [33,34,35,36,37]. Generative Adversarial Networks (GANs) [38], as excellent generative models, have also been employed for seismic denoising [39]. SeisGAN [40], tailored to the characteristics of seismic data, implements a specialized processing workflow for supervised seismic data denoising. The diffusion model [41,42] is an excellent generative model developed in recent years and has been applied in various fields [43,44,45]. Seismologists have also attempted to apply diffusion models to address seismic denoising problems [46,47]. However, these methods typically suffer from high computational complexity. The Principal Component Analysis Denoising Diffusion Probabilistic Model (PCADDPM) [48] utilizes Principal Component Analysis (PCA) to reduce computational complexity, achieving supervised denoising for 2D seismic sections. However, when PCA processes low-SNR data, saturation of the norm factor may occur, hindering effective differentiation across varying noise intensities.

Many seismic denoising algorithms originating from natural image processing are supervised and require noiseless data. However, obtaining noiseless ground-truth seismic data is much more challenging than in natural image processing. Some researchers have focused on developing unsupervised noise suppression methods. Some methods [49,50] utilize autoencoders to learn sparse features of seismic signals through encoding and decoding processes. Some methods [51,52,53] utilize the statistically based Noise2Noise (N2N) modes [54,55,56,57,58], which assumes high coherence in seismic signals while treating random noise as statistically independent and unpredictable. Some methods [59,60] apply Deep Image Prior (DIP) [61], processing noisy seismic data directly through randomly initialized networks and achieving excellent denoising results.

This paper proposes Self-Supervised Seismic Denoising using the Denoising Diffusion Probabilistic Model (SSDn-DDPM) to address seismic denoising problems. To enhance noise suppression, improve detail representation, and address the absence of ground truth in seismic denoising tasks, the algorithm employs a self-supervised noise estimation model to predict the noise distribution, thereby optimizing the diffusion model. Additionally, to make more effective use of the internal features of seismic data, the algorithm is explicitly optimized for high-dimensional samples, enabling it to leverage structural similarities across all dimensions of 3D seismic data, thereby achieving more effective noise suppression. To comprehensively illustrate the characteristics and advantages of the proposed algorithm, the experimental section includes comparative experiments with representative algorithms selected from each of the aforementioned categories: SOSVMF [2] representing filtering-based methods, SGK [20] representing sparse dictionary learning methods, and DRR [24,25] representing rank-reduction approaches. In the deep learning domain, SeisGAN [40] is chosen as the representative supervised method, DIP [60] as the unsupervised representative, and PCADDPM [48] as the representative of diffusion models. Comprehensive experiments demonstrate that the proposed method offers a significant advantage in suppressing noise and preserving structural details in 3D seismic data.

This article presents the following contributions based on the concepts outlined above:

1.: Considering the difficulty of obtaining noiseless data from seismic sensors and the need for high-quality training data to constrain diffusion models, we design a self-supervised 3D seismic noise estimation model to predict the noise distribution, which is embedded into the diffusion model to further guide and improve the sampling process, reducing computational complexity while enhancing detail representation.
2.: Based on the spatial structural characteristics of 3D seismic data, we propose a specialized sampling method that effectively enhances the internal coherence of the data signals. This effective utilization of the spatial information in 3D seismic data allows statistics-based denoising algorithms to more accurately estimate the seismic noise distribution, thereby improving the denoising performance.
3.: We implement self-supervised blind denoising of 3D seismic data based on a diffusion generative model. The algorithm only needs to observe individual 3D seismic data without prior knowledge, parameter tuning, or ground truth to achieve denoising while effectively recovering subsurface structural details.

2. Methodology

2.1. Self-Supervised 3D Seismic Noise Modeling

This subsection introduces a self-supervised method for predicting 3D seismic noise distribution. The method does not rely on noiseless seismic data nor does it require prior knowledge of noise characteristics for training. A fully trained model estimates the noise component within the seismic data, enabling noise modeling and supporting subsequent denoising processes.

Generally, seismic data contaminated by random noise can be described by the following process:

x = c + n,

(1)

where x denotes the noisy seismic data, and c and n represent the clean seismic signal and random noise, respectively. The goal of the method presented in this subsection is to estimate the random noise n by solving for c from x. We represent the proposed method as a function

P (\cdot)

and employ the

L_{2}

loss, defined as

L (P (x), x) = {(P (x) - x)}^{2}

, to measure the prediction error. The problem to be solved can be formulated as the following optimization process:

\underset{θ}{arg min} L (P_{θ}) = E_{c, x} {∥P_{θ} (x) - c∥}^{2} .

(2)

Compared to image processing, a particularly challenging problem in seismic noise suppression is the difficulty of obtaining noiseless data. In image processing, due to the ease of image acquisition, a theoretically noiseless labeled image c can be obtained through multi-image stacking. However, using this approach to obtain c in real seismic data processing is difficult. The most straightforward way to overcome this problem is to use only the observed noisy data x for denoising training, which involves solving the following optimization problem:

\underset{θ}{arg min} L (P_{θ}) = E_{x} {∥P_{θ} (x) - x∥}^{2} .

(3)

Consider the difference in

L

between Equations (2) and (3) as follows:

\begin{matrix} E_{x} {∥P (x) - x∥}^{2} = E_{x, c} {∥P (x) - c∥}^{2} + E_{x, c} {∥c - x∥}^{2} \\ + 2 E_{x, c} [(P (x) - c) (c - x)], \end{matrix}

(4)

where the first term matches the supervised loss in Equation (2) and the second term represents the noise variance. Here, we focus on the third term:

E_{c} [E_{x | c} [(P (x) - c) (c - x)]] .

(5)

Without further constraints, the expectation of x is no longer separable because

P (x)

and x are generally correlated. Inspired by the concept of J-invariance [56], we design a specialized sampling strategy. As shown in Figure 1c, for the entire sampling block, consider

J = {v_{1}, \dots, v_{n}}

. During training, a specific

v_{j} \in J

is selected for slicing as the target, and the input volume is chosen according to the following equation:

[v_{1}^{e}, v_{2}^{e}, \dots, v_{m}^{e}] = \underset{v_{i}^{e} \in J ∖ v_{j}}{arg min} \sum_{i = 1}^{n} O (v_{i}^{e}, v_{j}),

(6)

where

O (\cdot)

denotes the Euclidean distance, m denotes the selectable sampling range, and e denotes partitions that are always disjoint from

v_{j}

. By using this method, the denoising function

P_{J} (x_{*, *, j^{e}})

, where

P_{J} : R^{p^{2} \times n} \to R^{p^{2} \times 1}

, satisfies the property that the output of

P_{J}

in J does not depend on the input to

P_{J}

in J. This design ensures that

P_{J} {(x)}_{j} | c

and

x_{j} | c

are mutually independent. Thus, Equation (5) can be written as

\sum_{j = 1}^{n} E_{c} (E_{x | c} [P_{J} {(x)}_{j} - c_{j}]) (E_{x | c} [c_{j} - x_{j}]) .

(7)

For random noise where

E [n] = 0

, we have

E [x | c] = c

. The value of Equation (7) is 0, and the third term of Equation (4) vanishes. It can be seen that for a denoising model trained in this self-supervised manner, the noise in each volume is independent. According to Equation (4), the training will be equivalent, in expectation, to supervised training of

P

plus a constant. This implies that optimizing

L

through self-supervised training will also optimize the supervised loss with ground truth. The denoiser for

v_{j}

only accesses volumes other than

v_{j}

. Since the noise in these volumes is random and unrelated to the noise predicted for

v_{j}

, the denoiser will learn to preserve the effective signal while suppressing random noise fluctuations. Thus, denoising becomes the optimization process:

\underset{θ}{arg min} L (P_{J} θ) = E_{x} {∥P_{J} θ (x_{*, *, j^{e}}) - x_{*, *, j}∥}^{2} .

(8)

The choice of denoiser is flexible, with U-Net, a more general architecture, serving as the base network. In Figure 1a, we illustrate the specific network structure, data flow format, and implementation details of our proposed method. As shown in the figure, the proposed model has a typical encoder–decoder structure. Skip connections [62] are employed to minimize information loss during section reconstruction. Residual connections are utilized within both the encoder and decoder layers to enhance feature extraction, improve the stability and efficiency of network training, and better capture complex features in 3D seismic data. The self-attention mechanism is incorporated into specific layers to enhance feature representation and improve the distinction between noise and signal, thereby improving denoising performance and preserving effective signals. The network input consists of multichannel seismic sections, where the number of channels is determined by the number of nearest-neighbor volumes selected. The network outputs a single-channel denoised section, representing the predicted central volume corresponding to the multichannel volume input. This network architecture effectively extracts complex features within seismic data and provides good noise prediction performance while maintaining computational efficiency.

We design a specialized sampling strategy to ensure that the training process satisfies J-invariance [56], thereby enabling self-supervised noise estimation based on the internal features of 3D seismic data. The entire seismic noise modeling process only requires noisy 3D seismic data, and the algorithm automatically extracts internal information from the noisy data and performs self-supervised training. This process does not rely on external clean data to provide additional supervision to the model.

The detailed algorithm implementation is described in pseudo-code in Algorithm 1. As can be seen, the noise prediction process only uses S, i.e., noisy seismic data, without additional noiseless data, allowing the algorithm to be trained in a self-supervised manner to obtain the corresponding noiseless data, thus enabling self-supervised noise prediction.

Algorithm 1 Self-supervised noise prediction

Require: S (3D Noisy Seismic Data)

1:: $v, d \leftarrow S e r p e n t i n e S a m p l i n g (S)$ (introduced in §2)
2:: for $j = 1$ to n step 1 do
3:: for $i = 1$ to m step 1 do
4:: $v_{j}^{e} \leftarrow [\underset{v_{i}^{e} \in J ∖ v_{j}}{arg min} \sum_{i = 1}^{n} O (v_{i}^{e}, v_{j})]$
5:: end for
6:: $x_{*, *, j}, x_{*, *, j^{e}} \leftarrow s l i c e (v_{j}, v_{j}^{e}, d)$ (Slice by direction)
7:: $l o s s \leftarrow {∥P_{J} θ (x_{*, *, j^{e}}) - x_{*, *, j}∥}^{2}$
8:: end for
9:: Train a linear regressor P
10:: $o u t \leftarrow R e v e r t S e r p e n t i n e S a m p l i n g (P (v))$
11:: return out

It is important to highlight that the sampling process in Steps 1 and 10 of the loop ensures good structural similarity for subsequent processing of the 3D seismic volume while preserving the stochastic nature of the noise, which is crucial for self-supervised noise suppression. The sampling process is illustrated in Figure 1b. The sampling block is based on 3D serpentine sampling, which is elaborated in Section 2.2.

The fully trained network is used for preliminary noise prediction of 3D seismic data. The model only needs to observe a single 3D seismic volume to estimate its noise distribution. The diffusion model in Section 2.3 embeds and utilizes the self-supervised noise prediction model described in this subsection, with the prediction results guiding the diffusion model in the conditional sampling process.

2.2. Three-Dimensional Serpentine Sampling

Three-dimensional seismic exploration is a detailed investigative method for promising or complex geological structures. The acquired data contain spatial information, and the detected geological bodies exhibit continuity and integrity in three dimensions [63,64]. The proposed algorithm leverages the continuity of structural information to remove noise. The additional dimension of information aids the algorithm in fully understanding the data. Therefore, when processing 3D data, we do more than just perform 2D slice sampling. Instead, this study considers the structural similarity of the 3D structure in all three directions, preserving spatial information. At the same time, a specialized serpentine sampling mechanism enhances spatial correlation within the data, aiming to better utilize the continuity of geological bodies across all three dimensions.

In Figure 1b, the sampling process is simply visualized. Specifically, the sampler moves along a serpentine path through the three dimensions of the seismic data. When the data along an inline direction are thoroughly sampled, the current sampling point moves one sampling step along the xline direction to begin sampling the next inline. After completing data sampling within a plane, the current sampling point moves one step along the sample direction to continue sampling along the inline direction in the next plane. The sampled 3D data volume is then sequentially placed into a sampling block for subsequent processing.

Geological bodies exhibit continuity and integrity in three dimensions, and our sampling process ensures that the sampled data move along a continuous path within the 3D seismic data. Compared to simple linear scanning, this sampling method ensures that adjacent data within the sampling block have appropriate structural similarity and avoids abrupt structural changes caused by scanning. Sufficient structural similarity is crucial to the algorithm utilizing statistical theory for denoising.

During the training process, we dynamically select the appropriate slicing direction within the sampled data block, rather than fixing it beforehand. We categorize the slicing directions as either along the inline direction or the xline direction. As illustrated in Figure 1c, the slicing and sampling directions are consistently orthogonal to each other. This approach ensures statistical similarity and sufficient diversity within the data to facilitate effective training. In other words, the slicing process adheres to the principle of maintaining structural similarity while preserving the random noise distribution. Slicing along the same direction as the sampling direction could introduce trace misalignment errors, which should be avoided. Furthermore, if slicing is consistently performed along a specific direction during sampling, it can lead to cumulative errors that affect the algorithm’s accuracy.

Each set of seismic data obtained through this method exhibits structural similarity while maintaining independent noise distributions, which provides a statistical theoretical foundation for noise attenuation using J-invariance. More importantly, by performing 3D sampling of the seismic data followed by direction-based data selection within the sampling blocks, the sampling and selection processes consider the correlation of information across all three dimensions, thus preserving the 3D characteristics of the seismic data. Due to the continuity of 3D serpentine sampling, the algorithm also maintains the spatial features of the data, which is particularly crucial for 3D seismic data processing.

2.3. Self-Supervised Denoising Diffusion Model

This subsection introduces a self-supervised seismic noise suppression algorithm that uses the Denoising Diffusion Probabilistic Model (DDPM) [41], called SSDn-DDPM. Diffusion models generate data through a progressive reverse sampling process, typically starting from random noise. However, due to the significant distribution gap between random noise and real data, relying on an unconstrained reverse generation process may cause the generated results to deviate from the actual signal. This deviation may compromise signal fidelity in denoising tasks, making it difficult to recover the original data accurately and affecting the credibility of the denoising results. Therefore, introducing ground-truth data as a supervisory constraint can guide the reverse sampling process, suppressing random deviations and producing results closer to the ground-truth distribution, thereby significantly improving denoising performance.

However, a prominent challenge in seismic data is the significant lack of noiseless data as ground truth, making it challenging to provide effective supervision directly from actual seismic data. To address this issue, we utilize the noise prediction model trained in Section 2.1, constructing prior information that approximates the actual noise distribution. By embedding the noise distribution into the diffusion model, the noise-based generative constraint indirectly guides the reverse sampling process, encouraging the model to suppress noise components while preserving signal features during generation. This constraint, based on noise modeling, does not rely on noiseless data, enabling self-supervised seismic data denoising and effectively alleviating the dependence on ground-truth data in traditional supervised learning methods. An abstract illustration of the forward and reverse processes of SSDn-DDPM is shown in Figure 2. The entire algorithm does not require noiseless ground truth and can be trained by observing a single noisy 3D seismic volume.

The original data

x_{0}

satisfy the initial distribution, i.e.,

x_{0} \sim q (x_{0})

. The diffusion process can be modeled as a Markov chain, which is expressed as follows:

q (x_{1 : T} | x_{0}) = \prod_{t = 1}^{T} q (x_{t} | x_{t - 1}) .

(9)

For the latent variables

x_{1} \dots x_{T}

in the Markov chain, the relationship between adjacent

x_{t}

and

x_{t - 1}

is given by

x_{t} = \sqrt{1 - β_{t}} x_{t - 1} + \sqrt{β_{t}} ϵ, ϵ \sim N (0, I),

(10)

and since

ϵ

follows a Gaussian distribution, the relationship between the distributions of

x_{t}

and

x_{t - 1}

can be expressed as

q (x_{t} | x_{t - 1}) = N (x_{t}; \sqrt{1 - β_{t}} x_{t - 1}, β_{t} I) .

(11)

Defining

α_{t} = 1 - β_{t}

, Equation (10) can be rewritten as

\begin{matrix} x_{t} & = \sqrt{α_{t}} x_{t - 1} + \sqrt{1 - α_{t}} ϵ \\ = \sqrt{α_{t} α_{t - 1}} x_{t - 2} + \sqrt{α_{t} (1 - α_{t - 1})} ϵ \\ + \sqrt{1 - α_{t}} ϵ, ϵ \sim N (0, I) . \end{matrix}

(12)

According to the addition property of the Gaussian distribution, the coefficients of

ϵ

can be combined into

\sqrt{1 - α_{t} α_{t - 1}}

. By applying mathematical induction, the following result is derived:

x_{t} = \sqrt{{\bar{α}}_{t}} x_{0} + \sqrt{1 - {\bar{α}}_{t}} ϵ, ϵ \sim N (0, I),

(13)

where

{\bar{α}}_{t} = \prod_{i = 1}^{t} α_{i}

. Given

x_{0}

and based on

\bar{α}

, the distribution of

x_{t}

can be sampled from the following expression:

q (x_{t} | x_{0}) = N (x_{t}; \sqrt{{\bar{α}}_{t}} x_{0}, (1 - {\bar{α}}_{t}) I) .

(14)

According to Equation (14), we can sample

q (x_{t} | x_{0})

in the diffusion process. The posterior conditional probability

q (x_{t - 1} | x_{t}, x_{0})

of the diffusion process also needs to be calculated, as given by Bayes’ theorem:

q (x_{t - 1} | x_{t}, x_{0}) = \frac{q (x_{t} | x_{t - 1}, x_{0}) q (x_{t - 1} | x_{0})}{q (x_{t} | x_{0})} .

(15)

According to Equations (11) and (14),

q (x_{t - 1} | x_{t}, x_{0})

satisfies the following relation:

\begin{matrix} q (x_{t - 1} | x_{t}, x_{0}) \propto exp - \frac{1}{2} [(\frac{α_{t}}{β_{t}} + \frac{1}{1 - {\bar{α}}_{t - 1}}) x_{t - 1}^{2} \\ - (\frac{2 \sqrt{α_{t}} x_{t}}{β_{t}} + \frac{2 \sqrt{{\bar{α}}_{t - 1}} x_{0}}{1 - {\bar{α}}_{t - 1}}) x_{t - 1} + C (x_{t}, x_{0})] . \end{matrix}

(16)

Assuming that

q (x_{t - 1} | x_{t}, x_{0})

follows a Gaussian distribution with mean

\tilde{μ} (x_{t}, x_{0})

and variance

{\tilde{β}}_{t}

, we have

q (x_{t - 1} | x_{t}, x_{0}) = N (x_{t - 1}; \tilde{μ} (x_{t}, x_{0}), {\tilde{β}}_{t} I) .

(17)

From Equation (16), the following can be derived:

{\tilde{β}}_{t} = \frac{1 - {\bar{a}}_{t - 1}}{1 - {\bar{a}}_{t}} β_{t},

(18)

and

{\tilde{μ}}_{t} (x_{t}, x_{0}) = \frac{\sqrt{α_{t}} (1 - {\bar{α}}_{t - 1})}{1 - {\bar{α}}_{t}} x_{t} + \frac{\sqrt{{\bar{α}}_{t - 1}} β_{t}}{1 - {\bar{α}}_{t}} x_{0} .

(19)

After the diffusion process, the original data

x_{0}

can be noised to a Gaussian distribution. For the reverse process, the original data need to be recovered from Gaussian noise, and the reverse process is still a Markov process [65]:

p_{θ} (x_{0 : T}) = p (x_{T}) \prod_{t = 1}^{T} p_{θ} (x_{t - 1} | x_{t}) .

(20)

In the reverse diffusion process, a parameterized network

p_{θ}

needs to be designed to estimate the reverse process. Assume the conditional probability

p_{θ} (x_{t - 1} | x_{t})

has mean

μ_{θ}

and variance

Σ_{θ}

. The inputs to

μ_{θ}

and

Σ_{θ}

are both

x_{t}

and t. The following relation can be obtained:

p_{θ} (x_{t - 1} | x_{t}) = N (x_{t - 1}; μ_{θ} (x_{t}, t), Σ_{θ} (x_{t}, t)) .

(21)

Equation (20) is the reverse diffusion, which is expected to gradually recover the distribution of

x_{0}

from

x_{T}

. Determining the value of

μ_{θ} (x_{t}, t)

and

Σ_{θ} (x_{t}, t)

requires performing maximum likelihood estimation on

p_{θ}

. Based on the rules of mathematical expectation and the definition of the KL divergence,

p_{θ} (x_{0})

can be expressed as follows:

\begin{matrix} log (p_{θ} (x_{0})) = E_{q_{(x_{1 : T} | x_{0})}} [log \frac{p_{θ} (x_{0 : T})}{q (x_{1 : T} | x_{0})}] \\ + D_{KL} [q (x_{1 : T} | x_{0}) | | p_{θ} (x_{1 : T} | x_{0})] . \end{matrix}

(22)

Since the KL divergence is non-negative, the following inequality holds:

log (p_{θ} (x_{0})) \geq E_{q_{(x_{1 : T} | x_{0})}} [log \frac{p_{θ} (x_{0 : T})}{q (x_{1 : T} | x_{0})}],

(23)

where the right-hand side is the lower bound for

log (p_{θ} (x_{0}))

. Let

L = - log (p_{θ} (x_{0}))

. The variational lower bound (VLB), denoted as

L_{VLB}

, for the optimization objective can be derived as

L_{VLB} = E_{q (x_{1 : T} | x_{0})} [log \frac{q (x_{1 : T} | x_{0})}{p_{θ} (x_{0 : T})}] .

(24)

Minimizing

L_{VLB}

corresponds to performing the maximum likelihood estimation of

p_{θ}

. According to Fubini’s theorem and further decomposition, Equation (24) can be derived as follows:

\begin{matrix} L_{VLB} & = E_{q (x_{0 : T})} [log \frac{q (x_{1 : T} | x_{0})}{p_{θ} (x_{0 : T})}] \\ = E_{q} [\underset{L_{T}}{\underset{︸}{D_{KL} (q (x_{T} | x_{0}) | | p_{θ} (x_{T}))}} \\ + \sum_{t = 2}^{T} \underset{L_{t - 1}}{\underset{︸}{D_{KL} (q (x_{t - 1} | x_{t}, x_{0}) | | p_{θ} (x_{t - 1} | x_{t}))}} \\ \underset{L_{0}}{\underset{︸}{- log p_{θ} (x_{0} | x_{1})}}] . \end{matrix}

(25)

We examine the composition of

L_{VLB}

. In

L_{T}

,

p_{θ} (x_{0 : T})

is the starting point of the reverse process, which is Gaussian noise.

q (x_{1 : T} | x_{0})

represents the diffusion process and does not contain learnable parameters. Therefore,

L_{T}

can be ignored as a constant. For

L_{0}

, the model specifies that the last step of the reverse process

p_{θ} (x_{0} | x_{1})

is an independent discrete decoder derived from a Gaussian distribution, which also does not require learnable parameters [41]. Thus, minimizing

L_{VLB}

only requires focusing on

L_{t - 1}

. The DDPM sets the variance as a constant related to

β_{t}

, so trainable parameters exist only in the mean. By substituting Equations (17) and (21) and expanding the KL divergence,

L_{t - 1}

can be written as follows:

\begin{matrix} L_{t - 1} = E [\frac{1}{2 σ_{t}^{2}} ∥(\frac{\sqrt{α_{t}} (1 - {\bar{α}}_{t - 1})}{1 - {\bar{α}}_{t}} x_{t} \\ {+ \frac{\sqrt{{\bar{α}}_{t - 1}} β_{t}}{1 - {\bar{α}}_{t}} x_{0}) - μ_{θ} (x_{t}, t)∥}^{2}] + C . \end{matrix}

(26)

For

μ_{θ}

in Equation (26), its input consists of

x_{t}

and the time encoding t. The output depends on our modeling objective. According to

{\tilde{μ}}_{t} (x_{t}, x_{0})

in Equation (19),

x_{0}

is unknown so we can let

μ_{θ}

directly predict the distribution of

x_{0}

. This design differs from the original DDPM and is inspired by DALL·E 2 [66]. The optimization objective designed in this way is beneficial for subsequent self-supervised training based on J-invariance. Based on this, let

F_{θ}

represent the specific form of the network prediction target. The mean of the reverse conditional distribution should be optimized toward the following target:

\begin{matrix} μ_{θ} (x_{t}, t) & = {\tilde{μ}}_{t} (x_{t}, F_{θ} (x_{t})) \\ = \frac{\sqrt{α_{t}} (1 - {\bar{α}}_{t - 1})}{1 - {\bar{α}}_{t}} x_{t} + \frac{\sqrt{{\bar{α}}_{t - 1}} β_{t}}{1 - {\bar{α}}_{t}} F_{θ} (x_{t}) . \end{matrix}

(27)

Substituting Equation (27) into Equation (26) and simplifying yields the expression for

L_{t - 1}

:

\begin{matrix} L_{t - 1} = E [\frac{{\bar{α}}_{t - 1} β_{t}^{2}}{2 σ_{t}^{2} (1 - {\bar{α}}_{t})} {∥x_{0} - F_{θ} (x_{t})∥}^{2}] . \end{matrix}

(28)

In the actual model training process, the coefficients involving hyperparameters at the front can be ignored, leading to a more stable training process and improved training results. Thus, the optimization objective is given by

\underset{θ}{arg min} L (F_{θ}) = E_{x} {∥F_{θ} (x_{t}) - x_{0}∥}^{2} .

(29)

Equation (29) describes the optimization objective for the complete reverse diffusion process from

x_{T}

to

x_{0}

. However, directly applying Equation (29) to the self-supervised 3D seismic denoising task presents challenges. For

x_{0}

, there is no directly accessible ground-truth noiseless data in a self-supervised task. For

x_{T}

, the distribution of

p (x_{T})

is Gaussian noise. Starting the inference from Gaussian noise is not only inefficient but also, for denoising tasks, we prefer a conditionally constrained reverse diffusion process to ensure that the denoised data do not deviate significantly from the underlying true signal within the original data.

To overcome these two challenges and achieve self-supervised conditional diffusion denoising for 3D seismic data, this paper proposes a solution that integrates statistical theory with the diffusion model, enabling conditional denoising without ground truth. By utilizing the noise function

P

obtained from the self-supervised noise estimation method in Section 2.1, we perform noise distribution matching within the Markov chain. This allows us to bypass a portion of the noise-reversal latent process, initiating the reverse process from an intermediate state and effectively constraining the diffusion model with information derived from the original data, acting as a proxy for ground truth.

As illustrated in Figure 2, the reverse Markov process can be expressed as follows:

p_{θ} (x_{0 : I}) = p (x_{I}) \prod_{t = 1}^{I} p_{θ} (x_{t - 1} | x_{t}),

(30)

where

p (x_{I})

denotes the posterior distribution in an intermediate state of the Markov chain.

Therefore, compared to DDPM, we need to solve for

p (x_{I})

to determine the initial node of the reverse diffusion process in the Markov chain. In Section 2.1, we implemented self-supervised noise estimation for 3D seismic data. By applying the fully trained noise estimation model

P_{J}

from Equation (8), a preliminary prior noise estimation of the noisy section can be obtained:

\tilde{n} = x_{j} - \sqrt{{\bar{α}}_{t}} P_{J} (x_{j^{e}}),

(31)

where

P_{J} (x_{j^{e}})

generates an estimate of the true value for section

x_{j}

.

\sqrt{{\bar{α}}_{t}}

is the pre-defined noise schedule representing the proportion of the true signal at state t in the Markov chain.

According to Equation (13), in the diffusion process, the pre-assigned added noise at node t in the Markov chain is

n (I) = \sqrt{1 - {\bar{α}}_{t}} ϵ

, where

ϵ \sim N (0, I)

. Solving for t allows us to determine the position in the Markov chain of the diffusion process where the noise estimated by

P_{J}

is added. We use Kernel Density Estimation (KDE) with a Gaussian kernel to estimate the probability density function of the noise. Then, by calculating its Kullback–Leibler (KL) divergence, we frame solving for t as the following optimization problem:

\underset{t}{arg min} D_{KL} (K (x_{j} - \sqrt{{\bar{α}}_{t}} P_{J} (x_{j^{e}})) | | K (\sqrt{1 - {\bar{α}}_{t}} ϵ)) .

(32)

Since

\bar{α}

is a pre-assigned monotonic discrete integer variable, the optimization problem can be considered a simple linear search problem. When the optimal

I = t

can be found such that the KL divergence between the two distributions is minimized, it implies that the noise component in the data is sufficiently close to the noise added at a certain node in the Markov chain during the diffusion process. As a node

x_{I}

in the reverse diffusion process shares the same noise schedule, there must exist at least one posterior sample that is sufficiently close to

x_{j}

in the complete reverse diffusion process. Thus, by matching

x_{j}

to

p (x_{I} | x_{j})

, the reverse diffusion process can start directly from the intermediate state

p (x_{I} | x_{j})

of the Markov chain, rather than from the Gaussian distribution

p (x_{T})

. This process not only accelerates the sampling process but also transforms the unconditional denoising process of the diffusion model into conditional generation denoising, which is constrained by the function

P_{J}

prediction of the actual value. As a result, the sampling outcome is no longer an unconditional generation but a further optimization of the denoising result of

P_{J}

.

When Equation (32) finds the optimal I, the corresponding noise

ϵ

can be obtained using the noise estimation from

P_{J}

. Due to the lack of valid samples, this self-supervised denoising method needs to prevent the solution of the diffusion model

F

from collapsing into the solution space of the noise estimation function

P

. To ensure that

F

correctly learns to estimate the noise distribution rather than simply learning the explicit noise model, we apply a spatial random shuffling operation to the noise predicted by

P

. Therefore,

ϵ

is calculated using the following equation:

\tilde{ϵ} = Shuffle [\frac{x_{j} - \sqrt{{\bar{α}}_{I}} P_{J} (x_{j^{e}})}{\sqrt{1 - {\bar{α}}_{I}}}] .

(33)

Since the noise components are assumed to be independent, applying the spatial random shuffling operation to the residual noise predicted by

P

does not alter the distribution of

ϵ

. This operation encourages

F

to learn from the implicit noise distribution estimated by

P

, guiding further optimization of the denoising result. The prior of

x_{t}

in the diffusion process can be derived from

\tilde{ϵ}

, as given by the following equation:

{\tilde{x}}_{t} = \sqrt{{\bar{α}}_{t}} P_{J} (x_{j^{e}}) + \sqrt{1 - {\bar{α}}_{t}} \tilde{ϵ} .

(34)

Similar to the J-invariance optimization concept introduced in Section 2.1, the training of the conditional constrained diffusion model in this subsection also lacks ground-truth data. We adopt the same sampling strategy as in Section 2.1 to construct

F

as a J-invariant function

F_{J}

, achieving conditional self-supervised denoising without ground truth. Consequently, the training target should be seismic sections from another volume, excluding the input. This requirement for self-supervised denoising without ground truth is also one of the reasons why our diffusion model optimization target is the signal rather than the noise. Therefore, the overall optimization objective of the algorithm is formulated as follows:

\begin{matrix} \underset{θ}{arg min} L (F_{J} θ) = E_{x} {∥F_{J} θ ({\tilde{x}}_{*, *, j^{e}}, t) - x_{*, *, j}∥}^{2} . \end{matrix}

(35)

When training is complete, according to Equations (21) and (27),

x_{t - 1}

can be sampled using the following expression:

\begin{matrix} x_{t - 1} & = \frac{\sqrt{α_{t}} (1 - {\bar{α}}_{t - 1})}{1 - {\bar{α}}_{t}} x_{t} + \frac{\sqrt{{\bar{α}}_{t - 1}} β_{t}}{1 - {\bar{α}}_{t}} F_{J} θ (x_{t}, t) \\ + σ_{t} z, z \sim N (0, I) . \end{matrix}

(36)

Following Equation (36), the reverse diffusion process can be iteratively performed from

x_{j}

to

x_{0}

. The number of iterations depends on the optimal matching stage of the noise contained in

x_{j}

within the Markov chain.

The network architecture is essentially the same as the noise estimation network used in Section 2.1. The difference lies in the addition of a time-encoding module within the residual block to mark the specific timestep of both the forward and reverse diffusion processes. This is implemented by encoding the timestep into a vector via sinusoidal positional embedding [67], which is then added to the corresponding features.

The specific algorithm is described in pseudo-code in Algorithm 2 and pseudo-code in Algorithm 3. As shown in Algorithm 2, the proposed method achieves self-supervised diffusion process training for 3D seismic data without requiring noiseless ground truth. When combined with the reverse diffusion process described in Algorithm 3, the overall algorithm requires only a single noisy 3D seismic volume to obtain denoised results. This achieves self-supervised conditional constrained seismic data denoising based on the diffusion model.

Algorithm 2 Self-supervised training of SSDn-DDPM

Require: S (3D Noisy Seismic Data),

P_{J}

(Pre-trained Noise Estimation Function), T (Diffusion Noise Schedule)

1:: while not converged do
2:: sample $x_{j^{e}}$ and $x_{j}$ from S (Sampling according to J-invariance in Section 2.1)
3:: $ϵ \sim N (0, I)$
4:: $\tilde{n} \leftarrow x_{j} - \sqrt{{\bar{α}}_{t}} P_{J} (x_{j^{e}})$
5:: for $t = 1$ to T step 1 do
6:: $\underset{t}{arg min} D_{KL} (K (x_{j} - \sqrt{{\bar{α}}_{t}} P_{J} (x_{j^{e}})) | | K (\sqrt{1 - {\bar{α}}_{t}} ϵ))$
7:: if Find the optimal solution t then
8:: $I \leftarrow t$ break
9:: end if
10:: end for
11:: $\tilde{ϵ} \leftarrow Shuffle [\frac{x_{j} - \sqrt{{\bar{α}}_{I}} P_{J} (x_{j^{e}})}{\sqrt{1 - {\bar{α}}_{I}}}]$
12:: $t \sim Uniform ({1, \dots T})$
13:: ${\tilde{x}}_{t} = \sqrt{{\bar{α}}_{t}} P_{J} (x_{j^{e}}) + \sqrt{1 - {\bar{α}}_{t}} \tilde{ϵ}$
14:: $\underset{θ}{arg min} L (F_{J} θ) = E_{x} {∥F_{J} θ ({\tilde{x}}_{t}) - x_{j}∥}^{2}$
15:: end while
16:: return I, $F_{J} θ$

Algorithm 3 Sampling of SSDn-DDPM

Require: S (3D Noisy Seismic Data),

P_{J}

(Pre-trained Noise Estimation Function),

F_{J} θ

(Pre-trained

x_{j}

Estimation Function), T (Diffusion Noise Schedule), I (Optimal Noise Matching)

1:: $x_{I}$ ← sample $x_{j}$ from S
2:: for $t = I$ to 1 step 1 do
3:: $z \sim N (0, I)$ if $t > 1$ , else $z = 0$
4:: $x_{t - 1} \leftarrow \frac{\sqrt{α_{t}} (1 - {\bar{α}}_{t - 1})}{1 - {\bar{α}}_{t}} x_{t} + \frac{\sqrt{{\bar{α}}_{t - 1}} β_{t}}{1 - {\bar{α}}_{t}} F_{J} θ (x_{t}, t) + σ_{t} z$
5:: end for
6:: $S_{0} \leftarrow R e c o n s t r u c t (x_{0})$
7:: return $S_{0}$

3. Results

In this section, we conduct a comprehensive and objective comparison experiment between the proposed SSDn-DDPM algorithm and various existing seismic denoising algorithms on multiple datasets. The seismic data used in the experiments include three types of synthetic seismic data with varying noise intensities and two types of real seismic records. To ensure comprehensiveness and objectivity, the selected comparison methods cover different approaches in the field of seismic denoising, representing notable and effective algorithms from each category. Specifically, these include the filtering-based SOSVMF algorithm [2,68], the sparse dictionary learning-based SGK algorithm [20], the rank-reduction-based DRR algorithm [24,25], the supervised deep learning-based SeisGAN [40], the unsupervised DIP [60], and the diffusion model-based PCADDPM algorithm [48]. The fundamental principles of these algorithms were introduced in Section 1. In this section, a comprehensive and detailed analysis of the experimental results is carried out, including a qualitative analysis of the denoised data and a quantitative analysis of the synthetic seismic data with ground truth, using various denoising evaluation metrics.

3.1. Model Configurations

Based on extensive experiments, we found that most of the parameters in the noise estimation model and the denoising diffusion model had a limited impact on the denoising performance of the algorithm as long as they were selected within a reasonable range. We mainly adopted standard configurations to ensure the stability of the model. For the few key parameters that significantly affected the algorithm performance, such as the size of the sampling block, the diffusion steps, and the noise scheduling strategy, we provide an in-depth discussion in Section 4.1, analyzing their mechanisms and optimization strategies in greater detail. The specific parameter settings are shown in Table 1, and the detailed network structural design is illustrated in Figure 1.

As shown in Figure 1a, we adopted 2D convolution in the network. The spatial relationships were confirmed and preserved through the 3D serpentine sampling and slicing process. Since we focused more on the underlying features than on the global structural features, it was unnecessary to explicitly model spatial relationships using 3D convolution. Additionally, compared to 3D convolution, 2D convolution has lower computational complexity, which is particularly advantageous for processing large-scale seismic data. Therefore, after a comprehensive evaluation, this paper adopted a 2D convolution-based network structure. The experimental results demonstrate that this approach achieves effective denoising performance while maintaining an acceptable computational and memory cost.

3.2. Synthetic Seismic Data

The ground truth of the synthetic seismic data is based on the seismic simulation of paleokarst collapse systems. Paleokarst systems provide favorable conditions for hydrocarbon migration. Conducting denoising research on them aids reservoir characterization, oil and gas production forecasting, seismic event prediction, and other engineering applications. Here, the dataset we selected was CigKarst [69,70], which was created by the Computational Interpretation Group (CIG) for deep learning-based paleokarst interpretation in 3D seismic images. The simulated paleokarst systems in CigKarst typically manifest as reflection depressions or discontinuities, which pose challenges to seismic denoising algorithms based on local statistics. Furthermore, due to the complexity of paleokarst systems, high-SNR seismic data are crucial for accurate identification and characterization. This study selected the 3D data volume from CigKarst as the noiseless ground-truth data, as shown in Figure 3, and added noise of varying intensities to obtain three sets of 3D seismic data suitable for quantitative analysis. As shown in Table 2, Table 3 and Table 4, the Peak Signal-to-Noise Ratio (PSNR) for the noisy synthetic data reached 24, 20, and 17 dB, respectively. Analyzing the denoising performance of algorithms under different noise intensities is helpful in exploring their generalization, robustness, and adaptability in complex real-world scenarios. The proposed algorithm is designed for blind denoising. When applied to data with varying noise intensities, it does not require any parameter adjustments and adaptively achieves good denoising results without significant performance fluctuations.

Table 2, Table 3 and Table 4 show the quantitative evaluation metrics for each algorithm applied to different noise levels. To comprehensively evaluate algorithm performance, we calculated the SNR, PSNR, Structural Similarity Index (SSIM), Mean Squared Error (MSE), and Cosine Similarity (CS). SNR measures the overall ratio of effective signal to noise in the denoised data. For seismic data, peak signal values typically represent effective reflections. PSNR evaluates the ability of the denoising algorithm to suppress noise in wave signals, which is particularly important for seismic exploration. SSIM measures the structural similarity of the denoised data and analyzes the ability of different algorithms to preserve effective signals. MSE measures the point-wise difference in the denoised data, evaluating the impact of accumulated local errors on signal restoration. Additionally, CS focuses on measuring the overall consistency of the signal, reflecting the denoising algorithm’s ability to preserve structural features and overall signal trends. It can be seen that SSDn-DDPM achieved lower MSE values than the competing algorithms while attaining higher SNR and PSNR values and maintaining excellent SSIM and CS values. These quantitative comparisons indicate that SSDn-DDPM effectively suppresses noise and simultaneously recovers signal details, excelling both in preserving point-wise local details and recovering overall structural information. Moreover, the denoising performance of the proposed algorithm remained more stable across varying noise levels, demonstrating its superior generalization and robustness.

Figure 4 illustrates the denoising results for synthetic seismic records. As shown in Figure 4a–h, each denoising method achieved noticeable denoising performance on lightly contaminated seismic data. Comparatively, the data processed by SSDn-DDPM appeared to be closest to the original record, as SSDn-DDPM effectively suppressed noise and better preserved data details. DRR, SGK, SeisGAN, and DIP exhibited some detail loss in restoring the paleokarst collapse structures compared to SOSVMF, PCADDPM, and SSDn-DDPM, especially in the horizontal slice. While SOSVMF and PCADDPM demonstrated excellent noise suppression, they did not outperform SSDn-DDPM. As a 2D denoising algorithm, PCADDPM lacks horizontal slice information, which is a notable limitation. In Figure 4i–p, similar to the lightly noisy data, SSDn-DDPM achieved the best overall denoising performance. As shown in Figure 4q–x, for the seismic data with heavy noise contamination, DRR and SSDn-DDPM effectively suppressed noise. Under such severe noise contamination, some other methods either failed, leaving significant residual noise, or damaged the effective signal to an unacceptable degree.

By comparing the denoising performance under various noise levels, as shown in Figure 4, it was found that some methods exhibited variations in denoising effectiveness and signal preservation as the noise distribution changed. This stems from the fact that these methods are non-blind, and despite meticulous parameter tuning, their performance fluctuated with varying noise distributions. Some deep learning-based methods, such as SeisGAN, require supervised training. Although SeisGAN is an excellent seismic denoising algorithm, it inevitably suffers from generalizability issues when faced with different noise distributions. In contrast, our proposed SSDn-DDPM can be trained in a self-supervised manner and is a blind denoising method, thereby eliminating the need for laborious parameter tuning and enabling effective noise suppression across different noise distributions. Furthermore, unlike supervised methods, SSDn-DDPM can be trained in a self-supervised manner on various data types without requiring ground truth, allowing it to be applied to a broader range of scenarios. Therefore, SSDn-DDPM demonstrated more stable denoising performance across different noise distributions, showcasing its superior generalizability and robustness.

The single-trace comparison presented in Figure 5 further illustrates this point. With varying noise distributions, some methods showed obvious differences in fitting the valid signal. Specifically, algorithms that effectively recovered the signal shape in lightly noisy data showed significant deviations when the noise distribution changed. However, SSDn-DDPM consistently maintained a good fit for the valid signal despite variations in noise intensity.

To visually display the 3D seismic data, we present the 64th inline slice, 64th xline slice, and 90th horizontal slice. Since PCADDPM is a 2D denoising algorithm, we concatenated its denoising results for the corresponding sections into a 3D seismic data volume for consistent presentation. It should be noted that PCADDPM’s results only contain values for the displayed sections and thus lack a horizontal slice. Other methods, apart from the displayed sections, contain complete 3D data volumes. This visualization approach is also adopted for the field seismic data presented later.

Figure 6 presents the corresponding noise residuals for synthetic seismic records with light noise contamination. For the noise removed in Figure 6h, the residual morphologies of PCADDPM and SSDn-DDPM are noticeably better than those of other methods. However, in the xline section, PCADDPM shows a more significant leakage of strong reflection signals than SSDn-DDPM.

Figure 7 presents the local similarity maps for synthetic seismic records with moderate noise contamination. The local similarity maps indicate less signal leakage in the denoised data from DRR and SSDn-DDPM, indicating their superior ability to preserve effective signals during noise suppression.

Figure 8 and Figure 9 show enlarged views of the inline section and horizontal slice of seismic records with heavy noise contamination, respectively. These enlarged views provide a clearer visualization of the denoising details. As shown in Figure 8c,i, DRR and SSDn-DDPM exhibited better denoising performance under heavy noise contamination. In the areas indicated by arrows in Figure 8, SSDn-DDPM not only effectively removed noise but also restored structural details more accurately. This phenomenon is more evident in the horizontal slice views in Figure 9. In the regions indicated by arrows, SSDn-DDPM demonstrated superior performance in preserving details. In addition to achieving the best noise suppression, SSDn-DDPM generated denoised data that closely resemble the structure in Figure 9a, indicating its ability to remove noise effectively while preserving the detailed information of the original data.

3.3. Field Seismic Data

In this subsection, we compare the denoising performance of different algorithms on field 3D seismic data. The chosen datasets, Stratton 3D and F3 Netherlands, are publicly available. The Stratton 3D seismic data were collected and made available by the Bureau of Economic Geology at the University of Texas at Austin, Austin, TX, USA. The Stratton 3D data suffer from relatively light noise contamination, and we extracted its deeper 3D data volume, which contains more noticeable noise, as experimental data. The original F3 Netherlands seismic data contain more substantial noise. We extracted the 3D data volume corresponding to the fault structures for experiments. For F3 Netherlands, we thank dGB Earth Sciences for making the data available as an OpendTect project via their TerraNubis portal (terranubis.com). These field 3D seismic datasets contain complex geological structures and diverse noise distributions, posing challenges for effective noise suppression. Combined with the synthetic seismic data, these datasets can serve as adequate benchmarks for comparing the methods and elucidating their characteristics.

Figure 10 shows the denoising results of different algorithms on the Stratton 3D data. It can be seen that DRR, SOSVMF, DIP, and SSDn-DDPM effectively suppressed noise. Due to the limitation of generalization, SeisGAN and PCADDPM did not achieve satisfactory denoising performance. While SGK removed noise, it also lost considerable data detail. Among the effective denoising algorithms, SSDn-DDPM preserved data details more comprehensively. Figure 11 displays the corresponding noise residuals, indicating that there was obvious structural information leakage in the denoising results of SGK and SeisGAN. The noise residuals of other methods did not show apparent signal leakage, indicating that the signal was effectively preserved.

Figure 12 shows the denoising results of different algorithms on the F3 Netherlands data. From the subfigure comparison, it is evident that the proposed SSDn-DDPM effectively suppressed noise and best preserved data details, which is particularly evident in the horizontal slice. Due to the characteristics of the F3 Netherlands data, preserving effective signals during denoising was challenging. As shown in Figure 13 and Figure 14, most methods, even those that perform well on other datasets, inevitably exhibited noticeable structural information leakage in their noise residuals for the F3 Netherlands data. SSDn-DDPM, while effectively denoising, showed the least structural information leakage, demonstrating its superior ability to preserve effective signals, which is crucial in seismic signal processing.

Figure 15 illustrates enlarged views of the F3 Netherlands seismic data. In the areas indicated by arrows in the enlarged views, it can be seen that SSDn-DDPM restored the subsurface structural details more effectively.

4. Discussion

4.1. Parameter Sensitivity

In this section, we conduct a sensitivity analysis of key parameters to explore their specific effects on the model. This section focuses on critical parameters that are closely related to the noise estimation model and the denoising diffusion model, including sampling block size, diffusion steps, and noise schedule.

4.1.1. Sampling Block Size

Our algorithm is based on local statistical characteristics. Specifically, for seismic data, it leverages the continuity of linear events. Therefore, during sampling, we aim to preserve the continuity of events as much as possible, allowing our denoising network to exploit this similarity and remove irrelevant noise. Based on this principle, we adopted a small step size in the horizontal direction to ensure that adjacent samples within the sampling block maintain linear event correlation in the original 3D data.

To determine the specific values of the sampling stride, we experimented with several small step sizes in the horizontal direction, as shown in Table 5. Ultimately, we found that a horizontal step size of 1 yielded the best results. However, setting the horizontal step size to 1 incurs a huge computational cost. To address this issue, we increased the step size along the time axis based on the following two considerations:

1.: Seismic data fundamentally represent the spatial arrangement of geophone recordings. Since the vertical sampling interval is much smaller than the lateral geophone spacing, the vertical resolution of seismic data is higher than the horizontal resolution, leading to greater spatial redundancy in the vertical direction. In contrast, the horizontal resolution is more expensive and should generally be preserved as much as possible.
2.: Due to sedimentary compaction, most geological formations exhibit horizontal stratification, except in certain unique cases. This results in a higher lateral statistical correlation of seismic events compared to the vertical direction.

Based on these considerations, we balanced denoising performance and computational cost by setting the vertical step size to 32, ultimately determining a sampling stride of

(1, 1, 32)

. Our sampling block size was designed in coordination with the sampling stride. Sampling blocks that are too small will require more iterations to cover the entire dataset, leading to increased computational costs. Therefore, we set the sampling block size to

(120, 120, 128)

. This choice may involve a trade-off due to computational resource limitations; nonetheless, it achieved excellent denoising results.

4.1.2. Noise Schedule

In the initial stage of the experiments, we referred to the commonly used linear noise schedule, in which

β_{t}

increases linearly within the range of diffusion steps. Although the linear noise schedule is a classic setting, it posed certain challenges in our task. The primary reason is that the noise distribution provided by the noise estimation model guides the sampling process of the denoising diffusion model, allowing it to sample from the intermediate state

x_{I}

of the Markov chain back to

x_{0}

. Since the amplitude of the linear noise increment is the same in both the early and late stages of diffusion, the noise features in most seismic data to be denoised are quickly overwhelmed by the linearly increasing noise in the early stages of diffusion.

In our method, this phenomenon caused the matched position in the Markov chain to be too close to

x_{0}

, significantly reducing the practical denoising steps involved in the sampling process. To intuitively understand this phenomenon, we visualize

x_{t}

generated under different noise schedules in Figure 16, which shows that the linear noise schedule resulted in only a few practical steps in the sampling process. The experimental results in Figure 17 also demonstrate that this phenomenon led to insufficient detail recovery, causing the final denoised results to exhibit signal blurring and amplitude loss.

We adopted a parameter freezing noise schedule to address this issue, as shown in Figure 16b. Specifically, during the early stage of diffusion, we froze

β_{t}

at a small constant value to slow the growth rate of noise intensity, allowing the matched node

x_{I}

in the Markov chain to undergo more iterations, fully recovering the data details. The specific setting was as follows: during the first 200 steps,

β_{t}

was fixed at 5 × 10⁻⁵, delaying premature data contamination by noise. Subsequently, from Steps 200 to 1000,

β_{t}

was increased linearly to 1 × 10⁻². The experimental results in Figure 17 show that this noise schedule enhanced the model’s ability to recover fine details while maintaining denoising quality.

4.1.3. Diffusion Steps

In the proposed algorithm, the number of diffusion steps determines the discrete time steps in the Markov chain through which the noisy data are progressively restored to clean data. A larger number of diffusion steps allows for a smoother noise removal process during sampling, leading to higher-quality results. However, excessive diffusion steps significantly increase inference time, reducing computational efficiency and yielding diminishing returns in performance improvement. To analyze the sensitivity of the diffusion step parameter, we conducted experiments under the parameter-freezing noise schedule, with the results presented in Table 6.

The experimental results indicate that as the number of diffusion steps increased, the denoising performance initially improved significantly. However, when the diffusion steps were further increased from 600 to 1000, the performance gains became much smaller, and the SNR even appeared to decrease, yielding diminishing returns in performance improvement, i.e., the benefit of increasing diffusion steps gradually weakened. Based on the experimental results and practical considerations, we ultimately selected 1000 diffusion steps, a commonly used setting. This choice ensured a balance between denoising performance and computational efficiency.

4.2. Computational Cost

In this section, we present the average computational time of each algorithm in Table 7 and the average memory usage of each algorithm in Table 8. The experimental environment employed a single NVIDIA RTX 3090 GPU, a Core i9-11900 CPU, and 64 GB of RAM.

As can be seen, the time performance of our method was not outstanding because, as a diffusion model-based approach, multiple inference steps were required for each patch. The reported values in Table 7 correspond to processing the entire 3D seismic dataset. However, when considering the time required for a single seismic section, our method takes an average of 3 s, which is acceptable for a non-real-time data processing program. Furthermore, to mitigate the computational cost of inference, our method is optimized by leveraging a noise prediction model. Without this model, the processing time per patch would increase to 14 s.

Regarding memory consumption, our algorithm remained within the mainstream range. In environments equipped with dedicated GPUs, there is generally sufficient GPU memory to run the proposed model smoothly.

Our method involves two stages during training: self-supervised 3D seismic noise modeling, which requires an average training time of 418 s and an average memory usage of 3103 MiB, and training the denoising diffusion model, which requires an average training time of 3.7 h and an average memory usage of 3970 MiB.

4.3. The Importance of Seismic Noise Modeling

Seismic noise is modeled using the self-supervised 3D seismic noise estimation model. The seismic noise model determines the state in the Markov chain where the noise estimated by

P_{J}

is added. Figure 18 illustrates the matching degree between the noise-added data sampled from different states in the Markov chain and the noisy input. It can be observed that our method matched the state in the Markov chain that was most similar to the noisy input, further demonstrating the effectiveness of noise matching in the algorithm.

If the seismic noise modeling module is removed, the noisy input data will fail to match the most appropriate state in the Markov chain and will instead start sampling from the initial state or an arbitrary state in the chain. Starting sampling without matching the correct state introduces significant discrepancies between the noise-added data and the noisy input, as shown in Figure 18. This undermines the effectiveness of ground-truth constraints during the denoising process, leading to deviations in the denoised results. Figure 19 presents the denoising results corresponding to sampling from different states. As shown in Figure 19a, sampling from the matched state yielded denoising results that were closest to the noiseless ground truth in Figure 19f. Conversely, denoising results obtained from other states exhibited structural differences, with the most significant deviations occurring in Figure 19e, where sampling was started from the initial state.

We believe that without the self-supervised 3D seismic noise estimation model, which models seismic noise, the algorithm essentially degrades into an unconstrained denoising diffusion model. However, the denoising task requires the accurate restoration of effective signals, which, in this algorithm, relies on seismic noise modeling to provide constraints for the diffusion model, highlighting its importance.

Table 9 presents the sampling times for the single-slice reverse diffusion process when starting from different states. It can be observed that sampling from the matched state through seismic noise modeling significantly optimizes the algorithm’s computational complexity.

4.4. The Importance of Three-Dimensional Serpentine Sampling

Three-dimensional serpentine sampling enhances the structural similarity between the data processed by the self-supervised algorithm, thereby improving noise perception and enabling more effective noise suppression. To demonstrate the importance of 3D serpentine sampling, we conducted ablation experiments by replacing it with random sampling and linear-scanning sampling.

Figure 20 shows the results of the ablation experiments. It can be seen that the algorithm using 3D serpentine sampling achieved the best results across all three datasets. In contrast, Figure 20c demonstrates that using random sampling led to a complete loss of similarity between data, resulting in the worst denoising performance, with a failure to restore certain effective signals. The denoising results with linear-scanning sampling, shown in Figure 20d, were significantly better than those with random sampling. However, due to its relatively limited structural similarity compared to 3D serpentine sampling, more errors were introduced in the denoising results. This can be observed in the enlarged views, where the recovery of structural details is clearly inferior to the results shown in Figure 20e using 3D serpentine sampling, particularly in the regions indicated by arrows.

We conducted an ablation analysis using three datasets with ground truth, as shown in Table 10. The denoising quantitative results were consistent with the qualitative observations. Three-dimensional serpentine sampling, which preserves the highest structural similarity, achieved the best performance, followed by linear-scanning sampling with moderate structural similarity, and finally random sampling, which lacks structural similarity. This further validates the significance of 3D serpentine sampling.

4.5. The Importance of the Diffusion Model

The denoising results of the complete algorithm with the diffusion model exhibited a higher SNR, less leakage of effective signals, and finer structural recovery compared to using only self-supervised 3D seismic noise estimation. To validate this, we conducted ablation experiments, and the results are shown in Figure 21.

The complete algorithm incorporating the diffusion model demonstrated superior noise suppression performance across all three datasets. The residual results reveal that removing the diffusion model resulted in more signal leakage, especially in the regions indicated by arrows. The quantitative results of the ablation experiment, presented in Table 11, further corroborate this observation, demonstrating that the complete algorithm achieved better performance across various quantitative metrics.

4.6. Simple Numerical Models

Compared to complex data, the signal components and noise characteristics of simple numerical models are more intuitive, which helps eliminate the interference of other unknown factors and makes the measurement of the denoising algorithm’s signal recovery effect more reliable. Their simple structure makes the observation of the denoising effect more direct and is especially suitable for confirming which noises are affected during the denoising process.

We constructed a simple numerical model, as shown in Figure 22a. This simple numerical model simulates pre-stack data using Ricker wavelets as the wavelet basis and calculates the seismic data morphology corresponding to the formation model using the time-distance curve formula. To ensure diverse geological features, the formation model consists of a six-layer structure, including four horizontal layers and two dipping layers, with one horizontal layer and one dipping layer containing fault structures.

After constructing the model, we added noise to verify the proposed method’s denoising effect on simple numerical models. The experimental results are shown in Figure 22c. It can be seen that the proposed method achieved excellent denoising results on the simple numerical model.

4.7. Applicable Noise Types

We added different types of noise to the simple numerical models to confirm which types of noise are affected during the denoising process. We added impulse noise, coherent noise represented by surface waves, regional coherent noise related to specific area interference, random trace noise, and mixed noise of various types to the simple numerical models. Considering the characteristics of seismic data, we selected noise types for experimentation that are primarily related to seismic data and did not include noise types common in natural images but uncommon in seismic data in our experiments.

The experimental results are shown in Figure 23. Since our algorithm cannot effectively remove surface wave interference, we enlarged it separately for a more detailed discussion. As seen in Figure 23, our method effectively suppressed various types of seismic noise. This is because the underlying logic of our algorithm is to utilize the autocorrelation of valid signals within seismic data to remove locally statistically uncorrelated noise. The various seismic noise types shown in Figure 23a–d have a certain degree of local randomness, allowing our method to effectively denoise them.

However, our method is powerless against surface wave interference, as shown in Figure 24d. Although surface waves are typically considered a form of coherent noise in seismic data processing, they exhibit distinct wave characteristics. As can also be seen in Figure 24g, for seismic data contaminated by surface wave interference and random noise, our method effectively removed random noise while preserving both reflection wave and surface wave signals.

This phenomenon defines the boundaries of our algorithm’s application. The proposed algorithm is suitable for removing random seismic noise and other noise with randomness, but it is not applicable to structural interference waves. Nevertheless, it is important to emphasize that, as described in Equation (1), our algorithm is specifically designed for random noise suppression in seismic data. Discussing its ability to remove other types of noise can be regarded as exploring the extension of its capabilities. However, it cannot be used as a basis for judging its effectiveness as a random noise suppression method.

4.8. Applicability of Post-Stack and Pre-Stack Data

In general, the proposed method is applicable to both post-stack and pre-stack data. The method is based on the internal similarity features of the data. Even in pre-stack data, structural continuity governed by hyperbolic moveout patterns persists. Therefore, the proposed method can be directly applied to both types of data without significant modifications to the model architecture or algorithm flow, and its self-supervised denoising mechanism has universal applicability.

Post-stack data undergo processing such as normal moveout correction and stacking, eliminating normal moveout and significantly enhancing the continuity of coherent events between traces. Pre-stack data, on the other hand, contain normal moveout, with events distributed in a hyperbolic shape with offset, and the continuity between traces is weaker than in post-stack data. Our method mainly relies on the internal similarity of the data, especially the continuity of reflection events in seismic data, to achieve self-supervised denoising. In pre-stack data, there is still inter-trace similarity, and although the continuity of reflection events is slightly weaker than in post-stack data, our method is still applicable.

We designed experiments to illustrate this. Random noise of the same scale was added to numerical models of pre-stack and post-stack data, generating noisy pre-stack and post-stack data with the same subsurface structure, as shown in Figure 25b,f. The same algorithm framework was applied without any adjustments, and the corresponding denoised data are shown in Figure 25c,g. Our method effectively suppressed noise in both pre-stack data and post-stack data, and qualitative analysis indicated no significant difference in denoising performance between them. Furthermore, we conducted a quantitative analysis of denoising performance, with the results shown in Table 12. The results show that although the method is applicable to both post-stack data and pre-stack data, the quantitative denoising results for pre-stack data were slightly inferior to those for post-stack data due to the presence of normal moveout in pre-stack data, which weakens the continuity of reflection events across traces.

Furthermore, we conducted experiments on real pre-stack data to verify the applicability and effectiveness of the proposed method across different types of seismic data. The real pre-stack data used in the experiments were selected from the SEG Advanced Modeling Corporation’s SEAM Open Data. The experimental results are shown in Figure 26, where it can be observed that the random noise in the seismic records was significantly suppressed after denoising. Notably, no obvious structural information leakage was observed in the noise residual, indicating that the proposed method not only effectively suppressed noise but also preserved the detailed features of the original seismic signal while avoiding signal damage. This further demonstrates the superiority of the method in maintaining the characteristics of the original signal.

In summary, the proposed method can be applied to post-stack and pre-stack data without any modifications. However, due to the differences in the continuity of reflection events between the two data types, denoising performance may show some variation. Post-stack data, having undergone normal moveout correction and stacking, exhibit more continuous reflection events, resulting in better denoising performance. Pre-stack data, despite having a weaker continuity of linear events, still possess sufficient local correlation for the proposed method to extract and distinguish internal noise from structural features. Qualitative denoising performance shows no significant difference between the two data types, with only minor discrepancies in quantitative metrics.

4.9. Generalizability

In our method, the algorithm suppresses noise based on the local similarity within seismic data. The overall design of our approach aims to minimize reliance on structural information in seismic data and instead focuses on underlying features, thereby enhancing the generalizability of noise suppression. As a result, regardless of the geological structures in the input data, the algorithm extracts its local amplitude signal features for processing. This ensures that as long as the underlying features remain similar, our algorithm can generalize effectively.

The underlying features of seismic signals refer to the intrinsic data characteristics independent of macro-scale geological structures. They are primarily determined by the fundamental properties of the signal at the local scale. Compared to structural information, such as stratigraphic morphology or reflector geometry, which are influenced by geological structures and lithological characteristics, underlying features more directly reflect the universal properties of seismic signals, including basic vibration patterns and noise statistical characteristics, which generally remain consistent across different regions and datasets. Therefore, by prioritizing underlying features, our method demonstrates generalizability when applied to seismic data from different regions.

We designed experiments to illustrate this. Experiments were conducted on pre-stack synthetic data, post-stack synthetic data, and the CigKarst dataset. In the experiments, the training and testing data were shuffled to illustrate the generalizability of our method to different data modalities. We used the model trained on the post-stack data to reduce noise on the pre-stack seismic data. Then, we swapped the training data with the test data, even using the model trained on the simple pre-stack synthetic data to reduce noise on the CigKarst dataset. Figure 27 shows the experimental results. It can be seen that the training data and testing data structures in Figure 27a,b do not correspond. We can see that good denoising results were obtained for models trained and tested on different data modalities, provided that the noise distribution was similar. The denoising effects are comparable to those of the normally trained models in Figure 27d. This indicates that our algorithm exhibits strong generalizability in seismic data. The trained model can be reused across various strata models without requiring additional retraining.

We summarize the quantitative metrics of the simple numerical model’s results in Table 13. For the objective evaluation metrics of the denoised data, the PSNR is comparable to that of the normally trained model. This indicates that our algorithm is still a reliable seismic signal-denoising approach in generalized applications.

It is worth noting that the quantitative results of the generalization experiments slightly exceeded our expectations. As shown in Table 13, when the algorithm trained on the pre-stack dataset was applied to the post-stack data, the quantitative denoising metrics unexpectedly surpassed those obtained from training directly on the post-stack dataset. Specifically, the SNR increased by 0.304 dB, the PSNR improved by 0.042 dB, and the SSIM increased by 0.004. We analyzed the reasons behind this phenomenon and found that pre-stack data contain richer structural information than post-stack data. This additional information allows the model trained on pre-stack data to better capture noise characteristics when applied to a test set containing post-stack data, significantly enhancing denoising performance.

If the noise distribution within the test data differs significantly from that of the training data used for the existing model due to differences in noise environments, we can also use the self-supervised training strategy to obtain good denoising results.

In exceptional cases where the training data cannot reflect the noise distribution in the test data, a conventional training dataset becomes ineffective, and most deep learning-based algorithms struggle to achieve satisfactory results. However, the algorithm proposed in this paper possesses self-supervised properties, requiring only a single 3D seismic dataset as input. In this scenario, the algorithm can be treated as a black box, utilizing the intrinsic prior knowledge of the data for self-supervised learning and ultimately outputting the corresponding denoised results, without requiring consideration of its internal mechanisms.

It should be emphasized that the above scenario is an extreme stress test for the model. When the training set completely fails to describe the underlying features of the test data, i.e., when the training dataset fails, most supervised learning-based methods will also fail. However, our method can still produce usable results at the cost of increased computational time. In regular situations, as described earlier, directly applying a trained model for seismic data denoising is sufficient to satisfy most practical needs.

5. Conclusions

In this paper, we propose a 3D seismic noise suppression algorithm called SSDn-DDPM, which requires only a single 3D seismic volume for blind denoising. The proposed algorithm employs a self-supervised noise estimation model to predict the 3D seismic noise distribution. The predicted noise distribution is then fitted to a noise model and incorporated as a conditional input into the diffusion Markov chain via KDE matching. Furthermore, to fully leverage the unique spatial structure inherent in 3D seismic data, the algorithm incorporates a specialized sampling strategy, effectively utilizing structural information to estimate the unstructured noise distribution more accurately. Ultimately, the proposed algorithm integrates statistics-based denoising theory into a diffusion generative model, achieving self-supervised noise suppression and effectively recovering subsurface structural details.

To fully demonstrate the effectiveness of the proposed method, we categorize seismic denoising methods based on their fundamental principles and select representative algorithms from each category for comprehensive comparison experiments. The experimental data include synthetic seismic data with varying noise intensities and two real 3D seismic datasets. Experimental results demonstrate that the proposed method outperforms competing methods in both noise removal and structural detail recovery for 3D seismic data, even without prior knowledge of the noise distribution or access to ground truth. This highlights the feasibility and broad research potential of applying diffusion probabilistic models to self-supervised denoising of 3D seismic data.

Author Contributions

All authors made significant contributions to this work. Conceptualization, Z.Z. and G.Q.; Data curation, Y.W. and J.S.; Formal analysis, Y.W. and J.S.; Funding acquisition, G.Q.; Investigation, Z.Z., Y.W. and J.S.; Methodology, Z.Z. and G.Q.; Project administration, G.Q. and Y.L.; Resources, G.Q.; Software, Z.Z. and Y.L.; Supervision, G.Q. and M.S.; Validation, Z.Z., G.Q. and M.S.; Visualization, Z.Z., Y.W. and J.S.; Writing—original draft, Z.Z.; Writing—review and editing, Z.Z., G.Q., Y.L. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Jilin Provincial Natural Science Foundation under grant number 20210101166JC and Program of Science and Technology Development Plan of Jilin Province of China under grant number 20240101374JC.

Data Availability Statement

The experimental data used in this research, CigKarst, Stratton 3D and F3 Netherlands, are publicly available. The synthetic seismic dataset CigKarst is available at https://zenodo.org/records/4285733, accessed on 15 March 2025. The real seismic dataset Stratton 3D is available at https://wiki.seg.org/wiki/Stratton_3D_survey, accessed on 15 March 2025. The real seismic dataset F3 Netherlands is available at https://terranubis.com/datainfo/F3-Demo-2020, accessed on 15 March 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Oboué, Y.A.S.I.; Chen, Y.; Fomel, S.; Zhong, W.; Chen, Y. An advanced median filter for improving the signal-to-noise ratio of seismological datasets. Comput. Geosci. 2024, 182, 105464. [Google Scholar] [CrossRef]
Chen, Y.; Zu, S.; Wang, Y.; Chen, X. Deblending of simultaneous source data using a structure-oriented space-varying median filter. Geophys. J. Int. 2020, 222, 1805–1823. [Google Scholar] [CrossRef]
Liu, Y.; Yan, Z. Application of a cascading filter implemented using morphological filtering and time–frequency peak filtering for seismic signal enhancement. Geophys. Prospect. 2020, 68, 1727–1741. [Google Scholar] [CrossRef]
Li, J.; Fan, W.; Li, Y.; Qian, Z. Low-frequency noise suppression in desert seismic data based on an improved weighted nuclear norm minimization algorithm. IEEE Geosci. Remote. Sens. Lett. 2020, 17, 1993–1997. [Google Scholar] [CrossRef]
Bonar, D.; Sacchi, M. Denoising seismic data using the nonlocal means algorithm. Geophysics 2012, 77, A5–A8. [Google Scholar] [CrossRef]
Wang, Z.; Liu, G.; Li, C.; Shi, L.; Wang, Z. Random noise attenuation of 3D multicomponent seismic data using a fast adaptive prediction filter. Geophysics 2024, 89, V263–V280. [Google Scholar] [CrossRef]
Wang, W.; Yang, J.; Huang, J.; Li, Z.; Zhao, C. Seismic data denoising using a new framework of FABEMD-based dictionary learning. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5914409. [Google Scholar] [CrossRef]
Zhao, Y.; Zhong, Z.; Li, Y.; Shao, D.; Wu, Y. Ensemble empirical mode decomposition and stacking model for filtering borehole distributed acoustic sensing records. Geophysics 2023, 88, WA319–WA334. [Google Scholar] [CrossRef]
Ran, Q.; Tang, C.; Han, S.; Liang, H.; Xue, Y.J.; Chen, K. Seismic noise attenuation using variational mode decomposition and the Schroedinger equation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4503712. [Google Scholar] [CrossRef]
Li, S.; Zhang, X.; Zhang, J.; Liu, L. Adaptive dual-domain filtering for random seismic noise removal. Geophysics 2024, 89, V377–V394. [Google Scholar] [CrossRef]
Ibrahim, A.; Trad, D. Inversion-based deblending using migration operators. Geophys. Prospect. 2020, 68, 2459–2470. [Google Scholar] [CrossRef]
Geetha, K.; Hota, M.K. Seismic Random Noise Attenuation using Optimal Empirical Wavelet Transform with a New Wavelet Thresholding Technique. IEEE Sens. J. 2023, 24, 596–606. [Google Scholar] [CrossRef]
Fomel, S.; Liu, Y. Seislet transform and seislet frame. Geophysics 2010, 75, V25–V38. [Google Scholar] [CrossRef]
Do, M.N.; Vetterli, M. The contourlet transform: An efficient directional multiresolution image representation. IEEE Trans. Image Process. 2005, 14, 2091–2106. [Google Scholar] [CrossRef]
Wang, W.; Yang, J.; Huang, J.; Li, Z.; Sun, M. Outlier Denoising Using a Novel Statistics-Based Mask Strategy for Compressive Sensing. Remote Sens. 2023, 15, 447. [Google Scholar] [CrossRef]
Li, C.; Wen, X.; Liu, X.; Zu, S. Simultaneous seismic data interpolation and denoising based on nonsubsampled contourlet transform integrating with two-step iterative log thresholding algorithm. IEEE Trans. Geosci. Remote. Sens. 2022, 60, 5918210. [Google Scholar] [CrossRef]
Fan, H.; Zhang, Y.; Wang, W.; Li, T. Suppressing seismic random noise based on non-subsampled shearlet transform and improved FFDNet. Front. Earth Sci. 2024, 12, 1408317. [Google Scholar] [CrossRef]
Aharon, M.; Elad, M.; Bruckstein, A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 2006, 54, 4311–4322. [Google Scholar] [CrossRef]
Zhou, Z.; Bai, M.; Wu, J.; Cui, Y. Coherent noise attenuation by kurtosis-guided adaptive dictionary learning based on variational sparse representation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5912310. [Google Scholar] [CrossRef]
Chen, Y. Fast dictionary learning for noise attenuation of multidimensional seismic data. Geophys. J. Int. 2020, 222, 1717–1727. [Google Scholar] [CrossRef]
Wu, J.; Chen, Q.; Gui, Z.; Bai, M. Fast dictionary learning for 3D simultaneous seismic data reconstruction and denoising. J. Appl. Geophys. 2021, 194, 104446. [Google Scholar] [CrossRef]
Oropeza, V.; Sacchi, M. Simultaneous seismic data denoising and reconstruction via multichannel singular spectrum analysis. Geophysics 2011, 76, V25–V32. [Google Scholar] [CrossRef]
Brox, D.S.; Sacchi, M.D. Robust vector MSSA for SNR enhancement of seismic records. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5917206. [Google Scholar] [CrossRef]
Huang, W.; Wang, R.; Chen, Y.; Li, H.; Gan, S. Damped multichannel singular spectrum analysis for 3D random noise attenuation. Geophysics 2016, 81, V261–V270. [Google Scholar] [CrossRef]
Chen, Y.; Huang, W.; Yang, L.; Oboué, Y.A.S.I.; Saad, O.M.; Chen, Y. DRR: An open-source multi-platform package for the damped rank-reduction method and its applications in seismology. Comput. Geosci. 2023, 180, 105440. [Google Scholar] [CrossRef]
Baraniuk, R.; Davenport, M.; Duarte, M.; Hegde, C. Introduction to Compressive Sensing. IEEE Signal Process. Mag. 2007, 56, 4–5. [Google Scholar]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef]
Banjade, T.P.; Zhou, C.; Chen, H.; Li, H.; Deng, J.; Zhou, F.; Adhikari, R. Seismic Random Noise Attenuation Using DARE U-Net. Remote Sens. 2024, 16, 4051. [Google Scholar] [CrossRef]
Zhao, H.; Zhou, Y.; Bai, T.; Chen, Y. A U-Net Based Multi-Scale Deformable Convolution Network for Seismic Random Noise Suppression. Remote Sens. 2023, 15, 4569. [Google Scholar] [CrossRef]
Zhao, H.; Bai, T.; Wang, Z. A Natural Images Pre-Trained Deep Learning Method for Seismic Random Noise Attenuation. Remote Sens. 2022, 14, 263. [Google Scholar] [CrossRef]
Li, W.; Wu, T.; Liu, H. Structure-Preserving Random Noise Attenuation Method for Seismic Data Based on a Flexible Attention CNN. Remote Sens. 2022, 14, 5240. [Google Scholar] [CrossRef]
Zhao, B.; Han, L.; Zhang, P.; Yin, Y. Weak Signal Enhancement for Passive Seismic Data Reconstruction Based on Deep Learning. Remote Sens. 2022, 14, 5318. [Google Scholar] [CrossRef]
Li, F.; Liu, H.; Wang, W.; Ma, J. Swin Transformer for Seismic Denoising. IEEE Geosci. Remote Sens. Lett. 2024, 21, 7501905. [Google Scholar] [CrossRef]
Wang, H.; Lin, J.; Li, Y.; Dong, X.; Tong, X.; Lu, S. Self-Supervised Pretraining Transformer for Seismic Data Denoising. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5907525. [Google Scholar] [CrossRef]
Gao, L.; Shen, H.; Min, F. Swin Transformer for simultaneous denoising and interpolation of seismic data. Comput. Geosci. 2024, 183, 105510. [Google Scholar] [CrossRef]
Zhang, J.; Wang, W.; Li, K.; Wang, H.; Wei, Z. Enhancing Seismic Data Denoising Through Multiscale Analysis Across Transformer and GAN. IEEE Geosci. Remote Sens. Lett. 2024, 21, 7508205. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, Y.; Dong, H.; Song, L. STUGAN: An Integrated Swin Transformer-Based Generative Adversarial Networks for Seismic Data Reconstruction and Denoising. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5919715. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Ding, M.; Zhou, Y.; Chi, Y. Self-Attention Generative Adversarial Network Interpolating and Denoising Seismic Signals Simultaneously. Remote Sens. 2024, 16, 305. [Google Scholar] [CrossRef]
Lin, L.; Zhong, Z.; Cai, C.; Li, C.; Zhang, H. SeisGAN: Improving seismic image resolution and reducing random noise using a generative adversarial network. Math. Geosci. 2024, 56, 723–749. [Google Scholar] [CrossRef]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar] [CrossRef]
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10684–10695. [Google Scholar] [CrossRef]
Kazerouni, A.; Aghdam, E.K.; Heidari, M.; Azad, R.; Fayyaz, M.; Hacihaliloglu, I.; Merhof, D. Diffusion models for medical image analysis: A comprehensive survey. arXiv 2022, arXiv:2211.07804. [Google Scholar] [CrossRef]
Xiang, T.; Yurt, M.; Syed, A.B.; Setsompop, K.; Chaudhari, A. DDM²: Self-Supervised Diffusion MRI Denoising with Generative Diffusion Models. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar] [CrossRef]
Gao, Z.; Shi, X.; Han, B.; Wang, H.; Jin, X.; Maddix, D.; Zhu, Y.; Li, M.; Wang, Y.B. Prediff: Precipitation nowcasting with latent diffusion models. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Volume 36. [Google Scholar] [CrossRef]
Feng, Q.; Wang, S.; Li, Y. Analysis of DAS seismic noise generation and elimination process based on Mean-SDE diffusion model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5905613. [Google Scholar] [CrossRef]
Durall, R.; Ghanim, A.; Fernandez, M.R.; Ettrich, N.; Keuper, J. Deep diffusion models for seismic processing. Comput. Geosci. 2023, 177, 105377. [Google Scholar] [CrossRef]
Peng, J.; Li, Y.; Liao, Z.; Wang, X.; Yang, X. Seismic Data Strong Noise Attenuation Based on Diffusion Model and Principal Component Analysis. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5903511. [Google Scholar] [CrossRef]
Saad, O.M.; Chen, Y. Automatic waveform-based source-location imaging using deep learning extracted microseismic signals. Geophysics 2020, 85, KS171–KS183. [Google Scholar] [CrossRef]
Saad, O.M.; Chen, Y. Deep denoising autoencoder for seismic random noise attenuation. Geophysics 2020, 85, V367–V376. [Google Scholar] [CrossRef]
Shao, D.; Zhao, Y.; Li, Y.; Li, T. Noisy2Noisy: Denoise pre-stack seismic data without paired training data with labels. IEEE Geosci. Remote Sens. Lett. 2022, 19, 8026005. [Google Scholar] [CrossRef]
Fang, W.; Fu, L.; Li, H. Unsupervised CNN based on self-similarity for seismic data denoising. IEEE Geosci. Remote Sens. Lett. 2021, 19, 8022205. [Google Scholar] [CrossRef]
Gu, X.; Collet, O.; Tertyshnikov, K.; Pevzner, R. Removing Instrumental Noise in Distributed Acoustic Sensing Data: A Comparison Between Two Deep Learning Approaches. Remote Sens. 2024, 16, 4150. [Google Scholar] [CrossRef]
Lehtinen, J. Noise2Noise: Learning Image Restoration without Clean Data. arXiv 2018, arXiv:1803.04189. [Google Scholar] [CrossRef]
Krull, A.; Buchholz, T.O.; Jug, F. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2129–2137. [Google Scholar] [CrossRef]
Batson, J.; Royer, L. Noise2Self: Blind Denoising by Self-Supervision. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Chaudhuri, K., Salakhutdinov, R., Eds.; Proceedings of Machine Learning Research. Volume 97, pp. 524–533. [Google Scholar] [CrossRef]
Quan, Y.; Chen, M.; Pang, T.; Ji, H. Self2self with dropout: Learning self-supervised denoising from single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1890–1898. [Google Scholar] [CrossRef]
Huang, T.; Li, S.; Jia, X.; Lu, H.; Liu, J. Neighbor2neighbor: Self-supervised denoising from single noisy images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14781–14790. [Google Scholar] [CrossRef]
Qiu, C.; Wu, B.; Liu, N.; Zhu, X.; Ren, H. Deep learning prior model for unsupervised seismic data random noise attenuation. IEEE Geosci. Remote Sens. Lett. 2021, 19, 7502005. [Google Scholar] [CrossRef]
Saad, O.M.; Oboue, Y.A.S.I.; Bai, M.; Samy, L.; Yang, L.; Chen, Y. Self-attention deep image prior network for unsupervised 3-D seismic data enhancement. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5907014. [Google Scholar] [CrossRef]
Ulyanov, D.; Vedaldi, A.; Lempitsky, V. Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9446–9454. [Google Scholar] [CrossRef]
Mao, X.; Shen, C.; Yang, Y.B. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar] [CrossRef]
Brown, A.R. Interpretation of Three-Dimensional Seismic Data; Society of Exploration Geophysicists: Houston, TX, USA; American Association of Petroleum Geologists: Tulsa, OK, USA, 2011. [Google Scholar]
Yilmaz, O. Seismic Data Analysis: Processing, Inversion and Interpretation of Seismic Data; Society of Exploration Geophysicists: Houston, TX, USA, 2001; Volume 463. [Google Scholar]
Sohl-Dickstein, J.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 2256–2265. [Google Scholar] [CrossRef]
Ramesh, A.; Dhariwal, P.; Nichol, A.; Chu, C.; Chen, M. Hierarchical text-conditional image generation with clip latents. arXiv 2022, arXiv:2204.06125. [Google Scholar] [CrossRef]
Vaswani, A. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar] [CrossRef]
Wang, H.; Chen, Y.; Saad, O.M.; Chen, W.; Oboué, Y.A.S.I.; Yang, L.; Fomel, S.; Chen, Y. A MATLAB code package for 2D/3D local slope estimation and structural filtering. Geophysics 2022, 87, F1–F14. [Google Scholar] [CrossRef]
Wu, X.; Yan, S.; Qi, J.; Zeng, H. Deep learning for characterizing paleokarst collapse features in 3-D seismic images. J. Geophys. Res. Solid Earth 2020, 125, e2020JB019685. [Google Scholar] [CrossRef]
Wu, X.; Yan, S.; Qi, J.; Zeng, H. cigKast: A data of 3D synthetic seismic volumes with labeled paleokarsts for deep-learning-based paleokarst interpretation. JGR Solid Earth 2020, 125, e2020JB019685. [Google Scholar] [CrossRef]

Figure 1. Architecture of SSDn-DDPM. (a) The network architectures used in the noise estimation model and the diffusion model. (b) Illustration of the 3D serpentine sampling. (c) Illustration of the diffusion model training process.

Figure 2. Illustration of the forward and reverse processes of SSDn-DDPM.

Figure 3. Noiseless original synthetic seismic record.

Figure 4. Denoising comparison for synthetic seismic records: (a–h) show denoising results under light noise contamination, (i–p) under moderate noise contamination, and (q–x) under heavy noise contamination. For each noise level, (a–x) sequentially present the noisy data followed by the denoised results corresponding to DRR, SOSVMF, SGK, SeisGAN, DIP, PCADDPM, and SSDn-DDPM.

Figure 5. Seismic single-trace comparison for synthetic records with (a) light, (b) moderate, and (c) heavy noise contamination.

Figure 6. Comparison of removed noise in synthetic seismic records with light noise contamination. The removed noise corresponds to (a) the original record, (b) DRR, (c) SOSVMF, (d) SGK, (e) SeisGAN, (f) DIP, (g) PCADDPM, and (h) SSDn-DDPM.

Figure 7. Local similarity maps of synthetic seismic records with moderate noise contamination using (a) DRR, (b) SOSVMF, (c) SGK, (d) SeisGAN, (e) DIP, (f) PCADDPM, and (g) SSDn-DDPM.

Figure 8. Enlarged denoising comparison of synthetic seismic inline sections with heavy noise contamination. (a) The original record, (b) noisy data, and denoised data corresponding to (c) DRR, (d) SOSVMF, (e) SGK, (f) SeisGAN, (g) DIP, (h) PCADDPM, and (i) SSDn-DDPM. In each subfigure, the left panel shows the full section, the top-right panel shows the enlarged details of the red-boxed region, and the bottom-right panel shows the enlarged details of the blue-boxed region. Green arrows highlight regions with significant feature differences.

Figure 9. Denoising comparison of synthetic seismic horizontal slices with heavy noise contamination. (a) The original record, (b) noisy data, and denoised data corresponding to (c) DRR, (d) SOSVMF, (e) SGK, (f) SeisGAN, (g) DIP, (h) PCADDPM, and (i) SSDn-DDPM. Green arrows highlight regions with significant feature differences.

Figure 10. Denoising comparison in the Stratton 3D seismic data. (a) The original record, and denoised data corresponding to (b) DRR, (c) SOSVMF, (d) SGK, (e) SeisGAN, (f) DIP, (g) PCADDPM, and (h) SSDn-DDPM.

Figure 11. Comparison of removed noise in the Stratton 3D seismic data. The removed noise corresponding to (a) DRR, (b) SOSVMF, (c) SGK, (d) SeisGAN, (e) DIP, (f) PCADDPM, and (g) SSDn-DDPM.

Figure 12. Denoising comparison in the F3 Netherlands seismic data. (a) The original record, and denoised data corresponding to (b) DRR, (c) SOSVMF, (d) SGK, (e) SeisGAN, (f) DIP, (g) PCADDPM, and (h) SSDn-DDPM.

Figure 13. Comparison of removed noise in the F3 Netherlands seismic data. The removed noise corresponding to (a) DRR, (b) SOSVMF, (c) SGK, (d) SeisGAN, (e) DIP, (f) PCADDPM, and (g) SSDn-DDPM.

Figure 14. Local similarity maps of the F3 Netherlands seismic data using (a) DRR, (b) SOSVMF, (c) SGK, (d) SeisGAN, (e) DIP, (f) PCADDPM, and (g) SSDn-DDPM.

Figure 15. Enlarged denoising comparison of the F3 Netherlands seismic data. (a) The original record, and denoised data corresponding to (b) DRR, (c) SOSVMF, (d) SGK, (e) SeisGAN, (f) DIP, (g) PCADDPM, and (h) SSDn-DDPM. In each subfigure, the left panel shows the full section, the top-right panel shows the enlarged details of the red-boxed region, and the bottom-right panel shows the enlarged details of the blue-boxed region. Green arrows highlight regions with significant feature differences.

Figure 16. Illustration of noise schedule differences. (a) Linear noise schedule. (b) Parameter-freezing noise schedule.

Figure 17. Denoised results with different noise schedules. (a) Denoised result using the parameter-freezing noise schedule. (b) Denoised result using the linear noise schedule. (c) Denoising residual using the parameter-freezing noise schedule. (d) Denoising residual using the linear noise schedule. Green arrows highlight regions with significant feature differences.

Figure 18. Illustration of the matching degree between noise-added data sampled from different states in the Markov chain and the noisy input. (a) Data generated from the matched state, specifically the 165th state. (b–e) Data generated from the 300th, 400th, 500th, and 1000th states, respectively. (f) True noisy input. Closer matches indicate better results.

Figure 19. Ablation experiment removing the seismic noise estimation model. (a) Denoising result obtained with the seismic noise estimation model, sampling from the matched state. (b–e) Denoising results obtained after removing the seismic noise estimation model, sampling from the 300th, 400th, 500th, and 1000th states, respectively. (f) Noiseless ground truth.

Figure 20. Ablation experiment replacing 3D serpentine sampling with random sampling and linear-scanning sampling. (I) CigKarst results with moderate noise contamination. (II) Stratton 3D results. (III) F3 Netherlands results. (a) Original record. (b) Noisy data. (c) Denoising results using random sampling. (d) Denoising results using linear-scanning sampling. (e) Denoising results using 3D serpentine sampling. Each subfigure, from top left to bottom right, includes: the full section, enlarged details of the red-boxed region, enlarged details of the blue-boxed region, and the residual corresponding to the full section. Green arrows highlight regions with significant feature differences.

Figure 21. Ablation experiment removing the diffusion model. (I) CigKarst results with moderate noise contamination. (II) Stratton 3D results. (III) F3 Netherlands results. (a) Noisy data. (b) Denoising results using only the noise distribution estimation model. (c) Complete SSDn-DDPM denoising results. Each subfigure includes, from top left to bottom right, the full section, enlarged details of the red-boxed region, enlarged details of the blue-boxed region, and the residual corresponding to the full section. Green arrows highlight regions with significant feature differences.

Figure 22. Denoised results of the simple numerical model. (a) Simple numerical model. (b) Noisy simple numerical model. (c) Denoised results of the simple numerical model.

Figure 23. Denoised results under different types of noise contamination. (I) Noisy data. (II) Denoised results. (III) Ground truth of noise. (IV) Denoising residual. (a) Impulse noise. (b) Regional noise. (c) Random trace noise. (d) Mixed noise.

Figure 24. Denoising results for data with surface wave interference. (a) Pre-stack data. (b) Surface wave interference. (c) Pre-stack data affected by surface wave interference. (d) Denoised pre-stack data affected by surface wave interference. (e) Surface wave and random noise interference. (f) Pre-stack data affected by surface wave and random noise interference. (g) Denoised pre-stack data affected by surface wave and random noise interference.

Figure 25. Denoised results of the pre-stack and post-stack numerical models. (a) Post-stack model. (b) Noisy post-stack model. (c) Denoised result of the noisy post-stack model. (d) Denoising residual of the noisy post-stack model. (e) pre-stack model. (f) Noisy pre-stack model. (g) Denoised result of the noisy pre-stack model. (h) Denoising residual of the noisy pre-stack model.

Figure 26. Denoised results of real pre-stack data. (a) Real pre-stack data. (b) Denoised result. (c) Denoising residual.

Figure 27. Comparison of denoising results for model generalization. (I) Pre-stack data test results. (II) Post-stack data test results. (III) CigKarst test results. (a) Training data. (b) Test data. (c) Ground truth of test data. (d) Denoised results under normal usage of the algorithm. (e) Denoised results under generalized usage of the algorithm.

Table 1. Model parameters.

	Noise Prediction Model	Denoising Diffusion Model
Sampling block size	$120 \times 120 \times 128$	$120 \times 120 \times 128$
Sampling stride	$1 \times 1 \times 32$	$1 \times 1 \times 32$
Channel multiplier	1, 2, 4, 8, 8	1, 2, 4, 8, 8
Activation	Swish	Swish
Batch size	32	32
Optimizer	Adam	Adam
Learning rate	1 × 10⁻⁴	1 × 10⁻⁴
Diffusion steps	-	1000
Noise schedule	-	(5 × 10⁻⁵, 1 × 10⁻²)
Noise schedule freezing ratio	-	0.2

Table 2. SNR, PSNR, SSIM, MSE, and CS results for synthetic seismic data with light noise contamination. [SNR (dB), PSNR (dB)].

	SNR	PSNR	SSIM	MSE	CS
Noisy	4.846	24.038	0.749	0.640	0.867
DRR [24,25]	11.160	30.352	0.951	0.149	0.961
SOSVMF [2,68]	13.346	32.538	0.960	0.090	0.976
SGK [20]	11.600	30.792	0.927	0.135	0.967
SeisGAN [40]	8.119	27.070	0.913	0.318	0.917
DIP [60]	14.612	33.803	0.979	0.067	0.982
PCADDPM [48]	13.595	26.580	0.942	0.131	0.980
SSDn-DDPM (Ours)	16.186	35.377	0.983	0.047	0.988

Table 3. SNR, PSNR, SSIM, MSE, and CS results for synthetic seismic data with moderate noise contamination. [SNR (dB), PSNR (dB)].

	SNR	PSNR	SSIM	MSE	CS
Noisy	−1.170	20.895	0.523	2.560	0.658
DRR [24,25]	10.576	31.248	0.940	0.171	0.955
SOSVMF [2,68]	9.613	21.760	0.876	0.213	0.945
SGK [20]	10.143	29.348	0.917	0.189	0.952
SeisGAN [40]	6.171	23.398	0.837	0.820	0.801
DIP [60]	12.843	32.046	0.955	0.101	0.973
PCADDPM [48]	9.698	28.454	0.890	0.752	0.951
SSDn-DDPM (Ours)	14.030	33.445	0.973	0.077	0.980

Table 4. SNR, PSNR, SSIM, MSE, and CS results for synthetic seismic data with heavy noise contamination. [SNR (dB), PSNR (dB)].

	SNR	PSNR	SSIM	MSE	CS
Noisy	−5.388	17.523	0.356	6.763	0.473
DRR [24,25]	9.590	29.822	0.924	0.214	0.943
SOSVMF [2,68]	4.268	22.619	0.594	0.731	0.833
SGK [20]	7.845	27.500	0.853	0.321	0.915
SeisGAN [40]	3.160	21.145	0.724	1.820	0.673
DIP [60]	10.200	29.391	0.905	0.186	0.951
PCADDPM [48]	6.564	23.578	0.766	2.995	0.901
SSDn-DDPM (Ours)	10.414	30.063	0.931	0.177	0.954

Table 5. Effect of horizontal sampling stride on denoising results. [SNR (dB), PSNR (dB)].

	$8 \times 8$	$4 \times 4$	$2 \times 2$	$1 \times 1$
SNR	10.225	11.609	13.422	14.030
PSNR	25.701	28.907	31.560	33.445

Table 6. Effect of diffusion steps on denoising results. [SNR (dB), PSNR (dB)].

	Noisy	200	400	600	800	1000
SNR	0.099	12.203	13.414	14.054	14.291	14.276
PSNR	15.816	26.222	27.821	28.922	29.247	29.371

Table 7. Processing time comparison of denoising methods. [Time (s)].

	SOSVMF	SGK	DRR	SeisGAN	DIP	PCADDPM (2D)	SSDn-DDPM
Average Processing Time	70,528.14	2113.11	9841.07	225.62	233.96	11.33	1638.81

Table 8. Memory consumption comparison of denoising methods. [MiB].

	SOSVMF	SGK	DRR	SeisGAN	DIP	PCADDPM (2D)	SSDn-DDPM
Average Memory Consumption	389.2	819.2	1758.4	213.4	11,889.3	67.6	1790.9

Table 9. Sampling times when starting from different states. [Time (s)].

	165th (Matched State)	500th	1000th
Time	3.017	7.649	14.612

Table 10. SNR, PSNR, SSIM, MSE, and CS results for different sampling methods. [SNR (dB), PSNR (dB)].

	SNR	PSNR	SSIM	MSE	CS
Noisy	4.846	24.038	0.749	0.640	0.867
	−1.170	20.895	0.523	2.560	0.658
	−5.388	17.523	0.356	6.763	0.473
Random Sampling	6.105	25.297	0.894	0.479	0.873
	3.683	22.874	0.792	0.837	0.759
	2.872	22.064	0.694	1.009	0.701
Linear-Scanning Sampling	7.405	26.597	0.922	0.355	0.907
	6.236	25.464	0.906	0.465	0.880
	9.145	29.451	0.911	0.238	0.938
3D Serpentine Sampling	16.186	35.377	0.983	0.047	0.988
	14.030	33.445	0.973	0.077	0.980
	10.414	30.063	0.931	0.177	0.954

Table 11. SNR, PSNR, SSIM, MSE, and CS results w/o the diffusion model. [SNR (dB), PSNR (dB)].

	SNR	PSNR	SSIM	MSE	CS
Noisy	4.846	24.038	0.749	0.640	0.867
	−1.170	20.895	0.523	2.560	0.658
	−5.388	17.523	0.356	6.763	0.473
w/o Diffusion Model	10.611	32.239	0.906	0.089	0.973
	9.466	31.528	0.871	0.115	0.967
	6.479	28.435	0.739	0.230	0.937
Complete SSDn-DDPM	16.186	35.377	0.983	0.047	0.988
	14.030	33.445	0.973	0.077	0.980
	10.414	30.063	0.931	0.177	0.954

Table 12. Denoising results of post-stack and pre-stack data. [SNR (dB), PSNR (dB)].

	Post-Stack Data		Pre-Stack Data
	Noisy	Denoised	Noisy	Denoised
SNR	−0.539	24.551	−0.596	24.080
PSNR	17.818	39.344	17.783	38.827

Table 13. Comparison of denoising results between generalized and conventional model applications. [SNR (dB), PSNR (dB), SSIM].

	Post-Stack Data			Pre-Stack Data
Training Data	Post-Stack Data			Pre-Stack Data
Post-Stack Data	24.551	39.344	0.971	23.853	37.526	0.959
Pre-Stack Data	24.855	39.386	0.975	24.080	38.827	0.963

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Z.; Qin, G.; Liang, Y.; Sun, M.; Wang, Y.; Song, J. Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling. Remote Sens. 2025, 17, 1061. https://doi.org/10.3390/rs17061061

AMA Style

Zhang Z, Qin G, Liang Y, Sun M, Wang Y, Song J. Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling. Remote Sensing. 2025; 17(6):1061. https://doi.org/10.3390/rs17061061

Chicago/Turabian Style

Zhang, Zhonghan, Guihe Qin, Yanhua Liang, Minghui Sun, Yingqing Wang, and Jiaru Song. 2025. "Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling" Remote Sensing 17, no. 6: 1061. https://doi.org/10.3390/rs17061061

APA Style

Zhang, Z., Qin, G., Liang, Y., Sun, M., Wang, Y., & Song, J. (2025). Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling. Remote Sensing, 17(6), 1061. https://doi.org/10.3390/rs17061061

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ground-Truth-Free 3D Seismic Denoising Based on Diffusion Models: Achieving Effective Constraints Through Embedded Self-Supervised Noise Modeling

Abstract

1. Introduction

2. Methodology

2.1. Self-Supervised 3D Seismic Noise Modeling

2.2. Three-Dimensional Serpentine Sampling

2.3. Self-Supervised Denoising Diffusion Model

3. Results

3.1. Model Configurations

3.2. Synthetic Seismic Data

3.3. Field Seismic Data

4. Discussion

4.1. Parameter Sensitivity

4.1.1. Sampling Block Size

4.1.2. Noise Schedule

4.1.3. Diffusion Steps

4.2. Computational Cost

4.3. The Importance of Seismic Noise Modeling

4.4. The Importance of Three-Dimensional Serpentine Sampling

4.5. The Importance of the Diffusion Model

4.6. Simple Numerical Models

4.7. Applicable Noise Types

4.8. Applicability of Post-Stack and Pre-Stack Data

4.9. Generalizability

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI