Diffusion Model for DAS-VSP Data Denoising

Zhu, Donglin; Fu, Lei; Kazei, Vladimir; Li, Weichang

doi:10.3390/s23208619

Open AccessArticle

Diffusion Model for DAS-VSP Data Denoising

Aramco Americas—Houston Research Center, Houston, TX 77084, USA

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(20), 8619; https://doi.org/10.3390/s23208619

Submission received: 20 September 2023 / Revised: 5 October 2023 / Accepted: 19 October 2023 / Published: 21 October 2023

(This article belongs to the Special Issue Distributed Acoustic Sensing and Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

Distributed acoustic sensing (DAS) has emerged as a transformational technology for seismic data acquisition. However, noise remains a major impediment, necessitating advanced denoising techniques. This study pioneers the application of diffusion models, a type of generative model, for DAS vertical seismic profile (VSP) data denoising. The diffusion network is trained on a new generated synthetic dataset that accommodates variations in the acquisition parameters. The trained model is applied to suppress noise in synthetic and field DAS-VSP data. The results demonstrate the model’s effectiveness in removing various noise types with minimal signal leakage, outperforming conventional methods. This research signifies diffusion models’ potential for DAS processing.

Keywords:

distributed acoustic sensing (DAS); vertical seismic profiling (VSP); denoising; diffusion model

1. Introduction

Distributed acoustic sensing (DAS) has seen widespread adoption as an emerging technology for seismic signal recording, presenting notable advantages over conventional geophones [1]. Factors such as reduced costs, a wider detection range, and enhanced spatial resolution render DAS an attractive choice in modern seismic studies [2,3]. Despite these benefits, the implementation of DAS, and in particular, vertical seismic profile data (VSP), introduces its own set of unique challenges that require careful navigation.

One of the predominant challenges encountered in DAS-VSP data processing is the presence of noise from a diverse range of sources. Environmental noise, horizontal and fading noise, poor coupling, and flow interference constitute significant impediments that can severely impact the accuracy and efficacy of subsequent imaging, interpretation, and analysis of seismic data [1,4]. The mitigation of these noise types and the improvement of the signal-to-noise ratio (SNR) have thus become critical aspects of DAS-VSP data processing. Here, we focus on the elimination of “zigzag” noise [5] typical for DAS data and mainly attributed to coupling (Figure 1). Consequently, the development and application of efficient denoising methods represents a crucial research area in contemporary seismic studies.

Various denoising methods have been deployed for traditional seismic data processing. Techniques such as wavelet transform [6], band-pass filtering [7], and f-x deconvolution [8] have demonstrated considerable potential in enhancing the SNR in the presence of random noise. For high-amplitude erratic noises such as swell noises, robust singular spectrum analysis [9] and its adaptive version [10] recover signals from the low-rank components of the transformed data while removing erratic noise components via soft-thresholding. Horizontal noises in DAS data are typically addressed by stacking all or a significant portion of the DAS record and subtracting this estimate from each trace in the DAS record [5,11,12]. However, the efficacy of these conventional methods varies considerably with different types of noise. In particular, they have shown limited success in dealing with noise types such as fading noise and poor coupling [4,5,13]. Poor coupling in DAS data is known to lead to zigzag noise in correlated data [5]; we focus our attention on eliminating this type of noise (Figure 1).

The limitations of these conventional methods necessitate the exploration of alternative, more advanced strategies. The rise of deep learning has presented new opportunities in this context, with convolutional neural networks (CNNs) showing remarkable potential in seismic denoising applications [14,15]. CNN-based denoising methods leverage the power of machine learning, utilizing complex architectures and training algorithms to tackle the intricacies of seismic noise. The convolutional neural network (CNN) has been explored for seismic data denoising [14,16]. More recent work [17] in combining physics surrogate constraint with a CNN type of architect has demonstrated its effectiveness in suppression of severe coherent noises such as ground roll. The CNN-based denoising methods have shown the power to deal with complex noise in DAS data [18,19].

However, CNN-based methods are not without their limitations. Most ML-based methods rely on supervised learning algorithms which means ground truth is needed for training (Figure 1). It is hard to simulate the specific noise types in DAS-VSP data. One way to obtain noise data is extracting it from field data prior to first signal arrivals. But it is inevitable to introduce seismic signals with noise extraction if the noise is not easily separated from the signal in space and time. Moreover, when denoised field data derived from conventional methods serve as training datasets, the CNN models may inherit the limitations of these methods, making it difficult for the model to exceed the performance of the techniques used to create the training data.

Given these constraints, researchers have begun exploring alternative deep learning strategies, particularly generative models such as diffusion models [20]. Diffusion models present several advantages over other generative models, like generative adversarial networks (GANs). They are simpler to train and do not suffer from issues like mode collapse or the generation of low-quality outputs, which are common in GANs [21,22]. Durall et al. [23] demonstrated the capability of diffusion models for seismic processing from demultiple to interpolation. However, despite their potential, diffusion models remain underexplored in the context of DAS-VSP processing.

This study introduces the use of diffusion models for DAS-VSP noise suppression, presenting a pioneering approach in this domain. First, we generate a volume of synthetic DAS-VSP training data using forward modeling. We manipulate various parameters in this process, such as source location and the main frequency of the wavelet, creating a rich, diverse dataset for training the diffusion model.

Subsequently, we apply the trained diffusion model to denoise specific types of noise present in DAS-VSP data. The experimental results from both the synthetic and field experiments show that our proposed workflow effectively suppresses various types of noise with minimal impact on the effective DAS signals. Furthermore, the diffusion model exhibits a greater tolerance to the variety in noise types, demonstrating its robustness and versatility.

This study positions diffusion models as a promising architecture for future research in DAS-VSP noise suppression. By overcoming the limitations of conventional methods and the restrictions of existing CNN-based techniques, the use of diffusion models represents a significant step forward in seismic data processing. As we continue to refine this method and expand its applications, we expect to see a substantial improvement in the quality of DAS-VSP data, contributing significantly to the broader field of seismic studies.

2. Methods

This study proposes a denoising workflow for DAS-VSP data leveraging the power of diffusion models [20], more specifically, the denoising diffusion probabilistic models (DDPMs) [24]. Diffusion models use a Markov chain, which progressively transmutes one distribution into another, an idea employed in non-equilibrium statistical physics [25] and sequential Monte Carlo [26].

In this context, a DDPM is a parameterized Markov chain, trained through variational inference, to generate samples which, after a finite time, match the data. The transitions of this chain are designed to reverse a diffusion process—a Markov chain adding noise to the data in small increments until the original signal is completely masked. The process of transitioning to conditional Gaussians when the diffusion consists of small amounts of Gaussian noise allows for a notably straightforward neural network parameterization.

The process is illustrated in Figure 2 and can be divided into two parts: the forward process and the reverse process. We further extend this by including a section on the conditional diffusion models.

2.1. Forward Process

In the forward process, devoid of learnable parameters, the concluding distribution morphs into an isotropic-independent Gaussian distribution as time increases. It stipulates the real data distribution,

x_{0} ~ q (x_{0})

, and introduces minor Gaussian noise cumulatively over T steps. This process results in a sequence of samples

x_{1}, x_{2}, \dots, x_{T}

, each perturbed with additional noise. The noise’s mean and variance are governed by a factor,

β

, within [0, 1], and follows an ascending sequence,

β_{1} < β_{2} < \dots < β_{T}

, indicating an increment in the noise added over time. The forward process

q (x_{t} | x_{t - 1})

is defined by:

q (x_{t}| x_{t - 1}) = N (x_{t}; \sqrt{1 - β_{t}} x_{t - 1}, β_{t} I),

(1)

where

N

denotes normal distribution,

I

represents the unit matrix. Following the Markov chain principle, the joint probability distribution of

x_{1 : T}

given

x_{0}

is:

q (x_{1 : T}| x_{0}) = \prod_{t = 1}^{T} q (x_{t}| x_{t - 1}) .

(2)

2.2. Reverse Process

The reverse process of the diffusion model is essentially the denoising process. In this process, we aim to recover the original distribution of

x_{0}

from the standard Gaussian distribution

x_{T} ~ N (0, I)

. If

q (x_{t - 1} | x_{t})

can be gradually obtained, it would still be a Gaussian distribution, assuming

q (x_{t}| x_{t - 1})

follows a Gaussian distribution and

β_{t}

is small enough [27]. Every step of the reverse process

p_{θ} (x_{t - 1} | x_{t})

can be defined by:

p_{θ} (x_{t - 1}| x_{t}) = N (x_{t - 1}; μ_{θ} (x_{t}, t), Σ_{θ} (x_{t}, t)),

(3)

where

θ

represents the parameters learned by the neural network. The reverse process is defined by:

p_{θ} (x_{0 : T}) = p (x_{T}) \prod_{t = 1}^{T} p_{θ} (x_{t - 1} | x_{t}) .

(4)

The neural network used in the reverse process to simulate the distribution

p_{θ} (x_{t - 1} | x_{t})

is the U-Net [28]—a classic choice for image processing without changing the image shape that combines convolutional layers of a standard autoencoder architecture with skip connections to regularize the outputs. The U-Net is enhanced by adding Resblocks [29] introducing the differentiation capability directly into the network—thus allowing the focus on noise and attention layers [30] spreading the receptive field of the network. Adding Resblocks and attention mechanisms to the U-Net leads to improved performance by allowing the training of deeper models, improving feature learning, and enhancing focus on the most relevant parts of the input [23,24,28].

2.3. Conditional Diffusion Models

As a significant extension of the DDPM, conditional diffusion models [31,32] have been introduced to guide the diffusion process in a specific way, informed by the certain conditioning variable, y. The forward process now depends on these variables and is given by:

q (x_{t}| x_{t - 1}, y) = N (x_{t}; \sqrt{1 - β_{t}} x_{t - 1}, β_{t} I) .

(5)

Similarly, the reverse process now also takes these conditioning variables into account:

p_{θ} (x_{t - 1}| x_{t}, y) = N (x_{t - 1}; μ_{θ} (x_{t}, y, t), Σ_{θ} (x_{t}, y, t)) .

(6)

2.4. Training and Loss Function

Training the diffusion models and obtaining the corresponding parameter θ is performed by optimizing the negative log-likelihood:

L = Ε_{q (x_{0})} [- l o g p_{θ} (x_{0})] .

(7)

After applying the variational lower bound (VLB) [33,34] and the reparameterization trick [34], the loss function simplifies to:

L_{t} = Ε_{{t, x}_{0}, θ} [{‖ϵ - ϵ_{θ} (\sqrt{{\bar{α}}_{t}} x_{0} + \sqrt{1 - {\bar{α}}_{t}} ϵ, t)‖}^{2}],

(8)

where

ϵ_{θ}

derives from the network,

ϵ

represents the Gaussian noise,

α_{t} = 1 - β_{t}

, and

{\bar{α}}_{t} = \prod_{i = 1}^{t} α_{i}

.

Training the conditional diffusion model focuses on the condition variables

y

, with the loss function now being:

L_{t} = Ε_{{t, x}_{0}, y, θ} [{‖ϵ - ϵ_{θ} (\sqrt{{\bar{α}}_{t}} x_{0} + \sqrt{1 - {\bar{α}}_{t}} ϵ, t, y)‖}^{2}]

(9)

This approach enables the model to generate samples that closely match a specific data distribution dictated by the conditioning variables. The complexity added by the conditioning variables necessitates more nuanced training, making the process challenging yet yielding more precise results. The hyperparameters of the model were largely adopted from [23,24] and empirically fine-tuned for best performance. We observed that larger patch sizes substantially improved performance, yet to balance the GPU loads and dataset size we stopped at 512 × 512 patches. The final hyperparameters are listed in Table 1. The model was trained using 3 Nvidia A100 GPUs.

2.5. Synthetic Training Data Generation

Based on the analysis of the signal dominant frequent, velocity, wavelet, and noise type of the DAS-VSP data, we constructed synthetic data for training and testing the proposed conditional diffusion model. The synthetic dataset consisted of two parts, the clean set and the noisy set. For simplicity of the synthetic forward modeling, we considered the modeling with acoustic wave propagation. The acoustic solver simulated the pressure wavefield which for our manuscript was considered as a P-wave potential. The first vertical derivative of the pressure brought us to data equivalent to the displacement and the second derivative yielded the strain proxy. Given that the simulations were performed using a 6.25 m grid we applied a box filter with four grid points to obtain an analog of 25 m GL for synthetic data which was close to one of our target real datasets. While elastic solvers routinely used in DAS data simulation and inversion, e.g., [35,36,37,38] lead to higher fidelity in DAS data amplitudes, the acoustic solver allows for computationally efficient generation of large datasets necessary for training the network.

The velocity model is based on the SEAM Arid model [39]. The SEAM Arid model is built from the Barrett model designed to represent unconventional reservoirs in Texas and near-surface model generated to represent land data challenges typically faced in exploration in the Middle East; it has been used extensively for studies on deep learning-based inversion of seismic data [40], acquisition design for VSP [41] and surface seismic [42], and evaluation of structural uncertainty in challenging desert environments [43]. We created 135 shot gathers in the size of 2000 (samples) by 1098 (traces) with different offsets ranging from 0 to 3.5 km by moving the shot location from left to right in the model in Figure 3 with increments of 50 m.

The specific parameters of forward modeling are shown in Table 2. The Klauder wavelet represents a realistic seismic vibrator sweep autocorrelation function [39]. Two widely used in seismic exploration sampling intervals of 1 and 2 ms were used for simulations. The grid size 6.25 is native to the SEAM Arid model. The central frequencies of the source between 15 and 55 Hz were picked as representing common seismic monitoring with VSP frequency content. We use the synthetic shot gathers to generate a clean set with a 512 by 512 moving window. The moving window randomly captured 20 patches from each shot gather, and finally we obtained 2700 noise-free samples. After checking that the data patches captured the first arrivals of the simulated records samples, they were randomly separated into training (70% of samples), validation (20% of samples), and testing (10% of samples) datasets.

The generation of noise types found in field DAS-VSP records is crucial for training the diffusion model effectively. One common type of noise is the zigzag noise (N), defined by [44] as:

N = \sum_{i = 1}^{n_{m a x}} A_{0} x^{T_{0} + (i - 1) T} * W,

(10)

where

T_{0}

represents the first break time,

A_{0}

represents the first break amplitude,

T

is the noise period,

n_{m a x}

represents the maximum period numbers of the noise,

W

is the wavelet, and

x

represents the attenuation parameter. For synthetic data, we only considered the typical noise types which are hard to eliminate by conventional methods in DAS-VSP data, such as the zigzag noise, as shown in Figure 4c. The noise generation parameters are also specified in Table 2.

Once the clean acoustic DAS dataset was simulated, they were combined with the noise to create the synthetic dataset used for training. By injecting the zigzag noise (and possibly other types of noise) into the simulated wavefield, we created a noisy dataset that the model then learned to denoise. This process effectively trained the model to handle the kind of data that it would encounter in real-world scenarios.

3. Results

3.1. Test on the Synthetic Dataset

To assess the efficacy of our proposed method, we initially utilized the synthetic testing dataset (Figure 5). Figure 5a displays the clean synthetic data produced via Deepwave [45] and DAS conversion. The 2D synthetic DAS-VSP data, embedded with diverse noise types, is depicted in Figure 5b, exhibiting an input SNR of 8 dB. Figure 5c illustrates the denoised outcomes derived from our proposed diffusion model, while Figure 5d portrays the extracted noise. A notable improvement in the signal-to-noise ratio (SNR) was observed, increasing to 24 dB. The denoised data gathered in Figure 5c which is barely distinguishable from the clean data in Figure 5a, and underscores our method’s capability to attenuate noise while retaining the signal. However, minor signal leakage was observed when examining the removed noise in Figure 5d. Minor signal leakage appeared to be inevitable as the signal from some upgoing reflections resembled the noise when superimposed with downgoing waves.

To further evaluate the predicted results quantitatively on the synthetic dataset, we introduced the Structural Similarity Index (SSIM) [46] as metrics. The SSIM can measure the similarity between the denoised output/noisy data and the ground truth. The SSIM ranges from −1 to 1. When two images are identical, the value of SSIM is equal to 1. The SSIM of the data shown in Figure 5, where it improved from 0.67 to 0.81.

3.2. Test on the Field Data

The two field DAS-VSP datasets in the study come from the Citronelle dataset [11] in the U.S., and the Groß Schönebeck site [47] in Germany. Both datasets are publicly available with more processing results available for the Groß Schönebeck site. For this study, we applied the diffusion model directly to two field DAS-VSP datasets for denoising. While the diffusion model itself was trained using synthetic data, this represents the first demonstration of its application to real-world DAS-VSP data for noise removal.

The Citronelle DAS-VSP dataset was acquired in Citronelle Field, Alabama in 2016. The test site is a Triassic fluvial-dominated sandstone reservoir located at approximately 10,000 feet depth. The dataset was obtained using DAS technology along two ~1 km deep vertical monitor wells with a VSP geometry. Controlled injection of ~22,000 tons of CO₂ created a CO₂ plume extending ~300 m. The high-resolution DAS-VSP dataset allows detailed imaging and characterization of the reservoir architecture, fracture networks, fluid distributions, and CO₂ plume behavior before, during, and after injection. This dataset provides an invaluable resource to study time-lapse subsurface changes associated with CO2 injection in a geologic reservoir using DAS-VSP.

Comparisons between the input data shown in Figure 6a, denoised output (Figure 6b), and predicted noise (Figure 6c) provide several key observations. The green arrows indicate the areas where the noise was successfully suppressed in the denoised output, compared to the raw input data. This included attenuation of strong zigzag and checkerboard noise patterns. The blue arrows pinpoint the regions where primary reflection signals were preserved during denoising. The continuity and relative amplitudes of these events were maintained from input to output. The purple arrow shows an example where the first arrival direct wave was properly calibrated and aligned between the input and denoised data. This demonstrates the model’s ability to retain important signal components. The red arrow highlights an area of signal leakage where remnants of the reflection event persisted in the predicted noise. This indicates the model needs further tuning to completely remove noise without signal loss.

The Groß Schönebeck area is part of a large geothermal project in Germany where 4.2 km deep wireline DAS was acquired for the first time [47] which after major efforts in processing [48] led to successful imaging results in 3D VSP setup [49]. Public data retrieved for Groß Schönebeck dataset include correlated stacked noisy data and data after adaptive deconvolution.

The diffusion model removed the noise even in this case allowing for picking around the top of the well which significantly simplified the subsequent processing for the near-surface data (Figure 7). Adaptive deconvolution is aimed at making data sparser and spectrum flat is performed trace-by-trace and results into substantial amplitude changes and disbalancing, yet removes a significant portion of the zigzag noise caused by cable reverberations [48]. For the deeper part of the well, the reverberations were significantly slower than the direct arrivals and spatially separated and deconvolution worked well. However, deconvolution failed in the case when the reverberations had comparable speed to the first arrivals and were spatially not separated from the direct arrivals (Figure 8); this was particularly detrimental to the top part of the well that could not be picked due to intersecting events (red arrow in Figure 8b). To further assess the results generated by the proposed method, we applied a bandpass filter to extract a higher SNR bandwidth (10–30 Hz) to the Groß Schönebeck dataset and juxtaposed the filtered data with those derived from the diffusion model (Figure 9). While the bandpass filter result (Figure 9d) did partially remove some of the zigzag noise, it also removed a significant portion of signal introducing several issues including reduced temporal resolution (Figure 9e).

4. Discussion

4.1. Diffusion Time-Step Analysis

Diffusion models function by simulating a forward diffusion process that begins from a target distribution—in this study, the noise-free DAS-VSP samples—and gradually infuses noise until reaching a simple distribution, such as a standard Gaussian distribution. This forward process is computationally straightforward, but the reverse procedure of transitioning from the simple to the target distribution is more intricate and requires iterative training.

The diffusion process is typically partitioned into discrete timesteps. The quantity of the timesteps can significantly influence the model performance. We investigated the impact of various timestep numbers on the efficacy of our proposed diffusion model, ranging from 1 to 1000.

A single timestep (T = 1) configuration essentially requires the diffusion model to encapsulate the entire distribution in one instance. This task is substantial as the transformation from the simple to the target distribution is highly complex, leading to the potential for distortion or anomalies in the output. When the timestep quantity is set to 1 for both forward and reverse diffusion processes, the model assumes the role of a single step denoising model, adding and then trying to recover noise in one action. The U-Net, in this case, serves as the architecture for the reverse process, tasked with mapping the noisy data back to the original data. While the diffusion process is simplified, the U-Net’s task is more intricate, given that it must reverse the noise effects in one go. Hence, the diffusion model with a single timestep could be viewed as a single-step denoising U-Net.

Introducing 10 timesteps (T = 10) allows the diffusion model more ‘room’ to progressively transform the simple distribution into the target distribution. While the sample quality may improve compared to a single timestep, the output could still vary noticeably from the target distribution. This situation signifies a trade-off between computational complexity and sample quality.

Expanding to 100 timesteps (T = 100), the model can facilitate more gradual transitions from the simple to the target distribution. Consequently, the quality of the generated samples could surpass those of models with fewer timesteps. However, the increased computational requirement and potential risk of overfitting are important considerations.

Finally, with 1000 timesteps (T = 1000), the model is provided ample opportunity to make very incremental transitions from the simple to the target distribution. The generated samples could potentially be of high quality, virtually indistinguishable from the target distribution. But the computational expense escalates significantly, and careful regularization could be necessary to circumvent overfitting. Figure 10 shows the diffusion process with T = 1000. The noise is gradually removed with timestep increasing.

There exists a trade-off between the settings of timesteps and computational efficiency. For simpler tasks, a smaller timestep may suffice. Certain derivatives of diffusion models, like the deep diffusion implicit model (DDIM) by [50], can achieve comparable performance with smaller timesteps as they would with larger ones.

4.2. Advantages and Limitations

The injection of noise at every timestep can be considered a form of data augmentation. This process adds varying levels of noise to each training image at each timestep, creating an altered version for the model to learn from. This inherent data augmentation property can prove beneficial, especially when working with a limited dataset.

Comparisons with classic methods for denoising for DAS data described here are limited to the deconvolution and bandpass filter methods, but generally machine learning based methods outperform classic denoising methods as multiple recent studies suggest [15,19].

While diffusion models have several advantages, such as robustness to varying noise levels and an ability to handle complex distributions, they also pose challenges like increased computational demands. This increased complexity stems from the sequential nature of the diffusion process, compared to models like the U-Net. Furthermore, the quality of results still heavily depends on the training data quality and variety in supervised learning applied to train the diffusion models.

The computational expense of the current diffusion model implementation poses a potential limitation, particularly due to the large timestep setting. At the current stage, inference for a single shot gather takes about a minute while training requires several days. However, there are ways to reduce the training and inference time requirements by using transfer learning and adopting more computationally efficient diffusion model architectures actively researched in the machine learning community.

5. Conclusions

We proposed a DAS-VSP denoising workflow based on the conditional diffusion model, which was trained on a relatively small synthetic DAS-VSP shots dataset. The model’s performance was evaluated using synthetic and field DAS-VSP datasets. When compared to traditional methods, the proposed model offers a more robust and general solution for suppressing typical noise in DAS-VSP data.

The field of generative modeling is currently undergoing rapid and significant advancements. It is our hope that this pioneering research contributes to the ongoing development of this field, paving the way for further applications of the generative modeling techniques, such as diffusion models, in seismic and DAS-related research.

Author Contributions

Conceptualization, D.Z., L.F. and V.K.; methodology, D.Z., L.F. and V.K.; resources, D.Z. and V.K.; data curation, D.Z. and L.F.; writing—original draft preparation, D.Z., L.F. and V.K.; writing—review and editing, D.Z., L.F., V.K. and W.L.; supervision and project administration, W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The Citronelle DAS VSP dataset can be found in https://edx.netl.doe.gov/dataset/citronelle-2013-das-vsp. The Groß Schönebeck dataset can be found in https://doi.org/10.5880/GFZ.4.8.2021.001. The Deepwave can be found in https://doi.org/10.5281/zenodo.8189232.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mateeva, A.; Lopez, J.; Potters, H.; Mestayer, J.; Cox, B.; Kiyashchenko, D.; Wills, P.; Grandi, S.; Hornman, K.; Kuvshinov, B.; et al. Distributed acoustic sensing for reservoir monitoring with vertical seismic profiling. Geophys. Prospect. 2014, 62, 679–692. [Google Scholar] [CrossRef]
Fernández-Ruiz, M.; Soto, M.; Williams, E.; Martín-López, S.; Zhan, Z.; González-Herráez, M.; Martins, H. Distributed acoustic sensing for seismic activity monitoring. APL Photonics 2020, 5, 030901. [Google Scholar] [CrossRef]
Fang, G.; Li, Y.; Zhao, Y.; Martin, E. Urban Near-Surface Seismic Monitoring Using Distributed Acoustic Sensing. Geophys. Res. Lett. 2020, 47, e2019GL086115. [Google Scholar] [CrossRef]
Chen, J.; Ning, J.; Chen, W.; Wang, X.; Wang, W.; Zhang, G. Distributed acoustic sensing coupling noise removal based on sparse optimization. Interpretation 2019, 7, T373–T382. [Google Scholar] [CrossRef]
Willis, M.E.; Wu, X.; Palacios, W.; Ellmauthaler, A. Understanding cable coupling artifacts in wireline-deployed DAS VSP data. In SEG Technical Program Expanded Abstracts 2019; Society of Exploration Geophysicists: Tulsa, OK, USA, 2019; pp. 5310–5314. [Google Scholar]
Deighan, A.J.; Watts, D.R. Ground-roll suppression using the wavelet transform. Geophysics 1997, 62, 1896–1903. [Google Scholar] [CrossRef]
Stein, R.A.; Bartley, N.R. Continuously time-variable recursive digital band-pass filters for seismic signal processing. Geophysics 1983, 48, 702–712. [Google Scholar] [CrossRef]
Gülünay, N. FXDECON and complex Wiener prediction filtering. In SEG Technical Program Expanded Abstracts 1986; Society of Exploration Geophysicists: Tulsa, OK, USA, 1986; pp. 279–281. [Google Scholar]
Chen, K.; Sacchi, M.D. Robust f-x projection filtering for simultaneous random and erratic seismic noise attenuation. Geophys. Prospect. 2017, 65, 650–668. [Google Scholar] [CrossRef]
Li, W.; Chen, K.; Ahmed, F.; Jeong, W. Rank revealing and vector optimization methods for adaptive robust denoising. In SEG Technical Program Expanded Abstracts 2019; Society of Exploration Geophysicists: Tulsa, OK, USA, 2019; pp. 4695–4699. [Google Scholar]
Daley, T.M.; Miller, D.E.; Dodds, K.; Cook, P.; Freifeld, B.M. Field Testing of Modular Borehole Monitoring with Simultaneous Distributed Acoustic Sensing and Geophone Vertical Seismic Profile at Citronelle, Alabama. Geophys. Prospect. 2016, 64, 1318–1334. [Google Scholar] [CrossRef]
Ellmauthaler, A.; Willis, M.E.; Wu, X.; Leblanc, M. Noise sources in fiber-optic distributed acoustic sensing VSP data. In EAGE Extended Abstracts; European Association of Geoscientists & Engineers: Utrecht, The Netherlands, 2017; Volume 2017, pp. 1–5. [Google Scholar] [CrossRef]
Cai, Z.; Yu, G.; Zhang, Q.; Zhao, Y.; Chen, Y.; Jin, Y.; Zhao, H. Comparative research between DAS-VSP and conventional VSP data. In Proceedings of the 2016 Workshop: Rock Physics and Borehole Geophysics, Beijing, China, 28–30 August 2016; pp. 81–84. [Google Scholar] [CrossRef]
Yu, S.; Ma, J.; Wang, W. Deep learning for denoising. Geophysics 2019, 84, V333–V350. [Google Scholar] [CrossRef]
Jin, G.; Kazei, V.; Lellouch, A.; Li, W.; Titov, A.; Tribaldos, V.R. Distributed acoustic sensing in geophysics—Introduction. Geophysics 2023, 88, 1–4. [Google Scholar] [CrossRef]
Zhao, Y.; Li, Y.; Dong, X.; Yang, B. Low-frequency noise suppression method based on improved DnCNN in desert seismic data. IEEE Geosci. Remote Sens. Lett. 2019, 16, 811–815. [Google Scholar] [CrossRef]
Pham, N.; Li, W. Physics-constrained deep learning for ground roll attenuation. Geophysics 2021, 87, V15–V27. [Google Scholar] [CrossRef]
Zhao, Y.; Li, Y.; Wu, N. Distributed acoustic sensing vertical seismic profile data denoiser based on convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5900511. [Google Scholar] [CrossRef]
Yang, L.; Fomel, S.; Wang, S.; Chen, X.; Chen, W.; Saad, O.M.; Chen, Y. Denoising of distributed acoustic sensing data using supervised deep learning. Geophysics 2023, 88, WA91–WA104. [Google Scholar] [CrossRef]
Sohl-Dickstein, J.; Weiss, E.A.; Maheswaranathan, N.; Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. arXiv 2015, arXiv:1503.03585. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
Marano, G.C.; Rosso, M.M.; Aloisio, A.; Cirrincione, G. Generative Adversarial Networks Review in Earthquake-related Engineering Fields. Bull. Earthq. Eng. 2023, 1–52. [Google Scholar] [CrossRef]
Durall, R.; Ghanim, A.; Fernandez, M.; Ettrich, N.; Keuper, J. Deep diffusion models for seismic processing. Comput. Geosci. 2023, 177, 105377. [Google Scholar] [CrossRef]
Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. arXiv 2020, arXiv:2006.11239. [Google Scholar]
Jarzynski, C. Nonequilibrium Equality for Free Energy Differences. Phys. Rev. Lett. 1997, 78, 2690. [Google Scholar] [CrossRef]
Neal, R. Annealed importance sampling: Statistics and Computing. arXiv 1998, arXiv:physics/9803008. [Google Scholar]
Feller, W. On the theory of stochastic process, with particular reference to applications. In Proceedings of the First Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1949. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv 2015, arXiv:1505.04597. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Dhariwal, P.; Nichol, A. Diffusion Models Beat GANs on Image Synthesis. arXiv 2021, arXiv:2105.05233. [Google Scholar]
Saharia, C.; Ho, J.; Chan, W.; Salimans, I.; Fleet, D.J.; Norouzi, M. Image Super-Resolution via Iterative Refinement. arXiv 2021, arXiv:2104.07636. [Google Scholar] [CrossRef]
Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational inference: A review for statisticians. J. Am. Stat. Assoc. 2017, 112, 859–877. [Google Scholar] [CrossRef]
Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
Kazei, V.; Osypov, K. Inverting distributed acoustic sensing data using energy conservation principles. Interpretation 2021, 9, SJ23–SJ32. [Google Scholar] [CrossRef]
Kazei, V.; Osypov, K.; Alfataierge, E.; Bakulin, A. Amplitude-based DAS logging: Turning DAS VSP amplitudes into subsurface elastic properties. In SEG Technical Program Expanded Abstracts 2021; Society of Exploration Geophysicists: Tulsa, OK, USA, 2021; pp. 412–416. [Google Scholar]
Egorov, A.; Correa, J.; Bóna, A.; Pevzner, R.; Tertyshnikov, K.; Glubokovskikh, S.; Puzyrev, V.; Gurevich, B. Elastic full-waveform inversion of vertical seismic profile data acquired with distributed acoustic sensors. Geophysics 2018, 83, R273–R281. [Google Scholar] [CrossRef]
Podgornova, O.; Bettinelli, P.; Liang, L.; Le Calvez, J.; Leaney, S.; Perez, M.; Soliman, A. Full-Waveform Inversion of Fiber-Optic VSP Data from Deviated Wells. In Proceedings of the SPWLA 63rd Annual Logging Symposium, Stavanger, Norway, 11–15 June 2022. [Google Scholar] [CrossRef]
Oristaglio, M. SEAM update: The Arid model—Seismic exploration in desert terrains. Lead. Edge 2015, 34, 466–468. [Google Scholar] [CrossRef]
Kazei, V.; Ovcharenko, O.; Plotnitskii, P.; Peter, D.; Silvestrov, I.; Bakulin, A.; Zwartjes, P.; Alkhalifah, T. Elastic near-surface model estimation from full waveforms by deep learning. In SEG Technical Program Expanded Abstracts 2020; Society of Exploration Geophysicists: Tulsa, OK, USA, 2020; pp. 3872–3876. [Google Scholar] [CrossRef]
Kazei, V.; Liang, H.; AlDawood, A. Acquisition and near-surface impacts on VSP mini-batch FWI and RTM imaging in desert environment. Lead. Edge 2023, 42, 165–172. [Google Scholar] [CrossRef]
Bakulin, A.; Silvestrov, I. Quantitative evaluation of 3D land acquisition geometries with arrays and single sensors: Closing the loop between acquisition and processing. Lead. Edge 2023, 42, 310–320. [Google Scholar] [CrossRef]
Silvestrov, I.; Egorov, A.; Bakulin, A. Evaluating imaging uncertainty associated with the near surface and added value of vertical arrays using Bayesian seismic refraction tomography. J. Geophys. Eng. 2023, 20, 751–762. [Google Scholar] [CrossRef]
Yu, G.; Cai, Z.; Chen, Y.; Wang, X.; Zhang, Q.; Li, Y.; Wang, Y.; Liu, C.; Zhao, B.; Greer, J. Borehole seismic survey using multimode optical fibers in a hybrid wireline. Measurement 2018, 125, 694–703. [Google Scholar] [CrossRef]
Richardson, A. Deepwave. Zenodo 2023. [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Henninges, J.; Martuganova, E.; Stiller, M.; Norden, B.; Krawczyk, C.M. DAS-VSP Data from the Feb. 2017 Survey at the Groß Schönebeck Site, Germany; GFZ Data Services: Potsdam, Germany, 2021. [Google Scholar] [CrossRef]
Martuganova, E.; Stiller, M.; Bauer, K.; Henninges, J.; Krawczyk, C.M. Cable reverberations during wireline distributed acoustic sensing measurements: Their nature and methods for elimination. Geophys. Prospect. 2021, 69, 1034–1054. [Google Scholar] [CrossRef]
Martuganova, E.; Stiller, M.; Norden, B.; Henninges, J.; Krawczyk, C.M. 3D deep geothermal reservoir imaging with wireline distributed acoustic sensing in two boreholes. Solid Earth 2022, 13, 1291–1307. [Google Scholar] [CrossRef]
Song, J.; Meng, C.; Ermon, S. Denoising Diffusion Implicit Models. arXiv 2020, arXiv:2010.02502. [Google Scholar]

Figure 1. Denoising seismic data task. Noisy seismic (input) is mapped to clean seismic shot gather (label).

Figure 2. Workflow of proposed diffusion model for DAS-VSP data denoising. The forward process is untrainable (adding noise only), and the reverse process contains learnable parameters.

Figure 3. SEAM Arid slice used for data generation same as [41]. Red triangles at the top of the model mark positions of the sources used for dataset generation. DAS spans the whole well length (thin black line) while the geophones set up were focused around the target area (solid black area) leading to superior resolution.

Figure 4. (a) clean data (b) data with noise (c) noise.

Figure 5. Denoising synthetic noisy data: (a) clean synthetic data generated using deepwave and DAS conversion. (b) the synthetic DAS-VSP data with noise. (c) the denoised result using our proposed diffusion model. (d) the removed noise.

Figure 6. The denoising result for the Citronelle dataset. (a) field data input, (b) diffusion model denoised data, (c) predicted noise. The green arrows indicate that the noise is successfully suppressed. The blue arrows indicate the preserved signals. The purple arrow indicates the calibration of first arrival. The red arrow indicates the signal leakage.

Figure 7. The denoising result for the Groß Schönebeck dataset. (a) field data input, (b) diffusion model denoised result, (c) predicted noise. The green arrows indicate that the noise is successfully suppressed. The blue arrows indicate the preserved signals. The red arrow indicates the signal leakage.

Figure 8. Comparison with deconvolution. (a) field data input, (b) adaptive deconvolution result, (c) diffusion model denoised result. The red and green arrows indicate noisy areas in the data. The green arrow indicates the area where denoising is successful in both methods. The red arrow indicates that the deconvolution does not mitigate the noise, but the diffusion model does. The purple arrows indicate that the deconvolution changes the amplitude which the diffusion model preserves.

Figure 9. Comparison with bandpass filter. (a) field data input, (b) diffusion model denoised result, (c) residual between (a,b), (d) bandpass filter result, (e) residual between (a,d). The red arrow indicates aliasing introduced by the bandpass filter. The blue arrow points out the elimination of the first arrival by the bandpass filter, while it is preserved by the diffusion model. The green arrow denotes that the bandpass filter does not mitigate the noise, in contrast to the diffusion model. The purple arrow highlights that the bandpass filter induces significant signal leakage and reduces the resolution.

Figure 10. Diffusion process in different timestep.

Table 1. Detailed hyperparameters in the diffusion model.

Hyperparameters	Value
Patch size	512 × 512
Kernel size	3 × 3
Batch size	64
Epoch	500,000
Initial learning rate	${2 \times 10}^{- 5}$

Table 2. Specific parameters of forward modeling and noise generation.

Parameters	Value
Wavelet	Klauder
Time interval (s)	0.001, 0.002
Grid size (m)	6.25
Boundary condition	PML
Frequency [min, max] (Hz)	[15, 55]
Maximum period [min, max] ( $n_{m a x}$ )	[20, 40]
Noise attenuation parameter [min, max] (x)	[0.05, 0.7]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, D.; Fu, L.; Kazei, V.; Li, W. Diffusion Model for DAS-VSP Data Denoising. Sensors 2023, 23, 8619. https://doi.org/10.3390/s23208619

AMA Style

Zhu D, Fu L, Kazei V, Li W. Diffusion Model for DAS-VSP Data Denoising. Sensors. 2023; 23(20):8619. https://doi.org/10.3390/s23208619

Chicago/Turabian Style

Zhu, Donglin, Lei Fu, Vladimir Kazei, and Weichang Li. 2023. "Diffusion Model for DAS-VSP Data Denoising" Sensors 23, no. 20: 8619. https://doi.org/10.3390/s23208619

APA Style

Zhu, D., Fu, L., Kazei, V., & Li, W. (2023). Diffusion Model for DAS-VSP Data Denoising. Sensors, 23(20), 8619. https://doi.org/10.3390/s23208619

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Diffusion Model for DAS-VSP Data Denoising

Abstract

1. Introduction

2. Methods

2.1. Forward Process

2.2. Reverse Process

2.3. Conditional Diffusion Models

2.4. Training and Loss Function

2.5. Synthetic Training Data Generation

3. Results

3.1. Test on the Synthetic Dataset

3.2. Test on the Field Data

4. Discussion

4.1. Diffusion Time-Step Analysis

4.2. Advantages and Limitations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI