Dual-Domain Seismic Data Reconstruction Based on U-Net++

Enkai Li; Wei Fu; Feng Zhu; Bonan Li; Xiaoping Fan; Tuo Zheng; Peng Zhang; Tiantian Hu; Ziming Zhou; Chongchong Wang; Pengcheng Jiang

doi:10.3390/pr14020263

,

and

¹

College of Transportation Engineering, Nanjing University of Technology, Nanjing 211816, China

²

Sinopec Geophysical Corporation East China Branch, Nanjing 210009, China

³

Nanjing Shanhai Engineering Technology Co., Ltd., Nanjing 210003, China

^*

Author to whom correspondence should be addressed.

Processes2026, 14(2), 263;https://doi.org/10.3390/pr14020263

This article belongs to the Special Issue Advances in Evaluation, Development, Simulation and Utilization of Geo-Energy Resources and Underground Space

Version Notes

Order Reprints

Abstract

Missing seismic data in reflection seismology, which frequently arises from a variety of operational and natural limitations, immediately impairs the quality of ensuing imaging and calls into question the validity of geological interpretation. Traditional techniques for reconstructing seismic data frequently rely significantly on parameter choices and prior assumptions. Even while these methods work well for partially missing traces, reconstructing whole shot gather is still a difficult task that has not been thoroughly studied. Data-driven approaches that summarize and generalize patterns from massive amounts of data have become more and more common in seismic data reconstruction research in recent years. This work builds on earlier research by proposing an enhanced technique that can recreate whole shot gathers as well as partially missing traces. During model training, we first implement a Moveout-window selective slicing method for reconstructing missing traces. By creating training datasets inside a high signal-to-noise ratio (SNR) window, this method improves the model’s capacity for learning. Additionally, a technique is presented for the receiver domain reconstruction of missing shot data. A dual-domain reconstruction method is used to successfully recover the seismic data in order to handle situations where there is simultaneous missing data in both domains.

Keywords:

seismic data missing; model training; moveout-window; dual-domain data reconstruction method

1. Introduction

Natural constraints and human factors will inevitably result in missing data in seismic surveys. To ensure that the final dataset satisfies the requirements for further processing and interpretation, different interpolation algorithms are frequently used to restore missing data [1]. Three basic categories can be used to broadly classify conventional seismic data reconstruction techniques [2]: (1) Reconstruction techniques based on wave equations, which fill in data gaps by simulating seismic wave propagation in subsurface material using physical equations. Nonetheless, the quality of the subsurface velocity model has a significant impact on the reconstruction accuracy. Therefore, obtaining precise a priori velocity models continues to be a significant obstacle for this category of techniques [3]. (2) Reconstruction techniques based on sparse transforms. This method recovers lost traces by first taking use of the sparsity of seismic data in a transform domain. The reconstruction results are frequently transform-dependent since a variety of transforms, including the Fourier transform, wavelet transform, and Radon transform, can be used [4,5,6]. Furthermore, certain transforms might better describe particular geological phenomena. Therefore, in order to determine which transform is best for a particular dataset, researchers frequently have to put in a significant amount of work trying different transforms [7,8,9]. (3) Reconstruction techniques based on low-rank matrix completeness. Recovering complete data from partially seen entries is the main goal of low-rank matrix completeness. This class of approaches uses a low-rank approximation scheme, where a low-rank matrix approximates the current estimate (including anticipated values at missing spots) at each iteration. The missing seismic data is then restored by using this low-rank approximation as the input for the subsequent iteration, which gradually approaches the genuine answer [10,11,12]. Although these techniques have shown encouraging results in the interpolation of seismic data, each has associated drawbacks.

Seismologists are increasingly using deep learning methods for seismic data processing, inversion, data interpolation, and reconstruction as a result of the development of artificial intelligence in recent years [13,14,15]. The following are the main neural network structures that are frequently utilized when using deep learning techniques for seismic data reconstruction: These include Recurrent Neural Networks (RNNs), which are excellent at processing sequential data [16,17]; Generative Adversarial Networks (GANs), which are known for producing incredibly realistic data [18,19,20]; and Convolutional Neural Networks (CNNs), which are adept at capturing spatially local correlations and are the most often used neural network architectures when applying deep learning techniques to seismic data reconstruction. CNNs have been the most popular and effective option in modern data-driven seismic data reconstruction because of their remarkable ability to extract and represent spatial features. Seismologists quickly embraced the U-Net architecture for geophysical applications after its groundbreaking success in medical picture segmentation [21]. The network efficiently captures multi-scale information in seismic profiles, from large-scale structures to small phase alignments, by utilizing an encoder–decoder design with skip connections. In order to learn the residual mapping between input and output data, Residual Neural Networks (ResNets) were developed in response to computing limitations [22]. This architecture’s excellent reconstruction and denoising performance on seismic data has been verified through experiments [23,24,25]. The U-Net architecture, a typical type of convolutional neural network, is also frequently used to rebuild seismic data with randomly missing traces [26]. The suitability of the U-Net architecture for reconstructing seismic data with continually missing traces has now been investigated further. Furthermore, several U-Net-based modified architectures have been created and used for the reconstruction of seismic data. For example, the U-Net++ architecture has been successfully used to reconstruct seismic data and has shown better performance than the normal U-Net due to its deeper network and denser skip connections [27,28]. The accuracy and efficiency of seismic data reconstruction are improved by the incorporation of an attention mechanism into the U-Net design, which allows for the dynamic and adaptive selection of the most important feature information for the job at hand through its bottom-up feedback loop [29,30].

The aforementioned methods are generally applied for missing trace interpolation; the reconstruction of missing shot data remains underexplored. In order to achieve intelligent reconstruction of missing shot data, an adaptively created training set technique is proposed, wherein the Residual Neural Network (ResNet) is trained and validated on common-shot gathers before being smoothly transferred to common-receiver gathers [31]. The restoration of seismic data with irregular and regular missing patterns is further explored using a 3D Convolutional Neural Network based on the U-Net architecture. It successfully gets over the drawbacks of traditional methods, which frequently perform poorly when there are large gaps or high missing ratios [32]. The effectiveness of U-Net in reconstructing 3D seismic data in a variety of domains, including the t-x, f-x, frequency, time, and 3D spatial domains, has been investigated in recent studies. The results show that using three-dimensional approaches yields the best reconstruction performance [33]. Despite the encouraging outcomes of these deep learning-based techniques, they have two main drawbacks: First, current U-Net-based methods consistently handle seismic data as images for network training and segmentation, ignoring the inherent spatiotemporal characteristics of prestack seismic data, particularly the vital time and velocity information included in it. Additionally, the majority of current techniques use deep learning to carry out standalone reconstruction of either missing shot gathers or traces. The computational efficiency of true three-dimensional techniques is frequently constrained, and their efficacy in recreating missing shot data has not been sufficiently shown.

This study suggests a dual-domain reconstruction technique based on the U-Net++ neural network, integrating two improvements to address the restricted generalization potential of current seismic data reconstruction techniques: The first improvement introduces a Moveout-window selective slicing method during model training to address the reconstruction of missing traces. The first enhancement uses a moveout-window selective patching method during model training to recover lost traces. This method creates a well-curated training set by defining a high signal-to-noise ratio (SNR) interval, which improves the model’s learning capacity by supplying high-quality data samples. In the second, a method for reconstructing the missing shot data in the receiver domain is proposed. In order to successfully recover the incomplete data, a dual-domain reconstruction technique is used to handle the more difficult situation of simultaneous missing shot gathers and traces. This study aims to provide new insights into seismic data reconstruction under various missing-data scenarios and explore a highly adaptable deep learning-based reconstruction method. A method for reconstructing the missing shot gather in the receiving domain is proposed.

2. Methods

2.1. The Irregularity Problem of Seismic Data in Different Domains

Missing seismic signals from shot points or receivers are unavoidable in seismic data gathering. The reconstruction of such erratic seismic data is the main emphasis of this study. Data collected by receivers positioned along a seismic line can be sorted by receiver number or shot number, as shown in Figure 1. It is evident that both kinds of data gaps appear in the receiver domain (W_R) or the shot domain (W_S), and they can be combined to form the fundamental issue of missing seismic traces. The goal of data reconstruction is to use deep learning techniques to extract the whole signal from these partial observations.

Figure 1. Seismic waveform recording (sorted by shot point number, W_S: Shot domain data; W_R: Receiver domain data.).

It is possible to think of the entire seismic signal as a function of the data recorded by all receivers D and the time sample θ. As shown in Figure 1, the seismic data can be expressed as W_R (shot-domain data) or W_S (receiver-domain data) when it is transformed into the shot domain or receiver domain based on time samples. W_k represents the total representation, more precisely:

k = \{\begin{cases} S, W_{S} = (S_{j}; θ) \\ R, W_{R} = (R_{i}; θ) \end{cases}

(1)

Q_{k} = C_{k} \times W_{k}

(2)

where C_k = (c_m) denotes the m-th column vector of the sampling matrix, with each column being a binary-valued vector (0 or 1) of the same dimension as the complete signal. In order to create an approximation

{W_{k}}^{'} ({W_{k}}^{'} \approx W_{k})

of the fully sampled seismic data, the deep learning-based seismic signal reconstruction method learns data features from Q. The training asymptotically approaches the ground truth W_k as it converges toward a stable solution through repeated optimization. The following mathematical statement can be used to formally define this process:

U (w_{m}; θ) = W_{k}

(3)

(1 - C_{k}) * U (q_{m}; θ) + q_{m} = {W_{k}}^{'}

(4)

L_{i} = \frac{1}{2 S} \sum_{m = 1}^{S} {‖U (w_{m}; θ) - {W_{k}}^{'}‖}_{F}^{2}

(5)

where U(w_m; θ) denotes the complete signal, U(q_m; θ) represents the incomplete (missing) signal, S is the batch size, L_i signifies the Single-layer loss function and

{∥ \cdot ∥}_{F}

denotes the Frobenius norm.

Equation (7) states that the goal of model training is to minimize the weighted total loss under deep supervision. Equation (6) defines this loss function, which is obtained by adding the weighted losses from each decoder level with equal weights applied to all levels.

w_{i} = \frac{i}{N}

(6)

L_{t o t a l} = \sum_{i = 1}^{N} w_{i} * L_{i}

(7)

where N represents the number of neural network layers, w_i denotes the weights of each layer, L_i is the loss function at the current layer, and L_total is the overall loss function.

The Adam optimizer with β₁ = 0.9, β₂ = 0.9999 and weight decay of 10⁻⁵ was used in the training process. Using a cosine annealing scheduler (cycle of 50 epochs, minimum learning rate of 10⁻⁶, the starting learning rate was set to 10⁻⁴. When the validation loss did not reduce (drop < 10⁻⁶) for 20 consecutive epochs, the training termination criteria were either exceeding the maximum of 100 epochs or causing an early stopping. A 64 × 64 patch-based training and prediction approach was employed for large-scale seismic collects. Using a 25% overlapping ratio and weighted average fusion for overlapping regions, the patches were stitched together to rebuild the entire gather during the prediction phase.

The reconstruction error is calculated using the loss function to obtain the ideal network parameters. The network is said to have effectively reconstructed the entire signal when the error sufficiently drops or converges within a predetermined threshold, accomplishing the goal of reconstructing seismic data. The quality of the reconstructed seismic data can then be assessed using metrics like Signal-to-Noise Ratio (SNR), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index (SSIM). Algorithm 1 summarizes the rebuilding process.

Algorithm 1: Dual-Neural Network Training Process

Input: W (Complete data); W_R (Shot domain seismic data); W_S (Receiver domain seismic data); C_R, C_S (Trace and shot missing masks); L (Loss function); N = (4, 5, 6, 7, 8) (Candidate set for network depths); P_in, P_tra, P_r (Missing patches, target patches and reconstructed patches); t (Parameters of traditional slicing methods); T (Moveout window parameters {v₁, t₁, v₂, t₂}); W_{S_in} (Transform the reconstructed channel data to the detector domain)

Output: M_R* (Optimal seismic trace reconstruction model); M_s* (Optimal shot reconstruction model); W_r (Reconstruct data)

1. Data preprocessing and domain transformation:

2.

W

-Complete seismic data

3.

W_{R} = (R_{i}; θ) = (w_{1}, w_{2}, w_{3}, \dots \dots, w_{m})

-Sort into receiver domain seismic data

4.

W_{S} = (S_{j}; θ) = (w_{1}, w_{2}, w_{3}, \dots \dots, w_{n})

-Sort into shot domain seismic data

5.

Q = W * C_{R} * C_{S}

-Both missing trace and missing shot data

6. Training of the reconstruction model based on missing trace data:

7. for n ∈ N do -The current number of network layers being attempted

8. M_R = M_R(n) -Initialize the trace reconstruction model with n layers

9. P_in = (Q, t) -Slice the missing data by traditional method

10. P_tra = (W_R, T) -To set the Moveout window and extract the patches inside of it, use (v1, t1, v2, t2)

11. P_r = M_R(n)(P_in) -To obtain the reconstructed patch, move forward along the network’s depth

12. loss = L(P_r, P_tar) -Calculate the loss function

13. end for

14. update (M_R(n), loss) -Evaluate and document the performance of the current deep model

15. M_R* = M_R(n_R*) -Obtain the optimal missing trace reconstruction model

16. W_R^r = (M_R*, T) -Reconstruct the missing trace data by applying the optimal model

17. Training of the reconstruction model based on missing shot data:

18. W_{S_in} = W_R^r -Convert the reconstructed trace data into receiver domain data

19. for n ∈ N do -The current number of network layers being attempted

20. M_S = M_S(n) -Initialize the shot reconstruction model with n layers

21. P_in = (W_{S_in}, t) -Slice the missing data by traditional method

22. P_tra = (W_S, T) -To set the Moveout window and extract the patches inside of it, use (v1, t1, v2, t2)

23. P_r = M_S(n)(P_in) -To obtain the reconstructed patch, move forward along the network’s depth

24. loss = L(P_r, P_tar) -Calculate the loss function

25. end for

26. update (M_S(n), loss) -Evaluate and document the performance of the current deep model

27. M_S* = M_S(n_S*) -Obtain the optimal missing shot reconstruction model

28. W_S^r = (M_S*, T) -Reconstruct the missing shot data by applying the optimal model

29. Post-processing of data

30. W_r = (W_R^r, W_S^r) -Integrate the results of the two stages and conduct post-processing

31. Return: M_R*, M_s*, W_r

2.2. Modified Method for Dual-Domain Seismic Data Reconstruction

2.2.1. U-Net++ Model for Seismic Data Reconstruction

Existing deep learning-based techniques for reconstructing seismic data frequently use the U-Net architecture as their foundation, but they still have trouble extracting fine-grained features and capturing global contextual information. However, these drawbacks are successfully addressed by the U-Net++ design. Its tightly nested skip connections allow for more precise boundary recovery in seismic data segmentation, as seen in Figure 2. U-Net++’s encoder uses a series of convolutional and pooling layers to accomplish feature down-sampling and abstraction. Convolutional layers use a number of tiny kernels that move locally across the input feature maps, such as 3 × 3. They extract local spatial information, such as the edge structure of seismic events and local waveform morphology, when activated by non-linear functions like ReLU. Each kernel generates a response map matching to a particular feature pattern. Subsequently, the convolutional outputs are then down-sampled by the pooling layer (usually employing 2 × 2 max-pooling), which suppresses unnecessary information while retaining the most prominent features within local regions. This process gradually converts high-resolution information into higher-level semantic representations by halving the spatial dimensions of the feature maps and exponentially expanding the receptive field. Several cascaded “convolution–pooling” blocks make up the encoder as a whole, creating a hierarchical feature-extraction pathway. As the network depth increases, the feature maps’ spatial resolution gradually decreases while their semantic representational capacity gradually strengthens (from local textures to global structures, for example), resulting in a low-spatial-resolution, high-semantic-content feature representation. Additionally, U-Net++ creates a multi-level feature fusion approach by utilizing its densely stacked skip connections. The encoder gradually extracts features at various scales: deeper layers expand the receptive field to extract macroscopic geological structures (like stratigraphic morphology and structural outlines), while shallow layers capture local waveform details (like amplitude variations and noise patterns). Through the extensive connections, these features are sent in parallel to every matching decoder level. In order to ensure that the reconstructed results are both locally consistent with the known data and conform to the overall geological context, the decoder simultaneously integrates structural constraints from deep layers with fine-grained information from shallow layers during the reconstruction of missing data. As a result, it accomplishes high-precision reconstruction that combines geological plausibility with data fidelity [34,35,36].

Figure 2. Diagram of forward propagation in the U-Net++ network (Modified from [37]): (a) U-Net++ network structure; (b) Convolution module structure.

2.2.2. Modified Slicing Method for Building Training Set

In the majority of neural networks, the quality of the training set has a direct impact on the deep learning results. This work presents a selective slicing method during the formation of training samples, in contrast to earlier studies that mostly used image-domain training dataset construction approaches. The rationale for this approach is as follows: First, varied seismic wave types in prestack data show unique spatiotemporal characteristics. Direct waves and refracted waves, for example, usually show linear moveout patterns, reflection events show hyperbolic trajectories, and surface waves show dispersive “fan” or “broom” types of appearance. Because of these unique features, relevant signals can be enhanced through selective reconstruction. Second, there are notable regional differences in the quality of seismic data. For instance, deeper portions typically experience worsening data quality and progressively obscured lateral continuity, while shallow sections frequently exhibit prominent lateral continuity and a high signal-to-noise ratio.

In order to improve the depiction of reflection events, a selective slicing approach is suggested for training set building. In particular, the process is set up as follows:

The Moveout hyperbolic window is defined as follows:

$t = \sqrt{{(x / v)}^{2} + t_{0}^{2}}$

(8)

where (x) represents the offset distance, (t₀) denotes the two-way travel time of the reflected wave, and (v) refers to an estimated equivalent layer velocity. The parameter selection is to delineate regions containing high-quality reflection data by means of two hyperbolas, thereby achieving the goal of identifying high-quality data.
Random Slicing: Sampling points are randomly chosen at a predetermined ratio to act as the center points of training patches inside the window range delineated by velocity and time. Given that the deepest layer of the network used in this study performs a 16-fold down-sampling operation in relation to the original data, and because U-Net and its variants impose stringent requirements on patch dimensions, all input patches must be integer multiples of 16 × 16 in order to satisfy the input conditions of the network. Concurrently, a boundary constraint mechanism is implemented throughout the random sampling procedure to guarantee that the recovered patches’ spatial extent stays within the bounds of the initial data. As a secondary control technique, a zero-padding procedure is also used to ensure that the patch dimensions closely follow the established guidelines. By integrating this strategy with the U-Net++ architecture, the reconstruction model achieves more accurate feature learning and significantly improves reconstruction accuracy.
Setting Missing Percentage: Ten percentage point intervals were used to generate scenarios with trace missing ratios ranging from 10% to 80% while building the training set with missing traces. Every scenario with missing ratios was set up with an equal probability of 0.125. Due to practical considerations, scenarios with one to four missing shot gathers at intervals of 25 percentage points were defined for the training set with missing shots data. Every scenario was given an equal chance of 0.25. Traces inside the patch are chosen at random and have their amplitudes set to zero after the randomly missing ratio has been established. Consequently, the randomly missing patch and its corresponding original patch make up a paired training sample.

2.3. Data Reconstruction Process

The training process and the reconstruction process are the two main stages of the reconstruction framework. The training set is created during the training phase by processing data from the shot domain and the receiver domain, respectively, using the slicing method outlined in Section 2.2.2. These data are then transmitted into the U-Net++ network. Backpropagation is used to minimize the total loss function, which is the sum of the four separate loss functions at various levels, until predetermined stopping criteria, like the maximum number of iterations or the error threshold, are satisfied. The final trained model is the result of the process. The reconstruction phase involves the division of the target data into patches with uniform shapes. For a consistent model application, the patch size must precisely match the dimensions utilized during training. The data borders are extended by applying zero-padding when a patch goes beyond the original data’s edge. The trained model reconstructs each patch separately and then smoothly reassembles them to create the final data. The data must first be converted into the receiver domain in order to recreate missing shot data. After that, it is sent into the reconstruction model. To enable further accuracy evaluation and processing, the data is converted back into the shot domain once reconstruction is finished. For specifics, see Figure 3.

Figure 3. Workflow of dual-domain seismic data reconstruction based on U-Net++ neural network.

3. Experiment

3.1. Experimental Setup

The Marmousi2 synthetic seismic dataset is used to test the suggested dual-domain reconstruction method’s capacity to restore lost data [38]. The Marmousi2 model has a grid size of 5 m × 5 m, as seen in Figure 4. A Ricker wavelet with a dominant frequency of 15 Hz and a time delay of 0.1 s was used in the forward modeling, which was simulated using a finite-difference approach based on the acoustic wave equation. For simulation, a representative section between 4000 and 12,000 m that included both complex and comparatively flat geological features was chosen. A 40 m shot point interval was used to replicate 201 shot gathers in total. 600 receiver channels were created by bilaterally deploying receivers at 10 m intervals across the surface. Four seconds of wavefield data were acquired, with a 16 ms sampling interval. Shot points were placed between 4000 and 12,000 m, while receivers were placed between 1000 and 15,000 m.

Figure 4. Marmousi2 forward model (Black triangles indicate the range of simulated shots; red pentagrams mark the test single shot in two geological structures.).

The Signal-to-Noise Ratio (SNR), which is the assessment measure, is calculated by treating the difference between the reconstructed and original data as noise in order to objectively assess the reconstruction model’s performance. The following is the calculation:

S N R = 10 \log_{10} \frac{E [S^{2}]}{E [{(D - S)}^{2}]}

(9)

where (D) stands for the reconstructed data, (E) for the mathematical expectation, and (S) for the original data. An improved reconstruction performance is indicated by a higher SNR value.

3.2. Establishment of Dataset and Experimental Arrangement

The seismic data is initially divided based on the standards listed in Table 1 in order to create the training, validation, and test sets. The flexibility and reliability of the suggested method are then tested under various data-loss scenarios by training dual-neural network models (a trace-reconstruction model and a shot-reconstruction model) to handle the two distinct missing-data scenarios—missing traces and missing shots. In order to fill in the gaps in both domains, these two models are finally integrated.

Table 1. Experimental data settings.

3.2.1. Missing Trace Reconstruction Model Testing

The test dataset was randomly stripped of traces at 20%, 40%, 60%, and 80% for the situation of missing traces, as shown in Figure 5a–d. The reconstructed findings, displayed in Figure 5e–h, were then obtained by feeding the corrupted data into the trained model. Lastly, as shown in Figure 5i–l, residual sections were produced by deducting the reconstructed data from the original data. The findings generated by the suggested method are consistent with the original data, as shown by an analysis of the rebuilt data and the associated residual sections. Major reflection events and fine-scale features are effectively restored with excellent fidelity in the reconstructed data, which exhibits almost no difference from the original for lower missing ratios, such 20% and 40%. Some amplitude disparities are noticeable at larger missing ratios, especially 60% and 80%, especially at 80% trace loss. The main structural characteristics are nonetheless successfully recovered in the 80% scenario, despite the fact that finer details are considerably muddled. The reliability of the suggested approach is further supported by the computed Signal-to-Noise Ratio (SNR) values for the four missing-ratio scenarios (see Table 2), all of which show a notable improvement.

Figure 5. Diagrams of missing parts, reconstructions and differences at different proportions: (a–d) represent the missing seismic traces data of 20%, 40%, 60% and 80% respectively; (e–h) represents the reconstruction data of seismic traces missing by 20%, 40%, 60% and 80% respectively; (i–l) represents the residuals between the reconstructed data and the original data for missing 20%, 40%, 60%, and 80% respectively.

Table 2. SNR of missing and reconstructed data.

3.2.2. Missing Shot Data Reconstruction Model Testing

Two common-shot gathers with different structural difficulties, situated at 4600 m and 7000 m in the synthetic dataset, are reconstructed and examined in order to assess the dependability of the suggested shot data reconstruction method (see Figure 6). The shot-domain data are first converted into the receiver domain as part of the missing shot data reconstruction procedure, as shown in Figure 6. Figure 6 illustrates that while (b) shows the missing shot data, (a) shows the original data. (c) displays the reconstruction outcome produced by the suggested approach. The difference section in (d) shows that the majority of missing traces have been successfully recovered throughout the receiver domain, with only the shallow high-energy area (0–1 s) exhibiting discernible amplitude inconsistencies. Even though local artifacts might show up in the receiver-domain reconstruction, they have distinguishable features and are much reduced in later processing stages like normal moveout (NMO) correction and stacking, guaranteeing the final stacked section’s dependability.

Figure 6. Reconstruction results of the receiver domain: (a) Original data; (b) Missing data; (c) Reconstructed data; (d) Residual data.

As illustrated in Figure 7, the reconstructed data are converted back to the shot domain in order to verify the reconstruction performance for missing single shot gathers in geological formations of different complexity. As illustrated in Figure 7, the original data is shown in (a) and (d), and the reconstructed single-shot data are shown in (b) and (e). It is evident that the principal reflection events are successfully recovered in 3 s by the reconstructed data, and the fine-scale features within this time frame also show a respectable degree of recovery. As seen in Figure 7c,f, Amplitude differences and a detectable loss of fine-scale characteristics are seen when the reconstructed result is subtracted from the original data. However, the final stacked portion is not significantly affected by these variances. The domain transformation procedure is thought to be the cause of these problems since it might magnify errors made during receiver-domain reconstruction, resulting in noticeable amplitude disparities and the loss of fine-scale features in the final shot-domain results. Overall, the suggested method performs better at reconstruction in flat locations, most likely because the reflection patterns there are more consistent and logical. On the other hand, complicated geological features often result in dispersed and irregular seismic responses, which can make fine-scale feature recovery less accurate and clear.

Figure 7. The reconstruction test results in the case of 4600 m (a–c) and 7000 m (d–f) missing shot data: At 4600 m: (a) Original data; (b) Reconstruct the shot data; (c) Residual data. At 7000 m: (d) Original data; (e) Reconstruct the shot data; (f) Residual data.

3.2.3. Dual-Domain Refactoring Test

Dual-domain reconstruction is a methodology designed to handle seismic data with simultaneous trace and shot gather losses. The strategy proceeds as follows: first, the missing seismic traces are reconstructed without addressing the missing shot gathers at this stage. Upon completion of trace reconstruction, the repaired data are transformed into the receiver domain. Subsequently, the missing shot gathers are reconstructed. Finally, the reconstructed shot gathers are inserted back into the restored data. The specific workflow is illustrated in Figure 8. By making the most of both reconstruction models, this method greatly improves the reconstructed data’s correctness and comprehensiveness.

Figure 8. Dual-domain missing reconstruction strategy.

Using 100 shot gather from the 4000–8000 m segment of the synthetic dataset (which includes both smooth and intricate geological features) a test was carried out to confirm the methodology’s dependability. After adding 10% random noise to the synthetic data, 50% of traces and 5% of shot gathers were arbitrarily eliminated to mimic actual acquisition conditions. As seen in Figure 9, (a) shows the artificially corrupted data, (b) shows the original full data, and (c) shows the reconstruction data. The notable lateral reflection events in the reconstructed shot gather have been successfully recreated, especially in the shallow area around 1.2 s, although Figure 9d shows some amplitude inconsistencies. The primary coherent events over the deeper interval (2–3 s) are recovered adequately and closely resemble the original data. Overall, the reconstructed shot data and the original data show excellent performance and consistency. This is illustrated by the results in Figure 9, which show that the proposed dual-domain seismic data reconstruction method is reliable for both reconstructing missing shot gathers in the receiver domain and restoring missing traces.

Figure 9. Experimental results of dual-domain missing reconstruction: (a) 50% of the traces and 5% of the shot data are missing; (b) Original data; (c) Dual-domain reconstruction data. (d) Residual data.

As illustrated in Figure 10 and Figure 11, velocity analysis and stacking were carried out on the corrupted data, the rebuilt data, and the original data under the same processing conditions in order to assess the performance of the dual-domain reconstructed data in stacked sections. In contrast to the dual-domain reconstructed velocity spectrum, which displays remarkable similarity to the original data with minimal difference in the concentration of energy clusters, the velocity spectrum of the corrupted data at CDP 461, as shown in Figure 10, exhibits significantly scattered energy due to reduced fold coverage. Furthermore, the dual-domain rebuilt results in the super gather nearly resemble the original data, confirming the dual-domain reconstruction strategy’s correctness and dependability.

Figure 10. Velocity analysis and super gathers at CDP461: (a) 50% of the seismic traces and 5% of the shot gathers are missing data; (b) Dual-domain reconstruction data; (c) original data.

Figure 11. Superimposed section from 4000 to 8000 m: (a) 50% of the seismic traces and 5% of the shot gathers are missing data; (b) Dual-domain reconstruction data; (c) original data; (d) Residual data. (The red circle areas A and B are the detailed feature comparisons of the three situations, and the red triangular shape is the shot gather horn of the missing).

The variations in stacking outcomes across the three scenarios are mostly centered within the intervals of 0.7–1.5 s and 2.5–3.5 s (designated as regions A and B in the picture), as can be seen clearly in the stacked sections of Figure 11. The stacked portion of the corrupted data in regions A and B exhibits much reduced reflection continuity, with hardly perceptible reflection features. The dual-domain rebuilt data exhibits close alignment with the original data and successful restoration of missing traces, restoring fine-scale characteristics in these regions with great fidelity. Upon simultaneous comparison, shot gather No. 75 shows discontinuous reflections at around 1.2 s (Figure 11, Region B). In line with the worse shot gather reconstruction performance in structurally complex regions covered in Section 3.2.2, this artifact may be ascribed to complicated local geology, which impairs the receiver-domain reconstruction quality. However, the Marmousi2 model limits depicted in Figure 4 are still geologically consistent with the total stacked section.

In conclusion, the suggested dual-domain reconstruction approach largely accomplishes the desired goals with performance that is very close to that of the original complete data by successfully recovering both major reflection features and fine-scale details with high precision in scenarios involving missing data in both domains.

4. Analysis and Discussion

Two aspects of the evaluation are examined in order to verify the efficacy of the suggested method: first, the enhancement of the Signal-to-Noise Ratio (SNR) attained by the selective patching strategy; and second, the effect of the sequence of trace reconstruction and shot gather reconstruction in the dual-domain workflow on the final reconstruction accuracy.

4.1. The Impact of Slicing Methods on Refactoring Effects

To verify the superiority of the model trained using the selective slicing method in seismic data reconstruction, forward-modeled shot gathers are conducted at two separate geologically complicated locations: 5000 m [Figure 12a–c] and 7120 m [Figure 12d–f]. Random noise with an amplitude of 40% of the maximal signal is added to these data in order to conduct reconstruction testing. The seismic data with 50% randomly missing traces and noise contamination are shown in Figure 12a. Table 3 provides a summary of all models’ training parameters.

Figure 12. Reconstructed data at positions with 50% trace missing: At 4600 m: (a) Missing data with noise; (b) Reconstruction data by traditional method; (c) Reconstruction data by our method. At 7120 m: (d) Missing data with noise; (e) Reconstruction data by traditional method; (f) Reconstruction data by our method.

Table 3. Model Training Parameters.

Traditional deep learning techniques for reconstructing seismic data mostly use a single-channel training dataset construction method that was carried over from the image processing field. In particular, a continuous stride is used to cut raw seismic data into patches of a predetermined size. To make partitioning easier, zero-padding or border extension is applied unilaterally or bilaterally for shot gathers whose dimensions are not integer multiples of the patch size. The findings of the traditional method show that, while the SNR is enhanced from 1.24 to 3.35, using all available information for training does not yield the optimal outcomes, as seen in Figure 12b. The fundamental reason for this is that the training patches contain a considerable amount of low signal-to-noise ratio (SNR) data, such as deep parts or areas above the preliminary wave, which significantly affects training efficacy.

The training set was built using the selective slicing method outlined in Section 2.2.2, which prioritized data inside hyperbolic Moveout windows that contained locations with more unique seismic features. This method concentrates on a high ratio of signal (SNR) segments to improve learning efficiency and reconstruction accuracy. Despite a drop in the number of training samples from 135,480 to 88,440, the bulk of the preserved patches were of greater quality. The missing traces’ reflection events have been largely recovered, as illustrated in Figure 12c. Furthermore, when compared to the usual strategy of using all data for training, the new method retrieves meaningful signals that were previously obscured by noise more successfully. The suggested method successfully recovers minor details within the 0–2.5 s range that were poorly reconstructed by the conventional approach. However, the reconstruction of deeper parts (below 2.5 s) is limited. Notably, the signal-to-noise ratio (SNR) increases from 3.35 to 5.92 when compared to the old technique. The suggested method improves reconstruction of the target interval (0–2.5 s) in both geological environments by removing low-quality patches beyond designated windows.

We also investigated if the choice of the Moveout window will significantly affect the reconstruction outcomes. As seen in Figure 13, a sensitivity analysis was carried out. The comparison of the Signal-to-Noise Ratio (SNR) of the reconstructed results (Figure 13d–f) after varying the velocity and time intervals (Figure 13a–c) shows that these variations do not significantly affect the reconstruction quality, with all cases achieving comparably high recovery levels.

Figure 13. Sensitivity test of the Moveout window on reconstruction results: (a–c) Selection of the moveout-window ranges; (d–f) Reconstructed results and their corresponding Signal-to-Noise Ratios (SNR).

We applied the suggested reconstruction approach to field seismic data and compared its performance with the conventional Curvelet-transform-based reconstruction method in order to verify its efficacy and practical applicability (results are presented in Figure 14). Experimental results show that, under simulated scenarios of continuous multi-trace missing data (Figure 14a–c), the suggested method can more fully recover valid signals in the missing zones (Figure 14g–i) than the conventional approach (Figure 14d–f). This significantly improves the structural continuity and spatial coherence of the reconstructed data while demonstrating notable noise-suppression capability. These findings provide additional evidence that the suggested approach has generality and reliability for tasks involving the reconstruction of actual seismic data.

Figure 14. Actual seismic data deficiency and reconstruction results: (a–c) the actual seismic data of 5, 10 and 20 channels are missing consecutively; (d–f) the reconstruction result of the Curvelet method; (g–i) the reconstruction results of the method proposed in this paper.

4.2. The Order of Dual-Domain Reconstruction Affects the Reconstruction Results

The sequence of recreating missing shot gathers and traces in the dual-domain reconstruction method may have a major impact on the ultimate reconstruction outcome. This is especially important when substantial receiver losses occur, as the receiver domain may contain significant data gaps. The quality of the receiver-domain reconstruction has a direct impact on the succeeding shot gather reconstruction; inaccuracies at this stage may propagate errors or jeopardize the reliability of the final recovered shot data.

Moreover, there may be overlapping situations when missing shot gathers and missing traces happen at the same time, where the locations of missing shot gathers that need to be reconstructed coincide with receiver domain gaps. This behavior may make the reconstruction process even more unpredictable, which could jeopardize the accuracy of the outcome.

The dual-domain reconstruction results of five missing shots, various percentages of missing trajectories, and randomly distributed noise with a noise strength of 10% of the maximum signal are displayed in Figure 15. The technique successfully recovers the principal reflection events and some fine-scale characteristics within the 0.5–3 s interval, displaying good reconstruction performance, as illustrated in Figure 15a–c) when the proportion of missing traces is relatively low. As illustrated in Figure 15d, notable discontinuities in the reflection events within the 1.5–3 s interval appear when the percentage of missing traces reaches 50%. Additionally, the energy of the shot gather that was recreated using this sequence is noticeably lower.

Figure 15. Dual-domain reconstruction of seismic traces with different missing rates in the absence of five single shot gather: (a) 5% of seismic traces are missing; (b) 10% of seismic traces are missing; (c) 20% of seismic traces are missing; (d) 50% of seismic traces are missing.

As previously mentioned, the strategy of reconstructing shot gathers first, followed by traces, results in notable discontinuities in the final reconstruction result when the percentage of missing traces reaches 50%, as illustrated in Figure 16a. Consequently, as shown in Figure 16b, the same incomplete data was treated by first reconstructing traces and then shot gather reconstruction. The latter sequence effectively restores the majority of reflection properties, especially in the areas indicated by red circles, according to comparative analysis. Lastly, as shown in Table 4, the Signal-to-Noise Ratio (SNR) was computed for the four missing-data cases. The findings show that substantially higher SNR values are obtained using the trace-first-then-shot reconstruction sequence. At last, Table 4 shows the Signal-to-Noise Ratio (SNR) values for the four missing-data cases. The findings show that the best process, which produces dual-domain reconstructed data of greater quality, is achieved by first reconstructing traces and then shot gather reconstruction.

Figure 16. Single-shot gather reconstruction was carried out in the missing of 5 shot gathers and 50% of the seismic traces: (a) Reconstruct the shot first, then the trace; (b) Reconstruct the trace first, then the shot. (The red circle areas are the detailed feature comparisons of the two situations).

Table 4. Signal-to-noise ratio of dual-domain reconstructed data under different missing ratios.

As mentioned in this study, the impact of the reconstruction sequence on the quality of the results has just undergone preliminary validation and does not yet represent a definitive conclusion. In our next study, this feature will be further confirmed.

5. Conclusions

This paper proposes a dual-domain reconstruction technique to handle scenarios with both partially missing traces and completely missing shot gathers, thereby addressing the problem of irregular seismic data caused by different reasons in field operations. The following conclusions are reached after validation with simulated data:

(i): The selective slicing method based on the Moveout window allows for improved reconstruction accuracy and quick model training. The dual-domain reconstruction method, which is based on dual neural networks, guarantees great reconstruction accuracy while preserving data integrity.
(ii): The dual-domain reconstruction method shows a remarkable capacity to attenuate noise and recover reflection events from missing traces with high precision in the shot domain. In the receiver domain, the reconstructed missing shot data accomplish moderate restoration of fine-scale characteristics in the shallow regions and successfully recover the principal reflection events.
(iii): The processed dual-domain reconstructed data effectively and very faithfully restores subsurface structural information in the stacked sections. Moreover, the dual-domain strategy’s ideal process, which involves rebuilding missing traces before missing shot gathers, has been determined and verified.

In summary, the dual-domain seismic data reconstruction based on U-Net++ has demonstrated promising performance in simulated data experiments. This methodology offers a novel and effective approach for addressing dual-domain data missing in real-world seismic exploration caused by various operational constraints. Future research can further investigate its applicability and effectiveness in practical field data reconstruction scenarios.

Author Contributions

Conceptualization, methodology, W.F., B.L. and E.L.; writing—review and editing, P.Z., X.F. and T.Z.; methodology, data curation, writing—original draft preparation, E.L. and F.Z.; supervision, T.H., Z.Z., C.W. and P.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 42104099, Jiangsu Provincial Youth Elite Scientists Sponsorship Program (2025), Jiangsu Province Industry-University-Research Cooperation Project (BY2025).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The author expresses gratitude to all the teachers in the research group for their revision suggestions and to the anonymous reviewers for their useful advice. To address the shortcomings in the pseudo-code, we have included some code.

Conflicts of Interest

Authors Tiantian Hu and Pengcheng Jiang were employed by the company Nanjing Shanhai Engineering Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Li, Z.; Wang, J.; Sun, M.; Li, Z.; Cao, G.; Li, X.; Yong, P. Strategy of seismic data regularized reconstruction. J. China Univ. Pet. (Ed. Nat. Sci.) 2018, 42, 40–49. [Google Scholar]
Yi, J.; Zhang, M.; Li, Z.; Li, K. Review of deep learning seismic data reconstruction methods. Prog. Geophys. 2023, 38, 361–381. [Google Scholar]
Sun, M.; Li, Z.; Li, Z.; Li, Q.; Li, C.; Zhang, H. Reconstruction of seismic data with weighted MCA based on compressed sensing. Chin. J. Geophys. 2019, 62, 1007–1021. (In Chinese) [Google Scholar] [CrossRef]
Ma, J.W.; Plonka, G.; Chauris, H. A New Sparse Representation of Seismic Data Using Adaptive Easy-Path Wavelet Transform. IEEE Geosci. Remote Sens. Lett. 2010, 7, 540–544. [Google Scholar] [CrossRef]
Naghizadeh, M.; Innanen, K.A. Seismic data interpolation using a fast generalized Fourier transform. Geophysics 2011, 76, V1–V10. [Google Scholar] [CrossRef]
Trad, D.O.; Ulrych, T.J.; Sacchi, M.D. Accurate interpolation with high-resolution time-variant Radon transforms. Geophysics 2002, 67, 644–656. [Google Scholar] [CrossRef]
Herrmann, F.J.; Hennenfent, G. Non-parametric seismic data recovery with curvelet frames. Geophys. J. Int. 2008, 173, 233–248. [Google Scholar] [CrossRef]
Liu, Y.; Fomel, S. OC-seislet: Seislet transform construction with differential offset continuation. Geophysics 2010, 75, WB235–WB245. [Google Scholar] [CrossRef]
Shao, J.; Wang, Y.B.; Chang, X. Radon domain interferometric interpolation of sparse seismic data. Geophysics 2021, 86, WC89–WC104. [Google Scholar] [CrossRef]
Gao, J.J.; Sacchi, M.D.; Chen, X.H. A fast reduced-rank interpolation method for prestack seismic volumes that depend on four spatial dimensions. Geophysics 2013, 78, V21–V30. [Google Scholar] [CrossRef]
Ma, J.W. Three-dimensional irregular seismic data reconstruction via low-rank matrix completion. Geophysics 2013, 78, V181–V192. [Google Scholar] [CrossRef]
Spitz, S. Seismic Trace Interpolation in the F-X Domain. Geophysics 1991, 56, 785–794. [Google Scholar] [CrossRef]
Cao, J.J.; Cai, Z.C.; Liang, W.Q. A novel thresholding method for simultaneous seismic data reconstruction and denoising. J. Appl. Geophys. 2020, 177, 104027. [Google Scholar] [CrossRef]
Cui, J.; Yang, P.; Wang, H.; Bian, C.; Hu, Y.; Pan, H. Research on automatic picking of seismic velocity spectrum based on deep learning. Chin. J. Geophys. 2022, 65, 4832–4845. (In Chinese) [Google Scholar] [CrossRef]
Song, C.; Alkhalifah, T. Wavefield Reconstruction Inversion via Physics-Informed Neural Networks. arXiv 2021, arXiv:2104.06897. [Google Scholar] [CrossRef]
Huangfu, M. Reconsitution of Irregular Seismic Data Based on Recurrent Neural Network. Master’s Thesis, Harbin Institute of Technology, Harbin, China, 2021. [Google Scholar]
Yoon, D.; Yeeh, Z.; Byun, J. Seismic Data Reconstruction Using Deep Bidirectional Long Short-Term Memory with Skip Connections. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1298–1302. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets; MIT Press: Cambridge, MA, USA, 2014. [Google Scholar]
Oliveira, D.A.B.; Ferreira, R.S.; Silva, R.; Brazil, E.V. Interpolating Seismic Data with Conditional Generative Adversarial Networks. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1952–1956. [Google Scholar] [CrossRef]
Wei, Q.; Li, X.Y.; Song, M.P. De-aliased seismic data interpolation using conditional Wasserstein generative adversarial networks. Comput. Geosci. 2021, 154, 104801. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition; IEEE: New York, NY, USA, 2016. [Google Scholar]
Gao, Z.M.; Chen, H.L.; Li, Z.; Ma, B.L. Multiscale Residual Convolution Neural Network for Seismic Data Denoising. IEEE Geosci. Remote Sens. Lett. 2024, 21, 7502505. [Google Scholar] [CrossRef]
Tang, S.H.; Ding, Y.S.; Zhou, H.W.; Zhou, H. Reconstruction of Sparsely Sampled Seismic Data via Residual U-Net. IEEE Geosci. Remote Sens. Lett. 2022, 19, 7500605. [Google Scholar] [CrossRef]
Wang, B.F.; Zhang, N.; Lu, W.K.; Wang, J.L. Deep-learning-based seismic data interpolation: A preliminary result. Geophysics 2019, 84, V11–V20. [Google Scholar] [CrossRef]
Liu, Q.; Fu, L.H.; Zhang, M. Deep-seismic-prior-based reconstruction of seismic data using convolutional neural networks. Geophysics 2021, 86, V131–V142. [Google Scholar] [CrossRef]
Wu, G.; Liu, Y.; Liu, C.; Zheng, Z.S.; Cui, Y. Seismic data interpolation using deeply supervised U-Net plus plus with natural seismic training sets. Geophys. Prospect. 2023, 71, 227–244. [Google Scholar] [CrossRef]
Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Springer International Publishing: Cham, Switzerland, 2018. [Google Scholar]
Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; Mcdonagh, S.; Hammerla, N.Y.; Kainz, B. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar] [CrossRef]
Zhu, Y.F.; Cao, J.J.; Yin, H.J.; Zhao, J.T.; Gao, K.F. Seismic Data Reconstruction based on Attention U-net and Transfer Learning. J. Appl. Geophys. 2023, 219, 105241. [Google Scholar] [CrossRef]
Wang, B.F.; Zhang, N.; Lu, W.K.; Geng, J.H.; Huang, X.Y. Intelligent Missing Shots’ Reconstruction Using the Spatial Reciprocity of Green’s Function Based on Deep Learning. IEEE Trans. Geosci. Remote Sens. 2020, 58, 1587–1597. [Google Scholar] [CrossRef]
Chai, X.; Tang, G.; Wang, S.; Lin, K.; Peng, R. Deep Learning for Irregularly and Regularly Missing 3-D Data Reconstruction. IEEE Trans. Geosci. Remote Sens. 2021, 59, 6244–6265. [Google Scholar] [CrossRef]
Tian, Y.; Li, X.; Li, Z. 3D Seismic Data Reconstruction Based on UNet in Different Domains. Chin. J. Eng. Geophys. 2024, 21, 506–517. [Google Scholar]
Iyyanar, G.; Gunasekaran, K.; George, M. Hybrid Approach for Effective Segmentation and Classification of Glaucoma Disease Using UNet and CapsNet. Rev. d’Intell. Artif. 2024, 38, 613. [Google Scholar] [CrossRef]
Jiang, H.; Zhao, J. Multi-lesion Segmentation of Fundus Images using Improved UNet. IAENG Int. J. Comput. Sci. 2024, 51, 1587. [Google Scholar]
Jiao, J.; Zeng, X.C.; Liu, H.; Yu, P.; Lin, T.; Zhou, S. Three-Dimension Inversion of Magnetic Data Based on Multi-Constraint UNet++. Appl. Sci. 2024, 14, 5730. [Google Scholar] [CrossRef]
Li, M.; Yan, X.S.; Hu, C.Y. A self-supervised missing trace interpolation framework for seismic data reconstruction. Earth Sci. Inform. 2024, 17, 5991–6017. [Google Scholar] [CrossRef]
Martin, G.S.; Wiley, R.; Marfurt, K.J. Marmousi2: An elastic upgrade for Marmousi. Lead. Edge 2006, 25, 156–166. [Google Scholar] [CrossRef]