1. Introduction
Against the backdrop of globally increasing demand for detailed exploration of complex geological structures, diffraction wave imaging technology is receiving growing attention. Compared to conventional reflection-based seismic imaging, diffraction imaging can overcome the limitations of the Fresnel zone and achieve super-resolution imaging results. As the foundation of diffraction imaging technology, diffraction wave separation is naturally of critical importance. Current diffraction-separation techniques often require preprocessing steps; for instance, plane-wave destruction filtering and common reflection surface stacking require post-stack data, while dip-filtering methods require common image gathers. In contrast, ASHRT can directly perform separation on shot gathers, which not only streamlines the separation process but also presents notable advantages for subsequent processing steps and quality control.
In the 1970s, the Radon transform was first applied to the field of seismic exploration by Claerbout. After more than five decades of development, the Radon transform has been widely utilized to address various challenges in seismic exploration. In standard seismic data processing workflows, using the Radon transform for multiple suppression and post-stack noise attenuation has become an effective means to enhance the quality of seismic data processing. In fact, in recent years, the Radon transform has not only been employed for noise removal and multiple suppression but has also evolved into different variants to solve other problems, such as diffraction and reflection wave separation, simultaneous source separation, de-aliasing, and seismic data reconstruction.
To improve the resolution of the Radon transform, an efficient frequency-domain sparse algorithm was proposed by Sacchi and Ulrych [
1]. Over the years, inspired by this idea, researchers have introduced various methods to enhance resolution, which can be mainly categorized into time-domain algorithms [
2], frequency-domain algorithms [
3,
4], and hybrid frequency-time-domain algorithms [
5,
6], among others.
The first variant of the Radon transform is the parabolic Radon transform, which was introduced by Hampson in 1986. Hampson incorporated the concept of Tikhonov regularization to stabilize the inversion of the parabolic Radon transform [
7]. Subsequently, other variants of the Radon transform were proposed, including the hyperbolic Radon transform [
8], the apex-shifted hyperbolic Radon transform [
9], and the anisotropic Radon transform [
10], among others.
The Hybrid-Domain ASHRT was proposed by Trad et al. in 2003. This method combines the advantages of the higher resolution of time-domain Radon transforms with the high computational efficiency of frequency-domain Radon transforms. Trad et al. employed sparsity-constrained inversion optimization in the time domain while introducing the Stolt operator to bypass the time-varying characteristics of ASHRT for rapid computation. This approach can significantly reduce the computational cost of ASHRT, enabling the processing of the typically massive volumes of seismic data common today. Regarding the algorithm, related research has been conducted. For instance, in 2013, Lu adopted the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) to solve sparse problems under specific norm constraints. Compared to the Iteratively Reweighted Least Squares (IRLS) method used by Trad, FISTA demonstrates overall superior performance for specific norm-constrained problems, further enhancing the computational efficiency and sparsity of the Radon transform. In 2017, Gholami and Aghamiry proposed a non-Gaussian-noise version of the iteratively re-weighted and refined least-squares algorithm. Compared to the Iteratively Re-weighted Least Squares (IRLS) algorithm, this proposed algorithm is more suitable for seismic data processing [
11].
ASHRT can be considered a compressive sensing method. Within this field, the use of norm constraints for sparsity optimization has been a subject of ongoing research. Since the theoretically optimal sparsity constraint—the L0 norm—is difficult to apply in practical computation, many researchers have investigated alternatives to the L0 norm. Currently, the most widely adopted approach is to replace the L0 norm with the L1 norm. However, for problems requiring higher sparsity, the L1 norm may not provide sufficient performance. To address these issues, an improved scheme using a modified L1 norm (the capped L1 norm) as a substitute for the L0 norm was proposed by Zhang et al. in 2010 [
12]; and some researchers utilized the L
q (
) quasi-norm to replace the L0 norm [
13,
14]. In recent years, research interest has shifted toward hybrid norm schemes. In 2013, the L1-2 norm, which theoretically offers better sparsity performance than the L1 norm, was proposed by Esser et al. [
15]. In 2018, a generalized lp/lq norm was employed by Jia et al. to design sparse filters [
16].
Since the utilization of diffracted wave information by Krey et al. in the mid-20th century [
17], various methods for diffracted wave separation have been developed. Currently, several mainstream approaches exist in the field: the plane-wave decomposition method proposed by Fomel in 2002 [
18], which achieves diffracted-wave separation using a local plane-wave model and predictive-error filtering techniques. In order to mitigate the impact of noise on PWD, Yu et al. (2017) introduced the concept of Tikhonov regularization [
19], incorporating constraints into the iterative solution process to reduce solution non-uniqueness and enhance stability. Also in 2017, Kong et al. employed a predictive filtering approach to address the phase reversal issue encountered in PWD methods [
20], thereby ensuring separation fidelity. The common-reflection-surface (CRS) method, introduced by Hubral in 1996 [
21] and later applied to diffracted wave separation by Dell and Gajewski et al. [
22,
23]. In 2018, Waldeland et al. utilized two types of structure tensors to predict both dip and curvature. This approach enables the extraction of the three parameters of the common reflection surface (CRS) from stacked data, thereby significantly accelerating computational efficiency [
24]. The dip-angle filtering method, proposed by Sava et al. [
25] and subsequently refined by Biondo [
26] and Li Zhengwei [
27] for diffracted wave separation and imaging. In 2019, Li and Zhang constructed vertical time-shift gathers using time migration on dip-angle gathers. In these gathers, diffraction and reflection waves exhibit distinct kinematic differences, enabling separation based on such discrepancies. Subsequently, they discovered that the information within the vertical time-shift gathers could be utilized to correct the phase of diffraction waves. In 2020, through further in-depth research, they proposed an improved dip-angle gather method suitable for 3D fault imaging. In these new dip-angle gathers, the differences between reflection and diffraction waves become more pronounced, allowing for more thorough separation of the two wave types. Additionally, the morphology of diffraction waves in time slices can provide guidance on the orientation and azimuth of faults [
28,
29,
30,
31]. The singular value decomposition (SVD) method, which separates diffracted and reflected waves based on their differing coherency by decomposing a matrix into simpler matrices for data compression [
32]; and the low-rank approximation method, which operates under the premise that reflected waves possess low-rank characteristics while diffracted waves act as sparse perturbations. In 2023, Chen et al. employed low-rank approximation techniques to optimize wavefield propagation operators for precise diffracted wave imaging [
33].
In the application of ASHRT for diffracted wave separation, Ibrahim et al. proposed the asymptote and apex-shifted hyperbolic Radon transform (AASHRT) based on ASHRT in 2015. This method enables better focusing of both diffracted and reflected waves [
34]. In the same year, Karimpouli et al. incorporated phase information into ASHRT according to the energy characteristics of diffracted waves, achieving promising diffracted wave separation results in synthetic data [
35]. In 2017, Gong et al. introduced a stretched version of the Stolt operator to accommodate vertical velocity variations and conducted successful diffracted-wave separation experiments [
36]. In 2018, Li et al. enhanced ASHRT using the Hilbert transform, proposing a dual-branch ASHRT to address the polarity reversal of diffracted waves on either side of the apex in the plane-wave domain, successfully separating diffracted waves in that domain [
37]. In 2021, Chiu et al. proposed a method for automatically deriving mask operators in the Radon domain. Combined with norm-constrained optimization inversion, this approach effectively suppresses multiples and achieves diffracted wave separation while significantly reducing data processing time [
38].
In addition to applications in de-aliasing, simultaneous source separation, multiple suppression, conventional denoising, diffracted wave separation, and seismic data interpolation, previous research has also explored other aspects of ASHRT. For example, Sabbione et al. utilized ASHRT for denoising microseismic records in 2013 [
39]; Seher et al. improved ASHRT in 2020 by incorporating phase-shift extrapolation operators that account for frequency and vertical velocity variations to achieve separation of low-frequency signals and noise [
40]; and in 2024, Cheng et al. employed deep learning methods to implement ASHRT [
41].
In summary, this study focuses on improving the diffraction wave separation based on ASHRT. Due to interpolation mapping and fixed velocity, the Stolt-based ASHRT often exhibits weak sparsity, making it inadequate for meeting the requirements of diffraction wave separation in seismic exploration under complex geological conditions. The use of the PS operator can circumvent the interpolation mapping issue, yielding sparser Radon transform results that better adapt to the demands of diffraction wave separation in complex seismic exploration. Moreover, compared to other separation techniques, ASHRT can directly perform separation on shot gathers, offering a simpler workflow and lower data requirements.
3. Numerical Experiments
A simple numerical simulation dataset was used to validate the advantages of the PS-based-ASHRT. The data were numerically simulated in a non-uniform half-space model, with a time sampling interval of 0.0044 s, a record length of 1.76 s, a trace interval of 5 m, and comprising 321 traces. The seismic record and the Radon domain results transformed using the two operators (PS and Stolt) are shown in
Figure 4.
As can be seen from the basic principle of ASHRT for diffraction wave separation illustrated in
Figure 1, the strength of sparsity in the Radon domain is crucially related to the effectiveness of diffraction wave separation. A sparser Radon domain leads to cleaner separation of diffraction waves.
Figure 4 shows that the sparsity of the PS-based-ASHRT is significantly stronger than that of the Stolt-based-ASHRT.
Considering that the energy of diffraction waves is typically one to three orders of magnitude weaker than that of reflection waves, the fidelity of the data before and after transformation becomes particularly important. Although in a general sense, stronger sparsity often implies lower fidelity, compared to the Stolt operator, which introduces errors through interpolated mapping, the PS operator achieves improvements in both fidelity and sparsity.
As shown in
Figure 5 and
Figure 6, compared to the Stolt-based-ASHRT, the PS-based-ASHRT exhibits higher fidelity and does not introduce the noise artifacts indicated by the red arrows in
Figure 5 and
Figure 6, which are associated with the Stolt operator.
Figure 7 presents a single-trace comparison between the original seismic data and the data reconstructed via inverse transformation using the two operators. It can be observed that, compared to the Stolt operator, the inverse transformation result of the PS operator more closely approximates the original data, demonstrating higher fidelity, less noise, and fewer disturbances.
Simultaneously, the fidelity is quantitatively characterized by the signal-to-noise ratio defined in Equation (
24):
According to Equation (
24), the calculated signal-to-noise ratios for the aforementioned model under the two operators are 17.45 dB (Stolt operator) and 25.72 dB (PS operator), respectively. From a quantitative analysis perspective, the PS operator demonstrates a significant improvement in fidelity compared to the Stolt operator.
In conclusion, under the constant-velocity assumption, the PS operator exhibits markedly higher sparsity and fidelity than the Stolt operator.
Furthermore, compared to the Stolt operator, which relies on a constant-velocity assumption, the PS operator offers the additional advantage of accommodating vertically varying velocities. Similar to other migration methods, the PS operator can utilize a smoothed velocity model to focus hyperbolic events with different velocities at different times. Earlier researchers attempted to modify the Stolt operator—for example, by creating a stretched version to accommodate vertical velocity variations. However, even these modified operators still exhibit notable limitations. For instance, they typically require velocity to change gradually with depth. Moreover, the fidelity of these variable-velocity Stolt operators is generally inferior to that of the constant-velocity version (often due to the trade-off with high sparsity). Therefore, these variants will not be discussed in further detail here.
4. Processing of Field Seismic Data
Compared to synthetic data, actual seismic data are more complex and exhibit a lower signal-to-noise ratio. A marine seismic dataset with pronounced diffraction energy is used to demonstrate the suitability of the PS operator for complex seismic data. This dataset has a trace interval of 12.5 m, a time sampling interval of 0.002 s, and a record length of 10 s. The dataset has undergone normalization processing.
Compared to
Figure 8b and
Figure 8c exhibits significantly higher sparsity, with energy clusters being more concentrated and the reverse hyperbolic energy clusters suppressed. For the separation of diffraction and reflection waves,
Figure 8c demonstrates a clear advantage, with a more distinct separation boundary and better separation effectiveness.
The aforementioned marine seismic data were processed for diffraction and reflection wave separation using the two operators, with the separation results presented in
Figure 9 and
Figure 10, respectively. Comparing the separation effectiveness of the two operators in
Figure 9 and
Figure 10, the PS-based-ASHRT shows markedly better performance than the Stolt-based-ASHRT in separating both reflection and diffraction waves, achieving cleaner separation and more distinct energy.
The PS operator is not only suitable for marine seismic data but also performs well with conventional land seismic exploration. A land seismic dataset will be used to demonstrate the applicability of the PS operator. This dataset has a trace interval of 40 m, a time sampling interval of 0.004 s, and a record length of 6 s. The dataset has undergone normalization processing. It is worth noting that this data was acquired under complex geological settings and contains substantial noise, having undergone only basic denoising procedures. The separation of diffraction and reflection waves was performed on this seismic data using the Stolt operator and the PS operator, respectively, with the results presented in
Figure 10 and
Figure 11.
Similar to marine seismic data, the PS operator also demonstrates clear advantages in processing land seismic data. The two examples above illustrate the adaptability and effectiveness of the PS-based-ASHRT.
Correspondingly, fidelity is also a crucial metric in diffraction wave separation. Methods like ASHRT are based on the “focus–mute–defocus” principle. This raises a significant industry concern: the signal distortion introduced when data is focused and then unfocused. The fidelity results of the two aforementioned real seismic datasets after processing with the two operators are presented in
Table 1.
To more clearly demonstrate the differences between the two operators, the inverse focusing results and residuals for both operators are presented in
Figure 12,
Figure 13,
Figure 14,
Figure 15 and
Figure 16. Specifically, the inverse focusing results and residuals for the land seismic data are shown in
Figure 13 and
Figure 14, while those for the marine seismic data are shown in
Figure 15 and
Figure 16.
The imaging results of the PS operator are relatively sensitive to the velocity model, and artifacts often arise in complex velocity models. In constant-velocity models, the artifacts from the PS operator are significantly reduced. Therefore, we propose the following technical approach: When velocity variations are minor, a constant low velocity can be used to approximate the entire model. When velocity variations are substantial, the time-stretched version of the Stolt operator can be used: The time axis is smoothly stretched, and a constant velocity is then applied to perform ASHRT. This strategy greatly reduces the artifact problem associated with the PS operator and enhances the stability of ASHRT. Only the stacking velocity is required for the velocity model. This is because, in practice, when using ASHRT to separate diffraction and reflection waves, it is sufficient to ensure that the reflection waves are adequately focused. Even if the diffraction waves are not perfectly focused, the reflection waves can still be effectively separated, thereby enabling the extraction of the diffraction waves. The same principle can also guide the selection of the apex shift range during separation.
5. Conclusions and Discussion
In this study, the PS operator was employed to replace the Stolt operator for implementing ASHRT, thereby circumventing the issues of low sparsity and low fidelity associated with linearly interpolated spectral mapping in the Stolt operator. The feasibility of the PS operator was validated using synthetic data. Furthermore, experiments on separating diffraction and reflection waves were conducted using the PS-based ASHRT on both marine and land seismic data. The experimental results demonstrate that the PS-based-ASHRT holds a clear advantage over the Stolt-based-ASHRT for diffraction wave separation, enabling effective separation of diffraction and reflection waves in shot gathers.
Although PS-based ASHRT demonstrates significant advantages over Stolt-based ASHRT in terms of fidelity and sparsity, it is not without its limitations. In terms of computational cost, the Stolt operator holds a substantial advantage over the PS operator. This is because the PS operator requires phase-shift calculations for each time sample, whereas the Stolt operator only needs to perform a single linear interpolation to compute all time samples. Consequently, as the number of time samples increases, the computational cost difference between the two becomes more pronounced. For applications involving large-scale seismic data, PS-based ASHRT still faces the challenge of high computational costs. These costs are reflected not only in computation time but also in memory requirements. The intermediate variables required for PS operator calculations have a dimensionality one level higher than those of the Stolt operator. While the additional memory cost of processing a single-shot gather may seem negligible, it becomes substantially higher in parallel or cluster computing environments.
Regarding adaptability to vertical velocity variations, PS-based ASHRT theoretically offers better performance compared to the time-stretched version of Stolt-based ASHRT. When vertical velocity changes are gradual, the PS operator can still exhibit better sparsity than the time-stretched Stolt operator. However, when velocity variations are severe, the PS operator can also generate artifacts, leading to incomplete separation. In such cases, the difference in sparsity between the two methods becomes less meaningful for diffraction wave separation.
Due to computational constraints, we did not perform full-scale wavefield separation across the entire large seismic dataset, suggesting that PS-based ASHRT may still have certain limitations. However, we randomly selected individual shot gathers from different large seismic datasets for related experiments. The experimental results were consistent with the findings presented in this study, indicating that PS-based ASHRT demonstrates satisfactory stability and applicability.
It is noteworthy that ASHRT (not limited to PS-based ASHRT) can achieve favorable results in seismic records where the events generally resemble hyperbolas. However, under certain specific conditions—such as significant static correction errors, excessively strong surface wave energy, unattenuated abnormal amplitudes, or highly irregular acquisition geometry—the sparsity of ASHRT can be substantially reduced, and separation results cannot be guaranteed in such cases. Moreover, for diffraction waves whose apexes lie close to those of reflection waves, removal based solely on 2D ASHRT is challenging, but it can be accomplished using 3D ASHRT by leveraging velocity differences.
In summary, while PS-based ASHRT offers greater theoretical advantages and better separation performance, its current application is constrained by computational costs. Significant challenges remain for its use with large-scale seismic data and in the implementation of 3D ASHRT.