Near-Surface Geological Structure Seismic Wave Imaging Using the Minimum Variance Spatial Smoothing Beamforming Method

Erecting underground structures in regions with unidentified weak layers, cavities, and faults is highly dangerous and potentially disastrous. An efficient and accurate near-surface exploration method is thus of great significance for guiding construction. In near-surface detection, imaging methods suffer from artifacts that the complex structure caused and a lack of efficiency. In order to realize a rapid, accurate, robust near-surface seismic imaging, a minimum variance spatial smoothing (MVSS) beamforming method is proposed for the seismic detection and imaging of underground geological structures under a homogeneous assumption. Algorithms such as minimum variance (MV) and spatial smoothing (SS), the coherence factor (CF) matrix, and the diagonal loading (DL) methods were used to improve imaging quality. Furthermore, it was found that a signal advance correction helped improve the focusing effect in near-surface situations. The feasibility and imaging quality of MVSS beamforming are verified in cave models, layer models, and cave-layer models by numerical simulations, confirming that the MVSS beamforming method can be adapted for seismic imaging. The performance of MVSS beamforming is evaluated in the comparison with Kirchhoff migration, the DAS beamforming method, and reverse time migration. MVSS beamforming has a high computational efficiency and a higher imaging resolution. MVSS beamforming also significantly suppresses the unnecessary components in seismic signals such as S-waves, surface waves, and white noise. Moreover, compared with basic delay and sum (DAS) beamforming, MVSS beamforming has a higher vertical resolution and adaptively suppresses interferences. The results show that the MVSS beamforming imaging method might be helpful for detecting near-surface underground structures and for guiding engineering construction.


Introduction
Most underground geotechnical engineering is carried out near the surface. Unfortunately, the near-surface geological conditions are highly complex due to the abundance of joints, faults, boulders, caves, and so on. Encountering unexplored weak formations and caves can cause a variety of geological disasters, such as uneven settlement or collapse, which can inflict immense economic losses and even death [1,2]. According to the report [3] of the current status of sinkhole collapses in the karst area in China, more than 1500 karst collapsing events have been recorded and these events formed more than 45,000 sinkholes. More than 75% sinkholes were triggered by human activities. These activities are mainly around the near-surface, including mine drainage, foundation engineering, and tunnel constructions. Subsidence sinkholes result from both subsurface dissolution and the downward gravitational movement of the undermined overlying material. These sinkholes, which are invisible from the surface, are the most important from a hazard and engineering perspective [4]. Hence, to provide guidance for construction and design efforts, geophysical prospecting methods are often used in engineering to detect underground structures as an important supplement to drilling and excavation. However, the near-surface is complex and traditional migration imaging methods may suffer from types of artifacts and interferences. High-precision prospecting in near-surface detection is quite challenging since surface waves and S-waves are powerful near the surface. There is an urgent need to deal with the artifacts and interferences. Furthermore, the detection capability is limited by the construction site and time frame [5]. Therefore, the development of accurate geophysical prospecting methods for near-surface detection is of great significance to geotechnical engineering.
Seismic methods are among the most commonly used geophysical techniques for near-surface detection with the advantages of a high efficiency, a high accuracy, and a low cost; moreover, seismic techniques are nondestructive [6]. Seismic reflection is extensively used in both academic and practical engineering [7]. One of the main goals of seismic data processing is seismic imaging. For instance, seismic migration imaging [8] is used to map underground structures. Although advanced imaging methods have emerged in recent decades, such as reverse time migration [9,10] and Gaussian beam migration [11][12][13], Kirchhoff migration [14] is still the most popular approach due to its ability to provide an image of sufficient quality and at an affordable computational cost. Zhang et al. [15] processed a model example and seismic field data to demonstrate the validity of prestack Kirchhoff time migration. Yuan et al. [16] applied Kirchhoff prestack time migration to seismic data of coal seam reflections and obtained better images than with poststack time migration. Wang et al. [17] used tomographic travel-time inversion and prestack Kirchhoff depth migration-based migration velocity analysis (MVA) and obtained a precise, highresolution migration velocity model. Liu and Zhang [18] proposed a novel approach that attached a prediction shaping filter to Kirchhoff prestack time migration to mitigate the stretching effect and demonstrated their method with a numerical example and field data. Zhang et al. [7] applied Kirchhoff poststack migration to a case of seismic scattered wave imaging and obtained reliable imaging results in both the synthetic data and field data. Nevertheless, in near-surface seismic exploration, where the depth of the target area is usually less than a hundred meters, Kirchhoff migration still suffers from the following problems: (1) Kirchhoff migration relies on the preprocessing of signals to deal with noise and interference such as white noise, S-wave signals, and multiples; (2) Kirchhoff migration is affected by high-energy surface waves, which appear as massive artifacts in the imaging results. The surface wave can be muted from the observed data, but it requires a manual cost; and (3) Kirchhoff migration is limited by the degradation in the imaging resolution that occurs when the wavelength is not much smaller than the size of the imaging area in near-surface detection.
In the medical field, the widely applied ultrasound beamforming method, which detects human body structures using acoustic properties, gives us another possible way to acquire seismic signals and solve the abovementioned problems. The delay and sum (DAS) beamforming algorithm is the most popular beamforming technique for imaging human body structures because of its real-time imaging capability [19,20]. In recent decades, various adaptive beamforming methods have been proposed to improve the resolution and stability of beamformers in medical ultrasound imaging. The most common approaches are based on the minimum variance (MV) beamformer devised by Capon [21], and numerous MV beamforming algorithms have been developed to continue improving the imaging quality [22][23][24][25]. Asl [26] proposed an MV beamforming method combined with adaptive coherence weighting and achieved an excellent performance in an application to medical ultrasound imaging. Ma et al. [27] introduced the multiple delay and sum with enveloping method to efficiently suppress sidelobes and artifacts. Most recently, Chen et al. [28] proposed the multioperator MV adaptive beamformer to promote real-time imaging. Accordingly, because the transmission and receiver concepts in medical imaging and seismic reflection are similar, the medical beamforming method can be adapted for seismic imaging.
In this study, we propose a minimum variance spatial smoothing (MVSS) beamforming method for near-surface reflection seismic exploration in a homogeneous assumption. Synthetic near-surface geological models are established to carry out a numerical simulation, and the image quality of the proposed MVSS beamforming method is compared with that of Kirchhoff migration and basic DAS beamforming. Moreover, the robustness to interferences and noise, robustness to other wave components and a delay time correction to enhance the focus effect are presented. Finally, the computational efficiency, a comparison between MVSS beamforming and RTM (one of the best imaging methods for seismic data), and the potential application of the CF matrix in time domain prestack migration methods are discussed.

Methods
The MVSS method was proposed in this paper to improve the signal-to-noise ratio (SNR), to enhance the focusing effect, and to improve the resolution. The core concept of MVSS beamforming is to assign a weight to each receiver. When higher weights are assigned to receivers with higher SNRs, the imaging quality is improved. The receiver array is divided into several subarrays to improve the robustness when obtaining the weights. In each subarray, to improve the resolution of the image and increase the SNR, we designed a weight for each receiver according to the covariance matrix of signals. After the imaging results of each subarray are superimposed, the obtained image is multiplied by the signal coherence matrix to reduce the influences of sidelobe interference and focusing errors. The proposed method is based on a homogeneous subsurface assumption, and it is a time domain raytracing method, which means the propagation path of waves is static. Notably, MVSS beamforming reduces to basic DAS beamforming without spatial smoothing, neglecting to calculate the weight of each receiver, and excluding the coherence factor (CF) matrix. Though DAS beamforming and Kirchhoff migration are coded in different ideas, they share similar principles, such as they are both prestack time migration and raytracing methods. Background velocity plays an important role in Kirchhoff migration and other migration methods, but it does not in DAS beamforming. DAS beamforming can be understood as an extremely simplified Kirchhoff migration, in which we assume that the wave propagates along a straight line and the subsurface is homogeneous. Figure 1 shows the workflow of MVSS beamforming, which is divided into three parts: (1) signal delay; (2) superposition with calculated weights from minimum variance matrix in subarrays; and (3) imaging and processing, including time-depth conversion and shot stacking. The detailed mathematics can be found in Appendix A.

MVSS Beamforming Imaging
First, the signal delay times are calculated according to the distance from the target point to each receiver and the background velocity. At the top of Figure 1 is a target point in the detection area. The waves reflected from the target point are received by the array, which is arranged in a straight line. The signals recorded by the receivers have different arrival times because the receivers are situated at different distances from the target point. Thus, delay processing (τ 1 . . . τ 5 ) is applied to these signals so that the fluctuations from this target point are aligned on the time axis.
Second, the signals from different receivers are superimposed with an MVSS beamformer. Calculating the superposition (∑) with weights (ω 1 . . . ω 5 ) of the delayed signals amplifies the signals from the target point while suppressing the reflections from other scattering points in the imaging area. The weights are calculated from a minimum variance matrix of subarrays with diagonal loading to enhance robustness. The weights are designed to minimize the output interference-plus-noise power while maintaining a distortionless response to the target signal. Then, the CF is used as a weight matrix to enhance the image, which can obviously amplify the reflection signals. Finally, after performing time-depth conversion and stacking the images of all shots, we obtained the beamforming result. Underground structures can be determined by the amplitudes in the image, where a higher amplitude corresponds to a greater possibility of a reflection interface or a stronger reflection.

A Simple Example
We took a circular cave model (Figure 2a) as an example to illustrate the workflow of MVSS beamforming. The seismic data of the 10th shot obtained by finite-different time-domain (FDTD) numerical simulations (Liu et al., 2018 [29]) are shown in Figure 2b. The receivers were arranged from 0 m to 30 m with an interval of 0.2 m. There were 151 receivers in total.
For example, we took point A (16 m, 7 m) as the target point ( Figure 2). The delay time was calculated as Equation (A1), and the results are shown in Figure 3. The 151 black dots represent the delay time of each receiver, and the 20 red circles represent the delay time of each shot. The first source shared the same spatial position and delay time with the 36th receiver. We moved the target point and repeated the delay time calculation using Equation (A1).
As per the definition of a subarray in Equation (A11), we divided the array of 151 receivers into 76 subarrays, and each subarray was composed of 75 receivers. Taking the first subarray as an example, we calculated the weight of each receiver in the subarray (Figure 4b) according to the DL and MV method using Equation (A13). We multi-plied the 75 delayed signals (Figure 4a) with the weight of each receiver ( Figure 4b) and then added them together to obtain the amplitudes on the axis line x = x A (Figure 4c). Two reflection interfaces were observed from the curve. The delay time was reduced by t = t m + t n = 0.0186 s to exclude the delay time of point A itself. Before the 0.0186 s was excluded, the delay time is not absolutely correct. If we see point A as the target point, the delay time of the receiver which is above point A should be 0. In delay time calculations according to Equation (A1), the reference point is not point A (16,7) but it is the point on x axis (16,0). This results in the 0.0186 s of delay time of point A itself. So, the 0.0186 s should be excluded from the delay time and the correct time-axis position of point A can be obtained. This can also be found in Equation (A3) as (t − τ m ). Then, we repeated the process above with each subarray to obtain the amplitude curve on the axis line x = x A (Figure 4d). In this certain case, the amplitude of a reflected wave was 0.005 times that of a surface wave in seismic signal. Eventually, the amplitude was 0.05 times that of a surface wave on the superposed curve for all subarrays. Then, we picked the peak amplitude and the corresponding estimated time (0.0093 s) as the amplitude result of point A.  We repeated the delaying, weighting, and superposition processes with the subarrays at every point in the imaging area to obtain the preliminary image for this shot (Figure 5a). Then, the CF (Figure 5b) was obtained according to Equation (A15). The CF matrix was used as another weight to enhance the SNR by multiplying it with the preliminary image. In this step, surface wave artifacts were suppressed because the adopted methods were developed for reflected waves to only amplify the reflected signals with focus. Other interferences were suppressed for the same reason. Then, we obtained the final image of the 10th shot.  The time-domain image was converted into a depth-domain image according to the background velocity. The same processing scheme, including beamforming imaging and time-depth conversion, was applied to the remaining shots. The final image was a stacked result of all shots ( Figure 6).

Kirchhoff Migration
Kirchhoff migration was adopted as a related work in this paper. The methodology is mainly from the authors of [30]. The involved work migrates a single shot record using prestack time migration. The algorithm is a simple travel time path summation with few options. Travel time from source to scatter point (also known as a target point in beamforming methods) is approximated by a Dix equation using the RMS velocity from the model at the lateral position halfway between the source and receiver and at the vertical travel time of the scatter point. Similarly, from the scatter point to a receiver, a Dix equation using the RMS velocity at halfway between the scatter point and the receiver is used. There is no topographic compensation. The source and receivers are assumed to be on the same horizontal plane.

Reverse Time Migration
Reverse time migration (RTM) is one of the best imaging methods for seismic data. RTM is a depth migration method. It takes the seismic records as input and reconstructs the underground wave field using the inverse time wave field continuation, then obtains the image amplitude according to the cross-correlation with the source wave field. There is no high-frequency approximation assumption of ray tracing methods or the angle limitation of one-way wave migration. Thus, RTM has a high imaging quality but also costs more in terms of computational resources.
RTM was adopted as a related work in this paper. The adopted reverse time migration program involved was from the authors of [30]. The algorithm dose is a prestack reverse time migration of a shot record. The migration computes the forward-in-time propagation of the source field and the reverse-time propagation of the input shot (receiver field). Thus, two wavefields (source field and receiver field) are simulated by finite-difference time stepping. The migrated shot comes from the cross correlation of these two fields at equal times. This requires that the source wavefield must be available at each time step of the receiver field and this is a major complexity.

Numerical Experiments
Six synthetic models were used to validate the proposed imaging method. For this purpose, we constructed an area with a height of 20 m and a width of 30 m (Figure 7) to simulate near-surface seismic exploration. The seismic data came from numerical simulations based on staggered-grid finite differences in the time domain [29,31]. A perfectly matched layer (PML) was used to absorb reflections on the left, right, and bottom boundaries of the model. The upper boundary of the model was a free boundary. According to the near-surface geological conditions, the background P-wave velocity was set as 1500 m/s, the S-wave velocity was set as 700 m/s, and the density was set as 1.8 g/cm 3 . A 600 Hz Ricker wavelet was used as the source at the ground surface. Twenty shots were deployed with an interval of 1 m, and 151 receivers were evenly arranged at a 0.2 m intervals across the top of the model.
The signal time length was 0.05 s. According to the size of the area and background velocity, the estimated P-wave travel time of a vertically down and up path was 0.026 s and the travel time associated with the far-right source, reflection at bottom left, and farright receiver pair was 0.045 s ( Figure 8). Therefore, the target P-wave signal integrity was guaranteed, which meant that a 0.05 s signal would be long enough to cover all the reflected objectives and the reflection of most part of the area could be received by multiple receivers. All parameters in the numerical experiments are shown in Table 1. Note that the sampling frequency was 2 × 10 5 which is consistent with the time step of the numerical simulation. In the numerical experiments, adapting a shorter recorded signal can help save computational resources. However, in real cases, the recorded signal can be longer in order to obtain more information. The receivers are velocity sensors that collect vibrations perpendicular to the ground surface. The geological models are produced by a random near-surface model generation algorithm ( Figure 9). For example, horizontal parallel layers (1), folds (2), random fluctuations (3), faults (4) and caves (5) were incorporated into the models. More details on the model generation methods can be found in Appendix B. The structural details of each model are shown in Table 2. The models in group A contained cave-type geological anomalies. The models in group B were composed of multiple strata. The models in group C were combinations of cavities and multiple strata. The medium parameters are shown in Table 3.

Cave Models
The imaging results of model A-1 are shown in Figure 10. In the MVSS beamforming imaging (Figure 10b), the cave roof and floor can be observed, but the right and left sides of the cave are missing. A slight artifact caused by surface waves is observed at the top of the image (Mark 1). The ceiling and floor can be observed, but the walls on both sides are missing. The reason is that the roof is convex and reflects the waves to both sides. Some of these reflected waves exceed the range of the receiver array, resulting in a lack of information from the roof. In contrast, the concave floor can better reflect all arriving waves toward the receiver array, forming a better image. The imaging results of the three methods show that the imaged cave is slightly larger in the vertical direction than the cave actually is. Figure 10c shows the imaging result via Kirchhoff migration. There is a massive artifact (Mark 1) at the top of the image caused by surface waves. Some minor artifacts can be observed between the top of the image and the cave (Mark 2). Arcuate noise signals caused by S-waves are present on both sides (Mark 4). The upper roof is formed as a dot (Mark 3).   Figure 10d shows the DAS beamforming imaging result. The artifacts caused by surface waves are still present at the top of the image (Mark 1). The roof of the cave is also formed as a dot shape (Mark 3). The artifact caused by S-waves is again observed (Mark 4) and is stronger than that in the Kirchhoff migration results.
Compared with RTM (Figure 10e), the advantage for suppressing the artifact caused by the surface wave and S-waves remains in MVSS beamforming. The cave location in MVSS beamforming is more accurate than that in RTM.
The imaging results of model A-2 are shown in Figure 11. In the MVSS beamforming imaging result shown in Figure 11b, the locations of the two caves are well determined, and the roofs and floors of both caves can be observed. In addition to the same surface wave artifacts detected in model A-1, some interferences were found beneath the floors of the caves (Mark 1) due to the superposition of different S-waves.  Compared with MVSS beamforming, in addition to the same interferences and artifacts in model A-1, the two floors overlap (Mark 1) in the Kirchhoff migration result (Figure 11c). This overlap appears again in the DAS beamforming result. This finding indicates that MVSS beamforming has a higher horizontal resolution than the other two methods. These imaging results further demonstrate that the imaging of these caves did not suffer from signal interference because the propagation paths of the waves reflected from the cave boundaries do not overlap with each other. Besides, the S-wave artifacts appears in Kirchhoff migration and DAS beamforming (Mark 2).
Compared with those in model A-1, the interferences in the Kirchhoff migration and DAS beamforming results in model A-2 are more intensive due to the more complicated structure (Mark 3). However, these interferences are still much weaker in the MVSS beamforming result, which means that MVSS beamforming exhibits better S-wave suppression.
The imaging results of model A-3 are shown in Figure 12, in which two caves are aligned vertically in the middle of the area (a beaded cave model). The surface wave artifacts and S-wave interferences are similar to those in model A-1 and model A-2. In addition, there is some overlap between the roof of the shallower cave and the floor of the deeper cave (Mark 1), which means that the three methods have similar vertical resolutions in this cave model. Compared with the MVSS beamforming imaging result, the Kirchhoff migration imaging result suffers from more intensive and complicated interferences on both sides of the image caused by S-waves. These S-wave interferences are also relatively intensive in the DAS beamforming result.

Layer Models
The imaging results of model B-1 are shown in Figure 13. In the MVSS beamforming imaging result (Figure 13b), two interfaces can be clearly observed. There is some light interference on both sides of the image (Mark 2) and between the interfaces (Mark 1). Furthermore, some parts of the deeper interface are missing (Mark 3).
In the Kirchhoff migration imaging result (Figure 13c), the interfaces between the layers can be observed clearly, but artifacts appear between the two layers and on both sides of the image (Mark 1), and some arc-shaped interferences can be observed on both sides of the image (Mark 2). The DAS beamforming imaging result is shown in Figure 13d; the interfaces can be observed, but arc-shaped interferences and a massive surface wave artifact at the top of the image can also be observed. Comparing the three imaging methods reveals interferences between the different interfaces in the Kirchhoff migration and DAS beamforming imaging results. In the MVSS beamforming image, the amplitude of the surface wave artifact is weak, and the interferences between the interfaces are suppressed. The thickness of the interface between the two layers is approximately 0.4 m. The best resolution given the wavelength of the wavelet is 0.24 m. In contrast, the interfaces are thicker (approximately 1.6 m on the deeper interface) in the Kirchhoff migration and DAS beamforming images, which means that the vertical resolution of MVSS is higher in this layer model. However, artifacts caused by the superposition of S-wave energy still exist and caused interference between the interfaces. Kirchhoff migration and DAS beamforming suffer from similar intersecting interferences (Mark 4) caused by S-waves. However, at shallow depths, these intersecting interferences are replaced by vertical bars in the Kirchhoff migration image.
Compared with the MVSS beamforming imaging results obtained for the cave models, the interfaces can be readily determined, but the amplitudes of the interfaces in this layer model seem weaker than the amplitudes of those in the cave models (using the same imaging and display parameters in every model). In addition, the layers are missing on two sides of the image because the shots range horizontally from 5 m to 25 m.
The imaging results of model B-2 are shown in Figure 14. Figure 14b shows the imaging result of MVSS beamforming. The interfaces between the layers can be observed, and the location of a fault can be inferred from the shapes of the interfaces, but the details of the fault are difficult to discern. Lightly surface wave artifact can be observed (Mark 1).Dislocations in the shallow interface can be observed. For the deeper interface, although the upper and lower stratigraphic boundaries can be seen, the dislocations cannot. Slight interferences can be observed on the sides of the image and beneath the interface (Mark 2). Figure 14c shows the imaging result of Kirchhoff migration. The artifact caused by surface waves can be observed at the top of the image. At the same time, interferences can be observed above the shallower interface and beneath the deeper interface.  Figure 14d shows the imaging result of DAS beamforming. Likewise, the artifact caused by surface waves can be observed at the top of the image (Mark 1). In addition, artifacts caused by S-waves are present on both sides of the image. We also detected interferences between the two interfaces and beneath the deeper interface. The interferences above the shallower interface appear with different shapes in the Kirchhoff migration and DAS beamforming imaging results (Mark 3).
Compared with the RTM (Figure 14e), there were less interferences beneath the interfaces in RTM but MVSS beamforming showed a higher vertical resolution. Still, the surface wave caused an intensive artifact at the top in the RTM result.
Compared the imaging results of the three methods, the artifacts caused by direct waves and surface waves are weaker in the MVSS beamforming result. Similarly, the artifacts on both sides of the image and beneath the deeper interface are also weaker. The dislocations across the deeper interface are missing because the reflection angles generated at such dislocations are large, and thus, the signal could not be recorded by the receiver array.
In summary, compared with the imaging results in model B-1, the shallower interface is complete, and the fault dislocation is well determined. However, the dislocation of the deeper interface is missing because the reflected waves were not received by the receiver array.
Both model B-3 ( Figure 15) and model B-4 ( Figure 16) contained low-velocity areas near the dislocation. The imaging results of model B-3 are not very different from those of model B-2. The weak layer results in more S-wave interferences above the shallower interface (Mark 1). The reason is that when seismic waves arrive at the weak layer, the reflected waves are not recorded by the receivers. Thus, because of the shadow created by the weak fault layer, the layer beneath the shallower interface is distorted (Mark 2).
Compared with Kirchhoff migration and DAS beamforming, MVSS beamforming still suppressed the S-wave interferences more effectively and provided a better contrast.
In model B-4, the fault angle is shallower than that in model B-3. The shallow part of the weak layer can be observed, but the deep part could not because the imaging results were disturbed by the weak layer. Arcuate interferences (Mark 1) and artifacts along the fault (Mark 2) can be observed. Furthermore, the imaging results were disturbed by the complicated structure due to the fault dislocation (Mark 2). However, the position of the deeper interface was not influenced by the weak fault layer.  Comparing MVSS beamforming with Kirchhoff migration and DAS beamforming reveals that the artifacts (Mark 2) in the latter two methods were more intensive than those in the former. For the deeper interface, Kirchhoff migration provided a better contrast than MVSS beamforming.

Cave-Layer Hybrid Models
The imaging results of model C-1 are shown in Figure 17. The imaging result of MVSS beamforming is shown in Figure 17b. The shapes and locations of the interfaces and caves can be clearly observed. There were some interferences between the layer interfaces and the cave boundaries. On both sides of the layer beneath the shallower interface, some artifacts caused by S-waves can be observed. The imaging result of Kirchhoff migration is shown in Figure 17c, indicating that some interferences appeared near (Mark 1) and between (Mark 2) the layer interfaces. The DAS beamforming imaging result is shown in Figure 17d, revealing multiple arcuate interferences between the two interfaces (Mark 2) and above the shallow interface (Mark 3).
When the three imaging methods are compared, the imaging results of Kirchhoff migration and DAS beamforming still exhibit artifacts of direct waves and surface waves, and there are more interferences between the layers and near the interfaces. Nevertheless, the MVSS method achieved the best resolution and most effectively suppressed the direct and surface waves.
Compared with those for model B-1, the imaging results for model C-1 display more interferences between the interfaces and on both sides of the caves because of the increased stacking of S-waves from different reflection interfaces. In addition, compared with model A-1, it can be noted that the image of the cave in model C-1 is more precise.
Compared with the RTM (Figure 17e), in the composed model of cave and fold, MVSS beamforming still made a better vertical resolution. There was a strong artifact caused by a surface wave in the RTM result.
Model C-2 comprises a combination of a fault, multiple layers, and a cave (model A-1 and model B-2). The imaging results are shown in Figure 18, where Figure 18b shows the imaging result of the MVSS beamforming method. The shallower interface of the layer can be clearly observed, and the cave between the layers can be detected as well. However, the deeper interface is not complete at widths of 10 m to 20 m (Mark 4).
Comparing the three methods makes it evident that the artifacts caused by surface waves are still noticeable in the Kirchhoff migration and DAS beamforming imaging results. Furthermore, the incompleteness of the deeper interface is as notable as it is in the MVSS beamforming imaging result, while the interferences between the layers and on both sides of the image (Mark 2) are weaker in the MVSS beamforming imaging results. However, the disturbances on the deeper interface (Mark 3) are similar among all three methods. Compared with model B-2, a part of the deeper interface of the layer is blurred in model C-2. There are two main reasons for this blurriness. The structure is particularly complicated in the middle of the area due to the coexistence of the fault and cave, which causes the wave to reflect back and forth between those structures; this produces many multiples. Thus, the reflected waves from the deeper interface are seriously disturbed by multiples from the complex structure in the middle of the model. Moreover, compared with the features of model A-1, the cave's roof and floor are not clear (Mark 1) because the waves are disturbed by the fault.

Robustness to Other Wave Components
This numerical experiment was carried out based on elastic wave propagation theory, in which the signals received contain various components, such as P-P, P-S, and S-S waves, surface waves, and direct waves ( Figure 19). For these imaging methods, we employed the P-P signal. Other components in the signal were considered to be noise or interference in imaging, and S-waves and surface waves were significant causes of imaging artifacts.   (Figure 20a,b) and model A-3 (Figure 20c,d). The curves indicate the following: (1) The amplitude in the shallow parts of the Kirchhoff curve and DAS curve is high (greater than the reflection amplitude). This is the result of a surface wave. However, in the MVSS curve, the surface wave amplitude is suppressed. The surface wave is suppressed because the MVSS method amplifies the reflected wave by using the CF matrix. (2) From the Hilbert transform, it can be noted that the MVSS beamforming method can recognize two close interfaces, while the Kirchhoff and DAS methods might combine these interfaces as one. Thus, MVSS beamforming has a better vertical resolution than the other two methods.
(3) In the MVSS curve, the reflection amplitudes are greater than those in the Kirchhoff migration and DAS beamforming curves; this is also partly due to the suppression of surface waves. This means that MVSS beamforming offers a better contrast than the other two imaging methods.  These results confirm that the MVSS beamforming method can significantly suppress artifacts caused by direct waves and surface waves. The suppression of surface waves in reflection seismic exploration has long been a topic of interest [32]. Strong surface waves are significant causes of artifacts in imaging. Hence, to determine how surface waves impact the imaging results, we performed image processing by excluding direct waves and surface waves and then compared the results with the original imaging results. Figure 21b shows the Kirchhoff migration imaging results excluding direct waves and surface waves. Compared with Figure 21a, the surface wave artifact at the top of the image disappears, and the interference across the shallower interface is weaker. However, the artifacts on both sides of the interfaces and beneath the deeper interface still exist. Figure 21e shows the DAS imaging result excluding direct waves and surface waves. Compared with Figure 21d, the surface wave artifact at the top of the image disappears, but the S-wave artifacts across the interfaces still exist. Figure 21h shows the MVSS beamforming result excluding direct waves and surface waves. Compared with Figure 21g, the artifact at the top of the image disappears, but the interferences beneath the deeper interface still exist. These comparisons suggest that the main consequence of direct waves and surface waves is the presence of artifacts at the top of the image. S-waves (or converted waves) are another main factor causing artifacts and interference in seismic imaging. S-wave signals and converted waves (reflected S-waves converted from P-waves at interfaces) have similar in-phase axes as P-wave signals. However, the velocities of S-waves and converted waves do not match the estimated velocity (P-wave velocity), so the use of S-waves (or converted waves) can cause errors. Figure 21c shows the Kirchhoff migration imaging result ignoring S-waves. Compared with Figure 21a, artifacts at the top of the image still exist, but the artifacts are thinner and weaker. Moreover, the interferences across the shallower interface and beneath the deeper interface both disappear. Figure 21f shows the DAS beamforming result ignoring S-waves. Compared with Figure 21d, the artifacts at the top of the image still exist but are weaker, and the interferences beneath the deeper interface and on both sides of the shallower interface disappear. Finally, Figure 21i shows the MVSS beamforming result ignoring S-waves. Compared with Figure 21g, the artifacts at the top of the image still exist, albeit with weaker energy, and the interference beneath the deeper interface disappears. In addition, compared with Figure 21f, the artifacts on both sides of the interfaces disappear.
From this discussion, we have found the following: 1.
The horizontal artifacts at the top of the images are caused by surface waves, while the artifacts beneath the interfaces are caused by S-waves, probably P-S waves; 2.
When S-waves are eliminated, the direct wave still exists, but the surface wave does not. This greatly weakens the artifacts at the top of the image;

3.
The ability of MVSS beamforming to suppress surface wave artifacts at the top of the image and the S-wave artifacts on both sides of the interfaces is superior. However, the S-wave artifacts beneath the interfaces still affect MVSS beamforming as much as they do the other methods.

Robustness to Random Noise
The signals in the numerical experiments were clear synthetic signals (Figure 22a). The SNR is defined as follows, where P signal and P noise are the power levels of the signal and noise, respectively, and A signal and A noise are the amplitudes of the signal and noise, respectively: SNR(dB) = 10 log 10 P signal P noise = 20 log 10 A signal 2 A noise 2 (1) We added Gaussian noise to the signal (Figure 22b) to make the SNR 0 dB (the power of noise equals the power of the target reflection signals) and then generated images using Kirchhoff migration, DAS beamforming, and MVSS beamforming.
In the Kirchhoff migration result with the noisy signal (Figure 22d), the interference caused by white noise can be observed as dots in the image. In the imaging result of DAS beamforming for the signal with Gaussian noise (Figure 22f), compared with the imaging result containing clear signals (Figure 22e), the white noise in the signal is detectable in the imaging result as well. Figure 22h shows the MVSS beamforming result for the signal with Gaussian noise. Compared with the Kirchhoff migration and DAS beamforming results, the influence of white noise is much less apparent for the MVSS beamforming results. However, relative to the imaging result with a clear signal (Figure 22h), although the noise is well suppressed, the brightness of the reflection interface decreases. The reason is that the energy of all signals increases when white noise is added, which narrows the energy gap separating the reflection signal and other signals from empty space. As a consequence, the energy of the reflective interface is relatively reduced. This phenomenon corresponds to the principle of keeping the effective signal gain unchanged while minimizing the energy of the whole image in the MVSS beamforming method.

Focus Enhancing by a Signal Advance Correction
In real-world applications, the velocity assigned to the background is inaccurate. Although the phenomenon we discuss in this section was not apparent in our numerical experiments, it does exist in practice. Therefore, another model was designed at a smaller scale with a higher velocity to observe more obvious phenomena and to verify how MVSS beamforming works when an inaccurate background velocity is provided. Figure   In the signals received in a near-surface situation, the time from zero to the first arrival is t. However, t is not the most appropriate value to use because the time employed in the algorithm is the peak-to-peak time or the time from first arrival to first arrival ( Figure 24). Thus, we fixed all signals by half a cycle to correct this phenomenon, called 'defocusing', to match the wave peak times, and the signal was advanced by a quarter of a cycle, half a cycle, and three-quarters of a cycle of the source wavelet, producing a set of images using MVSS beamforming. The results show that the imaging algorithm works best when the signal is advanced by half a cycle. However, the location of the objective becomes shallower than the real location as a result. To better observe the signal correction described above, we implemented the imaging algorithm in a model with smaller point anomalies and a high background velocity and excluded both surface waves (direct waves) and S-waves. The results with no processing, a 1/4-cycle advance, a 1/2-cycle advance, and a 3/4-cycle advance are shown in Figure 25. Upward-concave artifacts can be detected near the point target in the image without processing; this is the phenomenon known as defocusing. The reason for this defocusing is the mismatch between the delay information and the reflection signals. With an increase in the signal correction amount, the best focusing effect was achieved when the correction value was 1/2 of a cycle. When the correction value exceeds this value, the imaging defocuses in the other direction (concave-downward); the reason for this is again the mismatch between the delay information and the reflection signals. This phenomenon also appears in the 30 m × 20 m model, although it is not particularly obvious due to the lower frequency and slower wave velocity. The red line in Figure 26 is a reference line at a depth of 5 m (Figure 26a), and the red box marks the comparison of different time advance corrections. After the signal advance correction was applied, the focusing effect improved, and the focusing effect was best when the correction value was 1/2 of a cycle (Figure 26d). When the correction value increased to 3/4 of a cycle (Figure 26e), the image began to defocus to the other side (shallower side). It is worth noting that due to the signal advance correction, all the imaging targets moved to a shallower position. Therefore, this processing scheme will enhance the focusing effect while causing the target position to be slightly shifted.

Computational Efficiency
In terms of the size of our numerical experiment (20 shots, 151 receivers, 1001 time grids per signal), the calculation time needed to implement each imaging method ranged from several seconds to minutes, as summarized in Table 4. DAS beamforming was the most efficient method, with a fast imaging time of 3.9 s, followed by MVSS beamforming at 201.1 s, and Kirchhoff migration took the longest at 520.1 s. If the time delay was calculated according to the background velocity and stored prior to imaging, the calculation times of DAS and MVSS beamforming were further shortened. In the Kirchhoff migration adopted in this paper, the travel times from source to scatter point and from scatter point to receiver were approximated by a Dix equation using the RMS velocity. On the other hand, the travel times were calculated based on the geometry relationship between the source, scatter point, and receiver. To some extent, DAS beamforming is a simplified option of Kirchhoff migration, which results in the computational efficiency difference. Conceivably, if the condition of the medium at each point is known, the Kirchhoff migration method should theoretically have the best imaging results. In this experiment, the geological information was known approximately (background velocity), so the accuracy advantage of Kirchhoff migration was almost negligible, but the efficiency advantage of beamforming persisted. Beamforming methods assume that the wave propagates in a straight line and that the seismic velocity in the area is the same. This assumption is also available for Kirchhoff migration. However, in the adopted algorithm it still took time to estimate the travel time by a Dix equation using the RMS velocity desired from a homogeneous background model. This meant that the travel time was almost same as that in DAS beamforming, in which the travel time was simply calculated according to geometry and background velocity, but the calculation time was quite different. Therefore, the efficiency of DAS beamforming was as high as 130 times that of Kirchhoff migration. For this reason, we would like to apply beamforming imaging to detect underground anomalies in a scene with a single background velocity. The computational efficiency and accuracy of an algorithm have always been contradictory, but in near-surface exploration, the prior information is simple, which is suitable for beamforming methods to improve efficiency.

Coherence Factor Matrix
As stated in 2.5, the coherence factor (CF) matrix is an effective method to reduce artifacts and interferences in the image. The CF matrix is obtained in the delayed signal. Since the CF matrix is used as a weight to amplify the reflection wave and suppress the artifacts for each shot, it could be theoretically applied into other prestack time domain migration methods. Figure 27 here presents the results when the CF matrix was applied into Kirchhoff migration and DAS beamforming. The results shows that the CF matrix can obviously reduce the artifact caused by surface wave and interference on two sides of the images. However, there was also a contrast reduction at the lower floor in Kirchhoff. This might be a result of different travel time (delay time) calculation algorithm between MVSS beamforming and Kirchhoff migration.

Conclusions
This research introduced a medical imaging method, the minimum variance spatial smoothing (MVSS) beamforming imaging method, for near-surface seismic detection to achieve rapid and accurate imaging. To increase the robustness of the algorithm, the DL method was employed to add spatial white noise to the recorded wavefield and to adjust the weight calculation in the proposed method. The CF method was used to improve the image quality by amplifying the amplitudes of points with a strong focusing effect and depressing those with a weak focusing effect. We observed that reducing the delay time by one-half of a wavelet cycle provided the best image focusing effect when the wavelet length was not short enough for the imaging area in a near-surface situation. The method was tested on multiple synthetic near-surface geological models and compared with DAS beamforming, Kirchhoff migration, and reverse time migration. The results indicate that the MVSS beamforming method has a high computational efficiency while maintaining a high imaging quality, a high imaging resolution and a high contrast. The MVSS beamforming method provides the superior suppression of artifacts caused by surface waves as well as other imaging artifacts or interferences caused by S-waves, surface waves, and white noise. Meanwhile, we suggest that future works should focus on adapting inhomogeneous velocity models into beamforming imaging.
In signal superposition, we suppose that the number of receivers is M and that the signal of the receiver array is x(t), which includes the signal from target point s(t) as well as noise i(t) and interference v(t): where a is the direction vector of the beam and t is the time sequence of the signal. The beamformer b(t) is defined as: where w is the beamforming weight vector, H represents the conjugate transpose, and w m * represents a conjugate matrix of w m .
is the signal of each receiver after a time delay. τ m is the delay value of each receiver. The signal is similar to a vertically incident plane wave at each receiver after time delay processing. In this case, the direction vector a is a vector composed entirely of 1: where T represents matrix transposition. When w changes with the received signal to improve the image quality, b(t) becomes an adaptive beamformer. The weight vector w (Capon, 1969) [21] can be found from the maximum of the signal-to-interference-plus-noise ratio (SINR) [33]: where σ 2 s is the signal power, R i+n (t) is the interference-plus-noise covariance, and a is the direction vector in Equation (A5). The Maximizing Equation (A6) is equivalent to minimizing the output interference-plus-noise power while maintaining a distortionless response to the desired signal. In practice, the exact interference-plus-noise R i+n (t) is unavailable, so the sample covariance matrix R(t) is used in N receivers: Then, this issue can be equivalent to: min w w H Rw , subject to w H a = 1 (A8) By using the Lagrange multiplier method, the analytic formula is: where w MV is the MV beamforming weight vector and a is a known vector according to Equation (A5). The next step is to obtain the estimated covariance matrix R of the signal data. The spatial smoothing method is used to obtain a good estimation. The receiver array is divided into several overlapping subarrays, and the covariance matrix of these subarrays is averaged to obtain a robust covariance matrix estimation. This is also defined as spatial smoothing processing [22]:R where M is the total number of receivers and L is the length of the subarray. As shown in Figure A2, there are M receivers in total, and they are divided into M − L + 1 subarrays. Each of the subarrays is L receivers long. From left to right, subarray No. 1 to subarray No. M − L + 1 covers all receivers.  The size of each subarray must satisfy L ≤ M/2 (Asl, 2009 [26]) to ensure that the estimated covariance matrix is invertible (full rank). However, reducing L increases the robustness but reduces the resolution, and setting L = 1 makes the beamformer a DAS beamformer with uniform weights. Therefore, there is a trade-off between the robustness and resolution. The critical value L = M/2 is taken to achieve the highest possible imaging resolution.
Diagonal loading (DL) [34] is adopted to improve the robustness of the algorithm by adding spatial white noise to the recorded wavefield and to adjust the weight calculation [25] using: to replaceR(t), where δ represents the DL factor and I represents the identity matrix. The DL factor should satisfy δ 1/L. Then, we replace R in (8) with R dl , and the weight of the beamformer becomes: The beamformer can be written as: A weighting factor modified from the CF map is presented as: CF ranges from 0 to 1: when CF(t) = 1, it is a perfect focused point at time t, and when CF(t) = 0, the signals of different receivers are not coherent. After multiplication by the amplitude beamformer b(t), the final image of the Nth shot I t (N) is obtained as:

Appendix B.2. Fold
A Gaussian curve is used to form folds in the model because a Gaussian curve can satisfactorily present folds when the PML layer and fault dislocation are cut ( Figure A4).
where a is the fold depth, b is the horizontal location of the fold and c is a parameter controlling the width of the fold. Figure A4. Gaussian curve used to control the fold shape.
The interfaces between the layers are often not totally smooth. The curve y g (x) is used as a baseline to create fluctuations. The fluctuation y f is composed of random floating values with 0 as the mean. Now, the interfaces y i between layers ( Figure A5) can be obtained by Equation (A20). Since the Gaussian curve and fluctuations are randomly generated, the interfaces can be either parallel or nonparallel ( Figure A5). y i (x) = y n + y g (x) + y f (x) (A21) Figure A5. Interface with fluctuations. Now, we can obtain a model with horizontal parallel layers, folds, random fluctuations, faults, and caves incorporated ( Figure A8). Figure A8. Model with folds, random fluctuations, faults, and caves.