Two-Dimensional Full-Waveform Joint Inversion of Surface Waves Using Phases and Z/H Ratios

: Surface-wave dispersion and the Z/H ratio are important parameters used to resolve the Earth’s structure, especially for S-wave velocity. Several previous studies have explored using joint inversion of these two datasets. However, all of these studies used a 1-D depth-sensitivity kernel, which lacks precision when the structure is laterally heterogeneous. Adjoint tomography (i.e., full-waveform inversion) is a state-of-the-art imaging method with a high resolution. It can obtain better-resolved lithospheric structures beyond the resolving ability of traditional ray-based travel-time tomography. In this study, we present a systematic investigation of the 2D sensitivities of the surface wave phase and Z/H ratio using the adjoint-state method. The forward-modeling experiments indicated that the 2D phase and Z/H ratio had different sensitivities to the S-wave velocity. Thus, a full-waveform joint-inversion scheme of surface waves with phases and a Z/H ratio was proposed to take advantage of their complementary sensitivities to the Earth’s structure. Both applications to synthetic data sets in large- and small-scale inversions demonstrated the advantage of the joint inversion over the individual inversions, allowing for the creation of a more uniﬁed S-wave velocity model. The proposed joint-inversion scheme offers a computationally efﬁcient and inexpensive alternative to imaging ﬁne-scale shallow structures beneath a 2D seismic array.


Introduction
Surface waves are essential for resolving crust and upper mantle structures at various length scales. Surface waves have dispersive characteristics; that is, a surface wave travels with different velocities at different frequencies in the wave-propagation process. This feature has been well studied and widely used to image Earth structures, especially for the S-wave velocity model. Teleseismic surface-wave tomography was used in earlier studies, and it offered a good resolution for the upper mantle [1][2][3]. However, it is difficult to constrain structures in the shallow to middle crust due to the low frequency of such waves. In recent decades, ambient-noise tomography was developed, which can extract the Green's function of a surface wave from ambient-noise data, and it significantly improved the surface-wave-dispersion data coverage (mostly less than a 40 s period). Short-to intermediate-period surface waves have been widely used for imaging regional-or localscale crustal structures [4][5][6][7].
However, the surface-wave-dispersion curve is only sensitive to the integral feature of the velocity structure of a depth range, resulting in ambiguities during inversion. Studies have shown that the amplitude information of a surface wave can also help improve the velocity structure. The Z/H ratio (or ellipticity) of a Rayleigh wave is always defined as the amplitude ratio of the vertical and horizontal components of the fundamental mode of the Rayleigh wave. The Z/H ratio of a Rayleigh wave is only controlled by the structure

Calculation of 2D Rayleigh Wave Phase and Z/H-Ratio Kernel
Tanimoto and Rivera [9] calculated the 1D depth-sensitivity kernel of the Rayleigh wave dispersion and the Z/H ratio either theoretically or numerically by perturbing the model parameters at various depths. The features of the 1D kernel include that both the dispersion and Z/H ratio are sensitive to changes in the S-wave velocity and the density, but they are less sensitive to the P-wave velocity. For the dispersion, the S-wave velocity sensitivity is positive at different depths, suggesting that the dispersion provides more constraints on the integral velocity feature. In contrast, the S-wave velocity sensitivity of the Z/H ratio changes from positive near a free surface to negative at deeper depths, implying that the Z/H ratio resolves the velocity contrasts better. To study the features of the 2D kernel, we derived equations for the 2D full-wave sensitivity kernels of the phases and the Z/H ratio based on the adjoint-state theory.

D Full-Wave Sensitivity Kernels of the Traveltime Equation
First, we briefly review the travel-time and amplitude-sensitivity kernels for singlecomponent seismograms. Following the full-wave finite-frequency theory, the perturbations in the overall misfit δχ are linearly related to the relative perturbations in the density (δln ρ), P-wave velocity (δln α), and S-wave velocity (δln β) : In general, two primary measurements are used in adjoint tomography: the phase delay and the amplitude reduction. For the phase-delay measurements, the travel-time misfit perturbations (δt) can be represented as follows: where δt is the travel-time anomaly; δs(t) is the perturbation of the waveform displacement, s(t) − s(t); the tilde represents the synthetics based on the reference model; and t 1 and t 2 are the minimum and maximum values of the time window, respectively. ∂ is the time derivative, so δt is equal to the lag time at the maximum of the cross-correlogram between the observed and synthetic waveform. K p α , K p β , and K p ρ are the travel-time Fréchet kernels for the density, P-wave velocity, and S-wave velocity, respectively, which can be calculated using the adjoint-state method. The adjoint source of the cross-correlation travel-time is: where w(t, ω) is the time window over which the measurement of the cross-correlation traveltime is made, s (t, ω) is the waveform recorded by a station, the dot represents the time derivative, and ω is the central frequency of the Gaussian filter.

D Full-Wave Sensitivity Kernels of the Z/H Ratio Equation
For the amplitude reduction measurements, the natural logarithm of the amplitude reduction (δln A) can be expressed as follows: Here, A is defined as the ratio between (1) the amplitude of the cross-correlogram of the observed and synthetic waveforms at lag time δt and (2) the maximum amplitude of the auto-correlogram of the synthetic waveform. K amplitude-reduction Fréchet kernels for the density, P-wave velocity, and S-wave velocity, respectively. The Z/H ratio (η) of a Rayleigh wave is always defined as the Z-component's amplitude (A Z ) over the H-component's amplitude (A H ) at a certain frequency ω: The perturbation of the Z/H ratio can be expressed as the difference between the Z-component and H-component of the amplitude perturbation: If we consider the fact that the variation in the measurements is only caused by the variation in the S-wave velocity (β), then where K η β , K Z β , and K H β are the sensitivity kernels to the S-wave velocity of the Z/H ratio, the Z-component's amplitude, and the H-component's amplitude, respectively.
This equation holds true for any volume V, so the Z/H-ratio kernel of the S-wave velocity can be expressed as: Similarly, we can obtain the following equations for P-wave velocity (α) and density (ρ): Thus, to generate the kernels above, we only need to calculate the sensitivity kernels of the Z-component's amplitude and the H-component's amplitude, which can also be solved by using the adjoint-state method. The adjoint source of the amplitude is: where w(t, ω) is the time window over which the measurement of the amplitude is made, s(t, ω) is the waveform recorded by a station, and ω is the central frequency of the Gaussian filter. The definition of Z/H ratio in this study is noteworthy. Generally, the previous studies [7,27] commonly used the maximum amplitude of the envelope to calculate Z/H ratio as follows: where s z (ω) and s H (ω) are the Z-component and H-component of the seismogram processed using a Gaussian filter at a certain frequency ω, respectively, and E is their envelope.
In this study, we used the following equation defined in [42]: where t 1 and t 2 are the start and end of the measurement window, respectively. With respect to physical properties, both definitions in Equations (12) and (13) are robust and acceptable, and could result in very similar Z/H ratios for synthetic tests. In the real data application, they will be different, depending on the S/N level of the waveform data. Determining which is better would require a more detailed discussion regarding a different real data set. Otherwise, it should be noted that we need to guarantee that the same definition is used in the data processing and kernel calculation.

Sensitivity Calculation and Analysis
To illustrate the procedure for obtaining the travel-time and Z/H-ratio-sensitivity kernels, we used the 2D spectral-element method (SPECFEM2D) to calculate the 2D sensitivity kernels via the adjoint-state method. The numerical parameters of the modeling in the computation are shown in Table 1. The model measures 800 km in the horizontal direction, with 320 uniform mesh nodes (quadrangles in 2D), and 100 km in the vertical direction, with 40 uniform nodes. A total of (4 × 320 + 1) * (4 × 40 + 1) = 206,241 unique grid points were used for the 2D spectral-element discretization with a four-degree polynomial (Komatitsch et al., 2005). The average grid spacing of the mesh was 2.5 km, which was sufficient to accurately simulate the surface waves at periods of greater than~8 s . For simplicity and to minimize the possible interference of other arrivals on the Rayleigh waves, the P-wave velocity, the S-wave velocity, and the density were all held constant (Vp = 6000 m/s, vs. = 3500 m/s, and ρ = 2800 g/m 3 ). The time interval of the simulation was set to 0.02 s to satisfy the numerical stability, and the time step was set to 12,000. We placed a Gaussian-type source-time function with a dominant frequency of 0.2 Hz at location (650 km, 0 km) to generate the 2D synthetic waveform of the surface wave, which was computed using SPECFEM2D. A station was set at location (150 km, 0 km).  Figure 1 illustrates how to obtain the single-component phase sensitivity of the kernels (K pZ β , K pH β ) to δVs on a vertical profile and horizontal profile in 2D media for surface waves with a period of 20 s. First, we obtained the seismograms of the Z-component ( Figure 1a) and H-component (Figure 1b) recorded by the station, which were processed using a Gaussian filter with a central frequency of 0.5 Hz. Then, we obtained the adjoint source of the Z-component ( Figure 1c) and H-component (Figure 1d) at the station using Equation (3). We performed one adjoint simulation by injecting the back adjoint source at the station into the model, which created adjoint wavefield s † . Finally, the phase-sensitivity kernels of the Z-component ( Figure 1e) and H-component ( Figure 1f) at periods of 20 s were obtained using the time integration of the interaction of s † and the forward wavefield s. To study the features of the phase kernel in the depth and lateral directions at different periods, the Z-component kernels at three other periods of 10 s (Figure 1g   to δVs on a vertical profile and horizontal profile in 2D media, as well as their differential kernels (K − K ), which is called the Z/H-ratio kernel (K ) in this paper. The procedure for obtaining the amplitude kernels was very similar to that for the phase kernels, except that the adjoint source was generated using different measurements followed by Equation (11). The amplitude-sensitivity kernels of the Z-component and H-component with periods of 20 s obtained using the adjoint-state method are shown in Figure 2e   to δVs on a vertical profile and horizontal profile in 2D media, as well as their differential kernels (K pZ β − K pH β ), which is called the Z/H-ratio kernel (K η β ) in this paper. The procedure for obtaining the amplitude kernels was very similar to that for the phase kernels, except that the adjoint source was generated using different measurements followed by Equation (11). The amplitude-sensitivity kernels of the Z-component and H-component with periods of 20 s obtained using the adjoint-state method are shown in Figure 2e   The experiment described above shows that both the 2D-phase and Z/H-ratio kernels provide more information than the 1D sensitivity kernels. The 1D kernels only had sensitivities in the vertical direction because they did not consider the source direction. In contrast, the 2D kernels varied with distance from the station and with depth, which indicated that they had strong resolving powers in both the depth and lateral directions.
Both kernels had different features in the vertical and lateral directions. The phase kernel was concentrated along the propagation path, with a much broader sensitivity range than the Z/H-ratio kernels, while the Z/H ratio sensitivity was only concentrated near the station. The reason for this is that the amplitude-sensitivity kernels of the Z-component and the H-component had the same pattern on the source side and a common propagation path, and the use of the component-differential calculation could remove the Note that only the kernels near the station (the region outlined by the dashed line in (g)) were plotted.
The experiment described above shows that both the 2D-phase and Z/H-ratio kernels provide more information than the 1D sensitivity kernels. The 1D kernels only had sensitivities in the vertical direction because they did not consider the source direction. In contrast, the 2D kernels varied with distance from the station and with depth, which indicated that they had strong resolving powers in both the depth and lateral directions.
Both kernels had different features in the vertical and lateral directions. The phase kernel was concentrated along the propagation path, with a much broader sensitivity range than the Z/H-ratio kernels, while the Z/H ratio sensitivity was only concentrated near the station. The reason for this is that the amplitude-sensitivity kernels of the Z-component and the H-component had the same pattern on the source side and a common propagation path, and the use of the component-differential calculation could remove the observational errors in the source and propagation path. This feature was consistent with previous studies of the Z/H ratio, especially for 1D cases; that is, the Z/H ratio was only sensitive to the structures beneath the station.
To compare the sensitivity to the S-wave velocity in depth for these two data sets, we reduced the spatial dimension of the phase and the Z/H-ratio kernels shown in Figures 1g-j and 2g-j from 2D to 1D by laterally summing the values on the horizontal plane in each vertical grid and plotting them as a function of depth. Figure 3 shows the reduced 1D kernels at periods of 10 s, 20 s, 30 s, and 40 s for the phase (Figure 3a) and Z/H ratio (Figure 3b). The results showed that their sensitivity kernels were different but complementary. The Rayleigh wave phase was more sensitive to the deeper structure than the Z/H ratio at the same period, while the Z/H ratio had a strong sensitivity to the shallow structure (e.g., the sedimentary layers). Moreover, the phase-sensitivity kernel was much broader than the Z/H-ratio kernel for the same period, which meant the phase information could be used to solve the smoothed velocity (background model), while the Z/H ratio could deal with the lateral heterogeneity (e.g., velocity anomaly or undulated interface) better. observational errors in the source and propagation path. This feature was consistent with previous studies of the Z/H ratio, especially for 1D cases; that is, the Z/H ratio was only sensitive to the structures beneath the station.
To compare the sensitivity to the S-wave velocity in depth for these two data sets, we reduced the spatial dimension of the phase and the Z/H-ratio kernels shown in Figure 1gj and Figure 2g-j from 2D to 1D by laterally summing the values on the horizontal plane in each vertical grid and plotting them as a function of depth. Figure 3 shows the reduced 1D kernels at periods of 10 s, 20 s, 30 s, and 40 s for the phase (Figure 3a) and Z/H ratio ( Figure 3b). The results showed that their sensitivity kernels were different but complementary. The Rayleigh wave phase was more sensitive to the deeper structure than the Z/H ratio at the same period, while the Z/H ratio had a strong sensitivity to the shallow structure (e.g., the sedimentary layers). Moreover, the phase-sensitivity kernel was much broader than the Z/H-ratio kernel for the same period, which meant the phase information could be used to solve the smoothed velocity (background model), while the Z/H ratio could deal with the lateral heterogeneity (e.g., velocity anomaly or undulated interface) better. The effect of the velocity anomaly on the Z/H ratio was obvious and easily reachable from Figure 2. Here, we designed another experiment to demonstrate the effect of undulated interface on the Z/H-ratio kernels as shown in Figure 4. We chose a typical structure containing interface variation, which followed a co-sinusoidal function with a maximum amplitude of 5.0 km. The results showed that the Z/H-ratio sensitivity was significantly affected by the undulated interface, and the maximum difference between the undulated interface and the flat interface was about 50%. The experiments described above suggested that the Z/H-ratio kernel was much more sensitive to the model with velocity anomalies and interface variations. This meant that the Z/H ratio could help us to constrain a shallow crust with a complex lateral heterogeneity in which the surface-wave phase cannot be resolved, as well as to improve the accuracy of the inversion. The effect of the velocity anomaly on the Z/H ratio was obvious and easily reachable from Figure 2. Here, we designed another experiment to demonstrate the effect of undulated interface on the Z/H-ratio kernels as shown in Figure 4. We chose a typical structure containing interface variation, which followed a co-sinusoidal function with a maximum amplitude of 5.0 km. The results showed that the Z/H-ratio sensitivity was significantly affected by the undulated interface, and the maximum difference between the undulated interface and the flat interface was about 50%. The experiments described above suggested that the Z/H-ratio kernel was much more sensitive to the model with velocity anomalies and interface variations. This meant that the Z/H ratio could help us to constrain a shallow crust with a complex lateral heterogeneity in which the surface-wave phase cannot be resolved, as well as to improve the accuracy of the inversion.

Joint Inversion of the Rayleigh Wave Phase and the Z/H Ratio
As was demonstrated by the sensitivity analysis described in the previous section, the phase and the Z/H ratio of a Rayleigh wave had different sensitivities to the S-wave velocity structure, and they provided complementary constraints on the structure. Thus, joining the two data sets will help reduce the non-uniqueness of the inversion.
For adjoint tomography of the Rayleigh wave phase [32], we sought to minimize the travel-time misfits between the observed waveform and the synthetic waveform generated using SPECFEM2D. In this study, the frequency-dependent travel-time misfits can be written as: where m is the model vector, is the misfit function for the Rayleigh wave phase, is the number of periods, is the total number of misfit measurements in the i-th period, ∆ ( ) is the frequency-dependent phase delay between the jth pair of observed data and the synthetic data at frequency ω in the ith period, and , ( ) is the corresponding measurement uncertainty. Similarity, the frequency-dependent Z/H ratio misfit is defined as follows:

Joint Inversion of the Rayleigh Wave Phase and the Z/H Ratio
As was demonstrated by the sensitivity analysis described in the previous section, the phase and the Z/H ratio of a Rayleigh wave had different sensitivities to the S-wave velocity structure, and they provided complementary constraints on the structure. Thus, joining the two data sets will help reduce the non-uniqueness of the inversion.
For adjoint tomography of the Rayleigh wave phase [32], we sought to minimize the travel-time misfits between the observed waveform and the synthetic waveform generated using SPECFEM2D. In this study, the frequency-dependent travel-time misfits can be written as: where m is the model vector, χ RP is the misfit function for the Rayleigh wave phase, N c is the number of periods, M i is the total number of misfit measurements in the i-th period, ∆T j (ω) is the frequency-dependent phase delay between the jth pair of observed data and the synthetic data at frequency ω in the ith period, and σ T,j (ω) is the corresponding measurement uncertainty. Similarity, the frequency-dependent Z/H ratio misfit is defined as follows: where χ ZH is the misfit function for the Rayleigh wave's Z/H ratio, and ∆zh j (ω) is the Z/H ratio variation measured between the jth pair of the observed data and synthetic data with an uncertainty of σ zh,j .
Thus, the objective function of the joint inversion based on the adjoint-state method is defined as follows: where ω RP and ω ZH are the weights used to balance the two data types to prevent the results from being dominated by either one. The joint-inversion scheme is described in Figure 5.
where is the misfit function for the Rayleigh wave's Z/H ratio, and ∆ ℎ ( ) is the Z/H ratio variation measured between the jth pair of the observed data and synthetic data with an uncertainty of , . Thus, the objective function of the joint inversion based on the adjoint-state method is defined as follows: where and are the weights used to balance the two data types to prevent the results from being dominated by either one. The joint-inversion scheme is described in Figure 5.

Inversion Experiments
The complementary sensitivity of the phase and Z/H ratio to the vs. perturbation makes the joint inversion an important method of constraining S-wave velocity structures. Here, we designed two synthetic experiments to demonstrate the successful usage of this method on the large-and small-scale inversions.

Large-Scale Inversion Using Ambient Noise/Earthquake Surface Waves
Seismic linear arrays with dense station spacing have been deployed around the globe, which makes it possible to apply joint inversion to ambient-noise data or earthquake surface waves recorded by linear seismic arrays.
We set up a target model (Figure 6a) with alternating 6% box-shaped high-and low-Vs anomalies in the background model (also used as the initial model in this experiment; see material properties in Table 1), and no Vp or density perturbation was introduced into the target model. The size of the model and the number of uniform mesh nodes were the same as the parameter settings described in Section 2.3. All of the boxes with anomalies

Inversion Experiments
The complementary sensitivity of the phase and Z/H ratio to the vs. perturbation makes the joint inversion an important method of constraining S-wave velocity structures. Here, we designed two synthetic experiments to demonstrate the successful usage of this method on the large-and small-scale inversions.

Large-Scale Inversion Using Ambient Noise/Earthquake Surface Waves
Seismic linear arrays with dense station spacing have been deployed around the globe, which makes it possible to apply joint inversion to ambient-noise data or earthquake surface waves recorded by linear seismic arrays.
We set up a target model (Figure 6a) with alternating 6% box-shaped high-and low-Vs anomalies in the background model (also used as the initial model in this experiment; see material properties in Table 1), and no Vp or density perturbation was introduced into the target model. The size of the model and the number of uniform mesh nodes were the same as the parameter settings described in Section 2.3. All of the boxes with anomalies had lengths of 40 km, while the widths changed from 8 km in the upper row to 15 km in the middle row to 30 km in the lower row. These different sizes were designed to assess the spatial resolutions of the varying lengths for a Rayleigh wave with different periods. The source-time function was Gaussian, and the dominant frequency of the source was 0.2 Hz. Fifty stations were spaced 10 km apart and were located from 150 to 650 km, which could also be considered as a virtual source generating an empirical Green's function of the surface wave.
had lengths of 40 km, while the widths changed from 8 km in the upper row to 15 km in the middle row to 30 km in the lower row. These different sizes were designed to assess the spatial resolutions of the varying lengths for a Rayleigh wave with different periods. The source-time function was Gaussian, and the dominant frequency of the source was 0.2 Hz. Fifty stations were spaced 10 km apart and were located from 150 to 650 km, which could also be considered as a virtual source generating an empirical Green's function of the surface wave. We performed all of the forward and adjoint simulations using SPECFEM2D. Then, we filtered the synthetic and observed waveforms at four central period bands (10, 20, 30, and 40 s) using a Gaussian filter, which could cover the period band of 5 to 50 s quite well. Following the workflow of the joint inversion described in Figure 5, we measured the We performed all of the forward and adjoint simulations using SPECFEM2D. Then, we filtered the synthetic and observed waveforms at four central period bands (10, 20, 30, and 40 s) using a Gaussian filter, which could cover the period band of 5 to 50 s quite well. Following the workflow of the joint inversion described in Figure 5, we measured the frequencydependent travel-time and amplitude misfit between the synthetic and observed waveform and obtained the corresponding adjoint sources of the phase and amplitude, respectively. After that, we calculated the phase-sensitivity kernel and the amplitude kernel of the Z-component and H-component by injecting the adjoint sources at the stations based on the adjoint-state method. In addition, the Z/H-ratio kernels were calculated by subtracting the Z-component and H-component amplitude kernels. Finally, all of the event kernels were summed, smoothed, and preconditioned to obtain the final misfit gradient for updating the model. A 2D Gaussian function was used to smooth the summed kernel as a regularization procedure (Tape et al., 2010). The horizontal and vertical widths of the Gaussian function used to smooth the gradient were chosen as 30 km and 10 km, respectively, in the first 10 iterations. Then, they were reduced to smaller values of 10 km and 5 km, reflecting the incorporation of the shorter-period surface-wave measurements. Next, the smoothed kernel was preconditioned using a preconditioner, which was numerically calculated through the interaction between the forward and adjoint acceleration fields; namely, the pseudo-Hessian (Luo, 2012).
where s and s † are the forward and adjoint displacements, respectively, and N s is the number of sources. Finally, we used the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method [43] (see detail shown in Appendix A) to update the model, and a line-search method was used to determine the optimal step length for updating the model. Once the optimal step length was determined from the minimum of the total function for all of the bands, the model parameter m was updated as follows: where m i+1 and m i are the models of the (i + 1)th iteration and the ith iteration, respectively; and ν i and di are the optimal step length and the search direction (i.e., summed event kernels) of the ith iteration, respectively. To demonstrate the advantage of our joint-inversion framework, we also conducted two additional separate inversions using either only the Rayleigh-wave phase data or only the Z/H-ratio data. The separate inversions also began with the background model and used the same inversion parameters as the joint inversion, including the smoothing width and step lengths. Both data sets were mostly sensitive to the shear-wave velocity. We only inverted for the vs. structures. In total, we conducted three inversions: (1) adjoint tomography of the Rayleigh-wave phase data (i.e., phase inversion); (2) adjoint tomography of the Rayleigh-wave Z/H-ratio data (i.e., Z/H-ratio inversion); and (3) the joint inversion using the two data sets (i.e., joint inversion). Figure 6b,c show the gradients of the vs. for the first iteration of the phase and Z/H-ratio inversions. The gradients show that the two Rayleigh wave measurements had different and complimentary sensitivities to the crustal structure. The phase inversion was more sensitive to the deeper structure than the Z/H-ratio inversion, while the Z/H-ratio inversion had a strong sensitivity to the shallow structure. The total misfit evolution of the phase and Z/H-ratio inversions and the joint inversion are presented in Figure 7a,b. We can see that the joint inversion had a slower convergence rate and slightly larger misfits than those of the separate inversions, which was reasonable, since the joint-inversion scheme tried to fit both data sets simultaneously. The joint inversion converged after 32 iterations when the misfit changed over the last iteration because both the phase and Z/H-ratio data were less than 3%. Figure 7c To compare the results of the three types of inversions, we show their final vs. models in Figure 8. The Z/H-ratio information for the surface was mostly sensitive to the vs. structures at shallow depths (<30 km), which was limited by the period range of the Rayleigh wave in this study. Thus, the recovered vs. model (Figure 8b) was better resolved in the shallow regions, and the upper and middle rows show the correct pattern with the anomaly close to target model. Compared with the Z/H-ratio inversion, the phase inversion was sensitive to much deeper vs. structures and could constrain the broader range of the vs. structures at depths of ~20 km to 60 km (Figure 8a). The pattern of the anomaly in the middle and lower rows is almost recovered; however, the recovered pattern is rather smoothed and lacks constraints in the shallow structure, which was caused by the sensitivity characteristic of the phases shown in Figure 3a. Benefiting from the sensitivity to the vs. structures concentrated near the station, the Z/H-ratio inversion helped to better resolve the structures in the shallow areas that were not well resolved by the phase inversion. Thus, the addition of the Z/H ratio in the joint inversion helped to constrain the crustal structure at shallow depths. At greater depths, the joint inversion had a better resolution than the phase inversion with the correct amplitude recovery. These tests demonstrated that the joint inversion combined the complementary sensitivities of the surface wave phases and Z/H ratio, and was capable of building a more-unified vs. model, thus outperforming the inversions based on the individual data sets. To compare the results of the three types of inversions, we show their final vs. models in Figure 8. The Z/H-ratio information for the surface was mostly sensitive to the vs. structures at shallow depths (<30 km), which was limited by the period range of the Rayleigh wave in this study. Thus, the recovered vs. model (Figure 8b) was better resolved in the shallow regions, and the upper and middle rows show the correct pattern with the anomaly close to target model. Compared with the Z/H-ratio inversion, the phase inversion was sensitive to much deeper vs. structures and could constrain the broader range of the vs. structures at depths of~20 km to 60 km (Figure 8a). The pattern of the anomaly in the middle and lower rows is almost recovered; however, the recovered pattern is rather smoothed and lacks constraints in the shallow structure, which was caused by the sensitivity characteristic of the phases shown in Figure 3a. Benefiting from the sensitivity to the vs. structures concentrated near the station, the Z/H-ratio inversion helped to better resolve the structures in the shallow areas that were not well resolved by the phase inversion. Thus, the addition of the Z/H ratio in the joint inversion helped to constrain the crustal structure at shallow depths. At greater depths, the joint inversion had a better resolution than the phase inversion with the correct amplitude recovery. These tests demonstrated that the joint inversion combined the complementary sensitivities of the surface wave phases and Z/H ratio, and was capable of building a more-unified vs. model, thus outperforming the inversions based on the individual data sets.

Small-Scale Inversion Using Active Sources
Another suitable application of this method is retrieving model parameters and characterizing the near-surface region, mainly the shear-wave velocity model using active sources, for engineering applications, geotechnical studies, and physical modeling tests. In this section, we used more realistic synthetic experiments to demonstrate the promising applications of this full-waveform joint-inversion method to small-scale inversions, with an improved resolution.
The 600 × 100 m target model for elastic S-wave and P-wave velocities used for the numerical experiments in this section is shown in Figure 9. With this model, we used 100 and 20 uniform mesh nodes in the horizontal and vertical directions, respectively, resulting in a total of (4 * 100 + 1) * (4 * 20 + 1) = 32,481 unique grid points. The target elastic model included heterogeneous Vp (Figure 9a) and vs. models (Figure 9b) with a homogeneous density of 1000 kg/m 3 . Both the Vp and vs. increased with depth from a low-velocity layer, with the exception of one low-velocity zone (LVZ) and two high-velocity zones (HVZs) in the middle. We assumed an initial model with a homogeneous Vp of 1000 m/s, a homogeneous vs. of 500 m/s, and a homogeneous density of 1000 kg/m 3 . We set up 50 shots located from 50 to 550 m at a depth 0.5 m below the model's surface, with an equal

Small-Scale Inversion Using Active Sources
Another suitable application of this method is retrieving model parameters and characterizing the near-surface region, mainly the shear-wave velocity model using active sources, for engineering applications, geotechnical studies, and physical modeling tests. In this section, we used more realistic synthetic experiments to demonstrate the promising applications of this full-waveform joint-inversion method to small-scale inversions, with an improved resolution.
The 600 × 100 m target model for elastic S-wave and P-wave velocities used for the numerical experiments in this section is shown in Figure 9. With this model, we used 100 and 20 uniform mesh nodes in the horizontal and vertical directions, respectively, resulting in a total of (4 * 100 + 1) * (4 * 20 + 1) = 32,481 unique grid points. The target elastic model included heterogeneous Vp (Figure 9a) and vs. models (Figure 9b) with a homogeneous density of 1000 kg/m 3 . Both the Vp and vs. increased with depth from a low-velocity layer, with the exception of one low-velocity zone (LVZ) and two high-velocity zones (HVZs) in the middle. We assumed an initial model with a homogeneous Vp of 1000 m/s, a homogeneous vs. of 500 m/s, and a homogeneous density of 1000 kg/m 3 . We set up 50 shots located from 50 to 550 m at a depth 0.5 m below the model's surface, with an equal  Synthetic data were computed for the target model and the initial model and were simulated using source-time functions with a dominant 10 Hz Ricker wavelet. Then, we conducted the joint inversion following the same inversion procedures described in Figure 5. To avoid the local minima and reduce the non-linearity of the waveform misfit function, we adopted a hierarchical strategy, in which we successively inverted from low to high frequency progressively through 20, 30, 40, 50, 60, and 80 Hz. We also conducted three types of inversions to further demonstrate the advantage of the joint inversion over the separate inversions. For each inversion, we simultaneously inverted both the S-wave and P-wave velocities.
First, we used the objective function in Equation (13) to update the model via the phases. The result of using the Rayleigh wave phase is shown in the inverted S-wave (Figure 10a) and P-wave (Figure 11a) velocity models. A better result for the vs. final model was expected, since the surface waves were more sensitive to this elastic parameter. Neither the S-nor the P-wave velocity model converged to a model near the target shown in Figure 9. The most probable reason for this was that the inversion suffered from strong cycle-skipping problems, which easily occur in surface waveform FWI, with incorrect updates leading to a secondary minimum. Next, we replaced the phases with the Z/H ratio in the objective function. That is, we minimized the discrepancy defined by the Z/H ratio in Equation (14). Figures 10b and 11b show the resulting vs. and Vp models. Because the envelope information using the Z/H ratio calculation in this paper mitigated the cycleskipping effects, working only with the Z/H ratio led to a reasonably successful recovery of the vs. model of the shallow structure, but it did not constrain the deeper structure. Moreover, the capacity of the Z/H ratio to constrain the shallow crust with a complex lateral heterogeneity helped us obtain a significantly better low-velocity layer where the surface wave phase could not resolve the structure. The S-wave velocity structure was resolved, albeit with a low resolution, and only the very top portion of the P-wave velocity model showed signs of being updated in the desirable direction. The Z/H ratio inversions Synthetic data were computed for the target model and the initial model and were simulated using source-time functions with a dominant 10 Hz Ricker wavelet. Then, we conducted the joint inversion following the same inversion procedures described in Figure 5. To avoid the local minima and reduce the non-linearity of the waveform misfit function, we adopted a hierarchical strategy, in which we successively inverted from low to high frequency progressively through 20, 30, 40, 50, 60, and 80 Hz. We also conducted three types of inversions to further demonstrate the advantage of the joint inversion over the separate inversions. For each inversion, we simultaneously inverted both the S-wave and P-wave velocities.
First, we used the objective function in Equation (13) to update the model via the phases. The result of using the Rayleigh wave phase is shown in the inverted S-wave ( Figure 10a) and P-wave (Figure 11a) velocity models. A better result for the vs. final model was expected, since the surface waves were more sensitive to this elastic parameter. Neither the S-nor the P-wave velocity model converged to a model near the target shown in Figure 9. The most probable reason for this was that the inversion suffered from strong cycle-skipping problems, which easily occur in surface waveform FWI, with incorrect updates leading to a secondary minimum. Next, we replaced the phases with the Z/H ratio in the objective function. That is, we minimized the discrepancy defined by the Z/H ratio in Equation (14). Figures 10b and 11b show the resulting vs. and Vp models. Because the envelope information using the Z/H ratio calculation in this paper mitigated the cycleskipping effects, working only with the Z/H ratio led to a reasonably successful recovery of the vs. model of the shallow structure, but it did not constrain the deeper structure. Moreover, the capacity of the Z/H ratio to constrain the shallow crust with a complex lateral heterogeneity helped us obtain a significantly better low-velocity layer where the surface wave phase could not resolve the structure. The S-wave velocity structure was resolved, albeit with a low resolution, and only the very top portion of the P-wave velocity model showed signs of being updated in the desirable direction. The Z/H ratio inversions appeared to converge at the price of losing resolution by discarding the phase information. Finally, we proposed a full-waveform joint inversion of the phase and Z/H ratio in the same multiscale framework. Figures 10c and 11c show the results obtained using the joint-inversion philosophy. The addition of the Z/H ratio to the joint inversion helped constrain the S-wave velocity structure in the near-surface region, so the low-velocity layer was correctly corrected and converged to the target model. Moreover, the mitigation of the cycle-skipping problems contributed to the Z/H ratio and could help the phase information better constrain the principal features of the deeper structure. The LVZ and HVZ in the middle were well covered with correct amplitudes. In addition, the interfaces between the velocity zones were better defined in the vs. model. In the Vp model, although only the very top portion of the P-wave velocity model was acceptably imaged, it can be seen that the inclusion of the phase information in the joint inversion provided more additional structure to the P-wave velocity model compared with the inversion results obtained using on obtained using only the Z/H ratio. appeared to converge at the price of losing resolution by discarding the phase information. Finally, we proposed a full-waveform joint inversion of the phase and Z/H ratio in the same multiscale framework. Figures 10c and 11c show the results obtained using the joint-inversion philosophy. The addition of the Z/H ratio to the joint inversion helped constrain the S-wave velocity structure in the near-surface region, so the low-velocity layer was correctly corrected and converged to the target model. Moreover, the mitigation of the cycle-skipping problems contributed to the Z/H ratio and could help the phase information better constrain the principal features of the deeper structure. The LVZ and HVZ in the middle were well covered with correct amplitudes. In addition, the interfaces between the velocity zones were better defined in the vs. model. In the Vp model, although only the very top portion of the P-wave velocity model was acceptably imaged, it can be seen that the inclusion of the phase information in the joint inversion provided more additional structure to the P-wave velocity model compared with the inversion results obtained using on obtained using only the Z/H ratio.   In Figure 12, we show how well the waveform data were fit under phase and joint inversion as described in this paper. The left column shows the vertical displacement components (Uz), and the right column shows the horizontal components (Uh) for three traces of stations at X = 150, 300, and 450 m from one shot gather located at X = 50 m. The black solid lines represent the target data, red dashed lines represent the initial predictions, and blue solid lines and green solid lines represent the final models obtained by phase inversion and joint inversion, respectively. We can see that the phase inversion suffered from a cycle-skipping problem, creating local minima in the FWI objective function, so the final traces were badly matched to the target traces (Figure 12a). Joint inversion incorporating Z/H ratio improved the vs. model by mitigating cycle-skipping issues, so the traces of joint inversion were matched to almost-perfect agreement with the target traces (Figure 12b). In Figure 12, we show how well the waveform data were fit under phase and joint inversion as described in this paper. The left column shows the vertical displacement components (Uz), and the right column shows the horizontal components (Uh) for three traces of stations at X = 150, 300, and 450 m from one shot gather located at X = 50 m. The black solid lines represent the target data, red dashed lines represent the initial predictions, and blue solid lines and green solid lines represent the final models obtained by phase inversion and joint inversion, respectively. We can see that the phase inversion suffered from a cycle-skipping problem, creating local minima in the FWI objective function, so the final traces were badly matched to the target traces (Figure 12a). Joint inversion incorporating Z/H ratio improved the vs. model by mitigating cycle-skipping issues, so the traces of joint inversion were matched to almost-perfect agreement with the target traces (Figure 12b).

Discussion
As the extension of 2D surface-wave adjoint tomography method [32], the computing of 2D joint inversion also kept the efficiency. Taking the small-scale inversion as an example, the model was 600 m in length and 100 m in depth with 100 × 20 elements, and the simulation contained 20,000 time steps with a 0.0003 s time interval. As shown in Table 2, the phase inversion included 50 forward-modelings and calculations of kernels; the number of processors for each source was 4, leading to a total cost of 200 CPU hours. The computation time was 6.25 h for 46 iterations. For joint inversion, the CPU hours cost was doubled (400 CPU hours total) due to the addition of the Z/H-ratio kernel calculation. However, the actual computation also showed relative efficiency, and 38 iterations could be completed in just 5.8 h, which was acceptable compared with the computational cost in the 3D case. On the basis of the time cost, the 2D joint inversion can be very efficient when computational resources are limited.

Discussion
As the extension of 2D surface-wave adjoint tomography method [32], the computing of 2D joint inversion also kept the efficiency. Taking the small-scale inversion as an example, the model was 600 m in length and 100 m in depth with 100 × 20 elements, and the simulation contained 20,000 time steps with a 0.0003 s time interval. As shown in Table 2, the phase inversion included 50 forward-modelings and calculations of kernels; the number of processors for each source was 4, leading to a total cost of 200 CPU hours. The computation time was 6.25 h for 46 iterations. For joint inversion, the CPU hours cost was doubled (400 CPU hours total) due to the addition of the Z/H-ratio kernel calculation. However, the actual computation also showed relative efficiency, and 38 iterations could be completed in just 5.8 h, which was acceptable compared with the computational cost in the 3D case. On the basis of the time cost, the 2D joint inversion can be very efficient when computational resources are limited. The synthetic examples above provided us with two actual situations for the application of joint inversion. For large-scale inversion, we could use earthquake surface waves recorded by linear seismic arrays. In reality, it is difficult to find enough earthquakes along the profile. Fortunately, the approach has been developed over decades, in which the Z/H ratio can be extracted from ambient-noise cross-correlation data [7], which makes it the most suitable situation for applying the joint-inversion method. The tomographic experiment in Section 4.1 shows that both Rayleigh wave phase and Z/H ratio had sensitivity to S-wave velocity in the crustal structure, but their sensitivity patterns were different. Rayleigh wave phase velocity dispersion was sensitive to the average velocity of the depth range sampled. The Z/H ratio was more sensitive to the shallower structure compared to the Rayleigh wave phase velocity dispersion at the same frequency; thus, the Z/H ratio was more helpful to constrain the shallow structure, such as the sedimentary basin. The joint inversion combined the complementary sensitivities of the phases and Z/H ratio and was capable of building a more unified crustal structure.
For small-scale inversion, this method could be used effectively in retrieving model parameters and characterizing the near-surface region using active-source surface waves for engineering applications, geotechnical studies, and physical modeling tests. The tomographic experiment in Section 4.2 indicated that joint inversion has another important advantage; that is, it can deal with the cycle-skipping problem, which easily occurs in FWI of surface waves in near-surface cases. This contributes to envelope-information usage in the Z/H ratio calculation, which avoids incorrect vs. model updates, leading to a secondary minimum. Due to limited sensitivity to the P-wave velocity model, it was found that the P-wave model also seemed to be not well constrained by the joint inversion. This can be improved by inclusion of body-wave data sets, such as the receiver function for large-scale inversions and seismic refractions for small-scale inversions, which is sensitive to the P-wave velocity.
Before application to real data, there are some issues that require more study. One is the difficulty of extracting stable Z/H-ratio measurements from a single source-station pair. Similarly, using amplitude information from an ambient-noise cross-correlation is not always reliable. Another issue is the 3D structural effect on both phase and Z/H-ratio kernels, which were not properly modeled in the 2D case. For example, the ray path may not follow the profile. Thus, the requirements for the linear profile need to be considered before a field experiment is conducted. In addition, the real data FWI is indeed more difficult in near-surface application because of many practical problems, such as waveform data-signal processing problems in signal-to-noise ratio improvement, the effect of rugged topography beneath real irregular receiver array acquisition, a very complicated nearsurface effect, the random or man-made noise effect, and heterogeneous distribution of ambient-noise sources. Ignoring any issue in the real data waveform inversion could cause the inversion result to be unexplained or unstable. This paper focused on the theoretical and technical aspects, and the real data application will be discussed and published in future works.

Conclusions
In this study, we developed a full-waveform joint-inversion scheme for the Rayleigh wave phase and the Z/H ratio simultaneously. First, we developed an approach for calculating the 2D sensitivity kernels of the Rayleigh wave phases and the Z/H ratio based on the adjoint-state method. A numerical experiment was carried out to investigate the sensitivities of the Rayleigh wave phase and the Z/H ratio to the S-wave velocity structure. The experiments showed that they were both mostly sensitive to the shear-wave velocity, but their sensitivity patterns were different. The phase kernel was concentrated along the propagation path, with a much broader sensitivity range than the Z/H-ratio kernels. The Z/H ratio was more sensitive to the shallower structure compared to the Rayleigh wave phase-velocity dispersion at the same frequency. In addition, the Z/H ratio could deal with the lateral heterogeneity better, which was more helpful in constraining shallow structures, such as sedimentary basins or near-surface structures.
The joint-inversion method was applied in two synthetic experiments involving large-and small-scale inversions. By comparing the tomographic results of the adjoint tomography of the Rayleigh wave phases, the adjoint tomography of the Z/H ratio, and the joint inversion, we demonstrated that the joint inversion outperformed the separate inversions because it combined the complementary sensitivities of both, resulting in a more S-wave velocity unified model. Also, joint inversion could deal with the cycleskipping problem in FWI in near-surface cases. The proposed joint-inversion scheme offers a computationally efficient and inexpensive alternative to imaging fine-scale crustal structures or near-surface structures beneath a 2D seismic array.  Data Availability Statement: The data of this study is available from the authors upon request.

Recursion
(1) Set q = g k , i = k − 1, j = min(k, l); (2) Perform j times: λ i = s T i q y T i s i , q = q − λ i y i , i = i − 1; (3) Set γ = s T k−1 y k−1 y T k−1 y k−1 , r = γDq, i = k − j; (4) Perform j times: µ = y T i r y T i s i , r = r + s i (λ i − µ), i = i + 1; (5) End with result p k = −r; L-BFGS's relatively modest memory usage rests on the fact that if k is the current iteration number and l is the memory value, then vector pairs prior to {sk−l,yk−l} are no longer needed and can be removed from storage. The scaling factor γ, which accounts for differences between the true Hessian and the approximation thereto, is essential to the good performance of the algorithm.