Research on Soil Moisture Inversion Method for Canal Slope of the Middle Route Project of the South to North Water Transfer Based on GNSS-R and Deep Learning

: The soil moisture from the South-to-North Water Diversion Middle Route Project is assessed in this study. Complex and variable geological conditions complicate the prediction of soil moisture in the study area. To achieve this aim, we carried out research on soil moisture inversion methods for channel slopes in the study area using massive monitoring data from multiple GNSS observatories on channel slopes, incorporating GNSS-R techniques and deep learning algorithms. To address the issue of low accuracy in linear inversion when using a single satellite, this study proposes a multi-satellite and multi-frequency data fusion technique. Furthermore, three soil moisture inversion models, namely, the linear model, BP neural network model, and GA-BP neural network model, are established by incorporating deep learning techniques. In comparison with single-satellite data inversion, with the data fusion technique proposed in this study, the correlation is improved by 12.7%, the root mean square error is reduced by 0.217, the mean square error is decreased by 0.884, and the mean absolute error is decreased by 0.243 with the linear model. With the BP neural network model, the correlation is increased by 15.4%, the root mean square error is decreased by 0.395, the mean square error is decreased by 0.465, and the mean absolute error is reduced by 0.353. Moreover, with the GA-BP neural network model, the correlation is improved by 6.3%, the root mean square error is decreased by 1.207, the mean square error is decreased by 0.196, and the mean absolute error is reduced by 0.155. The results indicate that performing data fusion by using multiple satellites and multi-frequency bands is a feasible approach for improving the accuracy of soil moisture inversion. These research ﬁndings provide new technical means for the risk analysis of deformation disasters in the expansive soil channel slopes of the South-to-North Water Diversion Middle Route Project


Introduction
To address the issue of the uneven spatial distribution of water resources, as there are more water resources in the southern regions of China and less in the northern regions, the Chinese government has planned and implemented the South-to-North Water Diversion Middle Route Project. The South-to-North Water Diversion Middle Route Project is one of the world's major water conservancy projects. Its objective is to address the water scarcity issue in Northern China by diverting the abundant water resources from the southern regions of the country to the water-deficient areas in the north. The South-to-North Water Diversion Middle Route Project in China has a total length of 1432 km. The moisture. In order to validate the feasibility of the proposed method, this study employed the deep excavation of expansive soil channels in the South-to-North Water Diversion Project as the study area. By using deep learning techniques, models were established to correlate the phase, amplitude, and other relevant features with soil moisture. The results demonstrated that the soil moisture estimations obtained with the fused data from the proposed multi-satellite multi-frequency fusion technique outperformed those obtained from single-satellite single-frequency data inversion in terms of accuracy. This confirmed the effectiveness of the data fusion approach in improving the accuracy of soil moisture estimation when applied to the study area of the deep excavation of expansive soil channels in the South-to-North Water Diversion Project. In this paper, we use multi-satellite and multi-band data fusion techniques to process the GNSS-R observation data in order to obtain more comprehensive observation information, and use deep learning techniques to establish high-precision inversion models, which provide a new technical route for soil moisture inversion of deep excavated expansive soil channel slopes in the South-to-North Water Diversion Middle Route Project.

GNSS-R Fundamentals
The GNSS-R reflectometry technique involves a dual-base radar that allows one to obtain surface roughness features and geophysical parameters, i.e., by using GNSS to measure the delay (time delay or phase delay) between the direct signal and the signal reflected from the surface mirror; then, based on the geometric positional relationships between GNSS satellites, receivers, and mirror reflection points, the surface features can be inverted [32]. When using geodetic receivers, the environmental noise level remains constant, so the signal-to-noise ratio directly corresponds to the strength of the GNSS signal that is received. Figure 1 depicts the direct and reflected signals received by the GNSS antenna, where the direct signal exhibits much higher intensity than that of the reflected signal. As shown in Figure 1, the interference between the direct signal and the reflected signal (or multipath signal) results in an overlay effect, causing oscillations, particularly at low satellite elevations. In most environments, the amplitude of the reflected signal is much smaller than that of the direct signal. Therefore, the signal-to-noise ratio is controlled by the direct signal, and the desired multipath effects can be extracted by separating this oscillation pattern.
rithm based on least squares to combine data from multiple satellites in the same frequency band. Furthermore, an entropy-based method is applied to fuse data from different frequency bands. By utilizing the fused data for GNSS-R technology, this study mitigates signal gaps and improves the quality of observational data, thereby enhancing the estimation accuracy of soil moisture. The data fusion approach helps fill in missing information and reduces the variability in the observations, leading to more reliable and precise estimations of soil moisture. In order to validate the feasibility of the proposed method, this study employed the deep excavation of expansive soil channels in the Southto-North Water Diversion Project as the study area. By using deep learning techniques, models were established to correlate the phase, amplitude, and other relevant features with soil moisture. The results demonstrated that the soil moisture estimations obtained with the fused data from the proposed multi-satellite multi-frequency fusion technique outperformed those obtained from single-satellite single-frequency data inversion in terms of accuracy. This confirmed the effectiveness of the data fusion approach in improving the accuracy of soil moisture estimation when applied to the study area of the deep excavation of expansive soil channels in the South-to-North Water Diversion Project. In this paper, we use multi-satellite and multi-band data fusion techniques to process the GNSS-R observation data in order to obtain more comprehensive observation information, and use deep learning techniques to establish high-precision inversion models, which provide a new technical route for soil moisture inversion of deep excavated expansive soil channel slopes in the South-to-North Water Diversion Middle Route Project.

GNSS-R Fundamentals
The GNSS-R reflectometry technique involves a dual-base radar that allows one to obtain surface roughness features and geophysical parameters, i.e., by using GNSS to measure the delay (time delay or phase delay) between the direct signal and the signal reflected from the surface mirror; then, based on the geometric positional relationships between GNSS satellites, receivers, and mirror reflection points, the surface features can be inverted [32]. When using geodetic receivers, the environmental noise level remains constant, so the signal-to-noise ratio directly corresponds to the strength of the GNSS signal that is received. Figure 1 depicts the direct and reflected signals received by the GNSS antenna, where the direct signal exhibits much higher intensity than that of the reflected signal. As shown in Figure 1, the interference between the direct signal and the reflected signal (or multipath signal) results in an overlay effect, causing oscillations, particularly at low satellite elevations. In most environments, the amplitude of the reflected signal is much smaller than that of the direct signal. Therefore, the signal-to-noise ratio is controlled by the direct signal, and the desired multipath effects can be extracted by separating this oscillation pattern.  The relationship between the SNR multipath amplitude and SNR is established by identifying the effect of the gain pattern of the receiving antenna on the recorded signal strength, and at any moment, the SNR and satellite altitude θ can be expressed by Equation (1) [33]: where A d and A m represent the amplitudes of the direct signal and multipath signal, respectively, which indicate the contributions of the multipath signal to the SNR. ψ denotes the phase difference between the two signals. A c is expressed as the composite signal amplitude of the two signals, i.e., the signal-to-noise ratio (SNR). Figure 2 shows the trend of SNR variation and altitude angle variation in the L1 band of the G02 satellite on 1 January 2021 at station GP01.
The relationship between the SNR multipath amplitude and is established by identifying the effect of the gain pattern of the receiving antenna on the recorded signal strength, and at any moment, the and satellite altitude θ can be expressed by Equation (1) [33]: where represent the amplitudes of the direct signal and multipath signal, respectively, which indicate the contributions of the multipath signal to the SNR. denotes the phase difference between the two signals.
is expressed as the composite signal amplitude of the two signals, i.e., the signal-to-noise ratio (SNR). Figure 2 shows the trend of SNR variation and altitude angle variation in the L1 band of the G02 satellite on 1 January 2021 at station GP01. As seen in Equation (1) and Figure 2, the change in the amplitude of the direct signal or multipath signal with respect to the phase leads to a corresponding change in the SNR amplitude, and the effect of the antenna gain pattern indicates that A d ≫ A m . Thus, the overall amplitude of the SNR is mainly driven by the direct signal [34], while the multipath signal produces a small-amplitude, high-frequency oscillation in the direct signal and, thus, affects the SNR. This oscillation is more pronounced at lower satellite azimuth angles [35].
To determine the multipath amplitude of the SNR, it is necessary to separate the contribution of the multipath signal to the SNR from the amplitude of the direct signal . This can be achieved by fitting a low-order polynomial to the SNR time series to estimate the direct signal and subtracting it from the original SNR data. The residual sequence, ℎ , represents the multipath component and can be expressed with Equation (2): In the equation, represents the amplitude, denotes the carrier wavelength, represents the phase, and h is the distance from the phase center of the receiving antenna to the reflecting surface, which is also known as the effective antenna height.
Due to the inability to obtain a complete periodic segment of the SNR's residual sequence in Figure 2, it is generally challenging to address it with a fast Fourier transform. As seen in Equation (1) and Figure 2, the change in the amplitude of the direct signal or multipath signal with respect to the phase leads to a corresponding change in the SNR amplitude, and the effect of the antenna gain pattern indicates that A d A m . Thus, the overall amplitude of the SNR is mainly driven by the direct signal [34], while the multipath signal produces a small-amplitude, high-frequency oscillation in the direct signal and, thus, affects the SNR. This oscillation is more pronounced at lower satellite azimuth angles [35].
To determine the multipath amplitude of the SNR, it is necessary to separate the contribution of the multipath signal to the SNR from the amplitude of the direct signal dSNR direct . This can be achieved by fitting a low-order polynomial to the SNR time series to estimate the direct signal and subtracting it from the original SNR data. The residual sequence, dSNR multipath , represents the multipath component and can be expressed with Equation (2): In the equation, A m represents the amplitude, λ denotes the carrier wavelength, ψ represents the phase, and h is the distance from the phase center of the receiving antenna to the reflecting surface, which is also known as the effective antenna height.
Due to the inability to obtain a complete periodic segment of the SNR's residual sequence in Figure 2, it is generally challenging to address it with a fast Fourier transform. However, the Lomb-Scargle algorithm can effectively extract weak periodic signals from non-uniform sequences [36]. Therefore, Lomb-Scargle spectral analysis is applied to the SNR's residual sequence to obtain the highest frequency, leading to the determination of the most effective vertical reflection height, h. During the fitting process for obtaining ψ and A m , the effective antenna height, h, is often treated as a fixed constant. However, in practical measurements, variations in satellite trajectories and environmental conditions around the receiver station can cause changes in h. In long-term observation sequences, the median of the effective antenna height is closest to the vertical distance from the receiver antenna to the reflecting surface. Therefore, this study adopts the median value of the effective antenna height in the long-term observation sequence as a fixed value for h in the fitting of the feature parameters. The SNR reflection component exhibits periodic oscillations with the satellite elevation angle, approximating a cosine function. Therefore, a nonlinear least squares algorithm is employed to perform cosine fitting on the resampled data to obtain the reflection signal's amplitude parameter A m and phase parameter ψ. Finally, the soil moisture is inverted by using the amplitude parameter A m and phase parameter ψ.

Data Fusion Methods
This study proposes a novel data fusion method that utilizes multiple data processing algorithms for data preprocessing to enhance the inversion process. The method automatically selects satellites with a high correlation among the amplitude, phase, and soil moisture. An adaptive fusion algorithm based on least squares is then applied to merge data from multiple satellites in the same frequency band. Furthermore, an entropy-based fusion method is employed to merge amplitude and phase data from different frequency bands. By using this method, signal gaps are reduced, and the limitations of the limited observation information from a single satellite and varying data quality are addressed, resulting in improved data quality and enhanced accuracy in soil moisture inversion.
Before performing data fusion, to reduce the significant differences in amplitude caused by different satellites, the amplitude sequence is first arranged in ascending order. The average value of the top 20% of the sequence is selected as the baseline for normalization, as shown in Equation (3): For the phase, the initial phase of each satellite signal arriving at the ground is different. In order to clearly derive the phase variation caused by humidity changes for the comparison of the phase characteristics of different satellites, etc., the phase time series of each satellite track needs to be zeroed, i.e., the minimum value is set to zero. When zeroing according to Equation (4), first, the average of the lowest 20% of observations for each track (satellite) is calculated, and then this average is subtracted from the phase time series.
In the above equation, A 20% represents the average of the top 20% of the largest values in the amplitude sequence, and ψ 20% represents the average of the bottom 20% of the smallest values in the phase sequence. By applying the aforementioned processing, noise and errors caused by vegetation, terrain, and other factors can be removed from the time series, which is beneficial for soil moisture inversion. This step helps enhance the accuracy of soil moisture retrieval by mitigating the impacts of various sources of interference.
After normalizing the data, the same frequency band data from multiple satellites are fused by using an adaptive fusion algorithm based on the least squares method. The adaptive fusion algorithm based on the least squares method aims to minimize the total variance with respect to the true value by adjusting the weights of each datum, thus achieving more accurate fusion results. For the SNR observations provided by multiple satellites, the phase and amplitude values of each satellite are obtained after processing. These phase and amplitude data are then fused to obtain a more precise estimation. In this algorithm, the least squares method is used to solve for the optimal weighting coefficients that minimize the sum of squared errors between the fused result and the true value. Specifically, assuming that there are n satellites providing observations, and after processing, the phase data x 1 , x 2 , . . . , x n is obtained, along with their corresponding weight coefficients w 1 , w 2 , . . . , w n , the objective is to solve the optimal weight coefficients that bring the weighted result closest to the true value y. Then, the problem can be transformed into the following problem of minimizing an objective function: Equation (5) is derived so that the derivative is zero to solve for the optimal weight coefficients w 1 , w 2 , . . . , w n . The optimal weighting factor can be expressed as Equation (6): where X = [x 1 , x 2 , . . . , x n ], y is the true value, and w = [w 1 , w 2 , . . . , w n ] is the weight coefficient. Specifically, for the ith weighting factor, The optimal weighting coefficients are calculated according to Equation (7), and they are used to weigh the observations to obtain a more accurate estimate.
Finally, the entropy method is used to fuse data from different frequency bands acquired by GNSS receivers for fusion in order to obtain higher-quality observation data and improve the inversion accuracy. Data fusion with the entropy method involves multivariate data fusion based on the principle of information entropy. The core idea is that a greater information entropy indicates a greater uncertainty of the index and a smaller weight; a smaller information entropy indicates a lower uncertainty of the index and a larger weight.
The observed values of each indicator are quantified according to certain rules, and then the information entropy and weight of each indicator are calculated; the final fusion result is calculated through the information entropy principle and the weighted average principle. Specifically, the entropy value method is calculated as follows: (1) The normalized values of each column in the data are calculated and scaled to a range of [0, 1]; the formula is shown in Equation (8): In Equation (8), x represents the original data, x min denotes the minimum value of the data, and x max corresponds to the maximum value of the data.
(2) The weight of the ith sample under the jth indicator is calculated for that indicator; the formula is shown in Equation (9): (3) The entropy value is calculated for each column of data; the definition of entropy is used to compute the entropy value for each column of data. Entropy represents the uncertainty or information content of the data, and the formula for calculating the entropy value is shown in Equation (10): (4) The weights of each datum are calculated; the formula for calculating the weight w j for the jth indicator is shown in Equation (11): Remote Sens. 2023, 15, 4340 8 of 34 (5) The formula for performing data fusion is shown in Equation (12): The fusion algorithm automatically selects satellites with high correlation between amplitude-phase data and soil moisture, fuses data from multiple satellites in multiple segments, reduces signal loss, improves the quality of observation data, and obtains highprecision inversion models. The artificial neural network algorithm, as the name suggests, is an algorithmic network composed of artificial neurons that mimics the way neural transmission occurs in the human brain. It possesses strong capabilities for nonlinear mapping, self-organization, adaptation, memory, and prediction, making it well suited for solving complex logical operations and nonlinear problems [37]. Neural networks can be used for tasks such as classification, clustering, and prediction. They require a sufficient amount of historical data, and by training on this data, a network can learn the underlying knowledge within the data. The BP neural network, which is a widely used and a classical artificial neural network, possesses the aforementioned capabilities, along with characteristics such as strong plasticity, simplicity, and powerful learning abilities. Today, it is used in extensive applications across various fields [38].
The BP neural network is the fundamental form of a neural network, and its output is obtained through forward propagation, while the error is propagated back through the network by using a backpropagation method. A BP neural network emulates the activation and propagation processes of human neurons. Considering a three-layer neural network as an example, a BP neural network consists of three layers: the input layer, the hidden layer, and the output layer. The input layer receives data, and the output layer outputs data. Each neuron in the previous layer is connected to neurons in the next layer, collecting information from the previous layer and transmitting it to the next layer through activation. The structure of a BP neural network is depicted in Figure 3.  The fusion algorithm automatically selects satellites with high correlation between amplitude-phase data and soil moisture, fuses data from multiple satellites in multiple segments, reduces signal loss, improves the quality of observation data, and obtains highprecision inversion models.

BP Neural Network
The artificial neural network algorithm, as the name suggests, is an algorithmic network composed of artificial neurons that mimics the way neural transmission occurs in the human brain. It possesses strong capabilities for nonlinear mapping, self-organization, adaptation, memory, and prediction, making it well suited for solving complex logical operations and nonlinear problems [37]. Neural networks can be used for tasks such as classification, clustering, and prediction. They require a sufficient amount of historical data, and by training on this data, a network can learn the underlying knowledge within the data. The BP neural network, which is a widely used and a classical artificial neural network, possesses the aforementioned capabilities, along with characteristics such as strong plasticity, simplicity, and powerful learning abilities. Today, it is used in extensive applications across various fields [38].
The BP neural network is the fundamental form of a neural network, and its output is obtained through forward propagation, while the error is propagated back through the network by using a backpropagation method. A BP neural network emulates the activation and propagation processes of human neurons. Considering a three-layer neural network as an example, a BP neural network consists of three layers: the input layer, the hidden layer, and the output layer. The input layer receives data, and the output layer outputs data. Each neuron in the previous layer is connected to neurons in the next layer, collecting information from the previous layer and transmitting it to the next layer through activation. The structure of a BP neural network is depicted in Figure 3.  Here, i is the number of input layer neurons, j is the number of hidden layer neurons, k is the number of output layer neurons, w is the weight, and b is the "bias". Each circle is a neuron.
The BP algorithm includes the following two processes: (1) Forward propagation of information, where the feature signal is passed forward along the input layer and passed to the output layer nodes through the hidden layer's neurons. The output nodes do not directly output the signal but need to undergo a series of nonlinear changes. The obtained output signal is analyzed for error with the target output signal, and if the error is too large, it is transferred to the error backpropagation process. (2) In the backward propagation of error, the error obtained from the forward propagation of the signal is reversed from the output layer to the entire neural network; the error is divided equally among the nodes in each layer when it passes through the hidden layer and the input layer, and the network weights are updated so that the error decreases layer by layer along the reversed neural network and so on until the forward propagation of the signal reaches the desired output. The threshold and weights corresponding to the actual output are determined at this time, and the training of the neural network can be stopped. Specifically, assuming a three-layer BP neural network with M input layer nodes, N hidden layer nodes, and O output layer nodes and by using a sigmoid function as the activation function, the main steps of BP neural network training are as follows.
(1) The input variables net i are computed for the ith node of the hidden layer of the neural network; the equation is shown in Equation (13): where the meaning of the variable x i denotes the input parameter of the jth node of the input layer, j = 1, . . . , M; the meaning of the variable w ij denotes the neural network's weight parameter between the ith node of the hidden layer and the jth node of the input layer; the meaning of the variable θ i denotes the threshold parameter of the ith node of the hidden layer. (2) The output variable y i is computed for the ith node of the hidden layer of the neural network; the equation is shown in Equation (14): where g(x) is the excitation function of the hidden layer. A sigmoid function expressed by Equation (15) is used in this study: (3) The input variable net k is calculated for the kth node of the output layer of the neural network; the equation is shown in Equation (16): where the meaning of the variable w ki denotes the weight parameter between the kth node of the output layer and the ith node of the hidden layer, i = 1, . . . , q; the variable a k denotes the threshold parameter of the kth node of the output layer, k = 1, . . . , L; (4) The output variable o k is computed for the kth node of the output layer of the neural network; the equation is shown in Equation (17): The error E is calculated with Equation (18): where Y k is the desired output. (6) The weight is updated with Equation (19): where η is the learning rate. (7) The threshold is updated with Equation (20): (8) It is determined whether the iteration of the algorithm is finished, and if not, one returns to step (2).
A flowchart of the BP neural network is shown in Figure 4.
The error E is calculated with Equation (18): where is the desired output. (6) The weight is updated with Equation (19): where is the learning rate. (7) The threshold is updated with Equation (20): (8) It is determined whether the iteration of the algorithm is finished, and if not, one returns to step (2).
A flowchart of the BP neural network is shown in Figure 4.

GA-BP Neural Network
While BP neural networks have strong learning capabilities and robustness, they can suffer from some limitations. Because their search mechanism is that of gradient descent,

GA-BP Neural Network
While BP neural networks have strong learning capabilities and robustness, they can suffer from some limitations. Because their search mechanism is that of gradient descent, without prior knowledge, the initial values and weights of the network are random, making them prone to being trapped in local minima instead of finding the global minimum. Consequently, a network may fail to obtain the optimal solution, and its learning and memory can be unstable. If training samples are added, the pre-trained network needs to be retrained from the beginning without leveraging the previous knowledge of weights and thresholds. This increases the learning burden and reduces the learning efficiency [39]. To address these issues, the genetic algorithm (GA) can be used to optimize a BP neural network. By incorporating the GA, it is possible to quickly obtain the optimal neural network parameters, accelerate the learning process, and enhance the progress of network inversion.
Genetic algorithms are used to construct a fitness function based on the objective function of a problem, evaluate and perform genetic operations, select a population consisting of multiple solutions (each solution corresponds to a chromosome), and reproduce it over multiple generations to obtain the individual with the best fitness value as the optimal solution to the problem [40]. The specific steps are as follows: (1) Chromosome encoding: A real-number encoding strategy is used to implement the encoding of the chromosomes of the genetic algorithm. The S-order real matrix is set to [−1, 1], based on which the parameters, such as the connection weights between nodes in each layer of the BP neural network and node thresholds in the hidden layer and output layer, are encoded and solved for optimality [41]. Compared with binary coding, real-number coding does not require decoding at a later stage, the coding length is shorter, and the accuracy of the parameter search is high [42]. (2) Initializing the population: The initial population of W = (w 1 ; w 2 ; . . . ; w p ) is randomly generated, and the number of individuals in the population is set to P. Individuals w i , w 1 ; w 2 ; . . . ; w s are generated with a linear interpolation function for one chromosome of the algorithm. (3) Calculation of the population individuals' fitness values: The sum of the squared training errors is used for the calculation of the population individuals' fitness values. (4) Selection: By using the roulette wheel method, the selection probability can be calculated with Equation (21): where f i is the fitness function, and p is the population size. (5) Crossover: The crossover operation of gene w q at position j and the crossover operation of gene w s at position j are performed according to Equation (22): where b is a random number in the range of [0,1]. (6) Mutation: The jth gene of the ith individual undergoes population variation, and the operation of which can be described by Equations (23) and (24): where w max and w min are the maximum and minimum values of gene w ij , respectively; G max is the maximum number of evolutions; g is the current iteration number; r is a random number in the range of [0, 1]; r 2 is a random number. (7) Obtaining new populations: Steps (4) to (6) are repeated until the optimal solution is output.
A flowchart of GA-BP neural network training is shown in Figure 5. A flowchart of GA-BP neural network training is shown in Figure 5.

Experimental Area
The study area was the head canal section of the South-North Water Diversion Project in China, which was located at the head of the Tao Fork Canal in Danyang Village, Jiu Chong Town, Xi Chuan County, Nanyang City, Henan Province, terminating at the junction of Fangcheng County and Ye County, with the end pile number 185 + 545 and a total length of 185.545 km, of which the channel length was 176.718 km and the building length was 8.827 km. Here, there were 58.411 km of deep excavation canals, with a maximum depth of 47.5 m and an opening width of 373.22 m; there were 33.689 km of fill canals, with a maximum fill height of 17 m; there were 149.476 km of swelling soil canals, accounting for 84.5% of the total canal length and including 56.729 km of weak swelling soil canals, 84.37 km of medium swelling soil canals, and 8.377 km of strong swelling soil canals. The experimental area is shown in Figure 6.

Experimental Area
The study area was the head canal section of the South-North Water Diversion Project in China, which was located at the head of the Tao

Experimental Data
The data adopts multi-system data from three GNSS automated measurement stations (GP01, GP02, GP03) in the experimental area, and uses continuous observation satellite data from three stations for a total of 150 days from December 16, 2020 to May 14, 2021 to conduct a multi-system combination GNSS-R high-resolution soil moisture inversion study. A soil moisture sensor probe was buried at a depth of approximately 7.5cm about 1m next to the GNSS receiver device, and the soil moisture communication device and GNSS receiving device were integrated. To validate the performance of the method, in situ soil moisture data measured near the station sites were used as reference data for comparison. The modeling data consisted of 125 days of satellite data and the corresponding soil moisture priors collected from December 16, 2020 to April 19, 2021. The data for verifying the model accuracy, on the other hand, comprised satellite data and soil moisture priors collected from April 20, 2021 to May 14, 2021. All GNSS receiver data can be used, and the Beiyun receiver used in this study. The GNSS station map and soil moisture meter map is shown in Figure 7, the GNSS monitoring receiver is shown in Figure 8, and the parameter configuration is shown in Table 1.

Experimental Data
The data adopts multi-system data from three GNSS automated measurement stations (GP01, GP02, GP03) in the experimental area, and uses continuous observation satellite data from three stations for a total of 150 days from 16 December 2020 to 14 May 2021 to conduct a multi-system combination GNSS-R high-resolution soil moisture inversion study. A soil moisture sensor probe was buried at a depth of approximately 7.5 cm about 1 m next to the GNSS receiver device, and the soil moisture communication device and GNSS receiving device were integrated. To validate the performance of the method, in situ soil moisture data measured near the station sites were used as reference data for comparison. The modeling data consisted of 125 days of satellite data and the corresponding soil moisture priors collected from 16 December 2020 to 19 April 2021. The data for verifying the model accuracy, on the other hand, comprised satellite data and soil moisture priors collected from 20 April 2021 to 14 May 2021. All GNSS receiver data can be used, and the Beiyun receiver used in this study. The GNSS station map and soil moisture meter map is shown in Figure 7, the GNSS monitoring receiver is shown in Figure 8, and the parameter configuration is shown in Table 1.         Figure 9 shows a flowchart of the soil moisture inversion technique used in this study. As can be seen in the figure, the technical route of this study can be divided into three steps:

Experimental Technical Program
(1) preprocessing of the GNSS-R soil moisture inversion data to extract the characteristic parameters of the amplitude and phase of the reflected signal from the observation data acquired with the original GNSS receiver; (2) the use of the multi-satellite multi-frequency data fusion technique to fuse the acquired characteristic parameter data to obtain more accurate observation data and improve the inversion accuracy; and (3) a linear model, BP neural network model, and GA-BP neural network model are developed to invert the soil moisture and compare the inversion accuracy of single-satellite data with that of the fused data.  Figure 9 shows a flowchart of the soil moisture inversion technique used in this study. As can be seen in the figure, the technical route of this study can be divided into three steps: (1) preprocessing of the GNSS-R soil moisture inversion data to extract the characteristic parameters of the amplitude and phase of the reflected signal from the observation data acquired with the original GNSS receiver; (2) the use of the multi-satellite multi-frequency data fusion technique to fuse the acquired characteristic parameter data to obtain more accurate observation data and improve the inversion accuracy; and (3) a linear model, BP neural network model, and GA-BP neural network model are developed to invert the soil moisture and compare the inversion accuracy of single-satellite data with that of the fused data.

Extraction of the Reflected Signal's Feature Parameters
The GNSS receiver's observation data are collected in the carrier phase and pseudorange format, while GNSS-R soil moisture retrieval requires the utilization of the satellite elevation angle and signal-to-noise ratio (SNR). To obtain these parameters, it is necessary to calculate them from the GNSS observation and navigation files with the relevant equations.
In this study, the signal-to-noise ratio data were extracted from the navigation file with the satellite altitude angle data by using teqc software. At low satellite altitude angles, the signal-to-noise ratio had a serious multipath effect and showed periodic oscillations. With the gradual increase in the satellite altitude angle, the antenna had a larger gain, and the signal-to-noise ratio tended to be stable. In order to extract the reflected signal data from the GNSS SNR data, a low-order polynomial fit to the SNR data was used to separate the contribution of the multipath effect to the SNR from the amplitude of the direct signal and to remove the direct component. In addition, in order to standardize the SNR data in dB/Hz units, which are normally converted into values in linear units (Volts/Volts), the linearization formula shown in Equation (25) was used [43]: Figure 10a shows a plot of the signal-to-noise ratio versus the satellite altitude angle. The signal-to-noise ratio data are shown in blue, and the direct radiation signal data from the low-order polynomial fit are shown in red. Figure 10b shows the linearized reflected signal after removing the direct signal.

Extraction of the Reflected Signal's Feature Parameters
The GNSS receiver's observation data are collected in the carrier phase and pseudorange format, while GNSS-R soil moisture retrieval requires the utilization of the satellite elevation angle and signal-to-noise ratio (SNR). To obtain these parameters, it is necessary to calculate them from the GNSS observation and navigation files with the relevant equations.
In this study, the signal-to-noise ratio data were extracted from the navigation file with the satellite altitude angle data by using teqc software. At low satellite altitude angles, the signal-to-noise ratio had a serious multipath effect and showed periodic oscillations. With the gradual increase in the satellite altitude angle, the antenna had a larger gain, and the signal-to-noise ratio tended to be stable. In order to extract the reflected signal data from the GNSS SNR data, a low-order polynomial fit to the SNR data was used to separate the contribution of the multipath effect to the SNR from the amplitude of the direct signal and to remove the direct component. In addition, in order to standardize the SNR data in dB/Hz units, which are normally converted into values in linear units (Volts/Volts), the linearization formula shown in Equation (25) was used [43]: Figure 10a shows a plot of the signal-to-noise ratio versus the satellite altitude angle. The signal-to-noise ratio data are shown in blue, and the direct radiation signal data from the low-order polynomial fit are shown in red. Figure 10b shows the linearized reflected signal after removing the direct signal. The research area is a slope with a gentle slope, with a maximum slope of around 15 degrees, which affects signal reception. In order to eliminate the influence of surface environmental factors, the Savitzky-Golay algorithm was introduced to preprocess the multipath components [44] in order to remove the effects of noise and coarse differences. Figure 11 shows the reflected signal after processing with the SG algorithm. Lomb-Scargle spectrum analysis (LSP) was used to estimate the temporal variations in the principal frequency of the signal-to-noise ratio interferogram to obtain the principal frequency; this was obtained as the effective reflector height h according to Equation (26). Figure 12 shows the LSP spectrum analysis of station GP02 and satellite G01 on 1 January 2021.
where is the main frequency, is the carrier wavelength, the wavelength of the L1 band is 24.42 cm, and that of the L2 band is 19.03 cm. The research area is a slope with a gentle slope, with a maximum slope of around 15 degrees, which affects signal reception. In order to eliminate the influence of surface environmental factors, the Savitzky-Golay algorithm was introduced to preprocess the multipath components [44] in order to remove the effects of noise and coarse differences. Figure 11 shows the reflected signal after processing with the SG algorithm. The research area is a slope with a gentle slope, with a maximum slope of around 15 degrees, which affects signal reception. In order to eliminate the influence of surface environmental factors, the Savitzky-Golay algorithm was introduced to preprocess the multipath components [44] in order to remove the effects of noise and coarse differences. Figure 11 shows the reflected signal after processing with the SG algorithm. Lomb-Scargle spectrum analysis (LSP) was used to estimate the temporal variations in the principal frequency of the signal-to-noise ratio interferogram to obtain the principal frequency; this was obtained as the effective reflector height h according to Equation (26). Figure 12 shows the LSP spectrum analysis of station GP02 and satellite G01 on 1 January 2021.
where is the main frequency, is the carrier wavelength, the wavelength of the L1 band is 24.42 cm, and that of the L2 band is 19.03 cm. Lomb-Scargle spectrum analysis (LSP) was used to estimate the temporal variations in the principal frequency of the signal-to-noise ratio interferogram to obtain the principal frequency; this was obtained as the effective reflector height h according to Equation (26). Figure 12 shows the LSP spectrum analysis of station GP02 and satellite G01 on 1 January 2021.
where f is the main frequency, λ is the carrier wavelength, the wavelength of the L1 band is 24.42 cm, and that of the L2 band is 19.03 cm. After obtaining the h values, the characteristic parameters of the amplitude and delayed phase were obtained by fitting the nonlinear least squares in Equation (2) [45]. Table  2 shows some of the amplitude and phase data obtained by fitting station GP02.

Data Fusion
Before data fusion, in order to reduce the excessive differentiation of the feature elements caused by different satellites, the inverse-derived data were first normalized by using Equations (3) and (4) for the amplitude and phase data, respectively. Table 3 shows the normalized data of some characteristic elements of station GP02. Figure 13 shows the normalized feature elements of some satellites of station GP02.  After obtaining the h values, the characteristic parameters of the amplitude and delayed phase were obtained by fitting the nonlinear least squares in Equation (2) [45]. Table 2 shows some of the amplitude and phase data obtained by fitting station GP02.

Data Fusion
Before data fusion, in order to reduce the excessive differentiation of the feature elements caused by different satellites, the inverse-derived data were first normalized by using Equations (3) and (4) for the amplitude and phase data, respectively. Table 3 shows the normalized data of some characteristic elements of station GP02. Figure 13 shows the normalized feature elements of some satellites of station GP02.  The normalized characteristic elements were subjected to a correlation analysis with soil moisture. The results of the correlation analysis are shown in Table 4.  The normalized characteristic elements were subjected to a correlation analysis with soil moisture. The results of the correlation analysis are shown in Table 4. According to the correlation results in Table 4, it can be seen that for points GP01 and GP03, the phase and amplitude of the L1 and L2 bands of satellite G06 had a strong correlation with soil moisture; for point GP02, the phase and amplitude of satellite G19 had a strong correlation with soil moisture in the L1 band, while the phase and amplitude of satellite G22 had a strong correlation with soil moisture in the L2 band. Therefore, in this study, satellite G06 was selected for subsequent single-satellite inversion experiments for points GP01 and GP03, satellite G19 was selected for subsequent single-satellite inversion experiments for point GP02 in the L1 band, and satellite G22 was selected for subsequent single-satellite inversion experiments for point GP02 in the L2 band. At the same time, the four satellites with the highest correlations were automatically selected by using the fusion algorithm for the joint multi-satellite inversion experiments with the corresponding satellites at the corresponding points. Table 5 shows the fusion of the single-satellite data with multi-satellite multi-band data on 16 December 2020 for the data on the characteristic parameters.

Soil Moisture Inversion Results
In order to verify the effectiveness of the fusion algorithm, and considering that the deep learning algorithm had self-learning and adaptive abilities for solving highdimensional nonlinear problems, models of three methods were established for a comparative analysis. The three models included the conventional linear regression model, the BP neural network model, and the GA-BP neural network model. We trained using the data from the first 125 days (for linear inversion experiments, we used 125 days of data to establish the model; for neural network experiments, we divide the data into training and validation sets in an 8:2 ratio to establish the model) and then used the 25 day data that did not participate in the training as the test set to verify the accuracy of the trained model, in order to verify its generalization ability and real performance. The linear model was modeled separately for the amplitude and phase with soil moisture, and it has been verified by many studies that the inversion effect of unifying the amplitude and phase as x-value inputs when using a deep learning algorithm is more accurate than when using single-feature element inversion [45]. Therefore, in this study, the amplitude, phase, and frequency were used as input x-values for the deep learning network, and soil moisture values were used as output y-values for training. Figure 14 shows the linear model that was built at site GP01. Figure 15 shows the linear model that was built at site GP02. Figure 16 shows the linear model that was built at site GP03. Table 6       As shown in Figure 14-16 and Table 6, the linear model that was developed did reflect the relationship between the characteristic elements and soil moisture.   As shown in Figures 14-16 and Table 6, the linear model that was developed did reflect the relationship between the characteristic elements and soil moisture. The highest singlesatellite inversion correlation for station GP01 was the phase inversion of the L1 band, with a correlation of 58.  Table 6 show that the fused data had an improved correlation and model fit, and the root mean square error and mean absolute error decreased, which verified the effectiveness of the multi-satellite multi-band fusion technique with a linear model.
In this experiment, the training process of the BP neural network was set to 6000 epochs. One epoch is the process of importing the entire dataset for complete training, with 16 hidden layers and a learning rate of 0.02. In the GA algorithm of GABP neural network, the population size is set to 500, the mutation rate is 0.09, the crossover rate is 0.1, and the number of iterations is 500; in the BP neural network, the number of epochs is set to 1000, and the learning rate is 0.01. Figure 17 shows the loss value curve of the training set and validation set during the training process of BP neural network, and Figure 18 shows the loss value curve of the training set and validation set during the training process of GABP neural network. As shown in the Figure, there is no overfitting during the training process.  Table 6 show that the fused data had an improved correlation and model fit, and the root mean square error and mean absolute error decreased, which verified the effectiveness of the multi-satellite multi-band fusion technique with a linear model. In this experiment, the training process of the BP neural network was set to 6000 epochs. One epoch is the process of importing the entire dataset for complete training, with 16 hidden layers and a learning rate of 0.02. In the GA algorithm of GABP neural network, the population size is set to 500, the mutation rate is 0.09, the crossover rate is 0.1, and the number of iterations is 500; in the BP neural network, the number of epochs is set to 1000, and the learning rate is 0.01. Figure 17 shows the loss value curve of the training set and validation set during the training process of BP neural network, and Figure 18 shows the loss value curve of the training set and validation set during the training process of GABP neural network. As shown in the Figure, there is no overfitting during the training process.  After the training, we used untrained data from 20 April 2021 to 14 May 2021 as test data to verify the accuracy of the model. Figure 19 shows the results of soil moisture values in the linear model test set. Figure 20 shows the soil moisture results of the BP neural network test set. Figure 21 shows the soil moisture results of the GA-BP neural network model test set. After the training, we used untrained data from April 20, 2021 to May 14, 2021 as test data to verify the accuracy of the model. Figure 19 shows the results of soil moisture values in the linear model test set. Figure 20 shows the soil moisture results of the BP neural network test set. Figure 21 shows the soil moisture results of the GA-BP neural network model test set.

Discussion
In this study, five metrics-the root mean square error (RMSE), model goodness of fit (R2), correlation (r), mean absolute error (MAE): and mean squared error (MSE)-were used to evaluate the models' accuracy.
Root mean square error (RMSE): The square root of the ratio of the square of the deviation of the observed value from the true value to the number of observations N. This reflects the extent to which the measured data deviate from the true value. The formula is shown in Equation (27): where is the predicted value of soil moisture, and is the true value of soil moisture. Model goodness of fit (R 2 ): The percentage of the variance in the dependent variable y that can be explained by the independent variable x. The formula is shown in Equation

Discussion
In this study, five metrics-the root mean square error (RMSE), model goodness of fit (R2), correlation (r), mean absolute error (MAE): and mean squared error (MSE)-were used to evaluate the models' accuracy.
Root mean square error (RMSE): The square root of the ratio of the square of the deviation of the observed value from the true value to the number of observations N. This reflects the extent to which the measured data deviate from the true value. The formula is shown in Equation (27): where y i is the predicted value of soil moisture, and Y i is the true value of soil moisture.
Model goodness of fit (R 2 ): The percentage of the variance in the dependent variable y that can be explained by the independent variable x. The formula is shown in Equation (28)-(31): Correlation (r): This measures the correlation between the predicted and actual values. The formula is shown in Equation (32): Mean absolute error (MAE): This is the average of the absolute error between the predicted value and the actual value. The formula is shown in Equation (33): Mean Squared Error (MSE): This is a commonly used measure of the difference between the predicted values of a model and the actual observed values to assess how well the model fits on the given data. The formula is shown in Equation (34): Table 7 shows the results for the root mean square error (RMSE), model goodness of fit (R 2 ), correlation (r), mean absolute error (MAE), and mean squared error (MSE) between the predicted and true values of soil moisture for the linear model.
As shown in Figure 19 and Table 7, the accuracy of the model built by using the data from the L2 band was higher than that when using the L1 band in the model of amplitude and soil moisture at station GP01, with the correlation between the predicted and true values of L2 band being 76.0%, the root mean square error being 2.144, the goodness of fit being 0.578, the mean square error being 4.597, and the mean absolute error being 1.608. The correlation between the predicted and true values of the fused data was 87.7%, the root mean square error was 1.927, the goodness of fit was 0.769, the mean square error being 3.713, and the mean absolute error was 1.365. It was calculated that the correlation between the predicted and true values of the fused data improved by 12.7%, the root mean square error decreased by 0.217, the goodness of fit improved by 0.191, the mean square error decreased by 0.884, and the mean absolute error decreased by 0.243 in comparison with the single-satellite data on the L2 band. In the model, the accuracy of the model built with data from the L2 band was higher than that built with data from the L1 band. The correlation between the predicted and true values of the L2 band was 88.9%, the root mean square error was 1.921, the goodness of fit was 0.790, the mean square error being 3.690, the mean absolute error was 1.560, and the correlation between the predicted and true values of fused data was 98.4%; the root mean square error was 1.028, the goodness of fit was 0.968, the mean square error being 1.057, and the mean absolute error was 0.790. It was calculated that the correlation between the predicted and true values of the fused data was improved by 9.5%, the root mean square error was reduced by 0.893, the goodness of fit was improved by 0.178, the mean square error decreased by 2.633, and the mean absolute error was reduced by 0.770 in comparison with the single-satellite data in the L2 band. Similarly, it could be calculated that the correlation between the predicted and true values of the fused data in the amplitude and soil moisture model for station GP02 was improved by 7.6%, the root mean square error was reduced by 0.476, the fit was improved by 0.251, the mean square error decreased by 0.719, and the mean absolute error was reduced by 0.449 in comparison with the single-satellite data for the L2 band. In the phase and soil moisture model for station GP02, the correlation between the predicted and true values of the fused data improved by 7.6%, the root mean square error decreased by 0.472, the fit improved by 0.127, the mean square error decreased by 0.969, and the mean absolute error decreased by 0.458 in comparison with the single-satellite data for the L2 band. In the amplitude and soil moisture model for station GP03, the correlation between the predicted and true values of the fused data improved by 9.3%, the root mean square error decreased by 0.636, the fit improved by 0.151, the mean square error decreased by 0.940, and the mean absolute error decreased by 0.463 in comparison with the single-satellite data for the L1 frequency band. In the phase and soil moisture model for station GP03, the correlation between the predicted and true values of the fused data improved by 6.5%, the root mean square error decreased by 0.18, the fit improved by 0.087, the mean square error decreased by 0.240, and the mean absolute error decreased by 0.162 in comparison with the single-satellite data for the L1 band.  Table 8 shows the results for the root mean square error (RMSE), model goodness of fit (R 2 ), correlation (r), mean absolute error (MAE), and mean squared error (MSE) between the predicted and true values of soil moisture from the BP neural network model.
As shown in Figure 20 and Table 8, the accuracy of the model built by using the data from the L2 band in the BP neural network model of station GP01 was higher than the accuracy of that built with the data from the L1 band. The correlation between the predicted and true values for the L2 band was 84.2%, the root mean square error was 2.114, the goodness of fit was 0.447, the mean square error was 4.469, and the mean absolute error was 1.615. The correlation between the predicted and true values for the fused data was 96.4%, the root mean square error was 0.907, the goodness of fit was 0.898, the mean square error was 0.823, and the mean absolute error was 0.602. It was calculated that the correlation between the predicted and true values of the fused data was improved by 12.2%, the root mean square error was reduced by 1.207, the goodness of fit was improved by 0.122, the mean square error was decreased by 3.646, and the mean absolute error was reduced by 1.013 in comparison with the single-satellite data in the L2 band. In the BP neural network model for station GP02, the accuracy of the model built with the data from the L2 band was higher than that of the model built with data from the L1 band. The correlation between the predicted and true values for the L2 band was 81.1%, the root mean square error was 0.787, the goodness of fit was 0.594, the mean square error was 0.619, and the mean absolute error was 0.651. The correlation between the predicted and true values for the fused data was 96.5%, the root mean square error was 0.392, the goodness of fit was 0.899, the mean square error was 0.154, and the mean absolute error was 0.298. It was calculated that the correlation between the predicted and true values of the fused data was improved by 15.4%, the root mean square error was reduced by 0.395, the goodness of fit was improved by 0.305, the mean square error was decreased by 0.465, and the mean absolute error was reduced by 0.353 in comparison with the single-satellite data for the L2 band. In the BP neural network model for station GP03, the accuracy of the model built with the data from the L1 band was higher than that of the model built with data from the L2 band. The correlation between the predicted and true values of the L1 band was 70.1%, the root mean square error was 0.826, the goodness of fit was 0.671, the mean square error was 0.682, and the mean absolute error was 0.635. The correlation between the predicted and true values for the fused data was 75.9%, the root mean square error was 0.599, the goodness of fit was 0.720, the mean square error was 0.359, and the mean absolute error was 0.432. Compared with the single-satellite data for the L1 band, the correlation between the predicted and true values of the fused data was improved by 5.8%, the root mean square error was reduced by 0.227, the goodness of fit was improved by 0.058, the mean square error was decreased by 0.323, and the mean absolute error was reduced by 0.203.  Table 9 shows the results for the root mean square error (RMSE), model goodness of fit (R 2 ), correlation (r), mean absolute error (MAE), and mean squared error (MSE) between the predicted and true values of soil moisture for the GA-BP neural network model.
As shown in Figure 21 and Table 9, the accuracy of the model built with the data from the L2 band in the GA-BP neural network model for station GP01 was higher than that of the model built with data from the L1 band. The correlation between the predicted and true values for the L2 band was 89.1%, the root mean square error was 1.078, the goodness of fit was 0.856, the mean square error was 1.162, and the mean absolute error was 0.688, and the correlation between the predicted and true values of the fused data was 95.4%. The correlation between the predicted and true values of the fused data was 95.4%, the root mean square error was 0.983, the goodness of fit was 0.880, the mean square error was 0.966, and the mean absolute error was 0.533. It was calculated that the correlation between the predicted and true values of the fused data improved by 6.3%, the root mean square error decreased by 1.207, the goodness of fit improved by 0.024, the mean square error was decreased by 0.196, and the mean absolute error decreased by 0.155 in comparison with the single-satellite data for the L2 band. In the GA-BP neural network model for station GP02, the accuracy of the model built with the data from the L2 band was higher than that of the model built with the data from the L1 band. The correlation between the predicted and true values for the L2 band was 88.5%, the root mean square error was 0.199, the goodness of fit was 0.874, the mean square error was 0.040, and the mean absolute error was 0.151. The correlation between the predicted and true values of the fused data was 94.2%, the root mean square error was 0.154, the goodness of fit was 0.885, the mean square error was 0.024, and the mean absolute error was 0.096. It was calculated that the correlation between the predicted and true values of the fused data was improved by 5.3%, the root mean square error was reduced by 0.045, the goodness of fit was improved by 0.011, the mean square error was decreased by 0.016, and the mean absolute error was reduced by 0.055 in comparison with the single-satellite data for the L2 band. In the GA-BP neural network model for station GP03, the accuracy of the model built with the data from the L1 band was higher than that of the model built with the data from the L2 band. The correlation between the predicted and true values for the L1 band was 82.2%, the root mean square error was 0.409, the goodness of fit was 0.590, the mean square error was 0.167, and the mean absolute error was 0.308. The correlation between the predicted and true values of the fused data was 84.8%, the root mean square error was 0.342, the goodness of fit was 0.713, the mean square error was 0.117, and the mean absolute error was 0.250. It was calculated that the correlation between the predicted and true values of the fused data was improved by 2.6%, the root mean square error was reduced by 0.067, the goodness of fit was improved by 0.123, the mean square error was decreased by 0.050, and the mean absolute error was reduced by 0.058 in comparison with the single-satellite data for the L1 band. We analyzed the soil moisture inversion error, calculated the absolute soil moisture inversion error (the difference between the inversion error and the true value), and analyzed the interval distribution pattern. Table 10 shows the maximum, median, and minimum true values of soil moisture, the predicted values of soil moisture for each inversion model, and the absolute errors between the predicted values and the true values. Figure 22 shows a statistical histogram of the frequency of absolute errors in soil moisture accounted for by the three model inversions at station GP02. As shown in the figure, the absolute error distribution of the linear model is between 0.5 and 1.5, the BP neural network model has an absolute error distribution of −0.5 to 0.5, the GABP neural network model has an absolute error distribution of −0.25 to 0.25, and overall, the three models conform to a normal distribution.   Figure 22 shows a statistical histogram of the frequency of absolute errors in soil moisture accounted for by the three model inversions at station GP02. As shown in the figure, the absolute error distribution of the linear model is between 0.5 and 1.5, the BP neural network model has an absolute error distribution of −0.5 to 0.5, the GABP neural network model has an absolute error distribution of −0.25 to 0.25, and overall, the three models conform to a normal distribution.  From the above experimental results, it can be seen that the accuracy of the linear model, BP neural network model, and GA-BP neural network model built by fusing multisatellite multi-band data by using the technique proposed in this study was higher than that of the model built from single-satellite data, which fully proved the feasibility and effectiveness of the method proposed in the study. The values predicted with the GA-BP neural network model were closer to the true values received by the sensors. Figure 23 shows the plot of soil moisture values and error in the true values for the inversion of the three models, where yellow is the predicted value and error in the true values for the GA-BP neural network model, green is the predicted value and error in the true values for the BP neural network model, red is the predicted value and error in the true values for the linear model built with the amplitude of the characteristic elements, and blue is the predicted value and error in the true values for the linear model built with the phase of the characteristic elements.
From the above experimental results, it can be seen that the accuracy of the linear model, BP neural network model, and GA-BP neural network model built by fusing multisatellite multi-band data by using the technique proposed in this study was higher than that of the model built from single-satellite data, which fully proved the feasibility and effectiveness of the method proposed in the study. The values predicted with the GA-BP neural network model were closer to the true values received by the sensors. Figure 23 shows the plot of soil moisture values and error in the true values for the inversion of the three models, where yellow is the predicted value and error in the true values for the GA-BP neural network model, green is the predicted value and error in the true values for the BP neural network model, red is the predicted value and error in the true values for the linear model built with the amplitude of the characteristic elements, and blue is the predicted value and error in the true values for the linear model built with the phase of the characteristic elements. In this study, based on GNSS-R technology and deep learning method, we carried out a study on soil moisture inversion for the channel slope of the deep excavated expansive soil canal section of the South-to-North Water Diversion Middle Line Project in China, and provided a systematic solution for soil moisture inversion in the study area, which provided scientific data support for analyzing the deformation mechanism of the channel slope in the study area, and the main conclusions of the study are as follows:

Conclusions
In this study, based on GNSS-R technology and deep learning method, we carried out a study on soil moisture inversion for the channel slope of the deep excavated expansive soil canal section of the South-to-North Water Diversion Middle Line Project in China, and provided a systematic solution for soil moisture inversion in the study area, which provided scientific data support for analyzing the deformation mechanism of the channel slope in the study area, and the main conclusions of the study are as follows: (1) In order to improve the accuracy of soil moisture inversion by GNSS-R technology, a multi-satellite and multi-band data fusion technique is proposed based on the least squares adaptive fusion algorithm and entropy value method, which provides a solution to the problems of limited observation information and low inversion accuracy of soil moisture in single-satellite inversion. Combining the fused data with linear models, BP neural network models, and GA-BP neural network models for soil moisture inversion experiments, it can be seen that compared with single satellite retrieval, the root mean square deviation of the three models decreased by 0.893, 1.207, and 1.207, respectively, which indicates that it is feasible and reliable to use the multi-satellite multi-band data fusion technology proposed in this study to retrieve soil moisture in the study area. (2) The inversion analysis of soil moisture near the three GNSS stations was carried out using linear model, BP neural network model, and GA-BP neural network model, respectively, and the results showed that the results of inversion of soil moisture using GA-BP neural network were better than the other two models, and the correlation degree of the three sites is as low as 84.8% and as high as 95.4%, which indicates that the comprehensive use of multi-satellite multi-band data fusion technology and GA-BP neural network model inversion of soil moisture can achieve good results. It provides a new technical path for the soil moisture inversion of deep excavated expansive soil channel slopes in the South-to-North Water Diversion Project. (3) The GNSS-R soil moisture inversion process is affected by terrain conditions and soil roughness. The application scenario of this paper is the slope of the channel of the South-to-North Water Diversion Middle Line Project, and the study area has a low vegetation coverage, so the influence of vegetation on soil moisture is not considered. In the future, we will further optimize the soil moisture inversion model based on GNSS-R and deep learning, focusing on the influence of vegetation on the inversion results, to achieve a more realistic soil moisture inversion and to expand the application scenarios of the research results.