Theoretical Development of Plant Root Diameter Estimation Based on GprMax Data and Neural Network Modelling

: The in situ non-destructive quantitative observation of plant roots is difficult. Traditional detection methods are not only time-consuming and labor-intensive, but also destroy the root environment. Ground penetrating radar (GPR), as a non-destructive detection method, has great potential in the estimation of root parameters. In this paper, we use GprMax software to perform forward modeling of plant roots under different soil dielectric constants, and analyze the situation of plant roots with different dielectric constants and different root diameters under 1.5 GHz frequency antenna detection. Firstly, root systems with increasing diameter under different values of root and soil dielectric constant were scanned. Secondly, from the scanning results, two time points   and   of radar wave entering and penetrating the root system were defined, and the correlation between root diameter  and time interval ∆ between   and   was analyzed. Finally, the least square regression model and back propagation (BP) neural network model for root diameter parameter estimation were established, and the estimation effects of the two models were compared and evaluated. The research results show that the root diameter (12–48 mm) is highly correlated with the time interval. Given the dielectric constants of the root and soil, the prediction results of the two models are accurate, but the prediction result of the neural network model is more stable, and the residual between the predicted value and the actual value is mainly concentrated in the [−1.5 mm, 1.5 mm] range, as well as the average of prediction error percentage being 3.62%. When the dielectric constants of the root and soil are unknown, the accuracy of the prediction results of the two models is decreased, but the stability of the neural network model is still superior to the least squares model, and the residual error is mainly concentrated in the range of [−5.3 mm, 5.0 mm], the average of prediction error percentage is 10.19%. This study uses GprMax to simulate root system detection and reveals the theoretical potential of GPR technology for non-destructive estimation of root diameter parameters. It is also pointed out that in the field exploration process, if the dielectric constants of the root and soil in the experimental site are sampled and measured first, the prediction accuracy of the model for root diameter would be effectively improved. This research is based on simulation experiments, so further simulation followed by laboratory and field testing is warranted using non-uniform roots and soil.


Introduction
Plant roots are important media to ensure the carbon-nitrogen-water cycle of terrestrial ecosystems [1]. The root system obtains water and nutrients from the soil to support the growth and energy conversion of above-ground vegetation [2], and its fixation of soil organic matter effectively reduces the occurrence and harm of soil erosion [3]. The research of plant root systems helps explain the basic principles of plant evolution and spread, effectively protecting species diversity and biogeochemical cycles [4], and is also of great reference value for the formulation of natural environmental protection measures [5]. Traditional root research methods, such as excavation methods, monoliths, etc. [6,7], are limited in their application and development due to their heavy workload, destructive sampling approach and unachievable repeated measurements [8]. With the innovation and development of technology, some in situ non-destructive observation methods have become the hotspots of plant root parameter estimation and distribution modeling research [9], such as minirhizotrons [10], X-ray Computed Tomography (CT) [11], magnetic resonance imaging (MRI) [12], ground-penetrating radar (GPR) method [13][14][15][16], etc. Among them, GPR is a non-intrusive geophysical technology that uses high-frequency electromagnetic waves to locate underground targets [17]. It has fast detection speed, simple and flexible operation, and is widely used in the field of non-destructive detection [18]. The working principle of GPR is to transmit high-frequency electromagnetic waves into the ground through the transmitting antenna, and then use the receiving antenna to receive the electromagnetic waves reflected by the underground dielectric discontinuous medium, so as to obtain the change of two-way travel time and energy intensity of electromagnetic waves from transmission to reception [19]. Liang et al., Zhang et al., and Wu et al. have illustrated a more detailed introduction to the working methods of GPR [20][21][22].
At present, the research on root detection by GPR mainly focuses on root depth, diameter, growth direction, water content, biomass, distribution range and three-dimensional structure reconstruction [23][24][25]. Root diameter detection is a basic and important research direction. There is a strong positive correlation between root diameter and root biomass [26], so root diameter can be used to estimate root biomass, but the premise is that the detection result of root diameter should be as accurate as possible. Similarly, the correct estimation of root diameter will bring convenience to the three-dimensional structure reconstruction of underground roots, which requires high-resolution GPR, that is, to ensure that GPR can distinguish different reflection interfaces in vertical section [27].
In field exploration, increasing the frequency of electromagnetic waves can obtain higher-resolution root diameter scanning results, but it will also enhance the effects of various disturbance factors. At present, some researchers have proposed that the factors affecting GPR detection include the dielectric properties of root and soil, water content, and clay content of soil [28]. Among all the influencing factors, the comparison of dielectric properties between root and surrounding soil is a decisive factor affecting GPR performance [29,30]. The stronger the contrast of dielectric properties between root and soil, the greater the reflection coefficient of electromagnetic waves, which makes root detection easier [31]. In addition, surface vegetation, puddles, and soil heterogeneity will all affect the performance of GPR [28]. In order to reduce the effect of disturbance factors in the field detection environment, controlled experiments are carried out first in most cases. Guo et al. [32] successfully detected roots with a diameter of 2 cm buried 30 cm deep in the ground by using 900 MHz GPR, and improved the root biomass estimation model based on the amplitude and time interval of the reflected wave. Cui et al. [33] established an estimation model of root diameter parameters under controlled experiments and achieved good estimation results. For field exploration, Yamase et al. successfully detected the yellow poplar root with a diameter greater than 5 mm using the 900 MHz GPR [34], while Zhang et al. scanned the underground root with a diameter less than 1 cm using the 1600 MHz, 900 MHz, and 400 MHz GPR, and the results showed that the root recognition rates were all more than 50% [35]. Generally speaking, it is feasible to use GPR to detect roots in the field, but the success rate of root detection and the accuracy of root diameter estimation needs to be improved.
The main purpose of this study is to establish a non-destructive estimation model of plant root diameter parameters, so as to estimate the underground root diameter parameters according to the scanning results, including: (1) defining the time interval parameter and analyze its correlation with root diameter according to the GprMax v.3.1.5 (Available online: http://www.gprmax.com/, accessed on 29 March 2021) software's simulation detection results; (2) establishing least squares model and BP neural network model of root diameter parameters by using the correlation between root diameter and time interval; (3) testing the estimation effects of the two models under the conditions of known and unknown root and soil dielectric constants. The models established in this study will facilitate the non-destructive estimation of underground root diameter parameters and provide methods and guidance for the determination of root biomass in the future.

The Theoretical Basis of GprMax Forward Modeling
GprMax v.3.1.5 (Available online: http://www.gprmax.com/, accessed on 29 March 2021) is an open source software that can simulate electromagnetic wave propagation for GPR [36]. Previously, many researchers have proved the feasibility and accuracy of using GprMax for simulation detection through theory and practice [37][38][39][40][41]. Using GprMax for forward simulation is for modelling the corresponding radar response of roots and soil under the condition of known all variables of the simulation model, so as to explore the influence of each variable on the detection of roots by GPR [42]. In this research, GprMax was used to simulate the scanning process of GPR to obtain the data samples needed for the research. In GprMax, the simulation geometric model was constructed by using instructions with position coordinate information, and Paraview v5.9 (Available online: https://www.paraview.org/, accessed on 29 March 2021) [30] was used to check whether the constructed geometric model meets the expected requirements. For the establishment of the geometric model, please refer to Appendix A.

Time Interval Parameter
Scanning the root system with GPR yielded both A-scan and B-scan images. In Bscan, the result of root scanning is a hyperbola (for the description of A-scan and B-scan and the principles of hyperbola imaging of root scanning, see Appendix B), and the vertex of the hyperbola indicates the position of the root system [43]. Therefore, by extracting the A-scan at the vertex of the hyperbola, the reflection of the radar wave propagating along the root diameter direction can be obtained. The receiving antenna recorded the reflected wave of the root system and plotted the change of the positive and negative fluctuation of the field strength caused by the reflected wave. In the A-scan of field strength fluctuation caused by the radar wave reflected by the root system, two time points of the radar wave entering and penetrating the root system were defined as and (see Appendix C for detailed description of and ). and not only record the two time points when radar waves enter and penetrate the root system, but also reflect the time interval ∆ when the reflected wave field strength is negative:

Forward Modeling Experiment Design
In order to clarify the relationship between root diameter and ∆ , GprMax was used to design a simulation experiment. The comparison of the dielectric properties of the root and soil has been proved to be an important factor affecting the effectiveness of GPR detection. When the relative dielectric constants of root and soil are too close, the reflected wave signal received by the receiving antenna is weak or even cannot be received, thus making the root detection difficult. Using GPR, Liu et al. [44] found a close relationship between wave velocity and water content of soil in the root zone. In their experimental design, the dielectric constants of the soil were chosen to be 3.70, 6.35, 8.28, 9.98, and 13.67, and those of the root were chosen to be 7. 59, 9.21, 13.06, 17.81, and 23.53. The values of the dielectric constants of the root and soil in the experiment were obtained based on field measurements in Inner Mongolia, China. The values of the dielectric constants of the root and soil obtained by Liu et al. through field measurement provided a reference for the setting of the dielectric constants of the root and soil in this study. Therefore, in the exploratory experiment on the correlation between root diameter and ∆ , the soil dielectric constant was set to 3.70 and the root dielectric constant was set to 23.53, so that the comparison of the two dielectric constants was obvious in order to obtain a clear root scan hyperbola.
The simulation geometric model for exploring the correlation between root diameter and ∆ is shown in Figure 1, which is a vertical section of root scanning with GPR. The lower left corner of the image is the origin of coordinates, the x axis is the horizontal distance, and the y axis is the vertical depth, which together defines a rectangular area of 7.92 m × 0.28 m. In the vertical direction, the gray area below 0.25 m represents soil, and the white area above 0.25 m represents air. A pair of small gray and blue rectangles with an interval of 0.04 m were set at the surface of 0.25 m to simulate the transmitting and receiving antennas of GRP. In the horizontal direction, a cylinder root system was set every 0.5 m, totally 15 roots were set. The radius starts from 2 mm and increases at 2 mm intervals, as shown by the red circle in Figure 1 (the drawing of some root systems is omitted for simplicity). We ensure that the upper vertex of the circle is located on the horizontal line of 0.15 m, so that the time of radar wave propagation to each root vertex is the same. The blue border around Figure 1 simulates the perfect matching layer (PML), which is used to absorb the electromagnetic waves emitted by the antenna to the rectangular boundary. The transmitting and receiving antennas of the radar move synchronously to the right along the horizontal ground from the initial position on the left in Figure 1 to complete the simulation scan.

Model Training Data
When using GPR to scan the root system in the field, the dielectric constants of root and soil are unknown. The required parameters can only be extracted from the A-scan and B-scan images of the root scan, so as to predict the root diameter. However, under the contrast of different dielectric constants of root and soil, the scanning results of GPR on the root system are obviously different [32,45]. Therefore, in the process of establishing the model, the scanning results under different values of dielectric constant of root and soil should be comprehensively analyzed. Wu et al. [46] established a model of soil volume moisture content and dielectric constant, and determined that the dielectric constant of unsaturated water-bearing soil and sandy soil was between 6 and 11. However, due to the influence of water content, the dielectric constant of dry roots was 4.5, and the dielectric constant of saturated water roots can reach 22 and above [47]. The results of these studies provided a reference for the root and soil dielectric constants in this article. In the acquisition scheme of model training data, 5 different soil dielectric constants and 10 different root dielectric constants were selected and combined, and the specific values are shown in Figure 2. ), its value increases from 4 at intervals of 2, the y axis represents the relative dielectric constant of root ( ), its value increases from 7 at intervals of 2. The value in each rectangle represents the relative dielectric constant difference ∆ε between root and soil, and the black part is the discarded dielectric constant combination.
The relative dielectric constant difference ∆ε between root and soil is defined as: According to the values of the root and soil dielectric constants in Figure 2, 50 dielectric constant combinations of root and soil can be obtained. Under normal circumstances, the water content of living roots is greater than that of soil. It is found in the present study that when ∆ε is less than 3, the hyperbola of the root scanned by GprMax simulation is not clear, so the combinations of the dielectric constants of root and soil with < or ∆ε < 3 are discarded. The black portion in Figure 2 is that discarded combination, and the final effective combination is 40. For each effective combination of root and soil dielectric constants, 7 simulated cylindrical roots were set, with a radius of 5 mm beginning and increasing at 3 mm intervals. A center frequency of the radar was set to 1.5 GHz. Using GprMax to scan 7 roots with increasing radius under 40 combinations of root and soil dielectric constants, a total of 280 root hyperbolic samples can be obtained. Then Ascan of each hyperbola vertex was extracted to record and , and calculate ∆ , thereby obtaining a training data set of the model. The least square model and neural network model were established based on this training data. In the process of establishing neural network model, the training data were further divided into training set, validation set, and test set.

Model Test Data
In order to compare the prediction effects of the least square model and the neural network model on the root diameter parameters, the same test data need to be input. In this study, the soil dielectric constant was set to be 4, and the root dielectric constant was increased from 7 to 24 at interval 1. A total of 18 combinations of dielectric constants of roots and soil were generated. Under each combination of dielectric constants, 5 cylindrical roots with the diameters of 12 mm, 18 mm, 26 mm, 36 mm, and 48 mm were set. A hyperbola of 90 root scans was obtained by performing a simulation scan using GprMax. Then A-scan through each hyperbola vertex was plotted to record , , and ∆ , and the results were input into the two established models as test data for the models, so that the performance of the two estimated models could be compared.

Least Square Regression Model
The least square method is a commonly used method for data fitting [48]. Detailed description of the least square method can be found in Appendix D. In this study, the quadratic function curve with better fitting results was selected, that is, the basis function ( ) = 1, ( ) = , ( ) = , and the corresponding fitting function is: After solving the fitting coefficients * , * , * , the fitting Equation * ( ) = * + * + * was obtained, and the least square fitting model of time interval and root diameter was established. In this paper, in the basis function ( ) is the time interval ∆ , and is the root diameter .
For roots with the same diameter , ∆ obtained by root system scanning is not the same under different dielectric constants of root and soil. Therefore, the corresponding fitting curve ( ) will be produced under each combination of dielectric constants of root and soil. In Section 2.4.1, the root simulation scanning results of 40 combinations of root and soil dielectric constants were obtained, so 40 fitting curves of root diameter and time interval can be plotted. When using the least square model to predict the root diameter, if the dielectric constants of root and soil are known, the corresponding fitting curve ( ) can be found, and then the root diameter value can be predicted. If the dielectric constants of root and soil are unknown, the corresponding fitting curve ( ) cannot be obtained. Therefore, in this paper, the average value predicted by 40 fitting curves was calculated as the prediction result of root diameter: where: is the average value of the prediction results of 40 fitting curves. (∆ ) is the root diameter value predicted by the fitting curve when the input is ∆ .

BP Neural Network Model
The back propagation (BP) neural network is a computing model composed of a large number of interconnected neurons. Through learning and training, it calculates the output result that is approximate to the expected value when the input value is given [49]. The application of neural network in the field of non-destructive detection is novel and feasible [50]. Liu et al. used neural networks to determine the location of underground targets from GPR data [51]. In this study, we will also use a neural network to predict the diameter of the underground root system. The single-output neural network structure is shown in Figure 3. The input received by the h-th neuron in the hidden layer in Figure 3 is: where is the input from the i-th neuron. is the connection weight between the i-th neuron in the input layer and the h-th neuron in the hidden layer.
When the input of the hidden layer is greater than the threshold , the neuron is activated to further propagate the signal to the output layer. The input received by the output layer neuron is: where is the connection weight between the h-th neuron in the hidden layer and the j-th neuron in the output layer.
is the output of the h-th neuron in the hidden layer. When the total input received by neurons in the output layer is greater than the threshold, an output is generated. The neural network can be used to fit the data and establish a mapping relationship between the input and output of the data set.
In the training of the neural network model, the Levenberg-Marquardt (LM) back propagation algorithm was selected. This method involves three types of data sets: training set, validation set, and test set. The training set was used to calculate the gradient as well as update the connection weights and thresholds to complete the training of the model. A validation set was used for error estimation: if the error of training set decreases but the error of the validation set increases, we stop training in time to prevent over-fitting [52]. The test set was used to preliminarily evaluate the generalization ability of the model. , , and ∆ of 280 root samples in the training data in Section 2.4.1 were counted and randomly divided into 70% (196) of the training set, 15% (42) of the validation set, and 15% (42) of the test set (this test set is different from the test data in Section 2.4.2. The test set was used to preliminarily determine the prediction effect of the trained neural network model, while the test data in Section 2.4.2 were used to compare the prediction effect of the established least squares model with that of the neural network model under the same inputs). We set , , , , and ∆ as inputs, root diameter as label, and the number of hidden layer neurons as 10 to train the neural network model.

Performance Assessment and Comparison
In this paper, two models were used to estimate the root diameter parameters. When the same test samples were input, the prediction results of the two models needed to be counted, and the prediction performances of the two models were compared by calculating parameters such as residual and root mean square error ( ) between the predicted value and the actual value [53]. Among them, the distribution of the residuals can be analyzed by boxplots (the calculation of the residuals and is described in detail in Appendix E).

The Correlation between Root Diameter and ∆
The simulation scanning results of 15 cylinder roots with increasing radius set in the forward modeling experimental design in Section 2.3 are shown in Figure 4. The first hyperbolic image in Figure 4 is a scan result of a root with a diameter of 4 mm, and the second is a scan result of a root having a diameter of 8 mm. Through comparison, we can find that the hyperbolic image of the 8 mm root is relatively clear, and the two reflections of the root on the radar wave can be distinguished thereafter, while the scanning result of the 4 mm root is relatively vague, which indicates that the vertical resolution less than 8 mm can be obtained using the radar frequency of 1.5 GHz in this study. Later, with the increase of root diameter, the scanned hyperbola became clearer. Since the apex of the root system is located on the same horizontal line, the time point when the electromagnetic wave propagates to the top surface of the root system is basically the same, which is about 2 ns on the time axis. The propagation distance of electromagnetic wave increases in the root with increasing size, which leads to a longer time between two reflected waves, and the time point when the radar wave propagates to the bottom surface under the root diameter is also delayed. The A-scan passing through each hyperbola vertex was extracted from Figure 4, and then time points and were recorded from the A-scan and time interval ∆ was calculated. The results obtained are listed in Table 1. When the root dielectric constant is 23.53 and the soil dielectric constant is 3.70, the time interval ∆ increases with the increase of root diameter (Table 1), which indicates that there is a positive correlation between the time interval ∆ and root diameter .

Establishment of Least Square Model
The comparison of dielectric properties between root and soil will affect the detection effect of GPR. Therefore, we adopted the experimental scheme of Section 2.4 to expand the data samples, and then analyzed the correlation between root diameter and time interval under the comparison of dielectric properties between different roots and soil.
According to the root scanning results in Section 2.4, the fitting results of root diameter and time interval parameters under 40 combinations of root soil dielectric constants were obtained ( Figure 5). By observing the distribution of curves in Figure 5, it can be concluded that when ∆ is small, the difference between different curves is small, but when ∆ increases, the difference between different curves increases. For example, when ∆ is 0.3 ns, the predicted results of 40 curves are about 13-23 mm, while when ∆ is 1.2 ns, the predicted results are about 42-63 mm. Therefore, under different combinations of dielectric constants of root and soil, there are some differences among the least square prediction models, but the overall relationship between ∆ and is similar.

Back Propagation (BP) Neural Network Model
After the neural network model training was completed, regression analysis was carried out on each data set and the fitting curve was drawn ( Figure 6). It can be seen that for each type of data set in Figure 6, the fitting curve basically coincides with the ideal estimated distribution. The correlation coefficient and of each data set are listed in Table 2.

Comparison of Prediction Results under the Condition of Known Dielectric Constants of Root and Soil
After the establishment of the least square model and the neural network model, the time intervals ∆ extracted from the test samples in Section 2.4.2 were input into the two models to estimate the root diameter. Under the condition that the dielectric constants of root and soil are known, the root diameter results predicted by the two models according to ∆ are shown in Figure 7. The black circle in Figure 7 represents root input diameter, which could be divided into 5 groups according to the root diameter sizes: 12 mm, 18 mm, 26 mm, 36 mm, and 48 mm. The red circles are the estimation results of the root diameter by the prediction model. The residual error and the residual percentage between the estimated value and the actual value of the two models were calculated. The boxplots of the residual error and residual percentage are shown in Figure 8.  As shown in Figure 8a, the residual span of the least square model prediction result is about 3.7 mm, and that of the neural network model prediction result is about 3.2 mm, which indicates that the overall distribution range of the neural network model prediction residual is smaller. The range of the upper quartile ( % ) and lower quartile ( % ) of the residual distribution of the least square model is approximately [−0.5 mm, 1 mm], and the corresponding value of the neural network model is approximately [−1 mm, 1 mm], which indicates that the residual distribution of the least square method is more concentrated. It can be seen from Figure 8b that the residual percentage span of the least square model is about [−5%, 11%], and the residual percentage span of the neural network model is about [−10%, 5%], which shows that their span ranges are similar. By calculation, the of the least square model is 0.881, and the of the neural network model is 0.973. It can be found that both models have their advantages. The distribution span of the upper and lower quartiles of the least square model is smaller, and the is smaller than that of the neural network model. However, the overall residual distribution range of the neural network model is smaller, the prediction performance of the model is more stable, and the median of the residual distribution is closer to 0. In summary, under the condition of known dielectric constants of root and soil, the estimation of root diameter using the least square model and neural network model have both achieved good results. However, considering that the overall residual distribution range of the neural network model is smaller and the prediction performance of the model is more stable, the neural network model is more suitable for predicting root diameter parameters. The root diameters of the test samples were within [12 mm, 48 mm] of the range and the residual error range predicted by the neural network model was within [−1.5 mm, 1.5 mm] of the range, the residual percentage span was about [−10%, 5%], and the percentage of average prediction error was 3.62% according to the calculated absolute value of prediction error.

Comparison of Prediction Results under the Condition of Unknown Dielectric Constants of Root and Soil
Under the condition of unknown dielectric constants of root and soil, the test samples generated in Section 2.4.2 were also input into the least square model and neural network model respectively, so as to compare the prediction effects of the two models. For the least square model, the predicted value of root diameter can be obtained according to Equation (4). The prediction results of the two models are shown in Figure 9.  For the least square model, after the average value of 40 prediction curves was obtained, the error of the prediction results was larger than that in Figure 7a, but the overall prediction trend conforms to the distribution of the prediction curves of the least square model (Figure 9). Similarly, compared with Figure 7b, the coincidence degree between the estimated value and the actual value of the neural network model decreases, and the estimation error increases. Therefore, under the condition of unknown dielectric constant of root and soil, the prediction effect of both models is worse. As shown in Figure 10, the distribution boxplot of the residual and the percentage of the residual of the root diameter estimation of the two models under the unknown dielectric constants of the root and soil are plotted. It can be seen from Figure 10a that under the condition of the unknown dielectric constant of the root and soil, the residual span of the prediction result of the least square model is about 11 mm, and that of the prediction result of the neural network model is about 10 mm. The residual distribution intervals of the two models are roughly the same, but the median of the neural network model is closer to 0. It can be seen from Figure 10b that the residual percentage span of the least square model is about [−38%, 30%], and the residual percentage span of the neural network model is about [−30%, 21%], which indicates that the residual percentage distribution span of the neural network model is relatively small and the median of the residual percentage is closer to 0. Based on these two points, it is considered that the prediction performance of the neural network model is better. The specific prediction residuals are distributed between [−5.3 mm, 5.0 mm], the percentage of average prediction error is 10.19% according to the calculated absolute value of prediction error.
The trained neural network model was used to predict the root diameter parameters under the condition of unknown root and soil dielectric constants, and the results showed that the prediction performance was more stable than the least square model. Altogether, the stability performance of the neural network model is better than that of the least square model in the case of known or unknown dielectric constants of roots and soils. So the neural network model is considered to be more suitable for the estimation of root diameter parameters.

Discussion
In this paper, we describe two root diameter parameter prediction models based on simulation data, namely least square model and neural network model, and compare the prediction performance of the two models under the condition of known and unknown Residual (mm) Residual percentage (%) relative dielectric constants of root and soil. The simulation results showed that the prediction effect of the neural network model is better than that of the least square model, no matter whether the relative dielectric constants of root and soil are known or not. Therefore, the neural network model is more suitable for estimating root diameter parameters. Based on this conclusion, the prediction performance of the neural network model is compared under the conditions of known and unknown relative dielectric constants of root and soil. The results show that, given the relative dielectric constants of root and soil, the prediction error of the neural network model is in the range of [−1.5 mm, 1.5 mm], and the percentage of prediction residual is in the range of [−10%, 5%]. By calculating the absolute value of prediction error, the average prediction error percentage is 3.62%. However, under the condition of unknown relative dielectric constants of root and soil, the prediction error of the neural network model is increased by more than three times, and the prediction accuracy is seriously reduced. Therefore, we recommend sampling and measuring the dielectric constants of root and soil in the field exploration, so that the estimation accuracy of root diameter parameters can be effectively improved when the dielectric constants of root and soil are known.
In this study, the scanning environment set based on simulation detection has no effect of interference factors, so the neural network root diameter estimation model established in this environment has achieved good estimation effect. However, there are still some limitations in this study. Firstly, because the prediction performance of the model has not been evaluated by real detection data, it is our next task to carry out a large number of embedded experiments and field detection experiments. Secondly, although the root shape is not a cylinder and the soil is inhomogeneous in the real environment, the scanning results obtained by using GPR to detect specific tree species in a specific environment are clear hyperbola [16,54]. The clearer the hyperbola image is, the more accurate the time interval ∆ of radar wave passing through the vertex of hyperbola will be. As an important input parameter of the neural network model, the more accurate the time interval is, the better the prediction accuracy of the model will be. Therefore, we expect the neural network model to show good prediction performance under specific experimental conditions. In addition, in this study, the influence of root growth direction and the angle between root and GPR scanning line on the detection results has not been considered, while Liu et al. pointed out that the root direction and the relative geometry between root and ground GPR measurement direction will affect the intensity and shape of root reflection signals [55]. Therefore, in the following study, we will also analyze the influence of root angle on the scanning results based on simulation and embedded experiments.
The estimation of plant root diameter is basic and important research. Accurate estimation of root diameter will facilitate the study of root expansion. For example, it is difficult to detect fine roots in the actual detection process. However, Zhang et al. pointed out that the spatial distribution of fine roots can be predicted by using the spatial distribution of coarse roots [56], because fine roots occur nearly continuously along coarse roots [57]. If we can master the distribution law of the diameter of coarse roots, we can predict the spatial distribution of fine roots and even estimate the biomass of roots [6]. In addition, root diameter is also an important parameter for 3D structural reconstruction of roots [58,59]. Combined with the changing trend of root diameter, the root system can be restored more realistically [16,60]. Non-destructive estimation of root diameter is basic research. Although the root diameter parameter estimation model based on neural networks is based on simulation experiments, it provides a new idea in the root diameter estimation method, and the neural network model is a very suitable method in practical application. Specifically, the data collected in both controlled and field experiments can be used as supplementary data for training the neural network model, so that the neural network model can be continuously corrected and improved to produce more accurate estimation results in a more complex detection environment, which shows the advantages of using the neural network method to establish root diameter parameter estimation model, and also lays a foundation for further research of root systems in the future.

Conclusions
This research is based on GprMax simulation. From the results of root scanning, the time points and time intervals of two radar wave reflections caused by the root system were extracted, the correlation between root diameter and time interval was analyzed, and the estimation model of root diameter parameters was established. Through theoretical experiments, this paper draws the following conclusions for roots of 12-48 mm diameter in uniform soil: (1) Under the condition that the dielectric constants of root and soil are known, the average prediction error percentage of the least square model is 3.81%, and that of the neural network model is 3.62%. Under the condition of unknown dielectric constants of root and soil, the average prediction error percentage of the least square model is 11.32%, and that of the neural network model is 10.19%. (2) The prediction stability of the neural network model is better than that of the least square model with known and unknown dielectric constants of root and soil, so it is considered that the prediction effect of the neural network model is better and it is more suitable for estimating root diameter parameters. However, with unknown dielectric constants of root and soil, the prediction error of the neural network model is between [−5.3 mm, 5.0 mm], which is about 3 times that with known dielectric constants of root and soil, and the prediction accuracy drops seriously. (3) Comparing the prediction results of the neural network model under the known and unknown dielectric constants of root and soil, we recommend sampling and measuring the dielectric constants of root and soil in field exploration, so that the estimation accuracy of root diameter parameters can be effectively improved when the dielectric constants of root and soil are known.
The establishment of a root diameter parameter estimation model expands the means of root non-destructive detection. Root diameter is an important parameter in root research, so the correct estimation of root diameter lays a foundation for the future study of root biomass measurement, distribution range and three-dimensional reconstruction. This research is based on simulation experiments, so further simulation followed by laboratory and field testing is warranted using non-uniform roots and soil.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.
in the Figure A1. The blue border around the Figure A1 simulates the PML, which is used to absorb the electromagnetic waves emitted by the antenna to the rectangular boundary. Figure A1. Geometric model constructed by GprMax viewed in Paraview. Gray represents soil, small rectangles of gray and blue represent transmitting and receiving antennas of radar, red circle represents root system, and blue borders around it represent perfect matching layer (PML).

B. The imaging principle of root scanning hyperbola
A-scan and B-scan images can be obtained by simulating scanning of the root system with GPR. When GPR emit an electromagnetic wave underground, the electromagnetic wave is partially reflected at the interface between root and soil due to the difference of dielectric properties between root and soil, and the rest of the wave continues to propagate downward until the attenuation ended (In Figure A1, the receiving antenna only received the electromagnetic wave reflected by root, because the other electromagnetic waves were absorbed by PML). The receiving antenna receives the electromagnetic wave reflected by the root and records the change of electric field intensity of the reflected wave in time domain, thus forming a time-field strength change curve called A-scan (the black curves in Figure A2).
When the transmitting and receiving antennas transmit electromagnetic waves to the ground once at the initial position on the left side of the Figure A1, the A-scan recorded by the receiving antenna at this time is shown as the first curve in Figure A2. When GPR moves to the right with equal steps and emits electromagnetic waves to the ground at the same time of each movement, a group of A-scan diagrams of receiver field strength changes caused by electromagnetic waves reflected by roots can be recorded. The graph formed by merging this group of A-scans is the B-scan ( Figure A2). Throughout the scanning process, when the GPR was directly above the root system, the GPR was closest to the root, and the two-way travel time of the radar wave from transmission to reception was the shortest. When GPR was located at the initial and stop positions of both ends, GPR was furthest away from root, and the bidirectional travel time of radar wave from transmission to reception was the longest. Therefore, the distance between GPR and root decreased firstly and then increased during the whole movement process, which was manifested as the initial time point of field strength fluctuation caused by root decreased firstly and then increased in radar receiver. By connecting the starting and ending time points of radar wave reflection caused by root system in a group of Ascans with red dotted lines, the hyperbolic graph of root system scanning shown in Figure  A2 can be obtained.

C. The definition of time interval parameters
In Figure A1, the initial coordinates of the transmit antenna are (0.04 m, 0.15 m) and the initial coordinates of the receive antenna are (0.05 m, 0.15 m). Based on the initial position on the left side of Figure A1, they were set to translate to the right in steps of 0.002 m, with one scan of the root after each movement. After 80 times of translation, the symmetrical scanning results of the root system were obtained ( Figure A3). In Figure A3, red indicates positive field strength and blue indicates negative field strength. A darker color indicates a greater magnitude of the field strength. The distribution of color and amplitude is a result of the excitation source being a gaussian waveform. When the electromagnetic wave penetrates into and out of the root, the dielectric properties of the root and the soil change twice, resulting in two radar wave reflections, where the field strength fluctuates from positive to negative and then returns to positive ( Figure A3).
A-scan was extracted from Figure A3 when the transmitting and receiving antennas were symmetrically distributed just above the root ( Figure A4). At this moment the radar wave passed through the cylinder root approximately vertically. In the process of positive and negative fluctuation of field strength caused by radar wave reflected by root, the time points corresponding to zero field strength were defined as and ( Figure A3, Figure A4). and not only record the two time points when radar waves enter and pass through the root system, but also reflect the time interval ∆ when the reflected wave field strength is negative: Figure A3. B-scan generated by simulation scanning of root using GprMax. Red indicates positive field strength, blue indicates negative field strength, and white indicates zero field strength. The upper part is the surface reflected wave and the lower part is the result of root scanning. Two time points with zero field strength were marked as and . Figure A4. A-scan of radar wave crossing root diameter vertically downward along root diameter direction. Two time points with zero field strength were marked as and .

D. Least square method
The least square method requires that the quadratic sum ∑ where is the weight. By solving formula (5), the fitting coefficient can be solved, and then the fitting function can be determined.

E. Indicators of model evaluation
In this paper, the residual error and root mean square error ( ) of the two models were calculated as follows: For the least square model, the residual error between the estimated root diameter and the actual root diameter on the regression curve was calculated first: where: is the residual error between the estimated value and the actual value of the least square model. ∆ is the i-th time interval parameter entered. (∆ ) is the estimated value of root diameter of the least square model when ∆ is entered.
is the actual value of root diameter.
After obtaining the residual error of all test data, the distribution of residual error was analyzed by boxplots, and the was calculated: where is the of the least squares model. For the neural network model, it is necessary to use training data to train the model, establish the mapping relationship between input and output, and then use correlation coefficient and root mean square error to evaluate the training effect of the model: The closer the correlation coefficient is to 1, the closer the positive linear relationship between the time interval and the root diameter. The smaller the , the higher the accuracy of the root diameter estimated by the neural network model. After training a satisfactory neural network model, the root diameter parameter could be estimated. The evaluation indicator of estimation performance of neural network model is consistent with that of least square model, that is, residual distribution and . When comparing the prediction effects of the least square model and the neural network model, the residual percentage of the prediction model should be calculated, so as to draw boxplots of the residual percentage to compare and evaluate the prediction effects of the two models.

= * 100%
(A10) where is the residual percentage of the model.