Estimation of Heavy Metals in Agricultural Soils Using Vis-NIR Spectroscopy with Fractional-Order Derivative and Generalized Regression Neural Network

: With the development of industrialization and urbanization, heavy metal contamination in agricultural soils tends to accumulate rapidly and harm human health. Visible and near-infrared (Vis-NIR) spectroscopy provides the feasibility of fast monitoring of the variation of heavy metals. This study explored the potential of fractional-order derivative (FOD), the optimal band combination algorithm and different mathematical models in estimating soil heavy metals with Vis-NIR spectroscopy. A total of 80 soil samples were collected from an agriculture area in Suzi river basin, Liaoning Province, China. The spectra for mercury (Hg), chromium (Cr), and copper (Cu) of the samples were obtained in the laboratory. For spectral preprocessing, FODs were allowed to vary from 0 to 2 with an increment of 0.2 at each step, and the optimal band combination algorithm was applied to the spectra after FOD. Then, four mathematical models, namely, partial least squares regression (PLSR), adaptive neural fuzzy inference system (ANFIS), random forest (RF) and generalized regression neural network (GRNN), were used to estimate the concentration of Hg, Cr and Cu. Results showed that high-order FOD had an excellent effect in highlighting hidden information and separating minor absorbing peaks, and the optimal band combination algorithm could remove the inﬂuence of spectral noise caused by high-order FOD. The incorporation of the optimal band combination algorithm and FOD is able to further mine spectral information. Furthermore, GRNN made an obvious improvement to the estimation accuracy of all studied heavy metals compared to ANFIS, PLSR, and RF. In summary, our results provided more feasibility for the rapid estimation of Hg, Cr, Cu and other heavy metal pollution areas in agricultural soils.


Introduction
Soil contamination has increased significantly worldwide with the rapid development of industrialization and urbanization [1][2][3]. Due to their contamination being covert, persistent and irreversible, heavy metals are among the most hazardous contamination types that threaten the health of animals and human beings throughout the food chain [4][5][6][7][8]. Soil heavy metals, especially mercury (Hg), chromium (Cr), lead (Pb), cadmium (Cd) and copper (Cu), tend to be accumulated in soils because of anthropogenic activities and their own easy migration properties as common industrial pollutants [9][10][11][12][13][14]. Thus, it is crucial to closely monitor the concentration of heavy metals. However, the mechanisms of soil evolution are complex and difficult to understand [15], and the soil properties may vary obviously on a small scale [16]. Traditional field sampling is always accompanied by sample preparation and chemical analysis, which is difficult to apply widely [17]. Alternatively, because of its low cost and environmental friendliness, visible and near-infrared (Vis-NIR)  (80) in study area with Landsat 8 OLI image after true color compositing by bands 4 (red), 3 (green), and 2 (blue).
Sampling campaigns were carried out in October 2013, and 80 surface soil samples (0-20 cm) were collected in the study area. The coordinates of the samples sites were recorded through a handheld global positioning system, as shown in Figure 1b. Sampling points were randomly set in the cultivated lands along the river. Each sample contained five mixed subsamples, and non-soil materials were removed during collection. All samples were sealed in plastic bags and transported to the laboratory. After the samples were air-dried, the soil samples were passed through a 2 mm sieve. Each sample was divided into two parts: one part was used for spectral measurements, and the other was used for property analysis.

Chemical Analysis and Statistic
The concentration of Cr and Cu in the soil samples was measured through inductively coupled plasma atomic emission spectroscopy according to HJ 781-2016, and atomic fluorescence spectrometry was used to determine the concentration of Hg. For different heavy metal elements, the concentration was arranged in ascending order. The training set and validation set were separated as follows: the second value was selected, and then one value was selected every three values as the validation, and the remaining values were used as the training set. Each metal element was divided into a training set and validation set in a 3:1 ratio according to the method. The training set includes 60 samples, and the validation set includes 20 samples.

Spectral Measurement and Pretreatment
Spectral measurements of soil were performed with an ASD FieldSpec ® 3 field spectroradiometer (Analytical Spectral Devices, Inc., Boulder, CO, USA). The device has a spectral resolution of 3 nm over the 350-1000 nm region and 10 nm over the 1000-2500 nm region, and records reflectance at 1 nm intervals. Spectral measurements were conducted in a dark room. The soil samples were uniformly put into black containers with a  (80) in study area with Landsat 8 OLI image after true color compositing by bands 4 (red), 3 (green), and 2 (blue).
Sampling campaigns were carried out in October 2013, and 80 surface soil samples (0-20 cm) were collected in the study area. The coordinates of the samples sites were recorded through a handheld global positioning system, as shown in Figure 1b. Sampling points were randomly set in the cultivated lands along the river. Each sample contained five mixed subsamples, and non-soil materials were removed during collection. All samples were sealed in plastic bags and transported to the laboratory. After the samples were air-dried, the soil samples were passed through a 2 mm sieve. Each sample was divided into two parts: one part was used for spectral measurements, and the other was used for property analysis.

Chemical Analysis and Statistic
The concentration of Cr and Cu in the soil samples was measured through inductively coupled plasma atomic emission spectroscopy according to HJ 781-2016, and atomic fluorescence spectrometry was used to determine the concentration of Hg. For different heavy metal elements, the concentration was arranged in ascending order. The training set and validation set were separated as follows: the second value was selected, and then one value was selected every three values as the validation, and the remaining values were used as the training set. Each metal element was divided into a training set and validation set in a 3:1 ratio according to the method. The training set includes 60 samples, and the validation set includes 20 samples.

Spectral Measurement and Pretreatment
Spectral measurements of soil were performed with an ASD FieldSpec ® 3 field spectroradiometer (Analytical Spectral Devices, Inc., Boulder, CO, USA). The device has a spectral resolution of 3 nm over the 350-1000 nm region and 10 nm over the 1000-2500 nm region, and records reflectance at 1 nm intervals. Spectral measurements were conducted in a dark room. The soil samples were uniformly put into black containers with a diameter of 10 cm, and the surface of soil samples was scraped flat. A 50 W halogen lamp was used as the light source, and the incident angle was 30 • . The optical fiber probe was perpendicular to the surface of each sample at a distance of 5 cm. Before spectral measurements, the device was preheated for 30 min and optimized with a white panel. After every 10 soil samples, Remote Sens. 2021, 13, 2718 4 of 24 the whiteboard calibration was also carried out. Each sample was tested ten times, and the average value was obtained as the actual spectral reflection data.
After spectral measurements, the original spectra were treated by using the Savitzky-Golay filter [42]. A low order and narrow window size make the smoothing less effective, whereas a high order and loose window size can cause the over-smoothing phenomenon [21]. It is necessary to select moderate parameters, thus the second order and a window size of 11 points were determined [23]. After applying the Savitzky-Golay filter, marginal bands were eliminated, and the region of 400-2400 nm was used for subsequent analysis.

Spectral Feature Reduction
Vis-NIR spectroscopy data have a lot of redundant information when predicting a certain metal element in soils [43]. To avoid the situation of "dimensionality disaster", dimensionality reduction and optimal band combination have been widely used in predicting various elements in soils [29,[44][45][46][47][48][49][50]. In the study, five types of two-band spectral index are used to explore the suitable band combination for Hg, Cr, and Cu. The spectral indices include ratio index (RI), normalized difference index (NDI), difference index (DI), product index (PI), and sum index (SI). In the subsequent analyses, the five spectral indices were calculated in the training set for the optimal band combination algorithm. Additionally, the five spectral indices were calculated in both the training set and the validation set as the input of the model. The calculation details of the indices are listed as Equations (1)-(5). All combinations were calculated from the wavelength over 400-2400 nm, and the correlation coefficients with different heavy metal elements were also obtained. For certain heavy metals, the combination with the largest absolute value of correlation coefficient was selected as the optimal index of the corresponding spectral index.
where B m and B n are the spectral values at bands m and n in the range of 400-2400 nm, respectively.

Fractional-Order Derivatives
Compared with integer derivatives, FODs have better ability to extract the detailed information in the spectral data [44]. FODs are generally expressed in three definitions: Riemann-Liouville (R-L), Grunwald-Letnikov (G-L), and Caputo [51,52]. R-L is widely used to compute analytic solutions of simple functions. However, the R-L derivative of the constant is not equal to 0, which limits the application of R-L. The numerical implementation of G-L is equivalent to the convolution operation, which makes it suitable for digital signal processing. Caputo is suitable for dealing with initial value problems of fractional differential equations, so it is often used in engineering [51]. Due to its simple description and applicability to the spectrum, G-L was used for calculation of FODs in this study. The FOD of function f (x) is shown in Equation (6) [47].
where v is the order. PLSR integrates the characteristic of principal component analysis, canonical correlation analysis and linear regression analysis; therefore, it can avoid the problem of strong correlation between variables [53,54]. Because of this advantage, PLSR plays an increasingly important role in the analysis of soil Vis-NIR spectrum [55,56]. Before establishing models, PLSR performed principal component decomposition of the spectral matrix X and the concentration matrix Y, as Equations (7) and (8): where T and P are the scoring matrix and loading matrix of X, respectively; U and Q are the scoring matrix and loading matrix of X, respectively; E is the error matrix of X, and F is the error matrix of Y. Then, linear regression is established for T and U, as Equation (9): where B is the regression coefficient matrix. The formula of predicting unknown samples is shown as Equation (10): where x is the spectra of unknown samples, and y is the concentration of unknown samples. The optimal number of latent variables was determined through the leave-one-out cross validation procedure.
As an ensemble learning method, RF [57] is developed from a classification and regression tree (CART) model [24,58]. RF is an ensemble classifier, which is composed of a group of decision tree classifiers; the formula is shown as Equation (11): where θ k is the random vector that is independently identically distributed, and K is the number of decision trees. RF has advantages in processing high dimensional and multiple linear data, and it is not sensitive to overfitting. Therefore, when many trees are added, RF do not over-fit, but produce a limited generalization error and large amount of computation [59,60]. Three primary parameters of RF include the number of trees to be grown (ntree), the number of predictor variables used to split the nodes at each partitioning (mtry) and the minimum size of the leaf (nodesize). In this study, we used the randomForest package to establish models [61]. The default ntree defined in the package is 500, the standard for regression analysis of nodesize is 5, and mtry is one-third of the total number of inputs. ANFIS is the network structure that implements the fuzzy inference system and utilizes hybrid-learning rules [62]. The structure of ANFIS includes an antecedent part and conclusion part, and the two parts are connected to each other by fuzzy rules in the network form [63]. The ANFIS has five layers; the output of the first layer is shown as Equation (12): where µ A is a membership function, x is the input of node i, and O 1,i is a membership value. The output of the second layer is an algebraic product of the input signals, as Equation (13): In the third layer, the ratio of the ith rule's firing strength to the total summation of all rules' firing strength is calculated in each node, as Equation (14): In the fourth layer, the output of each node is shown as Equation (15): where f i is the corresponding function of each node. The overall output is computed in the fifth layer, as Equation (16): The ANFIS algorithm embedded in the MATLAB fuzzy inference toolbox was used to develop models. The Subtractive Clustering method was applied for generating FIS to fuzzify the input data. Furthermore, a hybrid learning algorithm was used to train fuzzy inference system, and the coverage threshold was fixed to 0.05. The training stopped after the error remained stable.
GRNN [64] is a memory-based network that is used to approximate the linear and nonlinear regression between independent variables and dependent variables [65]. GRNN has four layers, including an input layer, model layer, summation layer and output layer. In the input layer, the number of neurons is equal to the dimension of the input vector in the samples. The neuron can directly transmit the input into the model layer. The number n of samples is equal to the number of neurons in the model layer, and the transfer function of the neuron i is shown as Equation (17): where X is the input of the network, X i is the corresponding training sample, and σ is the smoothing parameter. In the summation layer, there are two types of neurons. The summation is performed through two methods: arithmetic summation and weighted summation; the formulas are shown in Equations (18) and (19): where y ij is the jth element of the ith output sample. The number of neurons in the output layer is equal to the dimension k of the output vector. The output of neuron i can be described as Equation (20): GRNN can be thought as the second-class radial basis function (RBF) network, having the feature of fast and straightforward estimators. The smooth factor is the parameter of GRNN; the error increases with the increase in smooth factor, and the network generalization ability decreases with the decrease in smooth factor. In the study, the optimum smooth factor was gained as trial and error, according to the mean squared error of estimation [66].

Model Accuracy Evaluation
In this study, the models were established based on the training set, and its accuracy was evaluated through comparing the predicted concentration and measured concentration Remote Sens. 2021, 13, 2718 7 of 24 of the validation set. The division of the training set and validation set was presented in Section 2.2, and they will be described in detail in Section 3.1. Furthermore, three parameters, the determination coefficient (R 2 ), root mean square error (RMSE) and residual prediction deviation (RPD), were utilized as the indicators to evaluate model accuracy. The calculations of R 2 , RMSE and RPD are as Equations (21)-(23).
where t i is the heavy metal concentration of the ith sample in the validation set, p i is the predicted heavy metal concentration of the ith sample in the validation set, t is the average heavy metal concentration, and n is the total number of the samples in the validation set; SD is the standard deviation of the heavy metal concentration in the validation set.

Descriptive Statistics for Heavy Metals
The descriptive statistics of three heavy metals is shown in Table 1. According to China's Soil Environmental Quality Control Standards [67], no sample exceeded the risk screen value of Hg and Cr, but 15 samples exceeded the risk screen value of Cu. Furthermore, 86.25% of Hg, 75% of Cr, and 100% of Cu exceeded the background values. Through the pollution ratio of the training set and validation set, it can be seen that the heavy metal concentration distribution of the training set and validation set is similar. Therefore, the two sets for Hg, Cr, and Cu were comparable and representative of the entire population.

Fractional-Order Derivative Spectrum
The spectra of 80 soil samples were processed by FOD, and the results are demonstrated in Figure 2. The original spectrum was relatively smooth. The spectral reflectance changed gently and showed an upward trend. Three absorption peaks appeared at approximately 1410 nm, 1910 nm, and 2210 nm, and the absorption characteristics were not significant enough. As the order of FOD increases, fractional differential values approached 0, indicating that the baseline drift and mixed overlapping peaks were gradually removed. The spectral intensity was also reduced, thereby more detailed information of the spectrum Remote Sens. 2021, 13, 2718 8 of 24 was mined. This was proven by the presence of more characteristic peaks in the high fractional-order spectrum. Furthermore, when the order of FOD achieved 2, the spectrum fluctuated too frequently, indicating that some noise obscures the spectral information. strated in Figure 2. The original spectrum was relatively smooth. The spectral reflectance changed gently and showed an upward trend. Three absorption peaks appeared at approximately 1410 nm, 1910 nm, and 2210 nm, and the absorption characteristics were not significant enough. As the order of FOD increases, fractional differential values approached 0, indicating that the baseline drift and mixed overlapping peaks were gradually removed. The spectral intensity was also reduced, thereby more detailed information of the spectrum was mined. This was proven by the presence of more characteristic peaks in the high fractional-order spectrum. Furthermore, when the order of FOD achieved 2, the spectrum fluctuated too frequently, indicating that some noise obscures the spectral information.

Correlation between Heavy Metals and Optimal Spectral Indices
In order to reduce the redundant information of the spectrum with the optimal band combination algorithm, the correlation coefficient between heavy metal concentration and spectral indices was utilized. Hg is shown as an example in Figures A1-A5. As order of FOD increases, the correlation coefficients of all indices were augmented significantly, and the distribution pattern of the correlation coefficient was finer. The bands and correlation Remote Sens. 2021, 13, 2718 9 of 24 coefficients of the best spectral indices for different fractional-order spectrum are listed in Tables 2-4. It can be seen that the correlation pattern among different fractional-order spectra varies from case to case. For three heavy metals, better correlation coefficients are achieved by high-order spectrum (i.e., from 1-order to 2-order). Therefore, the optimal bands of high-order spectrum are statistically analyzed. The wavelengths at 1011 nm, 1173 nm, 1295 nm, 1710 nm, 1969 nm, and 2087 nm were selected by multiple high-order spectra and thus, are particularly important for the estimation of Hg. The wavelengths at 547 nm, 714 nm, 866 nm, 955 nm, 1137 nm, 2346 nm, and 2342 nm were selected by multiple high-order spectra and thus, are particularly important for the estimation of Cr. The wavelengths at 667 nm, 923 nm, 980 nm, 1015 nm, 1129 nm, and 1562 nm were selected by multiple high-order spectra and thus, are particularly important for the estimation of Cu.

Model Estimation and Comparisons
In this study, FOD and four modeling methods were used to establish the estimation models of Hg, Cr, and Cu. The model's accuracy was evaluated through comparing the predicted concentration and measured concentration of the validation set. The Hg concentration estimation model's accuracy is listed in Table 5. The Cr concentration estimation model's accuracy is listed in Table 6. The Cu concentration estimation model's accuracy is listed in Table 7.     Four modeling methods and FOD were applied in the modeling process. In the estimation of different heavy metals, the performances of four methods varied greatly. In addition, the models based on the fractional-order spectrum showed better estimation performances than those based on the integer-order spectrum and the original spectrum. The analyses of variance (ANOVA) were performed on three heavy metals. RMSE was taken as the dependent variable, and FOD and modeling methods were taken as the independent variables, as shown in Tables 8-10. The results showed that FOD and the modeling methods of three heavy metals passed the significance test. For Hg and Cr, the modeling methods had an influence on the accuracy of estimation (p < 0.05); FOD exerted a larger effect on the accuracy of estimation (p < 0.01). For Cu, both FOD and the modeling methods had a significant effect on the accuracy of estimation (p < 0.01).

Comparison of Results for Fractional-Order Derivatives
In order to intuitively demonstrate the influence of FOD on the estimation of heavy metals, the RPDs of models were drawn in line charts (Figure 3). The model with 1.8-order and the GRNN method achieved the best performance in the estimation of Hg, the model with 1.6-order and the GRNN method achieved the best performance in the estimation of Cr, and the model with 1.2-order and the RF method achieved the best performance in the estimation of Cu. Among 12 lines, 4 lines peaked at 1.8-order, 3 lines peaked at 1.6-order, 3 lines peaked at 1.4-order, and 2 lines peaked at 1.2-order. To further elucidate the effect of FOD, the error bar of three heavy metals was formed based on the RPD of models, as shown in Figure 4. For Hg and Cr, the accuracy of models varied little between 0-order and 1-order. For three heavy metals, the accuracy of the models using any method improved greatly between 1.2-order and 1.8-order, and decreased greatly at 2-order. In addition, the variances of model accuracy for Hg and Cr were larger between 1.4-order and 1.8-order.
shown in Figure 4. For Hg and Cr, the accuracy of models varied little between 0-order and 1-order. For three heavy metals, the accuracy of the models using any method improved greatly between 1.2-order and 1.8-order, and decreased greatly at 2-order. In addition, the variances of model accuracy for Hg and Cr were larger between 1.4-order and 1.8-order.

Comparison of Results for Mathematical Models
The models were divided into four categories according to modeling methods three heavy metals. The boxplot of RPD is shown in Figure 5. For Hg, the mean R models with GRNN reached 1.26, which was 0.05 higher than those with ANFI

Comparison of Results for Mathematical Models
The models were divided into four categories according to modeling methods for the three heavy metals. The boxplot of RPD is shown in Figure 5. For Hg, the mean RPD of models with GRNN reached 1.26, which was 0.05 higher than those with ANFIS, 0.15 higher than those with PLSR, and 0.08 higher than those with RF. The models with GRNN had the largest difference between maximum and minimum values of RPD. For Cr, the mean RPD of models with GRNN reached 1.31, which was 0.07 higher than those with ANFIS, 0.09 higher than those with PLSR, and 0.06 higher than those with RF. For Cu, the mean RPD of models with GRNN reached 1.50, which was 0.12 higher than those with ANFIS, 0.18 higher than those with PLSR, and 0.16 higher than those with RF. The scatterplots of optimal models of Hg, Cr and Cu are shown in Figure 6. Although the optimal model was implemented by RF in the estimation of Cr, the overall accuracy of Cr models was still dominated by GRNN. When estimating the concentration of Hg, Cr and Cu, compared with ANFIS, PLSR, and RF, the models with GRNN were significantly improved in the quartile values of RPD. derivatives.

Comparison of Results for Mathematical Models
The models were divided into four categories according to modeling methods for the three heavy metals. The boxplot of RPD is shown in Figure 5. For Hg, the mean RPD of models with GRNN reached 1.26, which was 0.05 higher than those with ANFIS, 0.15 higher than those with PLSR, and 0.08 higher than those with RF. The models with GRNN had the largest difference between maximum and minimum values of RPD. For Cr, the mean RPD of models with GRNN reached 1.31, which was 0.07 higher than those with ANFIS, 0.09 higher than those with PLSR, and 0.06 higher than those with RF. For Cu, the mean RPD of models with GRNN reached 1.50, which was 0.12 higher than those with ANFIS, 0.18 higher than those with PLSR, and 0.16 higher than those with RF. The scatterplots of optimal models of Hg, Cr and Cu are shown in Figure 6. Although the optimal model was implemented by RF in the estimation of Cr, the overall accuracy of Cr models was still dominated by GRNN. When estimating the concentration of Hg, Cr and Cu, compared with ANFIS, PLSR, and RF, the models with GRNN were significantly improved in the quartile values of RPD.

Discussion
In this study, the concentration of three heavy metals in most samples was below the risk screen value. However, most samples had Hg and Cr concentration that surpassed the local background value, and all samples had Cu concentration that surpassed the local background value. In unpolluted natural soils, the heavy metals are at trace concentrations [69]. The enrichment of heavy metals indicates the ecological environment has been

Discussion
In this study, the concentration of three heavy metals in most samples was below the risk screen value. However, most samples had Hg and Cr concentration that surpassed the local background value, and all samples had Cu concentration that surpassed the local background value. In unpolluted natural soils, the heavy metals are at trace concentrations [69]. The enrichment of heavy metals indicates the ecological environment has been disturbed by anthropogenic activities [70,71]. This is related to industrial activities, especially the non-standard design of mine tailing ponds, which leads to the leakage of heavy metal elements.
In the study of spectral derivatives, some researchers have reported that the first derivative produced higher model accuracy than the second derivative when estimating heavy metals and soil organic matter [72][73][74]. This phenomenon is related to the fact that the first derivative can enhance the spectral information while maintaining the continuity and integrity of spectral information, whereas the second derivative introduces additional spectral noise [20,72]. In Figure 2, the second derivative spectrum fluctuated too frequently, indicating that the spectral information was disturbed by spectral noise. However, in this study, the accuracy of models using the first derivative was similar to those using the second derivative in the estimation of Cr and Cu, and the accuracy of models using the second derivative was higher than those using the first derivative in the estimation of Hg. How did this phenomenon happen in our experiment? After spectral derivatives, the spectra were processed by the optimal band combination algorithm. The algorithm selects the optimal bands, and carries on the mathematical transformation to the reflectance. In this process, the additional spectral noise caused by the high-order derivatives is removed. Furthermore, compared with low-order derivatives, high-order derivatives have the advantages of highlighting spectral information and separating the minor absorption peaks. Therefore, the accuracy of the models using the second derivative was similar to that of the first derivative, or even greatly improved. In addition to the removal of noise, the optimal band combination algorithm also has the ability to reduce the impact of irrelevant wavelength and cope with the overlapping absorption of soil components through considering wavelength interactions [48,75,76]. According to the spectral indices counted in Tables [56,77]. The spectral characteristics of soil organic matter (SOM) and clay are closely related to the spectral response of N-H, C-H and O-H covalent bonds [78]. This is consistent with the absorption characteristics obtained. According to previous studies, due to the strong absorption capacity of SOM, iron oxides and clay, the Vis-NIR spectroscopy monitoring of heavy metals is through association with proxies (i.e., SOM, iron and clay) [79,80]. Therefore, Vis-NIR spectroscopy estimation for Hg, Cr and Cu can be performed indirectly through these components or functional groups.
It is obvious that the spectra after the first derivative are quite different from those after the second derivative, and most of the subtle information changes are ignored. Tables 2-4 demonstrate how the correlation of spectral indices varies with the order of spectral derivatives. The characteristic bands of spectral indices have great variation in 1-order and 2-order. This showed that conventional spectral derivatives lost certain useful spectral information, and this loss of information is amplified by the optimal band combination algorithm. Additionally, the correlation coefficient of most spectral indices achieved the highest value between 1.2-order and 1.8-order, suggesting that FOD is able to detect more detailed spectral features compared with conventional spectral derivatives. Figure 3 shows the variation in model accuracy after FOD and the four modeling methods. It can be seen that the models using FOD achieved better accuracy in the estimation of three heavy metals, regardless of the modeling method used, and the optimal models appeared at the 1.8-order, 1.6-order, and 1.2-order. Consistent with the optimal models, the maximum values of mean RPD for the models were also achieved at 1.8-order, 1.6-order and 1.2-order, which corresponded to Hg, Cr and Cu, respectively ( Figure 4). This further indicates the advantages of high-order derivatives and the optimal band combination algorithm. Highorder derivatives can highlight the hidden features in the spectrum, and the additional noise introduced by the high-order derivatives can be removed by the optimal band combination algorithm. The incorporation of the optimal band combination algorithm and FOD can enhance the estimation accuracy of heavy metals.
In the study, the ANFIS, PLSR, GRNN, and RF methods were applied to compare the estimation accuracy of the three heavy metal concentrations. According to Tables 8-10, the model accuracy listed in Tables 5-7 is influenced not only by FOD but also by modeling methods, and the more intuitive comparison of modeling methods is shown by boxplots ( Figure 5). PLSR obtained the lowest mean value of model accuracy in the estimation of the three heavy metals. PLSR is only suitable for linear relationships [16,74,81]. However, there are various external or internal factors that can enhance the non-linear relationships between the Vis-NIR spectrum and soil components, such as measurement environment and characteristics of components [82,83]. Thus, compared with PLSR, the other three modeling methods are more suitable for estimation of heavy metals in soils due to their substantial non-linear prediction ability. Among ANFIS, RF and GRNN, GRNN showed the best performance in the estimation of heavy metal concentration. The GRNN has excellent capacity on non-linear mapping and learning speed; in particular, it has greater advantages in mapping unstable data to more accurate output [39,40,66]. Additionally, the range of target property is the fundamental factor of model accuracy [84]. The CV values reached 62.35%, 34.45%, and 28.58% for Hg, Cr, and Cu (Table 1), indicating heavy metal concentration in this study varied in a wide range with a high degree of dispersion. This is related to the outperformance of the models using GRNN.
In addition, the study also has some limitations. When dividing the dataset, more sample selection methods should be considered to enhance the stability of the experiment. For example, stratified sampling and Kennard-Stone sampling can be used to avoid taking the output property into account [85]. In the experiment, due to the high CV values of Hg, Cr and Cu in the samples, the models did not achieve high accuracy [84]. The representativeness of soil samples is significant for the estimation of heavy metals in soils using Vis-NIR spectroscopy. In future studies, this should be avoided by taking more soil samples. Furthermore, because of the optimal band combination algorithm, only five features were taken as input variables. This is related to the failure of RF to achieve better performance. Additionally, it is necessary to note the potential limitation of GRNN. It cannot ignore irrelevant inputs, thereby it is not likely to be suitable enough if existing more than five or six non-redundant inputs [66]. Thus, methods other than GRNN should be considered when using the full spectrum for modeling. The optimal results can be obtained by combining GRNN with appropriate dimensionality reduction methods, such as the optimal band combination algorithm.
The study of soil components is a multi-factor problem [86]. In the particular area, the lack of high-resolution spatial variability data always leads to substantial errors [87]. Using Vis-NIR spectroscopy can effectively reduce the workload of sampling, so as to obtain more spatial variability data. Thus, the incorporation of FOD, the optimal band combination algorithm and GRNN is a significant method for rapid evaluation of heavy metal concentration in agricultural soils.

Conclusions
In this study, the potential of Vis-NIR spectroscopy to estimate the concentration of Hg, Cr, and Cu in soils was evaluated. Through the incorporation of FOD, the optimal band combination algorithm and modeling methods, the optimal models of Hg, Cr, and Cu were obtained. The optimal validation accuracy of Hg (R 2 = 0.70, RPD = 1.86) and Cu (R 2 = 0.65, RPD = 1.73) was achieved by the 1.8-order and 1.2-order reflectance GRNN model, and the optimal estimated accuracy of Cr (R 2 = 0.69, RPD = 1.83) was achieved by the 1.6-order reflectance RF model. The optimal band combination algorithm is able to avoid the influence of spectral noise caused by high-order derivatives. Compared with conventional derivatives, FOD can identify the subtler spectral characteristics of heavy metals due to its gradual change in treatment of the spectrum. The high-order FOD is able to highlight hidden information and the separate minor absorbing peak. In addition, the incorporation of the optimal band combination algorithm and high-order FOD can further mine spectral information, ignoring noise. The optimal performance was achieved by the 1.8-order, 1.6-order, and 1.2-order spectra for Hg, Cr, and Cu, correspondingly. For estimation of heavy metals in soils, the modeling methods with the ability to solve nonlinear problems are more suitable. When using the appropriate dimensionality reduction method, GRNN provides an obvious improvement to the estimation accuracy of all studied heavy metals compared to ANFIS, PLSR, and RF. Thus, the incorporation of the optimal band combination algorithm, FOD, and GRNN for the rapid spectral estimation of Hg, Cr, and Cu concentration was proven to be feasible. Additionally, this study is vital for the application of Vis-NIR spectroscopy technology to the investigation of other heavy metal contaminants in soils. In the future, we will conduct more studies to detect the influence of GRNN and FOD on Vis-NIR spectroscopy estimation of soil heavy metals through large soil spectral libraries.
Author Contributions: All of the authors contributed to the study. X.X. conceived and designed the experiments, analyzed the data and wrote the manuscript. X.X. and Y.Z. processed the data. L.R., C.H., D.L. and F.A. contributed greatly to data collection. S.C. reviewed the manuscript. All authors have read and agreed to the published version of the manuscript. The correlations coefficients between Hg and the RI, DI, NDI, PI and SI indices of the 546 spectra after FOD are shown in Figure A1, Figure A2, Figure A3, Figure A4 and Figure