Application of a Hyperspectral Remote Sensing Model for the Inversion of Nickel Content in Urban Soil

: Hyperspectral remote sensing technology can provide a rapid and nondestructive method for soil nickel (Ni) content detection. In order to select a high-effective method for estimating the soil Ni content using a hyperspectral remote sensing technique, 88 soil samples were collected in Urumqi, northwest China, to obtain Ni contents and related hyperspectral data. At ﬁrst, 12 spectral transformations were used for the original spectral data. Then, Pearson’s correlation coefﬁcient analysis (PCC) and the CARS method were used for selecting important wavelengths. Finally, partial least squares regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) models were used to establish the hyperspectral inversion models of the Ni content in the soil using the important wavelengths. The coefﬁcient of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and residual prediction deviation (RPD) were selected to evaluate the inversion effects of the models. The results indicated that using the PCC and CARS method for the original and transformed wavebands can effectively improve the correlations between the spectral data and Ni content of the soil in the study area. The random forest regression model, based on the ﬁrst-order differentiation of the reciprocal (RTFD–RFR), was more stable and had the best inversion effects, with the highest predictive ability (R 2 = 0.866, RMSE = 1.321, MAE = 0.986, RPD = 2.210) for determining the Ni content in the soil. The RTFD–RFR methods can be used as a means of the inversion of the Ni content in urban soil. The results of the study can provide a technical support for the hyperspectral estimation of the Ni content of urban soil.


Introduction
Nickel (Ni) is classified as a group 2B carcinogen by the International Agency for Research on Cancer (IARC) [1], and is potentially harmful to the safety of the entire urban ecosystem.In China, studies have shown that soil Ni contamination in the Pearl River Delta [2], Qingyuan City [3], Xiongan New Area in Beijing [4], Kashgar City in Xinjiang [5], and the Yellow River Basin [6] are associated with ecological and health risks, and the carcinogenicity risk index is higher for children than for adults.Other studies [7,8] found that human activities are potentially a significant source of Ni in soil.Due to the potential ecological and health risks, it is imperative to closely monitor Ni contamination for areas with high emissions.
The traditional methods for determining the Ni content in soil requires field sampling followed by laboratory analysis, but it is time-consuming, costly, and inefficient [9,10].Hyperspectral remote sensing technology has been applied to predict the physical, chemical, and biological properties of soil, due to the advantages of rapid, accurate, nondestructive, low-cost, and dynamic monitoring over a large area [11][12][13], including soil moisture [14,15], hydrocarbon content [16], nitrogen content [17], organic matter content [18], electrical conductivity [19], and salt content [20,21].In recent years, hyperspectral remote sensing technology has shown good results in the prediction of Ni content in soil [22,23].For instance, Xia et al. [24] used hyperspectral models to estimate the soil Ni content in the agricultural production areas of Zhejiang Province, China, and found that the PLSR model had the optimal predictive ability with an R 2 of 0.77, RMSE of 5.51, and RPD of 1.94.Zhao et al. [25] believe that the predictive accuracy of the PLSR model (R 2 = 0.846, RMSEP = 15.795) is better than the SMLR model for the inversion of soil Ni content in Handan City, Hebei Province, China.Guo et al. [26] proved that the SD-MLR model has the best stability and accuracy (R 2 = 0.842, RMSE = 4.474), and can accurately predict the content of Ni in soils of the iron mining area in Beijing.It has also been pointed out that the BPNN model is very effective in inverting soil Ni content in Beijing and Yan Tai city, Shandong Province, China [27,28].
Due to the impact of strong human activities in cities, the level of Ni in urban soil is higher than that in farmland and natural soil [29].Meanwhile, the most appropriate estimation models for Ni in soil are for different soil types with different physical and chemical properties.Further, there is no unified standard for selecting appropriate models to estimate the Ni content of urban soil.So far, there have been few pieces of literature found on the hyperspectral estimation of Ni content in urban soil.Therefore, it is very important to analyze the possibility of the hyperspectral inversion of Ni content in urban soil.
In this study, the main goals were as follows: (a) identify the important spectral wavelengths of Ni in urban soil; (b) evaluate the efficiency of different spectral transformation methods for the inversion of Ni content; (c) select an optimum hyperspectral prediction model for the Ni content in urban soil based on the partial least squares regression (PLSR), random forest regression (RFR), and support vector machine regression (SVMR) models.The results will solve the existing problems for the hyperspectral inversion of Ni content in urban soil.

Description of the Study Area
The experimental field (87 • 28 -87 • 37 E and 43 • 48 -44 • 04 N) was selected in the central parts of the Urumqi, which is the capital of Xinjiang and the core city on the Silk Road Economic Belt.It is situated in the southern edge of the Junggar Basin, the northwest arid regions of China, and is one of the important metropolitan cities in NW China (Figure 1).The main soil type is mainly grey desert soil [29], and the climate of the study area is regionally marked by a continental arid climate.The annual average temperature, precipitation, and evaporation are about 6.7 • C, 280 mm, and 2730 mm, respectively.In recent years, the accelerated process of urbanization, expanding scale of industrial and agricultural production, and the increase in emissions from production and living have led to the Ni contamination of the soil in the study area [29,30].

Sample Collection and Analysis
A total of 88 topsoil samples were collected from the study area in April of 2021, as illustrated in Figure 1.At each of the sample sites, five sub-samples were taken from the topsoil layer (0-20 cm) within 100 m × 100 m areas, and then mixed together to form a composite soil sample, weighing more than 500 g.All the soil samples were returned to the laboratory and sieved through 20 meshes after they were naturally air dried.Each sample was divided into two groups: one for the determination of the Ni content and another for the hyperspectral measurement.The soil Ni content was determined, as described in "HJ 803-2016" [31], using an Inductively Coupled Plasma Mass Spectrometer (ICP-MS 7800).All the soil samples were tested repeatedly, and the determined consistency of the Ni measurements was 95.6%.

Spectrometric Determination
The spectral determination of the collected soil samples was measured using a FieldSpec ® 3 (ASD, Boulder, CO, USA) portable object spectrometer manufactured by Analytical Spectral Devices.The interval of data acquisition was 1 nm, with a spectral measurement range from 350 to 2500 nm.Firstly, the instrument was preheated, and secondly, a 40 cm × 40 cm white board was placed on a 2 m × 2 m black cardboard for calibration to obtain the absolute reflectance before determining the spectral data.Finally, the soil samples were kept in a natural state on the black cardboard with the sensor probe perpendicular to 15 cm above the soil surface, and the sensor probe was optimized with a white board every 5 min.A total of 15 replicate measurements were taken on the same soil sample, and 15 spectral curves were collected.

Sample Collection and Analysis
A total of 88 topsoil samples were collected from the study area in April of 2021, as illustrated in Figure 1.At each of the sample sites, five sub-samples were taken from the topsoil layer (0-20 cm) within 100 m × 100 m areas, and then mixed together to form a composite soil sample, weighing more than 500 g.All the soil samples were returned to the laboratory and sieved through 20 meshes after they were naturally air dried.Each sample was divided into two groups: one for the determination of the Ni content and another for the hyperspectral measurement.The soil Ni content was determined, as described in "HJ 803-2016" [31], using an Inductively Coupled Plasma Mass Spectrometer (ICP-MS 7800).All the soil samples were tested repeatedly, and the determined consistency of the Ni measurements was 95.6%.

Spectrometric Determination
The spectral determination of the collected soil samples was measured using a FieldSpec ® 3 (ASD, Boulder, CO, USA) portable object spectrometer manufactured by Analytical Spectral Devices.The interval of data acquisition was 1 nm, with a spectral measurement range from 350 to 2500 nm.Firstly, the instrument was preheated, and secondly, a 40 cm × 40 cm white board was placed on a 2 m × 2 m black cardboard for calibration to obtain the absolute reflectance before determining the spectral data.Finally, the soil samples were kept in a natural state on the black cardboard with the sensor probe perpendicular to 15 cm above the soil surface, and the sensor probe was optimized with a white board every 5 min.A total of 15 replicate measurements were taken on the same soil sample, and 15 spectral curves were collected.

Experimental Data
4.1.Spectral Analysis 4.1.1.Spectral Data Pre-Processing A total of 15 spectral curves were measured for each soil sample, and the 15 spectral curves were averaged using ViewSpecPro (Version 5.6) software, and the arithmetic mean was taken as the original reflectance spectral value of the soil sample.Due to the influence of the surrounding environment and the spectral instrument itself, the spectral bands within 350-399 nm, 1350-1430 nm, 1781-1970 nm, and 2401-2500 nm were excluded before constructing the hyperspectral models, which was outputted in a total of 1730 bands.
Finally, the original spectra of soil samples were smoothed using the Savitzky-Golay (S-G) method.The S-G filter algorithm is a weighted filtering of the data, which has the advantage of retaining the information about the variation of the signal more effectively, removing noise from the spectral curves, and improving the smoothness of the spectrum during filter smoothing.Figure 2 illustrates the spectral reflectance curves of the original spectra and the spectra processed by the S-G smoothing.A total of 15 spectral curves were measured for each soil sample, and the 15 spectral curves were averaged using ViewSpecPro (Version 5.6) software, and the arithmetic mean was taken as the original reflectance spectral value of the soil sample.Due to the influence of the surrounding environment and the spectral instrument itself, the spectral bands within 350-399 nm, 1350-1430 nm, 1781-1970 nm, and 2401-2500 nm were excluded before constructing the hyperspectral models, which was outputted in a total of 1730 bands.
Finally, the original spectra of soil samples were smoothed using the Savitzky-Golay (S-G) method.The S-G filter algorithm is a weighted filtering of the data, which has the advantage of retaining the information about the variation of the signal more effectively, removing noise from the spectral curves, and improving the smoothness of the spectrum during filter smoothing.Figure 2 illustrates the spectral reflectance curves of the original spectra and the spectra processed by the S-G smoothing.

Spectral Transformation
The original spectral response signals of the Ni in the soil are weak and it is difficult to directly reflect the important wavelengths with the original spectra data [32].In order to enhance the spectral information related to the Ni in the soil samples, the original spectral reflectance data (R) were subjected to first-order differentiation (FD), second-order differentiation (SD), first-order differentiation of the reciprocal (RTFD), second-order differentiation of the reciprocal (RTSD), first-order differentiation of the logarithm (LTFD), second-order differentiation of the logarithm (LTSD), root mean square first-order differentiation (RMSFD), root mean square second-order differentiation (RMSSD), logarithmic first order differentiation of the reciprocal (ATFD), logarithmic second order differentiation of the reciprocal (ATSD), logarithmic first order differentiation of the reciprocal (RLFD), and logarithmic second order differentiation of the reciprocal (RLSD) equations.

Important Wavelengths Selection
Firstly, Pearson's correlation coefficient analysis (PCC) was performed between the soil Ni content and 13 forms of the soil spectral data, and the bands with larger correlation coefficients were screened out as important wavelengths for hyperspectral prediction modeling.
Secondly, important wavelengths from all the original and 12 types of transformed spectral data were intelligently extracted using Competitive Adaptive Re-weighted Sampling (CARS), excluding further removed wavelengths with low correlation [33,34].

Spectral Transformation
The original spectral response signals of the Ni in the soil are weak and it is difficult to directly reflect the important wavelengths with the original spectra data [32].In order to enhance the spectral information related to the Ni in the soil samples, the original spectral reflectance data (R) were subjected to first-order differentiation (FD), second-order differentiation (SD), first-order differentiation of the reciprocal (RTFD), second-order differentiation of the reciprocal (RTSD), first-order differentiation of the logarithm (LTFD), second-order differentiation of the logarithm (LTSD), root mean square first-order differentiation (RMSFD), root mean square second-order differentiation (RMSSD), logarithmic first order differentiation of the reciprocal (ATFD), logarithmic second order differentiation of the reciprocal (ATSD), logarithmic first order differentiation of the reciprocal (RLFD), and logarithmic second order differentiation of the reciprocal (RLSD) equations.

Important Wavelengths Selection
Firstly, Pearson's correlation coefficient analysis (PCC) was performed between the soil Ni content and 13 forms of the soil spectral data, and the bands with larger correlation coefficients were screened out as important wavelengths for hyperspectral prediction modeling.
Secondly, important wavelengths from all the original and 12 types of transformed spectral data were intelligently extracted using Competitive Adaptive Re-weighted Sampling (CARS), excluding further removed wavelengths with low correlation [33,34].

Modelling of Hyperspectral Inversion
In order to consider both the Ni content vector and the spectral vector, soil samples were randomly split into a modeling dataset (70 samples) and a validation dataset (18 samples).The modeling dataset was used to build hyperspectral prediction models, while the validation dataset was used to test the accuracy of the prediction models.The partial least squares regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) models were used to select the optimum hyperspectral prediction model.
The PLSR algorithm can consider both the spectral information (x) and the corresponding reference values (y) of the samples during modeling and transform the original spectral data into mutually orthogonal and unrelated new variables via linear transformation, thereby eliminating multicollinearity between datasets [35].
The RFR technique is a new data mining technique which is designed to produce accurate predictions that do not overfit the data.Based on a bootstrapped training sample, the individual trees in the RFR ensemble are built, and only a small fraction of the predictor variables are considered in each segmentation; this ensures that trees are decorrelated with each other [36].The RFR technique is easy to run because it requires only three input parameters: the number of two 'random_state' and 'n_estimators'.The three input parameters are used to partition the modeling set and validation set and to determine the optimal partitioning of each tree node.Additionally, studies have shown that the three input parameters provide accurate results [32].
The SVMR model is a non-linear predictive model based on the kernel-based machine learning approach.It can use a specific transfer kernel function to map the spectrum matrix into a high-dimensional feature space and establish a hyperplane as a decision surface to segment different samples with the principle of interval-maximizing segmentation, and then in turn provides an inverse prediction of the heavy metal content of soil [37].

Model Validation
The determination coefficient (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and residual prediction deviation (RPD) are used to evaluate the prediction accuracy of the hyperspectral prediction models.A robust model has a higher R 2 and RPD and a lower RMSE and MAE [38].When R 2 < 0.5, the prediction model does not have a predictive ability; when 0.5 ≤ R 2 < 0.7, the model has a preliminary predictive capability; and when R 2 ≥ 0.7, the model has a good predictive capability [39].When RPD ≥ 2.0, the model has a better predictive ability; when 1.4 ≤ RPD < 2.0, the model has an initial predictive capability; and when RPD < 1.4, the model has a bad predictive capability.In general, a lower RMSE and MAE indicate better model prediction accuracy [40].

Statistical Analysis of Ni Content in Soil
The statistical results of the Ni content of the soil samples in Urumqi are given in Table 1.Standard deviation (SD) and coefficient of variation (CV) analyses were used to measure data dispersion, where the CV complements the SD.Table 1 showed that the Ni content in the soil samples was distributed in the range of 10.00-29.00mg/kg, and the average value was 18.52 mg/kg.The average Ni content of the modeling set and validation set were 18.71 mg/kg and 17.78 mg/kg, respectively.The SD of the modeling set and validation set were 3.59 and 2.92 mg/kg, respectively.Further, the CV values of the modeling set and validation set were 0.19 and 0.16, respectively.It is clear that the average, SD, and CV values of the Ni content in the modeling set were essentially the same as those of the validation set.This indicates that the division of soil samples was reasonable and can be used for subsequent model construction.

Correlation between Ni Content and Reflectance Data of Soil
The PCC analysis was performed between the Ni content and 13 forms of soil spectral data, which can identify the correlation between the Ni content and spectral data of soil samples.The degree of correlation was expressed by the Pearson coefficient (R), and Pearson's correlation analysis was used in the significance test at the P < 0.01 level (two-sided).
In Figure 3, the original spectral data showed a weak correlation with the Ni content, with nine important wavelengths selected.The correlation analysis of the Ni content and spectral data processed by differential transformations indicated that both positive and negative correlation coefficients had extreme values, and the positive and negative correlations of the filtered important wavelengths were more uniformly distributed.Thus, both the original spectrum and all the transformed spectra can filter out characteristic bands for data modeling, and the number of the important wavelengths descended in the order of: RTFD( 617

Correlation between Ni Content and Reflectance Data of Soil
The PCC analysis was performed between the Ni content and 13 forms of soil spectral data, which can identify the correlation between the Ni content and spectral data of soil samples.The degree of correlation was expressed by the Pearson coefficient (R), and Pearson's correlation analysis was used in the significance test at the P < 0.01 level (two-sided).
In Figure 3, the original spectral data showed a weak correlation with the Ni content, with nine important wavelengths selected.The correlation analysis of the Ni content and spectral data processed by differential transformations indicated that both positive and negative correlation coefficients had extreme values, and the positive and negative correlations of the filtered important wavelengths were more uniformly distributed.Thus, both the original spectrum and all the transformed spectra can filter out characteristic bands for data modeling, and the number of the important wavelengths descended in the order of: RTFD( 617 (17).It can be shown that the number of important wavelengths was less than the Pearson's correlation coefficient analysis expected for the original spectrum.

Establishment and Analysis of Hyperspectral Prediction Model
Partial least squares regression model (PLSR), random forest regression model (RFR), and support vector machine regression model (SVMR) were constructed for predicting the Ni content of soil in this study.Based on Python, a validation set was modeled, and the "random_state" of the three models was set as two.Due to the randomness of the RFR model, the number of parameters ("n_estimators" and another "random_state") will disturb the predictive performance of the model.Under the consideration of model performance, model running time, sample number and other factors, the number of parameters ("n_estimators" and another "random_state") of the RFR model was set in the range from one to ninety-nine.In order to consider both the soil Ni content vector and the spectral vector, the soil samples were split into a modeling set and validation set for modeling and verification, respectively.The number of modeling sets and validation sets were 70 and 18, respectively.According to the correlation coefficient between the Ni content and spectrum data, wavelengths with absolute values of more than 0.272 under the processed spectral reflectance data were taken as important wavelengths.Then, important wavelengths were selected as the independent variables (x), and the Ni content of the soil was selected as the dependent variable (y).The hyperspectral prediction model for the Ni content was established by the partial least squares regression (PLSR), the random forest regression (RFR), and support vector machine regression (SVMR) algorithms, respectively.

The Analysis of PLSR Model
The basic statistics related to the stability and accuracy of the PLSR model is given in Table 2.The R 2 , inversed by the PLSR model based on the important wavelengths selected by PCC, ranged from 0.012 to 0.603; RMSE values ranged from 1.838 to 3.032; MAE values ranged from 1.248 to 1.520; and RPD values ranged from 0.963 to 1.589.The accuracy of the PLSR model based on the original spectrum (R), was lower than the model based on the 12 transformations (RMSFD, RMSSD, LTFD, LTSD, RLFD, RLSD, RTFD, RTSD, ATFD, ATSD, FD, SD).The FD-PLSR model had the best predictive capability (R 2 = 0.603, RMSE = 1.838,MAE = 1.249,RPD = 1.589).The ranges of the R 2 , RMSE, MAE, and RPD values inversed by the PLSR model based on the important wavelengths selected by CARS, were 0.388-0.736,1.498-2.282,1.309-1.144,and 1.280-1.949,respectively.The predictive accuracy of the PLSR model has been improved to different degrees with 10 transformations (RMSFD, RMSSD, LTFD, RLFD, RLSD, RTFD, RTSD, ATFD, ATSD, SD) processed by CARS.The prediction accuracy of the CARS-RMSFD-PLSR model was the highest (R 2 = 0.736, RMSE = 1.498,MAE = 1.144,RPD = 1.949).In general, CARS is superior to PCC, and the CARS-RMSFD-PLSR model is better than the FD-PLSR model.A map of the spatial distribution (Figure 4) illustrates the relationship between the measured and CARS-RMSFD-PLSR predicted content of Ni in the soil in the study area.The results indicated that the accuracy of the PLSR estimation model was not high.A map of the spatial distribution (Figure 4) illustrates the relationship between the measured and CARS-RMSFD-PLSR predicted content of Ni in the soil in the study area.The results indicated that the accuracy of the PLSR estimation model was not high.

The Analysis of RFR Model
In Table 3, the ranges of the R 2 , RMSE, MAE, and RPD values, inversed by the RFR model based on the important wavelengths selected by the PCC, were 0.036-0.866,1.321-9.096,0.986-2.563,and 0.321-2.210,respectively.The R 2 was higher than 0.5 except for R, so RFR model has good predictive ability.The accuracy of the RFR model based on the transformed spectrum was higher than the original spectrum (R).The best inverse prediction model was the RTFD-RFR model (R 2 = 0.866, RMSE = 1.321,MAE = 0.986, RPD = 2.210).The R 2 , inversed by the RFR model based on the important wavelengths selected by CARS, ranged from 0.408 to 0.837; the ranges of the RMSE and MAE values were 1.388-5.037and 0.958-1.910;and the RPD values were 0.580-2.104.Except for three transformations (RLSD, ATSD, SD), the prediction accuracy of the model based on the remaining nine transformations by CARS, declined to different degrees.

The Analysis of RFR Model
In Table 3, the ranges of the R 2 , RMSE, MAE, and RPD values, inversed by the RFR model based on the important wavelengths selected by the PCC, were 0.036-0.866,1.321-9.096,0.986-2.563,and 0.321-2.210,respectively.The R 2 was higher than 0.5 except for R, so RFR model has good predictive ability.The accuracy of the RFR model based on the transformed spectrum was higher than the original spectrum (R).The best inverse prediction model was the RTFD-RFR model (R 2 = 0.866, RMSE = 1.321,MAE = 0.986, RPD = 2.210).The R 2 , inversed by the RFR model based on the important wavelengths selected by CARS, ranged from 0.408 to 0.837; the ranges of the RMSE and MAE values were 1.388-5.037and 0.958-1.910;and the RPD values were 0.580-2.104.Except for three transformations (RLSD, ATSD, SD), the prediction accuracy of the model based on the remaining nine transformations by CARS, declined to different degrees.The prediction accuracy of the CARS-ATSD-RFR model was the highest (R 2 = 0.837, RMSE = 1.388,MAE = 0.958, RPD = 2.104).The PCC was superior to CARS, and the RTFD-RFR model was better than the CARS-ATSD-RFR model.
A map of the spatial distribution (Figure 5) illustrates the relationship between the measured and RTFD-RFR predicted content of Ni in the soil in the study area.As shown in Figure 5, the estimated Ni (Figure 5B) content, based on the RTFD-RFR method, showed very similar distribution patterns as the laboratory-measured Ni content (Figure 5A).The results illustrated in Figure 5 further explain that the estimation model based on the RTFD-RFR has a high degree of accuracy for estimating the Ni content in urban soil.The prediction accuracy of the CARS-ATSD-RFR model was the highest (R 2 = 0.837, RMSE = 1.388,MAE = 0.958, RPD = 2.104).The PCC was superior to CARS, and the RTFD-RFR model was better than the CARS-ATSD-RFR model.

RTFD
A map of the spatial distribution (Figure 5) illustrates the relationship between the measured and RTFD-RFR predicted content of Ni in the soil in the study area.As shown in Figure 5, the estimated Ni (Figure 5B) content, based on the RTFD-RFR method, showed very similar distribution patterns as the laboratory-measured Ni content (Figure 5A).The results illustrated in Figure 5 further explain that the estimation model based on the RTFD-RFR has a high degree of accuracy for estimating the Ni content in urban soil.

The Analysis of SVMR Model
The basic statistics related to the stability and accuracy of the SVMR model is given in Table 4

The Analysis of SVMR Model
The basic statistics related to the stability and accuracy of the SVMR model is given in Table 4.The R 2 , inversed by the SVMR model based on the important wavelengths selected by PCC, ranged from 0.071 to 0.648; the values of the RMSE and MAE ranged from 1.730 to 2.963 and from 1.408 to 2.415, respectively; the RPD values ranged from 1.688 to 0.985.The accuracy of the SVMR model based on the original spectrum (R) was lower than the model based on 12 transformations (RMSFD, RMSSD, LTFD, LTSD, RLFD, RLSD, RTFD, RTSD, ATFD, ATSD, FD, SD).The ATSD-SVMR model has the best predictive capability (R 2 = 0.648, RMSE = 1.730,MAE = 1.408,RPD = 1.688).The ranges of the R 2 , RMSE, MAE, and RPD values, inversed by the SVMR model based on the important wavelengths selected by CARS, were 0.106-0.630,1.774-2.757,1.353-2.120,and 1.059-1.646,respectively.
A map of the spatial distribution (Figure 6) illustrates the relationship between the predicted and measured content of Ni in the soil in the study area.
As shown in Figure 5, the predictive accuracy of the SVMR estimation model was not high.The prediction accuracy of the CARS-RLSD-SVMR model was the highest (R 2 = 0.630, RMSE = 1.774,MAE = 1.353,RPD = 1.646).PCC was better than CARS, and the ATFD-PLSR model was better than CARS-RLSD-SVMR model.
A map of the spatial distribution (Figure 6) illustrates the relationship between the predicted and measured content of Ni in the soil in the study area.As shown in Figure 5, the predictive accuracy of the SVMR estimation model was not high.

Discussion of Optimal Prediction Models
The hyperspectral inversion of soil Ni content is influenced by two aspects; on the one hand, estimating Ni content in soil using hyperspectral remote sensing is a

Discussion of Optimal Prediction Models
The hyperspectral inversion of soil Ni content is influenced by two aspects; on the one hand, estimating Ni content in soil using hyperspectral remote sensing is a cost-efficient method, but challenging due to the effects of natural environmental conditions and soil properties [41].On the other hand, high-data dimensionality is a common problem in hyperspectral data processing [32], so the inversion accuracy of the constructed model is biased by redundant spectra and noise [42].Wang et al. [40] have pointed out that spectral transformation is an effective approach for identifying the highly correlated spectral bands.Yuan et al. [33] believed that using the CARS method can effectively eliminate redundant information, greatly improve the correlation, and increase the efficiency.Chen et al. [43] have constructed a model, based on the random forest regression technique, that can effectively predict the soil Ni content in Taiwan Province, China (R 2 = 0.63, RMSE = 150.02,MAPE = 31.93).Zhang et al. [44] have pointed out that the CWT-RBF hyperspectral inversion model (R 2 = 0.88, RMSE = 2.41, RPD = 3.91) is a great method to predict the soil Ni content in Gulin County, Sichuan Province, China.Hou et al. [14] have analyzed and concluded that the PLSR model (R 2 = 0.879, RMSE = 1.292) is an ideal model for predicting the soil Ni content of Zoucheng, Shandong Province, China.Yang et al. [45] have found that the partial least squares regression model has the highest predictive ability for the inversion of soil Ni content in the Shizishan mining area in Tongling city, Anhui Province.
In this study, the inversion accuracy of the Ni content in soil can be ranked as follows: R 2 FD-PLSR .Therefore, combined with the performance of the inversion accuracy of the Ni content in soil, the inversion accuracy of the RFR method is significantly better than that of the PLSR and SVMR methods.As shown in Tables 2-4, the fitness, stability, and accuracy of the inversion model are changed in different degrees by the processing methods of the original spectral data (R).The best inverse prediction model is the RTFD-RFR model (R 2 = 0.866, RMSE = 1.321,MAE = 0.986, RPD = 2.210), which has a better ability to invert the soil heavy metal content in the study area.The scatter plot of Ni content modeling by the RTFD-RFR and R-RFR model is exhibited in Figure 7.
pointed out that spectral transformation is an effective approach for identifyin highly correlated spectral bands.Yuan et al. [33] believed that using the CARS m can effectively eliminate redundant information, greatly improve the correlation increase the efficiency.Chen et al. [43] have constructed a model, based on the ran forest regression technique, that can effectively predict the soil Ni content in Ta Province, China (R 2 = 0.63, RMSE = 150.02,MAPE = 31.93).Zhang et al. [44] have po out that the CWT-RBF hyperspectral inversion model (R 2 = 0.88, RMSE = 2.41, RPD = is a great method to predict the soil Ni content in Gulin County, Sichuan Province, C Hou et al. [14] have analyzed and concluded that the PLSR model (R 2 = 0.879, RM 1.292) is an ideal model for predicting the soil Ni content of Zoucheng, Shandong ince, China.Yang et al. [45] have found that the partial least squares regression mod the highest predictive ability for the inversion of soil Ni content in the Shizishan m area in Tongling city, Anhui Province.
In this study, the inversion accuracy of the Ni content in soil can be ranked a lows: R 2 RTFD-RFR > R 2 CARS-ATSD-RFR > R 2 CARS-RMSFD-PLSR > R 2 ATSD-SVMR > R 2 CARS-RLSD-SVMR > R 2 F Therefore, combined with the performance of the inversion accuracy of the Ni cont soil, the inversion accuracy of the RFR method is significantly better than that of the and SVMR methods.As shown in Tables 2-4  The R 2 calculated by the RFR model based on the RTFD transformation of th portant spectral wavelengths is significantly higher than that modelled from the or spectral data (R), and both the RMSE and MAE are significantly decreased.From F 7, the accuracy of the RFR model based on R was not high, and the R 2 was 0.036 modelling prediction accuracy of the RTFD transformation was improved signific and the predicted and measured values presented strong agreement with each o with an R 2 of 0.866, which was improved by 0.830 compared with the R-RFR mode measured values of the Ni content of the soil samples in this study had a small ran fluctuation and no extreme points, but whether they were modelled with the or spectrum or with the transformed important spectral wavelengths, the predicted v were slightly higher when the measured values were low, and slightly lower whe measured values were high, which suggests that the inaccuracy may have come fro The R 2 calculated by the RFR model based on the RTFD transformation of the important spectral wavelengths is significantly higher than that modelled from the original spectral data (R), and both the RMSE and MAE are significantly decreased.From Figure 7, the accuracy of the RFR model based on R was not high, and the R 2 was 0.036.The modelling prediction accuracy of the RTFD transformation was improved significantly, and the predicted and measured values presented strong agreement with each other, with an R 2 of 0.866, which was improved by 0.830 compared with the R-RFR model.The measured values of the Ni content of the soil samples in this study had a small range of fluctuation and no extreme points, but whether they were modelled with the original spectrum or with the transformed important spectral wavelengths, the predicted values were slightly higher when the measured values were low, and slightly lower when the measured values were high, which suggests that the inaccuracy may have come from the extreme points; however, due to the limited number of sample points in this study, there were no extreme sample points, which was a limitation of the study.
Overall, a faster and convenient method for estimating the Ni content in soil was described in this work.This method provides an effective way for predicting the Ni content in urban soil.The results obtained in this study are in agreement with the results of previous studies [11,[38][39][40], which implies that the Ni content of soil can be assessed using hyperspectral remote sensing technology with reasonable accuracy.The results of this study support the use of a remote sensing technical approach for characterizing the Ni content in urban soil.

Conclusions
To find an optimal inversion model to predict soil Ni content, hyperspectral prediction models were constructed based on the important wavelengths and Ni content from field soil samples.The results of this study lead to the following conclusions: 1. Transformed spectral data with Pearson's correlation coefficient analysis and the CARS method can obviously reduce the interference of the environmental background and improve the correlations between spectral reflectance data and the Ni content of soil.However, the spectral reflectance data correlate differently with the Ni content under different spectral processing methods.The first-order differentiation of the reciprocal (RTFD) has the most significant enhancement of spectral features.2. The results showed that the RTFD-RFR model is more stable and has the best inversion effects, with the highest predictive ability (R 2 = 0.866, RMSE = 1.321,MAE = 0.986, RPD = 2.210) for determining the Ni content in soil in the research region.The RTFD-RFR model can be used as a means of predicting the Ni content in urban soil.
Overall, the results of this study demonstrate the possibility of directly applying hyperspectral remote sensing approaches to estimating Ni content in urban soil.This method can provide a technical support for the hyperspectral inversion and rapid detection of soil Ni content.However, this study lacks the combination of hyperspectral and remote sensing imagery, which needs to be further verified in subsequent studies.

Figure 1 .
Figure 1.Location of experimental field and sample sites.

Figure 1 .
Figure 1.Location of experimental field and sample sites.

Figure 2 .
Figure 2. The spectral reflectance curves of soil of the original spectra (a) and the spectra processed by the Savitzky-Golay smoothing (b).

Figure 2 .
Figure 2. The spectral reflectance curves of soil of the original spectra (a) and the spectra processed by the Savitzky-Golay smoothing (b).

Figure 4 .
Figure 4. Comparison of Ni distribution maps of field measured values (A) and predicted values by PLSR (B).

Figure 4 .
Figure 4. Comparison of Ni distribution maps of field measured values (A) and predicted values by PLSR (B).

Figure 5 .
Figure 5.Comparison of Ni distribution maps of field measured values (A) and predicted values by RFR (B).

Figure 5 .
Figure 5.Comparison of Ni distribution maps of field measured values (A) and predicted values by RFR (B).

Figure 6 .
Figure 6.Comparison of Ni distribution contour maps of field measured values (A) and predicted values by SVMR (B).

Figure 6 .
Figure 6.Comparison of Ni distribution contour maps of field measured values (A) and predicted values by SVMR (B).
, the fitness, stability, and accuracy o inversion model are changed in different degrees by the processing methods o original spectral data (R).The best inverse prediction model is the RTFD-RFR mode 0.866, RMSE = 1.321,MAE = 0.986, RPD = 2.210), which has a better ability to inve soil heavy metal content in the study area.The scatter plot of Ni content modeling b RTFD-RFR and R-RFR model is exhibited in Figure 7.

Figure 7 .
Figure 7. Measured and RTFD-RFR predicted values of Ni content in soil.

Figure 7 .
Figure 7. Measured and RTFD-RFR predicted values of Ni content in soil.

Table 1 .
Statistical values of Ni content in soil in Urumqi.

Table 1 .
Statistical values of Ni content in soil in Urumqi.

Table 2 .
Statistics of accuracy parameters of PLSR model for soil Ni content in Urumqi.

Table 3 .
Statistics of accuracy parameters of RFR model for Ni content of soils in Urumqi.

Table 3 .
Statistics of accuracy parameters of RFR model for Ni content of soils in Urumqi.

Table 4 .
Statistics of accuracy parameters of SVMR model for Ni content of soils in Urumqi.

Table 4 .
Statistics of accuracy parameters of SVMR model for Ni content of soils in Urumqi.