Repaid Identification and Prediction of Cadmium–Lead Cross-Stress of Different Stress Levels in Rice Canopy Based on Visible and Near-Infrared Spectroscopy

Accurate detection of cadmium (Cd) and lead (Pb)-induced cross-stress on crops is essential for agricultural, ecological environment, and food security. The feasibility to diagnose and predict Cd–Pb cross-stress in agricultural soil was explored by measuring the visible and near-infrared reflectance of rice leaves. In this study, two models were developed—namely a diagnostic model and a prediction model. The diagnostic model was established based on visible and near-infrared reflectance spectroscopy (VNIRS) datasets with Support Vector Machine (SVM), followed by leave-one-out cross-validation (LOOCV). A partial least-squares (PLS) regression, as the prediction model was employed to predict the foliar concentration of Cd and Pb contents. To accurately calibrate the two models, a rigorous greenhouse experiment was designed and implemented, with 4 levels of treatments on each of the Cd and Pb stress on rice. Results show that with the appropriate pre-processing, the diagnostic model can identify 79% of Cd and 85% of Pb stress of any levels. The significant bands that have been used mainly distributed between 681–776 nm and 1224–1349 nm for Cd stress and 712–784 nm for Pb stress. The prediction model can estimate Cd with coefficient of determination of 0.7, but failed to predict Pb accurately. The results illustrated the feasibility to diagnose Cd stress accurately by measuring the visible and near-infrared reflectance of rice canopy in a cross-contamination soil environment. This study serves as one step forward to heavy metal pollutant detection in a farmland environment.


Introduction
Over recent decades, the accumulation of heavy metals in agricultural soil has been an important issue worldwide related to environmental pollution and human risk [1]. Distinguishing the types and levels of heavy metals in agricultural soil is of crucial importance for food safety and health risks, which does help to prevent and curb heavy metal pollution. Therefore, it is necessary to detect and quantify heavy metal contaminations in agricultural soil.
Rice was employed to be an indicator of heavy metal pollution in farmland, due to its extensive planting in the vast areas of China and Southern India. Thanks to serious pollution, Cd and Pb were selected to be the polluted types [43,44]. To effectively control the environmental variables, such as temperature, humidity, and light, the experiment was carried out in a glass greenhouse with a stable temperature of 25°-30°C and a humidity of 60-80%. The greenhouse has a ceiling that can be opened and closed by a switch. 576 pots of rice were planted in hydroponics and will be assigned to 96 groups, with 6 pots per group. The stress levels of heavy metals could be accurately controlled during the experiment. The nutrient solution recommended by the International Rice Research Institute was used as the basic rice hydroponic solution, while cadmium chloride (CdCl2) and lead nitrate (Pb(NO3)2) were used as the solute to regulate Cd and Pb contents in nutrient solution. solid plastic boxes with painted black were used as the containers for nutrient solution, and black foam boards with holes were used as fixed supports for rice seedlings above the boxes. These nutrient solutions with different Cd-Pb concentrations were replaced every ten days to ensure the rice was not under the other stress due to poor nutrient ingredients.
We designed a Cd-Pb cross stressed experiment with four different stress levels by referencing the controlled concentrations in farmland in GB15618-2018 of China [45]. The four different stress levels were 0mg/L, 2mg/L, 5mg/L and 8mg/L of Cd and 0mg/L, 50mg/L, 100mg/L and 500mg/L of Pb, and then, there were, totally, 16 groups with different combinations. The detailed concentrations, corresponding abbreviations and the combinations are showed in Figure. 1. In the following statement, the abbreviations were going to be used to represent the corresponding treatments. The experimental process is showed in Figure. 2.

Spectrum and Cd-Pb Contents in Leaf Measurement
The spectral reflectance data of the rice canopy were measured inside the greenhouse with ceiling opened, by a FieldSpec3 portable spectroradiometer (Analytical Spectral Devices, Inc., USA) at 10:00-14:00 local time, under cloud-free or near-cloudless weather conditions [17,46]. The spectral instrument covers the VNIRS region from 350 nm to 2500 nm with two different spectral resolutions: 3 nm for 350-1000 nm and 10 nm for 1000-2500nm, respectively.
At the late booting stages of rice, each group of rice were sampled for the reflective spectra of their canopy. At this stage, when measured from nadir direction at 30 cm height, there are roughly 90% of foliar tissue and 10% dark background within the field-of-view. Spectra were acquired with 8 averages of slightly different centering points of the samples. Spectral measurements were calibrated with a white standard reference panel, resulting in 96 reflective spectra for the 96 groups of rice canopy.
After the spectral measurement, these rice leaves were collected in groups and refrigerated in the laboratory with the temperature of 16°-18°C. In the lab, rice leaves were ground to homogenate and digested for later use. The linear regression equation of the relationship between absorbance and concentration was found by blank test. Under the same conditions, an equal amount of sample solution was put into the graphite furnace to measure the absorbance value; the above linear equation was taken to obtain the sample Cd/Pb contents. The Cd/Pb contents in the sample was calculated as follows: where C1 and C0 are the Cd/Pb contents with a unit of ng/mL in the digestive solution and the blank control solution, respectively, while v is the total volume of the sample digestive solution with a unit of Ml and m is the mass of the sample with a unit of g.

Spectrum and Cd-Pb Contents in Leaf Measurement
The spectral reflectance data of the rice canopy were measured inside the greenhouse with ceiling opened, by a FieldSpec3 portable spectroradiometer (Analytical Spectral Devices, Inc., USA) at 10:00-14:00 local time, under cloud-free or near-cloudless weather conditions [17,46]. The spectral instrument covers the VNIRS region from 350 nm to 2500 nm with two different spectral resolutions: 3 nm for 350-1000 nm and 10 nm for 1000-2500nm, respectively.
At the late booting stages of rice, each group of rice were sampled for the reflective spectra of their canopy. At this stage, when measured from nadir direction at 30 cm height, there are roughly 90% of foliar tissue and 10% dark background within the field-of-view. Spectra were acquired with 8 averages of slightly different centering points of the samples. Spectral measurements were calibrated with a white standard reference panel, resulting in 96 reflective spectra for the 96 groups of rice canopy.
After the spectral measurement, these rice leaves were collected in groups and refrigerated in the laboratory with the temperature of 16-18 • C. In the lab, rice leaves were ground to homogenate and digested for later use. The linear regression equation of the relationship between absorbance and concentration was found by blank test. Under the same conditions, an equal amount of sample solution was put into the graphite furnace to measure the absorbance value; the above linear equation was taken to obtain the sample Cd/Pb contents. The Cd/Pb contents in the sample was calculated as follows: where C 1 and C 0 are the Cd/Pb contents with a unit of ng/mL in the digestive solution and the blank control solution, respectively, while v is the total volume of the sample digestive solution with a unit of Ml and m is the mass of the sample with a unit of g.

Spectral Pre-Processing
To reduce the random error of sampling randomly, the spectral means of eight randomly selected measured spots in same stress pre-processing were calculated in the MATLAB 2015a platform, and Remote Sens. 2020, 12, 469 5 of 17 then the calculated values were used as the spectrum for the corresponding stress treatments of single acquisition. As a result, there were a total of 96 spectra, each group containing six. To decrease instrument noises and to improve the signal-to-noise ratio, the spectral bands (1351-1460 nm, 1801-2030 nm and 2351-2500 nm) were removed from calculated reflectance spectra, and the remaining hyperspectral datasets were processed by the following pretreatments: Derivative pretreatment, including first derivative and second derivative, eliminated the interferences of background noise, resolved overlapping spectra, and minimized additive baseline drift of raw spectral reflectance [18]. Savitzky-Golay smoothing [47] was applied to remove random noise and to increase spectral data quality. Normalization and standardization decreased redundant information and extracted the spectral difference, which did help to remove unnoticeable weight. Besides the above single pre-processing methods, we tried to combine two or more single pretreatment methods together, and took the priority of pre-processing methods into consideration to find appropriate combinations to improve diagnostic accuracies [19]. As a result, the study employed 10 different pretreatments. Table 1 displayed the pre-processing methods employed in this study and their corresponding abbreviations. In the combined pre-processing methods, the order of abbreviations represented the priority of pre-processing methods. Take NorSG1D as an example. The abbreviation represented first normalization, then Savitzky-Golay smoothing followed by 1st-Derivative. Table 1. Pre-processing methods and their corresponding abbreviations.

Spectral Dimension Reduction
(Two-way analysis of variance) ANOVA was performed to extract significant bands from spectral variables for Cd-Pb cross-stress. Analysis of variance was used to test the difference between multiple means to determine whether a stressed factor had a significant effect on the spectral changes. ANOVA was applied to discriminate the factors of spectral change in a certain band, and these factors included Cd stress, Pb stress, and conjunct stress. These bands with spectral changes that were only related to one stressed factor were selected to be used as the significant bands. By comparing the significance value with a preset confidence (p = 0.05), the corresponding bands with lower value than the preset significance value were selected as significant band for subsequent diagnosis and prediction.

Diagnosis of Heavy Metals
SVM transfers training data into a higher-dimensional feature space using a kernel function and then computes separating hyperplanes as a result of achieving maximum separation between the classes [48]. It is often used to diagnose and classify the hyperspectral datasets with a small scale. In this study, an SVM with the linear kernel function was employed to diagnose the levels of Cd-Pb cross-stress. By pre-training, the optimal combination of parameters was determined.

Prediction of Cd-Pb Contents
PLSR is often preferred to quantitatively derive information from hyperspectral reflectance spectra [8]. When calculating the principal components, PLSR is more interpretable, and it is the more efficient algorithm than PCR in relating the selected significant bands to heavy metals contents. PLSR associates the partial least-squares (PLS) method with a classical multivariate linear regression to explain the correlation between the selected significant bands from VNIRS datasets and the Cd-Pb contents: where X R M,N is a matrix of M spectra samples with N spectral bands,ŷ ∈ R M,1 is the vector of the predicted Cd or Pb values of the M rice leaf samples,b ∈ R N,1 is the vector of the estimated PLSR regression coefficients (b-coefficients) and b 0 is the intercept [49]. Significant bands in different pretreatment methods were used as independent variables to predict Cd-Pb contents in the leaves of rice. For each prediction, two thirds of hyperspectral samples were randomly selected as calibration dataset, and one third remaining samples were treated as test dataset. The model with the best evaluation performance was selected as the final model. SVM is also widely used for its robust prediction tasks. This paper adopts SVMs with radial basis kernel function. The hidden node is the inner product of the input sample and a support vector, and the output node is the linear combination of the hidden layer output. The input variables of the prediction model are the pre-processed significant bands. The output target value is the predicting concentration of contaminations. Finally, accuracy of prediction was calculated against those of PLSR. Method with higher performance is employed in this study.

Diagnostic Models Evaluation
Leave-one-out cross-validation was employed to evaluate the diagnostic performance of SVM model in the study, whereby each hyperspectral dataset was diagnosed by remaining datasets [50]. By this way, it provided nearly unbiased predictions even with the limitation of sample numbers [51]. For 96 hyperspectral samples total, the diagnostic model was run 96 times, and the mean of 96 diagnostic results was regarded as the accuracies of diagnostic model.

Predictive Models Evaluation
The prediction models, created between the heavy metal (Cd and Pb) contents and spectral variables, were evaluated based on the coefficient of determination (r 2 ), the root mean squared error (RMSE) and the ratio of the interquartile distance of measured values to RMSE (RPIQ) [52,53], which were given by: where y i and y i are the measured and predicted values of the Cd/Pb contents, and y is the average measured value and N is the hyperspectral sample number. By analyzing the predicted performance of all significant bands, the predicted results with the highest r 2 would be used as the final prediction model. IQ is the interquartile distance, and SEP represents the standard error of prediction.

Leaf Cd-Pb Contents
The statistical descriptions of leaf Cd-Pb contents for the whole data sets, calibration data sets, and validation data sets are shown in Table 2. These statistical characters, such as mean and standard deviation (SD), were similar to three data sets with a small difference. These characteristic similarities indicated that the randomly selected calibration and validation data sets could effectively represent the whole data sets.

Significant Bands
The numbers of significant bands varied with different pretreatment methods for Cd and Pb, as displayed in Figure 3.
performance of all significant bands, the predicted results with the highest would be used as the final prediction model. IQ is the interquartile distance, and SEP represents the standard error of prediction.

Leaf Cd-Pb Contents
The statistical descriptions of leaf Cd-Pb contents for the whole data sets, calibration data sets, and validation data sets are shown in Table 2. These statistical characters, such as mean and standard deviation (SD), were similar to three data sets with a small difference. These characteristic similarities indicated that the randomly selected calibration and validation data sets could effectively represent the whole data sets.

Significant Bands
The numbers of significant bands varied with different pretreatment methods for Cd and Pb, as displayed in Figure. 3. The number of significant bands selected for Pb stress was no more than 10 in any pretreatment with the most bands of nine, while the most pre-processing methods for Cd stress were less than 10 with the most bands of 13. The minimum number of significant bands for Cd and Pb were 3 and 4, respectively, which could not be selected by the same pre-processing method.
For Cd, half of pretreatments could select the significant bands with no more than 5, three of which have selected the 3, including NorStaSG2D, NorSGSta2D, and SGSta2D. Like Cd, there were three pre-processing methods that selected the minimum number of four for Pb stress, such as NorStaSG1D, StaSGNor1D, and SGNor1D, as illustrated in Figure 3.

Spectral Response to Single Contamination
In Figure 4, the ranges of significant bands for Cd (a) and Pb (b), and the foliar spectral responses to different level of Cd (c) and Pb (d) in these spectral ranges were plotted.
with the most bands of 13. The minimum number of significant bands for Cd and Pb were 3 and 4, respectively, which could not be selected by the same pre-processing method.
For Cd, half of pretreatments could select the significant bands with no more than 5, three of which have selected the 3, including NorStaSG2D, NorSGSta2D, and SGSta2D. Like Cd, there were three pre-processing methods that selected the minimum number of four for Pb stress, such as NorStaSG1D, StaSGNor1D, and SGNor1D, as illustrated in Figure. 3.

Spectral Response to Single Contamination
In Figure. 4, the ranges of significant bands for Cd (a) and Pb (b), and the foliar spectral responses to different level of Cd (c) and Pb (d) in these spectral ranges were plotted. Significant bands detected with ANOVA for Cd-stressed are mainly distributed in the two extents of 681-776 nm and 1224-1349 nm, while that for Pb-stressed were concentrated distribution in the range from 712 nm to 784 nm. The Cd-Pb spectral response showed reverse patterns. With the increasing concentration of Cd, the foliar reflectance generally increases except the extent at 681-699 nm, while with the increasing concentration of Pb, the reflectance generally drops. However, this pattern is not applicable to the reflectance of unpolluted plants. Apart from the main ranges of significant bands, plotted in Figure. 4, there were some bands to scatter at 1461 nm for Cd and 815 nm, 1280-1281 nm, 1476 nm, and 1601 nm. The foliar reflectance characteristics of these scatted significant bands showed the same pattern with the corresponding contaminative heavy metals.

Diagnosis of Cd-Pb Cross-Stress with Different Stress Levels
All pre-processing methods were employed one by one for the selection of significant bands and for the further diagnosis of the specific stress levels that the rice was subject to. Based on the preset diagnostic labels, the selected significant bands were input to the SVM diagnostic model to distinguish specific stress levels and other stress levels, and the overall accuracies were the output diagnostic accuracies. Take the diagnosis of zero-Pb stress level (C-ZPb) as an example. As shown in Table 3, there was a total of 96 samples, including 24 zero-Pb stress level samples and 72 other stress levels samples. Based on pretreatment of NorStaSG1D, 21 zero-Pb stress level samples and 64 other stress level samples were diagnosed correctly, and total 85 samples were diagnosed with the correct stress level and as a result, the overall accuracy was 0.89. Significant bands detected with ANOVA for Cd-stressed are mainly distributed in the two extents of 681-776 nm and 1224-1349 nm, while that for Pb-stressed were concentrated distribution in the range from 712 nm to 784 nm. The Cd-Pb spectral response showed reverse patterns. With the increasing concentration of Cd, the foliar reflectance generally increases except the extent at 681-699 nm, while with the increasing concentration of Pb, the reflectance generally drops. However, this pattern is not applicable to the reflectance of unpolluted plants. Apart from the main ranges of significant bands, plotted in Figure 4, there were some bands to scatter at 1461 nm for Cd and 815 nm, 1280-1281 nm, 1476 nm, and 1601 nm. The foliar reflectance characteristics of these scatted significant bands showed the same pattern with the corresponding contaminative heavy metals.

Diagnosis of Cd-Pb Cross-Stress with Different Stress Levels
All pre-processing methods were employed one by one for the selection of significant bands and for the further diagnosis of the specific stress levels that the rice was subject to. Based on the preset diagnostic labels, the selected significant bands were input to the SVM diagnostic model to distinguish specific stress levels and other stress levels, and the overall accuracies were the output diagnostic accuracies. Take the diagnosis of zero-Pb stress level (C-ZPb) as an example. As shown in Table 3, there was a total of 96 samples, including 24 zero-Pb stress level samples and 72 other stress levels samples. Based on pretreatment of NorStaSG1D, 21 zero-Pb stress level samples and 64 other stress level samples were diagnosed correctly, and total 85 samples were diagnosed with the correct stress level and as a result, the overall accuracy was 0.89. Table 3. Diagnostic details of C-ZPb with NorStaSG1D pre-processing (C-ZPb is zero Pb concentration of cross stress, and NorStaSG1D is a pre-processing method with normalization, standardization, Savitzky-Golay smoothing, and 1st-Derivative, successively). The complete diagnostic accuracies were showed in Table 4. When only a single pretreatment was employed, the accuracies reached 0.60. When three or more pretreatment methods were combined, the diagnostic accuracies boosted to more than 0.8. The bolds represented that the diagnostic accuracy in any stress level was not less than 0.75 and 0.85 for Cd and Pb, respectively. All diagnostic accuracies reached 0.75 for all pretreatments. The accuracies of Cd-stressed levels ranged from 0.75 to 0.90, with the highest of 0.90 for C-HCd. The accuracies of Pb-stressed levels ranged from 0.79 to 0.92. The highest accuracy was 0.91 for C-HPb.

Sample Number Correct Diagnosis Incorrect Diagnosis Accuracy
For Cd stress in cross-stress, all pretreatments that reached accuracies of not less than 0.75 no matter what Cd-stressed levels the rice was subject to. Especially there were three pretreatments with an accuracy of not less than 0.79, including NorStaSG1D, SGNor2D, and SGSta1D. Based on the pretreatments of SGNor2D, all accuracies reached 0.81 in any Cd stress level. When diagnosing whether the rice was subject to zero-Cd stress level (C-ZCd), accuracies ranged from 0.81 to 0.86 from different pre-processing reflective spectra. the highest accuracy was 0.86 based on the four pretreatments, including NorSG1D, NorStaSG1D, SGNor1D, and SGSta1D.; for low-Cd stress level (C-LCd), the overall diagnostic accuracies were distributed in the range from 0.75 to 0.81, and the highest accuracy was 0.81 based on the significant bands of SGNor2D. Half of pretreatments reached diagnostic accuracies of not less than 0.77, which was a greatly acceptable result; for medium-Cd stress level (C-MCd), the accuracies exceeded 0.80except SGSta2D at 0.79. its highest value reached 0.90 in the pretreatment of SGNor1D;; for high-Cd stress level (C-HCd), half of pretreatments reached a diagnostic accuracy of no less than 0.85, which indicated that any of them could diagnose C-HCd from four stress levels in cross-stress. There were three pretreatments that reached 0.90, including NorStaSG1D, SGNor1D, and SGSta1D.
For Pb stress in cross-stress, among all pretreatments, there were six pretreatments that reached accuracies of not less than 0.85, no matter what Pb-stressed levels the rice was subject to. Especially for NorSGSta1D and SGSta1D, the diagnostic accuracies were not less than 0.88 in any Pb stress level. When diagnosing whether the rice was subject to zero-Pb stress level (C-ZPb), and the diagnostic accuracies ranged from 0.86 to 0.91 from different pre-processing reflective spectra, four of which reached 0.88. The highest was 0.91 based on the pretreatment of NorSGSta1D and SGSta1D. for medium-Pb stress level (C-MPb), the overall diagnostic accuracies were distributed to the extent of 0.81 to 0.90. with more than half of the pretreatments, these corresponding diagnostic accuracies were not less than 0.85, and the highest accuracy reached 0.90 in NorSGSta1D and SGSta1D. All diagnostic accuracies for C-HPb reached 0.85,and apart from SGNor2D at 0.85, the remaining all exceeded 0.85. The highest reached 0.92 with the pretreatment of NorSG1D.
When considering comprehensively diagnostic accuracies of Cd-stressed and Pb-stressed in different stress levels, there were four pretreatments that reached satisfying diagnostic accuracies-namely NorStaSG1D, NorSGSta1D, SGNor1D, and SGSta1D. With especially SGSta1D, the diagnostic accuracies of the four Cd-stressed levels were 0.86, 0.79, 0.85 and 0.90, respectively, while the diagnostic accuracies of the four Pb-stressed levels were 0.91, 0.90, 0.88 and 0.90, respectively. These satisfying diagnostic accuracies indicated that the feasibility of using VNIRS of rice canopy to diagnose the Cd-stressed and Pb-stressed levels in cross-stress.

Predictions Cd and Pb Contents
Compared with SVM, PLSR performed better with higher accuracy of prediction. Therefore, in this section, results of PLSR are listed. See Section 4.2 for the comparison between the two methods.
Based on PLSR modeling and r 2 evaluating, two predicted models with the highest r 2 value were selected as the Cd (a) and Pb (b) contents prediction model, respectively, as shown in Figure 5.
not less than 0.85, and the highest accuracy reached 0.90 in NorSGSta1D and SGSta1D. All diagnostic accuracies for C-HPb reached 0.85,and apart from SGNor2D at 0.85, the remaining all exceeded 0.85. The highest reached 0.92 with the pretreatment of NorSG1D.
When considering comprehensively diagnostic accuracies of Cd-stressed and Pb-stressed in different stress levels, there were four pretreatments that reached satisfying diagnostic accuraciesnamely NorStaSG1D, NorSGSta1D, SGNor1D, and SGSta1D. With especially SGSta1D, the diagnostic accuracies of the four Cd-stressed levels were 0.86, 0.79, 0.85 and 0.90, respectively, while the diagnostic accuracies of the four Pb-stressed levels were 0.91, 0.90, 0.88 and 0.90, respectively. These satisfying diagnostic accuracies indicated that the feasibility of using VNIRS of rice canopy to diagnose the Cd-stressed and Pb-stressed levels in cross-stress.

Predictions Cd and Pb Contents
Compared with SVM, PLSR performed better with higher accuracy of prediction. Therefore, in this section, results of PLSR are listed. See section 4.2 for the comparison between the two methods.
Based on PLSR modeling and r 2 evaluating, two predicted models with the highest r 2 value were selected as the Cd (a) and Pb (b) contents prediction model, respectively, as shown in Figure. 5. The predictions of Cd contents reached satisfying results. The of prediction was 0.70. However, both of our PLS and SVM models failed to predict the Pb concentration accurately, with a low of 0.13 and a severely biased residual. The results indicated that it was only feasible to predict foliar Cd concentration quantitatively, and the prediction of Pb contents needs further investigation.

Comparisons of Pre-Processing Methods
The pre-processing methods NorStaSG1D and SGSta1D reached greatly satisfying diagnostic accuracies. All of them were above 0.80, except for C-LCd at 0.79, as shown in Figure. 6. Therefore, the abovementioned two pre-processing methods should be given the priority for diagnosing Cd-Pb cross-stress. Some significant bands, selected from the dataset after pre-processing NorStaSG1D and SGSta1D, are likely to be used as sensitive indices to quickly detect the Cd and Pb stress in leaves, which does help to apply remote sensing technology in large-scale heavy metal pollution detecting. The predictions of Cd contents reached satisfying results. The r 2 of prediction was 0.70. However, both of our PLS and SVM models failed to predict the Pb concentration accurately, with a low r 2 of 0.13 and a severely biased residual. The results indicated that it was only feasible to predict foliar Cd concentration quantitatively, and the prediction of Pb contents needs further investigation.

Comparisons of Pre-Processing Methods
The pre-processing methods NorStaSG1D and SGSta1D reached greatly satisfying diagnostic accuracies. All of them were above 0.80, except for C-LCd at 0.79, as shown in Figure 6. Therefore, the abovementioned two pre-processing methods should be given the priority for diagnosing Cd-Pb cross-stress. Some significant bands, selected from the dataset after pre-processing NorStaSG1D and SGSta1D, are likely to be used as sensitive indices to quickly detect the Cd and Pb stress in leaves, which does help to apply remote sensing technology in large-scale heavy metal pollution detecting. Figure 6. Diagnostic accuracies of different types and stress levels with NorStaSG1D (Left) and SGSta1D (Right). (NorStaSG1D is a pre-processing method with normalization, standardization, Savitzky-Golay smoothing, and 1st-Derivative, successively, while SGSta1D is a pre-processing method with Savitzky-Golay smoothing, standardization, and 1st-Derivative, successively) We have found that the combination of pre-processing methods made significant difference in diagnostic accuracies for Cd-Pb cross-stress, which coincides with Shi's results, as shown in the reference [19]. For example, compared with the pre-processing method of NorSG, the combined method of NorSG1D showed a raised trend, no matter what types and stress levels the rice was subject to, as shown in Figure. 7. (NorSG is a pre-processing method with normalization and Savitzky-Golay smoothing, successively, while NorSG1D is a pre-processing method with normalization, Savitzky-Golay smoothing, and 1st-Derivative, successively) For almost all pretreatments, the corresponding diagnostic accuracies became higher with the increase of stressed levels. It is likely that the higher stressed levels of Cd-Pb cross-stress, the more obvious the change in some rice biochemical parameters (chlorophyll, water content, nitrogen content, etc.), which can be more easily detected and identified by VNIRS. These quantified stress levels can be gradually reduced to investigate the reliability of VNIRS in diagnosing low-level heavy metal pollution [13]. (NorStaSG1D is a pre-processing method with normalization, standardization, Savitzky-Golay smoothing, and 1st-Derivative, successively, while SGSta1D is a pre-processing method with Savitzky-Golay smoothing, standardization, and 1st-Derivative, successively).
We have found that the combination of pre-processing methods made significant difference in diagnostic accuracies for Cd-Pb cross-stress, which coincides with Shi's results, as shown in the reference [19]. For example, compared with the pre-processing method of NorSG, the combined method of NorSG1D showed a raised trend, no matter what types and stress levels the rice was subject to, as shown in Figure 7. We have found that the combination of pre-processing methods made significant difference in diagnostic accuracies for Cd-Pb cross-stress, which coincides with Shi's results, as shown in the reference [19]. For example, compared with the pre-processing method of NorSG, the combined method of NorSG1D showed a raised trend, no matter what types and stress levels the rice was subject to, as shown in Figure. 7. (NorSG is a pre-processing method with normalization and Savitzky-Golay smoothing, successively, while NorSG1D is a pre-processing method with normalization, Savitzky-Golay smoothing, and 1st-Derivative, successively) For almost all pretreatments, the corresponding diagnostic accuracies became higher with the increase of stressed levels. It is likely that the higher stressed levels of Cd-Pb cross-stress, the more obvious the change in some rice biochemical parameters (chlorophyll, water content, nitrogen content, etc.), which can be more easily detected and identified by VNIRS. These quantified stress levels can be gradually reduced to investigate the reliability of VNIRS in diagnosing low-level heavy metal pollution [13]. (NorSG is a pre-processing method with normalization and Savitzky-Golay smoothing, successively, while NorSG1D is a pre-processing method with normalization, Savitzky-Golay smoothing, and 1st-Derivative, successively).
For almost all pretreatments, the corresponding diagnostic accuracies became higher with the increase of stressed levels. It is likely that the higher stressed levels of Cd-Pb cross-stress, the more obvious the change in some rice biochemical parameters (chlorophyll, water content, nitrogen content, etc.), which can be more easily detected and identified by VNIRS. These quantified stress levels can be gradually reduced to investigate the reliability of VNIRS in diagnosing low-level heavy metal pollution [13].
We also investigated the prevailing pre-processing methods of using VNIRS to estimate Cd-Pb concentrations in brief, as shown in Table 5.  [12,13,42,56], but part of them were new conclusions, such as 880 nm and 1461 nm for Cd and 784 nm and 1601 nm for Pb. There is a closed relation between the reflectance of 670-760 nm and content of chlorophyll and nitrogen [57], some of which were same to the Cd-Pb-stressed significant bands, indicating that Cd stress and Pb stress also affected the rice photosynthesis. In addition, the Cd stresses also affected the N-H stretch, which was sensitive in the range from 1005-1015 nm [58]. These results may service the wavelength selection of hyperspectral sensors in the future.
Spectral pre-processing methods, such as SG, 1D, and average, et al. [4,12,13,56] were employed to reduce noise levels. GA and OSC are relatively less commonly used to derive stable hyperspectral signatures [12,42]. PCA is employed for spectral dimension reduction. Besides Table 5, two innovative methods, a genetic algorithm-based stacking algorithm [38] and a memory-based learning approach [39], may be worth exploring to diagnose and predict heavy metals with VNIRS datasets.

Comparisons of Estimated Models
Besides the prediction of PLSR, SVM was also conducted to estimate the Cd-Pb contents, but the latter performed not better than the former. Some evaluated parameters were showed in Table 6. Compared with SVM, PLSR predicted better with lower RMSE and RPIQ. Due to the poor performance of SVM in predicting, only PLSR was used to predict the Cd-Pb contents in rice canopy. Recently, Tsakiridis and Tziolas proposed two novel methods, namely a genetic algorithm-based stacking algorithm [38] and a memory-based learning approach [39], which presented an increase in estimated accuracies and a decrease in the RMSE about 6.47%. These innovative solutions could be considered to predict heavy metals contents and to evaluate the predictive performance in the future, especially for Pb contents prediction, which might improve the estimated accuracies and the robustness of predictive models. Table 6. Evaluated parameters of support vector machine (SVM) prediction and partial least squares regression (PLSR) predication.

Models
Factor To conform to the performance of predicted results, we investigated some research about estimating the Cd contents and Pb contents by VINRS in recent years. The details were displayed in Table 7.  [56] Almost all Cd-Pb diagnosis are based on soil spectra measured in the laboratory, and the values of r 2 range from 0.21 to 0.86, with great fluctuation. There is one study of the spectral monitoring part in the leaf only, without the strict conditions of the greenhouse. Although the highest value of r 2 is above our article, the study includes 36 samples with a single Cd stress only. In our study, there were 96 samples, which is nearly three times the samples number in the reference [55], and our study is on Cd-Pb cross-stress with four different stressed levels. The highest r 2 of Pb contents prediction was 0.73 in the above research [59], but the study involves amounts of soil sampling and wet chemical analysis. Therefore, our prediction of Cd contents was acceptable, but that of Pb stress need to be improved in the future work.

Investigations of Limits and Future Study
The study only explored the feasibility of diagnosing and predicting Cd and Pb due to limited experimental site and samples. However, there are often cross-stress of various heavy metals, such as Zn, Cu, Hg, Cr, Ni, etc. in farmland soil [11,14,41]. Therefore, it is necessary to explore possibilities of using VNIRS to diagnose and predict quantitatively the certain stress from multiple stressed types.
It is worthy to further explore modeling methods and observatory angles. In this study, we have used PLS and SVM solely to estimate the Cd-Pb concentrations in rice foliar based on the canopy reflectance. More sophisticated modeling methods, such as random forest and Lasso regression, are worthy exploring in the future studies. In addition, this study diagnosed and predicted the Cd-Pb cross-stress only from canopy hyperspectral reflectance, while other studies found that the diagnostic accuracies could be improved by combining the hyperspectral reflectance of rice plants and soil [34]. This is also interesting to verify on our target pollutant and try to use combined spectra to estimate, one step toward the reality.

Conclusions
In this paper, a Cd-Pb cross-stress experiment of heavy metals was designed and the feasibility of using rice canopy spectra to diagnose and predict Cd and Pb in cross-stress was explored.
It is found that the significant bands were mainly distributed in the extent of 681-776 nm and 1224-1349 nm for Cd stress and 712-784 nm for Pb stress. With the increasing concentration, the foliar reflectance generally increases for Cd stress, while it drops for Pb stress.
On one hand, all diagnostic accuracies reached 0.75 under any Cd-Pb-stressed level combination, and with the proper spectral pretreatment, the accuracy may exceed 0.85 with very few exceptions. On the other hand, the r 2 was 0.70 and 0.13 for Cd and Pb contents prediction. We conclude from our limited sample size in a greenhouse condition that from the VNIRS of rice canopy it is possible to diagnose in which single or combined stress the rice crop is suffering, and it is possible to predict the foliar Cd concentration from rice canopy. However, it is worth noting that this is only one example for a trend, while a solid conclusion still needs further studies in the field and with a sufficient sample size.
Author Contributions: S.Z. writes the draft and rectifies the manuscript, and T.F. makes an investigation, rectify the manuscript, and funds the project. J.L. and Y.H. rectify the draft and make parts of visualization works. S.W. and Y.L. contribute to data measurement. Y.C. gives constructive suggestions and provides the software. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.