Analysis of Chlorophyll Concentration in Potato Crop by Coupling Continuous Wavelet Transform and Spectral Variable Optimization

: The analysis of chlorophyll concentration based on spectroscopy has great importance for monitoring the growth state and guiding the precision nitrogen management of potato crops in the ﬁeld. A suitable data processing and modeling method could improve the stability and accuracy of chlorophyll analysis. To develop such a method, we collected the modelling data by conducting ﬁeld experiments at the tillering, tuber-formation, tuber-bulking, and tuber-maturity stages in 2018. A chlorophyll analysis model was established using the partial least-square (PLS) algorithm based on original reﬂectance, standard normal variate reﬂectance, and wavelet features (WFs) under di ﬀ erent decomposition scales (2 1 –2 10 , Scales 1–10), which were optimized by the competitive adaptive reweighted sampling (CARS) algorithm. The performances of various models were compared. The WFs under Scale 3 had the strongest correlation with chlorophyll concentration with a correlation coe ﬃ cient of − 0.82. In the model calibration process, the optimal model was the Scale3-CARS-PLS, which was established based on the sensitive WFs under Scale 3 selected by CARS, with the largest coe ﬃ cient of determination of calibration set ( R 2 c ) of 0.93 and the smallest R 2 c − R 2 cv value of 0.14. In the model validation process, the Scale3-CARS-PLS model had the largest coe ﬃ cient of determination of validation set ( R 2 v ) of 0.85 and the smallest root–mean–square error of cross-validation (RMSEV) value of 2.77 mg / L, demonstrating good prediction capability of chlorophyll concentration. Finally, the analysis performance of the Scale3-CARS-PLS model was measured using the testing data collected in 2020; the R 2 and RMSE values were 0.69 and 3.36 mg / L, showing excellent applicability. Therefore, the Scale3-CARS-PLS model could be used to analyze chlorophyll concentration. This study indicated the best decomposition scale of continuous wavelet transform and provided an important support method for chlorophyll analysis in the potato crops.


Introduction
Potato (Solanum tuberosum) is the world's fourth-largest food crop following rice, wheat, and maize [1,2]. Chlorophyll, as the essential photosynthetic pigment of potato leaves, reflects growth information about plant health [3] and photosynthetic rate [4], and its content is also significantly cultivar of the potato crop was Dutch. Due to the epidemic influence in the spring of 2020, the potato crop was planted on 5 June 2020, almost two months later than in 2018, and spectral data were collected on July 11, July 21, July 30 and August 12, respectively. N application and field management practices were similar to experiments in 2018. In addition, the sampling and data collection methods were also the same as in 2018. Although the growth stages of experiments might not exactly match the stages in 2018, collected data could be used to test the effectiveness of proposed methods on the analysis of chlorophyll concentration in potato canopy. Thus, a total of 160 samples were collected from four growth stages as a testing dataset in this paper. Details about the potato growth stages and sampling dates are given in Table 1.  Regarding spectral measurements and leaf sampling, one potato plant was randomly selected in each plot, for which canopy spectral data were collected three times and the average value was calculated to represent the canopy spectrum of the sample. The reflectance spectra were measured by using a ASD FieldSpec-HandHeld-2 spectrometer (Analytical Spectral Devices, Boulder, CO, USA), whose measured wavelength range is 325-1075 nm with step interval of 1 nm, spectral resolution < 3 nm, integration time ≥ 8.5 ms, and standard field-of-view of 25°. There were 751 wavelength variables per spectrum. During data collection, the ASD device was located directly above the sample plant canopy, and the vertical distance from sensor to canopy was about 30 cm. According to geometric operation, the sensor footprint on the potato plant canopy was about 0.02 m 2 . The spectral reflectance was corrected by a standard calibration whiteboard (Spectralon Standard Correction Board, Labsphere Co., Ltd., North Sutton, NH, USA) every 10 min to eliminate the interference of variation in solar-illumination intensity spectral data.

Chlorophyll Content Measurement
Three leaves in each sample plant canopy were randomly collected and were put into a freshness protection bag, which was numbered and stored in a portable thermal insulation box. Then, the chlorophyll concentration was determined based on the standard chemical methods in the laboratory [44]. Each potato leaf was cut into pieces. About 0.04 g pieces of each leaf were In 2020, the testing experiment was conducted at the Shang Zhuang Experiment Station of China Agricultural University in Beijing, China (40 • 08 12" N, 116 • 10 44" E), as shown in Figure 1. The cultivar of the potato crop was Dutch. Due to the epidemic influence in the spring of 2020, the potato crop was planted on 5 June 2020, almost two months later than in 2018, and spectral data were collected on July 11, July 21, July 30 and August 12, respectively. N application and field management practices were similar to experiments in 2018. In addition, the sampling and data collection methods were also the same as in 2018. Although the growth stages of experiments might not exactly match the stages in 2018, collected data could be used to test the effectiveness of proposed methods on the analysis of chlorophyll concentration in potato canopy. Thus, a total of 160 samples were collected from four growth stages as a testing dataset in this paper. Details about the potato growth stages and sampling dates are given in Table 1.
Regarding spectral measurements and leaf sampling, one potato plant was randomly selected in each plot, for which canopy spectral data were collected three times and the average value was calculated to represent the canopy spectrum of the sample. The reflectance spectra were measured by using a ASD FieldSpec-HandHeld-2 spectrometer (Analytical Spectral Devices, Boulder, CO, USA), whose measured wavelength range is 325-1075 nm with step interval of 1 nm, spectral resolution < 3 nm, integration time ≥ 8.5 ms, and standard field-of-view of 25 • . There were 751 wavelength variables per spectrum. During data collection, the ASD device was located directly above the sample plant canopy, and the vertical distance from sensor to canopy was about 30 cm. According to geometric operation, the sensor footprint on the potato plant canopy was about 0.02 m 2 . The spectral reflectance was corrected by a standard calibration whiteboard (Spectralon Standard Correction Board, Labsphere Co., Ltd., North Sutton, NH, USA) every 10 min to eliminate the interference of variation in solar-illumination intensity spectral data.

Chlorophyll Content Measurement
Three leaves in each sample plant canopy were randomly collected and were put into a freshness protection bag, which was numbered and stored in a portable thermal insulation box. Then, the chlorophyll concentration was determined based on the standard chemical methods in the laboratory [44]. Each potato leaf was cut into pieces. About 0.04 g pieces of each leaf were placed in a 25 mL mixture of acetone and anhydrous ethanol to extract chlorophyll. The volume ratio of acetone to anhydrous ethanol was 2:1. The extraction solution was placed in darkness for 24 h. The absorbance at 645 and 663 nm of extraction solution was then measured using a visible-infrared spectrophotometer (UV-752, Shimadzu, Kyoto, Japan) that could measure in the wavelength range of 200-1000 nm based on single beam optical system with step interval of 0.1 nm, optical system of a single beam, light source of a tungsten lamp and deuterium lamp, and spectral bandwidth of 4 nm. Chlorophyll concentration was calculated by the following equations: (1) where A 645 and A 663 are the absorbance at 645 and 663 nm, respectively; C a and C b are the concentrations of chlorophyll-a and chlorophyll-b, respectively; and C t is the total chlorophyll concentration, whose unit is mg/L in the study.

Data Analysis
The main data-processing steps are shown in Figure 2. The first part was to convert original reflectance spectra (Ref), which included SNV reflectance (SNV) data obtained from original reflectance by standard normal variate correction, and the wavelet features (WFs) were obtained by continuous wavelet transform (CWT). The second part was to establish analysis models, including PLS models based on the full spectral wavelengths and CARS-PLS models based on the sensitive wavelength variables selected by the CARS algorithm. The third part was to compare the chlorophyll analysis performance of various models.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 22 placed in a 25 mL mixture of acetone and anhydrous ethanol to extract chlorophyll. The volume ratio of acetone to anhydrous ethanol was 2:1. The extraction solution was placed in darkness for 24 h. The absorbance at 645 and 663 nm of extraction solution was then measured using a visible-infrared spectrophotometer (UV-752, Shimadzu, Kyoto Japan) that could measure in the wavelength range of 200-1000 nm based on single beam optical system with step interval of 0.1 nm, optical system of a single beam, light source of a tungsten lamp and deuterium lamp, and spectral bandwidth of 4 nm. Chlorophyll concentration was calculated by the following equations: ..
a 663 645 where 645 A and 63 6 A are the absorbance at 645 and 663 nm, respectively; a C and b C are the concentrations of chlorophyll-a and chlorophyll-b, respectively; and t C is the total chlorophyll concentration, whose unit is mg/L in the study.

Data Analysis
The main data-processing steps are shown in Figure 2. The first part was to convert original reflectance spectra (Ref), which included SNV reflectance (SNV) data obtained from original reflectance by standard normal variate correction, and the wavelet features (WFs) were obtained by continuous wavelet transform (CWT). The second part was to establish analysis models, including PLS models based on the full spectral wavelengths and CARS-PLS models based on the sensitive wavelength variables selected by the CARS algorithm. The third part was to compare the chlorophyll analysis performance of various models.

SNV Correction
SNV is a certified method that can remove both additive and multiplicative effects in spectral data [45,46]. In SNV, each spectrum was being centered and then scaled by the corresponding standard deviation. It could be calculated with Equation (4): where i x is the reflectance of the i nm,  is the average reflectance of a spectrum,  is the standard deviation of a spectrum, i z is the reflectance after SNV of the i nm. In this work, the reflectance spectra corrected by SNV were denoted as SNV reflectance (SNV).

SNV Correction
SNV is a certified method that can remove both additive and multiplicative effects in spectral data [45,46]. In SNV, each spectrum was being centered and then scaled by the corresponding standard deviation. It could be calculated with Equation (4): where x i is the reflectance of the i nm, µ is the average reflectance of a spectrum, σ is the standard deviation of a spectrum, z i is the reflectance after SNV of the i nm. In this work, the reflectance spectra corrected by SNV were denoted as SNV reflectance (SNV).

CWT
Mathematically, CWT is a liner operation that performs the convolution of reflectance spectrum with a scaled and shifted mother wavelet. The transform process is shown as Equation (5): where ψ(λ) is the mother wavelet function, f (λ) is the reflectance spectrum, and W f (a, b) is the wavelet coefficient (denoted as WF a,b ) for the scaling factor a and the shifting factor b. The scaling factor indicates the width of the scaled mother wavelet. The scaling factor used in this study was at dyadic scales 2 n (n = 1, 2, · · · , 10), denoted as scale 1, scale 2, . . . , scale 10, sequentially. The shifting factor was the central wavelength of the shifted mother wavelet. The physical and chemical components of crops had characteristic spectral absorption. b could be used to capture the peak and valley of an absorption feature, and the scaling factor a could be comparable with the width of an absorption feature. A crop leaf reflectance spectrum in the 325-1075 nm range consisted of a background continuum on which a number of absorption features attributable to pigments, water, and dry matter were superimposed [30]. Previous research had suggested that the shape of the absorption features is similar to that of the Gaussian function [47] or a combination of multiple Gaussian functions [48]. Thus, the second derivative of Gaussian, also known as the Mexican Hat, was used as the mother wavelet function in this study. All CWT operations were accomplished using the IDL 6.3 Wavelet Toolkit (ITT Visual Information Solutions, Boulder, CO, USA). The one-dimensional SNV spectra were transformed into two-dimensional wavelet power map data composed of scaling (frequency scale) and shifting (spectral wavelength) factors by using the CWT. According to previous literature, the scaling factor from 1 to 3 belongs to low frequency, the scaling factor from 4 to 7 belongs to middle frequency, and the scaling factor from 8 to 10 belongs to high frequency [30][31][32][33][34]. The sensitive spectral variables of potato chlorophyll could be selected from these wavelet coefficients.

CARS
CARS, proposed by imitating the "survival of the fittest" principle of the Darwinian theory of evolution, is an efficient strategy to select sensitive variables depending on the absolute values of regression coefficients (|α|) [43]. The steps of CARS can be summarized as follows [49,50]. First, |α| values are computed and used as indices to evaluate the importance of each variable. Second, the N subsets are selected by N Monte Carlo sampling runs based on the |α| of each variable. Third, a two-step procedure involving an exponentially decreasing function (EDF) and adaptive reweighted sampling (ASR) is used to select sensitive variables. In this step, EDF is utilized to remove the variables whose regression coefficients are relatively small in each sampling run. Following a decrease in EDF-based enforced variables, ARS is used to further eliminate the variables through a competitive way. Finally, the above three steps are repeated until the standard error of cross-validation is obtained, and then the optimal subset of variables is selected.

PLS Method
The PLS regression method proposed by Geladi [51] was used to solve multicollinearity problems among variables. PLS regression simultaneously executed principal component decomposition on the spectral reflectance matrix and the leaf chlorophyll concentration matrix [52], which were correlated in the decomposition process. A linear regression model was then established between them to analyze the Remote Sens. 2020, 12, 2826 7 of 22 chlorophyll concentration of potato leaves. To prevent model overfitting, internal interaction verification was performed by leave-one-out cross-validation (LOOCV), and the optimal latent variation was selected based on the largest coefficient of determination of the cross-validation set (R 2 cv ). The program package of SNV, CARS, and PLS algorithms is available at the http://www.libpls.net/index.php.

CWT-CARS-PLS
A new spectral data analysis method named CWT-CARS-PLS was proposed in this study. The sensitive variables selected by CARS can remove the uninformative variables and enhance the PLS model performance. Thus, CARS combined with PLS regression (CARS-PLS) was an effective algorithm to establish the quantitative analysis model. CWT can also transform the one-dimensional SNV spectra into two-dimensional wavelet coefficients. Regarding decomposition, CWT can reduce the high-frequency noises of spectral data and extract the valuable spectral variables. Then, CWT combined with CARS-PLS (CWT-CARS-PLS) can deeply identify sensitive WFs and establish a high-performance analysis model. The proposed CWT-CARS-PLS algorithm is briefly introduced in Figure 3. All data calculations including SNV correction, PLS, CARS-PLS, and CWT-CARS-PLS were completed using MATLAB R2018a software.

CWT-CARS-PLS
A new spectral data analysis method named CWT-CARS-PLS was proposed in this study. The sensitive variables selected by CARS can remove the uninformative variables and enhance the PLS model performance. Thus, CARS combined with PLS regression (CARS-PLS) was an effective algorithm to establish the quantitative analysis model. CWT can also transform the one-dimensional SNV spectra into two-dimensional wavelet coefficients. Regarding decomposition, CWT can reduce the high-frequency noises of spectral data and extract the valuable spectral variables. Then, CWT combined with CARS-PLS (CWT-CARS-PLS) can deeply identify sensitive WFs and establish a high-performance analysis model. The proposed CWT-CARS-PLS algorithm is briefly introduced in Figure 3. All data calculations including SNV correction, PLS, CARS-PLS, and CWT-CARS-PLS were completed using MATLAB R2018a software.

Model Evaluation Indicators
To establish the analysis model, the modelling dataset was divided into a calibration and a validation set through sample-set partitioning based on the joint X-Y distance (SPXY) algorithm. This algorithm can comprehensively differentiate independent and dependent variables among samples [53,54].
The calibration set (200 samples) was used to train the PLS model. The validation set (114 samples) was used to verify the established analysis model's performance. The performance of the PLS model was evaluated with the determination coefficient of validation set ( 2 R ) and the root-mean-square error (RMSE) as follows: where i y and  i y are the measured and predicted chlorophyll concentrations for sample i , respectively. y is the average value of measured chlorophyll, and n is the number of samples

Model Evaluation Indicators
To establish the analysis model, the modelling dataset was divided into a calibration and a validation set through sample-set partitioning based on the joint X-Y distance (SPXY) algorithm. This algorithm can comprehensively differentiate independent and dependent variables among samples [53,54].
The calibration set (200 samples) was used to train the PLS model. The validation set (114 samples) was used to verify the established analysis model's performance. The performance of the PLS model was evaluated with the determination coefficient of validation set (R 2 ) and the root-mean-square error (RMSE) as follows: Remote Sens. 2020, 12, 2826 where y i and y * i are the measured and predicted chlorophyll concentrations for sample i, respectively. y is the average value of measured chlorophyll, and n is the number of samples applied for the calibration or validation set. The difference value (R 2 c − R 2 cv ) between the R 2 of calibration set (R 2 c ) and R 2 of cross-validation (R 2 cv ) can be used as an indicator to judge the model stability, and a smaller value of R 2 c − R 2 cv value implies a more stable model. Furthermore, the R 2 of validation set (R 2 v ) and the RMSE of validation set (RMSEV) can be utilized to evaluate the PLS model accuracy, and a higher R 2 v and smaller RMSEV indicate a better model with stronger predictive capability.

Statistics on Chlorophyll Concentration of Modeling Data
Chlorophyll concentrations were measured from S1 to S4. The average value at each stage was calculated and used to indicate the dynamic changes of potato growth. Results are shown in Figure 4. Chlorophyll concentration increased from 28.12 mg/L at S1 to 31.04 mg/L with the highest value at S2, and then decreased gradually to 15.36 mg/L at the S4 stage.
Sens. 2020, 12, x FOR PEER REVIEW 8 ation set ( 2 c R ) and 2 R of cross-validation ( 2 cv R ) can be used as an indicator to judge the m ity, and a smaller value of value implies a more stable model. Furthermore, the ation set ( 2 v R ) and the RMSE of validation set (RMSEV) can be utilized to evaluate the l accuracy, and a higher 2 v R and smaller RMSEV indicate a better model with stro ctive capability.

sults tatistics on Chlorophyll Concentration of Modeling Data
Chlorophyll concentrations were measured from S1 to S4. The average value at each stage lated and used to indicate the dynamic changes of potato growth. Results are shown in Fig rophyll concentration increased from 28.12 mg/L at S1 to 31.04 mg/L with the highest val d then decreased gradually to 15.36 mg/L at the S4 stage.  The results of the dataset partitioned by the SPXY algorithm were shown in Table 2, w s the statistical description of the sample set for each growth stage and the combination of all four stages. Samples from all growth stages were combined to represent the chang ophyll concentration. The modelling dataset for the chlorophyll concentration analysis m sted of calibration and validation sets with 200 and 114 samples, respectively. The maxi of the calibration set (41.20 mg/L) was larger than that of the validation set (37.46 mg/L) inimum value of the calibration set (7.66 mg/L) was smaller than that of the validation set ). The division result by SPXY was reasonable, and the calibration set could strongly repr tire dataset.  The results of the dataset partitioned by the SPXY algorithm were shown in Table 2, which shows the statistical description of the sample set for each growth stage and the combination of data from all four stages. Samples from all growth stages were combined to represent the changes in chlorophyll concentration. The modelling dataset for the chlorophyll concentration analysis model consisted of calibration and validation sets with 200 and 114 samples, respectively. The maximum value of the calibration set (41.20 mg/L) was larger than that of the validation set (37.46 mg/L), and the minimum value of the calibration set (7.66 mg/L) was smaller than that of the validation set (8.20 mg/L). The division result by SPXY was reasonable, and the calibration set could strongly represent the entire dataset.  After SNV correction, the noise caused by the scattering effects was significantly eliminated, and the dispersion among spectral curves was significantly reduced, as shown in Figure 5b. Accordingly, the SNV spectra were used for subsequent continuous wavelet transformation and modeling analysis.  Figure 5a shows the Ref curves of the potato crop canopy. Serious scattering effects were observed in the Ref spectra among samples because of the different collection times and light reflection paths. After SNV correction, the noise caused by the scattering effects was significantly eliminated, and the dispersion among spectral curves was significantly reduced, as shown in Figure  5b. Accordingly, the SNV spectra were used for subsequent continuous wavelet transformation and modeling analysis. Furthermore, we examined the dynamic changes between different stages based on the average SNV spectrum of each stage. Figure 6 shows the reflectance of each stage. Their trends were similar in the visible (400-760 nm) and near-infrared (761-1000 nm) regions. In the visible region, the minimum reflectance appeared near 400 and 680 nm due to a strong absorption by the pigment. In the near-infrared region, the reflectance sharply increased from 711 nm to 760 nm because a reflective surface cavity existed in the spongy structure of the mesophyll. Although strong reflection existed in 761-1000 nm as a horizontal platform, a weak reflectance valley appeared near 970 nm because of the weak absorption of leaf water content. Furthermore, we examined the dynamic changes between different stages based on the average SNV spectrum of each stage. Figure 6 shows the reflectance of each stage. Their trends were similar in the visible (400-760 nm) and near-infrared (761-1000 nm) regions. In the visible region, the minimum reflectance appeared near 400 and 680 nm due to a strong absorption by the pigment. In the near-infrared region, the reflectance sharply increased from 711 nm to 760 nm because a reflective surface cavity existed in the spongy structure of the mesophyll. Although strong reflection existed in 761-1000 nm as a horizontal platform, a weak reflectance valley appeared near 970 nm because of the weak absorption of leaf water content.
in the visible (400-760 nm) and near-infrared (761-1000 nm) regions. In the visible region, the minimum reflectance appeared near 400 and 680 nm due to a strong absorption by the pigment. In the near-infrared region, the reflectance sharply increased from 711 nm to 760 nm because a reflective surface cavity existed in the spongy structure of the mesophyll. Although strong reflection existed in 761-1000 nm as a horizontal platform, a weak reflectance valley appeared near 970 nm because of the weak absorption of leaf water content.  However, significant changes were observed in some specific bands during growth. Within 530-640 nm, the SNV spectral reflectance increased with growth. The average SNV reflectance at S4 was significantly lower than that at the others, whereas the average SNV reflectance of S2 and S3 were very close. Within 740-880 nm, the SNV spectral reflectance decreased gradually. Small reflectance peaks were observed near 763 nm at S2-S4 stages. In the bands of 910-960 nm, the average value at S1 was significantly lower than those at the other stages.

Analysis of Wavelet Coefficient Curves under Different Decomposition Scales
The SNV spectral curves were decomposed into wavelet coefficients by CWT under 10 decomposition scales. The CWT results for some of the samples are shown in Figure 7. We observed that with increased scale, the wavelet coefficients gradually enlarged and the high-frequency noises were gradually reduced. Thus, the spectral curves were smoothed, and some characteristic absorption peaks were amplified under suitable decomposition scales, as shown in Scales 1-6 ( Figure 7). However, when the decomposition scales were too large, the spectral curve became excessively smoothed and caused the the specific characteristic absorption peaks to disappear, which was not conducive to quantitative analysis, as shown in Scales 7-10 ( Figure 7). However, significant changes were observed in some specific bands during growth. Within 530-640 nm, the SNV spectral reflectance increased with growth. The average SNV reflectance at S4 was significantly lower than that at the others, whereas the average SNV reflectance of S2 and S3 were very close. Within 740-880 nm, the SNV spectral reflectance decreased gradually. Small reflectance peaks were observed near 763 nm at S2-S4 stages. In the bands of 910-960 nm, the average value at S1 was significantly lower than those at the other stages.

Analysis of Wavelet Coefficient Curves Under Different Decomposition Scales
The SNV spectral curves were decomposed into wavelet coefficients by CWT under 10 decomposition scales. The CWT results for some of the samples are shown in Figure 7. We observed that with increased scale, the wavelet coefficients gradually enlarged and the high-frequency noises were gradually reduced. Thus, the spectral curves were smoothed, and some characteristic absorption peaks were amplified under suitable decomposition scales, as shown in Scales 1-6 ( Figure 7). However, when the decomposition scales were too large, the spectral curve became excessively smoothed and caused the the specific characteristic absorption peaks to disappear, which was not conducive to quantitative analysis, as shown in Scales 7-10 ( Figure 7).  Figure 8 shows the correlation coefficient curves between the chlorophyll concentration and Ref and SNV. Compared with Ref, the correlation coefficient between SNV and chlorophyll concentration was higher overall, illustrating that SNV correction reduced the noise of the original spectra and improved the analysis performance of spectral data. Furthermore, the correlation relationship between the chlorophyll concentration and SNV was analyzed. Within the ranges of 387-509, 519-633, and 744-844 nm, the absolute values of the correlation coefficient ( r ) were higher than 0.6. The peak value of the positive correlation occurred at 678 nm, and the r was 0.411. The peak  Figure 8 shows the correlation coefficient curves between the chlorophyll concentration and Ref and SNV. Compared with Ref, the correlation coefficient between SNV and chlorophyll concentration was higher overall, illustrating that SNV correction reduced the noise of the original spectra and improved the analysis performance of spectral data. Furthermore, the correlation relationship between the chlorophyll concentration and SNV was analyzed. Within the ranges of 387-509, 519-633, and 744-844 nm, the absolute values of the correlation coefficient (|r|) were higher than 0.6. The peak value of the positive correlation occurred at 678 nm, and the r was 0.411. The peak value of negative correlation occurred at 702 nm, and the r was −0.715. Within 845-917 nm, the positive correlation gradually decreased before becoming a negative correlation, and then |r| gradually increased. To further understand how the spectra changed with potato growth, correlation analysis was conducted between SNV and chlorophyll concentration from S1 to S4. Figure 9 shows the correlation coefficient curves. The chlorophyll concentration was correlated positively with the reflectance spectra within the range of 400-500 and 650-700 nm. However, a negative correlation existed between them within 510-630 and 701-750 nm. Furthermore, four band regions were highly correlated, including 400-510, 521-610, 701-740, and 761-920 nm. Overall, the correlation coefficients of S1-S4 had significant differences within 400-600, 601-620, and 700-902 nm. Conversely, the curve trend of the correlation coefficients of S2 and S3 was very similar.

Correlation Analysis between Chlorophyll and Wavelet Features
The correlation coefficients between the chlorophyll and wavelet coefficients were calculated in the decomposition Scales 1-10 to draw the correlation coefficient distribution map, as shown in Figure 10. The correlation coefficient was represented by different colors and color values of each pixel in the map, which could help select the high correlation WFs. We observed that the correlation coefficients varied in different decomposition scales (scaling factors) and wavelength locations (shifting factors). To further understand how the spectra changed with potato growth, correlation analysis was conducted between SNV and chlorophyll concentration from S1 to S4. Figure 9 shows the correlation coefficient curves. The chlorophyll concentration was correlated positively with the reflectance spectra within the range of 400-500 and 650-700 nm. However, a negative correlation existed between them within 510-630 and 701-750 nm. Furthermore, four band regions were highly correlated, including 400-510, 521-610, 701-740, and 761-920 nm. Overall, the correlation coefficients of S1-S4 had significant differences within 400-600, 601-620, and 700-902 nm. Conversely, the curve trend of the correlation coefficients of S2 and S3 was very similar. To further understand how the spectra changed with potato growth, correlation analysis was conducted between SNV and chlorophyll concentration from S1 to S4. Figure 9 shows the correlation coefficient curves. The chlorophyll concentration was correlated positively with the reflectance spectra within the range of 400-500 and 650-700 nm. However, a negative correlation existed between them within 510-630 and 701-750 nm. Furthermore, four band regions were highly correlated, including 400-510, 521-610, 701-740, and 761-920 nm. Overall, the correlation coefficients of S1-S4 had significant differences within 400-600, 601-620, and 700-902 nm. Conversely, the curve trend of the correlation coefficients of S2 and S3 was very similar.

Correlation Analysis between Chlorophyll and Wavelet Features
The correlation coefficients between the chlorophyll and wavelet coefficients were calculated in the decomposition Scales 1-10 to draw the correlation coefficient distribution map, as shown in Figure 10. The correlation coefficient was represented by different colors and color values of each pixel in the map, which could help select the high correlation WFs. We observed that the correlation coefficients varied in different decomposition scales (scaling factors) and wavelength locations (shifting factors).

Correlation Analysis between Chlorophyll and Wavelet Features
The correlation coefficients between the chlorophyll and wavelet coefficients were calculated in the decomposition Scales 1-10 to draw the correlation coefficient distribution map, as shown in Figure 10. The correlation coefficient was represented by different colors and color values of each pixel in the map, which could help select the high correlation WFs. We observed that the correlation coefficients varied in different decomposition scales (scaling factors) and wavelength locations (shifting factors).
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 22 Figure 10. Correlation coefficient map between wavelet features and growth stages.

Comparison of Correlation Coefficient
The highest correlation coefficients of Ref, SNV, and WFs are shown in Table 3. We observed that the correlation coefficient of SNV (r = 0.75) was higher than Ref (r = 0.50), which revealed that SNV correction could effectively remove the noise of spectral data. For WFs, the correlation coefficient gradually increased form Scale 1 to Scale 3, and then the correlation coefficient gradually decreased. The strongest correlation was found in Scale 3 located in 524 nm (r = -0.82), and the Ref had the weakest correlation (r = −0.50) located in 698 nm.
Moreover, the correlation coefficients of WFs in Scales 1-6 were higher than those of SNV, illustrating that CWT could enhance the correlation of chlorophyll by decomposing spectral data. The correlation coefficients of WFs in Scales 7-10 were also lower than those of SNV, further revealing that spectral data decomposing in too large scales were no longer helpful for quantitative analysis.

Sensitive Chlorophyll Variables Selected Using CARS
For the CWT-CARS-PLS, the sensitive WFs in each decomposition scale were selected, and the chlorophyll analysis PLS models were established for every scale. For comparison with CWT-CARS-PLS, the sensitive wavelengths were selected from Ref and SNV data to establish the

Comparison of Correlation Coefficient
The highest correlation coefficients of Ref, SNV, and WFs are shown in Table 3. We observed that the correlation coefficient of SNV (r = 0.75) was higher than Ref (r = 0.50), which revealed that SNV correction could effectively remove the noise of spectral data. For WFs, the correlation coefficient gradually increased form Scale 1 to Scale 3, and then the correlation coefficient gradually decreased. The strongest correlation was found in Scale 3 located in 524 nm (r = −0.82), and the Ref had the weakest correlation (r = −0.50) located in 698 nm. Moreover, the correlation coefficients of WFs in Scales 1-6 were higher than those of SNV, illustrating that CWT could enhance the correlation of chlorophyll by decomposing spectral data. The correlation coefficients of WFs in Scales 7-10 were also lower than those of SNV, further revealing that spectral data decomposing in too large scales were no longer helpful for quantitative analysis.

Sensitive Chlorophyll Variables Selected Using CARS
For the CWT-CARS-PLS, the sensitive WFs in each decomposition scale were selected, and the chlorophyll analysis PLS models were established for every scale. For comparison with CWT-CARS-PLS, the sensitive wavelengths were selected from Ref and SNV data to establish the Ref-CARS-PLS and SNV-CARS-PLS, respectively. The LOOCV was always operated to obtain the optimal principle components (PCs) in establishing the PLS models. The number of variables and PCs of various PLS models are shown in Table 4. For the chlorophyll analysis models, the maximal number of variables was 227 in Scale5-CARS-PLS model, and the minimal number of variables was 31 in Scale1-CARS-PLS. However, the minimal number of PCs was three in Scale3-CARS-PLS. The location of sensitive variables selected from Ref, SNV, and WFs in Scales 1-10 by using CARS algorithm are shown in Figure 11. All sensitive wavelengths selected by CARS were distributed in the visible and near-infrared regions. However, for the calibration model established by various sensitive variables, the predictive accuracy of Scale3-CARS-PLS model was the optimum, as shown in Figure 12a. Furthermore, the sensitive WFs of Scale3-CARS-PLS were analyzed through the leaf information. The number of variables of Scale3-CARS-PLS was 57 Ref-CARS-PLS and SNV-CARS-PLS, respectively. The LOOCV was always operated to obtain the optimal principle components (PCs) in establishing the PLS models. The number of variables and PCs of various PLS models are shown in Table 4. For the chlorophyll analysis models, the maximal number of variables was 227 in Scale5-CARS-PLS model, and the minimal number of variables was 31 in Scale1-CARS-PLS. However, the minimal number of PCs was three in Scale3-CARS-PLS. The location of sensitive variables selected from Ref, SNV, and WFs in Scales 1-10 by using CARS algorithm are shown in Figure 11. All sensitive wavelengths selected by CARS were distributed in the visible and near-infrared regions. However, for the calibration model established by various sensitive variables, the predictive accuracy of Scale3-CARS-PLS model was the optimum, as shown in Figure 12a. Furthermore, the sensitive WFs of Scale3-CARS-PLS were analyzed through the leaf information. The number of variables of Scale3-CARS-PLS was 57 Among them, the WFs located in the visible region could reflect the leaf pigment. The WFs located in near-infrared regions could reflect the leaf structure and other leaf substance; for instance, the WF at 929 nm reflected the leaf fat, the WF at 973 nm near 970 nm reflected the leaf water content, and the WF at 985 nm reflected leaf starch.

Comparison of the Performance of PLS and CARS-PLS Models
The chlorophyll analysis models were established using the CARS-PLS method, and the modeling results ( 2 c R and ) are shown in Figure 12. To highlight the advantages of selecting sensitive variables by CARS, the analysis models were also established using the PLS method. Figure 12a shows that for all variable categories, the 2 c R of CARS-PLS was higher than that than that of Ref-CARS-PLS. For CWT-CARS-PLS models, stability gradually strengthened from Scale 1 to Scale 3 and then gradually weakened. The stability of CARS-PLS models based on Scales 1-6 was stronger than those of Ref-CARS-PLS and SNV-CARS-PLS, which was consistent with the correlation analysis in Section 3.3. The above results demonstrated that the CWT could deeply identify spectral data to improve model performance.

Validation of Chlorophyll Analysis Models
The validation results of various chlorophyll analysis models are shown in Figure 13. The same as the calibration models, the CARS-PLS models had a higher determination coefficient of validation set ( 2 v R ) than the PLS models, and the CARS-PLS models had a smaller RMSEV than the PLS models primarily because the invalid variables were removed by the CARS algorithm.

Comparison of the Performance of PLS and CARS-PLS Models
The chlorophyll analysis models were established using the CARS-PLS method, and the modeling results (R 2 c and R 2 c − R 2 cv ) are shown in Figure 12. To highlight the advantages of selecting sensitive variables by CARS, the analysis models were also established using the PLS method. Figure 12a shows that for all variable categories, the R 2 c of CARS-PLS was higher than that of PLS, illustrating that CARS could effectively eliminate uninformative variables and improve model accuracy. The R 2 c − R 2 cv of CARS-PLS was lower than that of PLS, as shown in Figure 12b, which revealed that CARS could reduce model complexity and enhance model stability.
Furthermore, the R 2 c of SNV-CARS-PLS was higher than that of Ref-CARS-PLS. The R 2 c of CWT-CARS-PLS models established based on WFs was higher than those of models based on Ref and SNV. For CWT-CARS-PLS, the R 2 c gradually increased from Scale 1 to Scale 3 and then R 2 c gradually decreased. Based on the value of R 2 c − R 2 cv , the stability of SNV-CARS-PLS was stronger than that of Ref-CARS-PLS. For CWT-CARS-PLS models, stability gradually strengthened from Scale 1 to Scale 3 and then gradually weakened. The stability of CARS-PLS models based on Scales 1-6 was stronger than those of Ref-CARS-PLS and SNV-CARS-PLS, which was consistent with the correlation analysis in Section 3.3. The above results demonstrated that the CWT could deeply identify spectral data to improve model performance.

Validation of Chlorophyll Analysis Models
The validation results of various chlorophyll analysis models are shown in Figure 13. The same as the calibration models, the CARS-PLS models had a higher determination coefficient of validation set (R 2 v ) than the PLS models, and the CARS-PLS models had a smaller RMSEV than the PLS models primarily because the invalid variables were removed by the CARS algorithm. Furthermore, CWT-CARS-PLS models under Scales 2-6 had higher

Testing of the Developed Scale3-CARS-PLS Model
The testing set data collected in 2020 were used to test the stability and applicability of the developed Scale3-CARS-PLS model. The chlorophyll concentration of testing set data ranged from 8.81 mg/L to 39.59 mg/L, and the average content was 19.18 mg/L. The chlorophyll concentration range of the test set was smaller than that of the modeling set, which ranged from 7.66 mg/L to 41.20 mg/L.
The canopy reflectance spectra of the testing dataset (160 samples) were corrected by standard normal variate to obtain the SNV reflectance, then the CWT was performed on the SNV reflectance and the CARS algorithm was used to select the sensitive WFs under scale 3, and then the WFs were substituted into the Scale3-CARS-PLS model to predict chlorophyll concentration. In order to highlight the performance of Scale3-CARS-PLS, the reflectance spectra of the testing dataset were substituted into Ref-PLS and Ref-CARS-PLS models, and the SNV reflectance data were substituted into the SNV-CARS-PLS model. The scatter plot of 1:1 was created, as shown in Figure 15, to visually demonstrate the chlorophyll concentration prediction results. The performance of Ref-CARS-PLS was better than Ref-PLS, which showed that CARS could eliminate the valueless variables to improve the model analysis ability. The model performance of SNV-SCAR-PLS was further enhanced due to the SNV pre-processing by correcting the scattering effect. Then, the Scale3-CARS-PLS model showed the strongest R 2 of 0.69 and the smallest RMSE value of 3.36 mg/L, which illustrated that the Scale3-CARS-PLS model possessed good analysis capability, and the spectral analysis method had good applicability. Figure 15d shows that these chlorophyll values were evenly distributed on both sides of the 1:1 line, further illustrating that the proposed Scale3-CARS-PLS model had good stability.

Testing of the Developed Scale3-CARS-PLS Model
The testing set data collected in 2020 were used to test the stability and applicability of the developed Scale3-CARS-PLS model. The chlorophyll concentration of testing set data ranged from 8.81 mg/L to 39.59 mg/L, and the average content was 19.18 mg/L. The chlorophyll concentration range of the test set was smaller than that of the modeling set, which ranged from 7.66 mg/L to 41.20 mg/L.
The canopy reflectance spectra of the testing dataset (160 samples) were corrected by standard normal variate to obtain the SNV reflectance, then the CWT was performed on the SNV reflectance and the CARS algorithm was used to select the sensitive WFs under scale 3, and then the WFs were substituted into the Scale3-CARS-PLS model to predict chlorophyll concentration. In order to highlight the performance of Scale3-CARS-PLS, the reflectance spectra of the testing dataset were substituted into Ref-PLS and Ref-CARS-PLS models, and the SNV reflectance data were substituted into the SNV-CARS-PLS model. The scatter plot of 1:1 was created, as shown in Figure 15, to visually demonstrate the chlorophyll concentration prediction results. The performance of Ref-CARS-PLS was better than Ref-PLS, which showed that CARS could eliminate the valueless variables to improve the model analysis ability. The model performance of SNV-SCAR-PLS was further enhanced due to the SNV pre-processing by correcting the scattering effect. Then, the Scale3-CARS-PLS model showed the strongest R 2 of 0.69 and the smallest RMSE value of 3.36 mg/L, which illustrated that the Scale3-CARS-PLS model possessed good analysis capability, and the spectral analysis method had good applicability. Figure 15d shows that these chlorophyll values were evenly distributed on both sides of the 1:1 line, further illustrating that the proposed Scale3-CARS-PLS model had good stability.

Testing of the Developed Scale3-CARS-PLS Model
The testing set data collected in 2020 were used to test the stability and applicability of the developed Scale3-CARS-PLS model. The chlorophyll concentration of testing set data ranged from 8.81 mg/L to 39.59 mg/L, and the average content was 19.18 mg/L. The chlorophyll concentration range of the test set was smaller than that of the modeling set, which ranged from 7.66 mg/L to 41.20 mg/L.
The canopy reflectance spectra of the testing dataset (160 samples) were corrected by standard normal variate to obtain the SNV reflectance, then the CWT was performed on the SNV reflectance and the CARS algorithm was used to select the sensitive WFs under scale 3, and then the WFs were substituted into the Scale3-CARS-PLS model to predict chlorophyll concentration. In order to highlight the performance of Scale3-CARS-PLS, the reflectance spectra of the testing dataset were

Discussion
Spectroscopy is a rapid and non-destructive method of gathering crop-pigment information [55,56]. In this study, the spectral characteristic response and chlorophyll concentration change at different stages were analyzed and discussed. Results demonstrated that the average reflectance was close in S2 and S3, and that the correlation curves between the reflectance and chlorophyll concentration of S2 and S3 had similar change trends. According to the potato phenology, a new tuber forms by stolons after the plant flowers at S2 and the tuber expands at S3. Consequently, nutrient availability and balance are transferred from aboveground stems and leaves to underground tubers during these periods. This phenomenon may explain why some of the plants have similar physiology status and spectral responses than others [57].

Abilities of Denoising and Sensitive-Variable Mining of CWT at Different Decomposition Scales
SNV can effectively reduce scattering noise to enhance the analysis performance of spectral data [45,46]. After SNV correction, dispersion among spectral curves was significantly reduced (Figure 5), and the correlation between spectral data and chlorophyll concentration was enhanced (as shown in Figure 8). Accordingly, SNV spectra were used for further CWT and modeling analysis.
After CWT, the spectral reflectance was transformed into the wavelet coefficients, as shown in Figure 7. With increased decomposition scales from 1 to 6, the wavelet coefficient curve was smoothed, and some characteristic absorption peaks amplified. Then, the curve was excessively smoothed, resulting in the disappearance of the characteristic absorption location. The above content was consistent with previous literature reporting that WFs in the middle-and low-frequency scales could capture the absorption characteristics of the physical and chemical substances of crops [33,58] and effectively eliminate the high-frequency noise of spectral data [36,59]. High-frequency WFs

Discussion
Spectroscopy is a rapid and non-destructive method of gathering crop-pigment information [55,56]. In this study, the spectral characteristic response and chlorophyll concentration change at different stages were analyzed and discussed. Results demonstrated that the average reflectance was close in S2 and S3, and that the correlation curves between the reflectance and chlorophyll concentration of S2 and S3 had similar change trends. According to the potato phenology, a new tuber forms by stolons after the plant flowers at S2 and the tuber expands at S3. Consequently, nutrient availability and balance are transferred from aboveground stems and leaves to underground tubers during these periods. This phenomenon may explain why some of the plants have similar physiology status and spectral responses than others [57].

Abilities of Denoising and Sensitive-Variable Mining of CWT at Different Decomposition Scales
SNV can effectively reduce scattering noise to enhance the analysis performance of spectral data [45,46]. After SNV correction, dispersion among spectral curves was significantly reduced (Figure 5), and the correlation between spectral data and chlorophyll concentration was enhanced (as shown in Figure 8). Accordingly, SNV spectra were used for further CWT and modeling analysis.
After CWT, the spectral reflectance was transformed into the wavelet coefficients, as shown in Figure 7. With increased decomposition scales from 1 to 6, the wavelet coefficient curve was smoothed, and some characteristic absorption peaks amplified. Then, the curve was excessively smoothed, resulting in the disappearance of the characteristic absorption location. The above content was consistent with previous literature reporting that WFs in the middle-and low-frequency scales could capture the absorption characteristics of the physical and chemical substances of crops [33,58] and effectively eliminate the high-frequency noise of spectral data [36,59]. High-frequency WFs could remove the absorption features and could not efficiently analyze the physiological and biochemical compositions [60].
The absolute value of the highest correlation coefficient between chlorophyll concentration and WFs under Scales 1-6 was higher than SNV (0.75), illustrating that the CWT could enhance the correlation of chlorophyll concentration by decomposing spectral data. Previous studies [30,34,[59][60][61] have reported the same results, such as Wang [59] who indicated that the correlation between wavelet coefficients and pigments was significantly higher than that of vegetation index and sensitive wavelengths. Furthermore, with increased decomposition scales from 1 to 3, the absolute value of the highest correlation coefficient of WFs increased from 0.78 to 0.82 and then gradually decreased to 0.70, illustrating that high-frequency WFs were not conducive to quantitative analysis [32,33,60,61]. WFs under Scale 3 exhibited the strongest correlation relationship with chlorophyll concentration.

Uninformative Variable Elimination by CARS Algorithm
Given that a spectrometer collects reflectance data based on near-contiguous spectral bands, the selection of sensitive wavelengths or variables is one of key steps in the chlorophyll analysis to solve multiple mutual lineal problems of overfitting and redundancy [62]. Thus, wavelengths and WFs need be selected by effective algorithms to remove the uninformative variables and to enhance model performance [59,63]. The CARS developed based on the model population analysis strategy [64] can be used to consider the contribution of each variable to the analysis model to select informative spectral variables. Relative to the PLS models, the number of input variables of CARS-PLS models was reduced significantly, and the CARS-PLS models possessed more excellent prediction ability, as shown in Figures 12a and 13. Overfitting frequently occurred during the modeling process, caused by the increasing number of model variables, which affected the stability and accuracy of the PLS model [65]. Accordingly, internal cross-validation was performed in this study. The difference in the determination coefficient of calibration and cross-validation sets (R 2 c − R 2 cv ) was used as an indicator to determine the model stability [66]. As shown in Figure 12b, the R 2 c − R 2 cv values of the CARS-PLS models were lower than those in the PLS models, further illustrating that the CARS can effectively eliminate redundant variables and improve the analysis of the model's stability.

Chlorophyll Content Analysis Capability of WFs under Different Decomposition Scales
We further analyzed the performance of various CARS-PLS models.  Figure 14. For WFs under Scale 3, 57 sensitive WFs were selected by the CARS algorithm, whose locations were evenly distributed in the visible (37 variables) and near-infrared (20 variables) region, as shown in Figure 11. Previous studies have reported that the spectral data in the visible region can analyze pigment content [67]. Moreover, the spectral data in the near-infrared region can reflect other substances' information and crop-canopy structure, which can improve the robustness of the chlorophyll analysis model [68].
Moreover, previous studies reported the detection of chlorophyll concentration in crops based on spectral wavelengths or/and spectral indices. Sun [28] selected 11 sensitive wavelengths for analyzing the chlorophyll concentration of potato leaf, with the R 2 v of the model of 0.77. Tao [69] screened the red edge position using the linear extrapolation method for estimating the chlorophyll concentration of potato with R 2 c of 0.87. However, the R 2 c and R 2 v of the analysis model developed by coupling CWT with CARS methods in this paper is 0.93 and 0.86, respectively. Above content demonstrated that CWT could deeply identify spectral data to improve model performance, and that the sensitive WFs under Scale 3 possessed the best excellent prediction capability for chlorophyll concentration of potato crops.

Generalizability of This Study to Future Works
A comprehensive analysis of testing results showed that spectral data could be processed using CWT. Sensitive variables were selected using CARS, which was suitable for model-variable optimization and prediction-capability improvement. Finally, the analysis performance of the Scale3-CARS-PLS model was tested using another variety of potato crop, the R 2 , and RMSE was 0.69 and 3.36 mg/L, as shown in Figure 15, which demonstrated that the Scale3-CARS-PLS model possessed good stability and excellent applicability. Previous studies reported that the chlorophyll concentration is significantly correlated with the concentration of nitrogen [5,70]. Therefore, the study could provide a theoretical support for precision nitrogen management in the potato field, and a method reference for large-scale remote sensing analysis of potato chlorophyll concentration.
However, this method was based on specific spectral data for potato crops. The restrictions were based on the existence of other datasets or potato varieties [22,71]. Therefore, more datasets from wide-ranging potato varieties, planting patterns, and experimental fields should be collected to develop a stable and accurate classification model using CWT-CARS-PLS method.

Conclusions
We presented an effective method for analyzing the chlorophyll concentration of potato plants through canopy spectroscopy. The dynamic responses of canopy spectra at different growth stages were analyzed. The spectral characteristics were found to significantly differ between S1, S2-S3, and S4. However, the SNV spectral reflectance curves in S2 and S3 were similar. The performances of Ref, SNV, WFs under different decomposition scales, CARS-PLS, and CWT-CARS-PLS in analyzing chlorophyll concentration were compared based on the model results. The CARS-PLS model established by WFs under different scales obtained by CWT exhibited the most excellent analysis ability and reliability. Scale3-CARS-PLS model had fewer variables, smallest R 2 c − R 2 cv value, strongest R 2 v , and weakest RMSEV for chlorophyll analysis. The analysis performance of the Scale3-CARS-PLS model was tested using another variety of potato crop with a satisfactory result. Based on spectral data, the WFs under Scale 3 showed excellent chlorophyll-content prediction capability. Thus, the proposed CWT-CARS-PLS was a potentially accurate and efficient method of analyzing the chlorophyll concentration of potato crops. This study could provide a method reference for large-scale remote sensing analysis of chlorophyll concentration and a theoretical support for precision nitrogen management of potato crops.