1. Introduction
Winter wheat is a vital grain crop in China, with its yield directly impacting national food security. Chlorophyll, an essential pigment in photosynthesis, plays a crucial role in capturing and transferring light energy, thereby serving as a fundamental indicator of crop growth and development [
1]. As the primary medium through which plants absorb light energy and convert it into chemical energy, chlorophyll content significantly influences photosynthetic efficiency and overall crop productivity. Variations in chlorophyll levels not only reflect the physiological status of the crop but also closely correlate with nitrogen nutrition and environmental stress factors [
2]. The soil plant analysis development (SPAD) meter provides rapid, non-destructive measurements of relative chlorophyll content through a unitless index based on leaf transmittance at specific wavelengths. Although SPAD values represent an instrument-dependent proxy rather than absolute chlorophyll concentration, they are widely adopted in agricultural practice due to their strong correlation with chlorophyll content and nitrogen status, making them valuable for guiding fertilization decisions and crop management [
3]. However, handheld SPAD meters are limited to point measurements and are impractical for large-scale field monitoring. In agricultural practice, the ability to rapidly and accurately estimate SPAD values across spatial scales through remote sensing is of great scientific significance for precision farmland management and crop monitoring. Despite numerous studies developing hyperspectral-based SPAD estimation models, three critical gaps persist: (1) most studies rely on pre-defined vegetation indices without systematic optimization of spectral band combinations under specific experimental conditions; (2) the comparative effectiveness of different spectral preprocessing methods for enhancing SPAD-reflectance correlations remains inadequately characterized across diverse crop conditions; and (3) comprehensive comparisons among advanced machine learning architectures for SPAD estimation using optimized spectral features are lacking. This study directly addresses these gaps through systematic spectral band optimization, rigorous preprocessing method comparison, and comprehensive machine learning algorithm evaluation.
Conventional chlorophyll measurement methods are generally destructive, as they necessitate repeated sampling of crop leaves, which is both labor-intensive and time-consuming. These constraints impede large-scale, continuous, and spatially precise monitoring, making these methods insufficient for meeting the requirements of modern precision agriculture [
4]. Hyperspectral remote sensing, characterized by its rich spectral resolution and multiple narrow bands, enables detailed acquisition of canopy spectral data. Its numerous contiguous narrow bands allow precise detection of chlorophyll-specific absorption features in the red and red-edge regions, which are directly correlated with leaf chlorophyll content and SPAD values [
5]. This fine spectral resolution enables differentiation of subtle variations in chlorophyll absorption that broader-band multispectral sensors cannot detect. It offers a non-destructive and efficient alternative for monitoring plant physiological and biochemical traits. Extensive research has demonstrated the applicability of hyperspectral technology in estimating chlorophyll content across various plant species [
6,
7,
8,
9]. Recent studies have shown significant correlations between hyperspectral reflectance-derived vegetation indices and the growth status of winter wheat [
10]. Furthermore, using multiple spectral indices as inputs to multivariate regression or machine learning models, rather than relying on a single index, has been shown to enhance the predictive accuracy of crop physiological parameters under field conditions where complementary indices can capture different aspects of canopy biochemistry and structure [
11].
Canopy spectral reflectance is often affected by measurement instrument limitations, methodological inconsistencies, and environmental factors, including canopy structure, ground background, leaf water content, and atmospheric absorption [
12,
13,
14,
15]. Canopy hyperspectral data, spanning wavelengths from 350 to 2500 nm, often contain redundant or interfering information. Variations in environmental conditions, growth stages, and crop physiological states introduce distinctive spectral signatures [
16]. Applying uniform spectral wavelengths across diverse conditions may lead to suboptimal utilization of spectral information and potential interference from unrelated factors, such as leaf area index (LAI), carotenoids, and background scattering effects, thereby affecting SPAD values estimation accuracy [
17]. Hyperspectral data preprocessing serves three critical purposes in SPAD estimation: (1) noise reduction from atmospheric absorption and instrumental artifacts, (2) enhancement of subtle spectral features related to chlorophyll absorption characteristics, and (3) minimization of confounding factors such as canopy structure variations and background soil reflectance [
14]. Different preprocessing techniques address these objectives through distinct mechanisms. FD transformation has been demonstrated to enhance chlorophyll-related spectral features while suppressing linear background trends and reducing illumination effects, making it particularly effective for vegetation biochemical parameter estimation [
18,
19]. MSC addresses scattering variations caused by surface irregularities, while SG smoothing reduces high-frequency noise while preserving spectral shape [
20]. However, optimal preprocessing methods may vary depending on crop type, growth stage, canopy structure, and environmental conditions. For instance, Shen et al. applied 20 preprocessing techniques to SPAD estimation modeling across different growth stages of winter wheat, revealing that combinations of wavelet denoising, first-order differential, and principal component analysis significantly improved model performance, but effectiveness varied by phenological stage [
20]. This variability necessitates systematic comparative evaluation of multiple preprocessing methods under specific experimental conditions rather than relying on single techniques identified in other contexts. Consequently, this study employs a comparative approach evaluating FD, SD, MSC, and SG transformations to identify the most effective preprocessing strategy for enhancing SPAD-reflectance correlations in winter wheat under our specific irrigation and growth conditions.
The selection of VI feature wavelengths can reduce computational complexity and enhance the efficiency and interpretability of SPAD values estimation models [
21]. Miao et al. optimized two-band and three-band combinations to construct anthocyanin estimation models for winter wheat, finding that the model based on the optimal three-band combination following first-order differential transformation achieved the highest accuracy [
22]. Zhang et al. demonstrated that mutual information-enhanced two-dimensional correlation spectroscopy can effectively identify yield-sensitive characteristic wavelengths of winter wheat at different growth stages. These wavelengths are strongly associated with key physiological parameters and yield formation, and their selection reduces redundant spectral dimensions while preserving critical physiological information, thereby enhancing the model’s predictive performance [
23].
Machine learning (ML) algorithms address key challenges in hyperspectral SPAD estimation, including modeling complex non-linear relationships between multiple spectral features and physiological parameters, handling high-dimensional feature spaces with potential multicollinearity, and accounting for confounding factors present in canopy-level field measurements [
24,
25]. In this study, we conduct a systematic comparative evaluation of six ML algorithms with distinct architectural characteristics to identify which modeling approaches are most effective for SPAD estimation using optimized spectral indices. Different ML algorithms exhibit varying suitability depending on data characteristics due to their distinct model structures [
26].
RF serves as a robust baseline, as it aggregates multiple decision trees to handle high-dimensional feature spaces effectively while mitigating overfitting through feature randomness, making it well-established for crop parameter estimation with moderate sample sizes [
27]. MLP models employ multilayer nonlinear transformations to capture complex relationships between spectral features and physiological parameters [
28]. We also include advanced architectures, LSTM, GRU, Deep-RNN, and CNN to evaluate whether they provide meaningful improvements over these simpler baselines. We acknowledge that some of these architectures may not be theoretically optimal for our data structure: our dataset comprises independent observations at different growth stages rather than longitudinal time series tracking individual plants, which limits the applicability of recurrent models designed for temporal sequence modeling [
29,
30]. Similarly, CNNs are typically most effective for spatially or spectrally continuous data, whereas our inputs consist of discrete vegetation indices [
31]. Nevertheless, we include these algorithms in our comparative evaluation to empirically assess whether they offer performance advantages when applied to optimized spectral index inputs. This comprehensive comparison allows us to determine whether the added complexity of advanced architecture is justified by improved predictive performance, or whether simpler models (RF, MLP) provide more appropriate trade-offs between accuracy, interpretability, and computational efficiency for this application. Our dataset of 900 samples, while substantial for agricultural field experiments, is acknowledged to be relatively limited for deep learning standards, and we implement rigorous cross-validation and regularization strategies to mitigate overfitting risks.
This study aims to develop a reliable model for estimating SPAD values in winter wheat, with the objective of enhancing the accuracy and stability of hyperspectral estimation techniques. Specifically, we test the following research questions: (1) spectral preprocessing methods (FD, SD, MSC, SG) significantly improve the correlation between canopy reflectance and SPAD values compared to original spectra; (2) optimized two-band vegetation indices, systematically selected through correlation analysis across the full hyperspectral range, outperform published vegetation indices in SPAD estimation accuracy; and (3) machine learning algorithms can effectively integrate multiple optimized spectral indices to model complex relationships between canopy-level reflectance and SPAD values. We compare diverse architectures, including established models well-suited to tabular data (RF, MLP) and advanced architectures (LSTM, GRU, Deep-RNN, CNN), to empirically assess whether increased architectural complexity improves SPAD estimation performance, acknowledging that some architectures may not be theoretically optimal for our non-temporal data structure. To test these research questions, five spectral vegetation indices (VIs) were selected and optimized using Pearson correlation analysis to identify the most effective combination of indices. Subsequently, various combinations of these optimized VIs were utilized as input features for multiple machine learning algorithms to construct SPAD value estimation models for winter wheat. Machine learning approaches were employed because canopy-level reflectance involves complex, non-linear relationships between multiple spectral features and SPAD values that cannot be adequately captured by simple empirical models. The resulting model demonstrates improved accuracy in monitoring winter wheat growth, thereby offering technical support for crop growth assessment and precision field management.
2. Materials and Methods
2.1. Overview of the Research Area
The experimental site is situated in Taigu District, Jinzhong City, Shanxi Province, China (112°34′19.96″ E, 37°25′19.81″ N), as illustrated in
Figure 1. This region experiences a temperate continental climate, characterized by an average annual temperature of 13.48 °C and an annual precipitation of approximately 326.9 mm. In Shanxi Province, the winter wheat growing season typically commences in early October. During the 2024 growing cycle, the crop entered the regreening stage from late February to April, followed by the jointing stage in early to mid-April. The flowering stage occurred from late April to early May, succeeded by the grain filling stage in mid-May, and the crop reached maturity by late May. The total growth period spanned approximately 250 days.
2.2. Experimental Design
The experiment was conducted at the Experimental Station of Shanxi Agricultural University from October 2023 to June 2024. In compliance with established agricultural experimental design standards, the experimental plots were designated as moisture-controlled units, each covering an area of 6 m2 (3 m × 2 m). To enhance the variability of winter wheat growth conditions, two wheat cultivars, namely Changmai 6878 and Zhongmai 175, were shown in each plot. Nitrogen, phosphorus, and potassium fertilizers were used as the base fertilizer and were applied once before sowing. The fertilization standard was 150 kg·hm−2 for nitrogen (N), 120 kg·hm−2 for phosphorus (P2O5), and 120 kg·hm−2 for potassium (K2O). The nitrogen source was urea, the phosphorus source was calcium superphosphate, and the potassium source was potassium sulfate. Other field management was consistent with that of local farmers. The inter-varietal row spacing was set at 15 cm, with an intra-row spacing of 10 cm. The inclusion of two cultivars was designed to increase dataset diversity and phenotypic variability rather than to investigate cultivar-specific effects. All data from both cultivars were pooled for model development to ensure that the resulting SPAD estimation models are robust and generalized across different genetic backgrounds.
To create various soil-plant-water conditions, five irrigation treatments were implemented for each cultivar: W0 (no irrigation), W1 (irrigation at jointing stage), W2 (irrigation at jointing + flowering stages), W3 (irrigation at jointing + grain filling stages), and W4 (irrigation at jointing + flowering + grain filling stages). The irrigation volume for each area is 0.36 m
3 each time. The experiment was arranged with three replicates per treatment, yielding a total of 15 experimental plots. The experimental design of the test field is shown in
Figure 2.
2.3. Data Acquisition
2.3.1. SPAD Values Measurement
In this study, SPAD values of winter wheat canopy leaves were measured at five key growth stages, jointing, heading, flowering, grain filling, and maturity using a handheld chlorophyll meter (SPAD-502, Minolta Camera Co., Ltd., Osaka, Japan) in each experimental plot [
32]. Spatial correspondence between SPAD and spectral measurements was ensured through the following protocol: sampling points within each plot were first marked with permanent stakes, then hyperspectral measurements were acquired by positioning the spectrometer probe vertically (1 m height, 25° field of view, approximately 0.2 m
2 footprint) above each marked point, immediately followed by SPAD measurements on flag leaves of plants within the same spectral measurement footprint. For each sampling point, three flag leaves were selected from three distinct plants. SPAD measurements were taken at three specific positions on each flag leaf: the apex, mid-section, and base. To ensure consistency in repeated measurements, the exact measurement points on the test plants were systematically marked. The sampling intensity is consistent with protocols reported in comparable hyperspectral crop monitoring studies [
33,
34] and was designed to balance statistical representativeness with practical field constraints. The distribution of samples across five irrigation treatments, two cultivars, and five phenological stages ensured adequate phenotypic diversity for model development and validation.
2.3.2. Acquisition of Canopy Hyperspectral Data
Canopy hyperspectral data and SPAD values were collected concurrently on the same day. The canopy hyperspectral reflectance of winter wheat was measured using a FieldSpec 3 spectroradiometer (Analytical Spectral Devices, ASD, Boulder, CO, USA), covering a spectral range of 350–2500 nm. Data acquisition was performed under clear and windless weather conditions to minimize environmental interference. For each experimental plot, sampling areas were selected based on visual assessment confirming uniform plant height (coefficient of variation < 10%), consistent canopy density, and absence of lodging, pest damage, or bare soil patches to ensure the representativeness and reliability of the spectral reflectance data.
Measurements were conducted between 10:00 and 14:00 to ensure stable illumination conditions. For each plot, three representative sampling points were selected, and three spectral measurements were taken per point. During data collection, the spectrometer probe was kept vertically at a height of approximately 1 m above the winter wheat canopy, with a field of view angle of 25°. A standard white reference panel was used for calibration prior to each measurement session.
2.4. Data Processing
In this study, Microsoft Excel 2021 was employed to organize the data, categorizing it into distinct sample datasets corresponding to different growth stages. This process resulted in a total of 900 hyperspectral samples and 900 SPAD values. Hyperspectral data preprocessing, including the removal of spectral outliers and correction of spectral anomalies, was conducted using ViewSpecPro 6.0 software. Subsequently, FD, SD, MSC, and SG transformations were applied to the hyperspectral data using Unscrambler X 10.4 software with results exported as ASCII text files (.txt format). This study excluded spectral curves in the ranges of 350–399 nm, 1350–1405 nm, 1750–1950 nm, and 2320–2500 nm, as significant noise was observed in these spectral ranges due to severe atmospheric absorption. Additionally, data generated by failed spectrometer measurements were also excluded [
35]. Using MATLAB 2022b, preprocessed spectra (imported from .txt files) and SPAD values (imported from Excel .csv files) were used to optimize vegetation indices by analyzing correlations between SPAD values and spectral reflectance, enabling selection of optimal spectral band combinations. Model training and validation were conducted in Python 3.8 using scikit-learn and TensorFlow libraries, with optimized indices and SPAD values imported as .csv files. Subsequently, Origin 2022 software was employed for data visualization and plotting.
2.5. Selection of Vegetation Indices
VIs play a crucial role in the estimation of SPAD values and are widely applied in the field of crop physiology monitoring. Among the numerous hyperspectral vegetation indices available, those exhibiting strong correlations with SPAD values were selected and optimized in this study. By leveraging the high spectral resolution of hyperspectral data defined by a dense arrangement of narrow spectral band optimized vegetation indices were developed through an enhanced band selection algorithm that systematically identified two optimal spectral bands.
The selected published hyperspectral vegetation indices, along with the corresponding SPAD values of winter wheat, were input into Unscrambler X 10.4 and SPSS 23 for correlation analysis. This process enabled the identification of vegetation indices exhibiting strong statistical relationships with SPAD values, thereby serving as a basis for model development. Subsequently, MATLAB was used to generate correlation matrix plots between the optimized vegetation indices and SPAD values, facilitating the identification of the most sensitive spectral bands for SPAD values estimation. The specific vegetation indices used in this study are detailed in
Table 1.
2.6. Model Construction Method
Following the transformation of canopy hyperspectral reflectance data, six machine learning algorithms RF, LSTM, MLP, Deep-RNN, GRU and CNN were utilized to develop statistical models for estimating SPAD values in winter wheat. The dataset consisted of 900 hyperspectral samples collected from wheat canopies under various cultivars and treatment conditions. To ensure consistent and comparable model evaluation across all algorithms, the complete dataset was partitioned once using random sampling with a fixed random seed, allocating 70% (630 samples) to the training set and 30% (270 samples) to the validation set. This identical train-validation split was applied uniformly to all six machine learning models, ensuring that all models were trained and evaluated on the same datasets. Furthermore, an identical set of input features was employed across all modeling approaches to maintain consistent input dimensionality, thus enabling a fair and directly comparable evaluation of their predictive performance.
RF, an ensemble learning method based on the aggregation of multiple decision trees, achieves high-dimensional data regression by employing feature randomness and majority voting strategies. Although particularly effective for small- to medium-sized structured datasets, RF exhibits limitations when applied to complex time series or unstructured data [
42]. In this study, hyperparameters were optimized using a grid search method. The optimal configuration consisted of 1200 decision trees, a maximum tree depth of 8, a minimum internal node split size of 3, and a minimum leaf node size of 2.
LSTM is a specialized form of recurrent neural network designed to overcome the long-term dependency challenges faced by conventional RNN. Its architecture incorporates gating mechanisms, namely, input, forget, and output gates [
43]. In this study, LSTM was applied in an exploratory manner to process vegetation indices arranged sequentially by correlation strength. The number of neurons and training epochs was optimized through grid search, resulting in an LSTM model with 50 neurons and 50 training iterations.
MLP is a feedforward neural network consisting of an input layer, one or more hidden layers, and an output layer. It updates network weights using the backpropagation algorithm [
44]. For this study, grid search optimization determined the following architecture: the first hidden layer comprised 100 neurons, the second hidden layer comprised 50 neurons, with an initial learning rate of 0.001 and a maximum of 1100 training iterations.
Deep-RNN models extend traditional RNN by stacking multiple recurrent layers to increase network depth, thereby improving their capacity to model complex temporal patterns in physiological data [
45]. The optimal Deep-RNN configuration identified in this study included three hidden layers with 64, 32, and 16 neurons, respectively, a learning rate of 0.001, and a single output layer.
GRU, a streamlined variant of LSTM, retains long-term memory capabilities while reducing computational complexity and improving training efficiency [
46]. In this study, a three-layer GRU network was constructed, with each layer consisting of 128 neurons. To mitigate overfitting, a dropout layer with a dropout rate of 0.2 was incorporated after each GRU layer. The output layer employed a fully connected Dense layer, and the model was trained for a maximum of 400 epochs.
CNN is a class of deep learning models well-suited for sequential data. By employing local convolution operations in the convolutional layers together with dimensionality reduction in the pooling layers, CNN enable the precise extraction of key response features from hyperspectral vegetation index sequences [
47]. In this study, CNN was applied in an exploratory manner by arranging vegetation indices as a one-dimensional vector ordered by correlation strength and applying 1D convolutional operations to detect local patterns among adjacent indices. This represents a non-standard application, as discrete vegetation indices lack the spatial or spectral continuity that typically justifies convolutional architectures. This approach was included for comprehensive algorithm comparison. In this study, the final CNN configuration was determined with 128 neurons in the Dense layer, a dropout rate of 0.2, a learning rate of 0.001, and 50 training epochs.
In the training and construction of all models, this study employed a systematic grid-search approach to comprehensively optimize hyperparameters. By exhaustively evaluating different parameter combinations, the optimal configuration for each algorithm on the dataset used in this study was determined, thereby ensuring an optimal balance between model complexity and generalization capability. To further enhance the reliability and robustness of hyperparameter selection, a five-fold cross-validation strategy was implemented exclusively on the training set (630 samples) during the hyperparameter optimization phase. The training set was divided into five mutually exclusive subsets, which were used sequentially for internal training and validation to identify the optimal parameter combinations from the grid search. This cross-validation process minimized random biases introduced by data partitioning and effectively improved the generalization ability of the models. After hyperparameter optimization was completed, the final models were trained on the entire training set using the optimal parameters identified through cross-validation. The independent validation set (270 samples), which was held out entirely during the hyperparameter tuning process, was used solely for final model performance evaluation, ensuring an unbiased assessment of the predictive accuracy of the winter wheat SPAD estimation models.
2.7. Model Evaluation and Performance Assessment
This study utilizes three evaluation metrics R
2, RMSE, and RE to comprehensively assess the accuracy and robustness of the SPAD value estimation models [
48]. Specifically, R
2 quantifies the proportion of variance in the measured SPAD values that are explained by the model, with values approaching 1 indicating superior model performance and predictive capability. RMSE measures the average magnitude of the deviations between predicted and observed values, with smaller RMSE values denoting greater predictive precision. RE expresses the prediction error as a percentage of the observed values, making it particularly suitable for evaluating model performance across varying SPAD value ranges. A lower RE value indicates greater model stability and consistency.
By jointly analyzing these three metrics, the model’s goodness of fit, the absolute error magnitude, and the relative deviation can be comprehensively evaluated, providing a robust assessment of both the model’s reliability and its practical application value. The formulas used to calculate R
2, RMSE, and RE are presented as follows:
In these formulas, n is the sample size, is the actual value, is the predicted value, and is the mean value.
3. Results
3.1. Descriptive Statistics
Figure 3 illustrates the temporal variation in SPAD values of winter wheat over the course of the experimental period. As shown, the patterns of SPAD values dynamics exhibited slight differences across irrigation treatments during crop development. Overall, SPAD values initially increased, reaching a peak at the heading stage, followed by a gradual decline. During the flowering and grain filling stages, SPAD values remained relatively stable, whereas a more pronounced decrease occurred during the maturation stage. Despite these temporal trends, differences among treatments were generally not statistically significant, and the overall variation patterns appeared inconsistent. Following the initial irrigation, only the W1 treatment exhibited higher SPAD values compared to the W0 control. At the heading stage, no statistically significant differences were observed among the W1, W2, W3, and W4 treatments.
To illustrate the temporal dynamics of SPAD values across different growth stages of winter wheat, a descriptive statistical plot of the SPAD value measurements is provided in
Figure 4, and the corresponding statistical characteristics for each phenological stage are summarized in
Table 2. The SPAD values of canopy leaves ranged from 1.21 to 67.80, with the average values exhibiting an initial increase followed by a gradual decline throughout the observation period. The highest average SPAD value of 52.37 was observed at the heading stage (T2), as shown in
Table 2. Although SPAD values during the flowering stage (T3) with a mean of 50.78 were slightly lower than those at heading, certain canopy leaves still exhibited values approaching the peak maximum of 62.10, indicating substantial chlorophyll content during this developmental phase.
By the grain filling stage (T4), SPAD values demonstrated a consistent downward trend. During the maturity stage (T5), SPAD values ranged from 1.21 to 67.80 with an average of 29.33. Across all growth stages, average SPAD values ranged from 29.33 to 52.37, with the peak occurring at the heading stage (T2, average 52.37), as detailed in
Table 2.
From a statistical distribution perspective, the SPAD data demonstrated distinct stage-specific characteristics as presented in
Table 2. During the heading stage, a positive kurtosis value of Kur = 0.12 indicated a leptokurtic distribution with fewer extreme deviations, suggesting relatively uniform chlorophyll levels across plants. In contrast, other growth stages displayed negative kurtosis values, indicative of platykurtic distributions with greater dispersion toward the tails. The coefficient variation (CV) was highest during the maturity stage, reflecting increased heterogeneity in chlorophyll content at this developmental stage. Across all stages, skewness values ranged between −1 and 1, indicating that SPAD values generally approximated a normal distribution.
3.2. Correlation Analysis Between Canopy Hyperspectral Reflectance and SPAD Values Under Different Transformations
In this study, four spectral transformations FD, SD, MSC, and SG were applied to the original spectrum. The results are illustrated in
Figure 5. As shown, the SG-transformed spectral reflectance exhibits a correlation coefficient with SPAD values that closely mirrors that of the original spectrum. In contrast, the MSC transformation yields a coefficient correlation trend that is largely opposite to that of the original spectrum. The FD and SD transformations, however, result in entirely different trends in the correlation coefficient compared to the original curve.
Notably, these spectral transformations amplify the underlying features embedded in the original spectrum, thereby enhancing the visibility of correlations between spectral data and SPAD values. The correlation coefficients between different spectral transformations and SPAD values are summarized as follows: the original spectrum (R) exhibits a correlation range from −0.58 to 0.22, with the maximum correlation observed at 684 nm. The first derivative (FD) transformed spectrum shows a correlation coefficient range from −0.70 to 0.56, peaking at 630 nm. The second derivative (SD) transformation yields a range from −0.54 to 0.61, with the highest correlation at 695 nm. The multiplicative scatter correction (MSC) transformed spectrum demonstrates a correlation range of −0.53 to 0.52, with the peak correlation at 1292 nm. The Savitzky–Golay (SG) transformed spectrum exhibits a correlation range from −0.58 to 0.22, with a peak at 684 nm, similar to the original spectrum.
Among the various spectral preprocessing transformations, FD exhibited the strongest correlation with SPAD values, achieving a peak absolute correlation coefficient of |r| = 0.70 at 634 nm. SD followed closely, with a peak |r| of 0.67 at 696 nm, whereas MSC and SG showed markedly weaker correlations (peak |r| = 0.52 and 0.34, respectively). These correlation patterns are illustrated in
Figure 5b–e. The superiority of FD was further underscored by statistically meaningful differences in peak correlation: +0.08 relative to SD, +0.23 relative to MSC, and +0.41 relative to SG. These consistent and graded improvements indicate that FD is notably more effective than the other methods at suppressing spectral noise while simultaneously enhancing diagnostically relevant spectral features. Based on these results, FD was selected for subsequent spectral index optimization.
3.3. Correlation Analysis of SPAD Measurements and Vegetation Indices in Winter Wheat
In this study, Pearson correlation analysis was performed to quantify the relationships between SPAD values and five vegetation indices, as summarized in
Table 3. The maximum absolute correlation coefficients between winter wheat SPAD values and each vegetation index are presented. For the raw vegetation indices (RVI–CIred-edge), correlation coefficients ranged from −0.48 to 0.47. In contrast, the FD transformed vegetation indices (FDRVI–FDCIred-edge) exhibited considerably stronger correlations, with coefficients ranging from −0.69 to 0.54. These findings indicate that first-derivative prepro-cessing significantly enhances the correlation between SPAD values and vegetation indices.
Specifically, the FDCIred-edge index demonstrated the highest correlation with SPAD, with a maximum coefficient of −0.69. The FDGNDVI followed with a maximum correlation of 0.54, while both the FDRVI and FDPRVI indices reached maximum correlations of 0.43 and −0.48. In contrast, the FDPRVI demonstrated a weaker correlation of −0.48, ranking below FDCIred-edge and FDGNDVI. Accordingly, among the spectral indices derived from FD spectral reflectance, the FDCIred-edge index demonstrated the highest correlation with SPAD, underscoring its superior capacity to capture chlorophyll-related spectral information.
Among all evaluated indices, FDCIred-edge consistently showed the strongest association with SPAD values, underscoring its superior ability to capture hyperspectral characteristics related to chlorophyll content. This advantage not only reduces spectral redundancy but also improves model robustness and computational efficiency. Based on descending correlation strength, the indices are ranked as follows:
3.4. Construction of SPAD Estimation Models for Winter Wheat with Different Sets of Vegetation Indices as Inputs
Based on the above analysis, SPAD estimation models were developed by incrementally incorporating vegetation indices as input variables, ranked according to their correlation strength with SPAD. The modeling accuracy associated with different input combinations is presented in
Table 4.
Among the six machine learning models evaluated, R
2 values ranged from 0.47 to 0.61, RMSE from 8.07 to 9.57, and RE from 24.79% to 36.42%. The GRU model achieved the highest performance with three input variables (R
2 = 0.61, RMSE = 8.23, RE = 27.38%), followed by RF with four inputs (R
2 = 0.60, RMSE = 8.07, RE = 27.49%) and CNN with three inputs (R
2 = 0.60, RMSE = 8.35, RE = 27.63%). Deep-RNN achieved R
2 = 0.58 with three inputs, MLP achieved R
2 = 0.58 with three inputs, while LSTM showed the lowest performance (R
2 = 0.49, RMSE = 9.49–9.57, RE = 35.98–36.42%). The detailed performance metrics across different input configurations are shown in
Table 4.
The RF model ranked second in overall performance, attaining an R2 of 0.60 with four input variables, along with an RMSE of 8.07 and an RE of 27.49%. The CNN model demonstrated comparable performance using three input variables, although its overall accuracy was slightly lower, with RMSE and RE reaching 8.35 and 27.63%, respectively. The Deep-RNN model also performed well with three input variables, achieving an R2 of 0.58, an RMSE of 8.11, and an RE of 24.79%, representing a significant improvement over its baseline R2 of 0.50.
Similarly, the MLP model achieved a maximum R2 of 0.58; however, its error metrics were inferior to those of the Deep-RNN model. Specifically, the MLP model yielded an RMSE that was 0.46 higher and an RE that was 2.14% greater, indicating comparatively weaker performance in error prediction. When the number of input variables was varied, the MLP model exhibited reductions in RMSE ranging from 0.02 to 0.48 and a decrease in RE from 0.85% to 6.58%. These results further support the conclusion that the configuration incorporating three input variables is optimal for both the Deep-RNN and MLP models.
In contrast, the LSTM model demonstrated relatively limited predictive capability. Its maximum R2 reached only 0.49, with RMSE values ranging from 9.49 to 9.57 and RE values varying between 35.98% and 36.42%, collectively reflecting substantially lower accuracy and stability.
Overall, the results indicate that the GRU model exhibits strong robustness across different input configurations in predicting SPAD values, consistently showing close agreement with observed measurements. Its RMSE values ranged from 8.23 to 8.75, and RE from 27.38% to 33.33%. The optimal predictive performance was achieved when the three variables most strongly correlated with SPAD were used as inputs, suggesting that the GRU-based SPAD estimation model offers the best trade-off between predictive accuracy and model simplicity.
3.5. Winter Wheat SPAD Values and Spectral Index Feature Band Selection
In this study, the correlation matrix method was employed to determine the optimal wavelength combinations for five spectral vegetation indices. For each index, all possible two-band wavelength combinations within the hyperspectral range were computed and individually correlated with SPAD values, and a correlation heat map was plotted, as shown in
Figure 6. In the heat map, color gradients from blue to red represent correlation values ranging from −1 to +1. Optimal wavelength combinations were identified as those producing the maximum absolute correlation coefficient (|r|) for each index, with specific wavelength pairs (λ
1, λ
2) corresponding to peak correlation values subsequently listed in
Table 5.
The optimal wavelength combinations for each spectral index were determined based on the wavelength positions (λ
1 and λ
2) at which the correlation coefficients with SPAD values were maximized. The maximum absolute correlation coefficients between winter wheat SPAD values and the various spectral indices are summarized in
Table 5. For indices derived from the original reflectance spectrum (NDSI–DI), the optimal correlation coefficient ranges from 0.669 to 0.699. After applying FD preprocessing, the spectral indices (FDNDSI–FDDI) exhibit improved correlation coefficients, ranging from 0.683 to 0.776. Overall, the use of FD preprocessing significantly enhances the correlation between SPAD values and the spectral indices.
Specifically, for the FDNDSI, the maximum correlation coefficient reached 0.77 when λ1 = 734 nm and λ2 = 630 nm. For the FDSASI, the lowest correlation among FD-based indices was 0.68 under the wavelength combination of 871 nm and 630 nm. For both the FDCSI and FDDI indices, the same wavelength pair of 747 nm and 630 nm produced correlation coefficients of 0.78 and 0.77, respectively. Among all spectral indices derived from first-order derivative processed spectra, the FDCSI displayed the strongest correlation with SPAD, achieving a coefficient of 0.78 with the optimal wavelength pair of 747 nm and 630 nm.
Among these, the FDCSI demonstrates the strongest correlation with SPAD values, suggesting that it most effectively captures relevant hyperspectral features. This not only reduces data redundancy but also enhances model robustness and computational efficiency. The ranking of the indices by correlation coefficient, from highest to lowest, is as follows:
3.6. Construction of a Winter Wheat SPAD Values Estimation Model Based on Optimizing the Number of Vegetation Indices Using Different Inputs
Based on the above analysis, the optimized VIs were ranked according to their correlation with SPAD values, and an increasing number of top-ranked indices were incrementally selected as input variables for constructing the SPAD values estimation model. The modeling accuracy corresponding to different input combinations is presented in
Table 6. The results demonstrate that model accuracy improves as more optimized VIs are used as inputs. However, once the number of input indices exceeds a certain threshold, the R
2 value tends to stabilize, indicating diminishing returns.
Model performance with optimized vegetation indices ranged from R
2 = 0.55 to 0.72, RMSE = 7.35 to 8.89, and RE = 24.91% to 31.93%, as shown in
Table 6. GRU achieved optimal performance with three inputs, yielding R
2 = 0.72, RMSE = 7.37, and RE = 24.90%. Deep-RNN achieved the highest R
2 value of 0.70 with four inputs, with corresponding RMSE = 7.35 and RE = 25.71%. CNN and LSTM both performed best with three inputs. For CNN, R
2 = 0.70, RMSE = 7.40, and RE = 24.91%; for LSTM, R
2 = 0.69, RMSE = 7.51, and RE = 26.19%. RF and MLP showed lower performance, with a maximum R
2 = 0.60 for both models.
Figure 7 shows scatter plots of measured vs. predicted SPAD values for all models using three optimized indices.
The CNN model also performed strongly with three spectral indices, achieving an R2 of 0.70, an RMSE of 7.40, and an RE of 24.91%. The LSTM model achieved an R2 of 0.69, representing an improvement over the baseline value of 0.62. The Deep-RNN model reached an R2 of 0.70 with four input variables, along with an RMSE of 7.35 and an RE of 25.71%, demonstrating improved error performance relative to the LSTM model. Compared with other input configurations, the Deep-RNN model achieved reductions of 0.6 in RMSE and 1.38% in RE, indicating that its optimal performance is obtained using four input variables. In contrast, both the CNN and LSTM models exhibited optimal predictive performance when three input variables were used.
The RF and MLP models showed comparatively lower predictive accuracy. The RF model achieved a maximum R2 of 0.60, with RMSE values ranging from 8.39 to 8.89. The MLP model also achieved a maximum R2 of 0.60, with RE values ranging from 27.02% to 28.78%. When three indicators were used as inputs, the RF model achieved an R2 of 0.60, an RMSE of 8.39, and an RE of 27.91%, while the MLP model reached an R2 of 0.60, an RMSE of 8.35, and an RE of 27.02%.
A comparison between predicted and measured SPAD values using the GRU model with varying input counts further confirmed its robustness. The GRU model consistently achieved good agreement with measured values, with RMSE values ranging from 7.37 to 7.78 and RE from 24.90% to 27.58%. The best estimation performance occurred when three VIs were used as inputs, indicating that the GRU-based SPAD values estimation model offers a superior balance between accuracy and model simplicity.
Using five optimized VIs as input variables, six machine learning algorithms RF, LSTM, MLP, Deep-RNN, GRU, and CNN were employed to construct SPAD values estimation models. To further evaluate the models’ fitting performance, scatter plots of measured versus predicted SPAD values were generated for each model, based on the three VI inputs. The optimal estimation results for each model were validated using a 1:1 reference line, as illustrated in
Figure 7.
The predictive performance of all six machine learning algorithms was assessed under consistent input conditions to ensure comparability. Model accuracy was evaluated based on three key indicators: R2, RMSE, and RE. Priority was given to models demonstrating the highest R2, along with the lowest RMSE and RE, to identify the most accurate and robust SPAD values estimation method.
4. Discussion
Analyzing the correlation between canopy spectral characteristics and SPAD values in winter wheat constitutes a fundamental prerequisite for developing reliable SPAD estimation models. In this study, Pearson correlation analyses were initially conducted between canopy spectral reflectance features, spectral position parameters, optimized vegetation indices, and SPAD values throughout the entire growth period. The results (
Figure 5) indicate that different spectral transformations exhibit varying degrees of correlation with SPAD values under diverse treatment conditions, with certain wavelength ranges demonstrating highly significant relationships. These findings are consistent with those reported by Cui et al., suggesting that the integration of multiple spectral transformation methods can effectively enhance the correlation between spectral reflectance and SPAD values. This underscores the importance and necessity of adopting such an approach for accurate SPAD values estimation in winter wheat [
34]. As shown in
Figure 5, the FD transformation yields the strongest correlation between spectral data and SPAD values. This improvement can be attributed to the FD transformation’s capacity to enhance subtle features within the original spectral data. Further examination of the correlation results presented in
Table 3 reveals that vegetation indices optimized through FD preprocessing exhibit highly significant correlations with SPAD values, with the CSI index demonstrating superior correlation compared to other optimized indices. Therefore, this study proposes an optimization approach for vegetation indices based on canopy spectral reflectance combined with spectral position characteristics. By integrating various combinations of optimized vegetation indices as input parameters and employing diverse modeling techniques, an efficient SPAD estimation model for winter wheat is developed.
To assess the influence of optimized VIs on SPAD values estimation models, this study analyzed canopy spectral reflectance characteristics and their correlation with SPAD values. Based on prior research experience, spectral bands in the ranges of 350–399 nm, 1351–1399 nm, 1801–1950 nm, and 2451–2500 nm were excluded before preprocessing to reduce noise from water vapor absorption and instrument-related interference [
35]. Previous findings by Hong et al. demonstrated that FD processing can effectively suppress noise, enhance spectral features, and improve model performance [
49]. Accordingly, multiple preprocessing techniques, including FD, SD, MSC, and SG were applied. All methods influenced both the magnitude and trend of the original spectral curve, with FD preprocessing notably altering spectral reflectance values and enhancing spectral distinctions.
Correlation analysis revealed that various spectral preprocessing techniques enhanced the relationship between winter wheat canopy spectral reflectance and SPAD measurements to differing extents [
34]. Among these methods, FD preprocessing demonstrated the most significant improvement in correlation with SPAD values (peak |r| = 0.70, compared to 0.61 for raw spectra). The spectral bands most strongly correlated with SPAD values following preprocessing were identified as 630 nm, 734 nm, 747 nm, and 871 nm. These wavelengths are consistent with hyperspectral bands previously reported as sensitive indicators of chlorophyll content and crop status, highlighting their importance for SPAD values estimation [
33].
This study systematically compared the effectiveness of spectral indices in estimating SPAD values for winter wheat. The results demonstrate that optimized VIs outperform conventional vegetation indices, primarily due to the optimized selection and combination of sensitive spectral bands. Experimental analyses revealed that, compared with raw spectral reflectance, the correlations between SPAD values and first-derivative (FD) spectral indices were substantially enhanced following spectral transformation. Consistently, Tang et al. [
21] also reported that spectral indices processed using FD transformation exhibited strong correlations with various physiological parameters. Among vegetation indices, FDCIred-edge exhibited the strongest correlation with SPAD, yielding an R
2 of 0.69. In contrast, among optimized VIs, FDCSI demonstrated the highest performance, with an optimal band combination at 747 nm and 630 nm, achieving a correlation coefficient of 0.78, representing a 9% improvement over the best-performing vegetation index. These findings clearly indicate that optimized VIs offer significant advantages in improving SPAD estimation accuracy and more precisely reflecting the chlorophyll content of winter wheat leaves. The underlying mechanism can be attributed to the enhanced synergistic effects of optimized two-dimensional band combinations, which improve the ability of these indices to capture and characterize spectral information, thereby strengthening their association with SPAD. Consequently, optimized VIs were selected as the primary basis for subsequent modeling and analysis in this study.
It is important to acknowledge the relationship between our canopy-level optimization approach and the established physiological basis of SPAD measurements. The SPAD-502 chlorophyll meter operates by measuring leaf transmittance at 650 nm (red, chlorophyll absorption peak) and 940 nm (near-infrared, reference wavelength), wavelengths that are physiologically meaningful and optimal for direct leaf-level measurements [
50]. However, our study addresses canopy-level SPAD estimation from reflectance measurements under field conditions, which presents different challenges. At the canopy scale, reflectance is influenced by canopy architecture, multiple scattering, soil background, and shadow effects—factors absent in leaf transmittance. Our optimization approach identified wavelength combinations that maximize SPAD-reflectance correlation while accounting for these canopy-level complexities. The optimized bands (e.g., 747 nm and 630 nm for FDCSI) achieved substantially higher correlation (r = 0.78) than traditional indices, demonstrating that canopy-level remote sensing benefits from wavelength optimization while remaining consistent with established SPAD physiological principles.
An important finding of this study is that the performance gains of complex deep learning architectures over simpler baseline models are modest. The GRU model achieved the highest validation R2 of 0.72, representing improvements of only 0.06–0.12 over RF, MLP, and CNN models. This modest advantage must be weighed against the increased computational complexity, reduced interpretability, and greater data requirements of recurrent architectures. Our dataset of 900 samples, while substantial for agricultural field experiments, is relatively limited by deep learning standards, which typically require thousands to tens of thousands of samples for reliable generalization. Although we implemented cross-validation and regularization strategies to mitigate overfitting, the marginal performance improvements suggest that the added complexity of deep learning may not be justified for SPAD estimation applications with dataset sizes typical of field agronomic experiments. For practical agricultural applications, simpler models such as RF or MLP may offer more reliable, interpretable, and computationally efficient alternatives. The deep learning approaches evaluated here should be viewed as exploratory comparisons demonstrating that increased model complexity does not necessarily translate to proportional performance gains, which is itself a valuable methodological insight for the crop monitoring community. Future work with larger multi-site, multi-season datasets would be needed to fully evaluate whether deep learning architectures can provide more substantial advantages for SPAD estimation.
The findings suggest that using three optimized VIs provides a more balanced and informative input set for SPAD values estimation. Models using fewer VIs may suffer from information insufficiency and saturation effects, reducing their predictive accuracy. A single VI cannot fully represent vegetation growth conditions due to its limited bandwidth and information content. In contrast, incorporating multiple optimized VIs derived from different wavelength combinations allows for a more comprehensive representation of complementary spectral features related to different canopy biochemical and biophysical properties, leading to improved model robustness [
51].
The comparative evaluation of six machine learning algorithms revealed that deep learning architectures, including GRU, LSTM, Deep-RNN, and CNN, achieved only modest performance improvements over simpler baseline models such as RF and MLP, with R2 gains ranging from 0.06 to 0.12. While our dataset of 900 samples is substantial for agricultural field experiments, it remains relatively limited by deep learning standards. To mitigate overfitting risks, we implemented a robust validation framework consisting of five-fold cross-validation, systematic hyperparameter optimization, and a dropout regularization rate of 0.2, alongside independent validation on a 30% hold-out dataset.
The modest performance gains, combined with reasonable validation set performance characterized by R2 values between 0.69 and 0.72, suggest that these models are not experiencing severe overfitting but are rather approaching the practical limits of the current dataset size. These findings provide critical insights, indicating that for SPAD estimation using field-scale datasets of this magnitude, the added complexity of deep learning may yield diminishing returns compared to simpler and more interpretable models like RF. This empirical comparison serves as a valuable reference for the agricultural remote sensing community in guiding algorithm selection for similar diagnostic applications.
This study adopted a two-stage design methodology to distinguish between establishing an appropriate analytical framework and conducting rigorous model validation. In the first stage, we conducted a preliminary comparison of multiple spectral preprocessing methods, including FD, SD, MSC, and SG, to determine the most effective transformation for enhancing winter wheat-related spectral features through correlation analysis of the full dataset. This comparison addressed a fundamental analytical question: which spectral transformation best reveals SPAD values within winter wheat canopy spectra, rather than serving to train predictive models or optimize specific model parameters. Results indicated FD as the most promising approach, achieving a Pearson correlation coefficient with SPAD values, significantly higher than other methods. This finding guided our decision to focus subsequent modeling efforts on spectra preprocessed using FD and vegetation indices optimized via FD. In the second stage, all subsequent model development strictly maintained the independence of training and validation processes. The dataset was randomly partitioned into a training set comprising 70% of the data, totaling 630 samples, and a validation set comprising 30% of the data, totaling 270 samples. All preprocessing methods, wavelength optimization, vegetation index construction, model training, and hyperparameter tuning were performed on the training set, with five-fold cross-validation applied to the training data. The validation set remained entirely independent throughout the modelling process and was used solely for final performance evaluation.
This study has several limitations that warrant clarification. Firstly, the dataset comprises only 900 samples collected from a single experimental site during the 2023–2024 growing season. Although the experimental design incorporated variability through five irrigation treatments, two varieties, and five growth stages, differences in environmental factors such as soil type, climatic conditions, and management practices still limit the model’s generalizability across different regions. Secondly, SPAD values serving as measured indicators of chlorophyll content are indirect metrics influenced by leaf structure and nitrogen status. Furthermore, while the ground-based hyperspectral measurement employed (FieldSpec 3) offers high spectral fidelity, it remains constrained to point or small plot scales, lacking the spatial scalability required for field-level monitoring. UAV hyperspectral imaging technology offers a viable pathway to overcome spatial coverage limitations by enabling rapid, large-scale data acquisition. Existing research indicates that establishing correlations between ground spectral measurements and UAV observations can effectively spatialize crop field variability [
52]. To address these issues, future research should focus on systematically evaluating the transferability of this spectral optimization framework to UAV platforms, resolving platform-specific challenges such as atmospheric effects and radiometric calibration, thereby enabling rapid, large-scale monitoring using UAV imagery. Furthermore, comprehensive validation across multiple locations and seasons, different genotypes, and environmental conditions is needed to determine the optimal dataset size for developing a general model. Integration of yield data is also essential to deepen the understanding of the relationship between SPAD values and crop productivity.
5. Conclusions
This study systematically analyzed the hyperspectral reflectance characteristics of winter wheat canopies under varying irrigation conditions. By concurrently collecting actual SPAD measurements and canopy hyperspectral reflectance data, it systematically investigated the response relationship between spectral features and SPAD values under multiple spectral transformation methods. Combining feature selection optimization metrics, optimized vegetation indices significantly correlated with SPAD values were identified. Based on these indices, six machine learning prediction models, including RF, LSTM, MLP, Deep-RNN, GRU, and CNN, were constructed. Ultimately, a hyperspectral estimation framework for SPAD values integrating spectral features and optimized vegetation indices was established.
The main conclusions of this study are as follows:
(1) All four spectral transformation methods applied to the original canopy reflectance data resulted in varying degrees of modification to the spectral characteristics. Among these methods, the FD transformation exhibited the strongest correlation with SPAD values. Specifically, the FD transformation chlorophyll spectral index (FDCSI) demonstrated the highest correlation coefficient.
(2) Through hyperspectral band screening, several wavelengths were identified as significantly correlated with SPAD values, including 590 nm, 591 nm, 629 nm, 630 nm, 631 nm, 734 nm, 747 nm, and 871 nm. These bands were used as feature combinations for the construction and optimization of vegetation indices.
(3) When using different numbers of optimized vegetation indices as model inputs, it was found that inputting the top three indices with the highest correlation coefficients yielded accurate SPAD value estimations. Among the six tested models, the GRU model exhibited the best performance, achieving an R2 of 0.72, RMSE of 7.46, and RE of 25.28%.
These findings provide a robust methodological basis for the non-destructive estimation of chlorophyll content in winter wheat and offer valuable insights for precision agricultural monitoring using hyperspectral remote sensing and deep learning techniques.