UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM

Zhang, Zhicong; Jiang, Zhicheng; Liu, Wenxin; Han, Yaxin; Wu, Yunhao; Cui, Dong; Yang, Haijun

doi:10.3390/horticulturae12060743

Open AccessArticle

UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM

by

Zhicong Zhang

^1,2,

Zhicheng Jiang

^1,2,

Wenxin Liu

^1,2,

Yaxin Han

^1,2,

Yunhao Wu

^1,2,

Dong Cui

^1,2,* and

Haijun Yang

^1,2

¹

College of Resources and Environment, Yili Normal University, Yining 835000, China

²

Institute of Resources and Ecology, Yili Normal University, Yining 835000, China

^*

Author to whom correspondence should be addressed.

Horticulturae 2026, 12(6), 743; https://doi.org/10.3390/horticulturae12060743

Submission received: 28 April 2026 / Revised: 6 June 2026 / Accepted: 8 June 2026 / Published: 18 June 2026

(This article belongs to the Special Issue Applications of Artificial Intelligence and Hyperspectral Imaging in Smart Digital Horticulture)

Download

Browse Figures

Versions Notes

Abstract

Canopy SPAD index is a practical indicator for evaluating the photosynthetic status and health of Malus sieversii, an endangered wild apple resource in Xinjiang. To develop a rapid and non-destructive monitoring approach, 255 canopy samples were collected across the flower fading stage, fruit stage, and fruit mature stage using synchronized UAV hyperspectral imaging and ground SPAD measurements. Spectral preprocessing, feature-band selection, regression modeling, and SHAP interpretation were evaluated using training-set optimization and independent test-set validation. SG-FD produced the strongest preprocessing response, with a maximum absolute correlation coefficient of 0.70. SiPLS reduced 220 effective bands to 84 wavelengths; subsequent CARS, GA, and SPA screening retained 28, 8, and 12 wavelengths, respectively. The SiPLS-CARS-based Transformer-LSTM model achieved the best performance, with R² = 0.91 and RMSE = 2.12 in training and R² = 0.86 and RMSE = 2.47 in testing. SHAP results indicated that red-edge wavelengths and visible sensitive bands contributed most to prediction. The proposed UAV hyperspectral and Transformer-LSTM framework provides an interpretable proof-of-concept method for canopy SPAD index estimation in Malus sieversii and supports non-destructive monitoring of wild fruit forest health.

Keywords:

deep learning; UAV hyperspectral imaging; feature band selection; SHAP analysis; SiPLS-CARS

1. Introduction

Malus sieversii is one of the important ancestral species of modern cultivated apple. It is mainly distributed in the Ili River Valley of the Tianshan Mountains in Xinjiang, and plays an important role in studies on apple origin, fruit tree germplasm conservation, and regional ecological security [1,2]. In recent years, due to human disturbance, poor natural regeneration, and the continuous increase in pests and diseases, the area of Xinjiang wild apple forests has been shrinking. Problems such as stand degradation and community aging have become increasingly serious, and some trees have already shown branch dieback or even whole-tree mortality [3]. Therefore, for the conservation and degradation early warning of this endangered wild fruit tree resource, it is important to develop a rapid, non-destructive health diagnostic method suitable for canopy-scale and regional-scale monitoring.

The SPAD value is a commonly used indicator for characterizing the relative chlorophyll status and photosynthetic activity of leaves, and it can reflect plant nutritional status and physiological activity. Previous studies have shown that accurate acquisition of canopy-scale chlorophyll-related parameters can improve the timeliness and spatial representativeness of vegetation health monitoring [4]. From a physiological perspective, changes in chlorophyll are closely related to plant nitrogen supply, carbon–nitrogen resource allocation, and photosynthetic processes [5,6]. Leaf chlorophyll level is also often used as an important proxy for estimating leaf photosynthetic capacity [7]. However, although traditional spectrophotometric methods and handheld chlorophyll meters have high measurement accuracy, their sampling coverage is limited. It is difficult to meet the needs of continuous monitoring of mountainous wild fruit forests under complex terrain, heterogeneous canopy conditions, and multiple phenological stages.

UAV hyperspectral remote sensing can simultaneously acquire continuous spectral information and high-spatial-resolution canopy information, providing an effective approach for non-destructive estimation of vegetation SPAD values or chlorophyll-related parameters. Zhu et al. used UAV hyperspectral data to estimate chlorophyll content in maize and wheat, and pointed out that spatial scale, crop phenotype, and phenological stage all affected model estimation accuracy [8]. Yin et al. conducted chlorophyll mapping of potato using multi-temporal UAV imagery and found that model performance differed clearly among different growth stages [9]. These studies indicate that UAV hyperspectral technology has a good application basis in chlorophyll monitoring. However, most existing studies have focused on field crops or agricultural systems under relatively stable management conditions. In contrast, Malus sieversii is a naturally distributed mountainous wild tree species, with irregular canopy structure, obvious background mixing, and large spectral response differences among phenological stages. Therefore, directly applying existing crop or commercial fruit tree models may not fully characterize the variation in canopy SPAD values of this species. Compared with existing SPAD inversion studies that mainly focus on field crops or managed fruit orchards, the present study addresses a more heterogeneous and conservation-oriented scenario involving an endangered wild fruit tree population. Its innovation lies in extending UAV hyperspectral SPAD estimation to naturally distributed Malus sieversii forests and constructing an interpretable canopy-scale workflow that can support early health diagnosis and conservation management of wild fruit forest resources.

At the methodological level, raw hyperspectral data are usually affected by noise, scattering effects, band redundancy, and strong collinearity among adjacent bands. Therefore, appropriate data processing methods are needed to enhance the useful information related to the target trait. Previous studies have shown that spectral preprocessing and feature band selection can improve spectral quality, reduce redundant variables, and retain sensitive bands closely related to plant pigments or canopy structure [10,11,12,13]. Meanwhile, deep learning models have shown strong potential in handling nonlinear relationships and complex feature interactions in high-dimensional spectral data. Ye et al. developed a hyperspectral deep learning attention model for predicting lettuce chlorophyll content, confirming the role of the attention mechanism in identifying key bands [14]. Yue et al. combined hyperspectral data with a deep learning network for the joint estimation of leaf area index and chlorophyll content, showing that multi-level feature representation can help improve model accuracy [15]. For Xinjiang wild apple, multi-phenological canopy spectra contain not only local continuous changes in spectral shape, but also long-range relationships among visible, red-edge, and near-infrared bands. Therefore, the Transformer + LSTM model is theoretically more suitable for this task: the Transformer can capture global dependencies among different bands, while LSTM can learn local continuous variations in the spectral sequence. Their combination is more likely to adapt to the nonlinear SPAD response under the complex canopy background of Malus sieversii.

Based on the above understanding, this study proposes the following scientific hypothesis: UAV hyperspectral data can effectively characterize the variation in canopy SPAD index of Malus sieversii; after retaining key sensitive bands and reducing redundant information, the Transformer + LSTM model, which combines global dependency modeling and local sequential feature extraction, may improve the estimation of the canopy SPAD index of Malus sieversii under the current proof-of-concept dataset. Around this hypothesis, this study selected the Saha Wild Fruit Forest in Xinyuan County, Ili, Xinjiang, as the study area. UAV hyperspectral imagery and ground-measured canopy SPAD data were synchronously collected during the flower fading stage, fruit stage, and fruit mature stage. This study focuses on three scientific questions: (1) whether the canopy SPAD index of Malus sieversii shows stable sensitive spectral responses in the visible, red-edge, and near-infrared regions; (2) whether appropriate spectral information compression can reduce redundancy while retaining complementary information useful for SPAD estimation; (3) whether Transformer + LSTM can better characterize the nonlinear relationship between spectral features and SPAD values under multi-growth-stage and complex canopy conditions. Through these analyses, this study aims to provide a methodological basis for rapid and non-destructive monitoring of the canopy SPAD index in Malus sieversii, and to offer preliminary technical support for the health assessment and conservation management of endangered wild fruit tree resources.

2. Materials and Methods

2.1. Study Overview

The technical workflow of this study mainly includes data collection, spectral preprocessing, feature selection, model construction, and result interpretation, as shown in Figure 1. During this process, all comparisons of preprocessing methods, correlation analysis, and feature selection are only conducted on the training set. The test set does not participate in these steps and is used only for independent evaluation after the model training is completed.

2.2. Study Area

The Ili River Valley, situated in the western segment of the Tianshan Mountains in northwestern Xinjiang, stands out as China’s sole “moist enclave” within an otherwise arid landscape. The research site was established within the Saha Wild Fruit Forest, located in Kalabula Township, Xinyuan County, Xinjiang (43°27–43°34′ N, 83°28–83°35′ E), as shown in Figure 2. Its central zone occupies the southern foothills of Saha Mountain—a low-elevation mountainous area spanning 1100–1600 m above sea level. Globally significant, this forest represents one of the few remaining remnants of the Tertiary-era warm-temperate broadleaf forests [16], serving as a critical natural laboratory for investigating the evolutionary history, genetic diversification, biogeographic origins, and paleoclimatic context of temperate fruit species across Central Asia and beyond. Climatically, the region falls under a mid-temperate continental semi-humid regime [17], characterized by an average annual temperature of 6–8 °C and yearly precipitation of 500–800 mm—most of which occurs during spring and summer months. In this investigation, in situ SPAD measurements from wild apple leaves were synchronized with UAV-acquired hyperspectral imagery at three key growth stages: flower fading stage (13 May 2025), fruit stage (8 June 2025), and fruit mature stage (27 August 2025).

2.3. UAV Hyperspectral Image Acquisition

Near-canopy hyperspectral images of wild apple trees in the study area were collected using a DJI M350 RTK platform equipped with the GaiaSky-mini3-VN hyperspectral imaging system developed by Dualix (Wuxi, China). This sensor provides 224 bands in the 400–1000 nm range, and the 70th channel (B70) is centered at 553.58 nm, as shown in Table 1. Among the 224 bands, 220 effective bands were retained for later analysis, while the four marginal bands located at the beginning and end of the sensor spectral range were excluded because spectral responses in the edge regions are generally less stable and more susceptible to low signal-to-noise ratio, calibration uncertainty, and illumination interference. Therefore, only bands with relatively stable spectral response were used for spectral preprocessing, correlation analysis, feature selection, and model construction. Before each flight, background calibration and white-panel calibration were completed, and the 220 bands were calibrated again before takeoff. Image acquisition followed pre-set waypoints, with an integration time of 1 ms, a hover time of 12 s, a flight altitude of 80 m, and a ground speed of 5 m s⁻¹. The spectrometer was mounted vertically downward with a field of view of 30°. Forward overlap and side overlap were both set to 60%, and waypoint overlap and line overlap were also kept at 60%.

2.4. Ground Data Acquisition and Sample Division

Canopy-level SPAD readings for Malus sieversii were collected synchronously with UAV hyperspectral image acquisition. In the field, sample trees were selected according to crown visibility in UAV hyperspectral images, accessibility for ground measurement, and representativeness of canopy growth status. Trees with severe mechanical damage, obvious pest or disease symptoms, heavy crown overlap with neighboring trees, strong shadow interference, or poor correspondence between RTK coordinates and UAV imagery were excluded. To reduce sampling bias, the selected sample trees were spatially distributed across the study area rather than concentrated in a single local plot.

Each sample tree was geolocated using real-time kinematic GPS, referenced to the WGS-84 datum, to ensure spatial matching between ground measurements and UAV imagery. For each selected tree, 12 fully developed and disease-free leaves were measured, with three leaves collected from each cardinal direction of the canopy: north, south, east, and west. Two non-veinal SPAD readings were taken on the central lamina of each leaf using a SPAD-502 Plus (Konica Minolta; Tokyo, Japan) chlorophyll meter. The two readings were first averaged to obtain the leaf-level SPAD value, and the mean value of the 12 leaves was then calculated as the canopy SPAD index of the corresponding sample tree.

A total of 255 canopy samples were collected across three growth stages, namely flower fading stage, fruit stage, and fruit mature stage, with 85 samples obtained at each stage. To ensure comparable growth-stage composition between the training and test sets, a stratified random splitting strategy by growth stage was adopted. Specifically, within each stage, 68 samples were randomly assigned to the training set and 17 samples were assigned to the independent test set. This resulted in 204 training samples and 51 test samples. The training set was used for preprocessing comparison, correlation analysis, feature selection, model training, and hyperparameter optimization, whereas the test set was used only for final independent evaluation. The statistical characteristics of canopy SPAD values for the full dataset and the representative training-test split are shown in Table 2.

To evaluate model stability and reduce the uncertainty associated with a single random split, five repeated stratified random splitting experiments were further conducted. In each repetition, samples were randomly repartitioned within each growth stage while maintaining the same training-test ratio. For each split, the complete modeling workflow, including preprocessing, feature selection, model training, and independent test-set evaluation, was repeated. The mean and standard deviation of R² and RMSE across the five repetitions were calculated for both the training and test sets and reported in Table 6.

2.5. Data Processing

2.5.1. Image Processing

After UAV hyperspectral images were collected, the data went through radiometric correction, atmospheric correction, geometric correction, image mosaicking, and format conversion before further analysis. The mosaicking procedure included aerial triangulation, dense point cloud generation, DEM generation, and orthophoto production.

Radiometric correction was conducted in SpecView2.9.3.41 using sensor calibration parameters, dark-current correction, and white reference panel data collected before each flight. Atmospheric correction was performed using an empirical-line correction procedure based on reference panel measurements acquired under the same illumination conditions as the UAV imagery. This procedure converted the raw digital number values into canopy reflectance and reduced the influence of illumination variation during image acquisition. Geometric correction and format conversion were then carried out in HiRegistrator, and image mosaicking was performed in Metashape 2.2.1, as shown in Figure 3. ENVI 5.6 + IDL was finally used to extract canopy spectra of the sample trees. Based on the RTK coordinates of each tree center, a 3 × 3 pixel region of interest (ROI) was defined, and the average spectral value of all pixels within the ROI was extracted.

2.5.2. Hyperspectral Preprocessing

Raw UAV hyperspectral reflectance data are commonly affected by sensor noise, illumination variation, canopy structural differences, and scattering effects. Therefore, spectral preprocessing was performed before feature selection and model construction. In this study, Savitzky–Golay smoothing (SG) was first applied to reduce high-frequency noise [18]. Based on the SG-smoothed spectra, multiplicative scatter correction (MSC) [19], first derivative transformation (FD) [20], and reciprocal transformation (RE) [21] were further applied to compare their ability to enhance SPAD-sensitive spectral information. The SG filter was implemented using a window length of 19 and a second-order polynomial. These preprocessing methods were evaluated only on the training set to avoid information leakage from the independent test set.

2.6. Selection of Hyperspectral Characteristic Parameters

2.6.1. Extraction of Characteristic Wavelengths

Hyperspectral data consist of numerous contiguous wavelength bands, which are typically highly correlated with one another. Accordingly, selecting characteristic wavelengths can reduce spectral redundancy, simplify the model, and enhance predictive accuracy. The synergistic interval partial least squares (siPLS) algorithm partitions the full spectral range into N non-overlapping, contiguous subintervals and systematically evaluates all feasible combinations of these subintervals. The best interval combination is selected based on the lowest cross-validated root mean square error (RMSECV), thereby eliminating irrelevant spectral features while preserving maximal discriminative information [22].

To simultaneously lower the dimensionality of input spectra and boost predictive performance, SiPLS was integrated with three established wavelength selection strategies: competitive adaptive reweighted sampling (CARS) [23], genetic algorithm (GA) [24], and successive projections algorithm (SPA) [25]. Each method was applied to refine the SiPLS-derived interval subsets, identifying a compact yet highly informative set of characteristic wavelengths.

2.6.2. Pearson Correlation Analysis

Pearson correlation coefficients (PCCs) were used to quantify the strength and direction of the linear relationship between each spectral band and the canopy SPAD index, with values ranging from −1 to 1. The correlation analysis was conducted only on the training set. For each band, the significance of the Pearson correlation coefficient was tested using two-tailed Student’s t-tests. Correlations with p < 0.05 were considered statistically significant, whereas correlations with p < 0.01 were considered extremely significant. Therefore, the number of extremely significant bands reported in Table 4 refers to the number of wavelengths satisfying p < 0.01 [26].

2.7. Model Construction and Evaluation Indices

2.7.1. Model Construction

In this study, we established a comparative modeling framework comprising classical machine learning and modern deep learning approaches for non-invasive estimation of the canopy SPAD index in Malus sieversii. Specifically, the machine learning models included random forest (RF), partial least squares regression (PLSR), and k-nearest neighbors (KNN), while the deep learning models consisted of a Transformer–LSTM hybrid and a CNN–LSTM–Attention architecture. Hyperparameter optimization was conducted using only the training set, and the independent test set was not involved in preprocessing selection, feature selection, parameter tuning, or early stopping. For RF, KNN, and PLSR, candidate hyperparameters were compared using five-fold cross-validation on the training set, and the parameter combination with the lowest validation RMSE was retained. For the deep learning models, the number of layers, hidden units, attention heads, dropout rate, batch size, learning rate, and maximum number of epochs were determined according to validation loss and convergence stability. Dropout and early stopping were used to reduce the risk of overfitting under limited sample conditions. Training was terminated when the validation loss did not decrease for 20 consecutive epochs, and the model with the best validation performance was retained for final evaluation.

All experiments were implemented on the Python–PyTorch platform. The main software environment used for model construction and analysis was Python 3.10.12, PyTorch 2.1.0, NumPy 1.24.3, pandas 2.0.3, scikit-learn 1.3.2, Matplotlib 3.7.2, and SHAP 0.44.1. Computations were performed on a Lenovo ThinkStation P3 workstation equipped with an Intel Core i9-14900K processor, 128 GB RAM, and a 2 TB solid-state drive. Random seeds were fixed across repeated runs to improve computational reproducibility.

Random forest (RF) is an ensemble learning method based on decision trees. It handles high-dimensional feature spaces effectively, and exhibits strong feature selection capability and resistance to overfitting. For the dataset used in this study, the model was configured with the following tuned hyperparameters: 60 decision tree estimators, a maximum depth of 5, a minimum of 8 samples required to split an internal node, a minimum leaf size of 8, a single feature evaluated at each node split, and a reproducible random state set to 5.

Partial least squares regression (PLSR) integrates multivariate linear regression with dimensionality reduction, making it particularly well-suited for spectral datasets characterized by severe multicollinearity. Although the sample size in this study exceeded the number of spectral bands, the 220 bands exhibited strong inherent intercorrelations. In such a scenario, leveraging PLSR to derive orthogonal latent variables remained a methodologically sound and effective strategy.

K-nearest neighbors (KNN) served as a baseline model for comparative evaluation. Relying on the principle that spectrally similar samples tend to share comparable SPAD values, the algorithm estimates the SPAD of an unknown sample by computing the mean of the SPAD values from its k nearest neighbors in the training set—identified using Euclidean distance. The optimal k value was selected through five-fold cross-validation.

The Transformer–LSTM model consists of a Transformer encoder and a bidirectional Long Short-Term Memory (BiLSTM) network, designed to jointly capture the global dependency relationships among hyperspectral bands and the local sequential characteristics of spectral data. The input to the model is the spectral sequence obtained after feature selection. First, the input spectra are projected into a 64-dimensional feature space through a linear embedding layer. The embedded features are then fed into a two-layer Transformer Encoder, where each layer contains four multi-head attention mechanisms employing Scaled Dot-product Attention. The hidden dimension of the feed-forward network is set to 128, and the dropout rate is configured as 0.2. Subsequently, the output of the Transformer encoder is further processed by a bidirectional LSTM network with 64 hidden units to learn the continuous variation patterns within the spectral band sequence. Finally, a fully connected layer is used to generate the predicted SPAD value. During model training, the Adam optimizer was adopted with an initial learning rate of 0.001, while the Mean Squared Error loss function (MSELoss) was employed as the optimization objective. The batch size was set to 16, and the maximum number of training epochs was 200. In addition, an Early Stopping strategy was introduced to mitigate the risk of overfitting under limited-sample conditions. Training was terminated when the validation loss failed to decrease for 20 consecutive epochs, and the model with the best validation performance was retained.

The CNN–LSTM–Attention model integrates a one-dimensional convolutional neural network (1D-CNN), an LSTM network, and an attention mechanism. The CNN component consists of two one-dimensional convolutional layers, with 64 filters in the first layer and 128 filters in the second layer. Both layers employ a kernel size of 3, a stride of 1, and same-padding to preserve the sequence length. Each convolutional layer is followed by a ReLU activation function and a MaxPooling1D layer with a pooling size of 2, enabling effective local feature extraction, dimensionality reduction, and noise suppression. The extracted convolutional features are subsequently fed into an LSTM network with 64 hidden units to capture the sequential dependency relationships among hyperspectral bands. A self-attention mechanism is further introduced after the LSTM layer, employing Scaled Dot-product Attention to enhance the representation of key spectral bands closely associated with SPAD values. For model training, the Adam optimizer was utilized with a learning rate of 0.001, and the Mean Squared Error loss function (MSELoss) was adopted as the objective function. The batch size was set to 16, and the maximum number of training epochs was fixed at 200. Moreover, a dropout rate of 0.2 and an Early Stopping strategy were incorporated to reduce the risk of model overfitting.

2.7.2. Evaluation Indices

The goodness-of-fit between observed and predicted canopy SPAD values was assessed using two statistical metrics: the coefficient of determination (R²) and the root mean square error (RMSE). Typically, higher R² values suggest stronger explanatory power and improved model fidelity, whereas RMSE quantifies the average magnitude of prediction residuals—lower RMSE values thus correspond to superior predictive performance [27], as shown in Table 3.

2.8. Model Interpretability Tool: SHAP

To enhance the interpretability of the optimal model’s prediction process, this study incorporated SHAP (Shapley Additive Explanations). SHAP quantifies the marginal contribution of each feature to individual predictions and simultaneously characterizes feature importance, the direction of its effect (positive or negative), and how its influence varies across samples. In this work, SHAP heatmaps, feature-importance bar plots, and beeswarm plots were used to visualize the contributions of key bands in the optimal model and to identify the sensitive features that had the greatest influence on canopy SPAD estimation of Malus sieversii [28].

3. Results

3.1. Canopy SPAD Values of Malus sieversii and Their Correlation with Spectral Preprocessing Methods

A screening model for spectral preprocessing methods was developed using the training set, and internal cross-validation was used to compare the performance of different preprocessing approaches. As shown in Table 4, RMSECV and the R² of the internal validation set were used to evaluate the relative performance of the preprocessing methods at the screening stage. Among the four methods, SG-FD showed the highest maximum absolute correlation coefficient and the largest number of extremely significant bands. It also achieved the lowest RMSECV and the highest internal-validation R². SG-RE ranked second in terms of correlation strength and internal validation performance, followed by SG-MSC and SG.

Table 4. Comprehensive comparison of different preprocessing methods.

Preprocessing Method	Max \|r\|	Number of Extremely Significant Bands	RMSECV	R² of Internal Validation Set
SG	0.50	120	2.32	0.79
SG-MSC	0.53	145	2.03	0.87
SG-RE	0.56	170	1.89	0.93
SG-FD	0.70	180	1.76	0.95

Figure 4a shows the canopy spectral reflectance curves of Malus sieversii across the three growth stages: flower fading stage, fruit stage, and fruit mature stage. The reflectance curves showed typical vegetation spectral characteristics. In the visible region, reflectance remained relatively low, with a green reflectance peak near 550 nm and an absorption feature around 680–700 nm. In the red-edge region, reflectance increased rapidly, while in the near-infrared region, the reflectance remained relatively high. Differences among the three growth stages were mainly observed in the near-infrared region, where reflectance was highest in the fruit mature stage, intermediate in the fruit stage, and lowest in the flower fading stage.

Figure 4b presents the correlation patterns between SPAD values and spectra after different preprocessing methods. The four preprocessing methods showed different correlation distributions. After SG smoothing, the correlation was relatively continuous across the full spectral range. After SG-MSC correction, stronger correlations were mainly observed in part of the visible and red-edge regions. After SG-RE transformation, the correlation direction was generally reversed compared with the original reflectance pattern. After SG-FD transformation, several high-correlation bands appeared near the red-edge region, especially around 708.04, 716.25, and 727.20 nm. Overall, SG-FD showed the strongest comprehensive performance among the tested preprocessing methods.

3.2. Selection of Characteristic Wavelengths for Canopy SPAD Estimation

3.2.1. Selection of Characteristic Intervals Based on SiPLS

Figure 5 shows the process of characteristic interval selection using SiPLS. In this study, the full spectral range was divided into 10 non-overlapping sub-blocks. Four sub-blocks were selected for model construction, and six latent variables were used in each PLS sub-model. Under this parameter combination, the SiPLS model obtained the lowest RMSECV. The selected sub-blocks were the second, third, fourth, and sixth intervals. The corresponding wavelength ranges were 466.17–518.90 nm, 521.55–575.04 nm, 577.73–631.78 nm, and 691.66–746.36 nm. A total of 84 wavelengths were retained for subsequent feature selection and modeling.

3.2.2. Feature Interval Selection Combining CARS with SiPLS

To further reduce the number of variables selected by SiPLS, CARS was applied to the 84 SiPLS-selected wavelengths. The CARS algorithm was run with 100 iterations of random sampling. As shown in Figure 6a, the number of retained wavelength variables decreased rapidly during the early sampling iterations and then gradually stabilized. Figure 6b shows the change in RMSECV during the CARS selection process. The minimum RMSECV occurred at the 30th sampling iteration. Figure 6c presents the regression coefficient distribution of the selected wavelengths at this iteration.

Finally, 28 characteristic wavelengths were retained by the SiPLS-CARS method: 466.17, 468.79, 471.40, 474.02, 479.27, 481.89, 484.52, 487.16, 497.71, 500.35, 502.99, 553.58, 556.26, 569.67, 572.36, 591.21, 602.01, 604.71, 607.41, 612.82, 615.53, 623.65, 626.36, 708.04, 716.25, 727.20, 729.93, and 740.89 nm.

3.2.3. Feature Interval Selection by Combining GA with SiPLS

GA was further used to screen characteristic wavelengths from the SiPLS-selected spectral intervals. The number of selected variables was set from 5 to 25, and the maximum number of iterations was set from 0 to 14. As shown in Figure 7a, RMSECV decreased as the number of variables increased at the beginning and reached the minimum value when eight variables were retained. When the number of variables continued to increase, RMSECV increased and then tended to stabilize. Figure 7b shows the convergence process of the GA. The RMSECV decreased rapidly in the early iterations and then changed only slightly in later iterations. The final SiPLS-GA wavelength subset contained eight characteristic wavelengths: 607.41, 691.66, 697.12, 713.52, 718.98, 735.41, 738.15, and 740.89 nm.

3.2.4. Feature Interval Selection Combining SPA with SiPLS

The 84 wavelengths initially selected by SiPLS were subjected to further refinement using SPA. To identify the optimal subset size, SPA was configured to evaluate candidate models with 1–15 variables. As shown in Figure 8, RMSECV decreases sharply with increasing variable count, reaching its global minimum of 1.913 at exactly 12 wavelengths. Beyond this point, RMSECV consistently increases—indicating overfitting and loss of generalizability—thereby confirming that 12 is the statistically robust and parsimonious optimum. The final SPA-selected wavelength set comprises: 466.17, 481.89, 500.35, 521.55, 537.53, 566.98, 583.12, 612.82, 629.07, 699.85, 718.98, and 740.89 nm.

3.3. Model Construction and Comparison

Table 5 summarizes the prediction performance of different model and feature-selection combinations. Overall, the model performance varied clearly among different wavelength-selection strategies. Compared with the 84 wavelengths selected by SiPLS alone, the secondary feature-selection methods further changed the prediction accuracy of different models. Among all combinations, the SiPLS-CARS feature subset produced the best overall performance. Under this feature subset, the Transformer–LSTM model achieved the highest test-set accuracy, followed by CNN–LSTM–Attention and RF. The models based on SiPLS-SPA and SiPLS-GA also showed improved performance compared with the basic SiPLS input, but their overall performance was lower than that of the SiPLS-CARS-based models.

Figure 9 shows the fitting results between measured and predicted SPAD values for the six superior model combinations. Among them, SiPLS-CARS + Transformer–LSTM showed the closest distribution of points around the 1:1 line. SiPLS-CARS + CNN–LSTM–Attention also showed good fitting performance, although the point dispersion was slightly greater. The fitting line of SiPLS-CARS + RF showed a gentler slope, indicating larger deviation in part of the high-SPAD range. For the Transformer–LSTM and CNN–LSTM–Attention models based on SiPLS-GA and SiPLS-SPA, the fitted lines showed greater deviation from the 1:1 line than the SiPLS-CARS-based Transformer–LSTM model.

Because Table 5 already reports all model and feature-selection combinations based on one representative random split, Table 6 focuses only on representative combinations with relatively good performance. These include the best SiPLS-CARS + Transformer–LSTM model, comparative models under the same SiPLS-CARS feature subset, and Transformer–LSTM models using the more compact SiPLS-SPA and SiPLS-GA feature subsets. To further evaluate model stability, five repeated stratified random splitting experiments were conducted for these representative combinations. In each repetition, the training and test sets were randomly repartitioned within each growth stage while maintaining the same training-test ratio. The complete modeling workflow was repeated independently for each split. The mean and standard deviation of R² and RMSE for both the training and test sets are reported in Table 6.

As shown in Table 6, the SiPLS-CARS + Transformer–LSTM model maintained the best overall performance across the five repeated random splits. The standard deviations of both R² and RMSE were relatively small, indicating that the model performance was stable under different random partitions. Compared with CNN–LSTM–Attention and RF under the same SiPLS-CARS feature set, Transformer–LSTM showed higher accuracy and lower prediction error on both the training and test sets. In addition, the SiPLS-CARS-based models showed more stable test-set performance than the models based on the more strongly compressed SiPLS-SPA and SiPLS-GA feature subsets.

Table 5. Prediction performance of different models for canopy SPAD estimation.

Wavelength Selection Method	Number of Variables	Modeling Method	Training Set		Testing Set
Wavelength Selection Method	Number of Variables	Modeling Method	R²	RMSE	R²	RMSE
SiPLS	84	Transformer–LSTM	0.84	2.56	0.72	3.46
		CNN–LSTM–Attention	0.70	3.81	0.69	3.67
		RF	0.76	3.41	0.62	3.95
		KNN	0.62	4.12	0.58	4.96
		PLSR	0.72	3.56	0.62	3.94
SiPLS-CARS	28	Transformer–LSTM	0.91	2.12	0.86	2.47
		CNN–LSTM–Attention	0.89	2.36	0.83	2.73
		RF	0.87	2.41	0.80	2.97
		KNN	0.88	2.38	0.73	3.38
		PLSR	0.86	2.64	0.72	2.94
SiPLS-SPA	12	Transformer–LSTM	0.84	2.54	0.76	2.78
		CNN–LSTM–Attention	0.84	2.67	0.72	2.96
		RF	0.81	2.71	0.68	3.50
		KNN	0.79	3.52	0.64	3.81
		PLSR	0.80	2.93	0.65	3.66
SiPLS-GA	8	Transformer–LSTM	0.89	2.34	0.77	3.15
		CNN–LSTM–Attention	0.87	2.50	0.75	3.28
		RF	0.85	2.66	0.71	3.53
		KNN	0.81	3.01	0.64	3.94
		PLSR	0.84	2.76	0.69	3.65

Table 6. Results of five repeated random splitting experiments (mean ± SD).

Model	Feature Selection	Training R²	Training RMSE	Test R²	Test RMSE
Transformer–LSTM	SiPLS-CARS	0.90 ± 0.01	2.18 ± 0.10	0.85 ± 0.01	2.50 ± 0.09
CNN–LSTM–Attention	SiPLS-CARS	0.88 ± 0.02	2.43 ± 0.13	0.82 ± 0.02	2.76 ± 0.15
RF	SiPLS-CARS	0.86 ± 0.02	2.48 ± 0.14	0.79 ± 0.02	3.01 ± 0.18
Transformer–LSTM	SiPLS-SPA	0.83 ± 0.02	2.66 ± 0.17	0.75 ± 0.03	2.96 ± 0.21
Transformer–LSTM	SiPLS-GA	0.88 ± 0.02	2.42 ± 0.16	0.76 ± 0.02	2.91 ± 0.19

3.4. Interpretability Analysis of Transformer—LSTM Inversion Model Based on SHAP Method

The SHAP method was used to analyze the SiPLS-CARS + Transformer–LSTM model, which was identified as the optimal model in this study. The 28 wavelengths selected by SiPLS-CARS were used as input variables. SHAP heatmaps, feature-importance bar plots, and beeswarm plots were generated to evaluate the contribution of each wavelength to model prediction. To highlight the main results, the top 20 wavelengths ranked by the mean absolute SHAP value are shown in Figure 10.

Figure 10a presents the SHAP heatmap of the selected wavelengths. The wavelengths 708.04, 727.20, 716.25, and 729.93 nm ranked highest in the heatmap, while the influence of the remaining wavelengths decreased with their ranking. The direction and magnitude of SHAP values varied among samples, indicating that the contribution of each wavelength was not completely uniform across all samples.

Figure 10b shows the global importance ranking of the top 20 wavelengths based on the mean absolute SHAP value. The wavelength 708.04 nm showed the highest contribution, followed by 716.25, 740.89, 727.20, and 729.93 nm. Other wavelengths showed relatively lower contributions.

Figure 10c presents the SHAP beeswarm plot. The core wavelengths showed wider SHAP value distributions than the lower-ranked wavelengths, indicating greater variation in their contribution to prediction among samples. The color distribution also showed that high and low feature values had different effects on the predicted SPAD values for several wavelengths. The dominance of 708.04, 716.25, 727.20, 729.93, and 740.89 nm suggests that the red-edge transition zone carried the most physiologically relevant information for canopy SPAD estimation. These wavelengths are located near the transition from strong red-light absorption by chlorophyll to high near-infrared reflectance associated with leaf internal structure, so their SHAP contributions are consistent with changes in chlorophyll-related canopy photosynthetic status. Visible bands selected by SiPLS-CARS, such as 553.58, 569.67, 602.01, and 607.41 nm, may provide complementary information on pigment absorption and green-red reflectance variation. Therefore, the SHAP results provide both statistical explanation of the model and physiological support for the selected wavelength subset.

4. Discussion

4.1. Effect of Spectral Preprocessing on Canopy SPAD Estimation of Malus sieversii

Spectral preprocessing strongly affected the extraction of SPAD-sensitive information from Malus sieversii canopy spectra. In the results, SG-FD showed better overall performance than SG, SG-MSC, and SG-RE. This indicates that, for the canopy spectra collected in this study, enhancing spectral shape variation was more useful than using smoothing, scatter correction, or reciprocal transformation alone.

This result is closely related to the characteristics of Malus sieversii canopies under natural forest conditions. Unlike commercial apple orchards, Malus sieversii grows in mountainous wild fruit forests, where trees are not arranged in regular rows and canopy structures are often uneven. The spectral signal is therefore more easily affected by crown overlap, background vegetation, local shadows, and differences among growth stages. Under such conditions, the original reflectance intensity may contain both physiological information and non-physiological interference. SG-FD can reduce part of the smooth background trend and highlight local spectral variation, making the spectral response related to SPAD values more distinguishable.

Previous studies also support the importance of derivative transformation and red-edge-related spectral variation in chlorophyll or SPAD estimation. Ta et al. [29] reported that first-order derivative spectra were sensitive to chlorophyll variation in apple leaves. Yang et al. [30] found that sample heterogeneity affected SPAD modeling performance. Zhang et al. [31] showed that derivative transformation could enhance subtle spectral differences for chlorophyll estimation. Wang et al. [32] also confirmed the value of red-edge-related spectral metrics in canopy SPAD retrieval. Compared with these studies, the present work focuses on Malus sieversii, an endangered wild apple species growing in a more complex and less managed environment. Therefore, the good performance of SG-FD suggests that derivative-based preprocessing is suitable for extracting SPAD-related information from heterogeneous wild-fruit-forest canopy spectra.

4.2. Significance of Feature-Band Selection Results

The feature-selection results show that SPAD estimation of Malus sieversii requires both dimensionality reduction and information preservation. SiPLS first reduced the full spectral range to several sensitive intervals, but the 84 retained wavelengths still contained adjacent and correlated variables. After secondary selection, CARS retained a smaller but still sufficiently rich wavelength subset, while GA and SPA produced more compact subsets.

The better performance of SiPLS-CARS indicates that the optimal feature set was not the one with the fewest wavelengths, but the one that kept enough complementary spectral information. This is important for Malus sieversii because its canopy spectra are influenced by multiple factors, including leaf pigment status, canopy structure, illumination differences, and growth-stage variation. A very small number of wavelengths may not fully represent these combined effects. By retaining 28 wavelengths, SiPLS-CARS provided a balance between reducing redundancy and preserving useful spectral information.

The selected wavelengths were mainly distributed in the visible and red-edge regions. This distribution is consistent with the general spectral response of SPAD-related traits, but it also reflects the need to use multiple spectral intervals rather than a single sensitive band. In wild apple canopies, the spectral response of the canopy SPAD index is likely expressed through the combined behavior of several wavelength regions. Therefore, the advantage of SiPLS-CARS can be understood as the reconstruction of a compact spectral feature set that still maintains multi-band complementarity. The SHAP ranking further confirms this interpretation. The highest SHAP contributions were concentrated around the red-edge region, especially 708.04–740.89 nm, which is sensitive to the balance between chlorophyll absorption in the red region and multiple scattering in the near-infrared region. This indicates that the Transformer–LSTM model did not rely only on numerical correlations, but captured spectral regions with clear physiological relevance to SPAD variation. The visible wavelengths retained in the SiPLS-CARS subset may provide complementary information about pigment absorption and canopy background effects, which is particularly important for wild apple trees under heterogeneous illumination and crown conditions.

Similar conclusions have been reported in studies on canopy SPAD estimation and hyperspectral feature selection. Huang et al. [33] showed that growth stage and tree species can affect canopy SPAD monitoring. Wei et al. [34] found that shadow and background pixels influence chlorophyll estimation for trees. Chen et al. [35] reported that selected narrow-band combinations can outperform full-band inputs in tree chlorophyll estimation. Zhang et al. [36] also indicated that chlorophyll-sensitive information is often distributed across multiple spectral regions. These studies support the view that feature selection should not only reduce the number of variables, but also retain spectral information that is biologically and statistically useful.

4.3. Applicability Differences Among Models for Canopy SPAD Estimation of Malus sieversii

The model comparison showed that both traditional machine learning and deep learning methods could be used for canopy SPAD estimation of Malus sieversii, but their applicability differed among feature-selection strategies. Traditional models such as RF, PLSR, and KNN provided usable prediction results after feature selection, especially under the SiPLS-CARS input. However, the deep learning models, particularly Transformer–LSTM, showed better overall performance and stability.

The advantage of Transformer–LSTM may be related to the structure of hyperspectral data. Hyperspectral bands are continuous and highly correlated, and useful SPAD-related information may appear not only in local spectral regions but also in the relationships among distant bands. The Transformer component is suitable for capturing global inter-band dependencies, while LSTM can further describe sequential variation along the spectral profile. This combination is therefore suitable for multi-band spectral modeling under complex canopy conditions. In contrast, PLSR is mainly linear, KNN depends strongly on local sample distribution, and RF may have limited ability to represent continuous spectral dependencies across distant wavelength regions.

The particular growth environment of Malus sieversii should also be considered when interpreting model performance. Commercial apple orchards usually have more regular planting patterns, more consistent canopy management, and relatively uniform observation conditions. In contrast, Malus sieversii populations in wild fruit forests grow under natural conditions, with irregular tree crowns, mixed backgrounds, variable illumination, and stronger differences among individual trees. These factors may increase the nonlinearity and heterogeneity of canopy spectral responses. Therefore, a model capable of learning complex feature interactions is more suitable for this species than a model relying only on simple linear relationships or local similarity.

Previous studies have reported that growth stage, sample quality, and modeling strategy can affect chlorophyll or SPAD estimation in fruit trees and crops. Wang et al. [37] found that the optimal model for apple canopy chlorophyll estimation may vary among growth stages. Jiang et al. [38] showed that incorporating phenological information can improve the robustness of chlorophyll prediction. Shi et al. [39] reported that data quality control affects canopy biochemical detection. Yu et al. [40] suggested that growth-stage-specific modeling can improve chlorophyll estimation in fruit trees. Different from these studies, the present work used a unified modeling strategy across three growth stages of Malus sieversii. The stable performance of Transformer–LSTM under repeated random splitting suggests that this model has good adaptability to growth-stage variation and canopy heterogeneity.

In addition, a stratified random splitting strategy by growth stage was adopted during dataset partitioning. This strategy ensured that the training and test sets had comparable growth-stage composition and reduced evaluation bias caused by imbalanced stage distribution. The repeated random splitting results presented in Table 6 provide statistical support for evaluating model stability. The SiPLS-CARS + Transformer–LSTM model showed relatively small variations in both R² and RMSE across the five repetitions, indicating that its performance was less affected by a single random partition. The low standard deviation of the test R² (0.01) and test RMSE (0.09) further indicates that the model maintained similar predictive ability when different samples from the same growth stages were assigned to the training or test sets. This may be related to the fact that the SiPLS-CARS subset retained 28 wavelengths and therefore preserved more complementary spectral information than the more strongly compressed SiPLS-SPA and SiPLS-GA subsets. By contrast, the SiPLS-SPA and SiPLS-GA models showed larger fluctuations in test-set performance, suggesting that excessive wavelength compression may increase sensitivity to sample partitioning under limited sample conditions. This comparison suggests that retaining a moderate number of informative wavelengths is beneficial not only for prediction accuracy but also for generalization stability under repeated partitioning. Together with dropout and early stopping, the repeated experiments helped reduce the risk of overinterpreting one representative split. Therefore, the repeated partitioning results support the internal stability of the proposed workflow, although they should not be interpreted as external validation across years, regions, or independent datasets. Nevertheless, because the dataset contained only 255 samples and was collected within one growing season, these results should be regarded as proof-of-concept evidence rather than full validation across all possible ecological conditions.

4.4. Innovations, Limitations, and Future Directions

The main contribution of this study lies in presenting a proof-of-concept application of UAV hyperspectral imaging and deep learning for canopy SPAD estimation of Malus sieversii, an endangered wild apple species. Compared with studies on crops or commercial orchards, this study focuses on a naturally distributed wild fruit forest, where canopy structure and spectral background are more complex. The proposed workflow integrates spectral preprocessing, feature-band selection, model comparison, repeated stability evaluation, and SHAP-based interpretation. This provides a preliminary technical route for non-destructive monitoring of canopy SPAD status in Malus sieversii.

The results also highlight the spectral particularity of Malus sieversii. The selected sensitive wavelengths were mainly located in the visible and red-edge regions, but the final model performance depended on the combined contribution of multiple wavelengths rather than a single band. This suggests that the canopy SPAD response of wild apple trees is not expressed by one isolated spectral feature. Instead, it is related to the joint variation of several spectral regions under the influence of natural canopy structure, growth stage, and background conditions. This characteristic distinguishes Malus sieversii from more uniform commercial fruit-tree systems and partly explains why feature selection and nonlinear modeling were both important in this study.

This study still has several limitations. Existing studies have shown that canopy shadow, structural differences, and observation conditions can affect chlorophyll or SPAD inversion in woody plants. Zhang et al. [41] reported that canopy shadow can interfere with chlorophyll extraction in apple trees. Tsuchiya et al. [42] showed that the optimal preprocessing method may vary with model structure. Hou et al. [43] indicated that optimized spectral indices have application potential in apple canopy SPAD estimation. Recent UAV-based studies have also shown that integrating spectral information with texture or structural features can improve SPAD or chlorophyll estimation in horticultural crops and woody plants [44,45]. These findings are relevant to the present study because the current model used only spectral variables, whereas canopy structure and local shadow conditions are important sources of spectral variation in natural wild fruit forests. In addition, a recent hyperspectral study using CARS-type feature selection and deep learning further indicated that feature fusion and repeated validation are useful for improving model reliability under limited sample conditions [46]. Compared with these studies, the present work was based on samples from one region and one growing season. Therefore, the generalizability of the model across different years, regions, and environmental conditions still needs further testing.

Compared with these studies, the present work was based on samples from one region and one growing season. Therefore, the generalizability of the model across different years, regions, and environmental conditions still needs further testing.

Another limitation is that the current model inputs were mainly spectral variables. Canopy structural features, textural information, and shadow-related indicators were not explicitly incorporated. These variables may be useful for improving model robustness in natural wild fruit forests. In addition, although five repeated random splitting experiments were conducted to evaluate model stability, the number of repetitions and the total sample size remain limited.

It should also be emphasized that this study used SPAD measurements as a non-destructive indicator of relative chlorophyll status rather than chemically extracted chlorophyll concentration. No laboratory spectrophotometric validation or species-specific calibration curve was established between SPAD values and actual chlorophyll concentration for Malus sieversii. Therefore, the results should be interpreted as canopy SPAD index estimation rather than direct estimation of absolute chlorophyll content.

Future studies should expand the sample size across different years, regions, and forest conditions to further test the transferability of the proposed method. Additional features such as canopy structural parameters, texture metrics, shadow indicators, and optimized spectral indices could also be integrated with hyperspectral bands. Moreover, combining SPAD measurements with laboratory chlorophyll extraction would help establish a species-specific calibration relationship for Malus sieversii and improve the physiological interpretation of UAV hyperspectral inversion results.

5. Conclusions

This study took Malus sieversii as the research object and integrated unmanned aerial vehicle (UAV) hyperspectral imaging data with synchronous ground-measured SPAD values. It systematically evaluated the performance of different spectral preprocessing methods, feature-band selection algorithms, and regression models for estimating the canopy SPAD index of Malus sieversii. The main conclusions drawn are as follows:

(1): The choice of spectral preprocessing technique significantly influenced the accuracy of canopy-level SPAD prediction in Malus sieversii. Of the four approaches evaluated—SG, SG-MSC, SG-RE, and SG-FD—the SG-FD combination delivered the most reliable predictive capability, demonstrated by improved correlation strength, reduced RMSECV, and more consistent internal model validation outcomes. These results suggest that SG-FD enhances the isolation of diagnostically relevant spectral signatures linked to SPAD-related canopy physiological variation in Malus sieversii.
(2): SiPLS effectively condensed the original hyperspectral variables into several key sensitive spectral intervals. Following additional feature compression using CARS, GA, and SPA, notable differences emerged across the feature selection strategies. Among them, the 28 characteristic wavelengths identified by the SiPLS-CARS hybrid approach yielded the best modeling performance, indicating that accurate canopy-scale SPAD estimation for Malus sieversii depends not only on eliminating redundant spectral information, but also on retaining a sufficient set of complementary, diagnostically informative wavelengths.
(3): Among all model combinations, the SiPLS-CARS + Transformer–LSTM architecture achieved the best overall performance, yielding R² and RMSE values of 0.91 and 2.12 on the training set, and 0.86 and 2.47 on the test set, respectively. This model outperformed conventional machine learning and deep learning baselines—including RF, PLSR, KNN, and CNN-LSTM-Attention. The results indicate that a hybrid deep-learning framework integrating global contextual modeling (via Transformer) with sequential temporal-spectral feature extraction (via LSTM) is suitable for capturing the complex nonlinear relationships inherent in canopy-level hyperspectral data of Malus sieversii. Meanwhile, the results of the five random repeated experiments demonstrated that the developed models exhibited good stability and generalization capability, indicating that the combination of Transformer-LSTM and SiPLS-CARS feature selection has potential for multi-temporal estimation of canopy SPAD values in Malus sieversii, although further validation is still needed.
(4): SHAP analysis showed that the features with the largest contributions were mainly distributed in the red-edge region and part of the visible-light-sensitive region, which is generally consistent with the typical spectral response pattern of chlorophyll. This indicates that the optimal model not only had strong predictive ability, but also showed a certain degree of spectral-physiological interpretability.
(5): In summary, the technical framework developed in this study—which integrates UAV-based hyperspectral imaging, spectral feature-band selection, and deep learning—provides a feasible methodological foundation for rapid, non-destructive monitoring of the canopy SPAD index in Malus sieversii. It further serves as a viable methodological benchmark for safeguarding and evaluating the physiological health of genetically valuable, endangered wild fruit tree species. However, its generalizability across different growing seasons, geographic regions, and more complex ecological contexts remains to be further validated.

Author Contributions

Conceptualization, D.C.; methodology, Z.Z.; software, Z.Z.; validation, D.C. and H.Y.; formal analysis, Z.Z.; investigation, Z.Z.; resources, D.C.; data curation, Z.Z., Y.H., Y.W. and Z.J.; writing—original draft preparation, Z.Z.; writing—review and editing, D.C. and W.L.; visualization, Z.Z.; supervision, D.C.; project administration, Z.J.; funding acquisition, D.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Xinjiang Talent Development Fund (XJRC-2025-KJ-PY-KJLJ-063); the Classification Mapping and Key Parameter Inversion of Malus sieversii in the Ili River Valley for the Autonomous Region’s Postgraduate Practical Innovation Project (XJ2026G256); the Ili Kazakh Autonomous Prefecture Key Research and Technology Development Program (YZD2024A04); the Open Fund of the Institute of Resources and Ecology, Yili Normal University (2024XJPTZD017); and the Xinjiang Yili Forest Ecosystem Observation and Research Station (CARS-25).

Data Availability Statement

The raw UAV hyperspectral images used in this study are related to rare wild apple resources and will continue to be used in the author’s ongoing master’s thesis; therefore, the raw data are not publicly available at this stage. To improve transparency and reproducibility, the processed canopy spectra, SPAD values, and sample-level modeling data may be made available by the corresponding author upon reasonable request, subject to institutional approval and data-use conditions.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT 5.4 and Youdao 11.2.10.0 Translate for language translation and polishing. The authors reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cornille, A.; Antolín, F.; Garcia, E.; Vernesi, C.; Fietta, A.; Brinkkemper, O.; Kirleis, W.; Schlumbaum, A.; Roldán-Ruiz, I. A multifaceted overview of apple tree domestication. Trends Plant Sci. 2019, 24, 770–782. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.M.; Zhang, D.Y.; Li, W.J.; Li, Y.M.; Zhang, C.; Guan, K.Y.; Pan, B.R. Characteristics and Utilization of Plant Diversity and Resources in Central Asia. Reg. Sustain. 2020, 1, 1–10. [Google Scholar] [CrossRef]
Rong, X.Y.; Wu, N.; Yin, B.F.; Zhou, X.B.; Zhu, B.J.; Li, Y.G.; Aanderud, Z.T.; Zhang, Y.M. Degradation of wild fruit forests created less diverse and diffuse bacterial communities decreased bacterial diversity, enhanced fungal pathogens and altered microbial assembly in the Tianshan Mountain, China. Plant Soil 2024, 501, 23–38. [Google Scholar] [CrossRef]
Peng, Y.; Nguy-Robertson, A.; Arkebauer, T.; Gitelson, A.A. Assessment of Canopy Chlorophyll Content Retrieval in Maize and Soybean: Implications of Hysteresis on the Development of Generic Algorithms. Remote Sens. 2017, 9, 226. [Google Scholar] [CrossRef]
Luo, J.; Zhou, J.J.; Masclaux-Daubresse, C.; Wang, N.; Wang, H.; Zheng, B. Morphological and Physiological Responses to Contrasting Nitrogen Regimes in Populus cathayana is Linked to Resources Allocation and Carbon/Nitrogen Partition. Environ. Exp. Bot. 2019, 162, 247–255. [Google Scholar] [CrossRef]
Porcar-Castell, A.; Tyystjärvi, E.; Atherton, J.; Van der Tol, C.; Flexas, J.; Pfündel, E.E.; Moreno, J.; Frankenberg, C.; Berry, J.A. Linking chlorophyll a fluorescence to photosynthesis for remote sensing applications: Mechanisms and challenges. J. Exp. Bot. 2014, 65, 4065–4095. [Google Scholar] [CrossRef]
Croft, H.; Chen, J.M.; Luo, X.Z.; Bartlett, P.; Chen, B.; Staebler, R.M. Leaf Chlorophyll Content as a Proxy for Leaf Photosynthetic Capacity. Glob. Change Biol. 2017, 23, 3513–3524. [Google Scholar] [CrossRef]
Zhu, W.; Sun, Z.; Yang, T.; Li, J.; Peng, J.; Zhu, K.; Li, S.; Gong, H.; Lyu, Y.; Li, B.; et al. Estimating Leaf Chlorophyll Content of Crops via Optimal Unmanned Aerial Vehicle Hyperspectral Data at Multi-Scales. Comput. Electron. Agric. 2020, 178, 105786. [Google Scholar] [CrossRef]
Yin, H.; Huang, W.; Li, F.; Yang, H.; Li, Y.; Hu, Y.; Yu, K. Multi-temporal UAV Imaging-Based Mapping of Chlorophyll Content in Potato Crop. PFG-J. Photogramm. Remote Sens. Geoinf. Sci. 2023, 91, 91–106. [Google Scholar] [CrossRef]
Song, D.; Gao, D.; Sun, H.; Qiao, L.; Zhao, R.; Tang, W.; Li, M. Chlorophyll Content Estimation Based on Cascade Spectral Optimizations of Interval and Wavelength Characteristics. Comput. Electron. Agric. 2021, 189, 106413. [Google Scholar] [CrossRef]
Tuerxun, N.; Zheng, J.; Wang, R.; Wang, L.; Liu, L. Hyperspectral Estimation of Chlorophyll Content in Jujube Leaves: Integration of Derivative Processing Techniques and Dimensionality Reduction Algorithms. Front. Plant Sci. 2023, 14, 1260772. [Google Scholar] [CrossRef]
Zhang, Y.; Han, X.; Yang, J. Selection of Optimal Spectral Features for Leaf Chlorophyll Content Estimation. Sci. Rep. 2024, 14, 25598. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Xu, X.; Wu, W.; Zhu, Y.; Yang, G.; Yang, X.; Meng, Y.; Jiang, X.; Xue, H. Hyperspectral Estimation of Chlorophyll Content in Grape Leaves Based on Fractional-Order Differentiation and Random Forest Algorithm. Remote Sens. 2024, 16, 2174. [Google Scholar] [CrossRef]
Ye, Z.; Tan, X.; Dai, M.; Chen, X.; Zhong, Y.; Zhang, Y.; Ruan, Y.; Kong, D. A Hyperspectral Deep Learning Attention Model for Predicting Lettuce Chlorophyll Content. Plant Methods 2024, 20, 22. [Google Scholar] [CrossRef]
Yue, J.; Wang, J.; Zhang, Z.; Li, C.; Yang, H.; Feng, H.; Guo, W. Estimating Crop Leaf Area Index and Chlorophyll Content Using a Deep Learning-Based Hyperspectral Analysis Method. Comput. Electron. Agric. 2024, 227, 109653. [Google Scholar] [CrossRef]
Wang, M.-T.; Xue, Z.-F.; Tao, Y.; Kan, Z.-H.; Zhou, X.-B.; Liu, H.-L.; Zhang, Y.-M. Spatiotemporal Patterns of Leaf Nutrients of Wild Apples in a Wild Fruit Forest Plot in the Ili Valley, China. BMC Plant Biol. 2024, 24, 684. [Google Scholar] [CrossRef]
Linghu, W.; Lu, Z.; Wang, Y.; Gao, G. The Effects of Globose Scale (Sphaerolecanium prunastri) Infestation on the Growth of Wild Apricot (Prunus armeniaca) Trees. Forests 2023, 14, 2032. [Google Scholar] [CrossRef]
Amigo, J.M.; Santos, C. Chapter 2.1—Preprocessing of hyperspectral and multispectral images. In Hyperspectral Imaging; Amigo, J.M., Ed.; Data Handling in Science and Technology; Elsevier: Amsterdam, The Netherlands, 2019; Volume 32, pp. 37–53. [Google Scholar] [CrossRef]
Wu, Y.; Yuan, S.; Zhu, J.; Tang, Y.; Tang, L. Estimation of Wheat Leaf Water Content Based on UAV Hyper-Spectral Remote Sensing and Machine Learning. Agriculture 2025, 15, 1898. [Google Scholar] [CrossRef]
Wang, N.; Li, S.; Qi, X.; Liu, M.; Yang, J.; Zhou, J.; Yu, L.; Yu, F.; Chen, C.; Wang, Y. Accurate Inversion of Rice LAI Using UAV-Based Hyperspectral Data: Integrating Days After Transplanting and Meteorological Factors. Agriculture 2025, 15, 2335. [Google Scholar] [CrossRef]
Song, Q.; Zhang, W. Estimation and spatial distribution of soil organic carbon content in farmland using unmanned aerial vehicle hyperspectral remote sensing technology. Sci. Rep. 2026, 16, 5480. [Google Scholar] [CrossRef]
Li, Y.; Fang, T.; Zhu, S.; Huang, F.; Chen, Z.; Wang, Y. Detection of olive oil adulteration with waste cooking oil via Raman spectroscopy combined with iPLS and SiPLS. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2018, 189, 37–43. [Google Scholar] [CrossRef]
Li, H.; Liang, Y.; Xu, Q.; Cao, D. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal. Chim. Acta 2009, 648, 77–84. [Google Scholar] [CrossRef]
Sanaeifar, A.; Dill-Macky, R.; Curland, R.D.; Reynolds, S.; Rouse, M.N.; Kianian, S.; Yang, C. High-Throughput UAV Hyperspectral Remote Sensing Pinpoints Bacterial Leaf Streak Resistance in Wheat. Remote Sens. 2025, 17, 2799. [Google Scholar] [CrossRef]
Araújo, M.C.U.; Saldanha, T.C.B.; Galvão, R.K.H.; Yoneyama, T.; Chame, H.C.; Visani, V. The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemom. Intell. Lab. Syst. 2001, 57, 65–73. [Google Scholar] [CrossRef]
Li, M.; Wang, W.; Li, H.; Yang, Z.; Li, J. Monitoring of vegetation chlorophyll content in photovoltaic areas using UAV-mounted multispectral imaging. Front. Plant Sci. 2025, 16, 1643945. [Google Scholar] [CrossRef]
Zhang, X.; Yu, H.; Yan, J.; Meng, X. Study on the Detection of Chlorophyll Content in Tomato Leaves Based on RGB Images. Horticulturae 2025, 11, 593. [Google Scholar] [CrossRef]
Miller, T. Explanation in artiffcial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
Ta, N.; Chang, Q.; Zhang, Y. Estimation of Apple Tree Leaf Chlorophyll Content Based on Machine Learning Methods. Remote Sens. 2021, 13, 3902. [Google Scholar] [CrossRef]
Yang, X.; Yang, R.; Ye, Y.; Yuan, Z.; Wang, D.; Hua, K. Winter wheat SPAD estimation from UAV hyperspectral data using cluster-regression methods. Int. J. Appl. Earth Obs. Geoinf. 2021, 105, 102618. [Google Scholar] [CrossRef]
Zhang, Y.; Chang, Q.; Chen, Y.; Liu, Y.; Jiang, D.; Zhang, Z. Hyperspectral Estimation of Chlorophyll Content in Apple Tree Leaf Based on Feature Band Selection and the CatBoost Model. Agronomy 2023, 13, 2075. [Google Scholar] [CrossRef]
Wang, Q.; Chen, X.; Meng, H.; Miao, H.; Jiang, S.; Chang, Q. UAV Hyperspectral Data Combined with Machine Learning for Winter Wheat Canopy SPAD Values Estimation. Remote Sens. 2023, 15, 4658. [Google Scholar] [CrossRef]
Huang, Y.; Li, D.; Liu, X.; Ren, Z. Monitoring canopy SPAD based on UAV and multispectral imaging over fruit tree growth stages and species. Front. Plant Sci. 2024, 15, 1435613. [Google Scholar] [CrossRef]
Wei, S.; Yin, T.; Yuan, B.; Ow, G.L.F.; Yusof, M.L.M.; Gastellu-Etchegorry, J.-P.; Whittle, A.J. Estimation of chlorophyll content for urban trees from UAV hyperspectral images. Int. J. Appl. Earth Obs. Geoinf. 2024, 126, 103617. [Google Scholar] [CrossRef]
Chen, Z.L.; Wang, X.F.; Qiao, S.J.; Liu, H.; Shi, M.M.; Chen, X.J.; Jiang, H.Y.; Zou, H.M. A leaf chlorophyll content estimation method for Populus deltoides (Populus deltoides Marshall) using ensembled feature selection framework and unmanned aerial vehicle hyperspectral data. Forests 2024, 15, 1971. [Google Scholar] [CrossRef]
Zhang, Y.M.; Ru, G.X.; Zhao, Z.L.; Wang, D.C. Hyperspectral prediction models of chlorophyll content in paulownia leaves under drought stress. Sensors 2024, 24, 6309. [Google Scholar] [CrossRef]
Wang, J.X.; Zhang, Y.; Han, F.; Shi, Z.P.; Zhao, F.; Zhang, F.Z.; Pan, W.Z.; Zhang, Z.Y.; Cui, Q.L. Estimation of canopy chlorophyll content of apple trees based on UAV multispectral remote sensing images. Agriculture 2025, 15, 1308. [Google Scholar] [CrossRef]
Jiang, C.B.; Cheng, Y.; Li, Y.F.; Peng, L.; Dong, G.S.; Lai, N.; Geng, Q.L. Phenology-aware machine learning framework for chlorophyll estimation in cotton using hyperspectral reflectance. Remote Sens. 2025, 17, 2713. [Google Scholar] [CrossRef]
Shi, Z.L.; Wang, L.L.; Yang, Z.L.; Li, J.Z.; Cai, L.W.; Huang, Y.P.; Zhang, H.Y.; Han, L.J. Unmanned aerial vehicle-based hyperspectral imaging integrated with a data cleaning strategy for detection of corn canopy biomass, chlorophyll, and nitrogen contents at plant scale. Remote Sens. 2025, 17, 895. [Google Scholar] [CrossRef]
Yu, M.Y.; Fan, W.F.; Zeng, J.K.; Li, Y.; Wang, L.F.; Wang, H.; Bao, J.P. Growth stage-specific modeling of chlorophyll content in korla pear leaves by integrating spectra and vegetation indices. Agronomy 2025, 15, 2218. [Google Scholar] [CrossRef]
Zhang, C.J.; Chen, Z.B.; Chen, R.Q.; Zhang, W.J.; Zhao, D.; Yang, G.J.; Xu, B.; Feng, H.K.; Yang, H. Retrieving the chlorophyll content of individual apple trees by reducing canopy shadow impact via a 3D radiative transfer model and UAV multispectral imagery. Plant Phenomics 2025, 7, 100015. [Google Scholar] [CrossRef]
Tsuchiya, Y.; Yoshida, K.; Ishiguro, Y.; Kawaki, J.; Yamashita, H.; Ikka, T.; Sonobe, R. Optimizing chlorophyll content prediction in tea leaves via spectral transformations and deep learning. BMC Plant Biol. 2025, 26, 26. [Google Scholar] [CrossRef]
Hou, K.; Hou, K.Y.; Shi, Z.Y.; Lou, W.; Xiao, B.; Li, X. Hyperspectral estimation of apple canopy SPAD values based on optimized spectral indices and CEO-LSSVM. Agronomy 2026, 16, 490. [Google Scholar] [CrossRef]
Wang, R.J.; Tuerxun, N.; Zheng, J.H. Improved estimation of SPAD values in walnut leaves by combining spectral, texture, and structural information from UAV-based multispectral image. Sci. Hortic. 2024, 328, 112940. [Google Scholar] [CrossRef]
Qi, Q.M.; Lu, J.S.; Zhang, J.Y.; Zheng, G.J.; Zhang, Q.Y.; Zhang, F.; Chen, F.D.; Fang, W.M.; Chen, S.M.; Guan, Z.Y. Enhanced UAV-based SPAD values estimation in tea chrysanthemum: An optimized and interpretable machine learning approach integrating spectral and textural information. Smart Agric. Technol. 2025, 12, 101449. [Google Scholar] [CrossRef]
Wu, X.M.; Zhong, L.W.; Ding, R.; Wang, C.H.; Chen, H.C.; Zhong, S.H.; Gu, R. Non-destructive estimation of SPAD and biomass in Lamiophlomis rotata using hyperspectral imaging and deep learning with DRSA-CARS feature selection. Front. Plant Sci. 2025, 16, 1640779. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Overall workflow of the study.

Figure 2. Overview of the study area.

Figure 3. Workflow of hyperspectral image mosaicking.

Figure 4. (a) Spectral curves at different stages; (b) Correlation analysis between SPAD values and reflectance after different spectral transformations.

Figure 5. SiPLS Screening Process. (a) Bar Chart of Root Mean Square Error of Cross-Validation for SiPLS; (b) Chart of Feature Band Interval Screening for SiPLS.

Figure 6. CARS screening process. (a) Changes in the number of wavelength variables; (b) Changes in RMSECV; (c) Trends of variable regression coefficients.

Figure 7. GA screening process diagram. (a) GA variable curve analysis; (b) GA iteration process.

Figure 8. SPA Variable Selection Process.

Figure 9. Fitting results of the measured values and predicted values by six superior models.

Figure 10. Explainable analysis of the Transformer-LSTM model. (a) Heatmap; (b) Feature importance bar chart; (c) SHAP beeswarm plot.

Table 1. Main parameters of the hyperspectral sensor.

Main Technical Parameters	Parameter
Spectral Range	400–1000 nm
Spectral Resolution	5.5 nm
Number of Spatial Channels	1024
Number of Spectral Channels	448 (1X), 224 (2X)
Spectral Sampling Interval	2.7 nm@224; 1.4 nm@448
Image Resolution	1024 × 1003
Imaging Lens	16 mm, 25 mm
Image Bit Depth	12 bit
Operating Voltage	12 v

Table 2. Statistical characteristics of canopy SPAD values in the full dataset and representative training-test split.

Sample Set	Sample Number	Maximum SPAD Value	Minimum SPAD Value	Average SPAD Value	Standard Deviation	Coefficient of Variation (%)
Total Sample Set	255	57.775	27.800	41.152	6.896	16.758
Training Set	204	57.775	27.800	40.937	6.961	17.004
Test Set	51	53.000	30.900	42.011	6.627	15.775

Table 3. Accuracy metrics used for model evaluation.

Accuracy Indicators	Equation
Coefficient of determination	$R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y_{i}})}^{2}}$ (1)
Root mean square error	$RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{n}}$ (2)

Note: n represents the total number of samples; i = 1, 2, 3, …, n; y_i and ŷ_i denote the measured and predicted SPAD values of the i-th sample, respectively; and ȳ represents the mean measured SPAD value.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Z.; Jiang, Z.; Liu, W.; Han, Y.; Wu, Y.; Cui, D.; Yang, H. UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM. Horticulturae 2026, 12, 743. https://doi.org/10.3390/horticulturae12060743

AMA Style

Zhang Z, Jiang Z, Liu W, Han Y, Wu Y, Cui D, Yang H. UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM. Horticulturae. 2026; 12(6):743. https://doi.org/10.3390/horticulturae12060743

Chicago/Turabian Style

Zhang, Zhicong, Zhicheng Jiang, Wenxin Liu, Yaxin Han, Yunhao Wu, Dong Cui, and Haijun Yang. 2026. "UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM" Horticulturae 12, no. 6: 743. https://doi.org/10.3390/horticulturae12060743

APA Style

Zhang, Z., Jiang, Z., Liu, W., Han, Y., Wu, Y., Cui, D., & Yang, H. (2026). UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM. Horticulturae, 12(6), 743. https://doi.org/10.3390/horticulturae12060743

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

UAV Hyperspectral Estimation of Malus sieversii Canopy SPAD Index Using Transformer-LSTM

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Overview

2.2. Study Area

2.3. UAV Hyperspectral Image Acquisition

2.4. Ground Data Acquisition and Sample Division

2.5. Data Processing

2.5.1. Image Processing

2.5.2. Hyperspectral Preprocessing

2.6. Selection of Hyperspectral Characteristic Parameters

2.6.1. Extraction of Characteristic Wavelengths

2.6.2. Pearson Correlation Analysis

2.7. Model Construction and Evaluation Indices

2.7.1. Model Construction

2.7.2. Evaluation Indices

2.8. Model Interpretability Tool: SHAP

3. Results

3.1. Canopy SPAD Values of Malus sieversii and Their Correlation with Spectral Preprocessing Methods

3.2. Selection of Characteristic Wavelengths for Canopy SPAD Estimation

3.2.1. Selection of Characteristic Intervals Based on SiPLS

3.2.2. Feature Interval Selection Combining CARS with SiPLS

3.2.3. Feature Interval Selection by Combining GA with SiPLS

3.2.4. Feature Interval Selection Combining SPA with SiPLS

3.3. Model Construction and Comparison

3.4. Interpretability Analysis of Transformer—LSTM Inversion Model Based on SHAP Method

4. Discussion

4.1. Effect of Spectral Preprocessing on Canopy SPAD Estimation of Malus sieversii

4.2. Significance of Feature-Band Selection Results

4.3. Applicability Differences Among Models for Canopy SPAD Estimation of Malus sieversii

4.4. Innovations, Limitations, and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI