Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework

Gao, Yi; Huang, Changping; Zhang, Xia; Zhang, Ze

doi:10.3390/rs18081105

Open AccessArticle

Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework

¹

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

³

Xinjiang Production and Construction Corps Oasis Eco-Agriculture Key Laboratory, College of Agriculture, Shihezi University, Shihezi 832003, China

^*

Author to whom correspondence should be addressed.

^†

Current address: National Engineering Research Center of Satellite Remote Sensing Applications, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China.

Remote Sens. 2026, 18(8), 1105; https://doi.org/10.3390/rs18081105

Submission received: 3 March 2026 / Revised: 29 March 2026 / Accepted: 31 March 2026 / Published: 8 April 2026

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A leaf-level hyperspectral time-series dataset was established across VW development.
A novel Transformer-TCN model was designed for intelligent VW severity estimation.

What are the implications of the main findings?

Temporal information was shown to markedly improve VW severity inversion accuracy.
Blue band-related index BRI was identified as a robust temporal marker of VW progression.

Abstract

Hyperspectral remote sensing provides essential biochemical and structural information for crop disease monitoring, yet its application to cotton Verticillium wilt has largely focused on single-period evaluations or multi-temporal classifications. Such approaches overlook the progressive nature of this vascular disease, whose pigment, water, and mesophyll responses evolve over time, making temporal hyperspectral information critical for reliable severity estimation but still insufficiently utilized. To overcome this limitation, we conducted daily time-series observations on cotton leaves and collected 2895 hyperspectral reflectance measurements and 770 high-resolution RGB images together with disease severity records, generating a temporally dense spectral-severity dataset spanning symptom-free to severe stages. Five categories of disease-related vegetation indices were derived and organized into 5-day spectral–temporal slices. Based on these features, we introduce a dual-branch Transformer-TCN model that integrates global temporal dependencies captured by self-attention with local temporal variations resolved by dilated causal convolutions for severity inversion. The model delivers the strongest performance with an R² of 0.8813, exceeding multiple single and hybrid time-series alternatives by 0.0446–0.1407 in R², equivalent to a relative improvement of 5.33–19.00%. Temporal spectral features also outperform their non-temporal counterparts, highlighting that disease progression dynamics captured by time-series spectra are critical for reliable severity retrieval. Feature contribution analysis indicates that the blue red index BRI provides the highest contribution, consistent with the single-index time-series modelling results. Photosynthesis- and water-related indices provide secondary but complementary support. Collectively, our results demonstrate that the dual-branch Transformer-TCN model can capture complex spectral–temporal relationships between cotton Verticillium wilt and disease severity, providing methodological support for crop disease monitoring and evaluation.

Keywords:

cotton Verticillium wilt; time-series spectra; Transformer; temporal convolutional network; disease severity estimation; deep learning

1. Introduction

Cotton Verticillium wilt (VW), a soil-borne fungal disease primarily caused by the pathogens Verticillium dahlia and Verticillium albo-atrum, is one of the most destructive vascular diseases affecting cotton production worldwide, leading to annual yield losses of up to 30% [1]. Accurate and timely monitoring of disease severity (DS) is therefore essential for disease management and precision agriculture decision-making [2]. However, traditional field-based rating approaches provide fast assessment but suffer from evaluator subjectivity, whereas laboratory-based diagnostic techniques offer higher diagnostic reliability yet are labor-intensive, time-consuming, and often require destructive sampling. Consequently, remote sensing technologies have emerged as an effective alternative for rapid, non-destructive, and continuous disease monitoring.

Pathogen infection alters pigment concentration, leaf water content, cellular structure, and stress physiology, generating characteristic reflectance changes across visible and near-infrared wavelengths [3]. Hyperspectral sensing can detect these biochemical and structural disturbances with high sensitivity, providing a strong theoretical basis for crop disease monitoring. In particular, spectrally derived vegetation indices, designed to emphasize specific biochemical and biophysical properties, provide physically interpretable proxies for tracking disease-induced physiological and biochemical responses. As a result, hyperspectral observations have been widely applied to disease detection, early warning, and severity estimation, demonstrating their ability to capture subtle variations induced by pathogen infection [4,5,6].

Cotton Verticillium wilt is a progressive host–pathogen interaction in which physiological disturbances develop over time [7,8,9], producing time-dependent variations in both leaf function and spectral expression. Early infection induces only minor disturbances due to limited pathogen load, whereas progressive vascular blockage in later stages leads to pronounced changes in pigment concentration, photosynthetic activity, water status, and mesophyll structure [10], with the magnitude of these responses differing across growth and infection stages [11,12,13]. These physiological transitions translate directly into dynamic spectral behavior: visible reflectance shifts arise early due to pigment degradation, whereas near-infrared alterations emerge later as cellular structure becomes increasingly impaired and water transport is disrupted, producing delayed and nonlinear spectral responses [14,15,16,17]. Consequently, the sensitivity of spectral features to disease severity varies among observation periods, and key monitoring features as well as model performance have been shown to shift substantially across infection stages [18,19], with some studies reporting the highest detection accuracy near the boll-setting period [20]. Because physiological traits and hyperspectral signatures evolve during infection, single-date observations capture only a limited snapshot of disease expression. In contrast, temporal sequences characterize how spectral responses develop across successive stages and thus provide a more complete depiction of disease progression. While the sensitivities of individual features vary across disease stages, leading to limited cross-temporal robustness of single-period observations, integrating multi-day temporal information can effectively reduce reliance on any specific time point and enhance the stability and credibility of severity estimation [21,22].

Temporal remote sensing information has increasingly been used to capture the dynamic progression of crop diseases, revealing temporal patterns that single-date observations cannot provide. Most existing studies have applied these temporal signals to improve disease identification or severity classification [19]. For example, Su et al. [23] constructed temporal variation features of vegetation indices from multi-temporal Unmanned Aerial Vehicle (UAV) multispectral data and achieved high-accuracy classification of wheat stripe rust severity levels. Jing et al. [24] incorporated temporal variation through a differential modeling strategy and significantly enhanced the capacity of vegetation indices to characterize wheat stripe rust severity, increasing the coefficient of determination by 38.8%. In cotton disease studies, Wu et al. [25] extracted key parameters from Normalized Difference Vegetation Index (NDVI) phenological curves to track the continuous occurrence of cotton root rot, while Nie et al. [26] constructed temporal feature vectors from multi-date satellite imagery, achieving an accuracy of 81.73% in identifying Verticillium wilt-infected areas. Overall, these studies consistently demonstrate that incorporating temporal information plays a critical role in disease identification. However, the complex spectral–temporal dependencies associated with progressive vascular diseases remain insufficiently exploited, as most existing approaches rely on traditional machine learning frameworks with limited deep temporal representation capacity [27]. Moreover, current applications are still largely oriented toward multi-temporal classification or stage-specific assessment rather than precise quantitative severity inversion [14], even though accurate severity quantification is essential for supporting disease control decision making, tracking epidemic progression, and enabling targeted intervention to reduce management costs.

Deep learning has substantially advanced crop disease monitoring by enabling automatic feature extraction and improved generalization compared with conventional machine learning approaches [28,29,30]. To date, most studies have been dominated by convolutional neural network (CNN) models, which have achieved strong performance in disease image classification and severity grading tasks [31,32,33,34]. With the growing availability of temporal remote sensing data, several studies have begun to incorporate time-series architectures, such as convolutional neural networks (CNNs) combined with long short-term memory networks (LSTMs) and bidirectional long short-term memory networks (BiLSTMs), namely CNN-LSTM and CNN-BiLSTM, to capture disease progression dynamics [35]. However, temporal deep learning applications in crop disease monitoring remain relatively limited, and recurrent architectures often struggle to model long-range temporal dependencies effectively. Transformer architectures, driven by multi-head self-attention mechanisms, excel at capturing global temporal relationships and long-range dependencies in sequential data [36,37]. Meanwhile, Temporal Convolutional Networks (TCNs) employ dilated causal convolutions to efficiently model local temporal patterns while maintaining stable gradient propagation and computational efficiency, and have demonstrated strong performance in healthcare time-series analysis and other dynamic monitoring tasks [38,39,40]. The complementary strengths of Transformers in global dependency modeling and TCNs in local temporal feature extraction provide a promising foundation for more effective spectral index time-series analysis. Nevertheless, their joint application to hyperspectral temporal modeling for cotton Verticillium wilt severity estimation remains insufficiently explored.

Overall, current studies highlight two key limitations: the temporal information embedded in spectral features remains insufficiently exploited, and existing approaches are still largely constrained to multi-temporal classification or stage-specific assessment rather than accurate quantitative severity estimation. These limitations underscore the need for a modeling framework capable of fully leveraging fine-scale temporal variation in spectral indices to enhance the robustness and reliability of disease severity monitoring across infection stages. To address this gap, this study proposes a Transformer-TCN model based on time-series spectral index slices for accurate Verticillium wilt severity estimation. By structuring vegetation indices into multi-day temporal segments and integrating the complementary strengths of Transformer self-attention for global temporal dependency modeling and TCN dilated convolutions for efficient local temporal feature extraction, the proposed framework captures deep spectral–temporal patterns that traditional machine learning and recurrent architectures fail to represent. This design enhances the accuracy, interpretability, and stage-robust performance of hyperspectral disease monitoring, offering an effective, non-destructive and intelligent solution for supporting precision disease management in cotton production. The main objectives and contributions of this study are summarized as follows:

(1): A leaf-scale temporal spectral-severity dataset covering the full course of cotton Verticillium wilt was established from daily field measurements, including 2895 leaf hyperspectral reflectance records and 770 synchronized high-resolution RGB images.
(2): A dual-branch Transformer-TCN model with global–local temporal fusion was proposed. The Transformer branch captured global dependencies, the TCN branch extracted local details through dilated convolutions, and the fusion branch enabled intelligent quantitative severity inversion.
(3): Under a unified dataset and experimental protocol, nine representative deep learning models were systematically benchmarked to validate the accuracy and robustness advantages of the proposed Transformer-TCN framework.
(4): Feature-importance analysis was conducted for time-series spectral indices and contrasted with single-date indices, elucidating the superiority and reliability of temporal features for severity inversion in terms of contribution and cross-stage consistency.

2. Materials and Methods

2.1. Study Area and Data Collection

2.1.1. Study Site and Leaf Time-Series Observation Design

Field experiments were conducted at the national Verticillium wilt experimental field of the Shihezi Institute of Agricultural Sciences (44.33°N, 86.05°E), Xinjiang, China (Figure 1). Xinjiang is a major cotton-producing region, contributing about 23.1% of global cotton output and 90.2% of China cotton production [41]. The region has a typical temperate continental climate with strong solar radiation. The experimental field has been planted with multiple candidate cotton germplasm lines for many years. Each year, a unified field inoculation is applied to maintain stable and consistent disease pressure. The field was managed in a closed manner during the observation period. Irrigation and fertilization followed local agronomic practice to avoid drought stress and other non-disease stresses that could confound disease monitoring.

Cotton was sown from 15 to 20 April 2023. Ridge film mulching was used with on-film hole sowing and subsurface drip irrigation. The row spacing pattern was 10 + 66 + 10 + 66 cm [42]. Irrigation was applied every 8 to 10 days, with 10 to 12 applications during the growing season. Fertilization used mono-ammonium phosphate and urea as the main sources. Potassium was supplied mainly as available potassium (KCl), with potassium dihydrogen phosphate (KH₂PO₄) applied at late stages, and all fertilizers were delivered via drip irrigation after dissolution.

Leaf time-series observations were conducted from 4 August to 10 September 2023, spanning 38 d. This period corresponds to the peak outbreak window of cotton Verticillium wilt in Xinjiang. It also covers key stages from early flowering and full flowering to the boll stage. These stages mark the transition from vegetative growth to reproductive growth, when carbon assimilation and allocation are high and yield formation is sensitive to stress. Observations ended after the first open bolls were observed in the field. At that stage, plants enter physiological maturity and natural senescence. Leaf activity declines, tissue structure stabilizes, and both disease development and spectral changes become slower.

To capture leaf-level disease time series, leaves showing symptoms unrelated to Verticillium wilt were first excluded or avoided at the beginning of the field campaign. After this screening, multiple main-stem leaves with no visible symptoms were randomly selected for daily tracking. In total, 29 main-stem leaves were continuously monitored in situ. Each leaf was observed once per day under field conditions. Each daily session included repeated acquisitions, and the average was used for analysis. Infection and lesion expansion rates differed among leaves. The resulting time-series dataset therefore includes different disease onset speeds and different disease course lengths. Eighteen leaves showed the full trajectory from symptom-free to lesion expansion and then to yellowing and abscission. The remaining leaves progressed more slowly and did not reach abscission by the end of the campaign. Overall, this study collected leaf-level hyperspectral reflectance and synchronous RGB images for healthy and infected leaves, covering the full phenotype evolution from the initial symptom-free stage to lesion expansion and, for some leaves, to yellowing and abscission. Previous studies have shown that full-stage coverage allows deep learning models to learn disease progression patterns across infection phases and extract shared features that are stable over time [43].

2.1.2. Leaf Spectral Reflectance and Image Acquisition

Leaf spectra were measured in situ using a Spectral Evolution^® PSR+ 3500 handheld spectroradiometer (Spectral Evolution, Lawrence, MA, USA), with a wavelength range of 350–2500 nm. The spectral resolution is 3.5 nm at 350–1000 nm, about 10 nm at 1500 nm, and about 7 nm at 2100 nm, with sampling at 1 nm intervals. During measurement, the leaf remained attached to the plant. It was gently flattened against a horizontally placed matte black board to reduce background effects. Measurements were taken at a near-nadir viewing geometry. The probe-to-leaf distance was about 9–15 cm and was adjusted to keep the whole leaf within the 25° field of view. A white reference panel in the same scene was used for calibration every 10–15 min to account for changes in incident light. A schematic diagram of the leaf-level spectral measurement setup is shown in Figure 2. In total, 2895 individual leaf-level spectral measurements were obtained under clear or nearly clear sky conditions with stable illumination.

An RGB image was captured immediately after each spectral measurement, also in near-nadir view, using an iPhone 14 Pro (Apple Inc., Cupertino, CA, USA) with a 48 MP main camera (Figure 3). The image size is 8064 × 6048 pixels. A total of 770 RGB images were collected and subsequently used for background separation, lesion annotation, and quantitative estimation of disease severity.

2.2. Data Processing and Sample Construction

2.2.1. Quantification of Disease Severity and Description of Leaf Symptom Evolution

A quantitative disease severity index was derived from RGB images by computing the proportion of lesion pixels within the leaf area. Following common practices in plant disease phenotyping [17], a binary leaf mask was extracted and lesion regions were identified within the mask. The ratio of lesion pixels to total leaf pixels was then used as the severity value for each observation. This index characterizes the temporal evolution of disease symptoms and served as the supervised label for time-series modeling.

Field observations showed a consistent symptom sequence. At mild infection, leaves showed subtle symptoms, mainly localized interveinal yellowing. As disease progressed, palisade and spongy mesophyll tissues were visibly damaged, leading to deformation of cellular structure. Lesions expanded gradually and were often accompanied by margin scorch and curling. At severe stages, lesions spread to nearly half of the leaf area and leaf shape became strongly distorted. In extreme cases, the whole leaf turned yellow, curled, and eventually died. Typical time-series transitions from healthy to wilted and dead leaves, together with the corresponding disease severity annotations for each observation, are illustrated in Figure 3b. This study captured in situ leaf changes from symptom-free to lesion expansion and, for some leaves, to abscission.

2.2.2. Spectral Preprocessing and Vegetation Index Construction

To reduce noise at the ends of the reflectance curve and to avoid strong atmospheric water absorption regions, the original 350–2500 nm spectra were trimmed to 340–1820 nm and 1950–2420 nm for analysis. Quality control was applied to repeated measurements, and spectra affected by strong external interference were removed. The remaining spectra were smoothed using a Savitzky–Golay filter. After removing abnormal spectra, the mean spectral reflectance of all retained single scans was calculated. This mean spectrum represents the overall reflectance level of the dataset and supports subsequent index calculation and analysis. In total, 770 averaged leaf spectra were obtained and used for subsequent analysis.

Pathogen infection affects pigments, water status, cell structure, and the photosynthetic system. These changes alter visible to near-infrared reflectance and the red-edge pattern [7,9]. Based on leaf reflectance, 34 vegetation indices commonly used for plant disease monitoring were calculated. They were grouped into five categories: Xanthophyll Cycle & Fluorescence Indices, Pigment Indices, Biochemical Absorption Indices, Structural Indices, and Water Content Indices (Table A1). Compared with raw reflectance, vegetation indices can help stabilize spectral representations and facilitate more reliable characterization of disease progression [44]. These indices were used as model inputs to represent the time-series changes in leaf spectral responses along disease progression. To illustrate the temporal variation patterns of spectral responses and disease progression, the representative index BRI was selected as an example. Daily boxplots of BRI and disease severity were generated to characterize the overall sample-level temporal distributions throughout the monitoring period (Figure 4).

2.2.3. Construction of Time-Series Slice Samples

To use daily leaf observations under a fixed-length input setting, a sliding-window strategy was applied to build time-series slice samples. Daily vegetation index sequences were first organized for each monitored leaf. A fixed-length window was then moved forward by one day within each leaf record to extract all consecutive sub-sequences, and each sub-sequence was treated as one time-series slice sample. In this study, the window length was set to 5 days. The window length affects how well disease-related time-series features are expressed and how well a model can learn global temporal dependence within a slice. Previous studies indicate that spectral responses linked to cell structure disruption and photosynthetic damage often show stable and repeatable changes within 1–5 days after inoculation or infection [45]. A very short window such as 3 days may not cover this stable response interval and may weaken learning of within-slice temporal dependence. A much longer window can smooth short-term changes and reduce monitoring timeliness. Time-series prediction studies also report that sequences around 5-day can reduce prediction error while keeping good time sensitivity [46]. Based on these considerations of physiological time scale and model stability, a 5-day slice length was used. The label of each slice was defined as the lesion coverage on the last day of the slice, which represents the disease severity at the window endpoint. This design was adopted because disease monitoring aims to quantify the current disease status at a specific observation time, which is directly relevant to diagnosis, severity grading, early warning, and management decisions. Under this setting, the preceding days within the slice provide temporal context on the recent progression of disease-related spectral responses, whereas the estimation target should remain the disease severity at the window endpoint.

2.3. Model Development

This section describes the architecture of the Transformer-TCN model, a deep learning framework developed for leaf-level cotton Verticillium wilt monitoring using hyperspectral index time-series data. The following subsections detail the design and functionality of its major components, including the Temporal Convolutional Network (TCN), the Transformer encoder, and the feature fusion module.

2.3.1. Overview of the Proposed Transformer-TCN Method

The proposed Transformer-TCN framework was developed for cotton Verticillium wilt severity estimation from 5-day vegetation-index time-series features. Its core objective is to jointly model complementary temporal information at local and global scales, thereby improving the characterization of disease-related dynamic patterns. As illustrated in Figure 5, the framework consists of three coordinated components: a TCN branch, a Transformer encoder branch, and a feature fusion module.

The TCN branch is designed to extract local temporal dynamics from neighboring time steps. By employing dilated one-dimensional convolutions, it progressively enlarges the temporal receptive field and enables effective modeling of short-term fluctuations, local transitions, and multiscale dynamic variations within the vegetation-index sequences. In parallel, the Transformer encoder branch is used to capture global temporal dependencies and the overall evolution pattern of the sequence. Through the self-attention mechanism, it directly models interactions among different time steps within the 5-day input window and is therefore well suited for representing long-range temporal relationships. The outputs from the two branches are subsequently concatenated and refined through a convolution-based fusion module, which enhances the complementarity between local and global temporal features and generates a unified representation for the final regression layer. By integrating local temporal patterns with global sequential dependencies, the proposed Transformer-TCN model provides a more complete characterization of the nonlinear relationship between vegetation-index time-series dynamics and cotton Verticillium wilt severity than either single-branch structure alone.

Let B denote the batch size and C the input dimension, where C is determined by the number of spectral indices multiplied by the 5-day sequence length. Each input sample is processed in parallel by the TCN and Transformer branches, and the resulting features are then fused, compressed, and mapped to a single output corresponding to the disease severity on the last day of the input window. This end-to-end design enables direct prediction from multi-index temporal features without additional manual feature engineering. When the input feature dimension is 170 and the batch size is 8, the detailed layer-wise parameter settings of the Transformer-TCN model are summarized in Table 1.

2.3.2. Temporal Convolutional Network Branch

The TCN branch employs dilated one-dimensional convolutions with residual connections to model temporal dependencies within the 5-day vegetation index sequences. By using exponentially increasing dilation factors across stacked temporal blocks, the receptive field is effectively expanded while maintaining computational efficiency. Residual connections facilitate stable gradient propagation and help preserve low-level temporal information during training. This branch is particularly suited for capturing localized variations in disease-related indices and supports modeling of short-term disease progression dynamics.

Further, the TCN constitutes a sequence modeling paradigm that replaces recurrent computations with convolutional operators, with its key principle being efficient learning of temporal dependencies through causal and dilated convolutions. Causal convolution enforces a strict temporal constraint by ensuring that each output is determined solely by the current and past inputs, thereby preventing information leakage from future observations. Dilated convolution introduces spaced filter taps within the kernel, enabling rapid expansion of the receptive field without a substantial increase in parameter count or computational cost, and thus accommodating both short-term fluctuations and longer-range cumulative effects within the sequence.

Given an input spectral sequence

X = (x_{1}, x_{2}, \dots, x_{b})

, and a convolutional filter

f : {0, 1, \dots, k - 1}

, the dilated causal convolution that combines causal and dilated convolutions can be formulated as follows:

G (b) = (X *_{d} f) (b) = \sum_{i = 0}^{k = 1} f_{i} x_{b - d \cdot i}

(1)

Here, d denotes the dilation rate that controls the spacing between adjacent filter elements. The

b - d \cdot i

enlarges the effective receptive field while preserving causality, such that the response at each position depends only on the current and preceding spectral elements. To strengthen deep representation learning, residual connections are introduced between successive dilated causal convolution layers. The residual block is expressed as [47]:

X_{h} = σ (T (X_{h - 1}) + X_{h - 1})

(2)

where

X_{h - 1}

denotes the input from the (h − 1)th residual block. The

T (\cdot)

represents transformation operation, which typically consists of a causal convolution followed by weight normalization and dropout, and

σ (\cdot)

denotes a nonlinear activation function.

2.3.3. Transformer Encoder Branch

Within the proposed Transformer-TCN framework for Verticillium wilt severity inversion, the Transformer encoder branch was incorporated to strengthen representation learning of long-range dependencies and global co-variation within 5-day vegetation index sequences, thereby characterizing the cross-time effects induced by the progressive disease course. Relative to convolutional operators that emphasize local neighborhoods and recurrent architectures that aggregate information sequentially, the Transformer establishes explicit global interactions across the entire 5-day window, enabling the identification of critical temporal nodes, cross-index coordination patterns, and stage-related trend relationships during disease evolution. The multi-head mechanism further learns complementary temporal dependence structures in parallel subspaces, improving the diversity and robustness of spectral–temporal representations under complex index dynamics. To stabilize optimization in deep encoding layers and mitigate performance degradation, residual connections were adopted to facilitate effective information propagation.

The core of the Transformer is multi-head self-attention. The input sequence is first linearly projected into Query (Q), Key (K), and Value (V) representations, and attention weights are obtained by computing similarity between Query and Key. After normalization, these weights are used to form a weighted sum of the Value vectors, thereby aggregating information from the entire window into the representation at each time step. Multi-head attention performs the same operation in parallel subspaces to jointly learn distinct forms of temporal dependence and cross-index association. The self-attention output is then passed through a feedforward network for nonlinear transformation, and residual connections together with layer normalization are applied to stabilize optimization and preserve efficient information flow. The corresponding formulations of single-head attention (SA) and multi-head attention (MHSA) are given below [48].

S A (Q, K, V) = s o f t \max (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(3)

M H S A (Q, K, V) = c o n c a t (h e a d_{1}, \dots h e a d_{h})

(4)

h e a d_{i} = A t t e n t i o n (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V})

(5)

Here, Q, K and V are tensors produced by learned linear projections of the spectral inputs, and d_k denotes the corresponding embedding dimension. The term QK^T quantifies inter-band relationships by evaluating pairwise similarity across spectral features. In multi-head attention, each head uses an independent set of trainable projection matrices W_i to form a distinct subspace, and the head-wise outputs are then concatenated and passed through an output projection to yield the final multi-head attention representation.

2.3.4. Feature Fusion and Output

Features generated by the TCN and Transformer branches are concatenated along the feature dimension and passed through a one-dimensional convolutional layer to compress the joint representation. The resulting latent vector is then fed into a fully connected layer to produce a single continuous output representing the end-of-window disease severity for each leaf sample. This fusion strategy enables effective integration of local and global temporal information within a computationally efficient framework suitable for leaf-level monitoring.

2.4. Benchmark Design and Evaluation

2.4.1. Comparative Models

To comprehensively evaluate the performance of the proposed Transformer-TCN (Former-TCN) model for leaf-level Verticillium wilt prediction, nine representative deep learning architectures were implemented for comparison. The benchmark included single-structure models (CNN, LSTM, Transformer, and TCN) and hybrid configurations (CNN-LSTM, CNN-TCN, Former-CNN, and Former-LSTM) that leverage complementary feature extraction mechanisms. This comparative analysis was designed to assess the capability of Transformer-TCN in jointly modeling local temporal patterns and global sequential dependencies for disease severity estimation.

(1): CNN extracts localized temporal features using one-dimensional convolutional filters applied to the concatenated index time-series. ReLU activation and fully connected layers are used to transform the extracted features for disease severity prediction [49].
(2): LSTM utilizes gating mechanisms and cell-state transitions to selectively retain informative signals and capture temporal dependencies within the 5-day sequences. The architecture comprises stacked LSTM layers followed by a fully connected output layer [50].
(3): Transformer models long-range dependencies across all input indices through a multi-head self-attention mechanism. Residual connections, layer normalization, and feedforward networks facilitate the extraction of global sequential patterns, which are subsequently pooled and mapped to the output layer [51,52].
(4): TCN captures local temporal variations using dilated convolutions with residual connections. By progressively increasing the dilation factors across layers, the TCN efficiently models dependencies over multiple time steps while maintaining stable gradient propagation [47].
(5): CNN-LSTM first extracts local temporal features through convolutional layers with ReLU activation, after which the feature sequences are processed by LSTM layers to learn temporal dependencies prior to final regression [53].
(6): CNN-TCN combines convolutional feature extraction with TCN blocks to integrate local and multi-scale temporal patterns before prediction [40].
(7): Former-CNN employs Transformer encoder layers to model global sequential dependencies, after which CNN modules are used to refine localized feature variations [54].
(8): Former-LSTM first extracts global representations using a Transformer encoder and then applies LSTM layers to capture temporal dynamics within the 5-day input window [55].

2.4.2. Evaluation Metrics

To quantitatively evaluate the model performance for disease severity estimation, four regression metrics were adopted, including the coefficient of determination (R²), the root mean squared error (RMSE), the ratio of performance to deviation (RPD), and the ratio of performance to interquartile range (RPIQ).

In addition to accuracy evaluation, model interpretability was assessed using the Shapley Additive Explanation (SHAP) framework [56]. Absolute SHAP values for each vegetation index were calculated to derive the global importance ranking of the model, while the sample-level SHAP distributions were visualized to examine the variability and stability of feature contributions across different temporal slice observations.

2.4.3. Implementation Configuration

All models were implemented and benchmarked using the PyTorch deep learning framework, version 3.12. Experiments were conducted on a workstation equipped with an Intel Core i7-10700 CPU at 2.90 GHz, 64 GB RAM, and an NVIDIA GeForce RTX 3060 GPU with 12 GB memory. For all models, the inputs were time-series vegetation indices constructed with a 5-day window length, and the output was the floating-point disease rate at the corresponding time step. All models were trained using the Adam optimization algorithm with a fixed learning rate of 0.0003 and a mini-batch size of 128, which balances convergence efficiency and training stability. The dataset was partitioned into training, validation, and test sets at a 7:2:1 ratio. This standardized experimental setting ensures that differences in predictive performance primarily reflect architectural characteristics rather than differences in preprocessing or training configuration, thereby providing a robust basis for evaluating model design, feature representation, and generalization capability in leaf-level disease monitoring.

3. Results and Analysis

3.1. Comparison of Time-Series Deep Learning Architectures

To systematically evaluate the performance of different deep learning models in modeling the time-series features derived from the combined set of all spectral indices, we constructed and compared nine models, including single-time models, convolutional-temporal fusion models, and attention-based models (Table 2 and Figure 6). The results show that the proposed Transformer-TCN model achieved the highest prediction accuracy, reaching an R² of 0.8813 and significantly outperforming all the comparison models. Overall, the fusion models outperformed their respective single-branch counterparts. Specifically, CNN-Transformer, Transformer-LSTM, and CNN-TCN achieved R² values of 0.8367, 0.8272, and 0.8192, respectively, all exceeding those of their corresponding individual models.

Among the single-architecture baselines, Transformer and TCN achieved similar accuracy, with R² values of 0.7910 and 0.7898, respectively, outperforming LSTM at 0.7406 and CNN at 0.7491. This result suggests that the causal convolution-based TCN structure effectively models local temporal dependencies using dilated convolutions, whereas the Transformer captures the overall relationships among different time points via a global self-attention mechanism. In contrast, under the temporal window setting adopted in this study, the traditional recurrent LSTM structure did not fully realize its advantage in modeling long term temporal dependencies.

Further comparisons reveal that simple dual-branch fusion models lead to some performance improvement, but the magnitude of improvement depends on the effectiveness of the time-series modeling mechanism. For instance, CNN-LSTM shows improvement over LSTM but remains significantly lower than CNN-TCN and Transformer-LSTM. This indicates that relying merely on recurrent structures for temporal aggregation is insufficient to adequately capture the evolution of disease indices. In contrast, Transformer-TCN combines global attention with local convolutional time modeling, achieving a synergistic representation of cross-scale time-series features. This structure simultaneously captures both overall trend relationships and local changes among different time, resulting in the best performance for modeling cotton Verticillium wilt, which exhibits stage-based and temporally progressive characteristics [57,58]. Overall, these results indicate that hybrid models integrating global temporal dependency modeling and local temporal convolution are more suitable for fine-grained disease severity prediction, particularly when applied to structured temporal features.

3.2. Contribution of Spectral Indices Based on SHAP Analysis

To gain deeper insight into the spectral drivers of the Transformer-TCN model, the Shapley Additive Explanation method was applied to quantify the contribution of each spectral index to disease severity estimation. The global importance ranking based on absolute SHAP values is presented in Figure 7, and the corresponding distribution patterns are shown in Figure 8.

The importance pattern reveals a clear dominance of pigment-related indices, particularly those associated with the blue spectral region. BRI and LIC show the highest contributions, with absolute SHAP values of 0.294 and 0.292, respectively, followed by VOG₂ and GM. These indices collectively constitute the primary drivers of the model predictions. A secondary contribution tier is formed by several photosynthesis-related and water-sensitive indices. PRI₂ remains among the more influential features with a SHAP value of 0.255, and NDWI₂ also shows a notable contribution. Additional indices such as PRI₅₇₀ and WBI₂ provide moderate support to the model. In contrast, most structural and biochemical absorption indices exhibit comparatively lower importance scores.

The SHAP distribution further indicates that the leading pigment indices maintain consistent directional effects on model output, whereas lower-ranked indices display more dispersed contribution patterns. Overall, the results demonstrate that the predictive strength of the Transformer-TCN model is mainly governed by pigment-sensitive spectral information, especially blue band-related features, with selected photosynthetic and water-related indices providing complementary support.

To further characterize the temporal attribution patterns of the 5-day input slices, the absolute SHAP values of all spectral indices were compared across the five-time steps (T0–T4) (Figure 9). The results show that feature contributions were not uniformly distributed within the temporal window, but instead exhibited clear dynamic variation along the time dimension. Overall, SHAP responses were generally stronger at T3 and T4, whereas T1 showed comparatively weaker contributions, indicating that the model relied more heavily on the later observations closer to the labeling day while still retaining useful trend information from the beginning of the window. Among all feature groups, pigment-related indices remained the dominant drivers of temporal modeling. In particular, the blue band-related index BRI maintained consistently high SHAP values across multiple time steps and showed especially strong contributions at T3–T4, indicating that blue band-related pigment information served as the most stable and persistent signal throughout the 5-day sequence. Photosynthesis- and water-related indices displayed a more stage-dependent pattern, with several PRI-derived features as well as WBI₂ and NDWI₂ showing enhanced contributions mainly at the later time steps, especially T3 or T4. This suggests that photosynthetic regulation and water-stress responses were more strongly captured as the disease progressed toward the end of the window, but their temporal contributions were less stable than those of the blue band-related pigment features. In contrast, most biochemical and structural indices, such as NDNI and RNDVI, maintained relatively low SHAP values across the temporal window, indicating that they played a comparatively limited role in sequence-based disease severity estimation.

3.3. Effect of Feature Dimensionality on Time-Series Modeling

Notably, in the previously reported results based on the full set of 34 time-series index features, the Former-TCN model achieved the highest accuracy with an R² of 0.88. To further examine the contribution of individual indices, single-index 5-day time-series features were used as model inputs while keeping the network architecture and training strategy unchanged.

Although the full-feature time-series model using 34 indices achieved the highest accuracy, models driven by a single index still maintained competitive performance (Figure 10). Among them, blue band-related indices, particularly BRI with an R² of 0.7980, showed the strongest predictive capability, reaching more than 90% of the accuracy achieved by the full-feature model. Fluorescence-related and water-related indices exhibited pronounced internal variability. Biochemical indices generally yielded lower accuracy.

This pattern is broadly consistent with the SHAP ranking, with blue band-related indices emerging as the most influential predictors. While temporally sensitive single indices can independently track disease progression, the SHAP results show that the best performance is obtained when multiple complementary spectral features are integrated. Overall, the single-index time-series results indicate that temporal information itself possesses strong representational power, although integrating multiple indices still provides the most reliable performance.

3.4. Contribution of Temporal Information to Model Performance

To further assess the contribution of temporal information to cotton Verticillium wilt monitoring, models were trained using non-temporal features derived from the full set of 34 spectral indices while keeping the same nine deep learning architectures. The prediction performance without temporal inputs was then systematically compared. As shown in Table 3, under non-temporal input conditions, the overall prediction accuracy of all models declined markedly compared with the time-series setting, with R² values ranging from 0.6668 to 0.7650. Notably, the proposed Transformer-TCN model still achieved the best performance under the non-temporal scenario with an R² of 0.7650; however, this value remained substantially lower than its time-series counterpart of 0.8813. These results indicate that non-temporal index features alone are insufficient to capture the progressive physiological responses and staged progression characteristics of Verticillium wilt. Instead, temporal information plays a critical role in improving prediction accuracy. This phenomenon is consistent with previous studies, which concluded that in ground-based spectral observations, frequent repeat measurements are more crucial than single high-density sampling [44], further supporting the importance of time-series data in disease monitoring.

4. Discussion

4.1. Ablation Study and Hyperparameter Analysis

The predictive performance of deep neural networks is highly sensitive to hyperparameter configuration. To systematically evaluate the influence of training settings on the Transformer-TCN model, a multidimensional grid search was performed. Model performance was evaluated using the coefficient of determination R² for cotton Verticillium wilt severity estimation. The objective was to identify a parameter configuration that ensures both high predictive accuracy and stable training behavior for time-series disease monitoring.

The learning rate and batch size are two key hyperparameters governing model optimization. To evaluate their combined effects, the learning rate was varied from 3 × 10⁻² to 3 × 10⁻⁵, while batch sizes of 4, 8, 16, and 32 were tested, resulting in 16 hyperparameter combinations, the results of which are presented in Figure 11. Overall, the Transformer-TCN model showed pronounced sensitivity to different parameter settings. Among all tested combinations, the best and most stable performance was achieved at a learning rate of 3 × 10⁻⁴. Under this setting, the model reached its highest accuracy when the batch size was 8, yielding an R² of 0.8813. In contrast, when the batch size was set to 4, 16, and 32, the R² values decreased to 0.8663, 0.8213, and 0.8075, respectively, indicating that model performance initially improved and then declined as the batch size increased.

When the learning rate was reduced to 3 × 10⁻⁵, the overall accuracy remained at a relatively lower level, with R² ranging from approximately 0.7297 to 0.7905, and tended to decline as batch size increased. In contrast, a learning rate of 3 × 10⁻³ maintained a higher accuracy range, with R² between 0.8403 and 0.8516, although it did not exceed the performance achieved at 3 × 10⁻⁴. When the learning rate further increased to 3 × 10⁻², the model exhibited substantially greater performance variability across batch sizes, with R² varying from 0.6949 to 0.8498, indicating increased sensitivity to parameter settings.

Considering both predictive accuracy and result stability, the optimal hyperparameter configuration for the Transformer-TCN model was determined to be a learning rate of 3 × 10⁻⁴ and a batch size of 8, yielding an R² of 0.8813. This configuration achieved the best overall performance in this study and was therefore adopted in subsequent experiments to ensure consistency and reproducibility.

In addition, ablation experiments were conducted to further evaluate the independent contributions of the Transformer and TCN branches to disease severity estimation. As shown in Table 4, when only the TCN branch was retained, the model achieved a test-set R² of 0.7248 with an RMSE of 0.4683. When only the Transformer branch was retained, the performance improved to an R² of 0.7886 and an RMSE of 0.4503. When both branches were incorporated, the model performance increased further, reaching an R² of 0.8813 and an RMSE of 0.3375. These results demonstrate that both the TCN and Transformer branches make substantial contributions to severity inversion. The TCN branch is more effective in capturing local temporal dynamics, whereas the Transformer branch is better suited to modeling global temporal dependencies. Their integration enables simultaneous characterization of local temporal variations and overall evolutionary trends, thereby providing a more complete representation of disease-related temporal responses and markedly improving estimation performance.

4.2. Analysis of Temporal Spectral Responses

After cotton Verticillium wilt infection, the pathogen colonizes the xylem, leading to vessel blockage and toxin secretion, which disrupt the normal transport of water and nutrients and further induce water deficit, stomatal regulation changes, and reduced photosynthetic activity, accompanied by declining chlorophyll and carotenoid contents and progressive tissue structural damage [59,60,61]. These physiological disturbances trigger coordinated biochemical changes that generate measurable spectral responses across specific wavelength regions [5].

The SHAP results quantitatively show that pigment-sensitive indices dominate model predictions, with blue band-related pigment features contributing most strongly and red-edge-related pigment features following. This confirms that blue-region spectral variability provides the primary information source for severity estimation and agrees with earlier findings that blue, red, and red-edge wavelengths respond strongly to disease stress [62,63,64,65]. The physiological basis for this dominance lies in the coupled sensitivity of blue wavelengths to chlorophyll and carotenoid absorption. Previous studies have shown that the 500–520 nm region is jointly influenced by chlorophyll and carotenoid absorption, while the 430–453 nm region corresponds to the major absorption peaks of chlorophyll a and chlorophyll b and is also modulated by carotenoid absorption [66,67]. In cotton Verticillium wilt, blockage of vascular transport is accompanied by persistent pigment-system disruption and photosynthetic suppression, and the associated chlorophyll degradation and carotenoid variation evolve continuously with disease progression. Therefore, the blue spectral region can more directly characterize the persistent pigment changes induced by Verticillium wilt and generate more continuous and temporally consistent signals, which explains why BRI and LIC ranked first and second in the SHAP analysis. Meanwhile, VOG₂ and GM, which ranked third and fourth, represent red-edge and red-edge-adjacent information, indicating that red-edge-related chlorophyll features also made important contributions. The red edge fundamentally represents the sharp transition between strong chlorophyll absorption in the visible region and strong internal structural scattering in the near-infrared region [68]. Its position is mainly controlled by chlorophyll concentration, whereas its shape is additionally influenced by structural changes [69,70,71]. Therefore, for cotton Verticillium wilt, red-edge-related features can reflect chlorophyll decline but are also affected by the progressively intensified structural damage during disease development. Because structural changes are generally more stage-dependent and lag behind pigment changes [72,73], the temporal disease-response consistency of red-edge-related features is slightly weaker than that of blue band-related features directly tracking persistent pigment variation. Blue-to-red ratio indices have similarly demonstrated robust Verticillium wilt monitoring capability across multiple crops [74,75,76,77], and our findings advance this understanding by demonstrating their consistent predictive strength across the entire time-series progression of Verticillium wilt in cotton.

Photosynthesis-related and water-sensitive indices formed the secondary contribution tier in the Verticillium wilt time-series monitoring framework, following the dominant pigment-sensitive features in the blue spectral region. The contributions of different photosynthesis-related indices showed pronounced variation, reflecting the complex and dynamic regulation of photosynthetic activity during disease progression. These indices are also highly sensitive to instantaneous illumination conditions and short-term physiological adjustments, which introduce temporal variability into their spectral responses and lead to divergent contributions across different indices [78,79,80]. Water-related indices showed moderate yet variable importance. Under pathogen stress, cotton leaves regulate stomatal conductance and redistribute internal water to maintain temporary turgor stability [3,81]. Because leaf water status is governed by such dynamic adjustment processes rather than strictly following the trajectory of lesion expansion, water-related spectral signals often exhibit non-monotonic fluctuations over time. This variability limits the stability of their contribution relative to pigment- or photosynthesis-sensitive features. In contrast, most structural and biochemical absorption indices displayed comparatively low importance. Although structural degradation is part of Verticillium wilt development, its relationship with lesion expansion is more indirect and often lags behind pigment-related changes. Consequently, structural and biochemical indices provide limited incremental information for time-series disease severity estimation. Overall, these findings confirm that pigment-sensitive information, particularly from the blue spectral region, provides the dominant signal for cotton Verticillium wilt monitoring, with photosynthetic and water-related indices offering secondary support.

4.3. Limitations and Perspectives

The Transformer-TCN framework developed in this study was primarily established on in situ point-based spectral measurements at the leaf scale, with continuous spectral feature sequences serving as the principal input. Spatial information, such as lesion distribution patterns and differences among leaf positions, was not explicitly incorporated into the current framework. The integration of such spatial information would help further characterize lesion distribution patterns and canopy structural characteristics, thereby strengthening the ability of the model to represent disease-related spatial context and improving its applicability to disease monitoring at the canopy, plot, and regional scales. Future work should therefore combine hyperspectral imagery acquired from UAV and satellite platforms to jointly integrate spatial and spectral features and establish a spatial-spectral representation framework for larger-scale crop disease monitoring.

In addition, the Transformer-TCN framework effectively integrates local detail features and global dependency information embedded in vegetation-index time series, and achieved strong performance in estimating the severity of cotton Verticillium wilt. This methodological characteristic indicates broader potential for extension to other crop diseases and forest disease monitoring tasks. However, because spectral response mechanisms vary across crops, disease types, and habitat conditions, the adaptability, transferability, and generalization capacity of the framework under more diverse disease scenarios remain to be systematically evaluated. Future studies should therefore rely on richer and more balanced datasets to assess the stability and scalability of this method across cross-crop, cross-disease, and cross-scene settings, while also exploring intelligent modeling strategies tailored to small-sample conditions to further improve applicability in complex real-world environments.

5. Conclusions

Cotton Verticillium wilt poses a major threat to yield, and leaf-scale severity inversion hinges on extracting both global and local disease signatures from hyperspectral time-series data. In this study, we construct a spectral–temporal severity dataset comprising five categories of vegetation indices and integrate the complementary strengths of a Transformer and a temporal convolutional network to develop a Transformer-TCN architecture for accurate leaf-level severity estimation. Specifically, the Transformer branch uses multi-head self-attention to model global dependencies across time steps, whereas the TCN branch employs dilated causal convolutions to capture local temporal fluctuations and fine-scale patterns. A fusion branch is then used to jointly integrate the global and local spectral–temporal representations. Comparative experiments with eight representative deep learning models indicate that the proposed Transformer-TCN consistently achieves the best performance, with an R² of 0.8813. In addition, it significantly outperforms results derived from non-temporal spectral indices, highlighting the critical role of temporal information in characterizing disease responses and improving inversion stability. Feature contribution analysis further reveals that pigment-sensitive information dominates model performance, with blue band-related index, particularly BRI, showing the highest contributions. Photosynthesis- and water-related indices provide secondary yet complementary information, consistent with the single-index time-series modeling results. Overall, by modeling hyperspectral temporal dynamics, this work enables high-accuracy intelligent estimation of leaf-scale Verticillium wilt severity and provides a technical basis for advancing time-series hyperspectral disease monitoring and supporting science-based field management.

Author Contributions

Y.G.: Writing—review & editing, Writing—original draft, Conceptualization, Methodology, Investigation, Software, Visualization; C.H.: Conceptualization, Validation, Supervision, Funding acquisition, Writing—review & editing; X.Z.: Supervision, Conceptualization, Writing—review & editing, Validation; Z.Z.: Supervision, Formal analysis, Data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (42571423) and Xinjiang Production and Construction Corps Science and Technology Program Project (No. 2025AB065 and No. 2025DA007). Changping Huang was supported by Youth Innovation Promotion Association, CAS (Y2021047).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that would have influenced the work reported in this paper.

Appendix A

Table A1. Hyperspectral vegetation indices utilized in this study.

Vegetation Indices	Equation	Reference
Structurial indices
Green NDVI	$G N D V I = \frac{R_{750} - R_{540} + R_{570}}{R_{750} + R_{540} - R_{570}}$	Gitelson and Merzlyak [82]
Healthy-index	$H I = \frac{R_{534} - R_{698}}{R_{534} + R_{698}} - \frac{R_{704}}{2}$	Mahlein et al. [83]
Red-edge NDVI	$R N D V I = (R_{750} - R_{705}) / (R_{750} + R_{705})$	Barnes et al. [84]
Xanthophyll cycle & Fluorescence indices
Reflectance Curvature Index	$C U R = (R_{675} \times R_{690}) / R_{683}^{2}$	Zarco-Tejada et al. [85]
Fluorescence Ratio Index	$F R I = R_{690} / R_{630}$	Zarco-Tejada et al. [85]
Fluorescence Curvature Index	$F C I = R_{683}^{2} / (R_{675} \times R_{691})$	Zarco-Tejada et al. [85]
Photochemical Reflectance Index (515)	$P R I_{515} = (R_{515} - R_{531}) / (R_{515} + R_{531})$	Hernández-Clemente et al. [86]
Photochemical Reflectance Index (570)	$P R I_{570} = (R_{570} - R_{531}) / (R_{570} + R_{531})$	Gamon et al. [87]
Photochemical Reflectance Index (600)	$P R I_{600} = (R_{600} - R_{531}) / (R_{600} + R_{531})$	Gamon et al. [87]
Photochemical Reflectance Index	$P R I_{1} = R_{685} / R_{655}$	Meroni et al. [88]
	$P R I_{2} = R_{680} / R_{630}$	Meroni et al. [88]
Pigment indices
Anthocyanin (Gitelson)	$A {nt}_{G i t e l s o n} = (1 / R_{550} - 1 / R_{700}) \times R_{780}$	Gitelson et al. [89]
Blue Index	$B = R_{450} / R_{490}$	Calderón et al. [90]
Blue/red Index	$B R I = R_{450} / R_{690}$	Zarco-Tejada et al. [91]
Blue Fraction	$B F = R_{400} / R_{410}$	Zarco-Tejada et al. [74]
Chlorophyll Index Red Edge	$C I = R_{750} / R_{710}$	Haboudane et al. [92]
Carotenoid Reflectance Index (550_515)	$C R I_{550_515} = (1 / R_{515}) - (1 / R_{550})$	Gitelson et al. [93]
Carotenoid Reflectance Index (700_515)	$C R I_{700_515} = (1 / R_{515}) - (1 / R_{700})$	Gitelson et al. [93]
Carter Index	$C T R I_{1} = R_{695} / R_{420}$	Carter [94]
Reflectance Band Ratio Index	$D C a b C xc = R_{672} / (R_{550} \times (3 \times R_{708}))$	Datt [95]
Gitelson and Merzlyak Index	$G M = R_{750} / R_{700}$	Gitelson and Merzlyak [96]
Lichtenthaler Index	$L I C = R_{440} / R_{690}$	Lichtenthaler [97]
Modified Chlorophyll Absorption Reflectance Index	$M C A R I = [(R_{701} - R_{671}) - 0.2 \times (R_{701} - R_{549})] / (R_{701} / R_{670})$	Daughtry [16]
Chlorophyll b	$P S D N_{b} = (R_{800} - R_{635}) / (R_{800} + R_{635})$	Blackburn [98]
Transformed Chlorophyll Absorption in Reflectance Index	$T C A R I = 3 \times [(R_{700} - R_{670}) - 0.2 \times (R_{700} - R_{550}) \times (R_{700} / R_{670})]$	Haboudane et al. [92]
Vogelmann Index	$V O G_{1} = R_{740} / R_{720}$	Vogelmann et al. [99]
	$V O G_{2} = (R_{734} - R_{747}) / (R_{715} + R_{720})$	Vogelmann et al. [99]
Biochemical absorption indices
Modified Chlorophyll Absorption Reflectance Index (1510)	$M C A R I_{1510} = [(R_{700} - R_{1510}) - 0.2 \times (R_{700} - R_{550})] / (R_{700} / R_{1510})$	Herrmann et al. [100]
Norm. Diff. N. Index	$N D N I = \log (1 / R_{1510}) - \log (1 / R_{1680}) / (\log (1 / R_{1510}) + \log (1 / R_{1680}))$	Serrano et al. [101]
Water content indices
Water Stress and Canopy Temperature	$W S C T = (R_{970} - R_{850}) / (R_{970} + R_{850})$	Babar et al. [102]
Water Band Index	$W B I_{1} = R_{970} / R_{900}$	Penuelas et al. [103]
	$W B I_{2} = R_{1150} / R_{1450}$	Sapes et al. [104]
Normalized Difference Water Index	$N D W I_{1} = (R_{835} - R_{1610}) / (R_{835} + R_{1610})$	Gao [105]
	$N D W I_{2} = (R_{860} - R_{1195}) / (R_{860} + R_{1195})$	Gao [105]

References

Zhu, D.; Zhang, X.; Zhou, J.; Wu, Y.; Zhang, X.; Feng, Z.; Wei, F.; Zhao, L.; Zhang, Y.; Shi, Y.; et al. Genome-Wide Analysis of Ribosomal Protein GhRPS6 and Its Role in Cotton Verticillium Wilt Resistance. Int. J. Mol. Sci. 2021, 22, 1795. [Google Scholar] [CrossRef]
Zhu, H.; Lin, C.; Liu, G.; Wang, D.; Qin, S.; Li, A.; Xu, J.-L.; He, Y. Intelligent agriculture: Deep learning in UAV-based remote sensing imagery for crop diseases and pests detection. Front. Plant Sci. 2024, 15, 1435016. [Google Scholar] [CrossRef] [PubMed]
Yang, M.; Huang, C.; Kang, X.; Qin, S.; Ma, L.; Wang, J.; Zhou, X.; Lv, X.; Zhang, Z. Early Monitoring of Cotton Verticillium Wilt by Leaf Multiple “Symptom” Characteristics. Remote Sens. 2022, 14, 5241. [Google Scholar] [CrossRef]
Wu, N.; Gao, P.; Wu, J.; Zhao, Y.; Xu, X.; Zhang, C.; Alexandersson, E.; Yang, J.; Xiao, Q.; He, Y. Rapid detection and visualization of physiological signatures in cotton leaves under Verticillium wilt stress. Artif. Intell. Agric. 2025, 15, 757–769. [Google Scholar] [CrossRef]
Yang, M.; Kang, X.; Qiu, X.; Ma, L.; Ren, H.; Huang, C.; Zhang, Z.; Lv, X. Method for early diagnosis of verticillium wilt in cotton based on chlorophyll fluorescence and hyperspectral technology. Comput. Electron. Agric. 2024, 216, 108497. [Google Scholar] [CrossRef]
Gao, Y.; Huang, C.; Zhang, X.; Zhang, Z.; Chen, B. Vertical stratification-enabled early monitoring of cotton Verticillium wilt using in-situ leaf spectroscopy via machine learning models. Front. Plant Sci. 2025, 16, 1599877. [Google Scholar] [CrossRef] [PubMed]
Mahlein, A.K.; Kuska, M.T.; Behmann, J.; Polder, G.; Walter, A. Hyperspectral Sensors and Imaging Technologies in Phytopathology: State of the Art. Annu. Rev. Phytopathol. 2018, 56, 535–558. [Google Scholar] [CrossRef]
Tian, L.; Xue, B.; Wang, Z.; Li, D.; Yao, X.; Cao, Q.; Zhu, Y.; Cao, W.; Cheng, T. Spectroscopic detection of rice leaf blast infection from asymptomatic to mild stages with integrated machine learning and feature selection. Remote Sens. Environ. 2021, 257, 112350. [Google Scholar] [CrossRef]
Abdelghafour, F.; Sivarajan, S.R.; Abdelmeguid, I.; Ryckewaert, M.; Roger, J.-M.; Bendoula, R.; Alexandersson, E. Including measurement effects and temporal variations in VIS-NIRS models to improve early detection of plant disease: Application to Alternaria solani in potatoes. Comput. Electron. Agric. 2023, 211, 107947. [Google Scholar] [CrossRef]
Li, W.; Liu, L.; Li, J.; Yang, W.; Guo, Y.; Huang, L.; Yang, Z.; Peng, J.; Jin, X.; Lan, Y. Spectroscopic detection of cotton Verticillium wilt by spectral feature selection and machine learning methods. Front. Plant Sci. 2025, 16, 1519001. [Google Scholar] [CrossRef]
Bai, Y.; Nie, C.; Yu, X.; Gou, M.; Liu, S.; Zhu, Y.; Jiang, T.; Jia, X.; Liu, Y.; Nan, F.; et al. Comprehensive analysis of hyperspectral features for monitoring canopy maize leaf spot disease. Comput. Electron. Agric. 2024, 225, 109350. [Google Scholar] [CrossRef]
Chen, B.; Wang, J.; Li, T.; Lin, H.; Hang, H.; Wang, F.; Wang, Q.; Ma, Q. Effects of Verticillum Wilt on Leaf Microstructure, Photosynthesis of Cotton. Cotton Sci. 2017, 29, 570–578. [Google Scholar] [CrossRef]
Jing, X.; Zou, Q.; Bai, Z.F.; Huang, W.J. Research progress of crop diseases monitoring based on reflectance and chlorophyll fluorescence data. Acta Agron. Sin. 2021, 47, 2067–2079. [Google Scholar] [CrossRef]
Li, W.; Guo, Y.; Yang, W.; Huang, L.; Zhang, J.; Peng, J.; Lan, Y. Severity Assessment of Cotton Canopy Verticillium Wilt by Machine Learning Based on Feature Selection and Optimization Algorithm Using UAV Hyperspectral Data. Remote Sens. 2024, 16, 4637. [Google Scholar] [CrossRef]
Zhang, N.; Zhang, X.; Shang, P.; Ma, R.; Yuan, X.; Li, L.; Bai, T. Detection of Cotton Verticillium Wilt Disease Severity Based on Hyperspectrum and GWO-SVM. Remote Sens. 2023, 15, 3373. [Google Scholar] [CrossRef]
Daughtry, C. Estimating Corn Leaf Chlorophyll Concentration from Leaf and Canopy Reflectance. Remote Sens. Environ. 2000, 74, 229–239. [Google Scholar] [CrossRef]
Chen, B.; Li, S.; Wang, K.; Zhou, G.; Bai, J. Evaluating the severity level of cotton Verticillium using spectral signature analysis. Int. J. Remote Sens. 2012, 33, 2706–2724. [Google Scholar] [CrossRef]
Sapes, G.; Schroeder, L.; Scott, A.; Clark, I.; Juzwik, J.; Montgomery, R.A.; Guzmán, Q.J.A.; Cavender-Bares, J. Mechanistic links between physiology and spectral reflectance enable previsual detection of oak wilt and drought stress. Proc. Natl. Acad. Sci. USA 2024, 121, e2316164121. [Google Scholar] [CrossRef]
Tian, L.; Ustin, S.L.; Xue, B.; Zarco-Tejada, P.J.; Jin, Y.; Yao, X.; Zhu, Y.; Cao, W.; Cheng, T. Visualizing the pre-visual: Rice blast infection signals revealed. Remote Sens. Environ. 2025, 328, 114905. [Google Scholar] [CrossRef]
Ma, R.; Zhang, N.; Zhang, X.; Bai, T.; Yuan, X.; Bao, H.; He, D.; Sun, W.; He, Y. Cotton Verticillium wilt monitoring based on UAV multispectral-visible multi-source feature fusion. Comput. Electron. Agric. 2024, 217, 108628. [Google Scholar] [CrossRef]
Zhao, S.; Zhu, X.; Tan, X.; Tian, J. Spectrotemporal fusion: Generation of frequent hyperspectral satellite imagery. Remote Sens. Environ. 2025, 319, 114639. [Google Scholar] [CrossRef]
Wang, J.; Yang, M.; Zheng, Z.; Gui, Y.; Zhou, J.; Zhang, C.; Zhao, L.; Gong, M.; Huang, C.; Zhang, Z. Modeling Temporal Resistance Assessment of Cotton to Verticillium Wilt Using Airborne Hyperspectral Data and Disease Progression Rates. Remote Sens. 2025, 17, 3701. [Google Scholar] [CrossRef]
Su, B.; Liu, Y.; Huang, Y.; Wei, R.; Cao, X.; Han, D. Analysis for stripe rust dynamics in wheat population using UAV remote sensing. Trans. Chin. Soc. Agric. Eng. 2021, 37, 127–135. [Google Scholar]
Jing, X.; Du, K.Q.; Duan, W.A.; Zou, Q.; Zhao, T.T.; Li, B.Y.; Ye, Q.X.; Yan, L.S. Quantifying the effects of stripe rust disease on wheat canopy spectrum based on eliminating non-physiological stresses. Crop J. 2022, 10, 1284–1291. [Google Scholar] [CrossRef]
Wu, M.; Yang, C.; Song, X.; Hoffmann, W.C.; Huang, W.; Niu, Z.; Wang, C.; Li, W.; Yu, B. Monitoring cotton root rot by synthetic Sentinel-2 NDVI time series using improved spatial and temporal data fusion. Sci. Rep. 2018, 8, 2016. [Google Scholar] [CrossRef] [PubMed]
Nie, J.; Jiang, J.; Li, Y.; Li, J.; Chao, X.; Ercisli, S. Efficient Detection of Cotton Verticillium Wilt by Combining Satellite Time-Series Data and Multiview UAV Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 13547–13557. [Google Scholar] [CrossRef]
Nazarenko, E.; Varkentin, V.; Polyakova, T. Features of Application of Machine Learning Methods for Classification of Network Traffic (Features, Advantages, Disadvantages). In Proceedings of the 2019 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), Vladivostok, Russia, 1–4 October 2019. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
Shrestha, A.; Mahmood, A. Review of Deep Learning Algorithms and Architectures. IEEE Access 2019, 7, 53040–53065. [Google Scholar] [CrossRef]
Hu, J.; Peng, D.; Chen, J.M.; Huete, A.R.; Yu, L.; Lou, Z.; Cheng, E.; Yang, X.; Zhang, B. High-precision inversion of vegetation parameters in the AI era: Integrating hyperspectral remote sensing and deep learning. Innovation 2025, 6, 100868. [Google Scholar] [CrossRef]
Shuai, L.; Li, Z.; Chen, Z.; Luo, D.; Mu, J. A research review on deep learning combined with hyperspectral Imaging in multiscale agricultural sensing. Comput. Electron. Agric. 2024, 217, 108577. [Google Scholar] [CrossRef]
Deng, J.; Hong, D.; Li, C.; Yao, J.; Yang, Z.; Zhang, Z.; Chanussot, J. RustQNet: Multimodal deep learning for quantitative inversion of wheat stripe rust disease index. Comput. Electron. Agric. 2024, 225, 109245. [Google Scholar] [CrossRef]
Boulent, J.; Foucher, S.; Théau, J.; St-Charles, P.-L. Convolutional Neural Networks for the Automatic Identification of Plant Diseases. Front. Plant Sci. 2019, 10, 941. [Google Scholar] [CrossRef] [PubMed]
Dhaka, V.S.; Meena, S.V.; Rani, G.; Sinwar, D.; Kavita; Ijaz, M.F.; Woźniak, M. A Survey of Deep Convolutional Neural Networks Applied for Prediction of Plant Leaf Diseases. Sensors 2021, 21, 4749. [Google Scholar] [CrossRef]
Abdalla, A.; Wheeler, T.A.; Dever, J.; Lin, Z.; Arce, J.; Guo, W. Assessing fusarium oxysporum disease severity in cotton using unmanned aerial system images and a hybrid domain adaptation deep learning time series model. Biosyst. Eng. 2024, 237, 220–231. [Google Scholar] [CrossRef]
Zhao, J.; Chu, F.; Xie, L.; Che, Y.; Wu, Y.; Burke, A.F. A survey of transformer networks for time series forecasting. Comput. Sci. Rev. 2026, 60, 100883. [Google Scholar] [CrossRef]
Wen, Q.; Zhou, T.; Zhang, C.; Chen, W.; Ma, Z.; Yan, J.; Sun, L. Transformers in Time Series: A Survey. arXiv 2022, arXiv:2202.07125. [Google Scholar]
Son, H. Toward a proposed framework for mood recognition using LSTM Recurrent Neuron Network. Procedia Comput. Sci. 2017, 109, 1028–1034. [Google Scholar] [CrossRef]
Liu, J.; Li, Q.; Yang, H.; Han, Y.; Jiang, S.; Chen, W. Sequence Fault Diagnosis for PEMFC Water Management Subsystem Using Deep Learning With t-SNE. IEEE Access 2019, 7, 92009–92019. [Google Scholar] [CrossRef]
Wang, Y.; Chen, P. Network traffic prediction based on transformer and temporal convolutional network. PLoS ONE 2025, 20, e0320368. [Google Scholar] [CrossRef]
Feng, L.; Wan, S.; Zhang, Y.; Dong, H. Xinjiang cotton: Achieving super-high yield through efficient utilization of light, heat, water, and fertilizer by three generations of cultivation technology systems. Field Crops Res. 2024, 312, 109401. [Google Scholar] [CrossRef]
Yao, H.; Zhang, Y.; Yi, X.; Zhang, X.; Zhang, W. Cotton responds to different plant population densities by adjusting specific leaf area to optimize canopy photosynthetic use efficiency of light and nitrogen. Field Crops Res. 2016, 188, 10–16. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, D.; Zhang, Y.; Cheng, F.; Zhao, X.; Wang, M.; Fan, X. Early detection of verticillium wilt in eggplant leaves by fusing five image channels: A deep learning approach. Plant Methods 2024, 20, 173. [Google Scholar] [CrossRef]
Sereda, I.; Danilov, R.; Kremneva, O.; Zimin, M.; Podushin, Y. Development of Methods for Remote Monitoring of Leaf Diseases in Wheat Agrocenoses. Plants 2023, 12, 3223. [Google Scholar] [CrossRef]
Zhang, K.; Yan, F.; Liu, P. The application of hyperspectral imaging for wheat biotic and abiotic stress analysis: A review. Comput. Electron. Agric. 2024, 221, 109008. [Google Scholar] [CrossRef]
Chaiyana, A.; Khiripet, N.; Ninsawat, S.; Siriwan, W.; Shanmugam, M.S.; Virdis, S.G.P. Early prediction of cassava mosaic disease onset based on remote sensing and climatic data. Comput. Electron. Agric. 2025, 230, 109836. [Google Scholar] [CrossRef]
Wu, Z. TCN-Driven Volatility-Robust Forecasting in Minute-Resolution Cryptocurrency Markets. In Proceedings of the 2025 International Conference on Economic Management and Big Data Application, Shenzhen, China, 25–27 July 2025; Association for Computing Machinery: New York, NY, USA, 2025; pp. 389–395. [Google Scholar]
Liu, Y.; Wu, Y.-H.; Sun, G.; Zhang, L.; Chhatkuli, A.; Van Gool, L. Vision Transformers with Hierarchical Attention. Mach. Intell. Res. 2024, 21, 670–683. [Google Scholar] [CrossRef]
Shi, S.; Xu, L.; Gong, W.; Chen, B.; Chen, B.; Qu, F.; Tang, X.; Sun, J.; Yang, J. A convolution neural network for forest leaf chlorophyll and carotenoid estimation using hyperspectral reflectance. Int. J. Appl. Earth Obs. Geoinf. 2022, 108, 102719. [Google Scholar] [CrossRef]
Zhao, R.; Tang, W.; Liu, M.; Wang, N.; Sun, H.; Li, M.; Ma, Y. Spatial-spectral feature extraction for in-field chlorophyll content estimation using hyperspectral imaging. Biosyst. Eng. 2024, 246, 263–276. [Google Scholar] [CrossRef]
Deng, J.; Zhang, X.; Yang, Z.; Zhou, C.; Wang, R.; Zhang, K.; Lv, X.; Yang, L.; Wang, Z.; Li, P.; et al. Pixel-level regression for UAV hyperspectral images: Deep learning-based quantitative inverse of wheat stripe rust disease index. Comput. Electron. Agric. 2023, 215, 108434. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6000–6010. [Google Scholar]
Barona López, L.I.; Ferri, F.M.; Zea, J.; Valdivieso Caraguay, Á.L.; Benalcázar, M.E. CNN-LSTM and post-processing for EMG-based hand gesture recognition. Intell. Syst. Appl. 2024, 22, 200352. [Google Scholar] [CrossRef]
Xia, Y.; Xiong, Y.; Wang, K. A transformer model blended with CNN and denoising autoencoder for inter-patient ECG arrhythmia classification. Biomed. Signal Process. Control 2023, 86, 105271. [Google Scholar] [CrossRef]
Li, Y.; Ren, Q.; Jin, H.; Han, M. LSTN: Long Short-Term Traffic Flow Forecasting with Transformer Networks. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022; pp. 4793–4800. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Zhu, Y.; Zhao, M.; Li, T.; Wang, L.; Liao, C.; Liu, D.; Zhang, H.; Zhao, Y.; Liu, L.; Ge, X.; et al. Interactions between Verticillium dahliae and cotton: Pathogenic mechanism and cotton resistance mechanism to Verticillium wilt. Front. Plant Sci. 2023, 14, 1174281. [Google Scholar] [CrossRef]
Zhang, Y.; Zhou, J.; Zhao, L.; Feng, Z.; Wei, F.; Bai, H.; Feng, H.; Zhu, H. A review of the pathogenicity mechanism of Verticillium dahliae in cotton. J. Cotton Res. 2022, 5, 3. [Google Scholar] [CrossRef]
Zhang, D.D.; Dai, X.F.; Klosterman, S.J.; Subbarao, K.V.; Chen, J.Y. The secretome of Verticillium dahliae in collusion with plant defence responses modulates Verticillium wilt symptoms. Biol. Rev. 2022, 97, 1810–1822. [Google Scholar] [CrossRef]
Chen, J.-Y.; Xiao, H.-L.; Gui, Y.-J.; Zhang, D.-D.; Li, L.; Bao, Y.-M.; Dai, X.-F. Characterization of the Verticillium dahliae Exoproteome Involves in Pathogenicity from Cotton-Containing Medium. Front. Microbiol. 2016, 7, 1709. [Google Scholar] [CrossRef] [PubMed]
Yadeta, K.A.; Thomma, B.P.H.J. The xylem as battleground for plant hosts and vascular wilt pathogens. Front. Plant Sci. 2013, 4, 97. [Google Scholar] [CrossRef]
Zhang, J.C.; Huang, Y.B.; Pu, R.L.; Gonzalez-Moreno, P.; Yuan, L.; Wu, K.H.; Huang, W.J. Monitoring plant diseases and pests through remote sensing technology: A review. Comput. Electron. Agric. 2019, 165, 104943. [Google Scholar] [CrossRef]
Lassalle, G. Monitoring natural and anthropogenic plant stressors by hyperspectral remote sensing: Recommendations and guidelines based on a meta-review. Sci. Total Environ. 2021, 788, 147758. [Google Scholar] [CrossRef]
Feng, S.; Zhao, D.X.; Guan, Q.; Li, J.P.; Liu, Z.Y.; Jin, Z.Y.; Li, G.M.; Xu, T.Y. A deep convolutional neural network-based wavelength selection method for spectral characteristics of rice blast disease. Comput. Electron. Agric. 2022, 199, 107199. [Google Scholar] [CrossRef]
Shafik, W.; Tufail, A.; Namoun, A.; De Silva, L.C.; Apong, R.A.A.H.M. A Systematic Literature Review on Plant Disease Detection: Motivations, Classification Techniques, Datasets, Challenges, and Future Trends. IEEE Access 2023, 11, 59174–59203. [Google Scholar] [CrossRef]
Gitelson, A.; Merzlyak, M.; Zur, Y.; Stark, R.; Gritz, U. Non-destructive and remote sensing techniques for estimation of vegetation status. In Proceedings of the 3rd European Conference on Precision Agriculture, Montpelier, France, 18–20 June 2001; Volume 1. [Google Scholar]
Delegido, J.; Vergara, C.; Verrelst, J.; Gandía, S.; Moreno, J. Remote Estimation of Crop Chlorophyll Content by Means of High-Spectral-Resolution Reflectance Techniques. Agron. J. 2011, 103, 1834–1842. [Google Scholar] [CrossRef]
Njoku, E. The red edge in arid region vegetation: 340–1060 nm spectra. In Proceedings of the 4th Annual JPL Airborne Geoscience Workshop, Washington, DC, USA, 25–29 October 1993. [Google Scholar]
Horler, D.N.H.; Dockray, M.; Barber, J. The red edge of plant leaf reflectance. Int. J. Remote Sens. 1983, 4, 273–288. [Google Scholar] [CrossRef]
Filella, I.; Penuelas, J. The red edge position and shape as indicators of plant chlorophyll content, biomass and hydric status. Int. J. Remote Sens. 1994, 15, 1459–1470. [Google Scholar] [CrossRef]
Liang, S.-Z.; Shi, P.; Ma, W.-D.; Xing, Q.-G.; Yu, L.-J. Relational analysis of spectra and red-edge characteristics of plant leaf and leaf biochemical constituent. Chin. J. Eco-Agric. 2010, 18, 804–809. [Google Scholar] [CrossRef]
Tamary, E.; Nevo, R.; Naveh, L.; Levin-Zaidman, S.; Kiss, V.; Savidor, A.; Levin, Y.; Eyal, Y.; Reich, Z.; Adam, Z. Chlorophyll catabolism precedes changes in chloroplast structure and proteome during leaf senescence. Plant Direct 2019, 3, e00127. [Google Scholar] [CrossRef]
Harris, J.B.; Schaefer, V.G. Some Correlated Events in Aging Leaf Tissues of Tree Tomato and Tobacco. Bot. Gaz. 1981, 142, 43–54. [Google Scholar] [CrossRef]
Zarco-Tejada, P.J.; Camino, C.; Beck, P.S.A.; Calderon, R.; Hornero, A.; Hernandez-Clemente, R.; Kattenborn, T.; Montes-Borrego, M.; Susca, L.; Morelli, M.; et al. Previsual symptoms of Xylella fastidiosa infection revealed in spectral plant-trait alterations. Nat. Plants 2018, 4, 432–439. [Google Scholar] [CrossRef] [PubMed]
Watt, M.S.; Poblete, T.; de Silva, D.; Estarija, H.J.C.; Hartley, R.J.L.; Leonardo, E.M.C.; Massam, P.; Buddenbaum, H.; Zarco-Tejada, P.J. Prediction of the severity of Dothistroma needle blight in radiata pine using plant based traits and narrow band indices derived from UAV hyperspectral imagery. Agric. For. Meteorol. 2023, 330, 109294. [Google Scholar] [CrossRef]
Camino, C.; Calderón, R.; Parnell, S.; Dierkes, H.; Chemin, Y.; Román-Écija, M.; Montes-Borrego, M.; Landa, B.B.; Navas-Cortes, J.A.; Zarco-Tejada, P.J.; et al. Detection of Xylella fastidiosa in almond orchards by synergic use of an epidemic spread model and remotely sensed plant traits. Remote Sens. Environ. 2021, 260, 112420. [Google Scholar] [CrossRef]
Poblete, T.; Camino, C.; Beck, P.S.A.; Hornero, A.; Kattenborn, T.; Saponari, M.; Boscia, D.; Navas-Cortes, J.A.; Zarco-Tejada, P.J. Detection of Xylella fastidiosa infection symptoms with airborne multispectral and thermal imagery: Assessing bandset reduction performance from hyperspectral analysis. ISPRS J. Photogramm. Remote Sens. 2020, 162, 27–40. [Google Scholar] [CrossRef]
Rogers, C.A.; Chen, J.M.; Zheng, T.; Croft, H.; Gonsamo, A.; Luo, X.; Staebler, R.M. The Response of Spectral Vegetation Indices and Solar-Induced Fluorescence to Changes in Illumination Intensity and Geometry in the Days Surrounding the 2017 North American Solar Eclipse. J. Geophys. Res. Biogeosci. 2020, 125, e2020JG005774. [Google Scholar] [CrossRef]
Schickling, A.; Matveeva, M.; Damm, A.; Schween, J.; Wahner, A.; Graf, A.; Crewell, S.; Rascher, U. Combining Sun-Induced Chlorophyll Fluorescence and Photochemical Reflectance Index Improves Diurnal Modeling of Gross Primary Productivity. Remote Sens. 2016, 8, 574. [Google Scholar] [CrossRef]
Peng, Y.; Zeng, A.; Zhu, T.; Fang, S.; Gong, Y.; Tao, Y.; Zhou, Y.; Liu, K. Using remotely sensed spectral reflectance to indicate leaf photosynthetic efficiency derived from active fluorescence measurements. J. Appl. Remote Sens. 2017, 11, 026034. [Google Scholar] [CrossRef]
Pascual, I.; Azcona, I.; Morales, F.; Aguirreolea, J.; Sanchez-Diaz, M. Photosynthetic response of pepper plants to wilt induced by Verticillium dahliae and soil water deficit. J. Plant Physiol. 2010, 167, 701–708. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N. Remote estimation of chlorophyll content in higher plant leaves. Int. J. Remote Sens. 1997, 18, 2691–2697. [Google Scholar] [CrossRef]
Mahlein, A.-K.; Rumpf, T.; Welke, P.; Dehne, H.-W.; Plümer, L.; Steiner, U.; Oerke, E.-C. Development of spectral indices for detecting and identifying plant diseases. Remote Sens. Environ. 2013, 128, 21–30. [Google Scholar] [CrossRef]
Barnes, E.M.; Clarke, T.; Richards, S.E.; Colaizzi, P.D.; Haberland, J.; Kostrzewski, M.; Waller, P.M.; Choi, C.Y.; Riley, E.; Thompson, T.L.; et al. Coincident detection of crop water stress, nitrogen status and canopy density using ground-based multispectral data. In Proceedings of the Fifth International Conference on Precision Agriculture and Other Resource Management, Bloomington, MN, USA, 16–19 July 2000. [Google Scholar]
Zarco-Tejada, P.J.; Miller, J.R.; Mohammed, G.H.; Noland, T.L.; Sampson, P.H. Chlorophyll Fluorescence Effects on Vegetation Apparent Reflectance: II. Laboratory and Airborne Canopy-Level Measurements with Hyperspectral Data. Remote Sens. Environ. 2000, 74, 596–608. [Google Scholar] [CrossRef]
Hernández-Clemente, R.; Navarro-Cerrillo, R.M.; Suárez, L.; Morales, F.; Zarco-Tejada, P.J. Assessing structural effects on PRI for stress detection in conifer forests. Remote Sens. Environ. 2011, 115, 2360–2375. [Google Scholar] [CrossRef]
Gamon, J.A.; Peñuelas, J.; Field, C.B. A narrow-waveband spectral index that tracks diurnal changes in photosynthetic efficiency. Remote Sens. Environ. 1992, 41, 35–44. [Google Scholar] [CrossRef]
Meroni, M.; Rossini, M.; Guanter, L.; Alonso, L.; Rascher, U.; Colombo, R.; Moreno, J. Remote sensing of solar-induced chlorophyll fluorescence: Review of methods and applications. Remote Sens. Environ. 2009, 113, 2037–2051. [Google Scholar] [CrossRef]
Gitelson, A.; Gritz, Y.; Merzlyak, M. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef]
Calderón, R.; Navas-Cortés, J.A.; Lucena, C.; Zarco-Tejada, P.J. High-resolution airborne hyperspectral and thermal imagery for early detection of Verticillium wilt of olive using fluorescence, temperature and narrow-band spectral indices. Remote Sens. Environ. 2013, 139, 231–245. [Google Scholar] [CrossRef]
Zarco-Tejada, P.J.; González-Dugo, V.; Berni, J.A.J. Fluorescence, temperature and narrow-band indices acquired from a UAV platform for water stress detection using a micro-hyperspectral imager and a thermal camera. Remote Sens. Environ. 2012, 117, 322–337. [Google Scholar] [CrossRef]
Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated narrow-band vegetation indices for prediction of crop chlorophyll content for application to precision agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
Gitelson, A.A.; Keydan, G.P.; Merzlyak, M.N. Three-band model for noninvasive estimation of chlorophyll, carotenoids, and anthocyanin contents in higher plant leaves. Geophys. Res. Lett. 2006, 33, L11402. [Google Scholar] [CrossRef]
Carter, G.A. Ratios of leaf reflectances in narrow wavebands as indicators of plant stress. Int. J. Remote Sens. 1994, 15, 697–703. [Google Scholar] [CrossRef]
Datt, B. Remote Sensing of Chlorophyll a, Chlorophyll b, Chlorophyll a+b, and Total Carotenoid Content in Eucalyptus Leaves. Remote Sens. Environ. 1998, 66, 111–121. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N. Signature Analysis of Leaf Reflectance Spectra: Algorithm Development for Remote Sensing of Chlorophyll. J. Plant Physiol. 1996, 148, 494–500. [Google Scholar] [CrossRef]
Lichtenthaler, H.K. Vegetation stress: An introduction to the stress concept in plants. J. Plant Physiol. 1996, 148, 4–14. [Google Scholar] [CrossRef]
Blackburn, G.A. Spectral indices for estimating photosynthetic pigment concentrations: A test using senescent tree leaves. Int. J. Remote Sens. 1998, 19, 657–675. [Google Scholar] [CrossRef]
Vogelmann, J.E.; Rock, B.N.; Moss, D.M. Red edge spectral measurements from sugar maple leaves. Int. J. Remote Sens. 1993, 14, 1563–1575. [Google Scholar] [CrossRef]
Herrmann, I.; Karnieli, A.; Bonfil, D.J.; Cohen, Y.; Alchanatis, V. SWIR-based spectral indices for assessing nitrogen content in potato fields. Int. J. Remote Sens. 2010, 31, 5127–5143. [Google Scholar] [CrossRef]
Serrano, L.; Peñuelas, J.; Ustin, S.L. Remote sensing of nitrogen and lignin in Mediterranean vegetation from AVIRIS data: Decomposing biochemical from structural signals. Remote Sens. Environ. 2002, 81, 355–364. [Google Scholar] [CrossRef]
Babar, M.A.; Reynolds, M.P.; Van Ginkel, M.; Klatt, A.R.; Raun, W.R.; Stone, M.L. Spectral reflectance to estimate genetic variation for in-season biomass, leaf chlorophyll, and canopy temperature in wheat. Crop Sci 2006, 46, 1046–1057. [Google Scholar] [CrossRef]
Penuelas, J.; Pinol, J.; Ogaya, R.; Filella, I. Estimation of plant water concentration by the reflectance Water Index WI (R900/R970). Int. J. Remote Sens. 1997, 18, 2869–2875. [Google Scholar] [CrossRef]
Sapes, G.; Lapadat, C.; Schweiger, A.K.; Juzwik, J.; Montgomery, R.; Gholizadeh, H.; Townsend, P.A.; Gamon, J.A. Canopy spectral reflectance detects oak wilt at the landscape scale using phylogenetic discrimination. Remote Sens. Environ. 2022, 273, 112961. [Google Scholar] [CrossRef]
Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]

Figure 1. Experimental area of the cotton disease nursery at the Cotton Research Institute, Shihezi Academy of Agricultural Sciences, Shihezi, Xinjiang, China.

Figure 2. Schematic diagram of the in situ spectral measurement setup for attached cotton leaves.

Figure 3. Field-collected cotton leaf dataset. (a) Continuous spectral measurements of main-stem leaves. (b) High-resolution RGB images with corresponding disease severity (DS).

Figure 4. Time-series boxplots of BRI and disease severity across the monitoring period, showing the temporal distributions of daily observations over all sampled leaves. The yellow lines indicate the median values at each monitoring date.

Figure 5. The overall flowchart of the proposed Transformer-TCN (Former-TCN) framework.

Figure 6. Scatter plots of observed versus predicted cotton Verticillium wilt severity for different deep learning architectures.

Figure 7. Contribution of spectral indices based on SHAP values for the Transformer-TCN model.

Figure 8. Distributional patterns of SHAP contributions for key spectral indices in the Transformer-TCN model.

Figure 9. Temporal SHAP consistency of spectral indices across the 5-day input window for the Transformer-TCN model, showing dynamic attribution patterns of physiological, pigment, biochemical, structural, and water-related features.

Figure 10. R² performance comparison of single-index time-series features across five functional index categories using the proposed Transformer-TCN model. (Teal green: Xanthophyll Cycle & Fluorescence Indices; orange: Pigment Indices; purple: Biochemical Absorption Indices; yellow-green: Structural Indices; blue: Water Content Indices).

Figure 11. The R² of the proposed Transformer-TCN with various parameters of learning rate and batch size.

Table 1. Parameter statistics for the Transformer-TCN model.

No.	Module	Layer Name	Input Size	Output Size	Description
1	Input	Input	[8, 170]	[8, 170]	Raw feature
2	Input	Reshape	[8, 170]	[8, 170, 1]	1D sequence
3	TCN Branch (2 × TCN)	TCN Block	[8, 170, 1]	[8, 64, 1]	2-layer TCN
4	TCN Branch (2 × TCN)	Flatten	[8, 64, 1]	[8, 64]	TCN flattening
5	Transformer Branch (4 × Encoder)	Conv1d	[8, 170, 1]	[8, 32, 1]	32-D embedding
6		Permute	[8, 32, 1]	[8, 1, 32]	Reshaping
7		FormerEncoder	[8, 1, 32]	[8, 1, 32]	4-layer encoder
8		Permute	[8, 1, 32]	[8, 32, 1]	Reshaping
9		Concat	[8, 32, 1]	[8, 64, 1]	Concatenation
10		Flatten	[8, 64, 1]	[8, 64]	Former flattening
11	Feature Fusion	Concat	[8, 64]	[8, 128]	Concatenation
12		Permute	[8, 128]	[8, 128, 1]	Reshaping
13		Conv1d	[8, 128, 1]	[8, 32, 1]	Channel fusion
14		Flatten	[8, 32, 1]	[8, 32]	Flattening
15	Output	Linear	[8, 32]	[128, 1]	Prediction
16	TCN hyperparameters		hidden_dim = [32, 64], kernel_size = 3
17	Transformer hyperparameters		hidden_dim = 32, heads = 2, layers = 4
18	Training hyperparameters		Batch_size = 8, learning_rate = 3 × 10⁻⁴, epoch = 50

Table 2. Prediction performance of different deep learning architectures using time-series spectral features for cotton VW monitoring.

Model	R²	RMSE	RPD	RPIQ
CNN	0.7491	0.4781	1.9965	3.6012
LSTM	0.7406	0.5282	1.9633	3.6581
Transformer	0.7910	0.4418	2.1875	3.7673
TCN	0.7898	0.4755	2.1811	4.0639
CNN-LSTM	0.8046	0.4821	2.2620	4.2525
CNN-TCN	0.8192	0.4658	2.3521	4.6810
Former-CNN	0.8367	0.3906	2.4745	4.2614
Former-LSTM	0.8272	0.4017	2.4058	4.1432
Former-TCN	0.8813	0.3375	2.9025	5.4366

Table 3. Prediction performance of cotton Verticillium wilt monitoring using different deep learning models under non-temporal inputs.

Model	R²	RMSE	RPD	RPIQ
CNN	0.6668	0.6029	1.7324	2.9257
LSTM	0.6990	0.5730	1.8227	3.0781
Transformer	0.7246	0.5481	1.9056	3.2181
TCN	0.7126	0.5599	1.8653	3.1501
CNN-LSTM	0.7001	0.5720	1.8260	3.0836
CNN-TCN	0.6902	0.5813	1.7968	3.0343
Former-CNN	0.7201	0.5621	1.8901	2.8589
Former-LSTM	0.7309	0.5418	1.9276	3.2553
Former-TCN	0.7650	0.5150	2.0629	3.1202

Table 4. Ablation analysis of different components in the Transformer-TCN method for disease severity estimation.

Cases		Performance
CNN	Transformer	R²	RMSE	RPD	RPIQ
√	×	0.7248	0.4683	1.9064	3.2781
×	√	0.7886	0.4503	2.1750	4.0739
√	√	0.8813	0.3375	2.9025	5.4366

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, Y.; Huang, C.; Zhang, X.; Zhang, Z. Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework. Remote Sens. 2026, 18, 1105. https://doi.org/10.3390/rs18081105

AMA Style

Gao Y, Huang C, Zhang X, Zhang Z. Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework. Remote Sensing. 2026; 18(8):1105. https://doi.org/10.3390/rs18081105

Chicago/Turabian Style

Gao, Yi, Changping Huang, Xia Zhang, and Ze Zhang. 2026. "Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework" Remote Sensing 18, no. 8: 1105. https://doi.org/10.3390/rs18081105

APA Style

Gao, Y., Huang, C., Zhang, X., & Zhang, Z. (2026). Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework. Remote Sensing, 18(8), 1105. https://doi.org/10.3390/rs18081105

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling Spectral–Temporal Information for Estimating Cotton Verticillium Wilt Severity Using a Transformer-TCN Deep Learning Framework

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Data Collection

2.1.1. Study Site and Leaf Time-Series Observation Design

2.1.2. Leaf Spectral Reflectance and Image Acquisition

2.2. Data Processing and Sample Construction

2.2.1. Quantification of Disease Severity and Description of Leaf Symptom Evolution

2.2.2. Spectral Preprocessing and Vegetation Index Construction

2.2.3. Construction of Time-Series Slice Samples

2.3. Model Development

2.3.1. Overview of the Proposed Transformer-TCN Method

2.3.2. Temporal Convolutional Network Branch

2.3.3. Transformer Encoder Branch

2.3.4. Feature Fusion and Output

2.4. Benchmark Design and Evaluation

2.4.1. Comparative Models

2.4.2. Evaluation Metrics

2.4.3. Implementation Configuration

3. Results and Analysis

3.1. Comparison of Time-Series Deep Learning Architectures

3.2. Contribution of Spectral Indices Based on SHAP Analysis

3.3. Effect of Feature Dimensionality on Time-Series Modeling

3.4. Contribution of Temporal Information to Model Performance

4. Discussion

4.1. Ablation Study and Hyperparameter Analysis

4.2. Analysis of Temporal Spectral Responses

4.3. Limitations and Perspectives

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI