Next Article in Journal
Effect of Row Spacing in the Period Prior to Weed Interference in Peanut Cultivation Under Azorean Conditions
Previous Article in Journal
Growth and Physiological Traits Associated with Water Use Efficiency in Different Popcorn Genotypes Grown Under Water-Stress Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Spectral Vegetation Index for Improved Detection of Soybean Cyst Nematode (SCN) Infestation Using Hyperspectral Data

1
School of Earth Systems and Sustainability, Southern Illinois University, Carbondale, IL 62901, USA
2
School of Agricultural Sciences, Southern Illinois University, Carbondale, IL 62901, USA
*
Author to whom correspondence should be addressed.
Crops 2025, 5(5), 58; https://doi.org/10.3390/crops5050058
Submission received: 17 June 2025 / Revised: 22 August 2025 / Accepted: 25 August 2025 / Published: 29 August 2025

Abstract

Soybean cyst nematode (SCN) is a pathogen with serious impacts on soybean yields, yet traditional field-based assessment is labor-intensive and often ineffective for early interventions, and the existing spectral vegetation indices (VIs) also lack the ability to accurately detect SCN infested plants. This study aimed to develop an improved detection method using hyperspectral data. A greenhouse-based experiment was designed to collect 100 hyperspectral datasets from 20 soybean plants inoculated with four SCN egg levels (0–10,000) from the 68th to 97th day after planting. Based on spectral similarity and inoculation levels, three stress classes were defined as proxies for actual plant stress: healthy (0 egg), moderate (1000 and 5000 eggs), and severe (10,000 eggs). These classifications are based on predefined inoculation thresholds and spectral trends, which may not fully align with direct physiological stress measurements due to inherent variability in individual plant responses. Through analysis of variance (ANOVA), principal component analysis (PCA), feature selection, and classification comparison, a new spectral VI, called SCNVI, was proposed using bands 338 nm and 665 nm. The SCNVI coupled with eXtreme Gradient Boosting (XGBoost) achieved an accurate classification of 70% for three classes and outperformed the 12 traditional VIs. These findings suggest that integrating the SCNVI and XGBoost algorithm provides the potential for improving the detection of SCN infestation, though further validation in field environments is required to confirm its practical applicability.

1. Introduction

Soybean (Glycine max L.) is one of the most important crops worldwide, with particular significance in the United States of America. It is one of the largest suppliers of animal protein feed and the second-largest contributor to vegetable oil production [1]. However, soybean cyst nematode (SCN, Heterodera glycines) often reduces soybean yield and causes economic loss [2]. SCN infestation, which involves microscopic nematodes feeding on soybean roots, leads to the formation of cysts that can house hundreds of eggs. These eggs hatch and further damage the plants, impacting their water and nutrient absorption and thus reducing yields. Consequently, understanding the spatial distribution and patterns of SCN and developing an effective detection method becomes important for sustaining high-yield soybean production [3].
Traditional detection methods, such as manual soil sampling and root inspection [1,4], are laborious, time-consuming and prone to failure in early SCN detection [5]. This is because visible symptoms of SCN infestation typically appear only after significant crop damage [6]. Moreover, these methods might not accurately represent SCN population variability due to their reliance on limited and random sampling [7].
A cost-effective alternative is remote sensing-based detection methods where SCN infestation can be captured by analyzing the different spectral reflectance characteristics of healthy and infested plants. Remote sensing methods include ground-based, airborne, and space-borne techniques that acquire spectral reflectance information through imaging [1,8]. Using sensors installed on cars or portable electronics, ground-based remote sensing takes pictures and spectral reflectance of light from soybean fields. Fixed-wing aircrafts and unmanned aerial vehicles (UAVs) or drones equipped with sensors to gather data at a close range. Space-based remote sensing collects images of soybean lands by satellites that orbit the Earth’s surface at a far range. These remote sensing methods provide a non-invasive and efficient way to collect data on crop health, including detecting SCN infestation.
Despite their advantage of large-area coverage, satellite or space-borne images such as Landsat images lack the ability to detect early infestation and damage of crops caused by disease and pests due to coarse spatial and temporal resolution [9]. Moreover, airborne and UAV multispectral images, due to fine spatial resolutions, have also been applied to the detection of SCN [10]. Specifically, airborne images have fine spatial resolutions that are often finer than 1 m × 1 m, and they have the potential to detect soybean SCN-induced stress [1,8]. However, these methods lack the ability to detect subtle changes, such as early or small-area SCN infestations due to a limited number of spectral bands.
Given these challenges, there is a growing interest in developing more efficient and accurate early detection methods [11]. Hyperspectral remote sensing can capture reflectance data across hundreds of contiguous narrow bands spanning the regions from visible to shortwave infrared (400–2500 nm), offering high spatial resolution and non-invasive insights into plant health [12]. The rich spectral data facilitates the detection of subtle physiological and biochemical changes in plants caused by SCN, which allows for early-stage identification even before symptoms are visible [13]. There are two types of hyperspectral systems: imaging-based and non-imaging-based. Compared to imaging systems, non-imaging hyperspectral sensors focus only on spectral reflectance curves, simplifying data acquisition and analysis while eliminating challenges such as mixed pixels, variable illumination, and atmospheric distortions [14]. These sensors also facilitate faster and more streamlined processing, which makes them ideal for real-time field diagnostics [15].
Moreover, vegetation indices (VIs) derived from multispectral and hyperspectral data can further enhance sensitivity to stress-induced changes in plants. Various VIs have been widely used to distinguish between healthy and infested soybean plants [1,11,15,16,17]. For example, Bajwa et al. [15] compared a total of 13 Vis, and it was found that the performance in detecting SCN-infested soybean plants varied greatly among different VIs and time periods of planting. Moreover, Kulkarni et al. [16,17] used NDVI, GNDVI, and WDRVI to detect the dynamics of soybean plants infected by SCN and predict the effect of SCN on soybean yield. Overall, the contributions of the VIs to improving the detection of soybean SCN infestations varied greatly and were site-specific. However, compared with original bands, VIs provided greater potential for early detection and management of crop diseases like SCN [18].
This study aimed to develop and evaluate a spectral vegetation index (SCNVI) for SCN infestation detection using non-imaging hyperspectral data and machine learning methods in a controlled greenhouse setting. The SCN-specific VI (SCNVI) was developed using key wavelengths identified through statistical analysis and feature selection methods, including one-way ANOVA analysis, PCA, linear discriminant analysis (LDA; [19]), the Select From Model with SVM (SFM + SVM; [20]), SFM with RF (SFM + RF; [21]), SFM with eXtreme GBoost (SFM + XGBoost; [22]), recursive feature elimination with SVM (RFE + SVM; [23]), RFE with RF (RFE + RF; [24]), and RFE with XGBoost (RFE + XGBoost; [25]). The applications of the SCNVI to infested plants separate from non-infested ones were validated by comparison with 12 widely used VIs.

2. Materials and Methods

An experimental design was first used for the collection of hyperspectral data from non-inoculated and inoculated soybean plants in a greenhouse, and the statistical characteristics of the hyperspectral data were analyzed to investigate the separability of three stress levels: healthy, moderate stress, and severe stress. A one-way ANOVA of the three classes was then performed to explore statistically significant differences in spectral reflectance values among the classes, and PCA was carried out to select a total of 40 spectral bands that dominantly contributed to the principal component 1 (PC1) and PC2. Moreover, from the 40 spectral bands obtained, the top 10 spectral bands were further selected based on their importance scores and classification accuracies using seven methods, including LDA, SFM + SVM, SFM + RF, SFM + XGBoost, RFE + SVM, RFE + RF, and RFE + XGBoost. The selected 10 bands were used to create various candidate VIs by band ratioing, band differencing, band subtraction, band addition, and logarithm transformation through classification and comparison [26]. Finally, a new SCN-specific VI was obtained based on the best performance of three-class classification and validated by comparison with 12 widely used VIs.

2.1. Experimental Design and Collection of Hyperspectral Data

Leaf-level hyperspectral data were collected using an ASD FieldSpec HandHeld 2 spectroradiometer (Malvern Panalytical Ltd., Malvern, UK) from a total of 20 soybean plants in a controlled greenhouse setting. The instrument has an adjustable integration time to optimize the signal-to-noise ratio and reduce saturation. Prior to data collection, the instrument was optimized using a Spectralon white reference panel to establish baseline reflectance values. The instrument captured the wavelengths from the ultraviolet (UV) to near-infrared region and ranging from 325 nm to 1075 nm at a spectral resolution of 1 nm. The total of 751 distinct bands provided a detailed representation of the plant’s physiological and biochemical responses to SCN stress at various stages. To simulate different levels of SCN stress, varying numbers of SCN eggs were introduced into the soil at the time of planting. Four groups of soybean plants were established by inoculating the soil with different levels of SCN eggs: 0, 1000, 5000 and 10,000 eggs per plant. Each group consisted of five plants. Due to the slight difference in spectral reflectance values between the 1000- and 5000-egg treatments, these two were combined into one class, called moderate stress. Although SCN egg inoculation levels were used as class labels, it is important to note that actual stress responses may vary due to individual plant variability, making these labels approximations of true physiological stress.
Leaf spectral reflectance was measured weekly from the 68th to the 97th day after planting. The data collection period was strategically chosen to align with the R1 (beginning bloom) to R5 (beginning seed) reproductive stages of soybean. This is a critical window when the plant’s demand for water and nutrients is at its peak for pod and seed development [27]. This timeframe coincides with the point at which cumulative damage from multiple generations of SCN on the roots severely impairs nutrient absorption, creating a significant physiological stress that is detectable by hyperspectral sensors. A total of 100 sample spectral datasets, each consisting of spectral data from 751 bands, were obtained. There were 25 spectral datasets for each of the four inoculation groups. The controlled environment minimized external variability, ensuring that the observed spectral difference values were primarily attributable to the health status of the plants.

2.2. Spectral Preprocessing and Denoising

Raw hyperspectral reflectance measurements contain high-frequency noise arising from instrument electronics and minor environmental variability. This noise can obscure subtle spectral signatures associated with stress. To mitigate this issue while preserving biologically relevant detail, we applied wavelet-based denoising to each reflectance spectrum prior to conducting any further analyses. The wavelet approach was selected because it adaptively suppresses noise while effectively preserving sharp spectral features, such as red-edge shifts and ultraviolet responses. Wavelet-based denoising has been proven to be an effective technique in signal processing and remote sensing applications, due to its ability to achieve adaptive spatial smoothing without oversmoothing salient features [28,29]. It has also been successfully applied to hyperspectral image denoising and vegetation monitoring [30].

2.3. Characterizing the Spectral Reflectance of Healthy and Stressed Plants

The statistics of spectral reflectance values from healthy and stressed plants were calculated and analyzed across the entire range from 325 nm to 1075 nm, and the regions in which plants were under different stress levels were clearly distinguished from each other and first identified. A one-way ANOVA between the healthy and two stressed groups (moderate stress and severe stress) was then conducted at a significance level of smaller than 0.05. The ANOVA analysis leads to an F-statistic, which is the ratio of the between class variance to the within class variance. If the class means are obtained from the same mean populations, the between-class variance should be smaller than the within-class variance. Therefore, greater ratios highlight the significant spectral bands and regions where reflectance values differ significantly across the class means.

2.4. Band Reduction of Hyperspectral Data

Owing to its fine spectral resolution, the hyperspectral data provides a detailed representation of subtle spectral signatures associated with different soybean stresses and diseases. To reduce the high dimensionality of the spectral data, a PCA was conducted to extract extensive spectral information into principal components (PCs). Then, the factor loadings measuring the correlations of each PC with the original bands and implying the contributions of the original bands to each PC were utilized to identify important spectral bands that could improve classification accuracy. The PCA thus facilitated the selection of important bands to balance dimensionality reduction with the preservation of critical spectral information. Based on the largest values of factor loadings in PC1 and PC2, a total of 40 significant spectral bands were selected from the ANOVA-filtered 371 bands. A total of 25 bands with the highest absolute loadings from PC1 were selected to retain broad stress-related signatures, such as those associated with chlorophyll degradation, while 15 bands from PC2 were included to capture more subtle but ecologically important features, such as red-edge shifts (680–730 nm) indicative of changes in photosynthetic efficiency and UV-range reflectance (330–400 nm) linked to lignin accumulation. This significant reduction in dimensionality facilitated a more focused analysis by highlighting the spectral bands that were most informative for SCN detection.

2.5. Methods for Selection of Optimal Wavelengths

In this study, the selection of optimal bands (wavelengths) was conducted through a combination of statistical and machine learning (ML) methods. These methods were designed to enhance the predictive accuracy of hyperspectral data in detecting SCN. A total of seven methods were used, including LDA, SFM + SVM, SFM + RF, SFM + XGBoost, RFE + SVM, RFE + RF, and RFE + XGBoost. LDA employs Fisher’s linear discriminant method to evaluate the importance of a band by maximizing the ratio of between-class variance ( S b ) to within-class variance ( S w ), ensuring the optimal separation of data points into predefined categories [31]. The importance score for a band is calculated as follows:
J w = w T S b w w T S w w
Here, w represents the weighting vector for the bands. Bands that achieve a higher ratio of between-class variance ( S b ) to within-class variance ( S w ) are assigned greater importance scores. This method is effective for datasets where the primary goal is to maximize class separability. This approach is most effective when data meet assumptions of normality and homogeneity of variance. LDA is valued for its computational simplicity, making it an efficient feature selection tool in situations where computational resources are limited.
SFM + RF and RFE + RF apply the Gini index. It is a measure of impurity reduction in decision tree-based algorithms, which helps to calculate the importance scores of spectral bands. The Gini index is computed as follows:
G = 1 i = 1 C   p i 2
where p i is the proportion of samples belonging to class i . For each band f i , the overall importance is determined as follows:
I f i = t = 1 T   Δ G i n i f i , t
where t denotes the decision tree node in which band f i is used, and Δ G i n i f i , t represents the reduction in impurity at node t due to splitting on band f i . Bands that provide the greatest reduction in impurity (i.e., with higher Δ G i n i ) are deemed more important.
SFM + SVM and RFE + SVM use the SVM framework to calculate band importance based on the optimization problem:
m i n i m i z e 1 2 w 2   subject   to   y i w T x i + b 1   for   all   i
where w represents the band coefficients in the SVM model, x i is the feature vector of the i -th sample, and y i is its class label. Bands with larger absolute coefficients (∣ w ∣) are considered more important. This method excels in identifying bands that contribute most to separating healthy and infested plants.
SFM + XGBoost and RFE + XGBoost evaluate band importance by analyzing the gradient of the loss function. At each boosting iteration, the gradient is calculated as follows:
g m = L y , F m 1 F m 1 x
where L is the loss function, and F m 1 represents the model’s prediction at the previous iteration. Bands that contribute to larger gradient magnitudes are assigned higher importance scores, as they effectively reduce the loss during training.
The SFM + SVM approach applies SFM in combination with the SVM, which is particularly adept at handling high-dimensional datasets and finding the optimal hyperplane that separates different classes [32]. SVM’s robustness in managing non-linear boundaries makes it highly effective for complex datasets. By incorporating SFM, this method retains only the most important features, optimizing the predictive power of the model.
Both SFM + RF and SFM + XGBoost evaluate feature importance and prioritize the most critical wavelengths for SCN detection. SFM + RF ranks features by their contribution to impurity reduction in decision trees, which is particularly effective for high-dimensional datasets with potential overfitting [33]. SFM + XGBoost, on the other hand, builds an additive model, progressively focusing on minimizing an arbitrary differentiable loss function, which makes it adept at handling datasets with complex feature interactions [34]. Both methods are highly efficient at selecting meaningful features, although SFM + XGBoost can be sensitive to noisy data, which may either enhance or hinder performance depending on the dataset.
RFE + RF, RFE + XGBoost, and RFE + SVM employ an iterative process that continuously eliminates the least significant features, allowing the models to concentrate on the most essential ones [35]. This recursive elimination approach ensures that the models refine their predictions by focusing on indispensable features, especially in datasets with significant feature redundancy. RFE + RF and RFE + XGBoost are particularly effective at reducing overfitting by narrowing down the feature set to only the most crucial predictors, while RFE + SVM adds the ability to handle non-linear feature interactions, making it especially useful for complex datasets.
Overall, the selection of these methods was guided by the specific characteristics of the hyperspectral data and the analytical objectives. LDA is most suitable for datasets where class separation is linear, while methods such as SFM + XGBoost and RFE + SVM are better suited to manage non-linear relationships and intricate feature interactions within the data. By applying these feature selection techniques, the optimal spectral bands for detecting SCN stress were chosen with an emphasis on accuracy, efficiency, and robustness. These methods significantly enhance the accurate identification and classification of plant health conditions, contributing to advancements in precision agriculture and hyperspectral data analysis.

2.6. Traditional Vegetation Indices Derived from Hyperspectral Data

The selection of VIs in Table 1 for this research was driven by the need to effectively analyze various aspects of vegetation health and stress related to SCN infestation using hyperspectral data. Each chosen VI provides unique insights into plant health and stress, utilizing different spectral bands to capture specific physiological and biochemical properties of vegetation. EVI [36] was chosen because it improves the sensitivity of NDVI [37] to high-biomass regions and reduces the influence of atmospheric distortion and bare soil. By including the blue band to correct for soil and atmospheric scattering effects, EVI offers more reliable and detailed vegetation data, which is crucial for accurately identifying areas affected by SCN. MSAVI2 [38] is specifically designed to minimize the influence of soil brightness, which is particularly useful in areas with sparse vegetation cover, a common case in fields affected by SCN. This index helps in accurately determining vegetation cover in such fields, enhancing the detection of stressed vegetation without the confounding effects of the underlying soil.
NDREI [39] utilizes the red-edge spectral region, which is sensitive to changes in chlorophyll content and serves as an indicator of plant stress and health. Since SCN stress affects plant vitality by hindering nutrient uptake, NDREI is invaluable for early detection of these physiological changes before they become apparent in the visible spectrum. TVI [40] was chosen due to its effectiveness in enhancing vegetation signals even in highly saturated areas. It uses a combination of green and red bands to assess plant vigor and health, making it suitable for monitoring changes in vegetation health over time, including the subtle effects of SCN stress. SATVI [41] incorporates adjustments for soil brightness, making it highly effective in areas with mixed vegetation and soil backgrounds. This capability is crucial for accurately assessing vegetation health in fields with uneven SCN damage where exposed soil might otherwise skew traditional indices.
MCARI [42] is tailored to highlight changes in the chlorophyll content of leaves, which directly correlates with plant health and productivity. Since SCN affects plant growth by attacking the root system, monitoring chlorophyll content with MCARI provides insights into the overall health and metabolic state of the plant. CCCI [43] is adept at estimating canopy chlorophyll content, which can indicate the level of stress or disease in a plant. For SCN monitoring, CCCI helps in distinguishing between healthy and stressed plants based on how the disease affects chlorophyll levels, offering a reliable metric for assessing the extent and impact of stress.
Each of these indices was selected not only for its individual capabilities but also for how their combined use can provide a comprehensive overview of plant health across different stages of growth and varying degrees of SCN stress. These strategic choices allow for a nuanced analysis of stress impacts, facilitating targeted agricultural interventions and improving management of SCN in soybean crops.
Table 1. The commonly used VIs, defined based on corresponding hyperspectral bands.
Table 1. The commonly used VIs, defined based on corresponding hyperspectral bands.
IndexNameFormulaReference
WBIWater Band Index ρ 970 / ρ 900 [44]
NRINitrogen Reflectance Index( ρ 570 ρ 670 ) / ( ρ 570 + ρ 670 ) [45]
EVIEnhanced Vegetation Index 2.5 × N I R R e d N I R + 6 × R e d 7.5 × B l u e + 1 [36]
MSAVI2Modified Soil-Adjusted Vegetation Index 2 2 × N I R + 1 2 × N I R + 1 2 8 × N I R R e d 2 [38]
NDREINormalized Difference Red Edge Index N I R R e d E d g e N I R + R e d E d g e [39]
TVITriangular Vegetation Index 0.5 × 120 × N I R G r e e n 200 × R e d G r e e n [40]
SATVISoil Adjusted Total Vegetation Index N I R R e d N I R + R e d + 0.5 × 1.5 R e d 0.5 [41]
MCARIModified Chlorophyll Absorption in Reflectance Index R e d E d g e R e d 0.2 × R e d E d g e G r e e n × R e d E d g e R e d [42]
CCCICanopy Chlorophyll Content Index N I R R e d E d g e N I R + R e d E d g e / N I R R e d N I R + R e d [43]
NDVINormalized Difference Vegetation Index ( N I R R e d ) / ( N I R + R e d ) [37]
GNDVIGreen Normalized Difference Vegetation Index ( N I R G r e e n ) / ( N I R + G r e e n ) [46]
SAVISoil-Adjusted Vegetation Index ( 1 + 0.5 ) ( N I R R e d ) N I R + R e d + 0.5 [47]

2.7. A New Vegetation Index Derived from Hyperspectral Data

In addition to the commonly used VIs in Table 1, based on the top 10 bands selected using seven methods, more than 200 new candidate VIs were created and then compared for detection of SCN stress. The new VIs were created by calculating band differencing, band ratioing, band addition, band multiplying, natural logarithm, and their combinations [48]. The accuracy comparison of three-class classification (healthy, moderate stress, and severe stress) showed that the following VI has the best accuracy:
S C N V I = ln 1 + R 338 n m × R 665 n m
where R 338 n m and R 665 n m are the reflectance values of wavelengths 338 nm and 665 nm, respectively. The proposed SCNVI was designed to mathematically amplify its response to the plant stress caused by SCN infestation. The multiplication of R 338 n m and R 665 n m potentially enhanced the sensitivity of the VI to subtle early-stage stress (Figure 1). The natural logarithm led to a steep slope indicating a short range of Z values (the product of R 338 n m and R 665 n m ) and reducing the impact of sensor noise, while emphasizing relative stress severity over absolute reflectance values.
Unlike NDVI, which relies on red and near-infrared (NIR) bands and becomes saturated under the condition of high-density canopy structures and biomass [37], the SCNVI combines one UV band and one red band. The red band captures the characteristics of plant leaves and the ability to photosynthesize, while the addition of the UV band helps to mitigate saturation effects in healthy plants and increases sensitivity to reflectance changes in stressed plants. The UV band at 338 nm detects structural and biochemical stress caused by SCN infestation, such as lignin accumulation and phenolic compound synthesis. These changes are plant defense mechanisms triggered by UV exposure and pathogen attacks [49,50]. As cell walls degrade or secondary metabolites accumulate, UV reflectance increases, providing early warning of stress before visible symptoms appear [50]. Moreover, the red band at 665 nm is located at the chlorophyll-a absorption peak, which reflects photosynthetic health. Healthy plants exhibit low reflectance in the red band, while SCN infestation disrupts chlorophyll synthesis, leading to degradation and increasing reflectance, which is the direct marker of photosynthetic impairment [49]. The red band is negatively correlated with chlorophyll content, making it a sensitive indicator of photosynthetic decline [49]. Combining the UV band and red band, SCNVI captures holistic plant health changes [51]. Thus, the proposed SCNVI provides the potential to improve the three-class classification. In addition, in the logarithmic transformation, adding a value of 1 ensures positive input values, avoiding undefined results when reflectance values approach 0.
Figure 1 presents the distribution of SCNVI values across three stress levels: healthy, moderate stress, and severe stress. The x-axis represents the plant stress categories, while the y-axis represents SCNVI values. Overall, healthy samples are approximately clustered tightly at low SCNVI values, indicating no structural damage and chlorophyll degradation. In contrast, severely stressed samples are approximately distributed with the highest SCNVI values, reflecting significant stress from UV-induced structural changes and photosynthetic decline. Moderately stressed samples are located in the middle of the range, exhibiting minor stress and subtle changes. However, the separation among the three classes is not perfect due to overlapping spectral reflectance values in some samples, which would lead to uncertainties and limit the classification ability of the SCNVI.
Moreover, the SCNVI is strongly correlated with the widely used VIs, including the soil adjusted total vegetation index (SATVI) (r = 0.89), NDVI (r = 0.88), MSAVI2 (r = 0.84), and SAVI (r = 0.83), implying that it is well suited for assessing vegetation vigor, biomass, and overall structural health (Figure 2). Additionally, its moderate correlations with nitrogen-sensitive VIs such as NRI (r = 0.65) and GNDVI (r = 0.61) indicate that it can provide insights into nitrogen absorption and photosynthetic efficiency. However, its lower correlations with chlorophyll-related (MCARI, r = 0.19) and water-sensitive (WBI, r = 0.29) VIs show that SCNVI is less focused on chlorophyll variation and water status but captures broader stress response in soybean plants. This unique profile is attributable to the inclusion of the 338 nm UV band, which detects the accumulation of biochemical defense compounds such as lignin and phenolic compounds. By integrating this early biochemical stress signal with the photosynthetic decline captured by the red band, the SCNVI provides a more comprehensive evaluation of SCN infestation. This justifies its novelty and superior performance compared to indices that target only one aspect of plant health. Therefore, SCNVI is effective in predicting plant stress and canopy structure, especially stress related to SCN infestation.

2.8. Accuracy Assessment

In this study, both the distinction between healthy and stressed soybean plants and the classification of three categories were achieved using hyperspectral data and seven classifiers. The hyperspectral data were collected five times from a total of 20 plants, leading to a total of 100 datasets for three classes: healthy (25 datasets from 5 plants), moderate stress (50 datasets from 10 plants), and severe stress (25 datasets from 5 plants). The dataset was split, with 70% used for training and 30% used for testing. Prior to model training, reflectance values were preprocessed using mean imputation for missing values (SimpleImputer, strategy = mean) and standardized via z-score normalization using StandardScaler to ensure consistent feature scaling. A confusion matrix was employed to assess the accuracy of classification. The confusion matrix was used to calculate the producer’s and user’s and overall accuracies to measure model performance. Moreover, the Kappa statistics, weighted precision, weighted recall, weighted F1, and the Matthews correlation coefficient (MCC) were calculated to further evaluate model performance.

3. Results

3.1. Characteristics of Healthy and Stressed Plants in Hyperspectral Bands

In Figure 3, several crucial insights into the spectral behavior of soybean plants across different levels of SCN stress are shown. There was an overall consistent trend: in a given spectral region, the spectral reflectance values increased as the severity of SCN stress increased, with the differences varying depending on specific spectral regions. In the entire spectral range from 325 nm to 1075 nm, there was a clear distinction between the plants with zero eggs inoculated and the egg-inoculated plants, with the SCN egg inoculation level corresponding to higher reflectance levels. Plants inoculated with no eggs consistently showed the lowest reflectance, while those inoculated with 10,000 eggs exhibited the highest reflectance. This trend was particularly pronounced in the ultraviolet (UV) and visible regions, where the difference in spectral reflectance between the no egg and egg-inoculated plants was most striking, suggesting significant physiological changes in the plant as SCN egg inoculation level increased. The dramatic increase in reflectance occurred in the UV and visible portions of the spectrum, indicating stress responses and chlorophyll degradation [52,53].
In this study, stress levels were defined based on the initial SCN egg inoculation levels. Plants inoculated with no eggs were assumed to be healthy, while those inoculated with 1000 and 5000 eggs showed very similar spectral signatures and were grouped into a single category representing moderate stress. Plants inoculated with 10,000 eggs showed the most pronounced spectral changes and were classified as severely stressed. This three-level classification of healthy, moderate stress, and severe stress was used for all subsequent analyses.
The UV region of 335 nm to 400 nm was highly sensitive to early plant stress caused by SCN (Figure 3b), where reflectance values increased substantially as the stress severity rose, with the severely stressed plants having the greatest reflectance values in the UV region. In contrast, healthy plants absorbed UV light due to the presence of protective pigments like flavonoids, which shielded the plant from UV radiation. As the stress became more intense, the pigments were degraded, leading to higher reflectance values. This pattern highlighted the UV spectral region as an important spectral interval for detecting early stress before the visible symptoms of stress manifest [54]. The difference in reflectance between healthy and stressed plants was most obvious between 330 and 360 nm, where the increased reflectance in stressed plants signaled the breakdown of UV-absorbing pigments, making the UV region a valuable indicator of early SCN stress [53].
In the visible region (Figure 3c), the differences in spectral reflectance between healthy and stressed plants were also pronounced. The healthy plants showed lower reflectance in the red region due to their efficient use of light for photosynthesis, absorbing more red light (600–700 nm) and reflecting less. As SCN stress increased, the plant’s chlorophyll levels declined, resulting in reduced photosynthetic efficiency and higher reflectance in the red region, especially around 680 nm. This trend was most obvious in severely stressed plants, where the degradation of chlorophyll led to a sharp rise in reflectance. In the green region (500–600 nm), the stressed plants also reflected more light, indicating stress-related reductions in chlorophyll content. The visible region can serve as an important diagnostic tool for identifying the early stages of chlorophyll degradation and photosynthetic impairment due to SCN stress [52,53]. Moreover, in both the UV and visible regions, the spectral reflectance of the plants with mild and moderate stress was obviously greater than that from the healthy plants and much smaller than that from the severely stressed plants. However, the difference in spectral reflectance values between the mildly and moderately stressed plants was not obvious across the UV and visible regions.
The near-infrared (NIR) region is closely related to the structural integrity of plant tissues and their water content. Within the interval of 700 to 900 nm, there was no consistent trend of spectral reflectance difference between the healthy and stressed plants (Figure 3d). At the beginning, the reflectance of the healthy plants was smaller than those of the stressed plants until 740 nm and then greater than that of the plants from mild stress but still smaller than those from the plants with moderate and severe stress until 900 nm. After that, the healthy plants exhibited slightly higher reflectance due to their intact cellular structures and ample water content. The SCN stress led to degradation of plant cell walls and a decrease in water content, resulting in lower reflectance after 900 nm. The differences in NIR reflectance were less pronounced compared with those in the UV and visible regions, but they still provided valuable information about the extent of cellular and structural damage caused by SCN stress [52].

3.2. ANOVA Analysis

In this study, an analysis of variance (ANOVA) was used to evaluate whether there was a significant difference of mean spectral reflectance among the three stress categories: healthy, moderate, and severe. In this analysis, the ANOVA identified the spectral bands in which the variance between the groups was greater than the within-group variance, indicating that these bands were sensitive to the physiological changes caused by SCN stress. The bands with a p-value of <0.05 and an F-value of >3.09 were deemed significantly different among the groups (Figure 4), highlighting the significant spectral regions in which the mean reflectance values of three classes differed significantly from each other.
The p-values between 336 nm and 705 nm remained consistently smaller than 0.05, particularly in the visible spectrum (460–700 nm), where p-values were smaller than 0.02 and the F-values were greater than 4.5. This suggested strong evidence that reflectance values in these bands differed significantly among the categories. Overall, both the UV and visible regions showed significant separation between classes, underscoring their importance in differentiating the stressed plants from the healthy ones. The UV and visible regions are known for their sensitivity to early plant stress, making it a critical range for early SCN detection. As SCN stress increased, the stressed plants reflected more UV and visible light due to the degradation of UV-absorbing pigments, such as flavonoids, that protect the plant under normal conditions and ameliorate the reduction of chlorophyll content and structural damage [55,56]. This finding aligns with the conclusions from previous studies [55,56,57,58].

3.3. Principal Component Analysis

The results of the PCA showed that PC1 explained 86.78% of the total variance from the original hyperspectral dataset, while PC2 accounted for an additional 10.72%; thus, PC1 and PC2 captured 97.50% of the total variance from the original bands. Based on the factor loadings of PCA—coefficients of correlation between the components (PC1 and PC2) with the original bands—a total of 40 bands were selected and are shown in Figure 5.
In the context of hyperspectral remote sensing, PC1 typically captures the overall brightness or average reflectance across the spectrum. This dominant variation is often driven by environmental factors such as illumination intensity, canopy structure, or overall pigment concentration, which affect reflectance over broad wavelength ranges. Consequently, many bands in PC1 have high positive loadings, reflecting their strong co-variation with this dominant brightness trend. In our results, the correlation coefficients of the bands with PC1 were positive and quickly increased from 0.62 to 0.80 for bands 325 nm to 350 nm, continued to increase to almost 0.99 at band 520 nm, and then slightly fluctuated between 0.99 to 1.0 for bands 520 nm to 686 nm before decreasing thereafter. In contrast, PC2 captured more subtle and orthogonal patterns not captured by PC1. It reflected localized physiological variations rather than overall brightness and was sensitive to physiological responses such as red-edge shifts or UV reflectance changes associated with stress. The correlation coefficients of the bands to PC2 were negative before 520 nm, with their absolute values continuously decreasing from 0.53 to 0; they then became positive and quickly increased from 0 to 0.45 at band 573 nm, then decreased to −0.1 at 676 nm and subsequently increased to 0.65. These non-linear trends in PC2 loadings emphasize its role in capturing localized physiological variation not aligned with global reflectance intensity. This result confirmed that while PC1 dominates the overall variance structure, PC2 contributes biologically meaningful variation, reinforcing the rationale for selecting bands from both components to support more robust SCN detection. Specifically, PC2 is sensitive to variations in the red-edge region (680–730 nm), which directly indicate changes in chlorophyll content and photosynthetic efficiency. Since SCN infestation disrupts nutrient uptake and degrades chlorophyll, the resulting decline in photosynthetic health is captured by shifts in this spectral region [59]. Furthermore, PC2 is also correlated with changes in UV-range reflectance (330–400 nm). Increased UV reflectance is often associated with plant defense mechanisms, such as the accumulation of lignin and phenolic compounds triggered by pathogen attacks [12].

3.4. Feature Selection by Seven Methods

The top 10 bands were selected and ranked by seven methods in Figure A1 (Appendix A) and are summarized in Table 2. The importance values varied depending on the methods but highlighted their contributions for SCN detection. The selected bands mainly fell in the visible region from 511 nm to 672 nm with several falling in the UV region and red-edge region, including 346 nm, 337 nm, 338 nm, 699 nm, 702 nm and 705 nm. However, the ranking of the selected bands was not completely consistent across the methods.

3.5. Comparison of Classification Accuracy Based on Top 10 Bands

In Figure A2 (Appendix A), the binary and three-class classification accuracies of the selected top 10 bands using seven methods were compared based on the validation dataset. The binary classification consisted of healthy and infested plants, while the three-class classification included healthy, moderate infestation, and severe infestation. The binary classification accuracy ranged from 65% to 70%, while three-class accuracy mostly varied from 45% to 55% for all the methods and bands, reflecting the decrease in classification accuracy from two-class to three-class classification and the difficulty of distinguishing multiple infestation levels due to subtle changes and overlapping of spectral reflectance. Three exceptions were SFM + RF with band 346 nm and RFE + XGBoost with bands 512 nm and 518 nm, leading to an accuracy of about 60% for three-class classification.

3.6. Performance and Accuracy Comparison of Proposed SCNVI

An ANOVA analysis was conducted to evaluate whether there were significant differences in SCNVI values across the three plant stress levels: healthy (0 eggs), moderate stress (1000–5000 eggs), and severe stress (10,000 eggs). The null hypothesis assumed that the mean SCNVI values were equal among the three stress levels, while the alternative hypothesis proposed that at least one group differed significantly. The results revealed that SCNVI values differed significantly among the groups (F = 5.36, p = 0.006). However, pairwise comparisons showed no significant difference between healthy plants and those under moderate stress (p = 0.52), indicating substantial overlapping of values. In contrast, plants under severe stress exhibited significantly greater SCNVI values than both the healthy and moderate stress groups (p = 0.002).
In this study, the proposed SCNVI was compared with a total of 12 widely used VIs to detect three plant stress levels using seven classification methods mentioned above and a validation dataset. In Figure 6, the most accurate classification results were obtained using XGBoost. The XGBoost classifier was configured with standard parameters (n_estimator = 100, max_depth = 6, learning_rate = 0.3, subsample = 1, colsample_bytree = 1, reg_lambda = 1) to evaluate the predictive utility of SCNVI. The classification accuracy of the training data varied from 95% to 97%, with the differences inconspicuous. However, all the traditional VIs led to an accuracy lower than 45% for the testing data, and the proposed SCNVI demonstrated a markedly superior performance, achieving a test accuracy of 70%. Compared with the traditional VIs, the proposed SCNVI increased the accuracy by 67%. Moreover, if the 20 spectral datasets collected on the 68th day after planting (implying an early growth and detection) were excluded from the classification using the SCNVI—that is, when SCNVI_80 inputs were used—the test accuracy dropped from 70% to 41.7%. This result emphasized the critical importance of the early detection time, specifically the time period of 68 days after planting. During this stage, the unique spectral signatures associated with plant stress were more pronounced, making it a pivotal time for data collection. The omitted spectral data captured essential information about early stress.
The effects of adding more VIs on three-class classification were examined using the XGBoost algorithm based on both training and testing datasets (Figure 7). Initially, the proposed SCNVI was first utilized for the classification of healthy, moderately stressed, and severely stressed plants. Then, other VIs were added one by one based on their individual accuracy, ranked by their test accuracy obtained previously. The classification accuracy of the training data increased with the second VI added, and after that, it stabilized. For validation data, using the proposed SCNVI alone led to the greatest classification accuracy (70%), and after that, adding the second VI resulted in a drop in accuracy to 63.3%. The addition of more VIs did not improve the classification, with the accuracy fluctuating from 50% to 60%, which implied that adding more VIs introduced noise or redundant information.
The confusion matrix in Table 3 showed that the proposed SCNVI achieved an overall classification accuracy of 70% with a Kappa statistic of 0.491, indicating a moderate agreement between the predicted and actual stress levels. This VI excelled in identifying both moderately and severely stressed soybean plants, evidenced by a producer’s accuracy of 86.7% and a user’s accuracy of 80.0%, respectively. However, the healthy plants had a relatively poor performance with a producer’s accuracy of 44.4%, mainly because out of nine healthy plants, there were five plants that were incorrectly classified into moderate stress, implying the difficulty of using the SCNVI to separate the healthy plants from moderately stressed plants.
We compared the best-performing VIs selected individually for each of the six classification methods and found that none outperformed the SCNVI in terms of overall accuracy and Kappa statistics. Consequently, we focused our evaluation on the proposed SCNVI due to its biological interpretability and consistent performance across models. Table A1, Table A2, Table A3, Table A4, Table A5 and Table A6 (Appendix A) show confusion matrices from six other methods using SCNVI as input, further indicating its robustness across different algorithm types.

4. Discussion

4.1. New Hyperspectral Data-Derived VI for Detection of SCN Infestation

Soybean is one of the most important crops in both the USA and the world due to its contributions to the suppliers of animal protein feed and vegetable oil production [1]. However, SCN infestation often leads to a great reduction in soybean yield and economic loss [2]. Although substantial research has been conducted, there is still a lack of an effective method for early detection of SCN [1,8,9,10,25]. In this study, a new spectral vegetation index SCNVI was proposed based on the 338 nm UV band and 665 nm red band from hyperspectral non-imaging data. The results suggested that compared with the traditional and widely used VIs, the proposed SCNVI, coupled with the SFM + XGBoost algorithm, led to the most accurate classification accuracy of three classes (healthy, moderately stressed, and severely stressed plants).
Biophysically, the UV 338 nm band is associated with the plant’s active biochemical defense mechanisms. Pathogen attacks trigger the synthesis of secondary metabolites like phenolic compounds and the reinforcement of cell walls with lignin [52,53], which increases reflectance in the UV spectrum [12]. Moreover, the visible band at 665 nm is a primary chlorophyll-a absorption peak. Disruption of nutrient uptake caused by SCN causes chlorophyll degradation, which increases reflectance at this wavelength. It directly indicates a decline in photosynthetic efficiency [54]. Therefore, the increase in spectral reflectance from the stressed plants in both the UV band and red band made it possible to distinguish the stressed plants from the healthy plants. While neither of these responses is unique to SCN, their simultaneous and pronounced occurrence gives the SCNVI its specificity.
The newly proposed SCNVI integrates UV and red reflectance to amplify co-occurring stress signals. The multiplicative term enhances sensitivity to simultaneous structural and photosynthetic stress. Moreover, the logarithmic transformation of the product of the reflectance values from these two bands results in a function that is characterized by a steep slope, non-linearly enhancing the sensitivity of the proposed SCNVI to SCN stress and thus the ability to use it for early detection of SCN infestation. Most traditional VIs (e.g., NDVI, EVI) use the bands in the visible and near-infrared regions and focus only on the monitoring of changes in chlorophyll or biomass. When traditional VIs are utilized to assess vegetation health, they are not sensitive enough to capture the subtle changes in plant structural and photosynthetic stress. The SCNVI addresses this limitation by incorporating UV reflectance, which is a novel feature in VI design, to detect biochemical stress.

4.2. Assessment and Comparison of SCNVI with Other Studies

Although substantial research related to detection of SCN infestation using remote sensing data has been conducted, most of the existing studies focus on finding spectral variables that could explain the variation of SCN eggs [16,17,55,56,57,58]. Jjagwe et al. combined aerial multispectral imagery with ML models and achieved a strong correlation (r = 0.75) between SCN impact and the normalized difference red edge index (NDRE) [55]. Their study demonstrated the effectiveness of vegetation indices in improving SCN detection at the field scale. A more direct comparison can be drawn with studies that employed non-imaging spectroradiometers combined with machine learning to classify plant diseases at the leaf scale. Recent studies have shown high detection accuracy for various plant diseases. For instance, Furlanetto et al. used a Fieldspec spectrometer and linear discriminant analysis (LDA) to differentiate multiple severity levels of Asian soybean rust, achieving a validation accuracy of 82.5% [56]. Conrad et al. used the SVM model on NIR spectra to distinguish between control, mock-inoculated, and inoculated rice plants one day after inoculation, achieving a three-class accuracy of 73.3% [57]. Similarly, Mishra et al. developed spectral disease indices (SDIs) for southern corn rust (SCR) detection and severity classification using leaf reflectance spectra and achieved accuracies of 87% and 70% for SCR detection and severity classification, respectively [58].
In this study, the testing dataset showed that the proposed hyperspectral band-based SCNVI achieved an overall classification accuracy of 70% with a Kappa value of 0.491, a producer’s accuracy from 44.4% to 86.7%, and a user’s accuracy from 66.7% to 80.0% for the classification of three classes (healthy, moderately stressed, and severely stressed plants). The findings are comparable with those from other previous studies [56,57,58]. Moreover, it is important to note that SCN is a below-ground root pathogen with more subtle and systemic effects, which can be inherently more challenging to detect from leaf-level spectra. Out of the nine healthy plants, however, incorrectly classifying five healthy plants into the moderately stressed class revealed the challenge in distinguishing healthy plants from moderately stressed plants, implying difficulty in capturing subtle changes using the proposed SCNVI due to overlapping of spectral reflectance values. Compared with previous studies by Bajwa et al. [15], which reported a high classification accuracy of 97% for healthy plants and 58% for stressed plants, and by Krishna and Prema [59], which achieved an accuracy of 80% to 82%, the overall accuracy obtained in this study was relatively lower. However, the previous studies dealt with the binary classification of healthy vs. stressed plants, while our classification involved three classes.

4.3. Limitations and Future Research

Although this study showed the potential of using the hyperspectral data-derived SCNVI for the detection of SCN infestation, there are several limitations. Firstly, the number of training and testing samples used in this study was relatively small, and thus, the proposed SCNVI should be further examined with a greater number of samples in future studies. Future studies need to expand this to include a much larger and more diverse set of samples to ensure model robustness. Secondly, in this study, the non-imaging hyperspectral data were collected in a greenhouse and utilized. It is still unknown whether the selected bands (338 nm and 665 nm) and the correspondingly derived SCNVI have similar characteristics to those from hyperspectral images, especially when affected by environmental variability such as lighting, soil background, and canopy structure. In the future, hyperspectral images such as UAV-derived hyperspectral images should be used to investigate the capacity of the proposed SCNVI for the detection of SCN infestation. Subsequent work has begun to address this gap by adapting the index to UAV multispectral data, reformulating it as SCNVI_UAV to accommodate the absence of the UV band and using available bands. The adapted index was applied to time-series UAV imagery, and hierarchical clustering of the SCNVI_UAV trajectories was used to delineate distinct stress response clusters in the field. These spectrally derived clusters exhibited a promising alignment with ground-truthed SCN population metrics; end-of-season egg counts and seasonal changes in egg accumulation differed significantly across clusters (p < 0.01), indicating that higher stress levels inferred from SCNVI_UAV corresponded with accelerated SCN reproduction [60]. While these findings suggest that the conceptual basis of the SCNVI may translate to field contexts, further work is needed to evaluate its robustness across diverse environmental conditions.
Thirdly, due to the high cost and high dimensionality of hyperspectral data, the application of the proposed SCNVI to large areas such as at regional and national scales is limited. Thus, there is a need to expand the SCNVI to space-borne multispectral images. Moreover, another critical limitation of this study is the assumption that inoculated SCN egg levels directly correlate with plant stress levels. While plants were grouped into three stress categories based on inoculation amounts, actual plant responses to SCN infection may vary due to genetic, physiological, or environmental factors. Therefore, the spectral reflectance used in this study represents induced or added stress potential rather than confirmed physiological stress. This can lead to high uncertainty in classification models, especially in cases where plants with identical SCN inoculation have different stress signatures. This limitation makes ground truthing difficult, as spectral labels are not based on direct physiological measurements or symptom validation but are inferred from treatment levels. Such a challenge is unique to SCN, which differs from other pests or diseases where visual symptoms or pathogen loads are more closely related to stress levels. To address this, future studies should incorporate direct physiological ground-truthing. This includes measuring leaf chlorophyll content, nitrogen status and chlorophyll fluorescence to build more accurate and biologically grounded classification models. Finally, while deep learning models like convolutional neural networks (CNNs) have shown significant promise for hyperspectral data analysis, their application is best suited for large datasets to prevent severe overfitting [61]. As this study was a preliminary investigation with a limited sample size, the use of traditional machine learning algorithms was a more methodologically sound approach to avoid poor model generalization.

5. Conclusions

Developing an effective method for detecting SCN infestation, especially for early detection, is critical in reducing loss of soybean yields due to pest- and disease-induced damages. In this study, a new spectral vegetation index, SCNVI, was proposed based on the selected 338 nm UV band and 665 nm red band from hyperspectral data through spectral data statistical analysis, band selection, and classification comparison. The results showed the following: (1) Severely stressed plants had significantly higher spectral reflectance values than healthy and moderately stressed plants in both the UV and visible regions. The spectral reflectance values from the moderately stressed plants were higher than those from the healthy plants, but their spectral differences were slight; (2) mean reflectance values differed significantly among the three classes overall, but the difference was not significant between the healthy and moderately stressed plants; (3) most of the selected top 10 bands by the seven methods fell in the region from 511 nm to 672 nm with several in the UV and red-edge regions, such as 338 nm and 699 nm; (4) based on the testing data, most of the combinations of the top 10 bands with seven classifiers led to an accuracy of 70% for the binary classification of healthy versus infested plants, but the accuracy was lower than 60% for three-class classification; and (5) the proposed SCNVI, coupled with XGBoost, resulted in a more accurate classification of three classes (70%), and compared with 12 traditional VIs, it increased the accuracy by 67%, showing a stronger capacity for early detection of SCN infestation. While these results are promising, they represent a foundational step. Therefore, we conclude that the SCNVI shows the potential for enhancing SCN detection. Moreover, a follow-up study adapting the SCNVI for UAV-based sensors suggests its framework is suitable for field-scale applications, as clusters derived from the adapted index correlated significantly with SCN egg population changes in the field [60]. This indicates a promising method for developing practical, large-area SCN monitoring tools. However, its true efficacy and practical utility for soybean production must first be confirmed through large-scale field validation under real-world agricultural conditions.

Author Contributions

Conceptualization: Y.W.; methodology: Y.W.; data collection: Y.W.; data analysis and validation: Y.W.; writing—original draft preparation: Y.W.; writing—review and editing: Y.W.; supervision: R.L. and J.B.; funding acquisition: R.L., A.F. and J.S. contributed to revising the manuscript and provided valuable insights from different perspectives. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Illinois Soybean Association (ISA) and the Illinois Soybean Center at Southern Illinois University.

Data Availability Statement

The data is available and can be obtained by directly contacting the author.

Acknowledgments

The authors would like to thank Joseph Kalinzi and Xian Liu for their field work and spatial analysis.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Figure A1. Multi-method band selection results: (a) Top 10 selected bands by LDA; (b) Top 10 selected bands by SFM + RF; (c) Top 10 selected bands by SFM + SVM; (d) Top 10 selected bands by SFM + XGBoost; (e) Top 10 selected bands by RFE + RF; (f) Top 10 selected bands by RFE + SVM; (g) Top 10 selected bands by RFE + XGBoost.
Figure A1. Multi-method band selection results: (a) Top 10 selected bands by LDA; (b) Top 10 selected bands by SFM + RF; (c) Top 10 selected bands by SFM + SVM; (d) Top 10 selected bands by SFM + XGBoost; (e) Top 10 selected bands by RFE + RF; (f) Top 10 selected bands by RFE + SVM; (g) Top 10 selected bands by RFE + XGBoost.
Crops 05 00058 g0a1aCrops 05 00058 g0a1b
Figure A2. Classification accuracy comparison using seven methods: (a) Test accuracy comparison using LDA; (b) Test accuracy comparison using SFM + RF; (c) Test accuracy comparison using SFM + SVM; (d) Test accuracy comparison using SFM + XGBoost; (e) Test accuracy comparison using RFE + RF; (f) Test accuracy comparison using RFE + SVM; (g) Test accuracy comparison using RFE + XGBoost.
Figure A2. Classification accuracy comparison using seven methods: (a) Test accuracy comparison using LDA; (b) Test accuracy comparison using SFM + RF; (c) Test accuracy comparison using SFM + SVM; (d) Test accuracy comparison using SFM + XGBoost; (e) Test accuracy comparison using RFE + RF; (f) Test accuracy comparison using RFE + SVM; (g) Test accuracy comparison using RFE + XGBoost.
Crops 05 00058 g0a2aCrops 05 00058 g0a2bCrops 05 00058 g0a2c
Table A1. Confusion matrix and performance metrics of three-class classification using the proposed SCNVI and LDA (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, KS: Kappa statistics, WP: weighted precision, WR: weighted recall, WF1: weighted F1, and MCC: Matthews correlation coefficient).
Table A1. Confusion matrix and performance metrics of three-class classification using the proposed SCNVI and LDA (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, KS: Kappa statistics, WP: weighted precision, WR: weighted recall, WF1: weighted F1, and MCC: Matthews correlation coefficient).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy09090.0%0.0%0.0%
Moderate Stress015015100.0%55.6%71.5%
Severe Stress033650.0%100.0%66.7%
TP027330
OA60.0%
KS0.245
WP47.8%
WR60.0%
WF149.0%
MCC0.389
Table A2. Confusion matrix and performance metrics of three-class classification using the proposed SCNVI and SFM+RF (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, KS: Kappa statistic, WP: weighted precision, WR: weighted recall, WF1: weighted F1, and MCC: Matthews correlation coefficient).
Table A2. Confusion matrix and performance metrics of three-class classification using the proposed SCNVI and SFM+RF (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, KS: Kappa statistic, WP: weighted precision, WR: weighted recall, WF1: weighted F1, and MCC: Matthews correlation coefficient).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy333933.3%33.3%33.3%
Moderate Stress31021566.7%71.4%68.9%
Severe Stress312633.3%28.6%30.8%
TP914730
OA50.0%
KS0.206
WP51.4%
WR50.0%
WF150.6%
MCC0.207
Table A3. Confusion matrix of three-class classification results using the proposed SCNVI and SFM-SVM (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
Table A3. Confusion matrix of three-class classification results using the proposed SCNVI and SFM-SVM (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy09090.0%0.0%0.0%
Moderate Stress015015100.0%50.0%66.7%
Severe Stress06060.0%0.0%0.0%
TP030030
OA50.0%
KS0.000
WP25.0%
WR50.0%
WF133.3%
MCC0.000
Table A4. Confusion matrix of three-class classification results using the proposed SCNVI and RFE-RF (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
Table A4. Confusion matrix of three-class classification results using the proposed SCNVI and RFE-RF (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy333933.3%33.3%33.3%
Moderate Stress31021566.7%71.4%68.9%
Severe Stress312633.3%28.6%30.8%
TP914730
OA50.0%
KS0.206
WP51.4%
WR50.0%
WF150.6%
MCC0.207
Table A5. Confusion matrix of three-class classification results using the proposed SCNVI and RFE-SVM (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
Table A5. Confusion matrix of three-class classification results using the proposed SCNVI and RFE-SVM (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy09090.0%0.0%0.0%
Moderate Stress015015100.0%50.0%66.7%
Severe Stress06060.0%0.0%0.0%
TP030030
OA50.0%
KS0.000
WP25.0%
WR50.0%
WF133.3%
MCC0.000
Table A6. Confusion matrix of three-class classification results using the proposed SCNVI and RFE-XGBoost (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
Table A6. Confusion matrix of three-class classification results using the proposed SCNVI and RFE-XGBoost (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, and KS: Kappa statistics).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy450944.4%66.7%53.0%
Moderate Stress11311586.7%68.4%76.0%
Severe Stress114666.7%80.0%73.0%
TP619530
OA70.0%
KS0.491
WP70.0%
WR70.0%
WF169.0%
MCC0.505

References

  1. Arjoune, Y.; Sugunaraj, N.; Peri, S.; Nair, S.V.; Skurdal, A.; Ranganathan, P.; Johnson, B. Soybean Cyst Nematode Detection and Management: A Review. Plant Methods 2022, 18, 110. [Google Scholar] [CrossRef]
  2. Markell, S.; Malvick, D. Soybean Disease Management Guide; University Extension: Fargo, ND, USA, 2021. [Google Scholar]
  3. University of Minnesota Extension. Soybean Cyst Nematode Management Guide. 2021. Available online: https://extension.umn.edu/soybean-pest-management/soybean-cyst-nematode-management-guide (accessed on 21 August 2024).
  4. Tylka, G.; Marett, C. Known Distribution of the Soybean Cyst Nematode, Heterodera glycines, in the United States and Canada through 2023. Plant Health Prog. 2025, 26, 51–53. [Google Scholar] [CrossRef]
  5. Ye, W. Soybean Cyst Nematode (Heterodera glycines) Problems in Soybean (Glycine max L.) Crops and Its Management. Adv. Agric. 2017, 2017, 1–8. [Google Scholar]
  6. Andres, H.; Grabau, Z.J. Soybean Cyst Nematode, Heterodera glycines (Ichinohe, 1952) (Chromadorea: Rhabdita: Heteroderidae); EENY-815/IN1441; UF/IFAS Extension, University of Florida: Gainesville, FL, USA, 2025; Available online: https://edis.ifas.ufl.edu/publication/IN1441 (accessed on 21 August 2025).
  7. Markell, S. SCN Population Variability in Soybean Fields. Plant Health Prog. 2021, 22, 200–202. [Google Scholar]
  8. Nutter, F.W., Jr.; Tylka, G.L.; Guan, J.; Moreira, A.J.D.; Marett, C.C.; Rosburg, T.R.; Basart, J.P.; Chong, C.S. Use of remote sensing to detect soybean cyst nematode-induced plant stress. J. Nematol. 2002, 34, 222–231. [Google Scholar] [PubMed]
  9. Zhang, J.; Huang, Y.; Pu, R.; Gonzalez-Moreno, P.; Yuan, L.; Wu, J.; Huang, J. Monitoring Plant Diseases Using Remote Sensing. Remote Sens. 2019, 11, 155. [Google Scholar]
  10. Cui, D.; Zhang, Q.; Li, M.; Hartman, G.L.; Zhao, Y. Image Processing Methods for Quantitatively Detecting Soybean Rust from Multispectral Images. Biosyst. Eng. 2010, 107, 186–193. [Google Scholar] [CrossRef]
  11. Santos, L.B.; Bastos, L.M.; de Oliveira, M.F.; Soares, P.L.M.; Ciampitti, I.A.; da Silva, R.P. Identifying Nematode Damage on Soybean through Remote Sensing and Machine Learning Techniques. Agronomy 2022, 12, 2404. [Google Scholar] [CrossRef]
  12. Mahlein, A.K. Plant Disease Detection by Imaging Sensors—Parallels and Specific Demands for Precision Agriculture and Plant Phenotyping. Plant Dis. 2016, 100, 241–251. [Google Scholar] [CrossRef]
  13. Wan, G.; He, J.; Meng, X.; Liu, G.; Zhang, J.; Ma, F.; Zhang, Q.; Wu, D. Hyperspectral Imaging Technology for Nondestructive Identification of Quality Deterioration in Fruits and Vegetables: A Review. Crit. Rev. Food Sci. Nutr. 2025, 2, 1–30. [Google Scholar] [CrossRef]
  14. Lassalle, G. Monitoring natural and anthropogenic plant stressors by hyperspectral remote sensing: Recommendations and guidelines based on a meta-review. Sci. Total Environ. 2021, 788, 147758. [Google Scholar] [CrossRef]
  15. Bajwa, S.G.; Rupe, J.C.; Mason, J. Soybean Disease Monitoring with Leaf Reflectance. Remote Sens. 2017, 9, 127. [Google Scholar] [CrossRef]
  16. Kulkarni, R.; Hartman, G.L.; Domier, L.L. Use of Vegetation Indices to Detect Soybean Cyst Nematode Stress. Crop Sci. 2008, 48, 1461–1470. [Google Scholar]
  17. Kulkarni, R.; Hartman, G.L.; Domier, L.L. WDRVI-Based Assessment of Soybean Yield Loss from SCN. Agron. J. 2008, 100, 221–227. [Google Scholar]
  18. Zhao, H.; Yang, C.; Guo, W.; Zhang, L.; Zhang, D. Automatic Estimation of Crop Disease Severity Levels Based on Vegetation Index Normalization. Remote Sens. 2020, 12, 1930. [Google Scholar] [CrossRef]
  19. McLachlan, G.J. Discriminant Analysis and Statistical Pattern Recognition; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
  20. Bommert, A.; Sun, X.; Bischl, B.; Rahnenführer, J.; Lang, M. Benchmark for Filter Methods for Feature Selection in High-Dimensional Classification Data. Comput. Stat. Data Anal. 2020, 143, 106839. [Google Scholar] [CrossRef]
  21. Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature Selection: A Data Perspective. ACM Comput. Surv. 2018, 50, 94. [Google Scholar] [CrossRef]
  22. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
  23. Chandrashekar, G.; Sahin, F. A Survey on Feature Selection Methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
  24. Kursa, M.B.; Rudnicki, W.R. Feature Selection with the Boruta Package. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef]
  25. Zhang, A.; Dong, Z.; Kang, X. Feature Selection Algorithms of Airborne LiDAR Combined with Hyperspectral Images Based on XGBoost. Chin. J. Lasers 2019, 46, 0404003. [Google Scholar] [CrossRef]
  26. Paul, N.; Sunil, G.C.; Horvath, D.; Sun, X. Deep Learning for Plant Stress Detection: A Comprehensive Review of Technologies, Challenges, and Future Directions. Comput. Electron. Agric. 2025, 229, 109734. [Google Scholar] [CrossRef]
  27. Pedersen, P.; Kumudini, S.; Board, J.; Conley, S.; Naeve, S.; Grau, C.; Oplinger, E. Soybean Growth and Development; Iowa State University Extension: Ames, IA, USA, 2004. [Google Scholar]
  28. Rasti, B.; Scheunders, P.; Ghamisi, P.; Licciardi, G.; Chanussot, J. Noise Reduction in Hyperspectral Imagery: Overview and Application. Remote Sens. 2018, 10, 482. [Google Scholar] [CrossRef]
  29. Hao, Y.; Liu, P.; Li, J.; Wang, L.; Li, W. Wavelet-Based Threshold Denoising for Imaging Hyperspectral Spectrometer. Int. J. Agric. Biol. Eng. 2014, 7, 83–90. [Google Scholar]
  30. Yang, C.; Everitt, J.H.; Du, Q.; Luo, B.; Chanussot, J. Hyperspectral Image Analysis for Vegetation Monitoring: A Review. Photogramm. Eng. Remote Sens. 2011, 77, 1121–1136. [Google Scholar]
  31. Zhang, G.; Jia, X. Feature Selection Using Kernel-Based Local Fisher Discriminant Analysis for Hyperspectral Image Classification. In Proceedings of the 2011 IEEE International Geoscience and Remote Sensing Symposium, Vancouver, BC, Canada, 24–29 July 2011; pp. 1728–1731. [Google Scholar]
  32. Zhang, R.; Ma, J. Feature Selection for Hyperspectral Data Based on Recursive Support Vector Machines. Int. J. Remote Sens. 2009, 30, 3669–3677. [Google Scholar] [CrossRef]
  33. Pullanagari, R.R.; Kereszturi, G.; Yule, I. Integrating Airborne Hyperspectral, Topographic, and Soil Data for Estimating Pasture Quality Using Recursive Feature Elimination with Random Forest Regression. Remote Sens. 2018, 10, 1117. [Google Scholar] [CrossRef]
  34. Tian, J.; Jiang, Y.; Zhang, J.; Wang, Z.; Rodríguez-Andina, J.J.; Luo, H. High-Performance Fault Classification Based on Feature Importance Ranking–XGBoost Approach with Feature Selection of Redundant Sensor Data. Curr. Chin. Sci. 2022, 2, 243–251. [Google Scholar] [CrossRef]
  35. Colkesen, I.; Kavzoglu, T. Performance Evaluation of Rotation Forest for SVM-Based Recursive Feature Elimination Using Hyperspectral Imagery. In Proceedings of the 2016 8th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Los Angeles, CA, USA, 21–24 August 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–5. [Google Scholar]
  36. Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
  37. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring Vegetation Systems in the Great Plains with ERTS. In Proceedings of the Third ERTS Symposium; NASA: Washington, DC, USA, 1974; pp. 309–317. [Google Scholar]
  38. Qi, J.; Chehbouni, A.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A Modified Soil Adjusted Vegetation Index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
  39. Gitelson, A.; Merzlyak, M.N. Spectral Reflectance Changes Associated with Chlorophyll Content in Higher Plants: Red-Edge Spectral Indices. Int. J. Remote Sens. 1994, 15, 2169–2189. [Google Scholar]
  40. Broge, N.H.; Leblanc, E. Comparing Vegetation Indices for Crop Monitoring Using Simulated Reflectance Data. Remote Sens. Environ. 2001, 76, 156–172. [Google Scholar] [CrossRef]
  41. Marsett, R.C.; Qi, J.; Heilman, P.; Biedenbender, S.H.; Watson, M.C.; Amer, S.; Weltz, M.; Goodrich, D.; Marsett, R. Remote Sensing for Grassland Management in the Arid Southwest. Remote Sens. Environ. 2006, 101, 399–413. [Google Scholar]
  42. Daughtry, C.S.T.; Walthall, C.L.; Kim, M.S.; de Colstoun, E.B.; McMurtrey, J.E. Estimating Corn Leaf Chlorophyll Concentration from Leaf and Canopy Reflectance. Remote Sens. Environ. 2000, 74, 229–239. [Google Scholar] [CrossRef]
  43. Barnes, E.M.; Clarke, T.R.; Richards, S.E.; Colaizzi, P.D.; Haberland, J.; Kostrzewski, M.; Waller, P.; Choi, C.; Riley, E.; Thompson, T.; et al. Coincident Detection of Crop Water Stress, Nitrogen Status and Canopy Density Using Ground-Based Multispectral Data. In Proceedings of the Fifth International Conference on Precision Agriculture, Bloomington, MN, USA, 16–19 July 2000. [Google Scholar]
  44. Peñuelas, J.; Filella, I.; Biel, C.; Serrano, L.; Save, R. The Reflectance at the 950–970 nm Region as an Indicator of Plant Water Status. Int. J. Remote Sens. 1993, 14, 1887–1905. [Google Scholar] [CrossRef]
  45. Filella, I.; Serrano, L.; Serra, J.; Peñuelas, J. Evaluating Wheat Nitrogen Status with Canopy Reflectance Indices and Discriminant Analysis. Crop Sci. 1995, 35, 1400–1405. [Google Scholar] [CrossRef]
  46. Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a Green Channel in Remote Sensing of Global Vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
  47. Huete, A.R. A Soil-Adjusted Vegetation Index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  48. Yang, C.; Bai, J.; Sun, H.; Bi, R.; Song, L.; Wang, C.; Zhao, Y.; Yang, W.; Xiao, L.; Zhang, M.; et al. A New Method for Rapid Construction of Multi-Band Vegetation Index. Int. J. Appl. Earth Obs. Geoinf. 2025, 140, 104601. [Google Scholar] [CrossRef]
  49. Carter, G.A.; Knapp, A.K. Leaf Optical Properties in Higher Plants: Linking Spectral Characteristics to Stress and Chlorophyll Content. Am. J. Bot. 2001, 88, 677–684. [Google Scholar] [CrossRef]
  50. Kataria, S.; Jajoo, A.; Guruprasad, K.N. Impact of Increasing Ultraviolet-B (UV-B) Radiation on Photosynthetic Processes. J. Photochem. Photobiol. B Biol. 2014, 137, 55–66. [Google Scholar] [CrossRef]
  51. Pineda, M.; Barón, M.; Pérez-Bueno, M.-L. Thermal Imaging for Plant Stress Detection and Phenotyping. Remote Sens. 2021, 13, 68. [Google Scholar] [CrossRef]
  52. Hückelhoven, R. Cell Wall-Associated Mechanisms of Disease Resistance and Susceptibility. Annu. Rev. Phytopathol. 2007, 45, 101–127. [Google Scholar] [CrossRef] [PubMed]
  53. Voigt, C.A. Callose-Mediated Resistance to Pathogenic Intruders in Plant Defense-Related Papillae. Front. Plant Science. 2014, 5, 168. [Google Scholar] [CrossRef]
  54. Falcioni, R.; Gonçalves, J.V.F.; de Oliveira, K.M.; de Oliveira, C.A.; Reis, A.S.; Crusiol, L.G.T.; Furlanetto, R.H.; Antunes, W.C.; Cezar, E.; de Oliveira, R.B. Chemometric Analysis for the Prediction of Biochemical Compounds in Leaves Using UV-VIS-NIR-SWIR Hyperspectroscopy. Plants 2023, 12, 3424. [Google Scholar] [CrossRef]
  55. Jjagwe, J.; Ochola, D.; Mwale, E.; Asea, G.; Okello, D.K.; Mukankusi, C. Remote Sensing and Machine Learning Approaches for Detecting Soybean Cyst Nematode Infestation at Field Scale. Remote Sens. 2024, 16, 765. [Google Scholar]
  56. Furlanetto, R.H.; Nanni, M.R.; Mizuno, M.S.; Crusiol, L.G.T.; da Silva, C.R. Identification and Classification of Asian Soybean Rust Using Leaf-Based Hyperspectral Reflectance. Int. J. Remote Sens. 2021, 42, 4177–4198. [Google Scholar] [CrossRef]
  57. Conrad, A.O.; Li, W.; Lee, D.Y.; Wang, G.L.; Rodriguez-Saona, L.; Bonello, P. Machine Learning-Based Presymptomatic Detection of Rice Sheath Blight Using Spectral Profiles. Plant Phenomics 2020, 2020, 8954085. [Google Scholar] [CrossRef]
  58. Mishra, P.; Asaari, M.S.M.; Herrero-Langreo, A.; Lohumi, S.; Diezma, B.; Scheunders, P. Close-Range Hyperspectral Imaging of Plants: A Review. Biosyst. Eng. 2020, 193, 139–151. [Google Scholar] [CrossRef]
  59. Krishna, G.; Prema, D. Early Detection of Plant Stress Using Leaf-Level Hyperspectral Data and Machine Learning Approaches. Comput. Electron. Agric. 2020, 175, 105580. [Google Scholar]
  60. Wang, Y. Detection of Soybean SCN Infestation Using Multi-Scale Remote Sensing. Master’s Thesis, Southern Illinois University, Carbondale, IL, USA, 2025. Available online: https://opensiuc.lib.siu.edu/theses/3376/ (accessed on 20 August 2025).
  61. Song, Y.; Zhang, J.; Liu, Z.; Xu, Y.; Quan, S.; Sun, L.; Bi, J.; Wang, X. Deep Learning for Hyperspectral Image Classification: A Comprehensive Review and Future Predictions. Inf. Fusion 2025, 123, 103285. [Google Scholar] [CrossRef]
Figure 1. Distribution of SCNVI values across stress levels (healthy, moderate stress, and severe stress).
Figure 1. Distribution of SCNVI values across stress levels (healthy, moderate stress, and severe stress).
Crops 05 00058 g001
Figure 2. The correlations of SCNVI with the selected VIs.
Figure 2. The correlations of SCNVI with the selected VIs.
Crops 05 00058 g002
Figure 3. Wavelet-denoised spectral reflectance curves of soybean samples at different levels of SCN egg inoculation: (a) entire spectral region (from 325 nm to 1125 nm); (b) ultraviolet region (325 to 400 nm); (c) visible region (400 to 700 nm); and (d) near-infrared region (700 to 1125 nm).
Figure 3. Wavelet-denoised spectral reflectance curves of soybean samples at different levels of SCN egg inoculation: (a) entire spectral region (from 325 nm to 1125 nm); (b) ultraviolet region (325 to 400 nm); (c) visible region (400 to 700 nm); and (d) near-infrared region (700 to 1125 nm).
Crops 05 00058 g003
Figure 4. Significant bands identified by ANOVA based on the p-value of <0.05 and F-value of 3.09.
Figure 4. Significant bands identified by ANOVA based on the p-value of <0.05 and F-value of 3.09.
Crops 05 00058 g004
Figure 5. The factors loadings of PCA—coefficients of correlation between the first two components (PC1 and PC2) and the original bands.
Figure 5. The factors loadings of PCA—coefficients of correlation between the first two components (PC1 and PC2) and the original bands.
Crops 05 00058 g005
Figure 6. Comparison of training and testing accuracy for the three-class classification between the proposed VI and the widely used VIs.
Figure 6. Comparison of training and testing accuracy for the three-class classification between the proposed VI and the widely used VIs.
Crops 05 00058 g006
Figure 7. The effects of adding more vegetation indices (VIs) on classification accuracy.
Figure 7. The effects of adding more vegetation indices (VIs) on classification accuracy.
Crops 05 00058 g007
Table 2. Top 10 selected bands by different methods (UV bands are shown in purple, blue bands in blue, green bands in green, red bands in red, and red-edge bands in yellow).
Table 2. Top 10 selected bands by different methods (UV bands are shown in purple, blue bands in blue, green bands in green, red bands in red, and red-edge bands in yellow).
RankSelection Methods
LDASFM + RFSFM + SVMSFM + XGBoostRFE + RFRFE + SVMRFE + XGBoost
1515337511671516512519
2516338569570515510510
3514346513702511514513
4517510512665517517514
5665570567672513515517
6702567517670510518518
7569670671699519519511
8568695672667518516516
9666511666666514511512
10699571519338512513515
Table 3. Confusion matrix of three-class classification results using the proposed SCNVI and SFM-XGBoost (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, KS: Kappa statistics, WP: weighted precision, WR: weighted recall, WF1: weighted F1, and MCC: Matthews correlation coefficient).
Table 3. Confusion matrix of three-class classification results using the proposed SCNVI and SFM-XGBoost (TR: total reference, TP: total prediction, PA: producer’s accuracy, UA: user’s accuracy, OA: overall accuracy, KS: Kappa statistics, WP: weighted precision, WR: weighted recall, WF1: weighted F1, and MCC: Matthews correlation coefficient).
MetricHealthyModerate StressSevere StressTRPAUAF1-Score
Healthy450944.4%66.7%53.0%
Moderate Stress11311586.7%68.4%76.0%
Severe Stress114666.7%80.0%73.0%
TP619530
OA70.0%
KS0.491
WP70.0%
WR70.0%
WF169.0%
MCC0.505
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Li, R.; Bond, J.; Fakhoury, A.; Schoof, J. A Novel Spectral Vegetation Index for Improved Detection of Soybean Cyst Nematode (SCN) Infestation Using Hyperspectral Data. Crops 2025, 5, 58. https://doi.org/10.3390/crops5050058

AMA Style

Wang Y, Li R, Bond J, Fakhoury A, Schoof J. A Novel Spectral Vegetation Index for Improved Detection of Soybean Cyst Nematode (SCN) Infestation Using Hyperspectral Data. Crops. 2025; 5(5):58. https://doi.org/10.3390/crops5050058

Chicago/Turabian Style

Wang, Yuhua, Ruopu Li, Jason Bond, Ahmad Fakhoury, and Justin Schoof. 2025. "A Novel Spectral Vegetation Index for Improved Detection of Soybean Cyst Nematode (SCN) Infestation Using Hyperspectral Data" Crops 5, no. 5: 58. https://doi.org/10.3390/crops5050058

APA Style

Wang, Y., Li, R., Bond, J., Fakhoury, A., & Schoof, J. (2025). A Novel Spectral Vegetation Index for Improved Detection of Soybean Cyst Nematode (SCN) Infestation Using Hyperspectral Data. Crops, 5(5), 58. https://doi.org/10.3390/crops5050058

Article Metrics

Back to TopTop