Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging

Katari, Sushma; Bevers, Noah; KC, Kushal; Peart, Alison; Lopez-Nicora, Horacio D.; Khanal, Sami

doi:10.3390/rs18050757

Open AccessArticle

Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging

by

Sushma Katari

¹

,

Noah Bevers

¹

,

Kushal KC

¹

,

Alison Peart

²,

Horacio D. Lopez-Nicora

²

and

Sami Khanal

^1,*

¹

Department of Food, Agricultural, and Biological Engineering, The Ohio State University, Columbus, OH 43201, USA

²

Department of Plant Pathology, The Ohio State University, Columbus, OH 43201, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(5), 757; https://doi.org/10.3390/rs18050757 (registering DOI)

Submission received: 17 December 2025 / Revised: 29 January 2026 / Accepted: 25 February 2026 / Published: 2 March 2026

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning with Applications in Remote Sensing (Third Edition))

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Near-infrared (NIR) spectral band is a strong discriminator between non-detected and SCN-infested areas.
Vision Transformer (ViT) and customized CNN achieved an accuracy of 77.5% in detecting SCN-infested areas.

What are the implications of the main findings?

Demonstrates the effectiveness of using multispectral and deep-learning architectures in SCN-infested fields.
The proposed framework can enable more precise spatial and temporal monitoring of SCN infestations.

Abstract

Soybean cyst nematode (SCN) is the most economically devastating pathogen of soybean in North America. Even at low to moderate infestation levels, SCN can cause 20–30% yield loss without producing any visible aboveground symptoms. In severely infested fields, yield reductions can reach 60–70% and, in extreme cases, exceed 80%. Prior research on identifying SCN infestations has primarily relied on traditional machine-learning methods applied to Unmanned Aerial System (UAS)-based multispectral imagery, with limited success. This study hypothesizes that deep-learning (DL) methods can more effectively capture the subtle spectral and spatial signatures in multispectral images of SCN stress. To address this gap, we evaluate the performance of advanced DL architectures, including Vision Transformer (ViT) and a customized Convolutional Neural Network (CNN), for detecting SCN infestation in soybean fields using multispectral UAS imagery. Spectral analysis of the multispectral imagery revealed that the near-infrared (NIR) band is a strong discriminator between non-detected and SCN-infested areas. The DL models trained and tested across multiple growth stages showed promising results. The four-timestamp ViT model (3 June, 29 July, 19 August, and 2 September) achieved an F1-score of 0.74, while the five-timestamp SCN–CNN model (3 June, 22 July, 29 July, 19 August, and 2 September) achieved an F1-score of 0.75. Although overall performance was comparable, ViT demonstrated more stable performance across varying training and test data distributions. These findings highlight the effectiveness of DL architectures to automatically extract subtle, complex plant features from multispectral imagery throughout the growing season. Compared with manual, time-consuming soil-sampling techniques, the proposed framework enables more precise spatial and temporal monitoring of SCN infestations across fields.

Keywords:

soybean cyst nematodes (SCN); vision transformer; multispectral images

1. Introduction

Soybean is one of the most important crops globally, serving as a primary source of vegetable oil for human consumption and industrial applications, including inks and biodiesel, as well as a major contributor to animal protein feed systems [1]. For the past six decades, soybean production area has steadily increased, and demand is anticipated to continue rising as the global population grows [2]. Soybean production also plays a significant role in the global economy, with the soybean sector contributing over $115 billion annually in the USA alone [3]. Thus, meeting future soybean demand challenges is critically important and hinges on addressing key biotic constraints, among which pathogens, especially nematodes, remain highly consequential.

Nematodes are a significant concern in large-scale soybean production, with the soybean cyst nematode (SCN), Heterodera glycines, recognized as the most destructive pathogen of soybean in the USA [3]. SCN accounts for more yield loss than any other soybean disease, causing twice as much yield loss as any other disease [4]. Between 1996 and 2014, it contributed to approximately 36% of pathogen-related soybean losses [3]. SCN infestations typically reduce yield by 20–30%, often without visible aboveground symptoms [5], and severe infestations can result in 60–80% or greater yield losses [6]. Economic losses are estimated to exceed $1.5 billion annually in the United States [3] and over 120 million in China [6]. The nematode’s ability to persist in soil for years without a host, infect soybean roots by secreting effectors through its stylet that break down cell walls, and establish feeding sites near the vascular cylinder for nutrient extraction [6], combined with its capacity to spread through any soil-moving process and remain visually undetectable during early infection, underscores the critical need for timely, field-scale detection and management [3,7]. If there are visually noticeable impacts from SCN presence, symptoms may include severe growth restriction, stunting, and a yellowish appearance. Strategies to control SCN include using resistant soybean varieties, rotating with non-host crops like alfalfa, oats, corn, sorghum, or wheat, and maintaining clean equipment to prevent contamination and further spread [3,7].

Remote sensing has increasingly been explored as a tool for detecting foliar diseases and specific pathogens or causal agents in soybeans [8,9]. Given SCN’s substantial impact, nematodes have received notable attention in the remote sensing literature. Early work demonstrated that reflectance measurements from satellite, airborne, and ground-based sensors could explain up to 60% of the variability in initial SCN population densities within soybean quadrats [10]. Additional airborne multispectral studies found significant correlations between SCN densities and indices such as the green normalized difference vegetation index (GNDVI), though detection remained challenging at low infestation levels [11,12]. Other studies using spectroradiometer data identified red, green, blue, and NIR wavebands as key discriminators for SCN-related stress, with the best detection occurring 105–120 days after planting [13].

Recent technological advances in sensor and aerial platforms have expanded the use of unmanned aerial systems (UAS) for detecting SCN and root lesion nematodes (Pratylenchus brachyurus). UAS multispectral data have been used to correlate nematode populations with specific spectral bands, and good R² (>0.3) values were observed in red, 586 nm, and green spectral regions [14,15]. Machine-learning models (i.e., logistic regression, random forest, and conditional inference tree) have been trained with multispectral UAS images along with georeferenced point nematode infestation soil measurements, which have achieved moderate accuracy (>70%) in detecting SCN-symptomatic plants using spectra bands such as green and near-infrared (NIR) bands as key predictors [16]. Additional UAS-based studies identified blue-band reflectance, GNDVI, NDRE, and Green Chlorophyll Index (GCI) as strong indicators of SCN stress and soybean yield. These same indices also had the highest correlation with soybean yield (r = 0.59–0.75) [17]. Collectively, these studies demonstrate the value of multispectral vegetation indices for identifying SCN-related plant stress.

Despite these advancements, most previous work has relied on traditional ML methods and vegetation index–based feature engineering, which may overlook the complex spectral–spatial patterns associated with early or subtle SCN infestations. The limited application of deep learning (DL) in this domain is likely due to data availability and the technical barriers to deploying advanced models. DL methods, including convolutional neural networks (CNNs) and transformer-based architectures such as Vision Transformers (ViT), can automatically extract intricate features from selected vegetative indices. These approaches have demonstrated strong performance in emerging precision agriculture applications, such as weed detection [18] and plant disease identification [19,20,21], yet their potential for SCN detection remains underexplored.

In early SCN infection, nematodes establish feeding sites within roots, which can disrupt water and nutrient uptake without causing visible foliar symptoms. Physiologically, this stress can reduce chlorophyll synthesis and alter leaf water content, which may manifest as subtle changes in reflectance before visible yellowing occurs [13], making early detection challenging. This study investigates the utility of UAS multispectral imagery combined with advanced DL architectures for detecting subtle physiological responses expressed through spectral changes in soybean plants under SCN stress. Leveraging these technologies could allow us to capture hidden SCN damage that traditional visual scouting might overlook. The study objectives are to: (i) analyze the spectral signatures associated with SCN-infested soybean at multiple growth stages using UAS multispectral imagery, and (ii) implement and compare Vision Transformer (ViT) and Convolutional Neural Networks (CNNs) models to evaluate their effectiveness in SCN identification and mapping. This work demonstrates the potential of contemporary DL methods to improve SCN monitoring, thereby supporting more timely and effective nematode management strategies in soybean production.

2. Materials and Methods

2.1. Study Area

The experimental soybean field (~3.5 acres) is located at the Waterman Agricultural and Natural Resources Laboratory near the main campus of The Ohio State University (OSU) in Columbus, Ohio, USA (40.015 N, 83.038 W). The soil is predominantly Crosby silt loam, characteristic of the Southern Ohio Till Plain, with an estimated slope of 2–6% [22]. The field is classified as prime farmland when adequately drained, but is somewhat poorly drained. Mean annual precipitation ranges from 91.5 to 112 cm (36 to 44 in), with a mean annual air temperature of 9 to 12 °C (48 to 54 °F) and a frost-free period of approximately 145 to 180 days (Figure 1). Field plots have been reserved for SCN studies, and for several years, SCN-susceptible soybean cultivars have been planted to maintain high population densities between research trials. The field has been infested long enough to be considered naturally SCN-infested, making it valuable for research purposes (Figure 2).

2.2. Data Collection

2.2.1. Collection of UAS Multispectral Imagery

UAS data were collected using a DJI Matrice 200 V2 (DJI Technology Co., Shenzhen, China) equipped with a Micasense RedEdgeMX multispectral camera (Parrot Co., now acquired by AgEagle Aerial Systems, Wichita, KS, USA). This camera captures imagery in five spectral bands: Blue (B), Green (G), Red (R), Red Edge (RE), and Near Infrared (NIR). Flights were conducted at an altitude of 30 m with 75% front-to-side overlap, primarily on sunny days with no to minimal cloud cover. Over the 2022 soybean growing season, eleven UAS flights were performed at approximately one-to two-week intervals on the following dates: 3, 10, 17, 24 June; 1, 11, 22, 29 July; 8, 19 August, and 2 September. Raw images from each flight were pre-processed in Pix4Dmapper software version 4.8.4 (Pix4D, Prilly, Switzerland) to generate five-band orthomosaics for each date. The resulting composites were subsequently georeferenced in ArcGIS 3.2 (ESRI, Redlands, CA, USA) to ensure precise spatial alignment across all flight dates and with the ground-truth data.

2.2.2. Soybean Cyst Nematode Population Sampling

SCN populations were monitored at 100 sampling locations distributed evenly across the field in a grid-like pattern (Figure 2). Each location was marked with flags and wooden stakes to maintain consistent soil sampling throughout the season. At the end of the 2022 growing season, precise GPS coordinates for all sampling points were recorded using a Trimble real-time kinematic (RTK) GPS system (Trimble Inc., Westminster, CO, USA) and verified against georeferenced UAS imagery. SCN counts were collected approximately every 14 days on 25 April; 9, 23 May; 7, 21 June; 5, 19 July; 1, 16, 29 August; 12, 25 September of 2022. Because UAS flights did not occur on the exact soil sampling dates, each UAS image was paired with the nearest SCN sampling date within a 3–4-day window (Figure 3). This time lag was not intentional, but rather an artifact of logistical challenges, including personnel schedules and equipment availability.

At each of the 100 flagged locations, soil samples were collected biweekly following the same monitoring schedule. For each sampling event, a composite sample consisting of 15–20 soil cores (15–25 cm depth) was collected using a 2.54-cm-diameter soil probe (MODEL LS, Oakfield Apparatus and Fond du Lac). Cores were collected between soybean rows to avoid disturbing the plants. Samples were stored at 4 °C until processing. From each composite sample, a 100 cm³ subsample was used to extract SCN cysts with a semiautomatic elutriator (University of Georgia Science Instrumental Shop, Athens, GA, USA). SCN eggs were then extracted and quantified according to standard procedures [23] and reported as eggs per 100 cm³ of soil.

2.2.3. Soybean Management and Yield

Soybeans were planted on 23 April 2022, at 180,000 seeds per acre. Standard commercial production practices, including herbicide application, fungicide treatments, and fertilization, were implemented as needed in all soybean trials, following recommendations in the Ohio Agronomy Guide [24]. To assess yield, approximately 25 soybean plants were collected from a 2 m² area at each sampling point and hand-threshed using a small bundle thresher (SBT-RSG; Almaco, Nevada, IA, USA). Seed weight was recorded, and moisture content was measured with a portable moisture tester (Harvest Hand; Dickey-John Corporation, Auburn, IL, USA). Final seed weight was adjusted to 13% moisture, and yield was expressed as grams per square meter (g/m²).

2.3. Exploratory Data Analysis

2.3.1. Ground Truth Data Observations

An initial exploratory analysis was performed to guide the optimal use and interpretation of the available data. Boxplots were generated for each SCN sampling date to assess temporal variation in population counts across the growing season (Figure 4). SCN populations showed relatively low variability from June through August; however, counts during this period were heavily skewed towards zero and near-zero values. In particular, observations exceeding 500 eggs/100 cm³ of soil were far less frequent than those below this threshold. Population counts also fluctuated substantially between sampling dates at the same location. Several dates, including 9 May, 7 June, 29 August, and 25 September, recorded extreme values exceeding 3000 eggs/100 cm³ of soil.

To further investigate these temporal fluctuations, pairwise correlation coefficients were calculated between soybean yield and SCN population counts at each quadrat location (Figure 5). For each sampling date, total SCN counts were summed and compared with yield to quantify the strength of association. In addition, using ArcGIS Pro 3.2., SCN population distributions were mapped across the field for each sampling date to examine spatial patterns and potential population movement over time. An example from 5 July 2022, when SCN showed a moderate positive correlation with yield (r = 0.21), is provided in Figure 6.

2.3.2. Spectral Data and Data Selection

SCN is a soil-dwelling parasite that can negatively impact plant health and growth. To evaluate its impact on soybeans, we compared multispectral signatures between SCN-infested and non-detected areas across the field. SCN counts from soil samples ranged from 0 to over 4000 eggs/100 cm³, with the majority of observations near zero (i.e., not detected). To balance the dataset and simplify subsequent modeling, SCN populations were grouped into two classes: non-detected (zero SCN count) and SCN-infested (SCN detected and quantified). To isolate the spectral signatures specific to soybean plants and distinguish them from non-plant (likely bare soil) pixels, a k-means clustering approach based on the Excess Green Index (ExG) was used (Figure 7 and Equation (1)). This clustering approach was used over a fixed spectral threshold because it provides a more robust separation of plant and non-plant pixels under variable field conditions. Following clustering, plant pixels were identified based on their high NIR reflectance, a well-established indicator of healthy vegetation [25].

ExG = 2G − R − B

(1)

After separating plant and non-plant pixels, we compared the reflectance values of the five spectral bands between the non-detected and SCN-infested categories. As an initial step, t-tests were conducted to evaluate whether mean spectral reflectance differed significantly between the two groups. Although the tests indicated statistically significant differences across all bands, the extremely large sample size (over one million pixel values derived from 100 different 3 m × 3 m soybean field grid images) made even very small differences appear highly significant. To assess the practical significance of these differences, rather than relying solely on p-values, we calculated effect sizes using a Cohen’s d test, which quantifies the magnitude of the difference between the two categories relative to the pooled standard deviation (Equation (2)).

Cohen ’ s d = \frac{{mean}_{1} - {mean}_{2}}{\sqrt{\frac{(n_{1} - 1) s_{1}^{2} + (n_{2} - 1) s_{2}^{2}}{n_{1} + n_{2} - 2}}}

(2)

In Equation (2),

{mean}_{1}

and

{mean}_{2}

refers to the means of the spectral values for the first (1) and second (2) categories, while

n_{1}

and

n_{2}

refers to the sizes of these spectral value sets, and

s_{1}

and

s_{2}

represent their standard deviations. A Cohen’s d value greater than 0.2 indicates a practically significant difference between the categories. This analysis was performed for each band and capture date to determine whether spectral signatures could reliably detect the impact of SCN on plants.

Spectral data analysis was performed in Python (Version 3.11.11) using standard data science and machine-learning packages, such as NumPy, Pandas, scikit-learn, and StatsModels. These packages were also utilized to investigate spectral variations and perform statistical analysis.

2.3.3. Data Utilized for Model Building

As stated previously, SCN population counts were consolidated into two classes: non-detected and SCN-infested, which served as the target labels for model development. Preliminary DL models trained using multiple SCN population categories performed poorly, indicating that multispectral images alone lacked the sensitivity to detect fine-grained variation in SCN population. Hence, a simplified two-class framework was used.

Rather than using raw spectral bands, five vegetation indices were used to train the models, producing normalized values between −1 and 1 that reduced variability across timestamps and effectively reflected differences in land cover types (Table 1). Because SCN population counts exhibited substantial spatial and temporal fluctuations, initial model training focused on two timestamps that were strongly associated with known SCN-infested regions, namely 19 August and 2 September. Additional timestamps were then incrementally incorporated to evaluate how aggregating multispectral imagery across multiple dates influenced SCN detection performance. This stepwise approach enabled direct comparison of model accuracy using progressively aggregated temporal inputs (Table 2). Using this multi-timestamp approach can also reduce the impact of potential image artifacts, such as field textures, illumination, or segmentation patterns, on the model’s learning process, as images at different timestamps exhibit varying soybean canopy features due to differences in soybean growth stages.

The sequence of timestamps was selected based on visual comparisons of the ground-truth SCN population patterns, insights from agronomic expertise, and historical observations of soybean yield reductions in SCN-affected areas. Dates showing the strongest correspondence with known infestation zones were prioritized for inclusion. This stepwise approach enabled evaluation of how model performance changed as temporal information increased, resulting in DL models trained with multispectral data from 2, 3, 4, 5, 6, and then 7 timestamps.

This iterative process was implemented for both deep-learning architectures: the Vision Transformer (ViT) and the Convolutional Neural Network (CNN) (hereafter, SCN–CNN), resulting in a total of 14 models. To ensure reliable training–testing splits and minimize sampling bias, 15 different random seeds were used to simulate diverse initialization and partitioning scenarios. The selected seed values are 4, 5, 12, 14, 15, 18, 29, 32, 36, 55, 70, 76, 82, 87, and 95. This strategy enhanced model robustness and generalizability by exposing each architecture to multiple training distributions. The datasets were divided into 80% training and 20% testing sets.

2.3.4. Vision-Transformer (ViT) Architecture

A transformer-based model (ViT) was implemented to detect SCN-infested regions in the field using multispectral imagery (Figure 8) [26]. The model takes images as input with a 224 × 224 × 5 pixel size, where the dimensions represent the height, width, and number of spectral channels. The five channels correspond to the vegetation indices listed in Table 1. Initially, each image is divided into smaller, user-defined patches of size 16 × 16 × 5, generating 196 patches for a 224 × 224 × 5 image. These patches are flattened and projected into a one-dimensional vector using a linear projection layer, resulting in fixed-length embeddings. A special classification token (CLS) is added at the beginning of each image embedding to enable patch-level relationship aggregation during classification.

To retain the spatial context of the patches, positional embeddings are added to each 196 patch embedding, allowing the model to learn positional relationships across the image. The sequence is then passed through a transformer encoder, which consists of three primary layers: a normalization layer, a multi-head attention layer, and a multi-layer perceptron (MLP). The self-attention mechanism assesses the significance of each embedded patch, allowing the model to learn long-range dependencies between patches. This is an advantage over CNNs, which primarily learn from local neighborhoods. The encoder output is fed into an MLP head comprising dense layers with sigmoid activation. To account for phenological differences across timestamps, a categorical variable representing the soybean growth stage is encoded through a dense layer (size = 8) and concatenated with the ViT MLP head. The soybean growth stage variable is an integer ranging from 0 to 15, with 0 corresponding to VE/VC and 15 to R8. Details of the growth stages and their corresponding integer labels are provided in Table S1. Two additional dense layers (128 and 64 units) follow this concatenation, culminating in a binary classification layer that predicts whether a pixel represents a non-detected area or an SCN-infested area.

2.3.5. SCN–CNN Architecture

The performance of the ViT model was compared with that of a simpler CNN-based architecture designed for SCN detection (SCN–CNN) (Figure 9). Similar to ViT, the SCN–CNN architecture accepts input images of size 224 × 224 × 5. The network consists of three convolutional blocks, each consisting of a convolutional layer, batch normalization, and max pooling. Convolutional layers help extract low-to high-level image features (e.g., edges and textures), which are required for pattern recognition, while batch normalization stabilizes and speeds up training by normalizing activations. Max pooling decreases spatial dimensions by retaining only the most important features and decreasing computational costs.

The three convolutional blocks are chosen because they effectively capture image features from low to high levels (by increasing the number of filters), help prevent overfitting, and are computationally efficient. Features extracted from the convolutional blocks are flattened and passed to dense layers, with a 50% dropout rate to prevent overfitting. Similar to ViT, the soybean growth stage is fed into the dense layer and concatenated with the previously mentioned dense layers of the SCN–CNN model. This concatenated layer is further connected to two additional dense layers (size 32 and 64). The final dense classification layer produces a probability for each of the two classes: non-detected and SCN-infested.

2.3.6. Model Performance Evaluation

The performance of the ViT and SCN–CNN models was evaluated using accuracy, precision, recall, and F1-score metrics (Figure 10). These metrics quantify how effectively each model identifies SCN-infested and non-detected areas relative to the ground truth (Equations (3)–(6)). Higher accuracy, precision, recall, and F1 score indicate a better model performance.

True Positive (TP) and True Negative (TN) represent correctly identified infested and non-detected SCN regions, respectively. False Positive (FP) occurs when the model incorrectly categorizes a region as non-detected when it is actually SCN-infested. Similarly, a False Negative (FN) indicates that the model failed to detect SCN in the soybean field. Accuracy reflects the overall correctness of the model, while precision and recall reflect the model’s ability to minimize false positives and to capture all positive instances, respectively. The F1-score provides a balanced harmonic mean by combining precision and recall.

Accuracy = \frac{TP + TN}{TP + FP + TN + FN}

(3)

Precision = \frac{TP}{TP + FP}

(4)

Recall = \frac{TP}{TP + FN}

(5)

F 1 = 2 \times (\frac{Precision \times Recall}{Precision + Recall})

(6)

2.3.7. Model Configuration

The ViT and SCN–CNN models were tested across multiple configurations to identify optimal training settings. For ViT, the final architecture used a patch size of 16, a projection dimension of 64, four transformer heads, and eight transformer layers. The SCN–CNN model consisted of three convolutional layers with 32, 64, and 128 filters (ReLU activation), followed by dense layers with 64 and 32 units. The convolutional layers used kernel sizes of 3, a pool size of 2, and a stride of 1. Both models were trained for up to 50 epochs with early stopping, which halted training if validation did not improve for 10 consecutive epochs. The Adam optimizer was utilized alongside sparse categorical cross-entropy as the loss function for all models.

Training time ranged from 10 to 40 min, depending on the number of input timestamps, with the ViT and SCN–CNN models showing comparable runtimes. The observed model performance during training and validation is illustrated in the Supplementary Materials (Figures S3 and S4). All models were implemented in PyTorch (v2.7.1) and TensorFlow (v2.13) Python (v3.11.11) libraries on a workstation with 64 GB of RAM and an Intel(R) Xeon(R) Silver 4114 CPU operating at 2.20 GHz.

3. Results

3.1. Spectral Analysis of Non-Detected and SCN-Infested Regions

Spectral value distributions for the blue, green, red, red-edge, and NIR bands were analyzed for each UAS timestamp. Reflectance values extracted from the pre-processed imagery were grouped into two classes: non-detected and SCN-infested, to assess whether SCN presence produced distinct spectral patterns. During the early growing season (June and July), the two classes were difficult to distinguish, indicating that early-season multispectral data contributed little to detecting SCN infestation (Figure 10). As the season progressed, however, class separation became more pronounced, likely due to the emergence of visible SCN-related stress and reduced plant vigor. Notably, among the five spectral bands, the NIR band demonstrated the strongest ability to differentiate non-detected and SCN-infested plants (Figure 11). Later-season imagery showed that non-detected pixels, i.e., healthy soybean plants, tended to exhibit lower red and blue reflectance and higher reflectance in the red-edge band compared to SCN-infested regions. These temporal shifts highlight the increasing diagnostic value of multispectral imagery as soybean plants advance through their growth stages.

To further differentiate the two classes based on plant and non-plant pixels, the K-means clustering method (Section 2.3.2) was used (Figure 12). Early in the season, SCN-infested areas exhibited higher average reflectance values across all five spectral bands for both plant and non-plant pixels. However, this pattern shifted beginning with the 22 July imagery, which corresponds to the R3 soybean growth stage (Figures S1 and S2 and additional figures in Supplementary Materials). From this date onward, the NIR band consistently showed higher reflectance in non-detected areas for both plant and non-plant pixels. Using SCN infestation as a proxy for plant stress, this shift aligns with expected spectral behavior: healthy vegetation shows stronger NIR reflectance than stressed or unhealthy plants.

The practical differences between non-detected and SCN-infested spectral values were quantified using Cohen’s d (Figure 13). For early-season imagery (June to 15 July), non-plant pixels showed larger effect sizes than plant pixels, with ‘d’ values generally between 0.2 and 0.5. In contrast, from 22 July through 2 September, plant pixels exhibited higher and nearly comparable effect sizes (d) to non-plant pixels, with most values ranging from 0.1 to 0.3. Across all timestamps, the NIR band consistently showed the strongest practical significance in distinguishing the two classes. The blue, green, and red bands also performed well except on 17 June, when their effect sizes were notably smaller. The red-edge band showed moderate significance early in the season but declined substantially after mid-July, rendering it less useful for late-season discrimination.

3.2. ViT and SCN–CNN Model Performance for Classification of SCN-Infested Areas

The DL models were trained using multispectral imagery from different timestamp combinations to classify non-detected and SCN-infested areas. Across 15 random seed initializations for each timestamp combination, some models performed poorly, often predicting nearly all samples as a single class due to imbalanced or unrepresentative training splits. These low-performing models were excluded, and only the top five models (across five random seeds) for each timestamp configuration were used for analysis in ViT and SCN–CNN model evaluations.

3.2.1. ViT Model Results

Among top-performing ViT models, the maximum accuracies of 77.5%, 70%, 75%, 69%, 69%, and 66% were achieved for models trained on 2, 3, 4, 5, 6, and 7 timestamps, respectively. The mean accuracy and standard deviation of the top models were as follows: 71% (±4) for 2 timestamps, 66% (±3) for 3 timestamps, 68% (±4) for 4 timestamps, 66% (±2) for 5 timestamps, 65% (±2) for 6 timestamps, and 64% (±1) for 7 timestamps (refer Supplementary Figure S5 for more information). The two-timestamp model (seed value = 55) achieved the highest precision (0.86), indicating strong confidence in identifying SCN-infested regions. However, its recall was relatively low (0.59), meaning many non-detected areas were misclassified as infested, which could lead to unnecessary concern or misallocation of SCN management efforts. While recall was low, the model still correctly identified SCN infestations 86% of the time.

Conversely, the six-timestamp model (seed = 4) achieved a high recall of 0.79 but at a lower precision of 0.57, reflecting more accurate detection of true infestations but a higher rate of false positives. Given this trade-off, we assessed the F1-score, which provides a balanced evaluation. The four-timestamp model produced the best overall F1-score (0.74), with precision and recall values of 0.73 and 0.76, respectively (Figure 14). This configuration offered the most reliable balance between correctly identifying SCN-infested areas and minimizing false alarms. The complete ViT model distributions across all 15 random seeds are also shown in Figure S6 of the Supplementary Materials.

To further evaluate model behavior across different infestation intensities, predictions from the best-performing models were grouped into four SCN population categories based on ground truth data: No SCN (not detected, zero eggs/100 cm³), Low SCN (0–100 eggs/100 cm³ of soil), Medium SCN (101–500 eggs/100 cm³ of soil), and High SCN (>500 eggs/100 cm³ of soil). Importantly, these categories were used only for post hoc evaluation (not for model training) to simplify interpretation and identify which population ranges were most frequently misclassified (Figure 15).

Results showed that the High SCN category, although poorly classified by two- and three-timestamp models, exhibited steadily improving accuracy beginning with the four-timestamp model. In contrast, the accuracy of the No SCN category declined as more timestamps were included, suggesting that models with many timestamps tend to overpredict SCN presence and generate more false positives. This trend suggests that while incorporating temporal information is valuable, too many timestamps can lead to unnecessary false alarms.

Overall, models trained with more than three, but fewer than six timestamps provided the best balance between detecting true infestations and minimizing false positives. Notably, the four-timestamp model with an F1 score of 0.74 showed accuracies of 74%, 67%, 75%, and 100% for No SCN, Low SCN, medium SCN, and high SCN categories, respectively. These results highlight the model’s strong capability to identify moderate to severe SCN infestations in the field.

3.2.2. SCN–CNN Model Results

Among the top-performing SCN–CNN models, the highest accuracies for the 2-, 3-, 4-, 5-, 6-, and 7-timestamp configurations were 77.5%, 67%, 67.5%, 73%, 67%, and 64%, respectively. The mean accuracy and standard deviation of the top models were as follows: 72% (±4) for 2 timestamps, 59% (±7) for 3 timestamps, 63% (±3) for 4 timestamps, 67% (±5) for 5 timestamps, 64% (±3) for 6 timestamps, and 61% (±2) for 7 timestamps (refer Supplementary Figure S7 for more information). Although the two-timestamp model achieved the highest overall accuracy (77.5%), its precision (0.62) and recall (0.67) indicated limited reliability.

In contrast, the five-timestamp model (seed = 14) achieved both high precision (0.75) and high recall (0.76), demonstrating strong capability in correctly classifying both SCN-infested and non-detected areas. Its F1-score (0.75) was the highest among all SCN–CNN models (Figure 16) and closely matched the performance of the best four-timestamp ViT model (precision = 0.73; recall = 0.76). Most other SCN–CNN models achieved accuracies between 0.6 and 0.7 but consistently showed precision and recall below 0.67, underscoring their limited discriminative performance. The complete model performance distributions across 15 random seeds are shown in Figure S8 in the Supplementary Materials.

When model predictions were reclassified into four SCN population categories, the SCN–CNN showed strong performance in identifying high-infestation areas, with accuracies generally above 70% across timestamp configurations (Figure 17). Accuracy for the low and medium SCN categories was lower but improved slightly with the addition of a fourth timestamp. Except for the seven-timestamp model, classification accuracy for the non-detected SCN category typically ranged from 60% to 80%, indicating that SCN–CNN maintained relatively low false-alarm rates.

3.2.3. Interplay of Architecture, Temporal Data, and Partitioning

Overall, the ViT and SCN–CNN models exhibited comparable performance, though ViT consistently achieved slightly higher accuracy across most timestamp configurations (Figure 18). SCN–CNN model outperformed ViT only in the five-timestamp scenario; in all other cases, it either performed similarly or less effectively than ViT. Additionally, ViT showed less variability across random seeds, indicating more stable performance under different training–testing splits compared with the SCN–CNN model.

Interestingly, the random seeds associated with the highest accuracy differed between the two model architectures, indicating that the same data can favor one model and disadvantage the other, reflecting inherent differences in how each architecture learns from the input data. For example, the two-timestamp ViT model achieved its highest accuracy (77.5%) with random seed 36, whereas the SCN–CNN model reached the same accuracy with random seed 76. Under seed 36, SCN–CNN achieved only 67.5% accuracy, and under seed 76, ViT achieved 61.3%. In contrast, both models performed well with random seed 14 in the five-timestamp configurations: SCN–CNN reached 73% accuracy while ViT achieved 69%. These results highlight that model performance is influenced not only by architecture and timestamp selection but also by the underlying data partitioning.

3.2.4. Model Performance Visualization over Each Timestamp of a Model

Since the models were trained using multiple timestamps, their performance was also evaluated separately for each date. Model predictions were overlaid on orthomosaics to visualize spatial agreement and identify areas where classification was most accurate. For the top-performing two-timestamp ViT model (overall accuracy: 77.5%), timestamp-specific accuracies were 77.3% on 18 August and 77.8% on 2 September (Figure 19). Although overall accuracies were similar, precision and recall varied considerably. On 18 August, precision was 0.5 and recall was 0.8, whereas on 2 September, precision increased to 0.83 and recall decreased to 0.63. The lower precision on 18 August resulted from misclassifying many low SCN population regions as non-detected, even though the model accurately captured medium SCN infestations, achieving an F1-score of 1 for that category.

A comparable analysis was performed for the five-timestamp SCN–CNN model (overall accuracy of 73%) (Figure 20). For some timestamps, such as 3 June and 29 July, the model classified all samples into a single category (either SCN-infested or non-detected). This occurred because ground-truth observations for those dates were highly imbalanced, with more than 70% of the samples belonging to a single class. As a result, the model achieved deceptively high accuracy despite failing to differentiate between classes. Attempts to retrain the model using class-balanced data did not resolve this issue, underscoring the importance of evaluating accuracy at each timestamp rather than relying solely on overall performance metrics.

3.2.5. Model Performance on Unseen Timestamp Data

The robustness of the model was evaluated by generating inferences using the top-performing two-timestamp SCN–CNN model on data from 29 July 2022. Although the model was trained on 19 August and 2 September, it still performed well on the unseen data, reaching an accuracy of 68%. The precision and recall scores were approximately 0.60, reflecting satisfactory performance.

4. Discussion

Using DL with UAS-based multispectral imagery and soil-sampled SCN population counts from the 2022 soybean growing season, we developed models capable of mapping likely SCN-infested areas within a field. Both the ViT and SCN–CNN architectures performed similarly, achieving high F1-scores of 0.74 (four-timestamp ViT) and 0.75 (five-timestamp SCN–CNN) for binary detection of ‘not detected’ versus ‘SCN-infested’ patches.

4.1. Contribution of Multispectral Imagery to SCN Detection

Despite the spatial and temporal variability in SCN population, multispectral data revealed trends consistent with plant stress physiology. Healthy vegetation typically exhibits strong NIR reflectance due to the leaf’s internal structure [27,28,29]. Starting around the R3 stage [30], when soybeans shift from vegetative to reproductive development [31], SCN stress likely disrupts water and nutrient transport, reducing plant vigor and altering spectral reflectance [32].

Our results showed that NIR reflectance consistently distinguished SCN-infested from non-detected plants later in the season. Subtle temporal shifts were also observed in the green, red, and blue bands, likely driven by changing chlorophyll content in stressed plants. These patterns align partly with past studies. For instance, Santos et al. (2022) [16] emphasized that green and NIR bands are key predictors when using machine-learning algorithms. Arantes et al. (2023) [15] found that red-edge and NIR bands best explained root lesion nematode counts in soybean, which complements our findings for SCN (no root lesion nematodes were present in our study). However, they found that the red spectral band explained the best variability in SCN counts, whereas this was less important in our findings. Likewise, Jjagwe et al. (2024) [17] identified the blue band as one of the key indicators of SCN populations using UAS and aerial multispectral imagery, which differed in our study, but not nearly as much as NIR. While band-level importance varied slightly, the consistent utility of NIR across studies—including ours—underscores its relevance for SCN detection. Because SCN is soil-dwelling, future research could also explore soil reflectance characteristics using hyperspectral sensors to enhance sub-canopy stress detection.

4.2. Deep-Learning Model Comparisons and Limitations

In this study, ViT and SCN–CNN models were trained using five vegetative indices (NDVI, ExG, NDRE, EVI, and GNDVI) rather than raw spectral bands to normalize variation across acquisition dates and growth stages. Although early-season imagery alone yielded limited performance in SCN detection, the models captured subtle variations in vegetation indices indicative of early plant stress. Healthy plants typically exhibit strong absorption in the blue and red regions due to high chlorophyll content, along with high NIR reflectance driven by leaf structure and water content [27]. In contrast, early SCN-infested plants may show reduced chlorophyll content, compromised leaf internal structure, and altered water status, resulting in slightly higher reflectance in visible bands (such as green and red) and lower NIR reflectance as stress progresses [13,17]. Vegetation indices leverage these spectral regions to capture early stress signals. For example, NDVI reflects overall canopy vigor, NDRE is sensitive to pigment changes, and GNDVI may highlight chlorophyll and water-related stress [27]. By learning these latent spectral patterns, the DL models can identify early stress signatures that may not be visually apparent, while reducing the need for manual feature engineering. Similar to our study, previous studies such as Jjagwe et al. (2024) [17] also explored the use of the Green Chlorophyll Index (GCI), NDRE, and GNDVI to identify the best index for differentiating plant vigor and found that SCN impact and NDRE showed a strong correlation (r = 0.75), highlighting the relevance of index-based approaches for capturing nematode-induced stress.

Methodologically, our study advances SCN detection beyond traditional statistical and ML approaches by applying DL methods to multispectral imagery. Early works primarily relied on linear regression and correlation analyses using broad spectral bands or simple indices (e.g., GNDVI), but detection was challenging at low infestation levels and generally most effective later in the season [10,11]. Recent comparable UAS studies using ML approaches (e.g., stepwise regression or tree-based algorithms) on multispectral imagery reported accuracies exceeding 70% [16], but were largely restricted to single growth-stage images and to specific bands derived through manual feature engineering [14,15,17]. As in our results, performance improved at mid- to late-season growth stages in these studies as well [14,17]. Our findings follow a similar pattern: both ViT and SCN–CNN models showed limited success in early-season detection, but performance improved substantially as the season progressed.

Related work using high-resolution spectroradiometer data has demonstrated the potential of specific wavebands for disease discrimination, while also revealing significant constraints [13,33]. For example, one study [13] reported difficulty distinguishing SCN from other diseases, such as Sudden Death Syndrome (SDS), due to spectral overlap at the leaf level. Another study [33], conducted under controlled greenhouse conditions using hyperspectral spectroradiometer data and XGBoost classification, achieved binary classification accuracies of 65–70% and three-class accuracies of 45–55% for the SCN population. That study found significantly higher ultraviolet (UV) and visible reflectance in severely stressed plants, while healthy plants exhibited lower red and green reflectance and no consistent NIR differences. It also underscored that, although hyperspectral data offer rich spectral detail, have the challenge of identifying biotic stress while lacking the spatial context and scalability provided by canopy-level imaging. In contrast, our DL models demonstrate slightly higher performance in binary classification using multispectral imaging data in an open field environment (>70%), by evaluating underlying causes of detection variability, such as phenological progression and temporal context.

Our results indicate that while DL offers superior feature extraction capabilities, SCN detection remains constrained by the biological latency of symptom expression. Similar to earlier findings [11,13], our models encountered ‘noise’ and reduced accuracy when utilizing early-season imagery alone. However, by leveraging multi-timestamp fusion, we partially overcame the limitations of single-date analysis commonly used in previous studies [14,15]. Combining two timestamps from mid- and late-season growth stages generally yielded the highest classification performance, perhaps enabling the models to capture the cumulative physiological impact of SCN infection over time. Notably, the SCN–CNN architecture occasionally outperformed the ViT model, suggesting that the CNN’s inductive bias towards locality (nearby pixels are related) and translation invariance may be useful for agricultural datasets with limited sample sizes and high temporal variability. While ViT is powerful and potentially more stable, our DL models demonstrated slightly higher performance, highlighting the benefits of combining multi-timestamp (multi-growth-stage) data for detecting SCN presence.

Finally, due to the moderate spatial resolution of the multispectral images, it was not possible to derive clear, human-interpretable visualizations of early SCN infection features across individual spectral bands from the model’s learned representations. Spectral or spatial patterns associated with early SCN stress were not evident through direct visual inspection of individual bands or pixels. In addition to ViT and customized CNN, we evaluated the effectiveness of residual networks for SCN detection under identical experimental conditions. Using a dataset split with a random seed of 36, the ViT model achieved an accuracy of 77.5%, whereas the ResNet model achieved 57%, with additional random seeds yielding similar or lower performance for the ResNet model. These results suggest that, within the scope of this study, ViT-based architectures are better suited for SCN detection than conventional CNN-based residual networks. Future work should explore alternative DL architectures and incorporate higher-resolution hyperspectral or optical imagery, alongside explainable AI techniques (e.g., saliency maps or feature attribution methods), to improve both detection performance and the interpretability of the spectral characteristics associated with early SCN infection.

4.2.1. Model Learning Across SCN Population Ranges

Post hoc evaluation using four SCN population classes showed that adding timestamps beyond four generally reduced the accuracy of the lower SCN population categories (100 eggs/100 cm³ of soil), which made up the bulk of the data. This decline in performance likely stems from several interacting factors: (i) Early-season imagery—these images introduced confounding spectral noise or variability unrelated to SCN, such as soil background effects and canopy development differences. (ii) Later-season imagery—natural yellowing and canopy thinning at later soybean growth increased false positives for the “No SCN” class, as natural senescence progressions can mimic SCN-induced stress signatures. (iii) Amplified noise—including more timestamps increased overall feature dimensionality and model complexity, which created noise rather than providing strong discriminatory learning patterns.

Conversely, accuracy for the high SCN category (exceeding 500 eggs/100 cm³ of soil) continuously improved with additional timestamps, suggesting that severe infestations produce distinct and persistent spectral signatures across growth stages. Although models were trained only on binary labels, these four-class evaluations revealed meaningful patterns in how models learn SCN severity and where misclassifications tend to occur. Overall, these findings highlight the importance of balancing temporal coverage with spectral clarity to avoid noise while maintaining robust detection of true infestations.

4.2.2. Optimal Time Period and Growth Stage for SCN Detection

The timestamps associated with the highest ViT and SCN–CNN performance—19 August and 2 September—correspond to approximately 118 and 132 days after planting, within the R6 growth stage. This aligns fairly well with Bajwa et al. (2017) [13], who reported an optimal detection window of 105–120 days after planting, when SCN-related reductions in chlorophyll become spectrally detectable. The study used spectroradiometer data alongside discrimination models to identify vegetative indices and spectral bands that were most responsive to detecting SCN in soybeans. The study reported that SCN-infested plants showed the highest reflectance in the red bands, which indicated a decrease in chlorophyll content. Although our findings differed slightly, the suggested optimal timeframe for SCN detection is consistent with ours.

During R3–R6 stages, soybeans are more highly susceptible to stress (moisture, light, temperature) and pest-induced yield losses than at any other developmental period [31]. The rapid nutrient and dry matter accumulation from the R2 stage results in a peak pod weight by the R6 stage. This process depends heavily on healthy roots, as they provide the necessary water and nutrients. After R6, root development slows, and leaf yellowing begins, progressing to browning and senescence, culminating in full maturity by R8. Given these physiological developments, optimal spectral recognition of SCN infestation at the R6 stage is logical, as SCN-infested plants exhibit lower leaf and pod development than healthier plants.

4.3. Data Limitations

Several data-related challenges also influenced model performance. First, a 3–4-day temporal mismatch between some UAS flights and SCN soil sampling dates may have introduced noise; narrowing this gap could improve model learning. Unfortunately, we are limited to the available datasets. Therefore, we are unable to verify or quantify the impact of this time lag on detection results. Second, SCN population counts also fluctuated substantially at the same sampling locations over time, with no linear progression across the season. This indicates high spatial and temporal heterogeneity even within individual soil sample locations. SCN sampling is implemented using soil cores that penetrate a very narrow section of the soil profile, and thus, a sampling location may be marked as “not detected” for a particular date, but there may still be SCN present in actuality, perhaps at low densities. Despite the very thorough SCN sampling protocol, potential noise in the data due to sampling error should be considered.

Compounding this, the overall population distribution was heavily right-skewed, with most observations near zero. The scarcity of consistently SCN-free areas throughout the season limited the availability of clean negative samples. Ideally, larger fields with areas of persistently zero SCN populations across all sampling dates would provide a clearer contrast for model training, but identifying such fields requires intensive soil sampling and is logistically challenging. The study was conducted during a single growing season in a single experimental field (3.5 acres), limiting the generalizability of the findings. Multi-year validation with data from multiple regions would be necessary to confirm the robustness of the models under diverse environmental and management conditions. Furthermore, the current 8:2 random split strategy introduces potential data leakage because repeated observations from the same sampling sites across dates may share spatial and textural patterns unrelated to SCN stress. This could inflate performance metrics by allowing the model to learn field-specific characteristics rather than true stress signals. Future work should implement grouped or spatially independent validation strategies to mitigate this risk and provide a more rigorous assessment of model performance.

4.4. Employing UAS for Future SCN Management

Current SCN management relies on (i) actively soil sampling to determine the presence of SCN and monitor population levels, (ii) rotating to non-host crops to keep numbers low, and (iii) using and rotating SCN-resistant soybean cultivars [34]. Given that there are currently no in-season treatments (such as pesticides) for SCN, rapid identification of likely SCN-infested regions in the field is the best place to start for a farmer, who can then direct soil sampling collection at these regions and gain awareness of infestation levels. One or two UAS multispectral flights during the R3–R6 growth stage could help farmers pinpoint areas for prioritizing soil sampling for SCN evaluation. Even with ~75% mapping accuracy, this approach could substantially reduce the labor and cost required for broad-scale soil testing. After confirming SCN presence, the farmer can use a combination of mapped SCN populations and yield-monitoring data to guide decisions on cultivar selection, rotation planning, and long-term field management. If SCN levels are low or not yet yield-limiting, farmers can plan monitoring strategies; if high, more aggressive rotational or varietal management may be warranted.

5. Conclusions

This study demonstrated the potential of multispectral UAS imagery combined with DL (ViT or CNN) models to detect SCN-affected regions within soybean fields. Spectral analysis showed that the NIR band was consistently the most informative for distinguishing non-detected from SCN-infested areas. Early in the season, SCN-infested regions exhibited slightly higher median NIR reflectance, but once the crop entered reproductive stages, the pattern reversed—with healthy (non-detected) areas showing higher NIR values. This temporal shift underscores the importance of considering soybean growth stage when interpreting SCN-related spectral responses. DL models trained across multiple timestamps further highlighted the value of incorporating phenological variation. The four-timestamp ViT model (3 June, 29 July, 19 August, 2 September) achieved an F1-score of 0.74, while the five-timestamp SCN–CNN model (3 June, 22 July, 29 July, 19 August, 2 September) reached 0.75. Although performance was comparable, the ViT exhibited greater stability across different training–testing splits, suggesting stronger generalization. Overall, these findings demonstrate that DL models can automatically extract meaningful plant features from complex multispectral data at different growth stages and detect SCN presence with field-scale precision. Compared to traditional scouting, this developed framework enhances spatial and temporal monitoring of SCN infestations and provides a foundation for more efficient, targeted soil sampling and SCN management strategies.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/rs18050757/s1. Table S1: Description and categorical label defined for each soybean growth stage. Figure S1: Variation in spectral value (containing both soybean plant and non-plant) distribution between non-infested and SCN-infested classes for the UAS images collected on (a) 3 June 2022, (b) 1 July 2022, (c) 22 July 2022, (d) 29 July 2022, and (e) 2 September 2022. Figure S2: Spectral value ranges of soybean plant and non-plant pixels between non-infested and SCN-infested classes for the UAS images collected on (a) 3 June 2022, (b) 1 July 2022, (c) 22 July 2022, (d) 29 July 2022, and (e) 2 September 2022. Figure S3: Training and validation performance of the top five ViT models at different timestamps of input. The model names are represented as RS-X, where X denotes the random seed used for the training and test data splits. Figure S4: Training and validation performance of the top five SCN–CNN models at different timestamps of input. The model names are represented as RS-X, where X denotes the random seed used for the training and test data splits. Figure S5: Performance distribution of the top five random seed ViT models across accuracy, precision, recall, and F1 score, including mean and standard deviation bars. Each point across the X-axis represents the performance metrics for a specific timestamp used in the ViT model. # indicates the number of timestamp images. Figure S6: ViT model performance distribution across test datasets observed at all 15 random seeds. Figure S7: Performance distribution of the top five random seed SCN–CNN models across accuracy, precision, recall, and F1 score, including mean and standard deviation bars. Each point across the X-axis represents the performance metrics for a specific timestamp used in the SCN–CNN model. Figure S8: SCN–CNN model performance distribution across test datasets observed at all 15 random seeds.

Author Contributions

Conceptualization, S.K. (Sushma Katari), N.B. and S.K. (Sami Khanal); methodology, S.K. (Sushma Katari) and N.B.; software, S.K. (Sushma Katari) and N.B.; validation, S.K. (Sushma Katari); formal analysis, S.K. (Sushma Katari) and N.B.; investigation, S.K. (Sushma Katari) and N.B.; resources, S.K. (Sami Khanal) and H.D.L.-N.; data curation, K.K. and A.P.; writing—original draft preparation, S.K. (Sushma Katari) and N.B.; writing—review and editing, S.K. (Sushma Katari), N.B., S.K. (Sami Khanal) and H.D.L.-N.; visualization, S.K. (Sushma Katari); supervision, S.K. (Sami Khanal); project administration, S.K. (Sami Khanal); funding acquisition, S.K. (Sami Khanal). All authors have read and agreed to the published version of the manuscript.

Funding

This project is funded by the OSU Graduate School Fellowship programs, including Diversity University, ENGIE-AXIUM, and EmPOWERment Fellowships, Ohio Soybean Council (OSC 26-R-06, OSC 24-R-36), and USDA AFRI GRANT (13713059).

Data Availability Statement

Data available upon request.

Acknowledgments

The authors thank Zak Ralston and the Soybean Pathology and Nematology team at The Ohio State University for their valuable technical support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kalaitzandonakes, N.; Kaufman, J.; Zahringer, K. The Economics of Soybean Disease Control; CAB International: Boston, MA, USA, 2019. [Google Scholar]
Hartman, G.L.; West, E.D.; Herman, T.K. Crops That Feed the World 2. Soybean—Worldwide Production, Use, and Constraints Caused by Pathogens and Pests. Food Sec. 2011, 3, 5–17. [Google Scholar] [CrossRef]
Arjoune, Y.; Sugunaraj, N.; Peri, S.; Nair, S.V.; Skurdal, A.; Ranganathan, P.; Johnson, B. Soybean Cyst Nematode Detection and Management: A Review. Plant Methods 2022, 18, 110. [Google Scholar] [CrossRef]
Bradley, C.A.; Allen, T.W.; Sisson, A.J.; Bergstrom, G.C.; Bissonnette, K.M.; Bond, J.; Byamukama, E.; Chilvers, M.I.; Collins, A.A.; Damicone, J.P.; et al. Soybean Yield Loss Estimates Due to Diseases in the United States and Ontario, Canada, from 2015 to 2019. Plant Health Prog. 2021, 22, 483–495. [Google Scholar] [CrossRef]
Wang, J.; Niblack, T.L.; Tremain, J.A.; Wiebold, W.J.; Tylka, G.L.; Marett, C.C.; Noel, G.R.; Myers, O.; Schmidt, M.E. Soybean Cyst Nematode Reduces Soybean Yield Without Causing Obvious Aboveground Symptoms. Plant Dis. 2003, 87, 623–628. [Google Scholar] [CrossRef]
Peng, D.; Jiang, R.; Peng, H.; Liu, S. Soybean Cyst Nematodes: A Destructive Threat to Soybean Production in China. Phytopathol. Res. 2021, 3, 19. [Google Scholar] [CrossRef]
Chen, S.; Kurle, J.; Malvick, D.; Potter, B.; Orf, J. Soybean Cyst Nematode Management Guide. Available online: https://extension.umn.edu/soybean-pest-management/soybean-cyst-nematode-management-guide (accessed on 2 December 2025).
Castelão Tetila, E.; Brandoli Machado, B.; Belete, N.A.; Guimarães, D.A.; Pistori, H. Identification of Soybean Foliar Diseases Using Unmanned Aerial Vehicle Images. IEEE Geosci. Remote Sens. Lett. 2017, 14, 2190–2194. [Google Scholar] [CrossRef]
Prince Czarnecki, J.M.; Samiappan, S.; Bheemanahalli, R.; Huang, Y.; Shammi, S.A. A Brief History of Remote Sensing of Soybean. Agron. J. 2025, 117, e70004. [Google Scholar] [CrossRef]
Nutter, F.W.; Tylka, G.L.; Guan, J.; Moreira, A.J.D.; Marett, C.C.; Rosburg, T.R.; Basart, J.P.; Chong, C.S. Use of Remote Sensing to Detect Soybean Cyst Nematode-Induced Plant Stress. J. Nematol. 2002, 34, 222–231. [Google Scholar]
Kulkarni, S.S.; Bajwa, G.; Robbins, R.T.; Costello, T.A.; Kirkpatrick, T.L. Spatial Correlation of Crop Response to Soybean Cyst Nematode (Heterodera glycines). Trans. ASABE 2008, 51, 1451–1459. [Google Scholar] [CrossRef]
Kulkarni, S.S.; Bajwa, G.; Robbins, R.T.; Costello, T.A.; Kirkpatrick, T.L. Effect of Soybean Cyst Nematode (Heterodera glycines) Resistance Rotation on SCN Population Distribution, Soybean Canopy Reflectance, and Grain Yield. Trans. ASABE 2008, 51, 1511–1517. [Google Scholar] [CrossRef]
Bajwa, S.G.; Rupe, J.C.; Mason, J. Soybean Disease Monitoring with Leaf Reflectance. Remote Sens. 2017, 9, 127. [Google Scholar] [CrossRef]
Arantes, B.; Moraes, V.H.; Geraldine, A.; Alves, T.; Albert, A.; Silva, G.; Castoldi, G. Spectral Detection of Nematodes in Soybean at Flowering Growth Stage Using Unmanned Aerial Vehicles. Ciência Rural. 2021, 51, e20200283. [Google Scholar] [CrossRef]
Arantes, B.; Moraes, V.H.; Geraldine, A.; Alves, T.; Albert, A.; Silva, G.; Castoldi, G. Detection of Nematodes in Soybean Crop by Drone. Rev. Ciência Agronômica 2023, 54, e20217810. [Google Scholar] [CrossRef]
Santos, L.B.; Bastos, L.M.; de Oliveira, M.F.; Soares, P.L.M.; Ciampitti, I.A.; da Silva, R.P. Identifying Nematode Damage on Soybean through Remote Sensing and Machine Learning Techniques. Agronomy 2022, 12, 2404. [Google Scholar] [CrossRef]
Jjagwe, P.; Chandel, A.K.; Langston, D.B. Impact Assessment of Nematode Infestation on Soybean Crop Production Using Aerial Multispectral Imagery and Machine Learning. Appl. Sci. 2024, 14, 5482. [Google Scholar] [CrossRef]
Ahmad, S.; Chen, Z.; Aqsa; Ikram, S.; Ikram, A. AI-Enabled Vision Transformer for Automated Weed Detection: Advancing Innovation in Agriculture. Int. J. Adv. Comput. Sci. Appl. 2024, 15. [Google Scholar] [CrossRef]
Barman, U.; Sarma, P.; Rahman, M.; Deka, V.; Lahkar, S.; Sharma, V.; Saikia, M.J. ViT-SmartAgri: Vision Transformer and Smartphone-Based Plant Disease Detection for Smart Agriculture. Agronomy 2024, 14, 327. [Google Scholar] [CrossRef]
Li, Y.; Yan, W.; An, S.; Gao, W.; Jia, J.; Tao, S.; Wang, W. A Spatio-Temporal Fusion Framework of UAV and Satellite Imagery for Winter Wheat Growth Monitoring. Drones 2023, 7, 23. [Google Scholar] [CrossRef]
Zeng, Z.; Mahmood, T.; Wang, Y.; Rehman, A.; Mujahid, M.A. AI-Driven Smart Agriculture Using Hybrid Transformer-CNN for Real Time Disease Detection in Sustainable Farming. Sci. Rep. 2025, 15, 25408. [Google Scholar] [CrossRef]
Soil Survey Staff, Natural Resources Conservation Service, United States Department of Agriculture. Available online: https://sdmdataaccess.sc.egov.usda.gov (accessed on 2 December 2025).
Faghihi, J.; Ferris, J.M. An Efficient New Device to Release Eggs From Heterodera glycines. J. Nematol. 2000, 32, 411–413. [Google Scholar]
Barker, D.J.; Chiavegato, M.; Essman, A.; Fulton, J.; Haden, R.; LaBarge, G.; Lentz, E.; Lindsey, A.; Lindsey, L.; Lopez-Nicora, H.; et al. Ohio Agronomy Guide, 16th Edition (PDF), 16th ed.; The Ohio State University Press: Columbus, OH, USA, 2024. [Google Scholar]
Knipling, E.B. Physical and Physiological Basis for the Reflectance of Visible and Near-Infrared Radiation from Vegetation. Remote Sens. Environ. 1970, 1, 155–159. [Google Scholar] [CrossRef]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2021, arXiv:2010.11929. [Google Scholar] [CrossRef]
Bevers, N.; Ohlson, E.W.; Kc, K.; Jones, M.W.; Khanal, S. Sugarcane Mosaic Virus Detection in Maize Using UAS Multispectral Imagery. Remote Sens. 2024, 16, 3296. [Google Scholar] [CrossRef]
Chang, J.; Clay, D.; Clay, S.; Reese, C. Using Field Scouting or Remote Sensing Technique to Assess Soybean Yield Limiting Factors. In iGrow Soybean: Best Management Practices for Soybean Production; South Dakota State University: Brookings, SD, USA, 2013. [Google Scholar]
Slaton, M.R.; Raymond Hunt, E., Jr.; Smith, W.K. Estimating Near-Infrared Leaf Reflectance from Leaf Structural Characteristics. Am. J. Bot. 2001, 88, 278–284. [Google Scholar] [CrossRef]
Fehr, W.R.; Caviness, C.E.; Burmood, D.T.; Pennington, J.S. Stage of Development Descriptions for Soybeans, Glycine max (L.) Merrill. Crop Sci. 1971, 11, 929–931. [Google Scholar] [CrossRef]
Nleya, T.; Sexton, P.; Gustafson, K. Soybean Growth Stages. In iGrow Soybeans: Best Management Practices for Soybean Production; South Dakota State University, SDSU Extension: Brookings, SD, USA, 2013; pp. 1–11. [Google Scholar]
Wendimu, G.Y. Cyst Nematode (Heterodera glycines) Problems in Soybean (Glycine max L.) Crops and Its Management. Adv. Agric. 2022, 2022, 7816951. [Google Scholar] [CrossRef]
Wang, Y.; Li, R.; Bond, J.; Fakhoury, A.; Schoof, J. A Novel Spectral Vegetation Index for Improved Detection of Soybean Cyst Nematode (SCN) Infestation Using Hyperspectral Data. Crops 2025, 5, 58. [Google Scholar] [CrossRef]
Dorrance, A.; Martin, D.; Harrison, K.; Lopez-Nicora, H.; Niblack, T. Soybean Cyst Nematode 2019. Available online: https://ohioline.osu.edu/factsheet/plpath-soy-5 (accessed on 17 December 2025).

Figure 1. Average daily precipitation (cm), average daily air temperature (°C), average daily relative humidity (%), and average daily soil temperature (°C) at the selected soybean field during the 2022 experimental period. Precipitation values correspond to the left y-axis, and temperature and humidity values correspond to the right y-axis. This data was collected from the Ohio State University CFAES Weather System at Columbus Station.

Figure 2. (a) A regional map showing the broader location of the subject field in Columbus, Ohio, in the Midwest region of the United States. (b) A multispectral orthomosaic image of the SCN-infested soybean field at Waterman Agricultural and Natural Resources Laboratory, displayed using RGB bands. The red squares mark the respective 3-m-by-3-m quadrat location for each of the 100 soil sample locations tested for SCN populations throughout the 2022 field season. The SCN population was measured at each quadrat approximately every two weeks.

Figure 3. The soil sampling dates are aligned with the nearest UAS flight images. These matches are indicated together in red boxes, with blue representing the soil sampling dates and green representing the UAS flight dates. In other words, each date for soil sample collection related to the SCN population of the 100 georeferenced soil sample locations corresponds to a multispectral orthomosaic image of the entire field.

Figure 4. The variation in SCN population (# refers to the number or count of eggs/100 cm³ of soil) throughout the entire field for each sampling date. Each sample/data point corresponds to the population measurement collected from one of the 100 quadrants across the field (i.e., the distribution for each sampling date includes 100 individual population counts). Data was gathered at multiple soil sampling dates, each corresponding to a specific soybean growth stage from Vegetative (VE) through Reproductive (R8), as mentioned in parentheses.

Figure 5. Variation in SCN population counts and their correlation with yield observed at each soil sampling date. The left blue Y-axis represents the periodic changes in SCN counts. The right red Y-axis represents the variation in pairwise correlation values.

Figure 6. An orthomosaic image illustrating the soil sampling locations and their SCN counts (# refers to the number or count of eggs/100 cm³ of soil) observed on 5 July 2022, overlaid on the 1 July 2022, UAS image. This is an example in which SCN population variation showed a relatively high correlation with soybean yield (0.21).

Figure 7. Sample results of K-means clustering showing the classification of soybean plant and non-plant pixels from quadrats within the multispectral UAS images. The four dated graphics demonstrate the k-means clustering approach that was employed using the Excess Green Index (ExG). After clustering, plant pixels were identified based on high NIR-band reflectance, an established indicator of healthy vegetation.

Figure 8. A schematic of the ViT architecture utilized for detecting non-detected and SCN-infested regions using multispectral imagery. * represents CLS (Classification) token position, which is a special token used at the beginning of the input sequence to capture the global representations.

Figure 9. A schematic of the SCN–CNN architecture consisting of basic convolutional, batch normalization, and max pooling layers that were utilized for binary classification (non-detected and SCN-infested) of multispectral imagery.

Figure 10. The classes of the confusion matrix that are used in binary classification of DL models. Positive refers to non-detected and negative refers to SCN-infested regions.

Figure 11. Variation in spectral value (containing both soybean plant and non-plant) distribution between non-detected and SCN-infested classes collected on (a) 16 July 2022 and (b) 19 August 2022.

Figure 12. Spectral values of soybean plant and non-plant pixels between non-detected and SCN-infested classes collected on (a) 16 June 2022 and (b) 19 August 2022.

Figure 13. Cohen’s d effect sizes for soybean plant and non-plant pixels across all five multispectral bands in distinguishing non-detected and SCN-infested classes. Circle size corresponds to the magnitude of Cohen’s d for each UAS collection date, spectral band, and pixel category.

Figure 14. The variation in performance metrics such as accuracy, precision, recall, and F1score observed for different timestamp-based ViT models (Top 5). # indicates the number of timestamp images.

Figure 15. Accuracy of the ViT model across four SCN population categories—No SCN, low SCN, medium SCN, and high SCN—derived from reclassifying the model’s two predicted classes using ground-truth observations. Although the model was trained using only two classes, this post hoc categorization provides deeper insight into which SCN population ranges are most frequently misclassified. # indicates the number of timestamp images.

Figure 16. Variation in accuracy, precision, recall, and F1score across the top five SCN–CNN models trained with different timestamp combinations. # indicates the number of timestamp images.

Figure 17. Accuracy of the SCN–CNN model across four categories: No SCN, low SCN, medium SCN, and high SCN population, as derived by reclassifying the model’s two predicted classes using ground-truth observations. Although the model was trained using only two categories, this post hoc grouping provides insights into which population ranges are most frequently misclassified. # indicates the number of timestamp images.

Figure 18. Comparison of ViT and SCN–CNN model accuracies across different timestamp configurations.

Figure 19. Best performing two-timestamp ViT model predictions overlaid on the UAS image captured on 19 August 2022. Yellow text represents the model-predicted probability of detecting the particular class (positive: non-detected, negative: SCN-infested).

Figure 20. Best performing five-timestamp SCN–CNN model predictions overlaid on the UAS image captured on 19 August 2022. Yellow text represents the model-predicted probability of detecting the particular class (positive: non-detected, negative: SCN-infested).

Table 1. Vegetation indices used as inputs to the DL models. The # column is simply numbering the listed indices.

#	Vegetation Index	Formula
1	Normalized Difference Vegetation Index (NDVI)	$NDVI = \frac{NIR - Red}{NiR + Red}$
2	Excess Green Index (ExG)	$ExG = 2 \times G - B - R$
3	Normalized Difference Red Edge (NDRE)	$NDRE = \frac{NIR - RedEdge}{NiR + RedEdge}$
4	Enhanced Vegetation Index (EVI)	$EVI = 2.5 \times \frac{NIR - Red}{(NIR + 6 \times RED) - (7.5 \times BLUE + 1)}$
5	Green Normalized Difference Vegetation Index (GNDVI)	$GNDVI = \frac{NIR - Green}{NiR + Green}$

Table 2. A table displaying the consecutive timestamps that were used as input data for DL models.

# of Timestamps	Selected Timestamps
2	19 August; 2 September
3	29 July; 19 August; 2 September
4	3 June; 29 July; 19 August; 2 September
5	3 June; 22 July; 29 July; 19 August; 2 September
6	3 June; 17 June; 22 July; 29 July; 19 August; 2 September
7	3 June; 17 June; 1 July; 22 July; 29 July; 19 August; 2 September

Note: The # sign refers to the count or amount of timestamps included in the respective timestamp grouping. The timestamps highlighted in bold represent newly added timestamps to the previously selected timestamps.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Katari, S.; Bevers, N.; KC, K.; Peart, A.; Lopez-Nicora, H.D.; Khanal, S. Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging. Remote Sens. 2026, 18, 757. https://doi.org/10.3390/rs18050757

AMA Style

Katari S, Bevers N, KC K, Peart A, Lopez-Nicora HD, Khanal S. Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging. Remote Sensing. 2026; 18(5):757. https://doi.org/10.3390/rs18050757

Chicago/Turabian Style

Katari, Sushma, Noah Bevers, Kushal KC, Alison Peart, Horacio D. Lopez-Nicora, and Sami Khanal. 2026. "Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging" Remote Sensing 18, no. 5: 757. https://doi.org/10.3390/rs18050757

APA Style

Katari, S., Bevers, N., KC, K., Peart, A., Lopez-Nicora, H. D., & Khanal, S. (2026). Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging. Remote Sensing, 18(5), 757. https://doi.org/10.3390/rs18050757

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Deep Learning for Soybean Cyst Nematode Detection: A Comparison of Vision Transformer and CNN with Multispectral Imaging

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Collection

2.2.1. Collection of UAS Multispectral Imagery

2.2.2. Soybean Cyst Nematode Population Sampling

2.2.3. Soybean Management and Yield

2.3. Exploratory Data Analysis

2.3.1. Ground Truth Data Observations

2.3.2. Spectral Data and Data Selection

2.3.3. Data Utilized for Model Building

2.3.4. Vision-Transformer (ViT) Architecture

2.3.5. SCN–CNN Architecture

2.3.6. Model Performance Evaluation

2.3.7. Model Configuration

3. Results

3.1. Spectral Analysis of Non-Detected and SCN-Infested Regions

3.2. ViT and SCN–CNN Model Performance for Classification of SCN-Infested Areas

3.2.1. ViT Model Results

3.2.2. SCN–CNN Model Results

3.2.3. Interplay of Architecture, Temporal Data, and Partitioning

3.2.4. Model Performance Visualization over Each Timestamp of a Model

3.2.5. Model Performance on Unseen Timestamp Data

4. Discussion

4.1. Contribution of Multispectral Imagery to SCN Detection

4.2. Deep-Learning Model Comparisons and Limitations

4.2.1. Model Learning Across SCN Population Ranges

4.2.2. Optimal Time Period and Growth Stage for SCN Detection

4.3. Data Limitations

4.4. Employing UAS for Future SCN Management

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI