YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds

Tang, Xiuying; Wang, Pei; Sun, Zhongqing; Liu, Zhenglin; Tang, Yumei; Shi, Jie; Ma, Liying; Zhang, Yonghua

doi:10.3390/agriculture16020140

Open AccessArticle

YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds

by

Xiuying Tang

,

Pei Wang

,

Zhongqing Sun

,

Zhenglin Liu

,

Yumei Tang

,

Jie Shi

,

Liying Ma

and

Yonghua Zhang

^*

College of Mechanical and Electrical Engineering, Yunnan Agricultural University, Kunming 650201, China

^*

Author to whom correspondence should be addressed.

Agriculture 2026, 16(2), 140; https://doi.org/10.3390/agriculture16020140

Submission received: 3 December 2025 / Revised: 30 December 2025 / Accepted: 31 December 2025 / Published: 6 January 2026

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Cucumber downy mildew, angular leaf spot, and powdery mildew represent three predominant fungal diseases that substantially compromise cucumber yield and quality. To address the challenges posed by the irregular morphology, prominent multi-scale characteristics, and ambiguous lesion boundaries of cucumber foliar diseases in complex field environments—which often lead to insufficient detection accuracy—along with the existing models’ difficulty in balancing high precision with lightweight deployment, this study presents YOLOv11n-DSU (a lightweight hierarchical detection model engineered using the YOLOv11n architecture). The proposed model integrates three key enhancements: deformable convolution (DEConv) for optimized feature extraction from irregular lesions, a spatial and channel-wise attention (SCSA) mechanism for adaptive feature refinement, and a Unified Intersection over Union (Unified-IoU) loss function to improve localization accuracy. Experimental evaluations demonstrate substantial performance gains, with mean Average Precision at 50% IoU threshold (mAP₅₀) and mAP_50–95 increasing by 7.9 and 10.9 percentage points, respectively, and precision and recall improving by 6.1 and 10.0 percentage points. Moreover, the computational complexity is markedly reduced to 5.8 Giga Floating Point Operations (GFLOPs). Successful deployment on an embedded platform confirms the model’s practical viability, exhibiting robust real-time inference capabilities and portability. This work provides an accurate and efficient solution for automated disease grading in field conditions, enabling real-time and precise severity classification, and offers significant potential for advancing precision plant protection and smart agricultural systems.

Keywords:

cucumber diseases; severity grading; complex field conditions; YOLOv11n; lightweight; embedded deployment

1. Introduction

Cucumber is a widely cultivated vegetable crop of global importance, whose yield and quality are crucial for the sustainable development of China’s protected vegetable industry. However, under greenhouse cultivation conditions, disease infections significantly constrain cucumber productivity and quality. Among these diseases, downy mildew, angular leaf spot, and powdery mildew are the most prevalent and destructive foliar diseases. Downy mildew [1], caused by Pseudoperonospora cubensis, is characterized by angular yellow spots on the leaf surface and purplish-gray mold formation on the underside under high-humidity conditions. Angular leaf spot [2], induced by Pseudomonas syringae, initially appears as water-soaked angular lesions that subsequently develop into necrotic spots with a tendency for perforation. Powdery mildew [3], primarily caused by Podosphaera xanthii, produces characteristic white powdery mold layers on leaf surfaces. These three diseases often occur individually or as mixed infections, severely impairing leaf photosynthetic function and leading to premature senescence and leaf yellowing. In severe cases, complete crop loss may occur. Complex field backgrounds including weeds, shadows, and overlapping plants often cause model misidentification and severity grading errors. Due to their non-invasive nature and high efficiency, image-based disease detection methods have become an important research trend in smart agriculture [4,5,6].

In the field of cucumber disease severity grading, researchers have explored diverse technical pathways. Early studies predominantly relied on traditional image processing and machine learning methods. These approaches utilized image segmentation [7], multi-feature fusion [8], and machine learning algorithms such as Support Vector Machines [9] for disease identification and grading, demonstrating the feasibility of machine vision for this task [10]. Cao et al. [11] integrated image mask learning, contrastive learning, and image-text cross-modal learning, achieving 95% accuracy in few-shot, unlabeled cucumber disease recognition. However, such methods are often sensitive to factors like varying illumination and leaf occlusion in complex backgrounds, resulting in limited robustness. With the widespread adoption of deep learning, Convolutional Neural Network (CNN)-based recognition and segmentation methods have become mainstream, significantly improving accuracy and generalization capability [12]. Research efforts have bifurcated: one stream focuses on modifying and leveraging classical network architectures. For instance, Wang et al. [13] implemented cucumber disease classification based on an improved VGG network, enhancing the grading accuracy for cucumber leaf diseases in complex backgrounds. The other stream addresses the challenge of field adaptability through lightweight model design. Tang et al. [14] employed a lightweight architecture for a six-level tomato early blight grading task, achieving 94.1% mAP₅₀ and a detection speed of 15.67 FPS. Liu et al. [15] constructed a lightweight CNN named MassNet, with the entire system achieving a grading efficiency of 93%. Zhang et al. [16] tackled the issues of excessive parameters and limited feature representation in the AlexNet model for plant disease recognition by proposing a GPDCNN model that incorporates dilated convolution and global pooling. This model maintained high accuracy while significantly reducing computational complexity, demonstrating superior recognition performance on a dataset of six cucumber leaf diseases. Meanwhile, the development of efficient, unified, end-to-end detection frameworks tailored for specific agricultural scenarios has emerged as a significant research trend. Such studies typically build upon established object detection architectures like YOLO, designing lightweight and unified networks at the architectural level to meet the deployment constraints of edge devices. For instance, Anandakrishnan et al. [17] developed the Tiny-YOLOv9 model specifically for UAV-based plant protection, achieving high-precision real-time detection across multiple crop foliar diseases. Nguyen et al. [18] enhanced the feature extraction capability of YOLOv11n by designing the αSiLU activation function, thereby achieving consistent performance improvements on tomato and cucumber disease datasets. This progression signifies that disease detection technology is evolving from optimizing models for singular scenarios towards constructing generalizable, adaptable system-level toolkits. Furthermore, Shu-fei et al. [19] proposed a YOLOv5s-SE-DW model for recognizing three cucumber diseases in natural environments, attaining a maximum mAP of 80.9% while compressing the model size to 9.45 MB and reducing computational cost to 11.8 GFLOPs. In recent years, research has increasingly focused on precise severity grading in natural environments. Yao et al. [20] achieved an average accuracy exceeding 94% for grading the severity of cucumber downy mildew and anthracnose. Ozguven et al. [21] utilized a Faster R-CNN model for the detection and severity assessment of cucumber powdery mildew, attaining an accuracy of 94.86% on a set of 175 images.

In summary, intelligent detection and severity classification of cucumber diseases have achieved a significant transition from traditional image processing to deep learning approaches, with notable improvements in recognition accuracy. However, persistent technical bottlenecks remain in key areas, including adaptation to complex field environments, fine-grained severity classification under co-occurring disease conditions, and lightweight model deployment. Regarding the choice of technical approach, instance segmentation-based methods, while capable of providing pixel-level accuracy, are often characterized by high model complexity and computational cost, making them unsuitable for real-time processing and deployment on resource-constrained embedded devices in field settings. In contrast, the object detection paradigm, which localizes and classifies individual lesion instances, enables rapid identification and preliminary severity assessment of leaves affected by multiple diseases and severity levels at a lower computational and annotation cost. To address these challenges, this study focuses on downy mildew, angular leaf spot, and powdery mildew under complex field conditions. Employing the lightweight YOLOv11n as the baseline model, three core improvements are introduced: the Deformable Convolution (DEConv) module to enhance multi-scale feature extraction from irregular lesions; the Spatial and Channel-wise Synergistic Attention (SCSA) module to suppress interference from complex backgrounds; and the Unified Intersection over Union (Unified-IoU) loss function to optimize the consistency between bounding box regression and classification. The resulting model, named YOLOv11n-DSU, is designed to achieve real-time, accurate identification and precise severity assessment of the three target diseases, thereby providing a reliable technical solution to support precision pesticide application, cost control, and quality assurance in agricultural production.

2. Materials and Methods

2.1. Data Acquisition

To construct a dataset suitable for grading detection of cucumber downy mildew, angular leaf spot, and powdery mildew, field image collection was conducted from 26 to 30 July 2025, at a cucumber plantation in Weijiaying, Bajie Subdistrict, Anning City, Kunming, Yunnan Province. The collection period coincided with the mid-to-late growth stage of cucumbers, characterized by frequent occurrences of all three diseases and diverse lesion morphology, covering the complete disease progression from initial infection to severe stages. Image acquisition was performed primarily using smartphones under natural lighting conditions. Photographs were taken from a distance of 30–60 cm at multiple angles, including top-down and oblique perspectives. To enhance model generalization in complex environments, the collection process intentionally incorporated varied scenarios such as strong sunlight, overcast diffuse light, and partial shading conditions, as illustrated in Figure 1. A total of 7129 raw images were initially obtained. Following a preliminary selection process to exclude severely blurred, overexposed, or invalid samples, 6037 images were retained. The final dataset includes images of healthy leaves as well as leaves affected by different severity grades of downy mildew, angular leaf spot, and powdery mildew.

The dataset used in this study was collected from cultivation greenhouses within a single geographical region. Although the dataset has been enriched through the introduction of diverse lighting conditions, background variations, and data augmentation techniques for sample diversification, it inherently lacks coverage across different climatic zones, cucumber cultivars, and long-term phenotypic disease spectra. A key design consideration of this work is to frame the model’s learning objective around recognizing the visual pathological characteristics of the diseases themselves and their severity levels, rather than memorizing irrelevant features associated with specific growing locations or varieties. This task definition intentionally forces the model to learn more generalizable patterns inherent to lesion morphology, thereby laying an algorithmic foundation for future adaptation to other environments and cultivars.

2.2. Data Augmentation and Final Dataset

Following the initial image selection, data augmentation was performed on the 6037 validated images to enhance the model’s adaptability to complex field environments. By incorporating rotation, exposure adjustment, and noise injection, the dataset was effectively expanded to a total of 6832 images. The final dataset distribution is summarized in Table 1.

The criteria for disease severity classification were defined by the following technical standards for cucumber disease resistance identification: NY/T 1857.1-2010 [22], NY/T 1857.6-2010 [23], and NY/T 1857.2-2010 [24]. The specific classification standards are provided in Table 2.

2.3. Multi-Disease Severity Classification Model Based on YOLOv11n

This study employs YOLOv11n as the baseline model for severity classification of cucumber downy mildew, angular leaf spot, and powdery mildew. YOLOv11, released by Ultralytics, represents one of the latest iterations in the YOLO series, maintaining the philosophy of single-stage, efficient design while incorporating comprehensive optimizations in network architecture, training strategies, and loss functions. YOLOv12, another high-performance variant that has recently gained attention in both the academic community and engineering practice, demonstrates notable improvements in feature fusion and cross-scale perception. The models referenced in this study—”YOLOv11n” and the “YOLOv12n” used in comparative experiments—feature the suffix “n” denoting the “nano” version of the series, characterized by a reduced parameter count and high computational efficiency, making them particularly suitable for deployment on edge computing devices. While the YOLO series has gained widespread recognition in object detection for its compact architecture and efficient inference, its direct application to coordinated detection of these three diseases poses substantial challenges. First, the diseases exhibit distinct morphological characteristics: angular chlorotic lesions in downy mildew, water-soaked spots progressing to perforations in angular leaf spot, and white powdery patches in powdery mildew, all demanding enhanced feature extraction capabilities from the model. Second, the multi-scale nature of disease manifestations is prominent, where early-stage lesions with minimal area and subtle features are particularly susceptible to feature attenuation during extraction, leading to insufficient detection accuracy for small targets. Furthermore, interfering factors in field environments, including varying illumination conditions and leaf occlusion, along with complex scenarios of disease co-infection, significantly increase identification difficulty. Finally, the visual transitions between different severity levels are often subtle, with particularly ambiguous boundaries distinguishing moderate from severe infection stages, thereby imposing higher requirements on the model’s capacity for fine-grained severity classification.

To address the aforementioned challenges, this paper incorporates three key structural enhancements into the YOLOv11n framework: First, a Depthwise Separable Dilated Convolution (DEConv) module is introduced, which employs parallel multi-rate dilated convolutions to enhance multi-scale lesion feature perception while maintaining parameter efficiency. Second, a Spatial-Channel Synergistic Attention (SCSA) mechanism is designed, leveraging the synergistic effect of channel recalibration and spatial context modeling to achieve precise focus on lesion regions and effectively suppress complex background interference. Third, the Unified Intersection over Union (Unified-IoU) loss function is adopted to dynamically optimize bounding box regression, significantly improving detection performance for lesions with ambiguous boundaries and similar morphological characteristics. The overall architecture of the enhanced algorithm is illustrated in Figure 2.

The principal improvements of this study consist of the following three aspects:

2.3.1. DEConv Convolutional Module

A Dynamic Deformable Convolution (DEConv) module [25] is integrated into the backbone feature extraction component of the YOLOv11n network, replacing the original standard convolutional structure to enhance the model’s capability in extracting features from irregularly shaped lesions and geometric variations.

As illustrated in Figure 3, the proposed DEConv module effectively addresses the challenges posed by the high diversity and complexity in the shape, scale, and texture of lesions caused by cucumber downy mildew, angular leaf spot, and powdery mildew. Through deformable convolutions in the main path, the DEConv module enables convolutional kernels to dynamically adapt to the angular contours characteristic of downy mildew and angular leaf spot, as well as the precise boundaries of irregularly clustered patches in powdery mildew. This achieves adaptive coverage of the receptive field over key regions of irregular lesions. Furthermore, the attention path employs a channel attention mechanism to selectively enhance discriminative features, such as the mold layer on the abaxial surface of downy mildew, brown scar tissues in angular leaf spot, and the distinctive powdery texture of powdery mildew. The fusion of these two pathways strengthens the discriminative capacity for complex morphological and multi-scale variations among the three diseases while maintaining model lightweightness. This design overcomes the limitations of traditional fixed convolutional kernels and improves the modeling performance for the intricate morphological and multi-scale characteristics of downy mildew, angular leaf spot, and powdery mildew.

2.3.2. SCSA Module

A Spatial and Channel Synergistic Attention (SCSA) module [26] is introduced to enhance the model’s capacity for capturing and reconstructing features in lesion regions. In cucumber disease images, lesion areas often exhibit challenges such as color distortion, texture blurring, and ambiguous boundaries with healthy tissues. The SCSA module addresses these issues through synergistic modeling in both spatial and channel dimensions, specifically improving the representation quality of key characteristics including lesion color, texture, and contour. The architecture of the SCSA module is depicted in Figure 4.

The SCSA module comprises two core submodules: Shared Multi-semantic Spatial Attention (SMSA) and Progressive Channel Self-Attention (PCSA). The operational workflow is as follows: The SMSA submodule first employs multi-scale 1D convolutions to extract multi-receptive-field spatial features from ambiguous lesion regions—such as the veination-constrained polygonal margins in downy mildew, diffuse transition zones of light brown spots in angular leaf spot, and blended boundaries between powdery coatings and leaf surfaces in powdery mildew. This process captures multi-level semantic information ranging from local details to global morphology, thereby providing spatial structural priors for subsequent channel attention. A progressive feature compression strategy is then applied to efficiently transmit spatial information into the channel dimension, preserving critical spatial context while reducing computational complexity, and guiding the channel attention to more accurately correct color deviations in lesions caused by illumination or occlusion. The PCSA submodule, leveraging the spatial priors provided by SMSA, dynamically adjusts channel weights through input-adaptive similarity computation, thereby enhancing the discriminability in color and semantics between diseased and healthy tissues. Finally, SMSA and PCSA are connected in series to form a lightweight synergistic attention framework, enabling the decoupling and deep integration of spatial structures and channel-wise features. This design collectively improves the structural clarity and color fidelity of lesion regions in images, suppresses redundancy in core lesion areas while enhancing detail reconstruction in marginal zones, and effectively highlights the polygonal contours of downy mildew, incipient perforation trends in angular leaf spot, and diffusion boundaries of powdery coatings in powdery mildew. The module significantly enhances the representational capacity of morphological and color characteristics, providing robust feature support for the accurate identification and severity grading of the three diseases.

2.3.3. Unified-IoU Loss Function

In the intelligent detection of cucumber diseases, lesion regions exhibit remarkable morphological diversity: downy mildew lesions display polygonal distribution constrained by leaf veins, angular leaf spot progresses from water-soaked spots to irregular yellowish-brown patches, and powdery mildew forms white powdery layers with blurred boundaries. These complex characteristics pose significant challenges for traditional IoU-series loss functions during optimization, including gradient saturation, scale sensitivity, and imbalanced attention allocation. To address these limitations, this study introduces a Unified-IoU (UIoU) loss function. It is important to note that UIoU is not a completely novel component but rather a systematic integration and task-oriented customization of advanced concepts—including dynamic scale sensitivity adjustment, task-aligned weighting, and geometric constraints—specifically tailored for the cucumber disease detection task. The core innovation lies in: (1) constructing a unified framework that integrates dynamic scaling, adaptive weighting, Fourier shape descriptors, and angular constraints to enable synergistic optimization of multiple components; and (2) directly coupling classification confidence with localization quality through confidence-guided loss computation and cosine annealing scheduling, thereby ensuring training stability and consistency of optimization objectives. The UIoU incorporates three innovative mechanisms to achieve dynamic optimization of prediction boxes across varying quality levels:

1.: Dynamic Scale Scaling Mechanism

The Unified-IoU (UIoU) loss function incorporates a dynamic bounding box scaling strategy, which performs synchronous transformation on both the predicted bounding box (P) and ground truth bounding box (G) through a scale factor α:

\begin{matrix} P^{'} = α \cdot P \end{matrix}

(1)

\begin{matrix} G^{'} = α \cdot G \end{matrix}

(2)

Here, P′ represents the transformed predicted bounding box, and G′ denotes the transformed ground truth bounding box. When α > 1, the bounding boxes are enlarged, thereby enhancing the learning emphasis on low-quality samples. Conversely, when α < 1, the bounding boxes are scaled down, intensifying the optimization focus on high-quality predictions. This mechanism is particularly effective for addressing challenges such as the polygonal lesions in cucumber downy mildew and the ambiguous boundary regions characteristic of powdery mildew infections.

2.: Adaptive Weight Allocation Mechanism

Built upon the task-aligned loss theory, a dynamic sample weighting function w is formulated as follows:

w_{dynamic} = {|IoU - μ_{t}|}^{γ} \cdot I (IoU > τ_{t})

(3)

Within this formulation, IoU (Intersection over Union) measures the overlap between predicted and ground truth bounding boxes, γ serves as the weight exponent governing sensitivity to IoU deviations, and I represents the indicator function that equals 1 when the specified condition is met and 0 otherwise. The mean IoU threshold

μ_{t}

and lower-bound IoU threshold τt are dynamically adjusted throughout the training process. This mechanism enables the model to intelligently allocate attention according to both training phase and sample quality: during initial training stages, it prioritizes moderately difficult samples to accelerate convergence, while in later stages it shifts focus to high-quality samples to enhance localization precision, thereby achieving adaptive optimization across different training phases.

3.: Multidimensional Geometric Constraints Mechanism

A loss function

L_{UIoU}

incorporating multi-dimensional geometric information is formulated as follows:

L_{UIoU} = λ_{1} L_{IoU} + λ_{2} L_{Shape} + λ_{3} L_{Angle}

(4)

where λ₁, λ₂ and λ₃ denote the weighting coefficients for the respective loss terms;

L_{IoU}

represents the conventional IoU loss; the shape constraint term

L_{Shape}

captures irregular lesion contours using Fourier descriptors; and the angular constraint term

L_{Angle}

enhances detection capability for linear lesions. To control computational overhead, Fourier descriptors approximate lesion contours using lower-order harmonic components, while angular constraints are implemented via lightweight vector operations. These computations introduce negligible additional costs during the forward pass in the training phase and are entirely removed during the inference phase, thus preserving real-time detection efficiency. Ablation studies confirm that the overall computational cost (in GFLOPs) of the final model remains lower than that of the baseline, demonstrating that the introduced enhancements achieve both lightweight design and improved effectiveness. This multi-constraint framework collectively enhances the modeling capacity for the complex morphological variations observed in various cucumber diseases.

To achieve a smooth transition during the training process, a cosine annealing scheduling strategy is adopted. Let the total number of training epochs be T and the current training epoch be epoch. The scaling factor is defined as:

\begin{matrix} {ratio}_{epoch} = 0.75 \cdot c o s (\frac{π \cdot epoch}{T}) + 1.25 \end{matrix}

(5)

This strategy ensures rapid convergence during the initial training phase and fine-grained optimization in later stages. Simultaneously, a confidence-guided mechanism is introduced, where the classification confidence of the predicted bounding box is denoted as

c \in [0, 1]

. The final Unified-IoU loss expression is formulated as:

\begin{matrix} L_{UIoU} = (1 - c) \cdot (1 - {IoU}_{scaled}) \end{matrix}

(6)

The core advantage of Unified-IoU lies in its innovative integration of dynamic scale scaling, adaptive weight allocation, and multi-dimensional geometric constraints, forming a specialized bounding box optimization framework tailored for cucumber disease characteristics. Through the dynamic scale scaling mechanism, the alignment between predicted and ground truth bounding boxes is notably enhanced. In complex scenarios with multiple co-existing diseases, Unified-IoU leverages the synergistic interaction of its three mechanisms to improve detection accuracy for individual diseases, thereby achieving comprehensive performance improvement in detection tasks. To enhance the reproducibility of the method, the complete computational workflow of Unified-IoU is summarized as Python pseudocode in Appendix A.

2.4. Experimental Setup and Evaluation Metrics

The experiments were conducted on a computing system equipped with a Windows 11 operating system, an Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40 GHz, and an NVIDIA RTX A5000 GPU with 24 GB VRAM. The deep learning framework used was PyTorch-2.0.1 with Python-3.9.7 as the programming language, PyCharm (2024.3.1) as the development environment, and CUDA 11.7 as the computational platform. All algorithms were executed under this consistent configuration.

During the training process, identical hyperparameters were maintained across all experiments. The specific parameter settings are detailed in Table 3.

To balance the requirements of both accuracy and real-time performance for severity grading of cucumber downy mildew, angular leaf spot, and powdery mildew, this study selects Precision (P), Recall (R), mean Average Precision (mAP), number of model parameters (Params), and computational complexity (GFLOPs) as evaluation metrics for model performance [27]. The corresponding formulas are provided as follows:

\begin{matrix} P = \frac{N_{TP}}{N_{TP} + N_{FP}} \times 100 % \end{matrix}

(7)

\begin{matrix} R = \frac{N_{TP}}{N_{TP} + N_{FN}} \times 100 % \end{matrix}

(8)

\begin{matrix} A P = \int_{0}^{1} P (R) dR \end{matrix}

(9)

\begin{matrix} m A P = \frac{1}{n} \sum_{i = 1}^{n} P_{A P_{i}} \end{matrix}

(10)

In this study,

N_{T P}

denotes the number of correctly classified disease severity grades across multiple diseases;

N_{F P}

represents the number of misclassified disease severity grades;

N_{F N}

indicates the number of undetected diseased leaf instances; AP (Average Precision) is defined as the area under the Precision-Recall curve; mAP (mean Average Precision) quantifies the average detection accuracy across different disease severity classifications; and the total number of distinct classes n is 16.

3. Results and Analysis

3.1. Ablation Study

Ablation studies serve to effectively validate performance improvements in cucumber multi-disease severity classification achieved by the proposed modular enhancements. To systematically evaluate the individual and combined contributions of each component, this study employs YOLOv11n as the baseline model and assesses the effects of the DEConv module, SCSA mechanism, and Unified-IoU loss function on the detection of cucumber downy mildew, angular leaf spot, and powdery mildew. The results of the ablation experiments are summarized in Table 4.

Specifically, mAP₅₀ denotes the mean average precision across all severity levels of cucumber downy mildew, angular leaf spot, and powdery mildew at a fixed IoU threshold of 50%. Meanwhile, mAP_50–95 represents the average detection precision computed over multiple IoU thresholds ranging from 50% to 95% with a step size of 5%, evaluating the model’s robustness across varying localization criteria for the three diseases.

Based on the ablation study results presented in Table 4, Experiment 2, which employed the DEConv module to replace the original convolutional structure, achieved improvements of 4.7%, 4.4%, 4.3%, and 3.3% in precision, recall, mAP₅₀, and mAP_50–95, respectively. Despite an increase in computational cost, the parameter count remained unchanged, indicating that DEConv effectively enhances feature extraction capability for irregular lesions through improved geometric modeling. In Experiment 3, the exclusive integration of the SCSA mechanism resulted in precision and mAP₅₀ improvements of 2.4% and 0.5%, respectively, while recall experienced a minor decline. This suggests that although the SCSA module strengthens focus on critical regions, it may simultaneously suppress certain background features, reflecting the selective nature of spatial-channel synergistic attention in complex disease lesion scenarios. In Experiment 4, where only the Unified-IoU loss was employed, the model exhibited a decline in mAP_50–95 and only marginal improvements in other metrics. This observation indicates that Unified-IoU acts as a strongly constrained optimization objective that inherently assumes the model possesses a robust capacity to represent lesion geometry. However, the standard convolutional structure of YOLOv11n has limited capability in extracting such geometric features, leading to a mismatch between the optimization objective and the model’s representational capacity. Consequently, when used in isolation, Unified-IoU cannot realize its full potential and tends to introduce instability during training. In Experiment 5, combining DEConv and SCSA modules, elevated mAP₅₀ to 90.8%, validating the complementary strengths of convolutional structures and attention mechanisms in feature extraction and fusion. The integration of DEConv with Unified-IoU in Experiment 6 demonstrated remarkable progress in recall, achieving an 8.0% improvement over the baseline, confirming that the synergistic use of deformable convolution and dynamic loss functions enhances detection capability for small-scale disease samples. In Experiment 7, the combination of SCSA and Unified-IoU underperformed the baseline YOLOv11n across all metrics. This finding corroborates the preceding analysis: without the geometric feature enhancement afforded by DEConv, the SCSA module primarily amplifies visually salient regions without fundamentally enriching the geometric representation within the feature space. Consequently, when Unified-IoU enforces its strong optimization constraints based on these geometrically impoverished features, it can steer the model toward a suboptimal or erroneous convergence point, leading to the observed concurrent degradation in both localization and classification performance. Finally, Experiment 8 incorporating all three modules achieved optimal performance across all evaluation metrics, with mAP₅₀ reaching 93.5% and mAP_50–95 attaining 88.3%. This result clearly illustrates the effectiveness of the hierarchical yet decoupled design among the modules: DEConv enhances the network’s geometric modeling capacity for irregular lesions; SCSA enables more precise semantic focusing within the feature space; and Unified-IoU performs high-accuracy bounding-box regression while ensuring alignment with classification confidence, demonstrating significant synergistic optimization effects that comprehensively improve detection performance while effectively controlling model complexity.

3.2. Analysis of Module Decision Mechanisms and Feature Flow

The ablation studies presented in Table 4 quantitatively measure the contribution of each module to the final performance metrics at a macroscopic level. To further elucidate the underlying mechanisms, a “feature flow” based theoretical analysis is constructed. The DEConv module operates at the foundational stage of feature extraction. Its deformable convolutional kernels dynamically adapt to the actual contours of irregular lesions—such as the polygonal shapes in downy mildew and water-soaked appearances in angular leaf spot—thereby overcoming the geometric representation limitations inherent in fixed kernels. This provides more accurate primary geometric features for subsequent processing stages and directly explains the significant improvement in the recall rate for early-stage and small-area lesions. The SCSA module implements synergistic attention across channel and spatial dimensions, amplifying the feature response to disease-specific textures. This enables the model to concentrate computational resources on pathological regions. This qualitative analysis aligns with the observed phenomenon of steadily increasing precision. The Unified-IoU loss function ensures that the high-quality features produced by the DEConv and SCSA modules are accurately decoded into detection outcomes. It guides bounding boxes to tightly fit lesion shapes through multi-dimensional geometric constraints and ensures consistency between classification confidence and localization quality via a confidence-coupled mechanism. This is identified as the core reason for the model’s greatest gain in the mAP_50–95 metric.

In summary, the synergy among the modules is not merely additive but constitutes a systematic solution formed across three levels—feature extraction, selection, and optimization—specifically designed to address the three core challenges of “feature deformation, background interference, and the mismatch between localization and classification.”

3.3. Comparative Analysis of Model Classification Performance Before and After Improvement

To visually compare the performance of the improved model in disease severity grading, Figure 5 presents normalized confusion matrices derived from the test set, contrasting the classification results of the baseline YOLOv11n model and the proposed YOLOv11n-DSU model. The horizontal and vertical axes represent the ground-truth labels and model predictions, respectively, while the color intensity along the diagonal reflects the correct classification rate for each category.

Figure 5 demonstrates that the YOLOv11n-DSU model achieves comprehensive performance improvements in the identification of downy mildew, angular leaf spot, and powdery mildew.

Regarding disease-specific performance, Notably, the timely detection of early-stage, small-sized lesions is critical for precision disease management. Table 5 demonstrates substantial improvements across different severity levels. For early-stage angular leaf spot detection, the mAP₅₀ for Grade 1 lesions significantly increased from 75.1% to 90.8% (a gain of 15.7 percentage points), while mAP_50–95 improved from 64.1% to 83.5%, indicating enhanced capability in capturing subtle features of incipient lesions. In downy mildew identification, the detection accuracy for Grade 1 and Grade 9 lesions reached 94.2% and 94.6%, representing improvements of 10.7 and 7.6 percentage points over the baseline, respectively, demonstrating excellent discriminative capacity throughout the disease progression from initial infection to severe stages. This directly demonstrates that the synergistic interaction between the DEConv module—enhancing geometric feature perception—and the Unified-IoU loss—optimizing bounding box regression for small targets—effectively addresses the challenge of capturing the subtle and difficult-to-capture characteristics of early-stage lesions, thereby significantly reducing the missed detection rate. For fine-grained classification, YOLOv11n-DSU exhibited outstanding performance in distinguishing early to mid-stage characteristics of powdery mildew. The mAP₅₀ values for Grade 3 and Grade 5 lesions both achieved 96.6%, with improvements of 7.4 and 9.1 percentage points over the original model, while their mAP_50–95 reached 91.4% and 90.4%, respectively, confirming the model’s accuracy in identifying characteristic changes during critical developmental phases. However, more modest improvements were observed for moderately severe cases, including Grade 5 and Grade 7 downy mildew and Grade 7 powdery mildew, where mAP_50–95 increased by only 5.6, 5.2, and 4.5 percentage points, respectively, with accuracy maintained between 83.7% and 85.5%. This limitation may stem from the higher complexity and frequent overlap of lesion features at these severity levels. The experimental results comprehensively validate the effectiveness and robustness of the proposed improvements. The model maintains high accuracy while preserving its lightweight design, making it particularly suitable for deployment on resource-constrained embedded devices.

3.4. Comparative Experiments on Loss Functions

To validate the efficacy and performance advantages of the proposed Unified-IoU loss function in cucumber disease object detection, a comparative study was conducted against prevailing IoU-based loss functions, including GIoU, DIoU, CIoU, EIoU, and SIoU. All experiments were performed using the identical YOLOv11n-DEConv-SCSA model architecture under consistent hyperparameter settings, evaluating the severity grading performance for cucumber downy mildew, angular leaf spot, and powdery mildew.

As presented in Table 6, Unified-IoU achieves optimal performance across all evaluation metrics, attaining precision, recall, mAP₅₀, and mAP_50–95 values of 87.2%, 85.7%, 93.5%, and 88.3%, respectively, surpassing all competing loss functions. Notably, Unified-IoU demonstrates substantial advantages in two critical metrics—recall and mAP_50–95—exceeding CIoU and EIoU by 5.6 and 7.9 percentage points, respectively. While EIoU and SIoU deliver moderate performance on certain individual metrics, their overall effectiveness remains considerably inferior to Unified-IoU. These results collectively demonstrate that Unified-IoU is particularly well-suited for precise detection and severity classification of cucumber diseases.

To further investigate the optimization characteristics of different loss functions, Figure 6 presents a comparative analysis of the convergence behaviors for bounding box regression loss (box_loss), classification loss (cls_loss), and distributional focal loss (dfl_loss) during the training process [28]. The results reveal that Unified-IoU exhibits accelerated convergence during initial training phases while maintaining a more stable optimization trajectory throughout the training process. Notably, the box_loss curve remains at the lowest level among all compared methods, demonstrating enhanced bounding box regression capability. The cls_loss maintains a stable descending trajectory during middle and late training stages, indicating sustained optimization of classification performance. Meanwhile, the dfl_loss shows relatively minor variations across different loss functions. These findings collectively indicate that Unified-IoU not only achieves superior detection accuracy but also demonstrates advantages in both training efficiency and optimization stability.

To further analyze the necessity of each constraint within the Unified-IoU loss function, a causal attribution is conducted based on its design motivation and the aforementioned experimental results. First, the dynamic scaling mechanism balances the optimization weights for lesions of different scales. Experimental results (see Table 5) show that the improved model achieves the most significant detection accuracy gains on Grade 1 (early-stage) lesions across all disease types, validating that this mechanism effectively enhances regression capability for small-scale targets. Second, the adaptive weighting mechanism improves model generalization. Comparing Experiment 4 and Experiment 8 in Table 4, when the model is equipped with strong feature extraction (DEConv) and focusing capability (SCSA), Unified-IoU raises mAP_50–95 to 88.3%, demonstrating that this mechanism works synergistically with high-level features to achieve more stable optimization. Finally, the geometric constraint mechanism directly addresses the irregular morphology of cucumber lesions. In Table 6, Unified-IoU exhibits superior performance in mAP_50–95 under high IoU thresholds, outperforming other loss functions, which substantiates the indispensable role of shape and angular constraints for achieving pixel-level precise localization. In summary, each component of Unified-IoU is designed to address a specific bottleneck, and its contribution is ultimately validated by the overall performance metrics.

3.5. Comparative Experiments

This section presents a comparative evaluation of YOLOv11n-DSU against mainstream object detection networks, including YOLOv5n, YOLOv8n, YOLOv11n, YOLOv12n, Vision Transformer, and Faster R-CNN, to validate its efficacy. The experimental results are summarized in Table 7.

Based on the experimental data presented in Table 7, the YOLOv11n-DSU model demonstrates superior performance across multiple key metrics. The proposed model achieves precision, recall, and mAP₅₀ values of 87.2%, 85.7%, and 93.5%, respectively, outperforming all comparative models, while also attaining 88.3% in mAP_50–95. Furthermore, the model exhibits remarkable computational efficiency, requiring only 5.8 GFLOPs and 5.5 MB of parameter storage. When compared against mainstream lightweight models, YOLOv11n-DSU shows substantial improvements in detection accuracy: it surpasses YOLOv5n by 15.9, 15.1, and 16.0 percentage points in precision, recall, and mAP₅₀, respectively; exceeds YOLOv8n by 3.5, 8.5, and 5.7 percentage points in the same metrics; and outperforms YOLOv11n by 6.1, 10.0, and 7.9 percentage points. In comparisons with non-YOLO architectures, the model demonstrates enhanced performance over Vision Transformer by 7.9, 14.6, and 11.5 percentage points in precision, recall, and mAP₅₀, respectively. The advantage is even more pronounced against Faster R-CNN, with corresponding improvements of 14.2, 17.4, and 16.2 percentage points across these three critical metrics.

Comprehensive evaluation demonstrates that YOLOv11n-DSU achieves substantial improvement in detection accuracy while maintaining superior computational efficiency.

3.6. Lightweight Model Deployment and Detection System

To enable efficient deployment of the cucumber downy mildew, angular leaf spot, and powdery mildew recognition model on embedded devices, the trained YOLOv11n-DSU model was successfully deployed on a Luban Cat5 development board. The deployment workflow, illustrated in Figure 7, began with converting the PyTorch-trained model weights into the ONNX intermediate format, which was subsequently transformed into the RKNN format compatible with the RK3588 chip architecture using the RKNN Toolkit. During the conversion, the model underwent key optimizations—including quantization, graph structure optimization, and tensor rearrangement—which compressed the model size and enhanced inference efficiency on the edge device while maintaining reasoning accuracy.

This system constructs a dedicated graphical user interface for cucumber downy mildew, angular leaf spot, and powdery mildew detection based on the Qt5 (5.15.2) platform. The interface adopts a modular architecture comprising four core functionalities: image acquisition, disease severity grading, path configuration, and result visualization. The system utilizes the OpenCV (4.11.0) module for image acquisition and preprocessing, while leveraging the RKNN API for efficient model inference. In practical operation, users can perform a series of tasks through the interface, including leaf image capture, detection task initiation, storage path setup, and result visualization. During detection, the system displays real-time annotated leaf images with disease severity indicators alongside detailed analytical results. The operational interface of the system is illustrated in Figure 8.

To validate the deployment performance, the YOLOv11n-DSU model underwent comprehensive evaluation, with its on-device performance compared against the original YOLOv11n model on the embedded platform. Figure 9 presents representative detection results from both models. Experimental results demonstrate that while maintaining its lightweight design, the YOLOv11n-DSU model not only achieves significant improvement in detection accuracy but also exhibits notable acceleration in inference speed, along with superior visualization clarity in the detection outputs.

4. Conclusions

The proposed YOLOv11n-DSU model achieves the high-precision detection of cucumber downy mildew, angular leaf spot, and powdery mildew severity under complex field conditions while effectively balancing model lightweight design and real-time inference capability. Experimental results demonstrate that the improved model attains 93.5% mAP₅₀ and 88.3% mAP_50–95, with precision and recall improving by 6.1% and 10.0%, respectively, compared to the original YOLOv11n model. Furthermore, the computational load and model size are reduced by 10.77% and 3.51%, notably, the detection capability for early-stage and small-sized lesions was markedly improved, respectively. These findings indicate that the optimized model effectively meets the requirements for practical agricultural field applications.

In terms of architectural improvements, an efficient and synergistic lightweight detection model, named YOLOv11n-DSU, is constructed in this work. This study innovatively integrates three collaborative modules: the DEConv module enhances feature extraction for irregular lesions through deformable convolutions, significantly improving the model’s adaptability to complex lesion morphologies. The SCSA module achieves adaptive feature calibration via spatial and channel attention mechanisms, thereby enhancing the semantic representation of critical lesion regions. The Unified-IoU loss function unifies the optimization objectives of bounding box regression and confidence assessment, improving the alignment between localization accuracy and classification confidence. Ablation studies demonstrate that the synergistic integration of these three components yields substantially better results than individual modules, achieving comprehensive detection performance improvements while maintaining the model’s lightweight design.

The YOLOv11n-DSU model demonstrates promising potential for deployment in embedded agricultural detection systems. The study accomplishes an end-to-end validation spanning from algorithmic design to practical field potential. Specifically, the proposed model achieves a 7.9 percentage-point improvement in mAP₅₀ over the baseline, while simultaneously reducing parameter count and computational cost by 4.0% and 10.8%, respectively. Leveraging its high-precision performance and lightweight architecture. To validate the deployment performance of the model on actual edge devices, the improved YOLOv11n-DSU model was deployed and tested on a Luban Cat5 platform (equipped with an RK3588 chip). After INT8 quantization, the model size was reduced to 5.5 MB. Experimental measurements show that when processing 640 × 640 input images, the end-to-end average inference speed reaches 22.5 FPS, with a peak memory usage of 586 MB during system operation. These results demonstrate the model’s practical deployment and operation potential on resource-constrained edge devices, the model can be directly implemented on field-deployed intelligent inspection devices or UAV platforms, enabling real-time detection and severity grading of cucumber diseases. This capability contributes to optimized pesticide application, enhances disease management efficacy, and ultimately advances intelligent and sustainable cucumber production.

The YOLOv11n-DSU model demonstrates particularly strong performance in detecting early-stage and small-scale lesions, extending its application value beyond mere disease identification. By offering an objective, high-throughput quantitative analysis tool, it addresses the long-standing challenge in cucumber disease-resistant variety breeding—which traditionally relies on labor-intensive and subjective manual assessment—and provides a novel technical pathway to support the sustainable development of smart agriculture.

5. Discussion

While the proposed YOLOv11n-DSU model demonstrates strong overall performance in cucumber disease recognition, notable limitations persist under extremely complex field scenarios. First, the object detection paradigm adopted in this study essentially approximates the pixel area of lesions through the area of bounding boxes, which are then classified according to national standards. This approximation is effective when lesions are relatively regular in shape and discretely distributed, and its accuracy can be quantified by the Jaccard index (IoU) between predicted boxes and true lesion regions. However, when lesions exhibit highly irregular morphology or are densely clustered, rectangular boxes inevitably include substantial healthy tissue, leading to area overestimation—an inherent limitation compared with instance segmentation methods in principle. The choice of this approach represents an engineering trade-off among model lightweightness, inference real-time performance, annotation cost, and evaluation accuracy. Experimental results indicate that, under the current dataset and application requirements, this approximation method already provides reliable severity grading outcomes.

Second, the challenge of generalizing a model trained on a single-region dataset must be addressed. On one hand, from a mechanistic perspective, the proposed architectural improvements are designed to encourage the learning of more generalizable features: the DEConv module extracts geometric invariants of lesions; the SCSA module enhances the discriminative contrast between diseased and healthy tissues; and the Unified-IoU loss optimizes for universal bounding-box matching. These designs steer the model toward learning the intrinsic visual patterns of the diseases, forming the inherent foundation for its generalization potential. On the other hand, to bridge the gap from a “laboratory model” to a “field-ready generic tool”, a two-stage technical framework is proposed: In the first stage, the model’s recognition boundaries are expanded by constructing cross-domain datasets, utilizing synthetic data, and applying domain generalization algorithms. In the second stage, the model presented here serves as a powerful pretrained backbone; when deployed in a new environment, only a small number of samples and lightweight fine-tuning are required to rapidly and cost-effectively obtain a high-accuracy adapted model.

To address these technical constraints, future research will primarily focus on two key directions: One direction is to explore lightweight instance segmentation models that pursue higher evaluation accuracy with acceptable efficiency loss. The other is to develop detection systems capable of automatically adapting to different geographical regions and cultivars, the core of which lies in advancing a rapid model migration methodology based on domain adaptation and data augmentation. The performance breakthrough achieved in early-stage lesion detection in this work significantly enhances the model’s practical value. It can serve as an objective, efficient, and quantitative analysis tool for assessing disease resistance in cucumber varieties, thereby better supporting precision agriculture management. These efforts will advance the realization of more intelligent and precise automated crop disease management systems, thereby providing robust technical support for modern scientific cultivation practices.

Author Contributions

Conceptualization, X.T. and P.W.; methodology, P.W. and Z.S.; resources, P.W., Z.L., Y.T. and L.M.; formal analysis, P.W.; data curation, P.W.; writing—original draft preparation, P.W.; writing—review and editing, X.T., Y.Z. and Z.S.; visualization, P.W.; project administration, J.S. and Y.Z.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Yunnan Provincial Joint Special Fund for Agricultural Basic Research (Project No. 202401BD070001-069).and The APC was funded by Yonghua Zhang.

Data Availability Statement

All data generated or analyzed during this study are included in this published article. The original datasets collected and analyzed during the current study are not publicly available due to privacy concerns but are available from the corresponding author on reasonable request.

Acknowledgments

The authors would like to express their sincere gratitude to all colleagues and technicians who provided administrative and technical support during this research. We also thank those who contributed experimental materials as in-kind donations. During the preparation of this manuscript, Microsoft Word was used for writing, PyCharm was employed for building and validating experimental models, and Qt5 was utilized for data visualization. The authors have thoroughly reviewed and edited all content and assume full responsibility for the publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Python Pseudocode for the Unified-IoU Loss Function.

import torch

import numpy as np

class UnifiedIoULoss:

def __init__(self, total_epochs = 500, lambda1 = 1.0, lambda2 = 0.5, lambda3 = 0.2):

self.total_epochs = total_epochs

self.lambda1 = lambda1

self.lambda2 = lambda2

self.lambda3 = lambda3

self.gamma = 2.0

self.iou_mu = 0.5

self.iou_tau = 0.4

def __call__(self, pred_boxes, true_boxes, conf_scores, current_epoch):

N = pred_boxes.shape[0]

ratio = 0.75 * np.cos(np.pi * current_epoch/self.total_epochs) + 1.25

alpha = torch.rand(N, 1) * (ratio - 1/ratio) + 1/ratio

alpha = alpha.to(pred_boxes.device)

pred_scaled = pred_boxes * alpha

true_scaled = true_boxes * alpha

iou = self.calculate_iou(pred_scaled, true_scaled)

self.update_dynamic_thresholds(iou)

w_dynamic = torch.abs(iou - self.iou_mu).pow(self.gamma)

w_dynamic = w_dynamic * (iou > self.iou_tau).float()

L_iou = 1.0 - iou

L_shape = self.shape_constraint(pred_scaled, true_scaled)

L_angle = self.angle_constraint(pred_scaled, true_scaled)

L_geometric = (

self.lambda1 * L_iou +

self.lambda2 * L_shape +

self.lambda3 * L_angle

)

L_weighted = w_dynamic * L_geometric

L_final = (1.0 - conf_scores) * L_weighted

return torch.mean(L_final)

def calculate_iou(self, boxes1, boxes2):

x1 = torch.max(boxes1[:, 0], boxes2[:, 0])

y1 = torch.max(boxes1[:, 1], boxes2[:, 1])

x2 = torch.min(boxes1[:, 2], boxes2[:, 2])

y2 = torch.min(boxes1[:, 3], boxes2[:, 3])

inter = torch.clamp(x2 - x1, min = 0) * torch.clamp(y2 - y1, min = 0)

area1 = (boxes1[:, 2] - boxes1[:, 0]) * (boxes1[:, 3] - boxes1[:, 1])

area2 = (boxes2[:, 2] - boxes2[:, 0]) * (boxes2[:, 3] - boxes2[:, 1])

union = area1 + area2 - inter + 1 × 10⁻⁷

return (inter/union).unsqueeze(1)

def shape_constraint(self, pred_boxes, true_boxes, k = 5):

N = pred_boxes.shape[0]

device = pred_boxes.device

L_shape = torch.zeros(N, 1, device = device)

for i in range(N):

pred_contour = self.box_to_contour(pred_boxes[i])

true_contour = self.box_to_contour(true_boxes[i])

pred_fourier = torch.fft.fft(pred_contour.float())[:k]

true_fourier = torch.fft.fft(true_contour.float())[:k]

L_shape[i] = torch.mean(torch.abs(pred_fourier - true_fourier) ** 2)

return L_shape

def angle_constraint(self, pred_boxes, true_boxes):

pred_vec = torch.stack([

pred_boxes[:, 2] - pred_boxes[:, 0],

pred_boxes[:, 3] - pred_boxes[:, 1]

], dim = 1)

true_vec = torch.stack([

true_boxes[:, 2] - true_boxes[:, 0],

true_boxes[:, 3] - true_boxes[:, 1]

], dim = 1)

pred_vec_norm = pred_vec/(torch.norm(pred_vec, dim = 1, keepdim = True) + 1e-7)

true_vec_norm = true_vec/(torch.norm(true_vec, dim = 1, keepdim = True) + 1e-7)

cos_sim = torch.sum(pred_vec_norm * true_vec_norm, dim = 1)

L_angle = 1.0 - torch.abs(cos_sim)

return L_angle.unsqueeze(1)

def box_to_contour(self, box, num_points = 100):

x1, y1, x2, y2 = box

t = torch.linspace(0, 1, num_points//4)

top = torch.stack([x1 + (x2-x1)*t, torch.full_like(t, y1)], dim = 1)

right = torch.stack([torch.full_like(t, x2), y1 + (y2-y1)*t], dim = 1)

bottom = torch.stack([x2 - (x2-x1)*t, torch.full_like(t, y2)], dim = 1)

left = torch.stack([torch.full_like(t, x1), y2 - (y2-y1)*t], dim = 1)

return torch.cat([top, right, bottom, left], dim = 0)

def update_dynamic_thresholds(self, iou):

alpha = 0.9

current_mu = torch.mean(iou).item()

self.iou_mu = alpha * self.iou_mu + (1-alpha) * current_mu

self.iou_tau = 0.8 * self.iou_mu

def usage_example():

batch_size = 32

total_epochs = 500

current_epoch = 250

pred_boxes = torch.randn(batch_size, 4)

true_boxes = torch.randn(batch_size, 4)

conf_scores = torch.rand(batch_size, 1)

loss_fn = UnifiedIoULoss(total_epochs = total_epochs)

loss_value = loss_fn(pred_boxes, true_boxes, conf_scores, current_epoch)

print(f”Unified-IoU: {loss_value.item():.4f}”)

if __name__ == “__main__”:

usage_example()

References

Vatter, T.; Barceló, M.; Gjakoni, P.; Segarra, G.; Trillas, M.I.; Aranjuelo, I.; Kefauver, S.C.; Araus, J.L. Comparing high-cost and lower-cost remote sensing tools for detecting pre-symptomatic downy mildew (Pseudoperonospora cubensis) infections in cucumbers. Comput. Electron. Agric. 2024, 218, 108736. [Google Scholar] [CrossRef]
Gadhi, M.A.; Nazir, T.; Majeed, M.Z.; Jatoi, G.H.; Jie, R.; Qiu, D. In-vitro and in-vivo assessment of biological control potential of nematode symbiont Xenorhabdus nematophila against Pseudomonas syringae, the causative agent of angular leaf spot of cucumber. J. Phytopathol. 2024, 172, e13351. [Google Scholar] [CrossRef]
Li, X.; Gao, Y.; Ahmad, N.; Bu, F.; Tian, M.; Jia, K.; Sun, W.; Li, C.; Zhao, C. Ficus carica Linn leaves extract induces cucumber resistance to Podosphaera xanthii by inhibiting conidia and regulating enzyme activity. Physiol. Mol. Plant Pathol. 2024, 133, 102339. [Google Scholar] [CrossRef]
Dolatabadian, A.; Neik, T.X.; Danilevicz, M.F.; Upadhyaya, S.R.; Batley, J.; Edwards, D. Image-based crop disease detection using machine learning. Plant Pathol. 2025, 74, 18–38. [Google Scholar] [CrossRef]
Wang, S.; Li, Q.; Yang, T.; Li, Z.; Bai, D.; Tang, C.; Pu, H. Lsd-yolo: Enhanced yolov8n algorithm for efficient detection of lemon surface diseases. Plants 2024, 13, 2069. [Google Scholar] [CrossRef]
Bao, W.; Fan, T.; Hu, G.; Liang, D.; Li, H. Detection and identification of tea leaf diseases based on AX-RetinaNet. Sci. Rep. 2022, 12, 2183. [Google Scholar] [CrossRef]
Zhang, J.H.; Kong, F.T.; Wu, J.Z.; Han, S.Q.; Zhai, Z.F. Automatic image segmentation method for cotton leaves with disease under natural environment. J. Integr. Agric. 2018, 17, 1800–1814. [Google Scholar] [CrossRef]
Wang, Z.; Xu, X.; Zhu, L.; Bin, Y.; Wang, G.; Yang, Y.; Shen, H.T. Evidence-Based Multi-Feature Fusion for Adversarial Robustness. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 8923–8937. [Google Scholar] [CrossRef]
Kashef, R. A boosted SVM classifier trained by incremental learning and decremental unlearning approach. Expert Syst. Appl. 2021, 167, 114154. [Google Scholar] [CrossRef]
Khan, R.U.; Khan, K.; Albattah, W.; Qamar, A.M. Image-based detection of plant diseases: From classical machine learning to deep learning journey. Wirel. Commun. Mob. Comput. 2021, 2021, 5541859. [Google Scholar] [CrossRef]
Cao, Y.; Sun, G.; Yuan, Y.; Chen, L. Small-sample cucumber disease identification based on multimodal self-supervised learning. Crop Prot. 2025, 188, 107006. [Google Scholar] [CrossRef]
Wang, B.; Pei, W.; Xue, B.; Zhang, M. A multiobjective genetic algorithm to evolving local interpretable model-agnostic explanations for deep neural networks in image classification. IEEE Trans. Evol. Comput. 2022, 28, 903–917. [Google Scholar] [CrossRef]
Wang, C.; Du, P.; Wu, H.; Li, J.; Zhao, C.; Zhu, H. A cucumber leaf disease severity classification method based on the fusion of DeepLabV3+ and U-Net. Comput. Electron. Agric. 2021, 189, 106373. [Google Scholar] [CrossRef]
Tang, X.; Sun, Z.; Yang, L.; Chen, Q.; Liu, Z.; Wang, P.; Zhang, Y. YOLOv11-AIU: A lightweight detection model for the grading detection of early blight disease in tomatoes. Plant Methods 2025, 21, 118. [Google Scholar] [CrossRef] [PubMed]
Liu, F.; Zhang, Y.; Du, C.; Ren, X.; Huang, B.; Chai, X. Design and Experimentation of a Machine Vision-Based Cucumber Quality Grader. Foods 2024, 13, 16. [Google Scholar] [CrossRef]
Zhang, S.; Zhang, S.; Zhang, C.; Wang, X.; Shi, Y. Cucumber leaf disease identification with global pooling dilated convolutional neural network. Comput. Electron. Agric. 2019, 162, 422–430. [Google Scholar] [CrossRef]
Anandakrishnan, J.; Sangaiah, A.K.; Son, N.K.; Kumari, S.; Arif, M.L.; Abd Rahman, M.A. UAV-Based Deep Learning with Tiny-YOLOv9 for Revolutionizing Paddy Rice Disease Detection. In Proceedings of the 2024 IEEE International Conference on Smart Internet of Things (SmartIoT), Shenzhen, China, 14–16 November 2024; pp. 16–21. [Google Scholar]
Nguyen, D.T.; Bui, T.D.; Ngo, T.M.; Ngo, U.Q. Improving YOLO-Based Plant Disease Detection Using αSILU: A Novel Activation Function for Smart Agriculture. AgriEngineering 2025, 7, 271. [Google Scholar] [CrossRef]
Li, S.F.; Li, K.Y.; Qiao, Y.; Zhang, L.X. Cucumber disease detection method based on visible light spectrum and improved YOLOv5 in natural scenes. Spectrosc. Spectr. Anal. 2023, 43, 2596–2600. [Google Scholar]
Yao, H.; Wang, C.; Zhang, L.; Li, J.; Liu, B.; Liang, F. A cucumber leaf disease severity grading method in natural environment based on the fusion of TRNet and U-Net. Agronomy 2023, 14, 72. [Google Scholar] [CrossRef]
Ozguven, M.M. Deep learning algorithms for automatic detection and classification of mildew disease in cucumber. Fresenius Env. Bull 2020, 29, 7081–7087. [Google Scholar]
NY/T 1857.1-2010; Technical Regulations for Identification of Disease Resistance of Cucumber. Chinese Academy of Agricultural Sciences, Institute of Vegetables and Flowers: Beijing, China, 2010.
NY/T 1857.6-2010; Technical Regulations for Identification of Disease Resistance of Cucumber. Chinese Academy of Agricultural Sciences, Institute of Vegetables and Flowers: Beijing, China, 2010.
NY/T 1857.2-2010; Technical Regulations for Identification of Disease Resistance of Cucumber. Chinese Academy of Agricultural Sciences, Institute of Vegetables and Flowers: Beijing, China, 2010.
Li, J.; He, X.; Chen, X.; Kong, D.; Huang, T.; Song, P. HDFA-YOLO: A real-time steel surface defect detection model based on backbone lightweight design and multi-scale feature fusion. Measurement 2025, 258, 119390. [Google Scholar] [CrossRef]
Si, Y.; Xu, H.; Zhu, X.; Zhang, W.; Dong, Y.; Chen, Y.; Li, H. SCSA: Exploring the synergistic effects between spatial and channel attention. arXiv 2024, arXiv:2407.05128. [Google Scholar] [CrossRef]
Zhou, Y.T.; Cao, K.Y.; Li, D.; Piao, J.C. Fine-YOLO: A Simplified X-ray Prohibited Object Detection Network Based on Feature Aggregation and Normalized Wasserstein Distance. Sensors 2024, 24, 3588. [Google Scholar] [CrossRef]
He, L.H.; Zhou, Y.Z.; Liu, L.; Zhang, Y.Q.; Ma, J.H. Research on the directional bounding box algorithm of YOLO11 in tailings pond identification. Measurement 2025, 253, 117674. [Google Scholar] [CrossRef]

Figure 1. Field image acquisition of cucumber downy mildew, angular leaf spot, and powdery mildew under varied lighting conditions.

Figure 2. Architecture of the enhanced YOLOv11n network.

Figure 3. Architecture of the DEConv module.

Figure 4. Architecture of the Spatial and Channel Synergistic Attention SCSA module.

Figure 5. Normalized confusion matrices: baseline YOLOv11n versus enhanced YOLOv11n-DSU.

Figure 6. Loss Convergence Curves.

Figure 7. Model Deployment Workflow.

Figure 8. Cucumber Disease Grading and Detection System Interface.

Figure 9. Comparison of Detection Performance.

Table 1. Detailed Information of the Final Dataset.

Data Subset	Total Labels	Total Images
Training Set	23,008	5466
Validation Set	2876	683
Test Set	2876	683
Total	28,760	6832

Table 2. Cucumber Disease Severity Classification and Sample Distribution.

Disease Category	Disease Grade	Diagnostic Characteristics	Training Set	Validation Set	Test Set	Total
Downy Mildew	Grade 0	Asymptomatic	1488	186	186	1860
	Grade 1	Lesion area < 10% of leaf surface	1492	187	187	1866
	Grade 3	Lesion area covering 10–25% of leaf surface	1308	164	164	1636
	Grade 5	Lesion area covering 25–50% of leaf surface	1370	171	171	1712
	Grade 7	Lesion area covering 50–75% of leaf surface	1248	156	156	1560
	Grade 9	Lesion area > 75% of leaf surface	1252	157	157	1566
Angular Leaf Spot	Grade 0	Asymptomatic	1488	186	186	1860
	Grade 1	Necrotic spots present without expansion	1485	186	185	1856
	Grade 3	Lesion area < 20% of leaf surface	1813	226	227	2266
	Grade 5	Lesion area covering 20–33% of leaf surface	1824	228	228	2280
	Grade 7	Lesion area covering 33–67% of leaf surface	1610	201	201	2012
	Grade 9	Lesion area > 67% of leaf surface	1426	178	178	1782
Powdery Mildew	Grade 0	Asymptomatic	1488	186	186	1860
	Grade 1	Lesion area < 33% of leaf surface	1490	186	186	1862
	Grade 3	Lesion area covering 33–67% of leaf surface	1325	165	166	1656
	Grade 5	Lesion area > 67% of leaf surface	1350	169	169	1688
	Grade 7	Dense powdery layer with marginal browning	1202	150	150	1502
	Grade 9	Necrotic area > 67% of leaf surface with severe browning	1325	166	165	1656

Table 3. Training Configuration for Cucumber Disease Severity Grading.

Parameter	Value
Total Epochs	500
Batch Size	16
Input Size	640 × 640
Optimizer	SGD
Momentum	0.937
Initial Learning Rate	0.01
Weight Decay	0.0005

Table 4. Ablation Study Results.

Exp. No.	DEConv	SCSA	Unified-IoU	P%	R%	mAP₅₀%	mAP_50–95%	GFLOPs	Weights
1	×	×	×	81.1	75.7	85.6	77.4	6.5	5.7
2	√	×	×	85.8	80.1	89.9	80.7	24.2	5.7
3	×	√	×	83.5	73.2	86.1	78.4	6.3	5.6
4	×	×	√	85.9	79.7	89.7	79.2	6.3	5.5
5	√	√	×	86.8	80.1	90.8	81.4	24.2	5.7
6	√	×	√	83.5	83.7	90.9	78.8	24.2	5.7
7	×	√	√	78.5	71.9	82.6	71.5	6.3	5.6
8	√	√	√	87.2	85.7	93.5	88.3	5.8	5.5

Note: √ indicates the use of this algorithm; × indicates that the algorithm is not used.

Table 5. Comparison of Severity-wise mAP Performance: Baseline vs. Enhanced YOLOv11n.

Disease Category	Disease Grade	mAP₅₀%		mAP_50–95%
Disease Category	Disease Grade	11n	11n-DSU	11n	11n-DSU
Downy Mildew	Grade 0	87.9	93.9	81.9	91.6
	Grade 1	83.5	94.2	78.7	90.4
	Grade 3	82.8	92.0	77.0	88.8
	Grade 5	85.6	89.0	78.9	84.5
	Grade 7	85.3	89.6	78.5	83.7
	Grade 9	87.0	94.6	80.7	88.7
Angular Leaf Spot	Grade 0	87.9	93.9	81.9	91.6
	Grade 1	75.1	90.8	64.1	83.5
	Grade 3	87.7	93.7	76.6	88.1
	Grade 5	83.0	93.8	73.3	89.2
	Grade 7	86.8	94.4	78.5	89.0
	Grade 9	91.2	96.5	82.5	91.5
Powdery Mildew	Grade 0	87.9	93.9	81.9	91.6
	Grade 1	87.9	94.8	81.9	91.5
	Grade 3	89.2	96.6	82.9	91.4
	Grade 5	87.5	96.6	77.4	90.4
	Grade 7	90.8	91.6	81.0	85.5
	Grade 9	81.8	94.1	72.5	85.1
	all	85.6	93.5	77.4	88.3

Table 6. Performance Comparison of Different Loss Functions.

Loss Function	P%	R%	mAP₅₀/%	mAP_50–95/%
CIoU	86.7	80.1	90.8	81.4
GIoU	86.8	80.8	91.3	79.9
DIoU	86.1	81.2	90.9	79.4
EIoU	87.8	80.1	91.4	80.4
SIoU	84.1	81.5	90.5	78.9
Unified-IoU	87.2	85.7	93.5	88.3

Table 7. Performance Comparison of YOLO Models for Multi-Disease Severity Grading in Cucumbers.

Model	P/%	R/%	mAP₅₀/%	mAP_50–95/%	GFLOPs	Weights
YOLOv5n	71.3	70.6	77.5	66.9	5.8	4.7
YOLOv8n	83.7	77.2	87.8	77.6	6.8	5.7
YOLOv11n	81.1	75.7	85.6	77.4	6.5	5.7
YOLOv12n	87.9	86.4	93.4	81.6	24.2	5.7
Vision Transformer	79.3	71.1	82.0	71.4	4.1	3.8
Faster R-CNN	73.0	68.3	77.3	69.4	6.1	5.2
YOLOv11n-DSU	87.2	85.7	93.5	88.3	5.8	5.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, X.; Wang, P.; Sun, Z.; Liu, Z.; Tang, Y.; Shi, J.; Ma, L.; Zhang, Y. YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds. Agriculture 2026, 16, 140. https://doi.org/10.3390/agriculture16020140

AMA Style

Tang X, Wang P, Sun Z, Liu Z, Tang Y, Shi J, Ma L, Zhang Y. YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds. Agriculture. 2026; 16(2):140. https://doi.org/10.3390/agriculture16020140

Chicago/Turabian Style

Tang, Xiuying, Pei Wang, Zhongqing Sun, Zhenglin Liu, Yumei Tang, Jie Shi, Liying Ma, and Yonghua Zhang. 2026. "YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds" Agriculture 16, no. 2: 140. https://doi.org/10.3390/agriculture16020140

APA Style

Tang, X., Wang, P., Sun, Z., Liu, Z., Tang, Y., Shi, J., Ma, L., & Zhang, Y. (2026). YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds. Agriculture, 16(2), 140. https://doi.org/10.3390/agriculture16020140

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

YOLOv11n-DSU: A Study on Grading and Detection of Multiple Cucumber Diseases in Complex Field Backgrounds

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Data Augmentation and Final Dataset

2.3. Multi-Disease Severity Classification Model Based on YOLOv11n

2.3.1. DEConv Convolutional Module

2.3.2. SCSA Module

2.3.3. Unified-IoU Loss Function

2.4. Experimental Setup and Evaluation Metrics

3. Results and Analysis

3.1. Ablation Study

3.2. Analysis of Module Decision Mechanisms and Feature Flow

3.3. Comparative Analysis of Model Classification Performance Before and After Improvement

3.4. Comparative Experiments on Loss Functions

3.5. Comparative Experiments

3.6. Lightweight Model Deployment and Detection System

4. Conclusions

5. Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI