Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV

Xu, Yi-Xiao; Yu, Xin-Hao; Yi, Qing; Zhang, Qi-Yuan; Su, Wen-Hao

doi:10.3390/plants14111656

Open AccessArticle

Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV

by

Yi-Xiao Xu

,

Xin-Hao Yu

,

Qing Yi

,

Qi-Yuan Zhang

and

Wen-Hao Su

^*

College of Engineering, China Agricultural University, 17 Qinghua East Road, Haidian District, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Plants 2025, 14(11), 1656; https://doi.org/10.3390/plants14111656

Submission received: 9 April 2025 / Revised: 20 May 2025 / Accepted: 28 May 2025 / Published: 29 May 2025

(This article belongs to the Section Crop Physiology and Crop Production)

Download

Browse Figures

Versions Notes

Abstract

Phyllosticta fragaricola-induced angular leaf spot causes substantial economic losses in global strawberry production, necessitating advanced severity assessment methods. This study proposed a dual-phase grading framework integrating deep learning and computer vision. The enhanced You Only Look Once version 11 (YOLOv11) architecture incorporated a Content-Aware ReAssembly of FEatures (CARAFE) module for improved feature upsampling and a squeeze-and-excitation (SE) attention mechanism for channel-wise feature recalibration, resulting in the YOLOv11-CARAFE-SE for the severity assessment of strawberry angular leaf spot. Furthermore, an OpenCV-based threshold segmentation algorithm based on H-channel thresholds in the HSV color space achieved accurate lesion segmentation. A disease severity grading standard for strawberry angular leaf spot was established based on the ratio of lesion area to leaf area. In addition, specialized software for the assessment of disease severity was developed based on the improved YOLOv11-CARAFE-SE model and OpenCV-based algorithms. Experimental results show that compared with the baseline YOLOv11, the performance is significantly improved: the box mAP@0.5 is increased by 1.4% to 93.2%, the mask mAP@0.5 is increased by 0.9% to 93.0%, the inference time is shortened by 0.4 ms to 0.9 ms, and the computational load is reduced by 1.94% to 10.1 GFLOPS. In addition, this two-stage grading framework achieves an average accuracy of 94.2% in detecting selected strawberry horn leaf spot disease samples, providing real-time field diagnostics and a high-throughput phenotypic analysis for resistance breeding programs. This work demonstrates the feasibility of rapidly estimating the severity of strawberry horn leaf spot, which will establish a robust technical framework for strawberry disease management under field conditions.

Keywords:

deep learning; strawberry angular leafspot disease; computer vision; severity classification; smart agriculture

1. Introduction

As a globally cultivated fruit crop valued for its rich content of sugars, vitamins, and minerals, strawberry (Fragaria × ananassa) has garnered significant research attention in agronomy, genomics, and nutrition sciences due to its susceptibility to pathogens that severely impact its yield and nutritional quality [1,2]. In strawberry production, disease is a major threat to quality and yield. Common strawberry diseases include powdery mildew, gray mold, anthracnose, and root rot [3]. Among these, angular leaf spot is one of strawberries’ most common and significant diseases. Its symptoms are manifested in the early stage of the disease; there are many black spots on the edges of the leaves, followed by water-soaked, red-brown irregular lesions on the lower surface. In the later stage, the lesions gradually expand and leave combined lesions on the leaf surface [4]. If the prevention and control of strawberry angular leaf spot disease is not timely, it will cause growth point necrosis and plant death, resulting in a decline in quality, reduction in yield, and economic losses. The timely detection of diseases and real-time precision spraying are effective means to control diseases. The method of assessing disease severity traditionally relies on the trained naked eye. However, this method is laborious, costly, time-consuming, and prone to human error. Therefore, it is urgently required to develop a more effective and high-throughput method for field assessment of the disease [5,6].

Computer vision has been widely implemented in agricultural research, notably in crop phenotyping acquisition and evaluation of disease [7,8,9]. R. Meena Prakash et al. [10] proposed a crop disease detection and classification method based on transfer learning and an optimized convolutional neural network (CNN). Similarly, R. Thyagaraj et al. [11] proposed a classification method of plant leaf diseases based on an improved support vector machine (SVM). Through image preprocessing, segmentation, and feature extraction, combined with an SVM classifier, the high-precision classification of plant leaf diseases was realized, and the test accuracy rate reached 95.0%. Additionally, traditional computer vision techniques remain relevant; for instance, S. Arivazhagan et al. [12] utilized K-means clustering to segment diseased leaf regions, achieving a 94.0% accuracy in classifying plant leaf diseases using texture features. More recently, J. G. A. Barbedo [13] combined K-means clustering with SVM to identify plant diseases from lesion spots in field conditions, reporting accuracies up to 95.0%, highlighting the enduring utility of classical methods in practical agricultural settings

In recent years, with the rapid development of deep learning, innovative technologies have been continuously integrated into the field of image recognition. Deep learning has the advantages of a fast recognition speed and high accuracy [14,15]. The target detection algorithm for plant diseases based on deep learning can be divided into a two-stage model represented by the region-based convolutional network (R-CNN) series and a one-stage detection model represented by the YOLO series [16]. These techniques have been widely used in the identification of pests and diseases. Shafik et al. [17] proposed a new hybrid convolutional neural network (Inception–Xception CNN) for the identification of plant diseases. The network combines the advantages of Inception and Xception architecture and achieves the high-precision recognition of plant leaf diseases through multi-scale feature extraction and deep supervised learning. The experimental results show that the model has achieved excellent performance on multiple plant disease datasets. Khan et al. [18] proposed a real-time apple leaf disease detection system based on deep learning. By improving the Faster-RCNN model, combining the convolutional block attention module (CBAM) and the ultra-lightweight dynamic upsampling operator (DySample), the detection accuracy and real-time performance of the model are significantly improved. The system has been successfully applied to the real-time detection of apple leaf diseases. Roy et al. [19] proposed an enhanced YOLOv4 model based on DenseNet. This model achieves the efficient detection of mango growth stages in complex environments by optimizing feature propagation and reuse mechanisms and combining an improved PANet structure to retain fine-grained information. Cardellicchio et al. [20] designed a single-stage detector based on improved YOLOv5 to identify the phenotypic traits of tomato plants (such as nodes, fruits, and flowers) and showed a high detection accuracy in complex datasets characterized by small targets, a high similarity, and tight color matching. Olisah et al. [21] proposed a multi-input convolutional neural network ensemble classifier (MCE), which is optimized by a pre-trained VGG16 model and can effectively identify the subtle features of blackberry maturity. Chen et al. [22] proposed an improved strawberry maturity detection algorithm based on the CES-YOLOv8 network structure. By replacing part of the C2f module in the YOLOv8 model trunk with the ConvNeXt V2 module and introducing an ECA attention mechanism, its feature representation ability is further improved. The experimental results show that the accuracy of the improved model in complex environments is improved by 4.8%.

This study innovatively proposed an improved dual-phase strawberry angular leaf spot disease classification method based on YOLOv11 for segmenting strawberry leaves and OpenCV threshold segmentation for segmenting the lesion part. The specific objectives of this study are as follows: (1) a proposal for a dual-phase diagnostic framework is presented, integrating YOLOv11-based leaf segmentation with threshold segmentation-based disease spot segmentation, which achieves a significant reduction in background interference while maintaining computational efficiency; (2) an optimized YOLOv11 model achieves the pixel-level segmentation of strawberry leaves and accurately classifies healthy versus diseased ones; (3) an OpenCV threshold segmentation algorithm efficiently detects lesions; (4) a quantitative grading system is established based on the ratio of diseased to total leaf area, validated with established datasets; (5) a PyQt5 (V5.15.11) software application incorporates the enhanced YOLOv11-CARAFE-SE model for practical disease diagnosis and phenotypic analysis in precision breeding.

As far as we know, this study represents the first application of a cascaded segmentation framework—integrating YOLOv11 with threshold segmentation—to the severity grading of strawberry angular leaf spot, enabling automated disease severity assessments in complex field scenarios.

2. Results

2.1. Model Training

To compare the performance of various models, this study developed six primary segmentation models based on annotated strawberry disease images: YOLOv8, YOLOv10, YOLOv11, YOLOv11-SE, YOLOv11-CARAFE, and YOLOv11-CARAFE-SE, with YOLOv11-CARAFE-SE serving as the enhanced model. Figure 1 illustrates the performance evaluation curves of the YOLOv11-CARAFE-SE model. Specifically, the figure includes the precision–recall curve and recall–confidence curve for box segmentation, as well as the precision–recall curve and recall–confidence curve for mask segmentation. The recall–confidence relationship demonstrates robust performance, maintaining a recall of 98% for all classes even at the highest confidence thresholds, thereby indicating reliable predictions with a high certainty. Specifically, YOLOv8 and YOLOv10 were utilized to evaluate performance improvements within YOLOv11, while YOLOv11-SE and YOLOv11-CARAFE were employed to assess the contributions of the SE and CARAFE modules to the YOLOv11 model. Figure 2a presents the training loss values of each model on the strawberry diseased leaf segmentation task as the number of epochs increases, and Figure 2b depicts the corresponding validation loss values as a function of epochs on the validation set. To facilitate a more granular analysis of the learning dynamics during the terminal phase of training, a magnified view of both training and validation loss curves, specifically from epoch 190 to 200, is presented. Observation of this magnified segment reveals that while the training loss of the proposed YOLOv11-CARAFE-SE model is marginally elevated compared to some baseline models, this discrepancy is minimal. Critically, this is accompanied by a competitive validation loss and, most importantly, a demonstrably superior mAP. This confluence of metrics suggests that the enhanced model has likely acquired more generalizable feature representations and exhibits a greater robustness against overfitting to the training data. Overall, the loss functions of all the models gradually decrease with increasing epochs, ultimately stabilizing; however, after stabilization, the differences in loss values among the models become relatively minor.

2.2. Ablation Experiments

To further assess the effectiveness of the individual modules within the YOLOv11-CARAFE-SE model, this study selected the YOLOv11n model as the baseline for comparison experiments. Figure 2 illustrates the loss curves of various models, demonstrating that all have converged after 200 epochs. An ablation study was conducted on the test set by sequentially integrating each module into the baseline network, in conjunction with a detailed analysis of the dataset. The corresponding results are summarized in Table 1. The comparative data indicate that the YOLOv11 model yields significant improvements relative to the YOLOv8 model across multiple performance metrics. In comparison with the original YOLOv11n model, the inclusion of the SE module notably enhances recall, albeit with a potential reduction in precision; importantly, it results in a significant improvement in mAP@0.5. Conversely, the CARAFE module exerts a minimal influence on recall, while enhancing precision and markedly boosting mAP@0.5. Ultimately, because of integrating the SE and CARAFE modules, the enhanced YOLOv11-CARAFE-SE model preserves a stable precision, with box precision increasing slightly from 88.1% to 88.3%, while mask precision exhibits a negligible decrease from 88.4% to 88.2%. In contrast, recall is significantly improved, with box recall rising from 86.0% to 87.2% and mask recall from 86.3% to 87.3%. Moreover, mAP@0.5 demonstrates a marked enhancement, with box mAP@0.5 increasing from 91.8% to 93.2% and mask mAP@0.5 from 92.1% to 93.0%. Additionally, the inference time is reduced from 1.3 ms to 0.9 ms, and the computational load decreased from 10.3 GFLOPS to 10.1 GFLOPS.

2.3. Impact of Different Attention Mechanisms on the Model

This study incorporated three attention mechanisms—SE, CBAM, and Context Aggregation—into the original YOLOv11 model and conducted comparative experiments, as presented in Table 2. All the models were evaluated using identical parameter configurations and under consistent conditions. According to the experimental results, when all the models were constrained to a computational load of 10.4 GFLOPS, each attention mechanism enhanced the box mAP@0.5. Notably, the improvement was most pronounced with the SE mechanism, whereas both CBAM and Context Aggregation resulted in a decrease in mask mAP@0.5. Overall, the experimental findings indicate that the SE attention mechanism demonstrates superior performance in this scenario.

2.4. Effects of Different Upsampling Methods on the Model

The original YOLOv11 model employs nearest neighbor interpolation for the upsampling. The introduced CARAFE and DySample modules are enhancements built upon nearest neighbor upsampling and bilinear interpolation, respectively. To enable a more rigorous comparison, the upsampling method in the original YOLOv11 model was altered from nearest neighbor interpolation to bilinear interpolation. According to the experimental results presented in Table 3, the enhanced model incorporating the CARAFE module demonstrated the most significant performance improvements across all the evaluated metrics—achieving an mAP of 93.1%—while also reducing GFLOPS to a minimum of 10.1.

2.5. Performance of the Improved Model in Strawberry Angular Leaf Spot Leaf Segmentation

To validate the effectiveness of the improved model, this study compared the detection outcomes of the original YOLOv11 model and the optimized YOLOv11-CARAFE-SE model under real-world conditions. Figure 3 presents three representative detection outcomes as examples. Although strawberry leaves were mis-segmented by the original YOLOv11 model, they were correctly segmented by the improved YOLOv11-CARAFE-SE model. The experimental results indicated that in complex natural backgrounds, the YOLOv11-CARAFE-SE model more accurately extracts the features of various diseases, achieving a higher segmentation accuracy and prediction confidence compared to the original YOLOv11 model. By incorporating the CARAFE and SE modules into YOLOv11, their synergistic effect enhanced the model’s feature extraction capabilities, thereby significantly improving its detection performance.

2.6. Disease Spot Segmentation Based on OpenCV and Disease Severity Classification

Figure 4 illustrates the flowchart of the disease spot segmentation process implemented using OpenCV. After validating the method’s effectiveness, this study performed disease spot segmentation on a training set comprising 2225 leaf images, resulting in images that exclusively display the diseased portions, along with their corresponding area ratio data. Given that data augmentation was applied to simulate real-world variations in shooting angles, lighting intensity, and other factors, and considering that segmentation was performed only on the leaf with the highest prediction confidence for each captured image, it is possible that the leaf identified as having the highest prediction confidence may differ between the original image and its four augmented versions. To ensure the reliability of the disease severity classification thresholds, inconsistent experimental data were excluded from the analysis. This procedure yielded 485 high-consistency datasets that were used to establish the severity classification thresholds.

A histogram, illustrated as Figure 5, was constructed in this study to illustrate the frequency distribution of the proportion of lesion area to total leaf area (disease severity ratio). The disease severity ratio was defined within the range of [0, 1], with an interval of 0.05. Additionally, to provide a clearer visualization of the data distribution trend, a kernel density estimation (KDE) curve was overlaid, with the bandwidth set to 0.8. Analysis of the KDE curve indicated a decreasing trend in sample count as the disease severity ratio increased, suggesting that leaves with mild disease symptoms were more prevalent, whereas severely infected leaves were relatively fewer. Furthermore, distinct changes in the slope of the KDE curve were observed at disease severity ratios of 0.10, 0.35, and 0.55, signifying notable variations in the rate of decline. Consequently, these points were selected as classification thresholds for disease severity levels. Further observations revealed that leaves within the same disease severity category exhibited consistent morphological and color characteristics. Leaves with mild infections typically retained a higher degree of greenness and had fewer lesions, whereas those with severe infections often exhibited symptoms such as chlorosis and wilting. Moreover, the number of severely diseased leaves was relatively low, possibly due to natural shedding or the removal of heavily infected leaves during field management.

Based on these findings, a classification standard for strawberry leaf disease severity was established (see Table 4), aiming to provide a scientific basis for disease assessment and monitoring, thereby optimizing disease control strategies.

In this study, 485 valid datasets containing angular leaf spot disease were selected as the validation target. Initially, disease severity was manually pre-classified based on the observed symptoms. Subsequently, automatic classification was performed for each severity level using an OpenCV-based algorithm. The predicted disease severity was then validated against the manually labeled severity to compute the model’s classification accuracy. The study’s results are presented in Table 5, where “Correct Grading” denotes the number of images correctly classified, “Sample” represents the total number of images in each category, and “Accuracy” is defined as the ratio of the former to the latter.

3. Discussion

This study proposed a dual-phase classification approach, integrating YOLOv11-CARAFE-SE for leaf detection and segmentation with OpenCV-based threshold segmentation for disease spot identification, which achieved the automated severity assessment of strawberry angular leaf spot disease. This study addressed an issue that consists of three main parts: leaf segmentation, disease spot segmentation, and disease severity classification. The results showed that the YOLOv11-CARAFE-SE model achieved exceptional performance in accurately detecting and segmenting diseased strawberry leaves under field conditions, providing a robust foundation for the subsequent disease spot segmentation. The OpenCV-based threshold segmentation method successfully differentiated disease spots from healthy tissue on segmented leaves, enabling the establishment of four distinct severity levels based on the proportion of diseased area to total leaf area. Based on this dual-phase classification approach, an excellent result in disease severity classification was achieved, representing a significant advancement in automated plant disease assessment. Significant improvements were achieved in mitigating background interference and reducing the computational load; however, limitations persist in the disease spot segmentation process, primarily due to the influence of lighting conditions.

Specifically, under uniform illumination in diffuse lighting conditions, the HSV color space is generally considered more robust to changes in light intensity compared to RGB. However, it exhibits certain limitations under complex, non-uniform, and extreme lighting variations, such as backlighting and shadows. Although the vast majority of actual field photography occurs under uniform lighting, during the performance testing of our algorithm in this study, we considered some extreme cases. For instance, when shooting specifically under backlit conditions, green leaves often appear yellowish, resulting in a decrease in their H (hue) value, making them difficult to distinguish from areas with mild disease. This issue can be reasonably addressed by adjusting the shooting angle to avoid direct backlighting. When shooting under shadowed conditions, the H value remains relatively stable, while the S (saturation) and V (value) values decrease. Therefore, in our threshold segmentation algorithm, the threshold ranges for S and V were adjusted to be relatively wide, as long as it did not significantly compromise the segmentation performance. Nevertheless, these potential issues remind us that future research could incorporate algorithms such as adaptive thresholding and low-light image enhancement to better cope with extreme conditions.

At present, the research on strawberry diseases and insect pests is primarily focused on detecting various disease types, with relatively few studies addressing the assessment of disease severity [23]. For example, Nguyen et al. [24] proposed a strawberry leaf disease classification method based on multi-task U-Net for the detection of gray mold, powdery mildew, tip burn, and healthy leaves, where the model using a VGG16 backbone demonstrated the highest effectiveness, achieving a classification accuracy of 99.18%. Nguyen et al. [25] developed a model based on visual transformers that achieved the classification recognition of seven types of strawberry diseases through data augmentation and transfer learning techniques, reaching an accuracy of 92.7%. Karki et al. [26] studied the performance of different pre-trained models using transfer learning to identify various strawberry diseases in deep convolutional neural networks. The target diseases included angular leaf spot disease, anthracnose, gray mold, and powdery mildew on fruits and leaves. The results showed that ResNet-50 achieved the highest accuracy, reaching 94.4%. Kumar et al. [27] proposed a model that combined convolutional neural networks and support vector machines, featuring three convolutional layers, three max pooling layers, and a fully connected layer with ReLU for feature extraction. The CNN was used to identify discriminative features, which were then classified using an SVM classifier. The classification accuracy for strawberry leaf diseases reached 95%.

However, the existing disease identification technology has not yet solved the core demand in high-throughput phenotyping and precision breeding strategies for plant resistance research. By breeding different varieties of strawberries and studying the distribution of strawberry angular leaf spot disease, it is possible to screen strawberry varieties with disease-resistant traits. Therefore, the core task of this study is to identify the disease site and classify the degree of disease for strawberry angular leaf spot. Compared with other models, the accuracy of the proposed method is higher than most of the studies in Table 6. Part of the research on pest classification using deep learning and computer vision methods is shown in the table. In a recent study, Vats et al. [28] combined CNN’s analysis capabilities with federated learning for detecting different severity levels of tea plant diseases, achieving an accuracy rate of up to 97%. Liu et al. [29] utilized DeepLabV3+, PSPNet, and UNet to assess apple Alternaria leaf blotch severity across four levels (0: healthy, 1: mild, 2: moderate, 3: severe), achieving a 92.8% accuracy. Liu et al. [30] developed an application based on deep learning, where the underlying model used MobileNetV2-DeepLabV3 for leaf segmentation in the first stage and ResNet50-DeepLabV3 for lesion segmentation in the second stage. It achieved an average Intersection over Union (MIoU) of 98.65% for leaf segmentation and a 86.08% MIoU for lesion segmentation.

The results indicate that the developed algorithm shows considerable promise for accurately classifying the severity of strawberry angular leaf spots. Nevertheless, it is important to note that this study is preliminary; future research will therefore focus on developing algorithms that maintain their effectiveness even when applied to datasets more representative of diverse, real-world field conditions. For instance, firstly, our data collection, primarily focused on the early vegetative growth stage, was based on evidence suggesting this is when young, rapidly expanding leaves are most susceptible to angular leaf spot, making the data valuable for understanding early disease characteristics and for breeding programs. However, this targeted approach means we did not systematically evaluate the disease’s appearance across distinctly different vegetation stages, which can vary and impact the model’s robustness. Secondly, while our general daylight data acquisition proved effective, we did not specifically target potentially optimal but narrower observation windows, such as early morning with dew or after high humidity, which might enhance the visibility of certain bacterial symptoms. Thirdly, other vegetation conditions—like plant density, leaf wetness, or wind-induced movement—can significantly impact image quality and were not exhaustively controlled or analyzed in this initial phase. The subsequent phase of this study will address these challenges. Future work will focus on developing a portable device equipped with a built-in RGB camera that captures images in real time and transmits them wirelessly to a microcomputer. Following model processing, the device will provide real-time evaluations of disease severity. A further development is to integrate the device into a mobile robot deployed in orchards. By evaluating the disease severity of strawberry leaves, the robot can precisely control the pesticide dosage. The device should be lightweight and low-cost. Furthermore, a remote detection and consultation system is planned to broaden the application of deep learning-based image segmentation technology in agriculture, thereby providing essential support for promoting agricultural green development, protecting the environment, and ensuring food safety.

4. Materials and Methods

4.1. Datasets

The experimental dataset was constructed using two data sources. The primary source consisted of 227 field-collected strawberry leaf images captured by our research team. These images were acquired during daylight hours (typically between 9:00 a.m. and 4:00 p.m.) to ensure adequate natural illumination and to represent typical field observation conditions. The data collection for this initial study focused primarily on strawberry plants at the early vegetative growth stage, as this is often when angular leaf spot becomes more prevalent and is thus more valuable for guiding the breeding of disease-resistant varieties. To enhance the sample’s diversity and improve the model’s generalizability, we included 410 angular leaf spot disease images from a publicly available dataset provided by Afzaal et al. of the Artificial Intelligence Lab at the Department of Computer Science and Engineering, Chungbuk National University, South Korea [31]. This dataset contains agricultural field images captured under natural lighting conditions, ensuring a high reliability. A key advantage stems from its collection in real-world fields and greenhouses, which inherently introduces significant variability—including diverse backgrounds, complex field conditions, and varying illumination. The dataset was formatted in YOLO and annotated for strawberry leaf segmentation using AnyLabeling (V0.4.10) software, which supports assisted annotation. To facilitate model training, the labeled images were divided into training and validation sets in a 7:3 ratio. Furthermore, following a 2:1 ratio between the size of the pre-augmentation validation set and the test set, 79 diseased images and 13 normal images, totaling 92 images, were selected as the test set data. Additionally, various data augmentation techniques were applied to improve the YOLOv11 model’s generalizability, diversify the training data, and mitigate overfitting. These techniques included brightness adjustments, horizontal flipping, the addition of noise, and image translation. Finally, the experimental dataset comprised 3185 strawberry leaf images, including 2665 diseased images and 520 healthy images, as shown in Table 7.

4.2. Strawberry Leaf Segmentation Method Based on Improved YOLOv11

4.2.1. YOLOv11

Given that the direct application of OpenCV-based image processing to raw field-collected images is prone to environmental variations such as illumination fluctuations, occlusions, and background noise [32,33], this study employed an improved YOLOv11 model to segment individual strawberry leaves. YOLO (You Only Look Once), introduced by Joseph Redmon and Ali Farhadi et al. in 2016 [34], is an object detection system that utilizes a single neural network. Building upon object detection, YOLOv5 [35] introduces a semantic segmentation functionality derived from target detection. Through iterative optimization, YOLOv11 achieves the following key innovations in both its model architecture and training strategies.

(1): The C3k2 module is an enhanced design derived from the traditional C3 module. It provides enhanced feature extraction capabilities by integrating variable convolutional kernels and channel separation strategies. In the shallow layers of the network, when the c3k parameter is set to False, the C3k2 module becomes functionally equivalent to the standard C2f module. When the c3k parameter is set to True, the Bottleneck module is replaced with the C3 module, as illustrated in Figure 6a;
(2): The proposal of the C2PSA mechanism integrates a multi-head attention mechanism within the C2 framework. This mechanism is cascaded after the spatial pyramid fast pooling (SPPF) module, as illustrated in Figure 6b;
(3): The classification detection head within the original decoupled head has been enhanced by incorporating two depthwise separable convolutions (DWConvs), resulting in two DWConv layers in total. This modification significantly reduces both parameter count and computational complexity, as shown in Figure 6c;
(4): Significant modifications were made to the model’s depth and width parameters. Furthermore, YOLOv11 offers multiple variants with different scaling factors, allowing the flexibility to meet diverse requirements. In this experiment, the YOLOv11n model was chosen as the base model for further improvements due to its lower parameter count and faster inference speed, making it particularly well suited for deployment in embedded agricultural equipment scenarios. To facilitate the comparison of models, we incorporated YOLOv8 [36], YOLOv9 [37], and YOLOv10 [38].

4.2.2. SE Attention

In the detection and segmentation of strawberry diseased leaves, the complex background environment may cause substantial interference. Moreover, the lesions characteristic of strawberry angular leaf spot occupy only a small fraction of the image, thereby reducing the accuracy of disease detection and leaf segmentation. To address this challenge, incorporating attention mechanisms emerges as a promising approach to help the model isolate target regions containing critical information from a multitude of irrelevant background areas. In this study, the SE Attention mechanism [39] was employed. Additionally, to further assess the effectiveness of the SE attention mechanism, this study incorporated three distinct attention mechanisms—SE, CBAM [40], and Context Aggregation [41]. As illustrated in Figure 7, SE Attention is a typical channel attention mechanism. Let the input feature map be U ∈ R^(C × H × W). The improved attention generation process is as follows.

(1): Global statistics extraction: the channel description vector is obtained by applying mean pooling across spatial dimensions, as shown in Equation (1) [39]:

$v_{C} = F_{s q} (u_{C}) = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} x_{C} (i, j)$

(1)

where $z_{C}$ represents the result of global average pooling for the $C$ -th channel, $H$ and $W$ denote the height and width of the feature map, and $u_{C} (i, j)$ signifies the feature value at position $(i, j)$ in the $C$ -th channel. The outcome is a vector $z = [z_{1}, z_{2}, \dots, z_{C}]$ of length $C$ , which encapsulates the global information for each channel.
(2): Dynamic channel calibration: the gating mechanism is used to learn the nonlinear relationship between channels, as shown in Equation (2) [39]:

$w = σ (W_{2} \cdot δ (W_{1} \cdot v))$

(2)

where $W_{1} \in R^{\frac{C}{r} \times C}$ is the weight matrix of dimension reduction, $W_{2} \in R^{C \times \frac{C}{r}}$ is the weight matrix of dimension increase, and the scaling factor r is the control parameter quantity.
(3): Feature recalibration: the learned channel weights are applied to the original feature map, as shown in Equation (3) [39]:

${\tilde{x}}_{C} (i, j) = w_{C} \cdot x_{C} (i, j)$

(3)

where $s_{C}$ denotes the weight for the $C$ -th channel, $X_{C}$ represents the $C$ -th channel of the input feature map, and ${\tilde{X}}_{C}$ is the adjusted feature map. The output-weighted feature map is given by $\tilde{X} \in R^{C \times H \times W}$ .

4.2.3. CARAFE Module

CARAFE [42] is an upsampling method proposed to enhance feature maps in convolutional neural networks. Upsampling is commonly used to produce higher-resolution feature maps, which enables the network to capture more detailed information. In the context of the recognition of strawberry angular leaf spot disease in this study—where many lesions are small and challenging to detect—the improved upsampling method can effectively enhance the model’s performance. CARAFE leverages the underlying content information at each spatial location to predict reassembly kernels and subsequently reassembles the features within a predefined local neighborhood. By incorporating content-aware information, CARAFE deploys adaptive and optimized reassembly kernels at various spatial locations, thereby outperforming mainstream upsampling operators. The DySample upsampling method is a technique that leverages a dynamic sampling strategy to enhance the detailed representation of low-resolution feature maps. In the experiments, DySample was introduced for comparative analysis [43].

CARAFE consists of two main components—the kernel prediction module and the content-aware reassembly module—which operate in two sequential steps. It proceeds in two sequential steps. In the first step, a reassembly kernel is predicted for each target location. Subsequently, the predicted kernel is employed to reassemble the features. Given an input feature map of dimensions H × W × C and an upsampling factor σ, a new feature map of dimensions σH × σW × C is produced. Specifically, the kernel prediction module generates location-specific kernels based on the input feature content, which are subsequently applied by the content-aware reassembly module to reassemble the feature map. Figure 8 illustrates the fundamental framework of CARAFE.

4.2.4. Proposed Model

The greenhouse cultivation environment is highly complex. Common challenges in leaf segmentation include similar textures between the target and background, target occlusion, and similarities among different target types. To enhance the model’s accuracy, this study optimized the YOLOv11 framework by integrating the SE and CARAFE modules, thereby improving the overall performance. Through extensive experimentation, the improved YOLOv11-CARAFE-SE model was developed by inserting an SE module between the Neck and Head and replacing the conventional upsampling module with the CARAFE module. Due to the fundamental similarities in strawberry leaf and angular leaf spot disease morphology across varieties, coupled with the enhanced feature learning capabilities imparted by these modifications, the improved YOLOv11-CARAFE-SE model is designed for robust and broad applicability in detecting angular leaf spot across different strawberry cultivars. The overall architecture of the improved YOLOv11-CARAFE-SE model is illustrated in Figure 9.

4.3. OpenCV-Based Lesion Segmentation Method and Disease Severity Grading

Accurate disease grading in complex field environments presents significant challenges due to background interference and variations in leaves’ appearance (Figure 10a). To address this, this research first employs YOLO to extract individual leaf regions, effectively isolating them from the background (Figure 10b). However, precise lesion segmentation remains difficult due to the small, irregular, and numerous nature of the disease spots, which deep learning networks struggle to detect accurately [44]. Additionally, manual annotation for training such models is labor-intensive and costly (Figure 10c). Given the distinct color contrast between lesions and healthy leaf tissue, we leverage an OpenCV-based threshold segmentation method to efficiently extract lesion areas without the need for extensive training data, ensuring both accuracy and computational efficiency.

Diseased leaf images were selected from the dataset, and an enhanced YOLOv11 segmentation model was employed to generate high-quality binarized mask images. In the resulting masks, white pixels accurately delineated the leaf regions, whereas black pixels indicated the background. By applying a bitwise AND operation to the mask and the original image, a segmented color image of the diseased leaf was obtained. By leveraging the model’s robustness in complex scenarios—including multi-scale feature fusion and adaptive noise suppression—the segmentation boundaries remained sharp and accurate even under challenging conditions, such as uneven lighting or sensor noise, thereby eliminating the need for additional post-processing steps.

The processed leaf images were converted from the RGB color space to the HSV color space. This transformation facilitated threshold segmentation in the subsequent steps. The HSV color space consists of three components: hue (H), saturation (S), and value (V). The HSV color space, especially its H component, is characterized by its relative robustness to changes in light intensity under diffuse lighting conditions with uniform illumination, a property not as strongly observed in the RGB color space. Thus, threshold setting primarily focused on adjusting the H component. Extensive experiments, guided by empirical testing and the research objectives, were conducted to determine an optimal threshold value, as summarized in Table 8:

The proportion of the diseased area relative to the total leaf area was employed. Accordingly, strawberry angular leaf spot was classified into distinct severity levels. After validating the effectiveness of the lesion segmentation algorithm, the entire dataset was analyzed, and classification thresholds were established based on the distribution characteristics of this ratio. The calculation is given by Equation (4) [45]:

P = \frac{S_{D}}{S_{L}}

(4)

where

S_{L}

denotes the total area of the strawberry leaf,

S_{D}

denotes the segmented diseased area, and

P

is the proportion of the diseased area to the total leaf area.

The severity classification of strawberry angular leaf spot was conducted in dual phases. In the first phase, each input image was processed by a pre-trained YOLO model for binary classification. If disease-free leaves were detected, they were directly classified as healthy samples; conversely, an infected leaf image was further processed by the trained segmentation model to generate a prediction mask, which was subsequently fused with the original image and preprocessed. In the second phase, for the positive samples identified by the YOLO model, an OpenCV-based threshold segmentation algorithm was applied to compute the ratio of the diseased area to the total leaf area, which was then used to determine the severity level. The research flowchart is illustrated in Figure 11. The pre-screening function of the YOLO model effectively reduced the computational burden.

4.4. A Detection Platform for Strawberry Angular Leaf Spot Severity Based on PyQt5

To implement the improved YOLOv11-CARAFE-SE model in practical applications, this study developed an efficient and user-friendly software application based on PyQt5 for assessing the severity of strawberry angular leaf spot, as depicted in Figure 12. The software’s user interface was designed using PyQt5 to create an intuitive graphical user interface (GUI), and the developed model, along with its runtime environment, was packaged using the PyInstaller tool. Upon launching the software, users could select images from a designated folder for analysis. By clicking the “Start Detection” button, the software initiated either single-image or batch detection. The detection results are displayed on the interface, with the corresponding disease severity level automatically generated. Furthermore, users have the option to review the detection results for each image individually. Finally, by clicking the “Export to Excel” button, the results could be exported in Excel format.

4.5. Equipment

The entire process of model training and validation was conducted on a personal computer (CPU: Intel^® Core™ i9 14900K @6.00 GHz; GPU: NVIDIA GeForce RTX 4080 16G). The environment was configured as shown in Table 9. The training environment was built using PyTorch (V2.5.0), with the GPU parameters shown in Table 10, including an input image size of 640 × 640 pixels, a batch size of 16, and a training duration of 200 epochs. The maximum learning rate was set to 0.001, and the optimizer used was Adaptive Moment Estimation (Adam) [46]. A weight decay of 0.0005 was applied to mitigate overfitting. The training process was executed with 32 threads to enhance computational efficiency. The final set of hyperparameters, which demonstrably yielded the best performance in this study, was established through a meticulous and iterative process of adjustments. Each selection was systematically evaluated and validated against performance on a dedicated validation set, thereby confirming this configuration as the optimal one achieved.

4.6. Model Evaluation

To comprehensively evaluate the performance of the model, a set of evaluation metrics—precision (P), recall (R), mean average precision (mAP), and inference time—was selected. Among them, P measures the proportion of samples predicted as positive that are indeed true positives, as defined in Equation (5) [47]. R represents the proportion of actual positive samples that are correctly identified by the model, as shown in Equation (6) [47]. Average precision (AP) provides a comprehensive assessment based on both precision and recall by calculating the area under the precision–recall (P-R) curve. mAP is obtained by averaging the AP values across all classes, as expressed in Equation (7) [47]. In particular, mAP@0.5 represents the average precision at an IoU threshold of 0.5, while mAP@0.5:0.95 is calculated by averaging the AP over multiple IoU thresholds, ranging from 0.5 to 0.95, thereby providing a more comprehensive evaluation of the model’s performance. A higher mAP indicates superior performance. Furthermore, in the context of image segmentation, the four metrics, P, R, mAP@0.5, and mAP@0.5:0.95, are computed separately at both the box level, which denotes the approximate location and category of the object, and the mask level, which delineates the pixel-level boundary of the target. Inference time is defined as the duration required for the model to perform inference on a single image; a lower inference time indicates a faster processing speed [48].

P r e c i s i o n = \frac{T P}{T P + F P}

(5)

R e c a l l = \frac{T P}{T P + F N}

(6)

m A P = \frac{\sum_{i = 1}^{C} A P_{i}}{C}

(7)

where TP, FP, FN, and TN stand for true positive, false positive, false negative, and true negative, respectively. The

{A P}_{i}

is the average precision value at the

i

-th species.

C

is the total number of species.

5. Conclusions

This research addressed the challenge of accurately evaluating strawberry angular leaf spot disease severity in complex environments by developing a dual-phase classification method integrating YOLOv11 with OpenCV-based algorithms. The improved YOLOv11-CARAFE-SE model enhanced the detection performance (box mAP@0.5 increased from 91.8% to 93.2%, mask mAP@0.5 from 92.1% to 93.0%) while significantly reducing the inference time by 30.8% (from 1.3 ms to 0.9 ms) and computational requirements from 10.3 to 10.1 GFLOPS. The second phase, introducing an OpenCV-based threshold segmentation algorithm with a disease severity classification standard, bridges the gap between detection and practical disease assessment. Furthermore, this dual-phase grading framework could achieve an average accuracy of 94.2% in detecting selected strawberry angular leaf spot samples. These technical advances enable broader deployment on accessible hardware in field conditions. Beyond its technical contributions, this methodology advances sustainable agriculture by facilitating the breeding of disease-resistant varieties, ultimately contributing to reduced economic losses and improved food security, while demonstrating how deep learning and computer vision techniques can be effectively integrated for agricultural applications.

Author Contributions

Y.-X.X.: Conceptualization, Methodology, Software, Verification, Formal Analysis, Investigation, Writing—Draft Preparation, Writing—Review and Editing, Project Management. X.-H.Y.: Software, Formal Analysis, Investigation. Q.Y.: Verification, Investigation, Writing—Draft Preparation, Writing—Review and Editing. Q.-Y.Z.: Conceptualization, Methodology, Writing—Review and Editing, Supervision. W.-H.S.: Conceptualization, Methodology, Resources, Writing—Review and Editing, Supervision, Project Management, Fundraising. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 32371991).

Data Availability Statement

Data are available on request due to privacy. The YAML configuration file for the proposed YOLOv11-CARAFE-SE model, used for training and evaluation, is publicly available in the Mendeley Data repository at https://doi.org/10.17632/R8K5ZZTJCM.1.

Acknowledgments

The authors are grateful to the constructive comments and suggestions provided by all of the authors cited in this article and the anonymous reviewers.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Edger, P.P.; Poorten, T.J.; VanBuren, R.; Hardigan, M.A.; Colle, M.; McKain, M.R.; Smith, R.D.; Teresi, S.J.; Nelson, A.D.L.; Wai, C.M.; et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 2019, 51, 541–547. [Google Scholar] [CrossRef] [PubMed]
Giampieri, F.; Forbes-Hernandez, T.Y.; Gasparrini, M.; Alvarez-Suarez, J.M.; Afrin, S.; Bompadre, S.; Quiles, J.L.; Mezzetti, B.; Battino, M. Strawberry as a health promoter: An evidence based review. Food Funct. 2015, 6, 1386–1398. [Google Scholar] [CrossRef] [PubMed]
Dholi, P.K.; Khatiwada, P.; Basnet, B.; Bhandari, S. An Extensive Review of Strawberry (Fragaria × ananassa) Diseases and Integrated Management Approaches: Current Understanding and Future Directions. Fundam. Appl. Agric. 2023, 8, 655–667. [Google Scholar] [CrossRef]
Montarry, J.; Mimee, B.; Danchin, E.G.J.; Koutsovoulos, G.D.; Ste-Croix, D.T.; Grenier, E. Recent Advances in Population Genomics of Plant-Parasitic Nematodes. Phytopathology 2021, 111, 40–48. [Google Scholar] [CrossRef] [PubMed]
Zhang, W.; Zhu, K.; Wang, Z.; Zhang, H.; Gu, J.; Liu, L.; Yang, J.; Zhang, J. Brassinosteroids function in spikelet differentiation and degeneration in rice. J. Integr. Plant Biol. 2019, 61, 943–963. [Google Scholar] [CrossRef]
Romero-Oraá, R.; García, M.; Oraá-Pérez, J.; López-Gálvez, M.I.; Hornero, R. Effective Fundus Image Decomposition for the Detection of Red Lesions and Hard Exudates to Aid in the Diagnosis of Diabetic Retinopathy. Sensors 2020, 20, 6549. [Google Scholar] [CrossRef]
Shakoor, N.; Lee, S.; Mockler, T.C. High throughput phenotyping to accelerate crop breeding and monitoring of diseases in the field. Curr. Opin. Plant Biol. 2017, 38, 184–192. [Google Scholar] [CrossRef]
Mochida, K.; Koda, S.; Inoue, K.; Hirayama, T.; Tanaka, S.; Nishii, R.; Melgani, F. Computer vision-based phenotyping for improvement of plant productivity: A machine learning perspective. Gigascience 2019, 8, giy153. [Google Scholar] [CrossRef]
Mahlein, A.-K.; Kuska, M.T.; Thomas, S.; Wahabzada, M.; Behmann, J.; Rascher, U.; Kersting, K. Quantitative and qualitative phenotyping of disease resistance of crops by hyperspectral sensors: Seamless interlocking of phytopathology, sensors, and machine learning is needed! Curr. Opin. Plant Biol. 2019, 50, 156–162. [Google Scholar] [CrossRef]
Prakash, R.M.; Vimala, M.; Ramalakshmi, K.; Prakash, M.B.; Krishnamoorthi, A.; Kumari, R.S.S. Crop Disease Detection and Classification with Transfer learning and hyper-parameters optimized Convolutional neural network. In Proceedings of the 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), Kannur, India, 11–12 August 2022; pp. 1608–1613. [Google Scholar]
Thyagaraj, R.; Satheesha, T.Y.; Bhairannawar, S. Plant Leaf Disease Classification Using Modified SVM With Post Processing Techniques. In Proceedings of the 2023 International Conference on Applied Intelligence and Sustainable Computing (ICAISC), Dharwad, India, 16–17 June 2023; pp. 1–4. [Google Scholar]
Selvaraj, A.; Shebiah, N.; Ananthi, S.; Varthini, S. Detection of unhealthy region of plant leaves and classification of plant leaf diseases using texture features. Agric. Eng. Int. CIGR J. 2013, 15, 211–217. [Google Scholar]
Arnal Barbedo, J.G. Plant disease identification from individual lesions and spots using deep learning. Biosys. Eng. 2019, 180, 96–107. [Google Scholar] [CrossRef]
Wang, P.; Fan, E.; Wang, P. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognit. Lett. 2021, 141, 61–67. [Google Scholar] [CrossRef]
Archana, R.; Jeevaraj, P.S.E. Deep learning models for digital image processing: A review. Artif. Intell. Rev. 2024, 57, 11. [Google Scholar] [CrossRef]
Attri, I.; Awasthi, L.K.; Sharma, T.P.; Rathee, P. A review of deep learning techniques used in agriculture. Ecol. Inform. 2023, 77, 102217. [Google Scholar] [CrossRef]
Shafik, W.; Tufail, A.; Liyanage De Silva, C.; Awg Haji Mohd Apong, R.A. A novel hybrid inception-xception convolutional neural network for efficient plant disease classification and detection. Sci. Rep. 2025, 15, 3936. [Google Scholar] [CrossRef]
Khan, A.I.; Quadri, S.M.K.; Banday, S.; Shah, J.L. Deep diagnosis: A real-time apple leaf disease detection system based on deep learning. Comput. Electron. Agric. 2022, 198, 107093. [Google Scholar] [CrossRef]
Roy, A.M.; Bhaduri, J. Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv4. Comput. Electron. Agric. 2022, 193, 106694. [Google Scholar] [CrossRef]
Cardellicchio, A.; Solimani, F.; Dimauro, G.; Petrozza, A.; Summerer, S.; Cellini, F.; Renò, V. Detection of tomato plant phenotyping traits using YOLOv5-based single stage detectors. Comput. Electron. Agric. 2023, 207, 107757. [Google Scholar] [CrossRef]
Olisah, C.C.; Trewhella, B.; Li, B.; Smith, M.L.; Winstone, B.; Whitfield, E.C.; Fernández, F.F.; Duncalfe, H. Convolutional neural network ensemble learning for hyperspectral imaging-based blackberry fruit ripeness detection in uncontrolled farm environment. Eng. Appl. Artif. Intell. 2024, 132, 107945. [Google Scholar] [CrossRef]
Chen, Y.K.; Xu, H.B.; Chang, P.Y.; Huang, Y.Y.; Zhong, F.L.; Jia, Q.; Chen, L.X.; Zhong, H.Q.; Liu, S. CES-YOLOv8: Strawberry Maturity Detection Based on the Improved YOLOv8. Agronomy 2024, 14, 1353. [Google Scholar] [CrossRef]
Shi, T.; Liu, Y.; Zheng, X.; Hu, K.; Huang, H.; Liu, H.; Huang, H. Recent advances in plant disease severity assessment using convolutional neural networks. Sci. Rep. 2023, 13, 2336. [Google Scholar] [CrossRef] [PubMed]
Nguyen, D.K.; Choi, Y.S.; Lee, J.H.; Tran, M.T.; Xin, X. An effective deep learning model for classifying diseases on strawberry leaves and estimating their severity based on the multi-task U-Net. Multimed. Tools Appl. 2024, 1–22. [Google Scholar] [CrossRef]
Nguyen, H.T.; Tran, T.D.; Nguyen, T.T.; Pham, N.M.; Nguyen Ly, P.H.; Luong, H.H. Strawberry disease identification with vision transformer-based models. Multimed. Tools Appl. 2024, 83, 73101–73126. [Google Scholar] [CrossRef]
Karki, S.; Basak, J.K.; Tamrakar, N.; Deb, N.C.; Paudel, B.; Kook, J.H.; Kang, M.Y.; Kang, D.Y.; Kim, H.T. Strawberry disease detection using transfer learning of deep convolutional neural networks. Sci. Hortic. 2024, 332, 113241. [Google Scholar] [CrossRef]
Kumar, R.R.; Chauhan, R.; Dhondiyal, S.A.; Singh, A. Deep Learning-Driven Diagnosis A CNN-SVM Hybrid Approach for Automated Detection of Strawberry Leaf Diseases. In Proceedings of the 2024 IEEE 3rd World Conference on Applied Intelligence and Computing (AIC), Gwalior, India, 27–28 July 2024; pp. 1467–1470. [Google Scholar]
Vats, S.; Kukreja, V.; Mehta, S. Tea Leaf Disease Detection: Federated Learning CNN Used for Accurate Severity Analysis. In Proceedings of the 2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), Gwalior, India, 14–16 March 2024; pp. 1–6. [Google Scholar]
Liu, B.-Y.; Fan, K.-J.; Su, W.-H.; Peng, Y. Two-Stage Convolutional Neural Networks for Diagnosing the Severity of Alternaria Leaf Blotch Disease of the Apple Tree. Remote Sens. 2022, 14, 2519. [Google Scholar] [CrossRef]
Liu, W.; Chen, Y.; Lu, Z.; Lu, X.; Wu, Z.; Zheng, Z.; Suo, Y.; Lan, C.; Yuan, X. StripeRust-Pocket: A Mobile-Based Deep Learning Application for Efficient Disease Severity Assessment of Wheat Stripe Rust. Plant Phenomics 2024, 6, 0201. [Google Scholar] [CrossRef]
Afzaal, U.; Bhattarai, B.; Pandeya, Y.R.; Lee, J. An Instance Segmentation Model for Strawberry Diseases Based on Mask R-CNN. Sensors 2021, 21, 6565. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
Bargoti, S.; Underwood, J. Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards. J. Field Robot. 2017, 34, 1039–1060. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Wang, J.; Chen, Y.; Dong, Z.; Gao, M. Improved YOLOv5 network for real-time multi-scale traffic sign detection. Neural Comput. Appl. 2021, 35, 7853–7865. [Google Scholar] [CrossRef]
Ultralytics. YOLOv8: A State-of-the-Art Object Detection Model. Available online: https://github.com/ultralytics/ultralytics (accessed on 20 August 2023).
Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In Proceedings of the Computer Vision—ECCV 2024: 18th European Conference, Milan, Italy, 29 September–4 October 2024; Proceedings, Part XXXI. Springer Nature: Cham, Switzerland, 2024; pp. 1–21. [Google Scholar]
Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Gao, P.; Lu, J.; Li, H.; Mottaghi, R.; Kembhavi, A. Container: Context Aggregation Network. arXiv 2021, arXiv:2106.01401. [Google Scholar]
Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. CARAFE: Content-Aware ReAssembly of FEatures. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3007–3016. [Google Scholar]
Liu, W.; Lu, H.; Fu, H.; Cao, Z. Learning to Upsample by Learning to Sample. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 6027–6037. [Google Scholar]
Liu, Y. An Improved Faster R-CNN for Object Detection. In Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 8–9 December 2018; Volume 2, pp. 119–123. [Google Scholar]
Mahlein, A.-K. Plant Disease Detection by Imaging Sensors—Parallels and Specific Demands for Precision Agriculture and Plant Phenotyping. Plant Dis. 2016, 100, 241–251. [Google Scholar] [CrossRef] [PubMed]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Padilla, R.; Netto, S.L.; da Silva, E.A.B. A Survey on Performance Metrics for Object-Detection Algorithms. In Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Niterói, Brazil, 1–3 July 2020; pp. 237–242. [Google Scholar]
Sze, V.; Chen, Y.-H.; Yang, T.-J.; Emer, J.S. Efficient Processing of Deep Neural Networks: A Tutorial and Survey. Proc. IEEE 2017, 105, 2295–2329. [Google Scholar] [CrossRef]

Figure 1. Performance evaluation curves of the YOLOv11-CARAFE-SE model, demonstrating a high accuracy and reliability. (a) Box precision–recall curve; (b) box recall–confidence curve; (c) mask precision–recall curve; (d) mask recall–confidence curve.

Figure 2. Recognizing training loss curves (a) and validation loss curves (b) for strawberry leaf segmentation using the six proposed YOLOv11-based models. The loss for the improved YOLOv11-CARAFE-SE model demonstrates a clear tendency towards stabilization in the later training epochs. A magnified view of epochs 190–200 is included to allow for closer inspection of this stabilization.

Figure 3. The three segmentation examples comparing the performance of YOLOv11 and YOLOv11-CARAFE-SE in strawberry angular leaf spot leaf segmentation, demonstrating how YOLOv11-CARAFE-SE correctly segments areas where the baseline YOLOv11 model errs. The disease area with the red circle is a false positive (FP), since it was predicted but has no corresponding ground truth.

Figure 4. The complete process for calculating strawberry disease severity based on YOLOv11 and OpenCV.

Figure 5. The frequency distribution chart of the disease area ratio and the classification of severity levels: the disease area ratio has a range of [0, 1] with an interval of 0.05, and thresholds of 0.1, 0.35, and 0.55 are used to divide the data into four levels.

Figure 6. Three key innovations in YOLOv11 model architecture: C3k2 module (a), C2PSA module (b), and DWConv module (c).

Figure 7. The specific structure of SE Attention.

Figure 8. The specific structure of the CARAFE Module.

Figure 9. The proposed YOLOv11-CARAFE-SE network structure diagram: Backbone (a), Neck (b), Head (c), SPPF (d), and CBS (e).

Figure 10. Difficult example of strawberry angular leaf spot labeling: (a) is the original image, (b) is the segmented leaf, and (c) is the result of labeling the ideal spot area, highlighting the difficulty of manual annotation.

Figure 11. The research flowchart of this study includes CNN networks to segment individual strawberry leaves, OpenCV image processing to segment the diseased part of the spot, and the output of disease grades.

Figure 12. A PyQt5-based software application designed for strawberry angular leaf spot disease classification.

Table 1. Comparative results of ablation experiments on the test set.

Methods	Box			Mask			Inference Time/ms	GFLOPS
Methods	Precision /%	Recall /%	mAP@0.5 /%	Precision /%	Recall /%	mAP@0.5 /%	Inference Time/ms	GFLOPS
YOLOv8	86.1	84.6	91.1	86.2	84.7	91.0	1.1	12.0
YOLOv9	88.1	81.2	91.0	88.5	80.5	90.3	1.9	53.2
YOLOv10	92.5	82.2	91.7	91.5	82.9	91.9	0.9	10.6
YOLOv11	88.1	86.0	91.8	88.4	86.3	92.1	1.3	10.3
YOLOv11-SE	86.6	87.6	92.3	86.6	87.6	92.6	1.2	10.3
YOLOv11-CARAFE	89.2	86.0	93.0	89.2	86.0	93.0	0.9	10.1
YOLOv11-CARAFE-SE	88.3	87.2	93.2	88.2	87.3	93.0	0.9	10.1

Table 2. Comparative experiment for different attention mechanisms.

Methods	Box			Mask			GFLOPS
Methods	Precision/%	Recall/%	mAP@0.5/%	Precision/%	Recall/%	mAP@0.5/%	GFLOPS
YOLOv11 (No Attention)	88.4	86.0	91.9	88.7	86.2	92.0	10.3
SE	86.7	87.6	92.5	86.7	87.6	92.6	10.4
CBAM	91.2	82.0	92.3	91.1	81.9	91.8	10.4
Context Aggregation	90.1	84.0	92.0	91.4	83.8	91.8	10.4

Table 3. Comparative experiment for different upsampling methods.

Methods	Box			Mask			GFLOPS
Methods	Precision/%	Recall/%	mAP@0.5/%	Precision/%	Recall/%	mAP@0.5/%	GFLOPS
YOLOv11 (nearest)	88.3	85.8	91.8	88.7	86.0	92.0	10.3
YOLOv11 (bilinear)	90.9	80.5	91.6	90.8	80.5	91.7	10.3
CARAFE	89.3	86.1	93.1	89.3	86.1	93.1	10.1
DySample	89.1	83.9	91.8	89.4	84.1	92.3	10.4

Table 4. Strawberry angular leaf spot disease severity classification standard proposed in this study, defining four disease levels based on thresholds of 0.1, 0.35, and 0.55.

Severity Level	Symptoms	Disease Area Ratio	Data Quantity
Level 1	Small water-soaked spots visible on the back of the leaf	(0, 0.10]	138
Level 2	Spot area expands, leaf edges appear dried and dehydrated	(0.10, 0.35]	217
Level 3	Large disease spots appear but do not completely merge to cover the entire leaf	(0.35, 0.55]	69
Level 4	Most of the leaf area is covered with red-brown lesions, which merge into a large patch	(0.55, 1]	61

Table 5. Performance of overall verification, showing achieved accuracy.

Severity Level	Correct Grading	Sample	Accuracy (%)
Level 1	134	139	96.4
Level 2	192	208	92.3
Level 3	70	75	93.3
Level 4	61	63	96.8
Total	457	485	94.2

Table 6. The comparison results of different models proposed by various teams for plant disease severity grading research.

References	Plants	Model	Disease Types/Levels	Accuracy
Nguyen et al. [24]	Strawberry	MT-UNet (VGG16 backbone)	Gray Mold, Powdery Mildew, Tip Burn, Healthy	98.9%
Nguyen et al. [25]	Strawberry	Vision Transformer	Anthracnose Fruit Rot, Flower Blight, Gray Mold, Leaf Spot Disease, Powdery Mildew On Leaves, Powdery Mildew On Fruits	92.7%
Karki et al. [26]	Strawberry	Resnet-50	Angular Leaf Spot, Anthracnose, Gray Mold, and Powdery Mildew on Both Fruit and Leaves	94.4%
Kumar et al. [27]	Strawberry	CNN-SVM	Powdery Mildew, Leaf Scorch, Leaf Blight	95.0%
Vats et al. [28]	Tea	CNN	(1_V Low) 1–20%, (2_Low) 21–40%, (3_Med) 41–60%, (4 High) 61–80%, (5_V High) 81–100%	97.0%
Liu et al. [29]	Apple	DeepLabV3+, PSPNet, UNet	0 (Healthy), 1 (Mild), 2 (Moderate), 3 (Severe)	92.8%
Liu et al. [30]	Wheat	MobileNetV2-DeepLabV3+ + ResNet50-DeepLabV3+	IoU score based on health category (IoU-H)	86.08%
Proposed method	Strawberry	YOLOv11-based	0 (Healthy), 1 (0, 10%], 2 (10%, 35%], 3 (35%, 55%], 4 (55%, 100%]	94.2%

Table 7. Strawberry angular leaf spot disease dataset.

Angular Leafspot	Training Images	Val Images	Test Images	Total Images
Diseased	$1865 (\times$ 5)	$800 (\times$ 5)	79	2744
Healthy	$360 (\times$ 5)	$160 (\times$ 5)	13	533

Table 8. The optimal threshold values for segmentation in the HSV color space in this segmentation task.

	H_max	H_min	S_max	S_min	V_max	V_min
Diseased	37	1	210	30	244	100
Healthy	64	38	255	100	200	57

Table 9. Configuration of the training environment.

Name	Information
CPU	Intel^® Core™ i9 14900K @6.00 GHz
GPU	NVIDIA GeForce RTX 4080 16G
Operating System	Windows 11
Deep Learning Framework	Pytorch 2.5.0
Programming Language	Python 3.12.7
Integrated Development Environment	VScode 1.92
Package Management Tools	Anaconda 2.5.2

Table 10. Experimentally determined optimal training parameter settings for YOLOv11-based deep learning models.

Hyperparameter	Value
Input image size	640 × 640
Batch size	16
Epoch	200
Maximum learning rate	0.001
Optimizer	AdamW
Weight decay	0.0005
Thread count	32

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, Y.-X.; Yu, X.-H.; Yi, Q.; Zhang, Q.-Y.; Su, W.-H. Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV. Plants 2025, 14, 1656. https://doi.org/10.3390/plants14111656

AMA Style

Xu Y-X, Yu X-H, Yi Q, Zhang Q-Y, Su W-H. Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV. Plants. 2025; 14(11):1656. https://doi.org/10.3390/plants14111656

Chicago/Turabian Style

Xu, Yi-Xiao, Xin-Hao Yu, Qing Yi, Qi-Yuan Zhang, and Wen-Hao Su. 2025. "Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV" Plants 14, no. 11: 1656. https://doi.org/10.3390/plants14111656

APA Style

Xu, Y.-X., Yu, X.-H., Yi, Q., Zhang, Q.-Y., & Su, W.-H. (2025). Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV. Plants, 14(11), 1656. https://doi.org/10.3390/plants14111656

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dual-Phase Severity Grading of Strawberry Angular Leaf Spot Based on Improved YOLOv11 and OpenCV

Abstract

1. Introduction

2. Results

2.1. Model Training

2.2. Ablation Experiments

2.3. Impact of Different Attention Mechanisms on the Model

2.4. Effects of Different Upsampling Methods on the Model

2.5. Performance of the Improved Model in Strawberry Angular Leaf Spot Leaf Segmentation

2.6. Disease Spot Segmentation Based on OpenCV and Disease Severity Classification

3. Discussion

4. Materials and Methods

4.1. Datasets

4.2. Strawberry Leaf Segmentation Method Based on Improved YOLOv11

4.2.1. YOLOv11

4.2.2. SE Attention

4.2.3. CARAFE Module

4.2.4. Proposed Model

4.3. OpenCV-Based Lesion Segmentation Method and Disease Severity Grading

4.4. A Detection Platform for Strawberry Angular Leaf Spot Severity Based on PyQt5

4.5. Equipment

4.6. Model Evaluation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI