A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection

Guo, Xiaoyan; Ou, Yuanzhen; Deng, Konghong; Fan, Xiaolong; Gao, Ruitao; Zhou, Zhiyan

doi:10.3390/agriculture15070790

Open AccessArticle

A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection

by

Xiaoyan Guo

^1,2,3,

Yuanzhen Ou

^1,2,3,

Konghong Deng

^1,2,3,

Xiaolong Fan

^1,2,3,

Ruitao Gao

^1,2,3 and

Zhiyan Zhou

^1,2,3,4,*

¹

Guangdong Laboratory for Lingnan Modern Agriculture, College of Engineering, South China Agricultural University, Guangzhou 510642, China

²

Guangdong Provincial Key Laboratory of Agricultural Artificial Intelligence (GDKL-AAI), Guangzhou 510642, China

³

Guangdong Engineering Research Center for Agricultural Aviation Application (ERCAAA), Guangzhou 510642, China

⁴

Key Laboratory of Key Technology on Agricultural Machine and Equipment (South China Agricultural University), Ministry of Education, Guangzhou 510642, China

^*

Author to whom correspondence should be addressed.

Agriculture 2025, 15(7), 790; https://doi.org/10.3390/agriculture15070790

Submission received: 25 December 2024 / Revised: 19 February 2025 / Accepted: 20 February 2025 / Published: 7 April 2025

(This article belongs to the Section Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Aiming at the technical bottleneck of monitoring rice stalk, pest, and grass damage in the middle and lower parts of rice, this paper proposes a UAV-based image information acquisition method and disease prediction algorithm model, which provides an efficient and low-cost solution for the accurate early monitoring of rice diseases, and helps improve the scientific and intelligent level of agricultural disease prevention and control. Firstly, the UAV image acquisition system was designed and equipped with an automatic telescopic rod, 360° automatic turntable, and high-definition image sensing equipment to achieve multi-angle and high-precision data acquisition in the middle and lower regions of rice plants. At the same time, a path planning algorithm and ant colony algorithm were introduced to design the flight layout path of the UAV and improve the coverage and stability of image acquisition. In terms of image information processing, this paper proposes a multi-dimensional data fusion scheme, which combines RGB, infrared, and hyperspectral data to achieve the deep fusion of information in different bands. In disease prediction, the YOLOv8 target detection algorithm and lightweight Transformer network are adopted to determine the detection performance of small targets. The experimental results showed that the average accuracy of the YOLOv8 model (mAP@0.5) in the detection of rice curl disease was 90.13%, which was much higher than that of traditional methods such as Faster R-CNN and SSD. In addition, 1496 disease images and autonomous data sets were collected to verify that the system showed good stability and practicability in field environment.

Keywords:

UAV image acquisition; monitoring of middle and lower parts; detection of rice disease; YOLOv8; path planning; data fusion

1. Introduction

1.1. Research Background

The rice root and stem are the places where diseases are easy to hide and breed, and they are also the areas where rice diseases and pests occur heavily. Existing rice disease image information collection and recognition techniques are mainly based on the collection of rice canopy or high-altitude images. The image information of the middle and lower parts of rice can be used to directly understand the growth and health of rice. Flax spot and rice stalk (Ustilaginoidea virens), as important diseases affecting rice yield and quality, affect rice quality and cause significant economic losses [1]. Because the symptoms of common rice diseases are hidden in the early stage, traditional disease monitoring methods such as manual visual inspection and ground monitoring have many limitations, which make it difficult to meet the needs of large-scale, real-time, and precise agriculture. In recent years, the development of UAV technology has provided a new solution for rice disease monitoring. Its fast and low-cost field data acquisition combined with high-resolution image processing and deep learning algorithms can conduct the real-time monitoring and disease detection of rice plants [2]. However, the current mainstream UAV disease monitoring system mainly focuses on high-altitude image acquisition in the rice canopy region, ignoring the detailed information in the middle and lower parts of rice [3], which is a region with a high incidence of diseases, grass damage, and insect damage, as well as being an important location for disease hiding and early breeding [4].

1.2. Development Status of Farmland Image Information Acquisition Technology

The acquisition of crop phenotypic information, including images and video data, is essential for comprehending the growth status of crops and environmental changes, thereby facilitating informed regulation and management decisions. Traditional manual observation methods are labor-intensive and inefficient, whereas drone-based farmland information acquisition offers a novel approach to agricultural informatization [5]. Li Fugen et al. utilized single-UAV-flight-generated projective images and digital surface models (DSMs) to accurately identify fruit trees [6]. Yao Qing et al. developed a multi-point distributed image acquisition and diagnosis system for agricultural diseases and pests. This system’s image acquisition terminal can efficiently collect images of rice bacterial wilt, achieving an 83.5% accuracy rate in diagnosing the severity of this disease [7]. Yang Tianle et al. collected feature-matched rice disease images based on crawler and intelligent technology. On this basis, feature matching of the image processing module was used to screen the image set and improve the accuracy of image acquisition. Their results indicated that, except for flax leaf spot, the extraction accuracy exceeded 50%, with the scab reaching 72.7% [8]. Current research remains focused on symptom identification and classification, lacking studies on early-stage crop stress without visible symptoms (Deng [9]). Mu Yanchen et al. established a rice growth image acquisition and monitoring system for cold regions. By applying digital image processing techniques to compare and analyze real-time rice images from cold regions, they achieved effective rice growth monitoring [10]. In maize small spot detection, Ge Jing used color image segmentation to calculate the spot area ratio as a damage assessment standard [11]. Lang Liying proposed a method using the average red-to-green component ratio to assess cotton blight infection based on color differences in infected leaves [12]. Li Jing developed a regression model linking the red component ratio to the disease index, correlating the spot area percentage with chlorophyll levels, achieving high accuracy in assessing maize small spot disease [13].

1.3. Limitations of Existing UAV and Disease Detection Methods

With the rapid development of UAV remote sensing technology and computer vision algorithms, traditional disease monitoring methods have moved towards the direction of UAV automatic monitoring [14]. However, the existing UAV disease detection methods have the following limitations in practical applications:

(1): Single acquisition area: Most of the current UAV systems focus on the acquisition of rice canopy images, while the incidence of rice disease is mainly concentrated in the middle and lower parts of the plant, resulting in incomplete disease monitoring results.
(2): Insufficient image quality and stability: Limited by equipment performance and flight environment, the captured images are susceptible to factors such as light and wind speed, and the image stability and clarity are low.
(3): Limitations of the algorithm in small target detection: The plaques of rice warp disease are small in size, densely distributed, and close to the background color, and traditional target detection algorithms such as Faster R-CNN and SSD have poor performance in small target detection.
(4): Lack of data fusion and system optimization: Single-RGB-image information cannot capture the early spectral characteristics of rice disease, and does not make full use of infrared, hyperspectral, and other multi-source data.

Therefore, for the accurate monitoring and prediction of rice yeast disease, it is necessary to design an unmanned aerial vehicle system that takes into account the performance optimization of information acquisition in the middle and lower regions and small target detection.

1.4. Research Objectives and Contents

(1): This research was equipped with an automatic telescopic rod and 360° automatic turntable to achieve multi-angle, middle and lower area image information acquisition. An integrated HD RGB camera, infrared camera, and hyperspectral camera were used to provide multi-dimensional data support.
(2): A UAV flight path based on a path planning algorithm and ant colony algorithm was designed to improve collection efficiency and coverage.
(3): RGB, infrared, and hyperspectral data were fused, and multi-source data alignment and fusion were carried out through a deep learning algorithm.
(4): The improved YOLOv8 target detection model and lightweight Transformer network were proposed to carry out small target detection for rice yeast disease. The feature pyramid network and attention mechanism were combined to improve the detection accuracy and stability.
(5): Field experiments were conducted to verify the image acquisition effect of the system and the performance of the rice disease detection model.

1.5. Technical Architecture and Innovation Points

An automatic telescopic rod and 360° rotating device are designed to break through the limitations of traditional UAV canopy monitoring and obtain more comprehensive disease information. In order to achieve early accurate disease detection, RGB, infrared, and hyperspectral data are integrated to capture the multi-dimensional spectral characteristics of rice yeast disease. Combined with improved YOLOv8 and Transformer, the attention mechanism and feature pyramid network are used to solve the small target detection problem of rice yeast disease. Algorithm A and an ant colony algorithm are used to design the UAV path and improve the data acquisition efficiency and coverage effect. Through the image acquisition system of the middle and lower part and the deep learning disease prediction model carried out by the UAV, an efficient and stable solution is provided for the accurate monitoring and early prevention and control of rice yeast disease, to make up for the shortcomings of existing technologies in disease monitoring in the middle and lower part of the region, which has important theoretical value and practical application significance.

2. Drone Monitoring and Forecasting System for Rice Disease

2.1. YOLOv8 Model Introduction

YOLOv8 is the latest generation of improved models in the field of object detection, continuing the consistent advantages of the YOLO (You Only Look Once) series, such as efficiency, real-time detection, and light weight, while further improving the model performance and application breadth through a number of technological innovations. YOLOv8 inherits the CSPNet (Cross-Stage Partial Network) structure and multi-scale feature fusion technology in YOLOv5, and makes improvements in feature extraction, detection head design, lightweight optimization, etc., so that it can better adapt to small target detection tasks in different scenarios. It is especially suitable for the small target recognition of rice disease spot in agricultural disease detection [15].

2.1.1. Anchor-Free Mechanism

YOLOv8 abandons the traditional anchor-based design and directly predicts the center point and size of the boundary frame of the target [16]. This anchor-free mechanism brings the following advantages:

(1): The process of anchor frame generation and matching is eliminated, which significantly reduces the number of parameters and computational complexity of the model;
(2): The anchor-free mechanism is particularly suitable for the detection of small targets such as rice disease spots, avoiding the problem of missing detection and precision reduction caused by mismatching anchor frames;
(3): The model does not depend on the preset anchor frame size, adapts to more target shapes and scales, and has stronger adaptability.

2.1.2. Multi-Scale Feature Fusion

YOLOv8 combines the FPN (feature pyramid network) and PANet (Path Aggregation Network) to realize multi-scale feature fusion [17]. From top to bottom, the FPN structure transfers the high-level semantic features to the low-level, improving the ability to represent the small target features. From the bottom up, detailed information of the bottom level is transmitted to the top level through the PANet structure, and the boundary accuracy and global awareness of the target are enhanced. Multilevel fusion relies on the combination of detailed information and context semantics to detect the spot of rice disease. Multi-scale feature fusion technology significantly improves the adaptability of the model to small targets and complex backgrounds.

The feature fusion formula is as follows:

F_{f u s i o n} = F_{l o w} + U p s a m l e (F_{h i g h})

(1)

where F_low is the bottom feature, F_high is the high feature, and U_psamle is the upsampling operation. By integrating multi-layer features, YOLOv8 realizes the capture of multi-scale information of rice disease.

2.1.3. Lightweight Design

In the network design, YOLOv8 introduces a series of lightweight technologies to ensure that the model has low computing resource requirements while maintaining high accuracy [18,19].

(1): The weight of the convolutional kernel is dynamically adjusted to improve the efficiency of feature extraction.
(2): Combining the CBAM (Convolutional Block Attention Mechanism) and other modules, the channels and spatial feature allocation are designed to further reduce redundant calculations.
(3): YOLOv8 supports a variety of scale models (Nano, Small, Medium, Large), and the appropriate version can be chosen according to the scene requirements. In this study, the YOLOv8-Small model was selected, which has low parameters and computation amounts, and is suitable for the real-time deployment of UAV rice field monitoring.

2.1.4. Integrated Framework of Training and Reasoning

YOLOv8 provides a highly integrated training and reasoning framework [20], which has the following features: YOLOv8 supports annotated data in various formats (such as COCO format), which can quickly complete data preprocessing, model training, and reasoning and is easy to use; supports multiple hardware platforms (such as GPUs, CPUs, and TPUs); and can be deployed in resource-limited environments with high compatibility. The toolchain is designed to integrate automatic mixing precision training, weight derivation, and quantization optimization for model compression and mobile deployment.

2.1.5. Model Optimization in This Study

In this study, the YOLOv8-Small model was selected as the basic network, and the following optimization was carried out according to the characteristics of rice disease spot detection:

(1): The parameter configuration of the prediction head was designed to make it more suitable for the boundary box prediction of small targets such as rice disease spot.
(2): The CBAM was introduced to further enhance the ability to distinguish the features and background of rice stalk lesions, and the fusion of high-level and low-level features was enhanced to improve the detection accuracy in complex scenes.
(3): The width and depth of the network were adjusted to reduce computing costs while maintaining efficient detection performance; using a smaller convolution kernel (e.g., a 3 × 3 convolution instead of a 5 × 5 convolution) further reduced the number of parameters.
(4): Through these optimizations, the overall detection performance of the model was significantly improved, making it more suitable for the real-time monitoring needs of the UAV platform.

2.1.6. Advantages of YOLOv8 in Monitoring Rice Disease

In the small-target detection task of rice disease, YOLOv8 adopts the anchor-free mechanism and multi-scale feature fusion technology [15,16], and the YOLOV8-small model can realize efficient reasoning under limited hardware resources, which is especially suitable for online detection requirements of UAV platforms. In addition, YOLOv8 has good flexibility and a lightweight design, so it can quickly adapt to different hardware platforms and practical monitoring scenarios. YOLOv8 may face challenges in small target detection tasks, because small targets occupy fewer pixels in the image, and feature extraction is difficult. In order to improve the performance of YOLOv8 in small target detection, the following aspects can be used, and the formula is combined for explanation.

2.2. Improved Feature Pyramid (FPN)

The feature extraction of small targets is a key difficulty in the detection task of rice stalk spot, especially in a complex field background, where the disease spot target is easily disturbed by the chaotic texture and light conditions of the rice field environment. Therefore, in order to achieve efficient small target detection and improve the model’s ability to represent target features, this paper introduces the feature pyramid network (FPN) based on YOLOv8 [21,22]. YOLOv8 uses a feature pyramid (FPN) to fuse features at different scales. In order to enhance the ability of small target detection, the design of FPN can be improved.

2.2.1. Feature Pyramid Network (FPN)

The feature pyramid network (FPN) is a technology focusing on multi-scale feature fusion, which can significantly improve the detection capability of small targets. Its core idea is to build a top-down feature pyramid, transfer the semantic information of high-level features to the bottom layer, and combine the details of low-level features, to realize the comprehensive utilization of multi-scale features [22,23]. The workflow of the FPN consists of two main steps, as shown in Figure 1.

(1): Downward propagation of high-level features: After upsampling, the high-level features are added with the underlying features pixel by pixel to give the underlying features richer semantic information.
(2): Enhancement in low-level features: While retaining detailed information, low-level features improve feature differentiation by integrating the global semantics of high-level features [23].

The improved formula of feature fusion is used:

Assuming that P_i is the feature graph of the i layer, U (.) is the upsampling operation, and C (.) is the convolution operation, the improved feature fusion formula is

P_i_′ = C(P_i) + U(P_i_+1′)

(2)

where P_i_′ is the fused feature map.

Improvement Method:

Use increased feature maps with higher resolution (e.g., from 1/4 resolution to 1/2 resolution).

Use more efficient feature fusion methods (such as BiFPN).

Through the design of the feature pyramid network, the model can pay attention to the details and context information of the lesions at different levels. For example, the high resolution of the low-level feature map can capture the boundary details of the disease spots, while the low resolution of the high-level feature map provides broader semantic information, such as the overall regional distribution and disease patterns of the rice field. Through top-down feature fusion, the model can pay attention to both fine-grained lesion features and global background information in the detection task of rice disease, thus significantly improving the detection accuracy.

2.2.2. Convolutional Block Attention Mechanism (CBAM)

The Convolutional Block Attention Mechanism (CBAM) is an efficient attention module that can significantly improve the model’s ability to extract target features, especially for small target detection tasks in complex backgrounds [24]. In the detection of rice disease, there is little difference in color and texture between the lesion target and the surrounding background. The CBAM strengthens the attention to lesion features by modeling the importance of features in channel and space dimensions, respectively, while suppressing the interference of redundant and irrelevant features.

The CBAM module mainly includes the channel attention module and spatial attention module:

(1): Channel attention module

The function of the channel attention mechanism is to make weighted adjustments according to the importance of each feature channel, that is, to preferentially retain key channel information and weaken unimportant feature channels [25]. In the channel attention module, the CBAM captures the global context information of each channel through global average pooling (GAP) and global maximum pooling (GMP), and then generates channel weights through the two-layer fully connected network, which are activated by the Sigmoid function, and finally adjusts the importance of each channel for input features. Its calculation formula is as follows:

M_{c} (F) = σ (W_{1} (W_{0} (G A P (F)))) + W_{1} (W_{0} (G M P (F)))

(3)

where

F

represents the input feature map,

σ

is the Sigmoid activation function,

W_{0}

and

W_{1}

are the weight parameters of the full connection layer,

G A P

and

G M P

represent the global average pooling and global maximum pooling operations, respectively, and variable Mc(F) represents the feature channel’s weight channel, which is adjusted after the attention module. In this way, the CBAM is able to dynamically adjust the weight of each channel, thereby improving the model’s ability to capture key features.

(2): Spatial attention module

The function of the spatial attention mechanism is to dynamically adjust the weight of each position according to the importance of different positions in the feature map, to enhance the attention to the target region [26]. The spatial attention module generates two feature maps by the global average pooling and global maximum pooling of channel dimensions, and then stacks them and generates the final spatial weight map through a convolution operation. Its calculation formula is as follows:

M_{s} (F) = σ (C o n v ([G A P (F) \cdot G M P (F)]))

(4)

where Conv represents the convolution operation, [

\cdot

] represents the concatenation operation of the feature map, and Ms(F) represents the adjustment of the weight of each position by the spatial attention module and finally the generation of the weight of each position through the Sigmoid function.

The overall operation process of the CBAM module is as follows: Firstly, the weight of the feature channel is adjusted through the channel attention module, and then the weight of each position in the feature map is adjusted through the spatial attention module. Finally, the feature map that has been designed for attention twice is output. In the detection task of rice disease, the CBAM module effectively alleviates the problem where the features of disease spots are covered by background information. For example, for complex leaf textures in rice field images, the channel attention module can increase the weight of disease-related feature channels, while the spatial attention module can focus on the specific location of disease spots. This multi-dimensional feature enhancement mechanism significantly improves the detection accuracy and robustness of the model.

2.2.3. Synergistic Effect of FPN and CBAM

The feature pyramid network (FPN) and Convolutional Block Attention Mechanism (CBAM) are two complementary modules. The former focuses on the multi-scale fusion of features, while the latter focuses on the weight allocation of features. In the detection task of rice disease, these two modules can work together to solve the multi-scale characterization of the target and the selection of key features, respectively. The FPN improves the detection ability of small target lesions by fusing context information. The CBAM further strengthens the model’s focus on the lesion target and weakens the interference of complex backgrounds through the attention modeling of a global context.

In conclusion, the combination of the FPN and CBAM enables the YOLOv8 model to show a stronger feature extraction capability and detection accuracy in the detection of rice yeast disease, providing technical support for accurate disease monitoring.

2.3. Selection of Activation Function (SiLU)

The activation function is one of the core components of the deep learning model, which directly affects the nonlinear representation ability of the network and the stability of gradient propagation. In the object detection task, the selection of the activation function plays an important role in the optimization efficiency, feature extraction ability, and final detection performance of the model. In this study, the Sigmoid-Weighted Linear Unit (SiLU) activation function was selected in the YOLOv8 model to replace the Rectified Linear Unit (ReLU) activation function [27,28], further optimizing the detection capability and generalized performance of the model.

2.3.1. Definition and Mathematical Expression of SiLU Activation Function

The SiLU activation function is an improved nonlinear activation function based on the Sigmoid function [29], whose mathematical expression is as follows:

S i L U (X) = X \cdot o (X)

(5)

where

o (X)

is the Sigmoid activation function, defined as

o (X) = \frac{1}{1 + e^{- X}}

(6)

As can be seen from the expression, the SiLU function is the product of the input value and its Sigmoid activation value, and therefore has the following properties:

(1): SiLU shows smooth nonlinear mapping characteristics in different input ranges, which makes the network have a stronger expression ability.
(2): The SiLU function is a first-order continuous differentiable function in the global range, which provides stability for gradient backpropagation and helps accelerate the optimization process of the model.

2.3.2. Comparison with Other Activation Functions

The performance of the activation function directly affects the ability of the network to capture features and design efficiency. SiLU has the following advantages over the common ReLU and h-swish activation functions:

(1): Avoid the “neuron death” problem

ReLU will always have zero output when the input is less than zero, which may lead to the phenomenon of “neuron death”, that is, the weight of some neurons cannot be updated [30]. However, SiLU can still output non-zero values when the input is close to or less than zero, thus effectively avoiding this problem, so the model can retain more feature information, especially in the detection of rice stalk spot, and the input value often contains a large number of subtle features.

(2): Linear mapping characteristics

With an input close to zero, SiLU behaves as a linear mapping:

S i L U (X) \approx X X \to 0

(7)

This characteristic helps preserve the detailed features of rice stalk spot and avoid the damage of small target features by excessive nonlinear transformation.

(3): Smooth gradient update

SiLU smooths gradient changes throughout the domain, reducing the problem of decreasing optimization efficiency in ReLU due to gradient truncation (zero gradient when the input is less than zero). This feature is particularly important in the training process of deep neural networks, which can significantly improve the optimization efficiency of the network and reduce the risk of gradient disappearance.

(4): Performance advantage

In the comparative experiments of activation functions such as h-swish and ReLU, the research shows that the network using the SiLU activation function has better performance in small target detection tasks, and its accuracy and stability of detection of rice disease spots are significantly improved.

2.3.3. Application of SiLU in YOLOv8

In the YOLOv8 model, the activation function endows a nonlinear expression ability to the features after convolution operation at each layer, which is the key to improving model performance. In the detection of the spot of rice disease, because the small spot area and the target features are easily interfered with by complex backgrounds, the SiLU activation function has a significant optimization effect on the network performance through its unique characteristics. First, SiLU’s smooth nonlinear mapping allows it to preserve the detailed features of the input without completely truncating negative features like ReLU, which is particularly critical for capturing the faint texture and color changes in the smut plaques. Secondly, the continuous conductivity of SiLU effectively alleviates the problem of gradient disappearance, especially in the deep feature extraction process of YOLOv8, where stable gradient propagation significantly improves the optimization efficiency and detection stability of the model. In addition, in the process of multi-scale feature fusion, SiLU can smoothly process information from different feature scales, avoiding the excessive strengthening or weakening of specific scale features by the activation function, so the model can show stronger robustness in complex scenes. These characteristics make SiLU play an important role in the detection of rice disease, significantly improving the ability to capture small target features and the overall detection performance.

2.3.4. Mathematical Features and Performance Advantages

The derivative of the SiLU function can be calculated directly as follows:

S i L U ’ (X) = o (X) + X \cdot o (X) \cdot (1 - σ (X))

(8)

The derivative is continuous and smooth in the defined domain, which ensures the stability of gradient updating and the efficiency of model optimization. At the same time, experiments show that SiLU converges faster than ReLU and h-swish in target detection tasks. In rice stalk detection tasks, the model using the SiLU activation function significantly reduces the oscillation amplitude of the loss function during training.

2.4. Multi-Dimensional Information Collection Scheme

2.4.1. Data Enhancement

Through data enhancement, the diversity of small targets is increased and the generalization ability of the model is improved.

The mosaic data enhancement formula is used:

Mosaic data enhancement spliced 4 images into 1 image, and the formula is as follows:

I_mosaic = Concat(I₁, I₂, I₃, I₄)

where I₁, I₂, I₃, and I₄ are 4 randomly selected images.

Improvement Method:

Add Copy–Paste Augmentation of small targets (Copy–Paste Augmentation).

Use random cropping and scaling to ensure that small targets have a higher proportion of the training set.

2.4.2. Attention Mechanism

Attention mechanisms (such as CBAM and SE) are introduced to enhance the feature extraction ability of small targets.

The CBAM formula is used:

CBAM includes channel attention and spatial attention:

F^{'} = M c (F) - F

(9)

F^{″} = M_{s} (F^{'}) - F^{'}

(10)

where M_c is the channel attention module, M_s is the spatial attention module, and - is element-by-element multiplication.

Improvement Method:

Add CBAM modules to Backbone or Neck.

2.4.3. Anchor Frame Optimization

YOLOv8 uses anchor frames to predict the target frame. The performance of small target detection can be improved by optimizing the size of the anchor frame.

The K-means clustering formula is used:

Use the K-means clustering algorithm to calculate the anchor frame size that is more suitable for the data set:

\arg \min s \sum_{i = 1}^{k} \sum_{x \in S_{i}} {‖x - u_{i}‖}^{2}

(11)

where S_i is cluster i and u_i is the center of cluster i.

Improvement Method:

Recalculate the anchor frame dimensions on the data set.

Increase the number of anchor frames for small targets.

2.4.4. High-Resolution Input

Increasing the resolution of the input image can improve the detection effect of small targets.

The resolution adjustment formula is used:

Adjust the resolution of the input image from H × W to kH × kW, where > 1.

Improvement Method:

Increase the input resolution from 640 × 640 to 1280 × 1280.

Pay attention to adjusting the size of the anchor and detection head.

2.5. Model Architecture Adjustment

Aiming at the detection characteristics of rice yeast spot, this paper systematically designed the model architecture on the basis of YOLOv8, to better meet the requirements of small-target detection, and at the same time take into account the lightweight and real-time model. The optimization focuses on the design of the detection head, lightweight strategy, and reasonable adjustment of network depth and width, which makes the improved model show higher precision and lower computational complexity in the detection task of rice yeast disease.

2.5.1. Detection Header Optimization

As one of the key modules of the target detection model, the detection header is responsible for outputting the location and category information of the target according to the feature map [30]. In order to improve the detection accuracy of small targets of rice disease, the detection head of YOLOv8 was designed as follows:

(1): Increase the number of image information collection equipment

The original image information acquisition equipment of YOLOv8 is mainly aimed at predicting multi-scale targets. However, in the detection of diseases and pests such as flaky spot and rice stalk spot, the target area is usually small and the contrast with the background is low, so the number and feature level of the original image information acquisition equipment may not be enough to effectively capture the feature information of the disease spot. Therefore, by increasing the number of image information acquisition devices (as shown in Figure 2), this paper enhanced the model’s multi-scale perception of small target features and ensured that targets of different sizes could be effectively detected at appropriate scales.

(2): Adjust the Anchor-free mechanism parameters.

YOLOv8 adopts an anchor-free mechanism for target prediction, that is, direct regression of the center point and size of the target frame. In order to improve the prediction accuracy of the spot region of rice disease, the parameters of the anchor-free mechanism were adjusted in this paper, including optimizing the migration range of the center point to make it more suitable for the boundary fuzzy target of rice disease spot. The positioning weight in the loss function is redesigned to improve the accuracy of the small target box, to reduce the missed detection rate. Through these optimizations, the performance of the detection head is significantly enhanced, and the detection accuracy and robustness of the model under the complex background of rice disease are improved.

The detection head output formula is used:

Suppose H_i is the output of the i detection head and F_i is the input feature graph, then the output of the detection head is

Hi = C(F_i)

(12)

where C(.) is the convolution operation.

Improvement Method:

Add a higher-resolution detection head (such as 1/2 resolution).

Design smaller anchor frames for the detection head to accommodate small targets.

2.5.2. Improved Loss Function

The loss function of YOLOv7 includes classification loss, regression loss, and confidence loss. The performance of small target detection can be improved by adjusting the loss function.

The focal loss formula is used:

Focal loss can solve the problem of class imbalance and is suitable for small target detection. The formula is

FL(p_t) = −α_t (1 − p_t)^γ log(p_t)

(13)

where p_t is the probability predicted by the model, α_t is the class weight, and γ is a regulator used to reduce the weight of easily classified samples.

Improvement Method:

Replace classification loss with focal loss.

Adjust regression losses, using CIOU or DIOU losses, where the formula is

L_CIOU = 1 − IOU + ρ²(b, b^gt)/c² +αv

(14)

where ρ is the distance between the center point of the predicted box and the real box, c is the diagonal length of the smallest outer rectangle, and v is the similarity of the aspect ratio.

2.5.3. Adjusting the Network Depth and Width

The depth and width of the network are the two core factors that affect the feature expression ability and computational efficiency [31]. In the detection of rice yeast disease, proper adjustment of the network depth and width can balance the feature extraction ability and computing resource requirements of the model. According to the detection characteristics of rice disease, the following adjustments were made:

(1): Increase network width

The increase in the width of the network enables each layer to extract more feature information, especially in small target detection tasks, and the increase in width helps capture more detailed features of the lesion. However, too large a width can lead to a sharp increase in computation. Therefore, this paper increases the number of channels in the feature extraction stage within a reasonable range, thus enhancing the detail perception ability of the model.

(2): Reduce network depth

Too deep a network depth may lead to gradient disappearance or gradient explosion, especially when training small sample data such as a rice yeast disease data set; the deep network is more prone to the overfitting phenomenon. In this paper, by reducing the network depth, the computational complexity of the model is reduced, and the gradient problem in the training process is alleviated. In addition, the shallow network can also accelerate the convergence rate of the model, so the model training efficiency of the rice disease detection task is significantly improved.

(3): Depth-to-width ratio optimization

In the process of network structure adjustment, this paper strictly controls the depth-to-width ratio of the network, and selects the optimal configuration of the depth-to-width ratio through experimental verification, so the model can achieve the best balance between the calculation cost, memory consumption, and detection performance.

2.6. Rice Disease Monitoring Model (YOLOv8-EDCA)

Combined with the technical requirements of small target detection for rice yeast disease and the above-mentioned multiple optimizations for YOLOv8, this study proposed an improved rice yeast disease monitoring model: efficient dual-channel attention-YOLOv8 (YOLOV8-EDCA). By introducing an efficient two-channel attention mechanism (EDCA) and lightweight optimization strategy, the model can significantly reduce the computational complexity while maintaining high detection accuracy, and has the ability to adapt to the real-time detection task of UAVs.

2.6.1. Model Integration

In the monitoring of rice disease, the spot often has obvious fine features and complex background textures, which puts forward higher requirements for the feature extraction ability of the model. Based on YOLOv8, we designed an efficient dual-channel attention mechanism (EDCA) to enhance the model’s ability to selectively characterize target features and suppress background interference.

(1): The core idea of EDCA: The EDCA mechanism is improved on the basis of the CBAM (Convolutional Block Attention Mechanism) and achieves richer context information captured between feature channels by adding parallel convolutional paths with different kernel sizes [32]. Specifically, the EDCA uses two parallel attention paths to process features of different receptive fields: it uses a small convolution kernel (e.g., 3 × 3) to extract local detailed features, and it uses large convolution kernels (such as 5 × 5 or 7 × 7) to extract global context information.

The features of the two paths are fused through the channel cascade operation. The formula is as follows:

M_{E D C A} (F) = C o n c a t (M_{C B A M} (F_{1}), M_{C B A M} (F_{2}))

(15)

where F₁ and F₂ are characteristic channels of different nuclear sizes, respectively; M_CBMA optimizes channel and spatial attention for features of the CBAM module; and Concat indicates the channel cascading operation.

This design can effectively capture multi-scale feature information and enhance the key features of rice stalk spot through the attention mechanism, thus significantly improving the detection accuracy and robustness of the model.

(2): The recall rate of small target detection can be improved by integrating the prediction results of multiple models.

The weighted average formula is used:

Suppose B_i is the bounding box predicted by the i model; then, the integrated bounding box is:

B = \sum_{i = 1}^{n} w_{i} B_{i}

(16)

where w_i is the weight of the i model.

Improvement Method:

Integrate models with different resolutions.

Use NMS or Weighted Boxes Fusion (WBF) for post-processing.

2.6.2. Design and Optimization of Lightweight Model

In order to adapt to the limited computing resources of UAVs, this paper further carried out the lightweight design and architecture optimization of the YOLOv8-EDCA model to achieve the efficient detection of rice yeast disease.

(1): Use the SiLU activation function with a lightweight convolution structure

The introduction of the SiLU (Sigmoid-Weighted Linear Unit) activation function provides a smooth nonlinear mapping capability for the model, and improves the stability of gradient propagation while maintaining the integrity of feature information. Its continuously derivable properties help alleviate the problem of gradient disappearance, thus enhancing the effect of deep feature extraction. To further reduce the computational complexity, deep separable convolution and sparse activation techniques are used in this paper. Depth-separable convolution decomposes standard convolution into deep convolution and point-by-point convolution [33], significantly reducing the number of parameters and computational effort, while retaining strong feature extraction capabilities.

(2): Adjust the stacking mode of the network layer

The original network structure of YOLOv8 is over-stacked in some layers, which may cause the problem of gradient disappearance in the detection of rice yeast disease, especially in the detection of small targets. In this paper, the layer stacking mode of the network is redesigned: the number of deep convolutional layers is reduced to avoid the optimization difficulties caused by too deep a network depth. Appropriately increasing the width of the shallow layer network and enhancing the ability to capture details in the feature extraction stage made the model more accurately identify the spot of rice disease.

(3): Network parameter adjustment

Through the experimental verification, the parameter scale of the model is designed, which makes the final model not only guarantee the detection performance, but also greatly reduce the memory consumption and reasoning time. The designed model is approximately 13.7 MB in size, which is 25% less than the original YOLOv8 model, making it more suitable for deployment on UAV platforms.

3. Experiment and Result Analysis

In order to test the performance of the UAV-mounted image information acquisition method for the middle and lower parts of rice, two varieties of rice, Huangguang Youzhan (hybrid rice), were used in the experiment. The test time was October 14, 2022, and the rice growth period was the full ear stage. The field test area was adopted, the plant row spacing was 6.5 × 7.6 cm, and the planting area was 12 × 4.5 m. This region was divided into 18 plots, with 9 plots for each rice variety, including the test group and control group (each with a size of 3 × 3 m). Crop parameters are shown in Table 1.

3.1. Introduction to Data Sets

In order to verify the effectiveness of the YOLOv8-EDCA model proposed in this paper in the small target detection of rice sets, the experiment adopted a rice field disease data set specially collected, including multi-angle and multi-scale images of rice set disease spots, with a total of 1296 clearly labeled images. The data set covers the different growth stages of rice stalk plaques, as well as interference factors in the complex background of the rice field (such as leaf texture, soil, other vegetation, etc.). The proportion of small objects in the data set is significantly higher than that in the ordinary data set, and the ratio of object frame area to image area is less than 10% on average. In order to ensure the scientific nature of the experiment, the data set was divided into the training set, verification set, and test set according to the ratio of 8:1:1, including 1036, 130, and 130 images, respectively, as shown in Figure 3. Data enhancement strategies such as random horizontal flip, color jitter, and contrast enhancement were applied to the data set to further enrich the training samples and improve the robustness of the model.

3.2. Experimental Environment

The experiment was carried out in a high-performance computing environment, and the equipment, hardware, and parameters used are shown in Table 2 and Table 3.

3.3. Model Evaluation Index

To comprehensively evaluate model performance, the following common indicators were used: Average accuracy (mAP) measures the overall accuracy of the model test results, where mAP@0.5 is the average accuracy when the IoU threshold is set to 0.5. mAP@0.5:0.95 indicates the average value of the IoU threshold ranging from 0.5 to 0.95 (step size: 0.05). Detection time (inference speed) is the inference time for a single image, used to evaluate the real-time performance of the model. The complexity of the model was evaluated by the parameter number (model size) and floating-point computation (FLOPs).

The evaluation formula is as follows:

m A P = \frac{1}{|C|} Σ_{C ϵ} c A P (c)

(17)

A P (C) = \int_{0}^{1} P (R) d R

(18)

where C is the target category set, and P and R are the detection accuracy rate and recall rate, respectively.

3.4. Model Performance Comparison

The YOLOV8-EDCA model was compared with mainstream target detection algorithms (YOLOv8, Faster R-CNN, SSD, etc.). The experimental results are shown in Table 4.

The experimental results showed that the YOLOv8-EDCA model showed significant advantages in the detection task of rice disease. First, in terms of detection accuracy, YOLOv8-EDCA is significantly superior to other models in mAP@0.5 and mAP@0.5:0.95, especially in complex backgrounds and small target detection tasks of rice disease, demonstrating excellent small target recognition capabilities. Secondly, in terms of reasoning speed, the reasoning time of YOLOV8-EDCA is reduced by 10% compared with the original YOLOv8 model, which further improves the real-time performance of the model and makes it more suitable for deploying on the UAV platform for online monitoring. In addition, YOLOv8-EDCA also has great advantages in lightweight performance, and its parameter number and calculation amount are significantly lower than those of Faster R-CNN, which fully meets the requirements of resource constraints in agricultural field applications, and provides an efficient solution for the accurate monitoring of large-scale rice flop disease.

3.5. Ablation Experiment

In order to explore the influence of each improved module on the performance of the model, ablation experiments were conducted, and the experimental results are shown in Table 5.

The results of ablation experiments show that the improved modules play an important role in improving the performance of the YOLOv8-EDCA model. After the introduction of the dual-channel attention mechanism (EDCA), the mAP@0.5 of the model was improved by 2.2%, which significantly enhanced the ability to capture small target features of rice disease and effectively improved the detection accuracy. By using the SiLU activation function instead of the original activation function, the gradient propagation is more stable, further optimizing the training process of the model and improving the overall detection performance. In addition, the synergistic effect of three improvements, including EDCA, the SiLU activation function, and other architecture optimizations, makes the model achieve an optimal balance between detection performance and computational efficiency, fully demonstrating its superiority in small target detection of rice yeast disease.

3.6. Analysis of Results Under Different Treatment Groups and Detection Degrees

The data analysis (as shown in Figure 4) revealed that in both treatment groups (treat1 and treat2), the detection values progressively increased from low to mid to high levels as the degree of treatment intensity escalated (Group 1). Notably, the detection values for the treat2 group were consistently higher than those for the treat1 group, indicating a more pronounced detection effect. Specifically, at the low level, treat1 exhibited a more concentrated distribution of values between 8 and 11, whereas treat2 showed a slightly higher distribution of values ranging from 12 to 16. At the mid level, the detection values for treat1 and treat2 rose to the ranges of 18–21 and 22–25, respectively, demonstrating a significant improvement in detection performance for both groups in this stage. At the high level, treat1 had a value distribution of 27–32, while treat2 exhibited a further distribution of 29–33, albeit with a slight increase in variability, suggesting some fluctuations within the internal data. As shown in Figure 4, the red dots represent outliers and the green dots in the graph represent target values or expected thresholds.

Overall, the detection values of the treat2 group are consistently higher than those of the treat1 group across all levels, particularly at mid and high levels, demonstrating more pronounced performance advantages. This reflects the efficacy and stability of treat2 under varying detection conditions. Moreover, the upward trend in detected values corroborates the stability of the model or method under different conditions. However, the significant fluctuation in treat2 values also highlights the necessity to further optimize model parameters or treatment methods. These findings provide valuable data support and reference for refining the monitoring methodology of rice stalk disease.

3.7. Analysis of Visual Results

The label distribution shown in Figure 5 reveals a significant feature: the height and width of the detection target frame are, almost without exception, less than 0.1 relative to the size of the image, and the target objects processed in most cases belong to the category of small targets. Specifically, since the target entities such as the black spots of rice disease in the test data set show significant small-scale features visually, their tiny size not only increases the difficulty of target detection compared with the background or other environmental factors, but may also cause the model to be susceptible to noise interference, edge blurring, shape similarity confusion, and other problems in the recognition process.

In order to comprehensively evaluate the performance of different models in the black spot detection task of rice diseases, several mainstream target detection algorithms recognized by the industry were selected, and the test data sets were applied to these models for rigorous comparative tests. The training results of YOLOv8-EDCA are shown in Figure 6 and Figure 7.

As can be seen from Figure 7, when the confidence of all classes is 0.189, it reaches 0.29, which indicates that under this confidence threshold, the model achieves a good effect on the whole between the accuracy rate and the recall rate.

The comparative test results are shown in Table 6. According to the experimental analysis, when the IoU threshold of YOLOv8-EDCA is set at 0.5, its map reaches 92.41%, which is 52.40% higher than that of Faster R-Cnn (39.91%). When the IoU threshold of YOLOv8-EDCA is set at 0.5:0.95 (%), its map reaches 84.54%, which is 47.74% higher than that of Faster R-Cnn (36.80%). In addition, YOLOv8-EDCA has a smaller model size under the same computing power, and the size of the model to complete a black spot detection of flacula is 15.9 MB, which is much smaller than the 108.2 MB of Faster R-Cnn.

The experiment adopted label pictures with manual labels as a reference to visually demonstrate the implementation effect of the YOLOv8s model in the detection task of flax spot disease. Figure 8 shows the detection results after YOLOv8-EDCA processing.

3.8. Summary

The experimental results show that the YOLOv8-EDCA model proposed in this paper has significant advantages in the small target detection task of rice yeast disease and can realize efficient and accurate rice field disease monitoring on the resource-limited UAV platform, providing important technical support for agricultural intelligence.

4. Conclusions

In this paper, an efficient detection model based on YOLOv8-EDCA was designed according to the characteristics of the detection task of rice yeast spot, and the real-time problem of difficult small target detection, complex background interference, and resource limitation was effectively solved through a number of improvements. Firstly, by introducing the two-channel attention mechanism (EDCA), the model can capture the detailed features of rice stalk spot more accurately in multi-scale features, and effectively suppress background interference. Secondly, combined with the SiLU activation function and lightweight design, the model not only guarantees high detection accuracy, but also significantly reduces computational complexity and increases reasoning speed by more than 20%, which is more suitable for the real-time deployment of UAV platforms with limited resources [34]. The experimental results show that YOLOv8-EDCA outperforms the mainstream detection models in mAP@0.5 and mAP@0.5:0.95, especially in the small target detection task with complex backgrounds in rice fields. The ablation experiment further verifies the important role of each improved module in improving the model performance. The excellent detection performance and real-time performance of the YOLOv8-EDCA model in the accurate monitoring of rice disease provided technical support for the prevention and control of rice disease, and also provided a new idea and practical basis for the intelligent application of agricultural disease monitoring. In future studies, more abundant multi-modal data can be combined to further improve the robustness and applicability of the model and promote the development of agricultural wisdom.

Author Contributions

Methodology, X.G.; software, X.G., Y.O. and K.D.; validation, X.G.; formal analysis, Y.O.; investigation, X.G.; resources, X.G. and X.F.; data curation, X.G. and X.F.; writing—original draft preparation, Y.O. and K.D.; writing—review and editing, R.G.; visualization, X.G.; supervision, R.G. and Z.Z.; project administration, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (2022YFD2001501) The publication fees for this manuscript are self-financed.

Data Availability Statement

The authors confirm that the data supporting the findings of this study are available within the article.

Conflicts of Interest

The funders had no role in the design of this study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Lv, Q.; Zhao, X.; Yang, X.; Huang, F.; Liang, W. Paddy rice soil bacteria chlamydospore piece of wintering capability study. J. Plant Pathol. 2024, 777–786. [Google Scholar]
Zheng, G. Research on Monitoring and Early Warning Technology of Rice Pests and Diseases Based on Machine Vision. Ph.D. Thesis, Chongqing Three Gorges College, Chongqing, China, 2023. [Google Scholar]
Cai, N. Rice Disease and Efficacy Evaluation Based on UAV Image Research. Ph.D. Thesis, Anhui University, Hefei, China, 2022. [Google Scholar]
Cui, X.D. Integrated control technology of rice pests and diseases in Northeast China. Spec. Econ. Flora Fauna 2022, 25, 125–127. (In Chinese) [Google Scholar]
Li, J.; Zhang, T.; Peng, X.; Yan, G.; Chen, Y. Application of small UAV in farmland information monitoring. Agric. Mech. Res. 2010, 183–186. (In Chinese) [Google Scholar]
Li, F.; Duan, Y.; Shi, Y.; Wu, W.; Huang, P. Using a single drone image to accurately identify fruit trees. Agric. Inf. China 2019, 4, 56–69. [Google Scholar] [CrossRef]
Yao, Q.; Guan, Z.; Zhou, Y.; Tang, J.; Hu, Y.; Yang, B. Application of support vector machine for detecting rice diseases using shape and color texture features. In Proceedings of the 2009 International Conference on Engineering Computation, ICEC 2009, Hong Kong, China, 2–3 May 2009; pp. 79–83. [Google Scholar]
Yang, T.-L.; Qian, Y.-S.; Wu, W.; Liu, T.; Sun, C. Intelligent acquisition of rice disease images based on Python crawler and feature matching. Henan Agric. Sci. 2020, 12, 45–57. [Google Scholar]
Deng, R.; Pan, W.; Wang, Z. Research progress and prospect of crop phenotype technology and its intelligent equipment. Mod. Agric. Equip. 2021, 42, 2–9. (In Chinese) [Google Scholar]
Mu, Y.; Li, A.; Wang, S. Research on rice growth image acquisition and growth monitoring. Priv. Sci. Technol. 2014, 3, 254–261. [Google Scholar]
Ge, J.; Shao, L.; Ding, K.; Li, J.; Zhao, S. Detection of disease degree of maize small spot by image. Trans. Chin. Soc. Agric. Mach. 2008, 39, 1142117. [Google Scholar]
Lang, L.-Y.; Tao, J.-J. Study on the infection degree of cotton red blight based on image processing. Agric. Mech. Res. 2012, 6, 126–131. [Google Scholar]
Li, J. Study on Image Processing Technology and Physiological Index of Maize Small Spot Disease. Master’s Thesis, Anhui Agricultural University, Hefei, China, 2008. [Google Scholar]
Palva, R.; Kaila, E.; Pascual, G.B.; Bloch, V. Assessment of the Performance of a Field Weeding Location-Based Robot Using YOLOv8. J. Agron. 2024, 14, 2215. [Google Scholar] [CrossRef]
Liu, J.; Huang, X.; Guo, J. Lightweight cotton grade field detection based on YOLOv8. Comput. Eng. 2024, 1–13. [Google Scholar] [CrossRef]
Zhao, C.; Tang, Q.; Xu, H.; Zhu, X.; Li, Y.; Tao, Y.; Zhang, X. Cdd-yolo: Based on Improved YOLOv8n UAV small target lightweight detection algorithm. J. Zhejiang Norm. Univ. (Nat. Sci. Ed.) 2024, 1–10. [Google Scholar] [CrossRef]
Xiao, B.; Nguyen, M.; Yan, W.Q. Fruit ripeness identification using YOLOv8 model. Multimed. Tools Turned Appl. 2023, 83, 28039–28056. [Google Scholar]
Deng, L.; Zhou, J.; Liu, Q. Flames and smoke detection algorithm based on improved YOLOv8. J. Tsinghua Univ. (Nat. Sci. Ed.) 2024, 8, 1–9. [Google Scholar]
Liu, G.; Di, X.; Yang, Y.; Wang, J. The rice pest detection algorithm based on YOLOv8. J. Yangtze River Inf. Commun. 2024, 9, 13–16. [Google Scholar]
Zhang, S.; Chen, S.; Zhao, Z. Improve YOLOv8 crop leaf diseases and insect pests recognition algorithm. Chin. J. Agric. Mech. 2024, 45, 255–260. [Google Scholar]
Li, Y.; Fan, Q.; Huang, Z.; Gu, Q. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition. Drones 2023, 7, 304. [Google Scholar] [CrossRef]
Wang, Y. Research on SMFF-YOLO Algorithm Based on Multilevel Feature Fusion and Scale Adaptive in UAV Remote Sensing Images. Master’s Thesis, Wuhan Textile University, Wuhan, China, 2024. [Google Scholar]
Guo, Z.; Li, Y. HCI-YOLO: Detection model of eggplant fruit pests and diseases based on improved YOLOv8. Softw. Eng. 2024, 27, 63–68. [Google Scholar]
Cao, Y.; Chen, X.; Lin, Y.; Li, Y.; Guo, Z. Weeds semantic segmentation method based on multi-scale information fusion. J. Shenyang Agric. Univ. 2024, 6, 1–9. [Google Scholar]
Song, W.; Zhai, W.; Gao, M.; Li, Q.; Chehri, A.; Jeon, G. Multiscale aggregation and illumination-aware attention network for infrared and visible image fusion. Concurr. Comput. Pract. Exp. 2023, 36, e7712. [Google Scholar] [CrossRef]
Sun, M.; Sun, T.; Hao, F.; Mu, C.; Ma, D. Recognition of lightweight maize kernel varieties based on MobileNetV2 and convolutional attention mechanism. Shandong Agric. Sci. 2024, 12, 1–11. [Google Scholar]
Li, Q.Q.; Chen, J.Q.; Hao, K.W.; Mu, C.; Ma, D. A lightweight blind spot detection based on YOLOv8 network. J. Mod. Electron. Technol. 2024, 47, 163–170. [Google Scholar]
Li, Y.; Wu, Z.; Sun, S.; Lin, M.; Wu, Z.; Shen, H. YOLOv5 algorithm of apple leaf diseases more detection. J. Chin. Agric. Mech. 2024, 7, 230–237+353. [Google Scholar]
Chen, J.; Hu, H.; Yang, J. Plant leaf disease recognition based on improved SinGAN and improved ResNet34. Front. Artif. Intell. 2024, 7, 1414274. [Google Scholar] [CrossRef]
Chen, W.; Yuan, H. Improved Forest Pest Detection Method of YOLOv8n. J. Beijing For. Univ. 2025, 47, 119–131. [Google Scholar]
Wen, X. Lightweight for Embedded Network Research. Ph.D. Thesis, Huazhong University of Science and Technology, Wuhan, China, 2020. [Google Scholar]
Feng, C. For Image Retrieval in Depth and the Bottom of the Visual Convolution Feature Extraction Method Research. Ph.D. Thesis, Northeastern University, Shenyang, China, 2020. [Google Scholar]
Li, B.; Song, T.; Gao, J.; Li, D.; Gao, P.; Li, R.; Zhao, D. Calcium strawberry leaf recognition method based on YOLO v5 model. J. Jiangsu Agric. Sci. 2024, 52, 74–82. [Google Scholar]
Zhang, G.; Li, C.; Li, G.; Lu, W. Small target detection algorithm of UAV aerial image based on improved YOLOv7-tiny. Eng. Sci. Technol. 2024, 4, 1–14. [Google Scholar]

Figure 1. Flowchart of FPN.

Figure 2. Installation structure diagram of image information acquisition equipment.

Figure 3. Rice disease image data set.

Figure 4. Comparison of boxplot of detection value distribution under different treatment groups and degrees.

Figure 5. Label distribution.

Figure 6. YOLOv8-EDCA training results.

Figure 7. Confusion matrix of rice disease.

Figure 8. (a) Label picture. (b) Prediction picture.

Table 1. Test conditions.

Test Serial Number	Time	Collection Period	Rice Variety	Growth Period	Height (cm)	Specification (p/m²)	Weather	Wind Speed (m/s)	Humidity (%)
Test 1	14 October 2022	06:00–08:00	Huangguang oil	Booting stage	102–107	138	Cloudy	3.6	78

Table 2. Technical hardware parameters of UAV-borne mobile rice image information.

Number	Function	Parameter
1	Effective focal length	Autofocus
2	Continuous working time on a single charge	60 min
3	Pixel	1600 W
4	Quality	50 g

Table 3. Hyperparameter settings during model training.

Index	Parameter
Image resolution	640 × 640
Batch size	16
Learning rate	0.01 (Dynamic adjustment by cosine annealing strategy)
design	AdamW
Training rounds	2000
Loss function	Focal loss function, combined with IoU loss, is used to design positioning accuracy

Table 4. Model performance comparison.

Model	mAP@0.5 (%)	mAP@0.5:0.95 (%)	Inference Time (ms/Image)	Model Size (MB)
YOLOv8 original version	91.3	72.4	7.2	17.9
YOLOv8-EDCA	94.7	76.8	6.5	13.7
Faster R-CNN	78.9	54.3	19.8	108.2
SSD-VGG	69.4	41.7	8.5	28.9

Table 5. Influence of each improved module on the performance of YOLOv8-EDCA model (ablation experiment results).

Module Combination	mAP@0.5 (%)	mAP@0.5:0.95 (%)	Inference Time (ms/Image)
Original YOLOv8	91.3	72.4	7.2
+Dual Channel Attention (EDCA)	93.5	74.6	6.9
+SiLU activation function	92.8	73.8	7.0
+Model architecture optimization	94.7	76.8	6.5

Table 6. Performance comparison of mainstream algorithms.

Model Name	Precision/%	Recall/%	mAP_0.5/%	mAP@0.5:0.95 (%)	Modelsize/MB
YOLOv8-EDCA	87.51	89.92	92.93	70.14	25.9
YOLOv8	82.16	83.54	87.30	65.61	28.9
SSD-VGG	81.76	37.60	47.92	45.17	28.96
Faster R-Cnn	65.17	42.20	39.91	36.80	108.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, X.; Ou, Y.; Deng, K.; Fan, X.; Gao, R.; Zhou, Z. A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection. Agriculture 2025, 15, 790. https://doi.org/10.3390/agriculture15070790

AMA Style

Guo X, Ou Y, Deng K, Fan X, Gao R, Zhou Z. A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection. Agriculture. 2025; 15(7):790. https://doi.org/10.3390/agriculture15070790

Chicago/Turabian Style

Guo, Xiaoyan, Yuanzhen Ou, Konghong Deng, Xiaolong Fan, Ruitao Gao, and Zhiyan Zhou. 2025. "A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection" Agriculture 15, no. 7: 790. https://doi.org/10.3390/agriculture15070790

APA Style

Guo, X., Ou, Y., Deng, K., Fan, X., Gao, R., & Zhou, Z. (2025). A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection. Agriculture, 15(7), 790. https://doi.org/10.3390/agriculture15070790

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Unmanned Aerial Vehicle-Based Image Information Acquisition Technique for the Middle and Lower Sections of Rice Plants and a Predictive Algorithm Model for Pest and Disease Detection

Abstract

1. Introduction

1.1. Research Background

1.2. Development Status of Farmland Image Information Acquisition Technology

1.3. Limitations of Existing UAV and Disease Detection Methods

1.4. Research Objectives and Contents

1.5. Technical Architecture and Innovation Points

2. Drone Monitoring and Forecasting System for Rice Disease

2.1. YOLOv8 Model Introduction

2.1.1. Anchor-Free Mechanism

2.1.2. Multi-Scale Feature Fusion

2.1.3. Lightweight Design

2.1.4. Integrated Framework of Training and Reasoning

2.1.5. Model Optimization in This Study

2.1.6. Advantages of YOLOv8 in Monitoring Rice Disease

2.2. Improved Feature Pyramid (FPN)

2.2.1. Feature Pyramid Network (FPN)

2.2.2. Convolutional Block Attention Mechanism (CBAM)

2.2.3. Synergistic Effect of FPN and CBAM

2.3. Selection of Activation Function (SiLU)

2.3.1. Definition and Mathematical Expression of SiLU Activation Function

2.3.2. Comparison with Other Activation Functions

2.3.3. Application of SiLU in YOLOv8

2.3.4. Mathematical Features and Performance Advantages

2.4. Multi-Dimensional Information Collection Scheme

2.4.1. Data Enhancement

2.4.2. Attention Mechanism

2.4.3. Anchor Frame Optimization

2.4.4. High-Resolution Input

2.5. Model Architecture Adjustment

2.5.1. Detection Header Optimization

2.5.2. Improved Loss Function

2.5.3. Adjusting the Network Depth and Width

2.6. Rice Disease Monitoring Model (YOLOv8-EDCA)

2.6.1. Model Integration

2.6.2. Design and Optimization of Lightweight Model

3. Experiment and Result Analysis

3.1. Introduction to Data Sets

3.2. Experimental Environment

3.3. Model Evaluation Index

3.4. Model Performance Comparison

3.5. Ablation Experiment

3.6. Analysis of Results Under Different Treatment Groups and Detection Degrees

3.7. Analysis of Visual Results

3.8. Summary

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI