HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings

Liang, Chunying; Chen, Yuheng; Hu, Jun; Zhou, Zheng

doi:10.3390/agronomy16070678

Open AccessArticle

HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings

¹

College of Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China

²

Zhejiang Mu Chan Li Ecological Technology Development Co., Ltd., Ningbo 315100, China

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(7), 678; https://doi.org/10.3390/agronomy16070678

Submission received: 2 December 2025 / Revised: 11 February 2026 / Accepted: 19 February 2026 / Published: 24 March 2026

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Transplanting status is a significant indicator for rice cultivation, and is essential for field management, food security and agricultural production. However, traditional characterization cannot detect the transplanting status in a timely and effective manner; manual seedling replanting is labor-intensive, has a high cost and is inefficient. This study proposed a detection method for floating seedlings and missed transplanting. The method employed a self-built improved YOLO, namely HGV-YOLO. We leverage a HorBlock module to achieve the splitting of the morphological features of rice seedlings in different dimensions of the backbone network of YOLOv8n, which enabled the network to further enhance the classification and recognition ability of rice seedlings. Furthermore, Grouped Spatial Convolution (GSConv) replaces convolution, and the VOV-GSCSP replaces the C2f modules, reducing the number of parameters and improving the model’s inference speed. To improve the model’s bounding box precision, the WIoU loss function was also incorporated. Finally, we use the least squares method to predict the center point of the rice seedlings. The experimental results indicate that HGV-YOLO achieves a precision of 93.7%, a recall of 83.1%, and an mAP@0.5 of 91.1%. Compared to YOLOv8n, HGV-YOLO reduces Params by 3.1% and GFLOPs by 1.2%, respectively, while improving mAP@0.5 by 2.3%. Compared to YOLOv3-tinyYOLOv5 and YOLOv6, HGV-YOLO achieves increases in mAP@0.5 of 4.6 %, 3.1%, and 2.8%, respectively. In summary, the HGV-YOLO model exhibits a strong performance and provides valuable insights for advancing the autonomous navigation of rice transplanting robotics.

Keywords:

deep learning; object detection; YOLOv8; seedling morphology

1. Introduction

Rice is the second largest grain crop in terms of its planting area and output in China, with an area of 28.991 million hectares. Rice cultivation is intricately linked to the advancement of the rice processing sector and plays a crucial role in boosting rice output [1]. Rice transplanting is an important procedure in rice production [2], which promotes the development of the root system. It is also beneficial for increasing the number of effective tillers and enhancing the yield per unit area of rice. In recent years, rice transplanters have shown the advantages of the uniform distribution of seedlings, saving time and labor, easing farming time, saving costs, and having a high operating efficiency [3]. As a result, mechanized rice transplanting has progressively gained widespread acceptance [4]. However, rice seedling floating and omission happen, which cause mechanical tears and wear of the transplanter and a difference in seedling growth [5]. The current detection approach mostly involves the manual assessment of seedling floating and omission [6]. Therefore, the fast and accurate recognition of seedling floating and omission is urgently needed, which can provide a method for evaluating the operation quality of the rice transplanter.

The development trend of deep learning technology combined with Unmanned Aerial Vehicle (UAV)images has gradually become the mainstream. To achieve efficient deployment on the UAV platform, Li proposed an improved model based on YOLOv12n, which improved the detection precision of the model for UAV images with dense and small-size targets [7]. Nie integrated a UAV deployment target detection algorithm based on the improved YOLOv10 model UAVlite-YOLOv10. Moreover, the number of calculations has become less than that of the basic model parameters [8]. Wu proposed an improved model, OE-YOLO, based on YOLOv11n. The model can be used to detect Rice Panicles and aid with real-time crop monitoring [9]. Chen combined UAVs with Remote Sensing to collect Rice Panicle images and used the improved YOLOv8 model REU-YOLO to detect Rice Panicles [10]. Deep learning has been used in rice seedling detection [11]. Zhu used GoogleNet to classify and identify rice seedlings, floating rice seedlings, and damaged rice seedlings in the greening stage. Through this method, the average recognition precision reached 91.7% [12]. Liu proposed a lightweight improved cottonseed damage detection method based on YOLOv5s to solve the problem of false detection. The improved network can effectively solve the problem of missing detection and reduce false detection. The enhanced model achieved a precision rate of 92.4% through ablation experiment verification [13]. The detection of rice seedling rows plays an important role in determining the location of seedling omission. Aiming at identifying rice seedling rows, Wang put forward a detection approach for rice seedling rows based on the Hough transform of the neighborhood of feature points. The Faster-RCNN was employed to acquire the feature points of rice seedlings, and the Hough transform algorithm was utilized to identify the center line of the seedling rows, attaining a precision rate of 92% [14]. He proposed a rice seedling row detection method based on a Gaussian heat map, due to the fact that the method based on machine learning is easily affected by external factors. The experimental results show that the reasoning speed of the model is 22fps, and it can recognize seedling rows in different environments [15]. It has difficulty identifying the missing position in rice transplanters, and its detection precision is low. To identify the missing positions of the rice transplanter, Wang proposed a method for assessing the operational quality of unmanned rice transplanters. The ASPP and CBAM modules were added to the original basic model YOLOv5 for improvement, and the least squares method was used to fit the rice seedling line to calculate the missing rice position. The identification precision of the rice model was 95.8% [16]. Through the review of peer studies, it is found that the current research has the following shortcomings. The detection of rice seedling omission is difficult, and the detection precision needs to be improved. Most studies focus on a single specific problem, such as the lack of generality and integration in rice seedling classification or rice seedling row detection. Based on these above issues, we improved the YOLOv8 model by integrating HorBlock, a VOV-GSCSP module and a WIoU loss function. It has a high precision in identifying rice seedling floating, sparse seedlings, and omission.

This study focuses on rice seedling omission. This paper introduces a method for detecting rice seedlings and omissions which has the advantages of high precision and a fast detection speed. First, the YOLOv8 model is optimized. The improved YOLOv8 model was used to complete the classification and detect rice seedlings. The center coordinate of the model detection frame is regarded as the center coordinate of the rice seedling. The least squares method (LSM) was then used to complete the fitting of rice seedling rows and to determine the seedlings of omission. In the classification of rice seedlings, the tillering ability of seedlings with sparse leaves is limited, and the seedling yield is low, which affects the overall field production. This is labeled as the “lack” category. The algorithm introduced in this study identifies seedlings of this category and determines their destination, which can lay the foundation for the subsequent replanting of seedlings. The identification of floating seedlings and the detection of missed seedlings can provide technical support for the path planning of the seedling replanting robot.

2. Materials and Methods

2.1. Data Acquisition

The data were collected at Beidahuang Group Qixing Farm in Heilongjiang Province, China. The geographic location of the image collection is shown in Figure 1. The image collection site is located at coordinates 132°31′26″~134°22′26″ E longitude. It falls within the cold temperate humid monsoon climate zone, with an annual average temperature ranging from 1 °C to 2 °C. The effective accumulated temperature is between 20 °C and 24 °C, with sunshine hours ranging from 2260 to 2449 h and average precipitation ranging between 550 and 600 mm. Paddy fields are the main arable land where rice is grown.

The unarmed aerial vehicle (UAV) (MAVIC 3T, DJI) was used to capture rice seedling images. Rice seedling images were acquired after 5 to 10 days of transplantation. A total of 369 images of rice seedlings were captured for the dataset. The resolution of the images was 4000 × 3000. Table 1 displays detailed figures of the flying height and speed.

2.2. Dataset

To enhance the generalization and robustness of the models, the RGB images of rice seedlings acquired by the UAV were further expanded using data enhancement methods. These included adjusting the image contrast, adding salt–pepper noise, and adding image brightness, as illustrated in Figure 2. The research collected 2607 rice seedling images in total.

The dataset was expanded to 2607 images and divided into training, validation, and test sets in an 8:1:1 ratio. The dataset was labeled using LabelImg and stored in VOC and txt file formats. The seedlings in images were categorized into three groups: “qualified”, “floating”, and “lack”. Figure 3a depicts qualified seedlings (labeled “qualified”) that are intact and not floating or breaking in the insertion hole. Figure 3b shows floating seedlings (labeled “floating”), where the roots are not rooted into the soil but floating on the water surface. Figure 3c depicts sparse leaf seedlings (labeled “lack”) with a leaf age of three leaves or less.

2.3. Evaluation Indicators

Key indications in target detection are selected to evaluate the enhanced model’s performance in this experiment. The metrics selected were precision (P), recall (R), F1-score (F1), number of parameters (Params), floating-point operations per second (FLOPs), and mean average precision (mAP). TP is the count of accurately identified seedlings, FP is the count of erroneously classified rice seedlings, FN is the count of undetected rice seedlings, and mAP is the mean average precision across all categories in the dataset. mAP@0.5 is the mAP value at an IoU threshold of 50%. mAP@0.5–0.95 is a stricter metric, calculated as the average mAP values within the IoU threshold range of 50–95%, providing a more precise evaluation of the model within that range. Here is the related formula:

P = \frac{T P}{T P + F P}

(1)

R = \frac{T P}{T P + F N}

(2)

A P = \int_{0}^{1} (R) dR

(3)

m A P = \frac{1}{n} \sum_{i = 1}^{n} A P_{i}

(4)

F 1 - s c o r e = \frac{2 \times p r e c i s i o n \times r e c a l l}{p r e c i s i o n + r e c a l l} \times 100 %

(5)

When detecting rice seedling omission, it is important to assess the quality of linear fitting using evaluation indexes. SSE, MSE, and R-squared are used to assess the goodness of fit of linear equations. R-squared is used to quantify the relationship between the algorithmic measurement outcomes and the manual measurement outcomes. The calculation formula is displayed in Equation (6). The formula involves n as the number of rice seedlings predicted,

y_{i}

as the manually measured value, and

{\hat{y}}_{i}

as the measurement value from the target detection algorithm.

\bar{y}

indicates the average value of manual measurements. When

R^{2}

= 1, it indicates that every observation point is contained on the line. The closer the value is to 1, the better the equation fits. In this paper, SSE and MSE are calculated to verify the effectiveness of the algorithm. The formulas are shown in (7) and (8).

R - s q u a r e d = \frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(6)

S S E = \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}

(7)

M S E = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}

(8)

2.4. Model Instruction

The test platform utilized for model training in this study is displayed in Table 2. A dataset created using images of rice seedlings taken by a UAV was used to train the neural network model. The batch size was set to 32, the epoch to 150 rounds, the starting learning rate to 0.001, the initial momentum to 0.937, and the weight attenuation coefficient to 0.0005. Figure 4 depicts the experimental flow chart. The whole detection process is divided into four parts. The first part is the acquisition and preprocessing of rice data, including image acquisition and annotation and the establishment of datasets. The second part is the improvement and training of the model. The third part is the identification and detection of rice seedlings. The fourth part is fitting the rice seedling rows and detecting rice seedling omission.

3. HGV-YOLO Model Construction

3.1. Detection of Rice Seedlings

3.1.1. Improved YOLOv8

The YOLO (you only look once) model is a single-stage detection algorithm [17]. The algorithm has achieved top detection precision and speed in target detection through ongoing improvement and optimization [18]. YOLOv8 enhances the YOLOv5 network by replacing the C3 module with the C2f module to allow for more plentiful gradient flow. Additionally, the number of channels is changed to accommodate models of varied scales. The YOLOv8 network topology consists of an input layer, backbone layer, neck layer, and head layer [19]. The network input size of the input layer is 640 × 640. Its main role is to preprocess the input image and resize and normalize the image according to the network input size [20]. The backbone layer serves as a feature extraction layer. The backbone network utilizes a Darknet-53 framework, and the C2f module is incorporated for residual learning. This layer mostly extracts information from images. By connecting the backbone and head layers, the neck layer performs the function of feature fusion. The primary objective is to increase the robustness of the detection network by optimizing the utilization of the features extracted from the backbone layer. The head layer is primarily responsible for classification and prediction. The main difference between the head layer of YOLOv8 and v5 is that the target detection process is separate from the detection of the target box and category. The final prediction output is generated by the detection heads of three feature graphs with varied sizes.

HGV-YOLO was constructed from the enhanced YOLOv8 model, and its network architecture is depicted in Figure 5. Firstly, the HorBlock module is used in the backbone layer to optimize the C2f module. Secondly, the VOV-GSCSP module based on GSConv is used to improve the C2f module in the neck layer, and the Conv in the backbone layer is improved to GSConv. Finally, the loss function (CIoU) is replaced with WIoU. This resulted in an enhanced YOLOv8 model to meet classification and prediction goals.

3.1.2. HorBlock Module Based on gnConv

Due to the complex field environment of rice seedlings, some rice seedlings will have leaves shielding each other, leading to some problems of missing detection, wrong detection, and low precision. During rice seedling identification, floating seedlings are more difficult to identify than eligible and leaf-thinning seedlings. The HorBlock module, which is built upon the YOLOv8n model and gnConv, was implemented to address the issues of incorrect and missing detection, as well as enhance the precision of recognition for rice floating seedlings [21]. By mitigating the impact of the surrounding environment, this module improves the precision of morphology recognition for rice seedlings. Figure 6 shows the structure diagram of the HorBlock module based on gnConv.

The gnConv operation combines standard convolution, linear projection, and element-wise multiplication, incorporating higher-order spatial interactions via gate convolution and recursive architecture [22]. For the implementation

y = g C o n v (x)

, we provide the input feature

x \in R^{H \times W \times C}

.

[{p_{0}}^{H \times W \times C}, {q_{0}}^{H \times W \times C}] = θ_{i n} (x) \in R^{H \times W \times 2 C}

(9)

H indicates the height of the feature, W represents the width of the feature, C represents the number of channels of the feature,

θ_{i n}

refers to the connected layer of the input, and

θ_{o u t}

refers to the fully connected layer of the output.

q_{0}^{'} = f (q_{0}) \in R^{H \times W \times C}

(10)

p_{1} = q_{0}^{'} ⊙ p_{0} \in R^{H \times W \times C}

(11)

y = θ_{o u t} (p_{1}) \in R^{H \times W \times C}

(12)

q_{0}

conducts a depth-separable convolution to get

q_{0}^{'}

, with f being a deep convolution.

p_{0}

and

q_{0}^{'}

are points multiplied by

p_{1}

,

p_{1}^{(i, c)} = \sum_{j \in Ω_{i}} w_{i \to j}^{c} q_{0}^{(j, c)} p_{0}^{(j, c)}

.

Ω_{i}

represents the center of the local window,

w

symbolizes the convolutional weights, and the final spatial interaction result is projected to the specified dimension via the fully connected layer.

The higher-order spatial interaction utilized by gnConv follows the same principle as the first-order spatial interaction [23]. gnConv is a recursive gate convolution that increases model capacity by incorporating higher-order interactions. First, utilize

θ_{i n}

to generate a collection of projection features,

p_{0}

and

{\{q_{k}\}}_{k = 0}^{n - 1}

:

[p_{0}^{H \times W \times C_{0}}, q_{0}^{H \times W \times C}, \dots, q_{n - 1}^{H \times W \times C_{n - 1}}] = θ_{i n} (x) \in R^{H \times W \times (C_{0} + \sum_{0 \leq K \leq 1} C_{K})}

(13)

Then, gating convolutions are recursively performed:

q_{k}^{'} = f_{k} (q_{k}) \in R^{H \times W \times C k}, k = 0, 1, \dots, n - 1

(14)

Secondly,

q_{k}^{'}

and

p_{k}

are dotted, i.e., the following spatial interaction:

p_{k + 1} = q_{k}^{'} ⊙ g_{k} (p_{k}) / α, k = 0, 1, \dots, n - 1

(15)

f_{k}

is a set of deep convolutional layers.

g_{k}

is a fully connected layer that matches dimensions according to different orders.

g_{k} = \{\begin{cases} l i n e a r (C_{0}, C_{0}) k = 0 \\ l i n e a r (C_{k - 1}, C_{k}) 1 \geq k \geq n - 1 \end{cases}

(16)

The specific value of

C_{K}

is calculated using the following formula:

C_{k} = \frac{C}{2^{n - 1 - k}}

(17)

n - 1 - k

indicates that the dimension of C is from small to large. The results of spatial interactions are mapped to the specified dimensions through the full connection layer.

y = θ_{o u t} (p_{n - 1}) \in R^{H \times W \times C}

(18)

This design allows you to see that gnConv performs interactions in a coarse-to-fine fashion, using fewer channels to calculate lower-order spatial interactions.

3.1.3. VOV-GSCSP Based on GSConv

In the task of seedling morphology classification, the detection speed of model seedling was improved, and the detection precision was maintained. In this paper, VOV-GSCSP and GSConv are used to optimize the YOLOv8 network. In the neck layer, VOV-GSCSP is used to replace the C2f module, and the GSConv module is used to replace the Conv module in the head layer [24]. Figure 7 depicts the structure of VOV-GSCSP. Standard convolution (SC), depth-separable convolution (DSC), and shuffle are combined in GSConv, a hybrid convolution. Compared to normal convolutional operations, GSConv is computationally more efficient and has fewer parameters. It improves the detection speed and overall performance of the model, while maintaining excellent detection precision. As shown in Figure 8, before using depth separable convolution (DSC), GSConv under-samples the normal convolution, combines the two convolution results, and uses the shuffle operation to combine the number of channels of the first two convolutions. Shuffle is a homogeneous mixing strategy which, by infiltrating the information produced by the standard convolution into each component of the information produced by the depth separable convolution, allows the local feature information to be exchanged uniformly across multiple channels. Utilizing GSConv during feature map extraction can decrease the computational workload and enhance the model’s processing performance. Furthermore, GSConv has a regularization effect that enhances the model’s generalization capacity to some degree. GSbottleneck introduces GSConv by replacing the C2f module in the neck layer with the interstage local network VOV-GSCSP module, reducing the network’s complexity.

3.1.4. WIoU Loss Function

YOLOv8 uses the CIoU regression algorithm as the default bounding loss. The precision of model detection results in the target detection being influenced by the loss function. In a practical field context, light, environmental factors, and rice seedlings may be concealed from one another, leading to poor-quality samples in the gathered rice seedling dataset. Assuming that the quality of the training data is a prerequisite, CIoU concentrates on enhancing the fitting capability of boundary frame loss. Consequently, the model’s detection performance will improve. Nonetheless, if one unquestioningly reinforces boundary box regression in the presence of low-quality examples, the impact of said examples will be magnified. Consequently, the model’s capacity for generalization will be diminished, leading to a subpar performance in terms of detection. Zanjia Ton et al. proposed a dynamic non-monotonic focusing mechanism and designed Wise-IoU (WIoU) [25]. By substituting IoU with an “outlier” for the quality assessment of anchor boxes, the dynamic non-monotonic mechanism diminishes the competitiveness of high-quality anchor boxes and decreases the detrimental gradients caused by low-quality examples in the dataset [26]. Therefore, WIoU is utilized instead of CIoU in this paper to address this issue and allow WIoU to concentrate on the anchor box of average quality, thereby marginally enhancing the model’s detection performance.

The aspect ratio and distance will enhance the penalty for low-quality examples, and the model’s generalization performance will also be affected. This is because the data collected in the complex field environment will invariably contain some low-quality instances [27]. WIoU, on the other hand, will reduce the penalty when the anchor box and the target coincide, so the model can obtain a better generalization ability.

R_{W I o U} \in [1, e]

; the

L_{I o U}

of an ordinary-quality anchor frame is significantly enlarged.

L_{I o U} \in [0, 1]

; it can decrease the

R_{W I o U}

of a high-quality anchor box and prioritize the distance between center points when the target frame and anchor frame overlap.

R_{W I o U v 1} = R_{W I o U} L_{I o U}

(19)

R_{W I o U} = \exp (\frac{{(x - x_{g t})}^{2} + {(y - y_{g t})}^{2}}{{({W_{g}}^{2} + {H_{g}}^{2})}^{*}})

(20)

To prevent obstructing gradients caused by

R_{W I o U}

,

W_{g}

and

H_{g}

are detached from the computation graph (denoted by superscript *). The model’s convergence speed and generalization capabilities are enhanced by eliminating elements that impede convergence without introducing additional metrics like the aspect ratio. Figure 9 below depicts the WIoU structural diagram.

3.2. Detecting Rice Transplanter Omission Positions

The least squares approach is a prevalent mathematical optimization strategy utilized for curve fitting by minimizing the sum of squares of errors between the anticipated and true values of all samples [28]. The formula for LSM will be given in Equation (21) below.

f (x) = w_{1} φ_{1} (x) + w_{2} φ_{2} (x) + w_{3} φ_{3} (x) + \dots + w_{m} φ_{m} (x)

(21)

where w_k and ϕ_k(x) represent undetermined coefficients and a set of linearly independent functions selected in advance [29], respectively. The least squares technique minimizes the sum of squares of the distances between

y_{i}

(where i = 1, 2, …, n) and

f (x i)

[30]. The application of LSM in line fitting is shown in Figure 10.

LSM was utilized for linear fitting in the study, and an enhanced YOLOv8 model was employed to detect rice seedlings and obtain a boundary box. The center coordinates of the bounding box are the approximate values of the rice seedling coordinates. The least squares method was utilized to suit the horizontal and vertical lines of the seedling. The intersection point between the horizontal and vertical lines fitted using the least squares method is recorded as the predicted center point of the rice seedling. If the point does not fall into any prediction box, it is judged to be the rice transplanter omission position. The position of the seedlings that were missed by the rice seedling omission is shown in Figure 11.

The detection process of rice transplanter omission positions is shown in Figure 12. Figure 12a depicts rice seedlings following pretreatment. Firstly, the seedling image was input into the improved model HGV-YOLO and the central coordinate point of the seedling was obtained, as shown in Figure 12b. Secondly, after identifying rice seedlings through HGV-YOLO, text files containing seedling information can be output. The text file includes information about the category used to identify the rice seedling, the coordinates of the centroid (x, y) of the rice seedling’s Cartesian Coordinate System, and the width and height of the seedling object detection bounding box. The center point coordinates of the detected seedling are based on the upper left corner of the image, where a standard rectangular coordinate system uses the lower left corner as the origin. To facilitate the study, we inverted the image vertically and used LSM to fit the rice seedling line to determine the missing location of the rice transplanter. The coordinates of the seedling center point (x, y) were initially extracted. A scatter map of the seedling center point was then created using MATLAB R2019a and incorporated into the rice seedling image. The central point coordinates of rice seedlings are displayed in Figure 12c. The least squares method was employed in MATLAB to fit the center point of rice seedlings and derive the fitting line for the row and column of rice seedlings. The horizontal and columnar equations of rice seedlings are represented by Equations (22) and (23). If the crossing point of the row of seedlings fitted by LSM does not fall into the rice seedling detection frame, it is considered to be the position of the missing seedling. The straight line of the row of seedlings fitted by LSM is shown in Figure 12d. We use a and b as the bottom corners to represent the rows and columns: a1 for the first row of equations, b1 for the first column of equations, and so on. The image was vertically rotated to suit the rice seedling line, and LSM was employed to facilitate our research. The red box diagram in Figure 12e illustrates the location of rice transplanter omission positions.

4. Analysis of Model Structure and Parameters

4.1. Analysis of HorBlock Module Position

The HGV-YOLO model incorporates HorBlock; two distinct integration strategies are used in this experiment to confirm the efficacy of incorporating this module. One approach is to sequentially substitute each C2f module (designated as Y8H1–Y8H4) in the backbone layer of the YOLOv8 network. The HorBlock module can obtain local aspects of the network structure using this method. Another approach is to substitute all C2f modules in the backbone layer with the HorBlock module (model called Y8H5) to acquire all the characteristics of the full backbone layer. Figure 13 depicts the positions of the various HorBlock modules. We can analyze the impact of HorBlock replacing the C2f module at various points in the backbone layer in the enhanced model using the above two methods.

To validate the HorBlock module, two different methods were designed. The primary role of the backbone layer is to extract features from input images. Y8H1, Y8H2, Y8H3, and Y8H4 are incorporated into the backbone layer of the enhanced YOLOv8 network, with each C2f module being substituted by the HorBlock module to capture aspects of the backbone network. The HorBlock module substitutes all C2f modules in the backbone layer to acquire the characteristics of the entire backbone network in an alternative manner (Y8H5).

As can be seen from the data in Table 3, the HorBlock module implements high-order spatial interactions through gnConv and recursive design, and the integration with the backbone layer makes a difference in the improved model. Compared to the original YOLOv8n, the precision of Y8H1 has improved by 1.8%. However, the training time to complete 150 epochs is 0.698h, which is longer than the base model’s 0.473h. When HorBlock is inserted into the Y8H2 location, the experiments show that the precision rate and recall rate of the model rise by 2.4% and 0.9%, while mAP@0.5 and mAP@0.5:0.95 increase by 2.2% and 3.2%. However, we found that the training time was 0.774 h. This indicates that the spatial interaction ability of the HorBlock module can effectively classify and recognize rice seedlings, which improves the recognition precision of the model and increases the training time of the model. In the integration of HorBlock in Y8H3, it can be found through the experimental data that the other indicators are lower than the basic model, except that the P-value is increased by 0.3%. At the Y8H4 position, the experiment showed that, compared with the basic model, except for the recall rate increasing by 1.3%, the other indicators were not particularly excellent. In the Y8H5 position, the precision rate and recall rate increased by 2.4% and 0.9%, but, in comparison, the training time increased by 0.263 h.

To graphically represent and evaluate the effectiveness of the basic model YOLOv8n in classifying and recognizing rice seedlings after integrating the HorBlock module at different positions, a P-R curve containing five different models was constructed, as shown in Figure 14. According to Figure 14, the equilibrium point of Y8H2 is significantly higher than that of the other four models, and the P-R curve of the Y8H2 model is superior to that of Y8H1, Y8H3, Y8H4, and Y8H5, indicating that the model is performing at its best.

This shows that the HorBlock module is integrated into the backbone layer of YOLOv8n. Due to its high-order spatial interaction ability, it can transmit the morphological characteristics of rice seedlings in the backbone layer, thus improving the recognition ability of rice seedlings under different categories. Among them, gnConv can split the features of rice seedlings into various forms in different dimensions, excavate richer clues between hierarchical features, and further improve the classification and recognition precision of rice seedlings in complex environments.

Figure 14 depicts the P-R curves of the HorBlock module at various places in HGV-YOLO. The results indicate that the Y8H2 model outperformed the other four models, confirming its superiority.

4.2. Ablation Experiments

An ablation test was designed in this study to confirm the efficacy of adding the HorBlock module, VOV-GSCSP module, and WIoU boundary frame loss function to the enhanced model HGV-YOLO. The findings are displayed in Table 4. The basic model is YOLOv8n. The HorBlock module is introduced separately to optimize the backbone layer. Except for a slight improvement in the precision and mAP@0.5:0.95, other indicators do not show significant changes. Upon the separate introduction of the VOV-GSCSP module into the neck layer, we have observed that the model’s precision is increased by 1%, the number of parameters is reduced by 0.3%, and the recall rate and mAP@0.5 are slightly lower than the base model. VOV-GSCSP can lighten the model and enhance the non-linear representation of the model, which can reduce the model size while maintaining a high precision rate. After that, the introduction of the WIoU bounding box loss function reduces the model precision by 0.4%, increases the recall by 0.6%, and reduces the model parameters by 0.17%, while other metrics only change slightly. This indicates that WIoU can focus on the normal quality of the anchor box, reducing the impact of low-quality anchor boxes, and indirectly improving the performance of the model. The precision rate and the recall rate can be improved to a certain extent by integrating the HorBlock module and the WIoU bounding box loss function, thereby optimizing the YOLOv8 network architecture. However, the number of parameters is also somewhat reduced by integrating the VOV-GSCSP module and the WIoU loss function. We can see that the recall rate of the model and mAP@0.5:0.95 can be improved to a certain extent. The precision rate and recall rate of the model were both increased by 0.5% and 0.3% when the HorBlock module and VOV-GSCSP module were assembled. After the introduction of these three modules in the improved model, compared with the basic model YOLOv8n, the precision rate increased by 2.4%, the recall rate increased by 1.6%, mAP@0.5 was increased by 2.3%, mAP@0.5:0.95 was increased by 2.4%, and the parameter number of the model was reduced by 3.14%. This accomplishment is primarily attributed to the following factors: The higher-order spatial interaction of the HorBlock module itself and its integration into the backbone layer of the network architecture improve the model’s feature extraction capability. The integration of the VOV-GSCSP in the neck layer improves the model’s detection performance for small targets and the model’s detection capability for different categories of rice seedlings. The WIoU bounding box loss function lessens the impact of low-quality sample examples that invariably occur in the dataset of the rice seedling images collected in complex environments.

The curves of different evaluation indicators before and after the model improvement are shown in Figure 15. Figure 15a,b show the precision and recall curves of the improved model and basic model in the training process. The improved model precision and recall curves are significantly higher compared to YOLOv8. As shown in Table 4, the improved precision rate of YOLOv8 is 93.7%, which is 2.4% higher than the precision rate of YOLOv8 of 91.3%; the improved recall rate of YOLOv8 is 83.1%, which is 1.6% higher than the recall rate of YOLOv8 of 81.5%. The precision and recall curves of the improved model oscillate less and more smoothly compared to the base model because the improved YOLOv8 incorporates the HorBlock module, allowing it to capture complex visual layouts.

The mAP@0.5 and mAP@0.5:0.95 curves between the basic model and the improved model during the training procedure are illustrated in Figure 15c,d. The figure illustrates that the enhanced model outperforms the basic model at various thresholds. The amplitude of the curve oscillations at mAP@0.5 and mAP@0.5 in the improved model is substantially lower than that of the basic model, and the curve is more stable. The enhanced model’s benefits are evident in the recognition of rice seedling morphology.

Below is Figure 16 depicting the P-R curve of HGV-YOLO for the rice seedling dataset. It can be seen from the P-R figure that the scores of qualified seedlings, floating seedlings, and lack seedlings are 98.5%, 81.5%, and 93.3%, and the mAP@0.5 of the model is 91.1%.

The enhanced model can serve as a basis for the practical use of compact automatic reseeding equipment that incorporates a deep learning recognition model and can achieve the development of a lightweight network model. The study’s results confirm the feasibility of the enhanced model.

5. Results and Discussion

5.1. Row Fitting and Omission Detection

The R-squared, SSE and MSE of the seedling line were fitted to calculate the linear equation by using the center coordinates of rice seedlings in different models to compare the quality of the rice line. The comparisons are shown in Table 5. One row of the rice seedling line equation was selected for the comparison.

The six models, YOLOv3-tiny, YOLOv5, YOLOv5-P6, YOLOv6, YOLOv8n, Rice-YOLO, were fitted with linear equations of

y = 0.0479 x + 0.3754

,

y = 0.0394 x + 0.3626

,

y = 0.0492 x + 0.3566

,

y = 0.0481 x + 0.3563

,

y = 0.0398 x + 0.3627

,

y = 0.0476 x + 0.3589

, and the SSE of the linear equations are

2.8863 \times 10^{- 4}

,

1.2461 \times 10^{- 4}

, 3.3959 × 10⁻⁴,

3.3421 \times 10^{- 4}

,

3.1372 \times 10^{- 4}

,

9.8063 \times 10^{- 5}

, and the fitted rice row lines are shown in Figure 17.

The R-squared of YOLOv6 was 0.7904, which was significantly lower than that of the other five models, and the SSE and MSE were also higher compared to the other models, indicating that this model fitted the rice row line equation with the lowest degree of fit to the data. The R-squared of the improved HGV-YOLO was 0.9129, which was significantly higher than the other five models, and the SSE and MSE were the lowest compared to the other models, indicating that the rice row equation fitted by this model fit the data the closest. As different models have different recognition effects on rice seedlings after rice seedling detection, resulting in different fitted seedling rows and lines, it is easy for seedling omission to not be detected. Table 5 shows the precision rate of each model’s identification of seedling omission through experimental analysis. YOLOv3-tiny+LSM has a precision of 88.4% and a detection time of 28 ms. YOLOv5+LSM has a correct recognition rate of 91.2% and a detection time of 22 ms. YOLOv5-P6+LSM has a precision of 86.6% and a detection time of 33 ms. YOLOv6+LSM has a precision of 86.3% and a detection time of 37 ms. YOLOv8n+LSM has a precision of 93.6% and a detection time of 25 ms. HGV-YOLO+LSM has a precision of 94.3% and a detection time of 27 ms. Figure 18 shows the row line fitting results of rice seedlings by different models in different scenarios. Figure 19 shows the results of the improved model for the detection of omitted locations of rice seedlings.

The detection of the omission of seedlings in this study is currently based on the unidirectional detection of images. The current study has successfully completed the search for the omission of seedlings in images based on image recognition. Subsequent research will improve the automatic insertion system based on GPS positioning and image coordinate conversion and guide the planting instrument to directly reach the omitted seedlings.

5.2. Floating and Sparse Rice Seedling Detection

Five mainstream models were used to compare the performance of the proposed method (HGV-YOLO); these models were trained from scratch using the same dataset. The results are shown in Table 6. The improved model has a precision of 93.7%, a recall rate of 83.1%, and a recognition rate of 96.2% for qualified seedlings, 91.1% for floating seedlings, and 93.7% for sparse seedlings. The F1 score is 88.1%, the mAP@0.5 is 91.1%, and the average predicted image speed is 15.7 ms. The floating-point number of GFLOPs is 8.1. The enhanced model in this study indicated an increase in the precision rate by 2.4%, recall rate by 1.6%, F1 value by 2.0%, and mAP@0.5 value by 2.3% compared to the unimproved model.

The F1 score of HGV-YOLO improved by 1.8%, 1.9%, and 1.4% compared to the YOLOv5, YOLOv3, and YOLOv6 models. Additionally, the mAP@0.5 increased by 2.7%, 4.0%, and 2.5%. The table depicts the classification results for three types of rice seedlings, indicating that the recognition performance for qualified seedlings, floating seedlings, and sparse seedlings in this study surpasses that of five other models. The recognition precision for each type of rice seedling exceeds 90%, meeting the technical standards for real-world field operations.

Table 6 shows that the average prediction time for each rice seedling image in this study is 15.7 ms, making it one of the fastest recognition models among the six selected models. Its recognition speed surpasses that of the YOLOv5 and YOLOv8n models. This suggests that the enhanced data in this study can enhance the image recognition of rice seedlings and improve the classification of rice seedlings in the field.

Figure 20 compares the recognition accuracies of the YOLOv8 model and the augmented model for several types of seedlings, including qualified seedlings, floating seedlings, and sparse seedlings. The data in Figure 20 show that qualified seedlings had improved recognition results, with an average precision of 95%. The enhanced model for floating seedlings shows significant improvements, with seedling recognition precision increasing by 3.7%, recall rate increasing by 5.8%, and the F1 value increasing by 5.1%. The mAP@0.5 value has increased by 6.0%. The precision of identifying seedlings with sparse seedlings (labeled “lack”) significantly increased, resulting in a 2.8% rise in the overall assessment index and F1 score. HGV-YOLO demonstrates a significant enhancement in seedling recognition outcomes compared to the baseline model, YOLOv8, with an average precision improvement of over 2%.

The results of the model detection are shown in Figure 21. HGV-YOLO has the most accurate recognition effect and the highest detection precision on the test set. The detection and classification precision of the other five models are lower than HGV-YOLO. Therefore, the improved model performs the best, validating the effectiveness of the proposed model.

Table 7 shows the overall evaluation indexes of mAP@0.5, mAP@0.5:0.95, and parameter quantity. It can be seen in the experimental results that the difference in mAP @0.5 between the different models for the eligible seedling category is subtle, but the difference between the floating seedling category is very significant. HGV-YOLO’s floating category scored 81.5% mAP@0.5, much higher than the other models. This is also because HGV-YOLO, which incorporates the HorBlock module, can achieve high-order interactions at the backbone network layer to effectively extract the morphological features of floating rice seedlings. The HorBlock model in the backbone layer works on splitting the features of different categories of rice seedlings into different dimensions and then mining the different levels of features between the links to improve the classification of different categories of seedlings. Moreover, HGV-YOLO, which incorporates the WIoU loss function, can reduce the impact of low-quality samples present in the rice seedling dataset on model predictions.

HGV-YOLO and the latest SOTA detection model were selected and comprehensively compared for the rice seedling dataset using the same training parameters. The results are shown in Table 8. In addition, HGV-YOLO has a high recognition precision for three kinds of rice seedlings, especially mAP@0.5. It shows a relatively good recognition effect. The recognition precision of the improved HGV-YOLO was 96.2% for qualified seedlings, 91.1% for floating seedlings and 93.7% for sparse seedlings. Compared with YOLOv10n, it has increased by 2.8%, 1.6% and 1.4%, respectively. Among them, the detection precision rate of HGV-YOLO for floating seedlings was 91.1%, which was 1.5%, 4.9% and 1.6% higher than that of YOLOv8-P2, YOLOv9c and YOLOv10n. Compared with YOLOv11n and YOLOv12n, the detection precision increased by 3.1% and 1.1%, respectively.

In addition, the mAP@0.5 of HGV-YOLO was 91.1%, which was 2.3% and 3.0% higher than that of YOLOv9c and YOLOv10n. It was 2.7% and 2.4% higher than that of YOLOv11n and YOLOv12n, respectively. The detection speed of HGV-YOLO was 15.7 ms, which was higher than that of YOLOv8-gGhost-P2, which was 20.3 ms. The results indicate that HGV-YOLO has a higher precision and faster reasoning speed than the other models. As a result, HGV-YOLO is the model with the most excellent comprehensive performance among the seven models.

The actual field environment after rice transplanting is disordered, and the rice seedling forms are various. As a result, rice seedlings are easily tilted by wind. Figure 22 shows the test results of five different models under normal water levels, more water, and less water: three kinds of environments. By comparing the five models, namely YOLOv9c, YOLOv10n, YOLOv11n, YOLOv12n and HGV-YOLO, all these types of models have certain false detection situations in the three environments. However, compared with them, HGV-YOLO has better classification detection results and fewer false detection situations. In the process of research, we found that the category of floating seedling has the greatest impact on HGV-YOLO classification precision. Because the classification standard of the floating seedling is that the seedling root is not planted in the soil and floats on the water, the characteristics of this form of seedling are obvious, and the recognition effect is very ideal. As shown in Figure 22a, the qualified seedlings are incorrectly classified as floating seedlings in YOLOv10n, YOLOv11n and YOLOv9c. This is mainly because, in the actual field environment, compared with the qualified seedlings, the shape of floating seedlings has a large inclination angle. Therefore, it is difficult to determine whether the roots of seedlings are planted in the soil. In addition, during the transplanting process of the transplanter, it will inevitably affect the field, causing some rice leaves to be covered with soil, resulting in the complexity of the recognition background. As shown in Figure 22b, the phenomenon of missing detection for YOLOv10n and YOLOv9c is serious. YOLOv11n and YOLOv12n mistakenly detected normal seedlings as other types. Moreover, the leaves of rice seedlings are inevitably disordered. A proportion of models, such as YOLOv9c, YOLOv10n and HGV-YOLO, have overlapping detection frames, as shown in Figure 22c. Therefore, to better handle the above situation, we have decided to collect a large number of images of rice seedlings under different conditions for more accurate classification and detection.

5.3. Evaluation of Rice Transplanter Operation Quality

A field test was done to verify the reliability of the improved model. The images of the field test were acquired at Qixing Farm in the Heilongjiang Province, China. The same UAV was used for the image collection. The flight height of the UAV was 6 m, and the number of rice seedlings photographed in the field corresponds to 14 to 16 rows, respectively. To evaluate the operation quality of the rice transplanter, a five-point sampling method was used to collect the rice seedling images for each rice sampling field. Each of the sampling plots was 20 m × 30 m.

According to the definitions of the qualified rate, floating rate, sparse rate and omission rate in “Rice transplanter test method”, the methods of calculation for the qualified seedling rate, floating seedling rate, sparse seedling rate, and omission seedling rate are as follows:

(1) Floating seedling rate: After transplanting, 100 points were selected consecutively in each region to be measured, and the number of floating seedlings was calculated and repeated three times. The calculation method was the number of floating seedlings divided by the total plants measured.

(2) Omission seedling rate: After transplanting, 100 points were selected consecutively in each region to be measured, and the number of omission seedlings was calculated and repeated three times. The calculation method was the number of omission seedlings divided by the total number of measured points.

(3) Qualified seedling rate: After transplanting, 100 points were selected consecutively in each region to be measured, and the number of qualified seedlings was calculated and repeated three times. The calculation method was the number of qualified seedlings divided by the total plants measured.

(4) Sparse seedling rate: After transplanting, 100 points were selected consecutively in each region to be measured, and the number of sparse seedlings was calculated and repeated three times. The calculation method was the number of sparse seedlings divided by the total plants measured.

R_{P} = \frac{Z_{p}}{Z} \times 100 %

(22)

R_{L} = \frac{X_{L}}{X} \times 100 %

(23)

R_{Z} = \frac{Z_{Z}}{Z} \times 100 %

(24)

R_{S} = \frac{Z s}{Z} \times 100 %

(25)

In Equations (22)–(25), Z is the total number of rice seedlings measured, Zp is the total number of rice seedlings bleached, and Rp is the rate of rice seedlings bleached. X_L is the total number of holes measured on rice seedlings, X is the total number of holes on omission rice seedlings, and R_L is the rate of the rice seedling omission. R_Z is the qualified rate of rice seedlings, and ZZ is the total number of normal rice seedlings. Rs is the sparse rate of rice seedlings, and Zs is the total number of normal rice seedlings.

Figure 23 shows the results of rice omission detection conducted by the model on large-scale rice plots and small-scale plots. The results of the rice seedling detection using the HGV-YOLO model in the field test are presented in Table 9. The actual statistical outcomes from the field environment are summarized in Table 10. The improved model was used to classify and recognize the pictures of rice seedlings, and the predicted values of qualified, floating, sparse and omission seedlings were 95.99%, 0.86%, 1.06% and 0.96%. Based on the calculation of 50 sample rice field plots and 13,000 rice seedling test samples, the actual qualified rate of rice seedlings is 96.23%, the floating rate is 1.06%, and the missed transplanting rate is 1.12%. The results show that the relative error between the predicted value and the real value of the rice seedlings is tolerant, indicating that the model can satisfy the evaluation of rice seedling transplanter operation quality.

5.4. Discussion

This research proposes an HGV-YOLO model based on YOLOv8n which can accurately classify rice seedlings and locate omission positions. It aims to provide effective technical support for the evaluation of the transplanting quality of rice transplanters and research on the path planning of seedling mending machinery.

HGV-YOLO’s detection of different forms of rice seedlings increases the complexity of improvement and makes the task more challenging. Firstly, the backbone network was improved to meet the classification and recognition requirements of rice seedlings in a complex environment. The C2f structure was optimized by using a HorBlock module, and the characteristics of rice seedlings in different forms were split in different dimensions. This improved method can provide richer clues between hierarchical features. Through the gating mechanism and recursive design of gnConv, the high-order spatial interaction is realized to reduce the interference of similar targets or backgrounds and improve the feature extraction ability of the backbone network for rice seedlings. Compared with the rice seedling line detection method developed by Lin, the precision of the improved YOLOv8 model was improved by 3.9% by using Faster-RCNN for the identification and positioning detection of rice seedlings [31]. And because the HorBlock module can capture complex visual layouts, the oscillation amplitude of the precision curve and recall curve is smaller and more stable. This nature means the model can complete the identification and classification of rice seedlings synchronously. In addition, HGV-YOLO maintains a lightweight and efficient design. The floating-point number of GFOLPs is 8.1, which meets the dual requirements of lightweight deployment and rapid deployment for rice transplanting machinery.

The ablation study further revealed the synergy mechanisms of key modules. The improved YOLOv8 model introduced the HorBlock module to optimize the backbone layer. The precision and mAP@0.5:0.95 have been slightly improved. By introducing a VOV-GSCSP module into the neck layer, the precision rate P was increased by 1%, the parameter quantity was reduced by 0.3%, and the recall rate R and mAP@0.5 were slightly lower than the basic model. Their synergy makes the model have the advantages of a light weight and high precision. Moreover, the introduction of WIoU can help focus on the common quality anchor frame, reducing the impact of environmental noise such as illumination and occlusion. When all three modules are applied together, the feature mapping weights are dynamically adjusted to effectively capture the feature information of the target rice seedlings at different scales. HGV-YOLO showed an excellent ability in classifying and identifying qualified seedlings, floating seedlings and seedlings with few leaves and detecting missing positions. Wang et al. proposed a rice seedling row recognition model CS-YOLOv5 based on bottom initial clustering and top external point elimination for rice weeding. This model is useful, since the mAP@0.5 for seedling recognition is 88.8%. The mAP@0.5 of HGV-YOLO is 92.1%, showing an increase of 2.3% [32]. Wang et al. also proposed a rice seedling row recognition method and constructed a CNN model. This model completes the recognition task through classification based on a row vector network; the recognition precision of the model for rice seedlings is 89.5%, which is lower than HGV-YOLO, but the F1 value is slightly higher than the 1.1% for the model [33]. This is because HGV-YOLO not only aims at the individual identification of seedlings but also completes the classification and identification of rice. HGV-YOLO takes the seedling as the central area for feature extraction and achieves a higher precision, covers a wide range of key features, and effectively suppresses irrelevant interference.

The detection of missing transplanting positions for seedlings is an essential task for improving the model. Different models have different recognition effects on seedlings. The coordinates of the center point of the detected seedlings are different, which leads to a difference in the line equation of the seedlings fitted using the least squares method and then affects the detection of the missing transplanting position of the seedlings. HGV-YOLO achieves a faster detection speed and higher precision. Compared with other detection algorithms (YOLOv3-tiny, YOLOv5, YOLOv5-p6, YOLOv6, YOLOv8n), HGV-YOLO’s R-square is 0.9129, which is significantly higher than the other five models, and its SSE and MSE are the lowest compared with the other models. The detection time of YOLOv8n was 25 ms, and the detection time of HGV-YOLO was 27 ms, which was 2 ms slower than that of YOLOv8n. However, the precision, R-squared, SSE and MSE were better than those of YOLOv8n. It has been proven that, though the enhancement model still has some problems, such as the extension of the training time, the overall detection precision is high. This model provides strong support for the classification and identification of rice seedlings and the accurate detection of missing transplanting positions.

In this study, in the actual field environment, the recognition precision of the improved HGV-YOLO model for qualified seedlings, floating seedlings, and leafless seedlings was 96.2%, 91.1% and 93.7%, respectively, and the average time of image prediction was 15.7 ms. This indicates that the improved model not only performs best on the dataset but also has a recognition precision of more than 90% in the field. A good balance has been achieved between precision and recall, highlighting the powerful detection and generalization capabilities of the model.

6. Conclusions

To realize the detection of floating, sparse and omission rice seedlings in complex agricultural environment, this study improved the YOLOv8n target detection model. The GSConv module is used in the head layer to improve the detection speed, VOV-GSCSP is used in the neck layer, and the HorBlock module is added. The enhanced model extracts various characteristics of rice seedlings through high-order interactions in the backbone layer and obtains the optimized target detection model HGV-YOLO. The following conclusions are drawn:

(1) HGV-YOLO, a floating, sparse and omission rice seedling detection model based on an improved YOLOv8n, was proposed. The average precision of the improved model was 93.7%, the recall rate was 83.1%, the F1 value was 88.1%, the mAP@0.5 was 91.1%, and the mAP@0.5 was 72.5%. Compared with the basic model YOLOv8n, the precision rate was improved by 2.4%, and the recall rate was improved by 1.6%. The mAP@0.5 increased by 2.3%, the mAP@0.5: 0.95 increased by 2.4%, and the number of parameters of the model decreased by 3.14%.

(2) Compared to YOLOv5, YOLOv5-p6, YOLOv3 tiny, YOLOv6 and YOLOv8n, HGV-YOLO achieves increases in the recognition precision of qualified seedlings by 2.4%, 1.5%, 1.1%, 1.3% and 1.4%, respectively. In the aspect of the identification of floating seedlings, the accuracy of the method is improved by 2.3%, 4.7%, 2.7%, 0.9% and 3.7%, respectively. As for sparse seedlings, the recognition accuracy was improved by 1.3%, 2.0%, 6.3%, 3.2% and 1.9%, respectively.

(3) Compared to the SOTA model, HGV-YOLO achieves a recognition precision of 96.2%, 91.1% and 93.7%, respectively, for qualified seedlings, floating seedlings, and sparse seedlings. Compared to YOLOv8-P2, YOLOv9c, and YOLOv10n, HGV–YOLO achieves increases in the recognition precision of floating seedlings of 1.5%, 4.9% and 1.6%, respectively. Therefore, the HGV–YOLO model has a better detection ability, especially for floating seedlings.

(4) The HGV-YOLO model field test showed that the detection precision of the omission location was 85.7%. The positioning error of rice seedlings meets the requirements of agronomy. The improved model can meet the detection requirements of floating, sparse and missing rice seedlings.

After the field test, HGV-YOLO met the requirements of seedling classification. However, the rice field environment is complex, and many factors such as the weather, wind speed, and water level will affect the morphological characteristics of rice seedlings; the recognition effect of the model will also be affected. Future work will focus on enriching the dataset of rice seedlings, using time information to eliminate the morphological ambiguity of seedlings and improving the robustness of the detection model. We will explore and use the Knowledge Distillation and Quantization Aware Training quantitative perception training optimization model to significantly improve the efficiency and performance of the model with less precision loss. Furthermore, we will consider deploying this model to edge devices to support decision-making and path planning in unstructured environments.

Author Contributions

Conceptualization, C.L. and Y.C.; methodology, C.L.; software, Y.C.; validation, C.L. and Y.C.; investigation, J.H.; resources, Z.Z.; data curation, Y.C.; writing—original draft preparation, C.L. and Y.C.; writing—review and editing, C.L.; visualization, Y.C.; supervision, J.H.; project administration, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (NSFC) (32201655), Heilongjiang Provincial Key R&D Program Projects (2023ZX01A06) and Heilongjiang Provincial “Double First-Class” Discipline Collaborative Innovation Achievement Project (LJGXCG2023-045).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to possible further research.

Conflicts of Interest

Author Yuheng Chen was employed by the company Zhejiang Mu Chan Li Ecological Technology Development Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Wang, Z.L.; Tan, X.M.; Ma, Y.M. Combining canopy spectral reflectance and RGB images to estimate leaf chlorophyll content and grain yield in rice. Comput. Electron. Agric. 2024, 221, 108975. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, F.; Zhang, K.P.; Liao, P.; Xu, Q. Effect of agricultural management practices on rice yield and greenhouse gas emissions in the rice–wheat rotation system in China. Comput. Electron. Agric. 2024, 916, 170307. [Google Scholar] [CrossRef] [PubMed]
Li, J.Y.; Shang, Z.J.; Li, R.F. Adaptive Sliding Mode Path Tracking Control of Unmanned Rice Transplanter. Agriculture 2022, 12, 1225. [Google Scholar] [CrossRef]
Amalia, A.F.; Rahayu, H.S.P.; Risna. Performance of Jarwo Rice Transplanter, Tegel Rice Transplanter, and Atabela Systems in Central Sulawesi. IOP Conf. Ser. Earth Environ. Sci. 2022, 977, 012074. [Google Scholar]
Wang, X.C.; Li, Z.H.; Tan, S.Y.; Li, H.W.; Long, Q.; Wang, Y.W.; Chen, J.T. Research on density grading of hybrid rice machine-transplanted blanket-seedlings based on multi-source unmanned aerial vehicle data and mechanized transplanting tes. Comput. Electron. Agric. 2024, 222, 109070. [Google Scholar] [CrossRef]
Daisuke, O.; Toshihiro, S.; Hiroshi, T. Surveillance of panicle positions by unmanned aerial vehicle to reveal morphological features of rice. PLoS ONE 2019, 14, e0224386. [Google Scholar]
Li, H.; Ma, J.; Zhang, J.L. ELNet: An Efficient and Lightweight Network for Small Object Detection in UAV Imagery. Remote Sens. 2015, 17, 2096. [Google Scholar] [CrossRef]
Nie, Y.; Na, X. UAV Lite-YOLOv10: A Lightweight Small Target Detection Algorithm for Unmanned Aerial Vehicles. In Proceedings of the 2025 IEEE 6th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Shenzhen, China, 11–13 April 2025; pp. 1–5. [Google Scholar]
Wu, H.Q.; Guan, M.X.; Chen, J.N. OE-YOLO: An Efficient Net-Based YOLO Network for Rice Panicle Detection. Plants 2025, 14, 1370. [Google Scholar] [CrossRef]
Chen, D.Q.; Xu, K.; Sun, W.B. REU-YOLO: A Context-Aware UAV-Based Rice Ear Detection Model for Complex Field Scenes. Agronomy 2025, 15, 2225. [Google Scholar]
Choi, H.K.; Han, K.S.; Han, H.S. Guidance Line Extraction for Autonomous Weeding robot based-on Rice Morphology Characteristic in Wet Paddy. J. Korea Robot. 2014, 9, 147–153. [Google Scholar] [CrossRef]
Zhu, W.; Ma, L.; Zhang, P. Morphological recognition of rice seedlings based on GoogLeNet and UAV image. South China Agric. Univ. 2022, 43, 99–106. (In Chinese) [Google Scholar]
Liu, Y.J.; Lv, Z.; Hu, Y.Y. Improved Cotton Seed Breakage Detection Based on YOLOv5s. Agriculture 2022, 12, 1630. [Google Scholar] [CrossRef]
Wang, S.S.; Yu, S.S.; Zhang, W.Y.; Wang, X.S. Detection of Rice seedling Rows Hough Transform of feature Point Neighborhood. Trans. Chin. Soc. Agric. Mach. 2020, 51, 18–25. (In Chinese) [Google Scholar]
Wang, Y.; Fu, Q.; Ma, Z.; Tian, X.; Ji, Z.; Yuan, W.S. YOLOv5-AC: A Method of Uncrewed Rice Transplanter Working Quality Detection. Agronomy 2023, 13, 2279. [Google Scholar] [CrossRef]
He, R.R.; Luo, X.W.; Zhang, Z.G.; Zhang, W.Y.; Jiang, C.Y.; Yuan, B.X. Identification Method of Rice Seedlings Rows Based on Gaussian Heatmap. Agriculture 2022, 12, 1736. [Google Scholar] [CrossRef]
Li, R.J.; Li, Y.D.; Qin, W.B. Lightweight Network for Corn Leaf Disease Identification Based on Improved YOLO v8s. Agriculture 2024, 14, 220. [Google Scholar] [CrossRef]
Wang, W.H.; Shi, Y.; Liu, W.F.; Che, Z.J. An Unstructured Orchard Grape Detection Method Utilizing YOLOv5s. Agriculture 2024, 14, 262. [Google Scholar] [CrossRef]
Zou, Y.Q.; Fan, Y.G. An Infrared Image Defect Detection Method for Steel Based on Regularized YOLO. Sensors 2024, 24, 1674. [Google Scholar] [CrossRef]
Wen, C.M.; He, W.W.; Wu, W.L.; Liang, X. Recognition of mulberry leaf diseases based on multi-scale residual network fusion SENet. PLoS ONE 2024, 19, e0298700. [Google Scholar] [CrossRef]
Yu, J.Z.; Bai, Y.; Yang, S.Q.; Ning, Z.F. Stolon-YOLO: A detecting method for stolon of strawberry seedling in glass greenhouse. Comput. Electron. Agric. 2023, 215, 108447. [Google Scholar] [CrossRef]
Wang, R.J.; Liang, F.L.; Wang, B.; Mou, X.W. ODCA-YOLO: An Omni-Dynamic Convolution Coordinate Attention-Based YOLO for Wood Defect Detection. Forests 2023, 14, 1885. [Google Scholar] [CrossRef]
Zheng, J.F.; Wu, H.; Zhang, H.; Wang, Z.Q.; Xu, W.Y. Insulator-Defect Detection Algorithm Based on Improved YOLOv7. Sensors 2022, 22, 8801. [Google Scholar] [CrossRef] [PubMed]
Cao, Y.K.; Pang, D.D.; Zhao, Q.C.; Yan, Y.; Jiang, Y.Q. Improved YOLOv8-GD deep learning model for defect detection in electroluminescence images of solar photovoltaic modules. Eng. Appl. Artif. Intell. 2024, 131, 107866. [Google Scholar] [CrossRef]
Tong, Z.J.; Chen, Y.H.; Xu, Z.W. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar] [CrossRef]
Liu, Z.G.; Sun, B.S.; Bi, K.Y. Optimization of YOLOv7 Based on PConv, SE Attention and Wise-IoU. Int. J. Comput. Intell. Appl. 2024, 23, 235033. [Google Scholar] [CrossRef]
Wang, H.Y.; Yang, H.T.; Chen, H.; Wang, J.Y.; Zhou, X.Y.; Xu, Y.F. A Remote Sensing Image Target Detection Algorithm Based on Improved YOLOv8. Appl. Sci. 2024, 14, 1557. [Google Scholar] [CrossRef]
Diao, Z.H.; Guo, P.L.; Zhang, B.Z.; Zhang, D.Y.; Yan, J.N.; He, Z.D.; Zhao, S.Z.; Zhao, C.J.; Zhang, J.C. Navigation line extraction algorithm for corn spraying robot based on improved YOLOv8s network. Comput. Electron. Agric. 2023, 212, 108049. [Google Scholar] [CrossRef]
Shazia, J.; Ghida, N.; Nazir, A.C.; Ali, A.; Muhammad, F.T. Adaptive recursive least squares method for parameter estimation of autoregressive models. IJANS 2023, 4, 72–89. [Google Scholar]
Dembinski, H.; Schmelling, M.; Waldi, R. Application of the iterated weighted least-squares fit to counting experiments. Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrometers Detect. Assoc. Equip. 2019, 940, 135–141. [Google Scholar] [CrossRef]
Lin, S.M.; Jiang, Y.; Chen, X.S. Automatic Detection of Plant Rows for a Transplanter in Paddy Field Using Faster R-CNN. IEEE Access 2020, 8, 147231–147240. [Google Scholar] [CrossRef]
Wang, S.S.; YU, S.S.; Zhang, W.Y.; Wang, X.S. The identification of straight-curved rice seedling rows for automatic row avoidance and weeding system. Biosyst. Eng. 2023, 233, 47–62. [Google Scholar] [CrossRef]
Wang, S.S.; Zhang, W.Y.; Wang, X.S. Recognition of rice seedling rows based on row vector grid classification. Comput. Electron. Agric. 2021, 190, 106454. [Google Scholar] [CrossRef]

Figure 1. Dataset collection location.

Figure 2. Data augmentation.

Figure 3. Various varieties of rice seedlings.

Figure 4. Schematic flowchart.

Figure 5. Improved YOLOv8 network structure.

Figure 6. HorBlock module structure diagram based on gnConv.

Figure 7. Structure diagram of VOV-SCSP.

Figure 8. Diagram of GSConv structure.

Figure 9. WIoU loss function.

Figure 10. Least squares method.

Figure 11. Schematic diagram of rice transplanter omission positions (red area indicates omission positions).

Figure 12. Identification flow chart of rice transplanter omission positions.

Figure 13. HorBlock module test location diagram.

Figure 14. P-R curves of the HorBlock module at different positions in HGV-YOLO.

Figure 15. Curves of different evaluation indicators before and after model improvement.

Figure 16. The P-R curve of the model for the rice seedling dataset was improved.

Figure 17. Rice seedling line fitted by different models.

Figure 18. The line-fitting results of rice seedlings under different environments.

Figure 19. Precision detection of rice transplanter omission positions by improved model.

Figure 20. Comparison of details of three different categories after model improvement.

Figure 21. Comparison of different models.

Figure 22. Example of HGV-YOLO in the actual detection scenario.

Figure 23. Omission detection.

Table 1. UAV parameters.

Parameters	Unit	Norm
Flight Height	m	2
Flight Speed	m/s	1
Photo Interval	s	2
Camera Angle	Fov	−90

Table 2. Hardware for training dataset.

Configuration	Parameter
CPU	16 vCPU Intel(R) Xeon(R) Platinum 8352 V CPU @ 2.10 GHz
RAM	120 GB
GPU	RTX 4090 (24 GB)
GPU computing platform	Cuda 11.3
operating system	Windows 10 (64-bit)
Deep learning framework	Pytorch 1.11.0 python 3.8 (ubuntu20.04)

Table 3. Performance comparison of different positions of HorBlock on the seedling dataset.

Model	P (%)	R (%)	mAP@0.5 (%)	mAP@0.5:0.95 (%)	F1 (%)	Time (h)
YOLOv8n	91.3	81.5	88.8	69.3	86.1	0.473
Y8H1	93.1	81.0	89.2	70.1	86.6	0.698
Y8H2	93.7	83.1	91.1	72.5	88.1	0.774
Y8H3	91.6	81.2	88.2	69.1	86.0	0.409
Y8H4	90.6	82.8	88.4	69.7	86.5	0.518
Y8H5	93.7	82.4	87.8	69.4	87.2	0.736

Table 4. Ablation tests of the improved model.

Model	HorBlock	VOV-GSCSP	WIoU	P (%)	R (%)	F1 (%)	mAP@0.5 (%)	mAP@0.5:0.95 (%)	Params
A				91.3	81.5	86.1	88.8	70.1	3,011,433
B	√			91.6	81.1	86.0	88.6	70.3	3,014,521
C		√		92.3	79.9	86.7	88.1	69.5	3,002,288
D			√	90.9	82.1	86.3	88.8	70.2	3,006,233
E	√		√	92.3	81.8	86.7	88.7	70.1	3,066,809
F		√	√	90.6	82.8	86.5	88.8	70.4	2,850,745
G	√	√		91.8	81.8	86.5	88.4	69,4	2,911,321
H	√	√	√	93.7	83.1	88.1	91.1	72.5	2,916,841

Table 5. Comparison of the results of different fitting methods in the same rice row.

Method	Equation	R-Squard	SSE	MSE	Precision (%)	Time (ms)
YOLOv3-tiny+LSM	$y = 0.0479 x + 0.3754$	0.8125	$2.8863 \times 10^{- 4}$	$5.7726 \times 10^{- 5}$	88.4	28
YOLOv5+LSM	$y = 0.0394 x + 0.3626$	0.8716	$1.2461 \times 10^{- 4}$	$2.4922 \times 10^{- 5}$	91.2	22
YOLOv5-P6+LSM	$y = 0.0492 x + 0.3566$	0.7948	3.3959 × 10⁻⁴	$6.7918 \times 10^{- 5}$	86.6	33
YOLOv6+LSM	$y = 0.0481 x + 0.3563$	0.7904	$3.3421 \times 10^{- 4}$	$6.6842 \times 10^{- 5}$	86.3	37
YOLOv8n+LSM	$y = 0.0398 x + 0.3627$	0.8328	$3.1372 \times 10^{- 4}$	$6.6344 \times 10^{- 5}$	93.6	25
HGV-YOLO+LSM	$y = 0.0476 x + 0.3589$	0.9129	$9.8063 \times 10^{- 5}$	$1.96126 \times 10^{- 5}$	94.3	27

Table 6. Comparison of the performance of different models on the dataset.

Model	P (%)	R (%)	Precision (%)			F1-Score (%)	mAP@0.5 (%)	Detect Speed (ms)	GFLOPs
Model	P (%)	R (%)	Qualified	Floating	Lack	F1-Score (%)	mAP@0.5 (%)	Detect Speed (ms)	GFLOPs
YOLOv5	91.6	81.6	93.8	88.8	92.4	86.3	88.4	18.7	7.2
YOLOv5-P6	91.0	82.2	94.7	86.4	91.7	86.4	88.4	16.1	8.0
YOLOv3-tiny	90.3	82.6	95.1	88.4	87.4	86.2	87.1	14.4	19.0
YOLOv6	91.8	82.2	94.9	90.2	90.5	86.7	88.6	17.4	11.9
YOLOv8n	91.3	81.5	94.8	87.4	91.8	86.1	88.8	17.9	8.2
HGV-YOLO	93.7	83.1	96.2	91.1	93.7	88.1	91.1	15.7	8.1

Table 7. Evaluation indexes of different models in rice classification.

Model	mAP@0.5 (%)	Qualified	Floating	Lack	mAP@0.5:0.95 (%)	Params
YOLOv5	88.4	98.5	74.6	92.2	69.2	2,503,529
YOLOv5-P6	88.4	98.5	74.3	92.3	70.0	4,334,896
YOLOv3-tiny	87.1	98.4	72.0	87.1	67.5	12,133,670
YOLOv6	88.6	98.4	76.7	90.7	70.2	4,238,441
YOLOv8n	88.8	98.6	75.5	92.3	70.1	3,011,433
HGV-YOLO	91.1	98.5	81.5	93.3	72.5	2,916,841

Table 8. Performance with other SOTA detection model.

Model	Category	P (%)	R (%)	AP₅₀	mAP@0.5	Params	GFLOPs	Detect Speed (ms)
YOLOv8-P2	qualified	94.6	96.8	98.4	88.3	2,926,956	12.4	19.3
	floating	89.6	69.7	74.9
	lack	91.9	78.7	91.6
YOLOv8-Ghost-P2	qualified	93.9	97.1	98.4	88.4	1,606,756	8.8	20.3
	floating	89.0	68.4	75.2
	lack	93.3	78.9	91.6
YOLOv9c	qualified	95.9	95.7	98.4	88.8	25,531,545	103.7	16.8
	floating	86.2	69.8	75.6
	lack	93.0	81.4	92.3
YOLOv10n	qualified	93.4	95.4	98.2	88.1	2,695,586	8.2	10.0
	floating	89.5	69.1	74.4
	lack	92.3	78.8	91.7
YOLOv11n	qualified	95.2	96.7	98.4	88.4	2,582,737	6.5	15.2
	floating	88.0	70.4	75.6
	Lack	89.6	81.0	91.0
YOLOv12n	qualified	96.1	94.5	98.6	88.7	2,508,929	5.8	10.4
	floating	90.0	68.4	75.4
	Lack	94.4	76.6	92.0
Rice-YOLO	qualified	96.2	95.3	98.5	91.1	2,916,841	8.1	15.7
	floating	91.1	74.2	81.5
	Lack	93.7	79.7	93.3

Table 9. Detection results of HGV-YOLO in a field trial.

Category	Qualified Seedling (%)	Floating Seedling (%)	Sparse Seedling (%)	Omission Seedling (%)
Precision	99.7%	81.1%	85.4%	85.7%

Table 10. Statistical table of rice seedling identification test results.

Category	Rate of Qualified (%)	Floating Seedling Rate (%)	Sparse Seedling Rate (%)	Omission Seeding Rate (%)
Predicted value	95.99	0.86	1.06	0.96
Actual value	96.23	1.06	1.24	1.12
The relative error	0.24	0.20	0.18	0.16

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, C.; Chen, Y.; Hu, J.; Zhou, Z. HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings. Agronomy 2026, 16, 678. https://doi.org/10.3390/agronomy16070678

AMA Style

Liang C, Chen Y, Hu J, Zhou Z. HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings. Agronomy. 2026; 16(7):678. https://doi.org/10.3390/agronomy16070678

Chicago/Turabian Style

Liang, Chunying, Yuheng Chen, Jun Hu, and Zheng Zhou. 2026. "HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings" Agronomy 16, no. 7: 678. https://doi.org/10.3390/agronomy16070678

APA Style

Liang, C., Chen, Y., Hu, J., & Zhou, Z. (2026). HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings. Agronomy, 16(7), 678. https://doi.org/10.3390/agronomy16070678

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

HGV-YOLO: A Detection Method for Floating Seedlings and Missed Transplanting Based on the Morphological Characteristics of Rice Seedlings

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Dataset

2.3. Evaluation Indicators

2.4. Model Instruction

3. HGV-YOLO Model Construction

3.1. Detection of Rice Seedlings

3.1.1. Improved YOLOv8

3.1.2. HorBlock Module Based on gnConv

3.1.3. VOV-GSCSP Based on GSConv

3.1.4. WIoU Loss Function

3.2. Detecting Rice Transplanter Omission Positions

4. Analysis of Model Structure and Parameters

4.1. Analysis of HorBlock Module Position

4.2. Ablation Experiments

5. Results and Discussion

5.1. Row Fitting and Omission Detection

5.2. Floating and Sparse Rice Seedling Detection

5.3. Evaluation of Rice Transplanter Operation Quality

5.4. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI