Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8

Lv, Runyi; Hu, Jianping; Zhang, Tengfei; Chen, Xinxin; Liu, Wei

doi:10.3390/agriculture15090942

Open AccessArticle

Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8

by

Runyi Lv

,

Jianping Hu

^*,

Tengfei Zhang

,

Xinxin Chen

and

Wei Liu

Jiangsu Provincial Key Laboratory of Hi-Tech Research for Intelligent Agricultural Equipment, Jiangsu University, Zhenjiang 212013, China

^*

Author to whom correspondence should be addressed.

Agriculture 2025, 15(9), 942; https://doi.org/10.3390/agriculture15090942

Submission received: 16 March 2025 / Revised: 19 April 2025 / Accepted: 24 April 2025 / Published: 26 April 2025

(This article belongs to the Section Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

This study is situated against the background of shortages in the agricultural labor force and shortages of cultivated land. In order to improve the intelligence level and operational efficiency of agricultural machinery and solve the problems of difficulties in recognizing navigation lines and a lack of real-time performance of transplanters in the crop-free ridge environment, we propose a crop-free-ridge navigation line recognition method based on an improved YOLOv8 segmentation algorithm. First, this method reduces the parameters and computational complexity of the model by replacing the YOLOv8 backbone network with MobileNetV4 and the feature extraction module C2f with ShuffleNetV2, thereby improving the real-time segmentation of crop-free ridges. Second, we use the least-squares method to fit the obtained point set to accurately obtain navigation lines. Finally, the method is applied to testing and analyzing the field experimental ridges. The results showed that the average precision of the improved neural network model using this method was 90.4%, with a Params of 1.8 M, a FLOPs of 8.8 G, and an FPS of 49.5. The results indicate that the model maintains high accuracy while significantly outperforming Mask-RCNN, YOLACT++, YOLOv8, and YOLO11 in terms of computational speed. The detection frame rate increased significantly, improving the real-time performance of detection. This method uses the least-squares method to fit the 55% ridge contour feature points under the picture, and the fitting navigation line shows no large deviation compared with the image ridge centerline; the result is better than that of the RANSAC fitting method. The research results indicate that this method significantly reduces the size of the model parameters and improves the recognition speed, providing a more efficient solution for the autonomous navigation of intelligent carrier aircraft.

Keywords:

navigation line; autonomous navigation; deep learning; crop-free ridge; least-square method

1. Introduction

Transplanting technology has been widely used in agricultural production due to its advantages of efficiently improving work efficiency and promoting the yields of crops such as grain and vegetables. In recent years, various transplanting machines developed in China have been widely promoted and applied to solving the problem of excessive labor intensity in transplanting. However, under the dual pressure of structural shortages in agricultural labor caused by the aging population and increasingly limited agricultural land, creating intelligent unmanned agricultural machinery and further improving agricultural productivity has become an inevitable direction for the development of agricultural machinery, including transplanting machines. The realization of autonomous navigation technology is one of the key technologies used to realize intelligent and unmanned agricultural machinery. For transplanting machinery, as Figure 1 shows, the operating scenes mainly comprise crop-free ridges; compared with recognizable crop lines, the similar colors and fuzzy boundaries found in such areas make it difficult to recognize navigation lines. Therefore, to enable the autonomous navigation of transplanters, it is particularly important to identify the navigation lines of crop-free ridges [1,2,3,4,5].

At present, many studies have been conducted on the recognition and extraction of navigation lines. In 2011, Wang et al. [6] used the I-component histogram of ridge roads in the HSI color space to perform the binary segmentation of images in tomato greenhouse working environments; they then used the least-squares method to extract navigation lines. In 2016, Jiang et al. [7] used ultra-green channels to divide crops and backgrounds; then, they scanned and obtained feature points and combined Hough transform and the K-mean clustering algorithm to extract crop lines. In 2023, Zhou et al. [8] combined the Da Song method with morphology to obtain binary images of crop rows and used adaptive clustering methods to obtain navigation lines. However, traditional recognition methods often rely on shallow image information and have limited effectiveness in dealing with complex agricultural environments. In 2025, Li et al. [9] optimized the detection area through open operations and the filtering of eight connected components; finally, they fitted and extracted navigation lines by clustering French S-shaped thresholding to screen pixels. However, previous navigation line recognition methods have relied heavily on shallow image information, which has limited effectiveness in dealing with complex agricultural environments. However, the above traditional machine vision recognition methods mostly rely on color channel thresholds, or shallow image information such as the gray difference, which has a low utilization rate for image information, significant limitations, and limited effectiveness in the face of complex agricultural environments. In 2019, Li et al. [10] conducted a study of field roads in hilly areas and extracted the non-statistical features of road centroid points. They used epipolar constraints and homography matrices for precise matching and 3D reconstruction, obtaining autonomous navigation routes for field roads. In 2020, Johannes et al. [11] conducted a comprehensive scan of harvested wheat fields using stereo cameras. By identifying the different three-dimensional features of harvested and uncollected wheat fields, they successfully determined the harvest boundary and obtained the direction of the shipping route. In 2021, Liu et al. [12] used LiDAR to obtain point cloud data in a greenhouse orchard environment and processed it to obtain tree rows using least squares. However, due to the large amount of real-time computation and high hardware requirements for 3D reconstruction combined with lidar, the real-time performance of agricultural machinery deployment still needs to be strengthened.

In recent years, with the rise of neural networks, deep learning algorithms have been increasingly applied in agricultural image recognition [13,14,15,16]. Compared with the traditional machine vision algorithm and 3D reconstruction, this method can better mine the deep information of the image and reduce the demand for hardware. In 2022, Zhou et al. [17] used YOLOv3 to identify trees in orchard scenes, combined with least-squares fitting to obtain navigation lines between fruit trees, with a final extraction accuracy of 90.0%; Yang et al. [18] improved U-Net for potatoes at different growth stages, identified potato crop regions, and extracted navigation routes. The accuracy of U-Net was improved by 3% compared to the original U-Net, and the average deviation of the fitted navigation angle was 2.16°. Li et al. [19] used Faster U-net to segment and train various types of front-view field ridge maps. By combining segmentation regions with ROI segmentation, multiple feature points were extracted and connected to form ridge lines. The navigation effect showed average angle differences of 0.624°, 0.556°, 0.526°, and 0.999° in four types of environment: corn, tomato, cucumber, and wheat, respectively. In 2023, Ruan et al. [20] used the YOLO-R network to identify rice and used the DBSCAN clustering algorithm to autonomously classify feature points into various routes and fit them, achieving an accuracy of 95.87% for 14 day crops. Yang et al. [21] used YOLOv5 network to achieve fast ROI localization, simplifying the complexity of image processing; they used the FAST algorithm to extract lane lines between rows. Due to the rapid determination of ROI, the model’s line-taking prediction speed is limited to within 30 ms. Chen Hong et al. [22] improved the backbone network and attention mechanism by upgrading DeeplabV3+ to Deeplab-MV3, simplifying the network model and meeting real-time accuracy requirements. They also achieved the recognition and extraction of navigation lines in complex field scenes. Yu et al. [23] conducted research on the intelligent recognition of navigation lines for edge-deployed agricultural machinery and concluded that, using the JATSON NANO platform, ENET is the best network among several compared network models. It balances speed and accuracy, with a speed of 16.8 FPS and an accuracy of 93%. Li et al. [24] proposed E2CropDet to simplify the calculation process; this method directly uses crop routes as model training labels to achieve end-to-end navigation line recognition functions, significantly reducing the interference of environmental factors in image recognition. In 2024, Gong et al. [25] proposed improvements to YOLOX Tiny, combining adaptive light with multi-scale datasets to enhance the detection of dense targets, improving detection accuracy and speed; they used the least-squares method to fit the navigation path. In 2025, Kong et al. [26] improved the ENET network model, optimized the feature point extraction between rice rows; they used the LSM algorithm for navigation line fitting, achieving an mPA of 96.3%. However, most of the above studies are based on crop environments, and the algorithms used mainly recognize crops or boundary obstacles as feature objects. However, there is little research on non-crop reference environments such as transplanting.

This study proposes a method for identifying and extracting crop-free-ridge navigation routes based on deep learning neural networks to address the problem of precise navigation in crop-free environments in the context of transplanting machines. In order to better adapt to the edge deployment of agricultural machinery and reduce model parameters and computational complexity, this study uses deep learning neural networks to accurately identify and divide crop-free ridges in real-time images. A new algorithm based on YOLOv8 is proposed to replace the backbone network and feature modules, reducing the number of model parameters and computational complexity, and improving recognition speeds. Finally, the least-squares method is used to fit the ridge edges of the partitioned blocks for extracting the centerline of the field ridges. The ridge centerline proposed based on this method can serve as a navigation line for ridge walking machinery, providing technical support for the intelligent navigation of ridge walking machinery.

2. Materials and Methods

The study was conducted on the field land (119.308° E, 31.972° N) of the Jurong Agricultural Science and Technology Innovation Center in Zhenjiang. The main soil type was yellow-brown soil, and a hand-held ridge raising machine and experimental land ridge raising were used as image material sources. The image was captured using a realsenseD435 camera (Intel Corporation, Santa Clara, CA, USA) with a pixel size of 640 × 480 and saved in RGB space. Given that the environments encountered by field equipment are often complex and varied, this requires deep learning neural network models to improve the richness of the training data and enhance their adaptability to different environmental images. Therefore, this article focuses on the prominent factors of light and ridge shape, which affect image acquisition in the experimental field environment; we select 486 ridge surface images, as shown in Figure 2. After the image acquisition is completed, we used the labeling software Labelme (version 5.4.1) to select and label the main ridge lines, as shown in Figure 3. To further enhance the universality of the ridge recognition model, the original training dataset was expanded using data augmentation techniques, including adjusting the image angles, hue, saturation, and other methods, as shown in Figure 4, to simulate the possible states encountered in field image acquisition. At the same time, the dataset was expanded to 1200 images and divided into a training set and a validation set in a 7:3 ratio.

3. Design and Analysis of Key Components

3.1. Improved YOLOv8 Network Model

In order to achieve good image recognition results, this study uses YOLOv8 [27] as the basic network framework and makes improvements based on it. The YOLOv8 network itself can realize the functions of visual recognition and navigation. However, the complex structure of the original neural network leads to a huge number of model parameters, reducing the calculation efficiency and resulting in slower detection speeds and lower effectiveness, meaning that it is not suitable for lightweight deployment on transplanters. To accelerate the speed with which field ridges are recognized and improve the efficiency of navigation line extraction, we provide a more efficient computing model for edge deployment devices. We utilized MobileNetV4 [28] and Shuffleblock, which is extracted from ShuffleNetV2 [29], to improve the YOLOv8 network model, significantly reducing model parameters and computational complexity and improving the detection frame rate of the model to adapt to the speed conditions of the accompanying detection. The specific structure is shown in Figure 5. Based on the original architecture of the YOLOv8 model, the model replaces the Backbone with MobileNetV4, adjusts its output channel to 1024, transmits the characteristic image to the sppf layer, replaces the four C2f modules that were originally located behind the Concat layers with Shuffleblock at the neck of the structure, and adds the corresponding conv layer between the four Concat layers and Shuffleblocks for convolution operations in order to maintain the correspondence of input channels.

3.1.1. MobileNetV4 Backbone Network

The module is used to extract features in YOLOv8 is C2f, which uses two 1 × 1 convolutional layers to adjust channels and extracts image features through the Bottleneck loop. Its structure has a good effect on the accuracy of feature extraction, but the large number of convolutions and cross-channel operations in the Bottleneck stage significantly increase the parameters and computational complexity, thereby reducing its computational speed and ultimately affecting the real-time requirements of the navigation methods. This article replaces the C2f module in the head section of YOLOv8 with the Shuffleblock module. This module is taken from the ShuffleNetV2 lightweight network, and its structure is shown in Figure 6. ShuffleNetV2 adopts a grouping feature extraction strategy, which groups the feature maps on channels and extracts features separately and only for some channels. In addition to using 1 × 1 convolutional layers to adjust the number of channels during feature extraction, it also employs depth-wise separable convolutions, eliminating the need for cross-channel operations and significantly reducing computational complexity. Finally, concatenation operations are used instead of addition operations, reducing element-by-element operations and further lowering the computation time. Finally, the feature maps are shuffled between channels to ensure the even distribution of the feature data.

3.1.2. Shuffleblock

The module used to extract features in YOLOv8 is C2f, which uses two 1 × 1 convolutional layers to adjust channels; it extracts image features through the Bottleneck loop. Its structure has a good effect on the accuracy of feature extraction, but the large number of convolutions and cross-channel operations in the Bottleneck stage will significantly increase the parameters and computational complexity, thereby reducing its computational speed and ultimately affecting the real-time requirements of navigation methods. This article replaces the C2f module in the head section of YOLOv8 with the Shuffleblock module. This module is taken from ShuffleNetV2 lightweight network, and its structure is shown in Figure 7. ShuffleNetV2 adopts a grouping feature extraction strategy, which groups the feature maps on channels and extracts features separately and only for some channels. In addition to using 1 × 1 convolutional layers to adjust the number of channels during feature extraction, it also employs depth-wise separable convolutions, eliminating the need for cross-channel operations and significantly reducing computational complexity. Finally, concatenation operations are used instead of addition operations, reducing element-by-element operations and further lowering the computation time. Finally, the feature maps are shuffled between channels to ensure the even distribution of the feature data. The addition of Shuffleblock can effectively reduce the number of parameters and the computation of the network, reduce the volume of the model, increase the speed, and provide more appropriate algorithms for edge devices.

3.2. Navigation Line Extraction Method

Due to the irregular details of the actual ridge edges and the possibility of ridge bending caused by deviations in ridge movement, it is necessary to extract and fit the edge points from the ridge blocks extracted by the neural network model to obtain navigation line features that are easy to calculate and process. In this study, the edge feature points obtained from line-by-line scanning are added and averaged to obtain the midpoint set, and the point set is fitted using the least-squares method to obtain the desired navigation line. The specific steps are as follows.

To extract the predicted field ridge blocks and perform further operations, the image is first processed using grayscale and binary processing. The image is divided into recognition field ridge blocks and the background, and then the largest area recognition block in the image is taken as the processing block. Since the image acquisition position is located in the center of the machine, the area occupied by the line ridges in the image should be the largest. The largest area block is taken to filter out noise and interference from incorrect recognition. Then, the processing block is scanned line by line to obtain a set of feature pixel points for the left and right edge lines of the field ridge. The average coordinates of the point sets on both sides are taken as the feature point set of the ridge centerline.

In actual driving, only a short distance in front of the car plays a role in navigation. To reduce the impact of subsequent path feature points on navigation line extraction, the obtained centerline feature points need to be limited within a certain range. Here, the range of influence is selected as the field ridge three meters in front of the car. In the test, the erection height of the camera is about 1.5 m, the angle between the camera lens angle and the horizontal line is 30°, the assumed height of the camera is 1.5 m, and the vertical angle of view of the realsensed435 camera is 42.5°. The angle between the ground position three meters in front of the camera and the lowest end of the camera field of view is 24.6°, accounting for about 58% of the total vertical angle of view of the camera. The value here is rounded considering the actual deviation: about 55% of the area below the screen is the required field ridge range. The center line feature points below the screen at this ratio are selected for curve fitting. To better represent the trend of the field ridge direction and avoid distortion of the fitting curve caused by excessive slopes of some points, we only perform linear equation fitting to directly obtain the desired navigation line when extracting the navigation line. We extract the required pixel offset and yaw angle based on this navigation line (Figure 8).

4. Optimization of Cylinder Working Parameters

4.1. Model Training Platform Environment

The model in this article was trained on the Windows 11 operating system, with an 11th Gen Intel (R) Core (TM) i7-11800H processor (Intel Corporation, Santa Clara, CA, USA) and an NVIDIA GeForce RTX 3060 GPU graphics processor (NVIDIA Corporation, Santa Clara, CA, USA). The compiled language is Python 3.8, and the deep learning framework is Pytorch 2.4.1. Before training, the images are uniformly adjusted to 640 × 640, with a batch size of 16 and 300 iterations. The initial learning rate is 0.01, and the final learning rate is 0.0001. The optimizer is GSD.

4.2. Model Evaluation Indicators

This model is used to detect ridge blocks and generate masks. To evaluate its detection performance, mean average precision (mAP) is used as a measurement metric. To determine the level of model lightweighting, the parameter count (Params) and floating-point operation count (FLOPs) are used as measurement indicators. Finally, to evaluate its actual detection performance, the frame rate per second (FPS) was used as the measurement metric.

4.3. Backbone Comparison Experiment and Ablation Experiment

To further investigate the contribution of the Shuffleblock module to the MobileNetV4 backbone network, a backbone comparison experiment and ablation experiments were conducted using the same training image and annotation data to validate the model. The results are shown in Table 1 and Table 2.

The results showed that, after replacing the original backbone network of YOLOv8 with MobileNetV4, there was a significant reduction in parameter and computational complexity. The number of Params and floating-point operations (FLOPs) decreased from 3.2 M and 12.1 G to 2.3 M and 9.7 G, respectively. However, in terms of the impact on accuracy, mAP decreased from 96.5% to 90.4%, still maintaining high accuracy. MobileNetV4ShuffleNetV2 was compared with other backbone networks; the ShuffleNetV2 network has significantly reduced FLOPs, only 2.4 G, which is 19.8% of the value of the original backbone network. However, the mAP decreased to 87%, which is 9.1% lower than the original model. Efficientnet has up to 7.5 m Params, which is 5.2 m higher than MobileNetV4. In contrast, it is not suitable for deployment on edge devices. Compared with GhostNet, it has smaller parameters and computational complexity than MobileNetV4. However, considering that the addition of Shuffleblock effectively reduces the parameters and computational complexity, MobileNetV4 ensures the accuracy of the model. In the ablation experiment, when using the feature extraction module Shuffleblock alone for replacement, there is a slight decrease in parameters and computational complexity, while there is no change in accuracy, indicating that the ShuffleNetV2 network has a significant effect on parameter reduction and speed improvements. Moreover, when only extracting the Shuffleblock module for neck feature extraction, compared to using ShuffleNetV2 to replace the backbone network or the original C2f module, the parameters are effectively reduced without affecting accuracy.

The model presented in this article combines MobileNetV4 and Shuffleblock, reducing Params from 3.2 M to 1.8 M and FLOPs from 12.1 G to 8.8 G. Moreover, both data indicators show a significant further decrease compared to when the two modules are used separately. MAP also remained relatively stable at 90.4%, without further decline compared to replacing the backbone network model alone. At the same time, when comparing the detection speed, the neural network combined with MobileNetV4 and Shuffleblock increased the average FPS from 46.7 to 49.5 compared to the baseline network YOLOv8, which actually improves the detection speed. Although the average FPS can reach 73.5 when using Shufflenet as the backbone network, the decrease in detection accuracy is too large. The network model proposed in this study has a more balanced and effective acceleration effect compared to others.

4.4. Comparison of Commonly Used Models

To evaluate the actual performance of the lightweight improved model, this study’s improved model is compared with commonly used neural network models using the same training image and annotation data to show the effect in specific scenes. In this comparison, the two-stage model Mask-RCNN [30] and the single-stage models YOLACT++ [31], YOLOv8, and YOLO11 [32] were selected to compare their model parameter computation and accuracy. Based on the data in the Table 3, compared to the single-stage model, the two-stage network model Mask-RCNN has lower accuracy and scale. In terms of accuracy, its mAP is only maintained at 80.6%, and the detection frame rate FPS is 49.5 compared to the model proposed in this study, which is only 13.8. Even compared to YOLACT++, its detection frame rate is insufficient, indicating that using a single-stage model for improvement under the premise of edge deployment will yield better results. In comparison with other single-stage models, the model also performs well. Comparing YOLO series with YOLACT++, while significantly reducing the number of parameters and computation, its detection accuracy mAP is also higher, proving that the YOLO series algorithm selected as the baseline model in this study exhibits better performance and is more suitable for edge deployment. Continuing to compare YOLOv8 with YOLO11, although YOLO11 has a lower parameter count compared to YOLOv8, YOLOv8 performs better in actual detection frame rates. Therefore, choosing YOLOv8 as the baseline model would be more advantageous for real-time detection. Compared with YOLOv8, this article found that, while mAP was only reduced by 5.9%, the number of parameters further decreased, Params decreased from 3.2 M to 1.8 M, FLOPs decreased from 12.1 G to 8.8 G, and FPS increased from 46.7 frames per second to 49.5 frames per second, providing a higher detection rate for navigation work.

In order to intuitively show the detection effect, the normal ridge line, flood-lit ridge line, dim light ridge line, damaged ridge line, and curved ridge line were selected as typical scenes for testing. The actual detection effect is shown in Figure 9. The accuracy performance of Mask-RCNN is markedly insufficient, as indicated by a large number of misidentifications and repeated recognitions; there is a large area of missing recognition in dim scenes. In other network models, YOLACT++ may have defects at the edges when the shadows are not obvious on cloudy days, but it easily identifies ridges and gullies together in bad ridge scenes. The YOLO series models perform well in most cases, but there are cases of misidentification in small areas outside the ridge surface. YOLO11 has relatively few parameters, and, compared with YOLOv8, the edge missing situation is more obvious in the darker and curved ridge surface environments. Finally, compared with the YOLO series, we found that, due to the reduction in parameters and computational complexity, the edges were often rough, but it still maintained a good detection level.

4.5. Verification of the Navigation Line Prediction Effect

To test the fitting effect of navigation lines, we compared the fitting effect of using the least-squares fitting algorithm with the RANSAC fitting algorithm [33]. Using the mask detected by the model in this article as a sample, a mask edge point is taken every pixel on both sides of the vertical position, and the average value is taken as the midpoint. This point set is used as the fitting material. After fitting the navigation line using two methods, the initial deviation and overall deviation between the navigation line and the fitted point set are detected to evaluate the impact of the fitting method on the lateral deviation and the reliability of the fitted line.

The figure clearly shows that the algorithm proposed in this study fits the navigation lines marked in blue, and, in most cases, it can better fit the point set curve compared to the navigation lines fitted by the RANSAC algorithm. Based on the data, overall, we propose using the least-squares fitting algorithm to predict masks in the whole validation image dataset. We conducted a specific analysis of the five typical scenes described above, with an average initial deviation of 3.60 pixels and an average overall deviation of 2.10 pixels; compared with the RANSAC algorithm, the RANSAC algorithm is 2.47 pixels and 0.31 pixels less, respectively. At the same time, in Scene 5, the maximum initial deviation is 8.34 pixels and the maximum overall deviation is 2.90 pixels, which represent significant reductions compared to the RANSAC algorithm. Especially in Scenes 1 and 5, there are significant distortion fluctuations in the lines. Taking the sample of Scene 1 with larger fluctuations as an example, the initial deviation of the least-squares algorithm is 7.6 pixels lower than that of RANSAC, with only 4.31 pixels of deviation. Moreover, in a larger data space, the validation data set ridge line is used by two methods to fit the navigation path, and the result is subjected to a t-test, as shown in the table. For the initial deviation, the p value is 0.01; thus, it can be concluded that the initial deviation of the least-square fitting method in the environment without crops and ridge surfaces is significantly lower than that of RANSAC, which indicates that the least-square fitting method can provide more accurate lateral deviation for the navigation vehicle (Figure 10).

5. Discussion

The development of neural network technology plays an important role in promoting agricultural development. Targeted modifications of the SOTA neural network model structure based on the research object, as well as the quantification of the applicability of the new neural network model through indicators such as parameter quantity, computational complexity, and average accuracy, have become an important form of neural network application in agriculture [13,14,15,16]. This study achieved lightweight transformation and ridge recognition by modifying the backbone network of the YOLOv8 neural network model and replacing the neck C2f module. Combined with the least-squares algorithm, a crop-free ridge guideline recognition method that facilitates edge deployment of agricultural machinery was proposed. In terms of the selection of the baseline model, this study selected YOLOv8. It is worth noting that, in the comparative experiment, YOLOv8 was more effective than YOLO11. This may be because, compared with YOLOv8, YOLO11 pays more attention to the global attention mechanism and is prone to losing local details such as the edge of the ridge line. On the other hand, the end-to-end design of YOLO11 lacks NMS processing, which is more prone to redundant detection and leads to a decline in accuracy. This indicates that YOLOv8 may have more advantages in the detection of single targets in the field environment. This method, combined with the Shuffleblock module, has a significant effect on parameter reduction and acceleration. The final model has a FLOPs of 8.8 G and a parameter size of 1.8 M. Compared with Li’s proposal of Faster U-net with a parameter size of 2.68 M [19] and Chen’s proposal of Deeplab-MV3 with a parameter size of 3.086 M [22], the parameter size is reduced by 0.88 M and 1.29 M, respectively. Compared with Gong’s research [25], the algorithm proposed in this study exhibited higher detection frame rates on lower computing platforms. As shown in Figure 5, the original reason why the proposed method model can significantly reduce the number of parameters in applications may be due to the replacement of the originally complex C2f module with a large number of depth-wise separable convolutions in the MobileNetV4 backbone network and Shuffleblock, achieving a smaller degree of accuracy improvement and a larger degree of parameter reduction. However, based on the final detection frame rate, it can be seen that, although the improved model exhibits a certain improvement in detection speed compared with the baseline model, the range is not large. This result may be due to the limited effect of reducing parameters and computational complexity in the neck extraction model position, and there may be hidden variables that affect the image recognition speed in addition to parameters and computational complexity. In addition, the method proposed in this study has a more efficient compensation effect in terms of ridge segmentation accuracy, mainly reflected in previous studies that focused on crop ridge recognition with significant differences between the field target objects and background environments [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]. Meanwhile, the method proposed in this study is mainly used for crop-free-ridge recognition with minimal differences between field target objects and background objects. Finally, as shown in Table 4, the navigation line extraction method proposed in this study, which combines the least-squares method, also exhibits good extraction performance. Compared to Chen’s research [22], the method proposed in this study reduces the lateral deviation of navigation lines by 1.7 in practical applications. In 2024, Yu et al. [34] also conducted research on crop-free-ridge recognition, and, compared with the navigation line extraction method presented in this study, the average pixel deviation was also 1.4 higher. This can largely be attributed to the better navigation line feature point extraction strategy and high robustness fitting algorithm, as only fitting the proximal ridge surface feature points effectively reduces the influence of useless information on the distal ridge surface. The least-squares fitting method can better reflect the changing trend in the feature point set when applied to more concentrated feature points for representation.

In general, the method proposed in this study ensures that the ridge shape segmentation effect and navigation line fitting effect meet the requirements of navigation recognition, realizes significant reductions in recognition model parameters and computational complexity, and improves the navigation line recognition frame rate to a certain extent, providing a more suitable algorithm model for edge device deployment. In the future, in-depth research will be conducted on the detection accuracy and detection frame rate to further improve the application effect of this research method in crop-free-ridge recognition navigation. For example, attention mechanisms can be added to the neural network model [35].

6. Conclusions

This article focuses on the problem of recognizing and navigating in crop-free ridge environments. By combining deep learning network models, YOLOv8 is improved and used for ridge segmentation and recognition in crop-free ridges. The least-squares fitting algorithm is used to extract navigation lines from the segmentation area. Based on our comparative experiments, the following conclusions can be drawn:

(1): The improved YOLOv8 model demonstrated superior performance compared to models such as Mask-RCNN, YOLACT++, YOLOv8, and YOLO11. Specifically, the Params and FLOPs of the model were reduced to 1.8 M and 8.8 G, respectively. At the same time, the detection frame rate of RTX3060 GPU was increased to 49.5 frames per second, which is 2.85 frames higher than the original model. In addition, while maintaining a high accuracy of 90.4%, we reduced the number of model parameters and the computation requirements, improved the detection frame rate, and provided a suitable method for the edge deployment of agricultural machinery.
(2): The least-squares fitting algorithm used to extract navigation lines from the detection mask exhibited good performance, with an average initial deviation of 3.60 pixels and an average overall deviation of 2.10 pixels. It also demonstrated good anti-interference performance in the presence of fluctuations in the fitted data points, meaning that it can better meet the accuracy requirements of real-time detection in complex scenes involving agricultural machinery.

7. Future Work

Despite these achievements, there are still many areas where the method needs further research and improvement. In the future, the method will be improved from the following perspectives:

(1): We will continue to explore the factors affecting the recognition frame rate and further improve the recognition efficiency of the method;
(2): We will further study the relationship between deep network model structures and recognition accuracy in crop-free-ridge environments while maintaining the existing parameter and computational complexity, reducing or even improving recognition accuracy losses;
(3): Based on specific usage scenarios, we will conduct research on the deployment platform to implement real-time ridge line recognition using onboard methods.

Author Contributions

Conceptualization, R.L., T.Z. and J.H.; methodology, R.L., T.Z. and J.H.; software, R.L., T.Z. and W.L.; validation, R.L. and T.Z.; formal analysis, R.L., X.C. and J.H.; investigation, R.L. and T.Z.; resources, J.H.; data curation, R.L.; writing—original draft preparation, R.L.; writing—review and editing, R.L., T.Z., X.C. and J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Jiangsu Modern Agricultural Machinery Equipment and Technology Demonstration Project (No. NJ2021-08); the Priority Academic Program Development of Jiangsu Higher Education institutions (No. PAPD-2023-87); and the Research and Application of Key Technology and Equipment for Efficient Automatic Transplanting of Open Field Cabbage (No. NJ2024-04); China Postdoctoral Science Foundation (2024M751185).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhao, S.; Liu, J.; Jin, Y.; Bai, Z.; Liu, J.; Zhou, X. Design and Testing of an Intelligent Multi-Functional Seedling Transplanting System. Agronomy 2022, 12, 2683. [Google Scholar] [CrossRef]
Zhang, L.; Liu, G.; Qi, Y.; Yang, T.; Jin, C. Research Progress on Key Technologies of Agricultural Machinery Unmanned Driving System. J. Intell. Agric. Mech. 2022, 3, 27–36. [Google Scholar]
Jin, Y.; Liu, J.; Xu, Z.; Yuan, S.; Li, P.; Wang, J. Development Status and Trend of Agricultural Robot Technology. Int. J. Agric. Biol. Eng. 2021, 14, 1–19. [Google Scholar] [CrossRef]
Luo, X.; Liao, J.; Zang, Y.; Qu, Y.; Wang, P. The Development Direction of Agricultural Production in China: From Mechanization to Intelligence. China Eng. Sci. 2022, 24, 46–54. [Google Scholar] [CrossRef]
Yin, J.; Wang, Z.; Zhou, M.; Wu, L.; Zhang, Y. Optimized Design and Experiment of the Three-Arm Transplanting Mechanism for Rice Potted Seedlings. Int. J. Agric. Biol. Eng. 2021, 14, 56–62. [Google Scholar] [CrossRef]
Wang, X.; Han, X.; Mao, H.; Liu, F. Visual Navigation Path Detection of Greenhouse Tomato Ridges Based on Least Squares Method. J. Agric. Mach. 2012, 43, 161–166. [Google Scholar]
Jiang, G.; Wang, X.; Wang, Z.; Liu, H. Wheat Rows Detection at the Early Growth Stage Based on Hough Transform and Vanishing Point. Comput. Electron. Agric. 2016, 123, 211–223. [Google Scholar] [CrossRef]
Zhou, X.; Zhang, X.; Zhao, R.; Chen, Y.; Liu, X. Navigation Line Extraction Method for Broad-Leaved Plants in the Multi-Period Environments of the High-Ridge Cultivation Mode. Agriculture 2023, 13, 1496. [Google Scholar] [CrossRef]
Li, H.; Lai, X.; Mo, Y.; He, D.; Wu, T. Pixel-Wise Navigation Line Extraction of Cross-Growth-Stage Seedlings in Complex Sugarcane Fields and Extension to Corn and Rice. Front. Plant Sci. 2025, 15, 1499896. [Google Scholar] [CrossRef]
Li, Y.; Wang, X.; Liu, D. 3D Autonomous Navigation Line Extraction for Field Roads Based on Binocular Vision. J. Sens. 2019, 2019, 6832109. [Google Scholar] [CrossRef]
Kneip, J.; Fleischmann, P.; Berns, K. Crop edge detection based on stereo vision. Robot. Auton. Syst. 2020, 123, 103323. [Google Scholar] [CrossRef]
Liu, J.; He, M.; Xie, B.; Peng, Y.; Shan, H. Rapid Online Method and Experiment of Autonomous Navigation Robot for Trellis Orchard. J. Agric. Eng. 2021, 37, 12–21. [Google Scholar]
Ji, W.; Gao, X.; Xu, B.; Pan, Y.; Zhang, Z.; Zhao, D. Apple Target Recognition Method in Complex Environment Based on Improved YOLOv4. J. Food Process Eng. 2021, 44, e13866. [Google Scholar] [CrossRef]
Zhang, F.; Chen, Z.; Ali, S.; Yang, N.; Fu, S.; Zhang, Y. Multi-Class Detection of Cherry Tomatoes Using Improved Yolov4-Tiny Model. Int. J. Agric. Biol. Eng. 2023, 16, 225–231. [Google Scholar]
Wang, J.; Gao, Z.; Zhang, Y.; Zhou, J.; Wu, J.; Li, P. Real-Time Detection and Location of Potted Flowers Based on a ZED Camera and a YOLO V4-Tiny Deep Learning Algorithm. Horticulturae 2021, 8, 21. [Google Scholar] [CrossRef]
Zhao, S.; Peng, Y.; Liu, J.; Wu, S. Tomato Leaf Disease Diagnosis Based on Improved Convolution Neural Network by Attention Module. Agriculture 2021, 11, 651. [Google Scholar] [CrossRef]
Zhou, J.; Geng, S.; Qiu, Q.; Shao, Y.; Zhang, M. A Deep-Learning Extraction Method for Orchard Visual Navigation Lines. Agriculture 2022, 12, 1650. [Google Scholar] [CrossRef]
Yang, R.; Zhai, Y.; Zhang, J.; Zhang, H.; Tian, G.; Zhang, J.; Huang, P.; Li, L. Potato Visual Navigation Line Detection Based on Deep Learning and Feature Midpoint Adaptation. Agriculture 2022, 12, 1363. [Google Scholar] [CrossRef]
Li, X.; Su, J.; Yue, Z.; Duan, F. Adaptive Multi-ROI Agricultural Robot Navigation Line Extraction Based on Image Semantic Segmentation. Sensors 2022, 22, 7707. [Google Scholar] [CrossRef]
Ruan, Z.; Chang, P.; Cui, S.; Luo, J.; Gao, R.; Su, Z. A Precise Crop Row Detection Algorithm in Complex Farmland for Unmanned Agricultural Machines. Biosyst. Eng. 2023, 232, 1–12. [Google Scholar] [CrossRef]
Yang, Y.; Zhou, Y.; Yue, X.; Zhang, G.; Wen, X.; Ma, B.; Chen, L. Real-time detection of crop rows in maize fields based on autonomous extraction of ROI. Expert Syst. Appl. 2023, 213, 118826. [Google Scholar] [CrossRef]
Chen, H.; Zhang, Z.; Xie, W.; Wang, C.; Wang, F. Research on Navigation Line Extraction Algorithm for Sanqi Ridge Based on Deeplab-MV3. J. Kunming Univ. Sci. Technol. (Nat. Sci. Ed.) 2023, 48, 95–106. [Google Scholar]
Yu, J.; Zhang, J.; Shu, A.; Chen, Y.; Chen, J.; Yang, Y.; Tang, W.; Zhang, Y. Study of convolutional neural network-based semantic segmentation methods on edge intelligence devices for field agricultural robot navigation line extraction. Comput. Electron. Agric. 2023, 209, 107811. [Google Scholar] [CrossRef]
Li, D.; Li, B.; Kang, S.; Feng, H.; Long, S.; Wang, J. E2CropDet: An Efficient End-to-End Solution to Crop Row Detection. Expert Syst. Appl. 2023, 227, 120345. [Google Scholar] [CrossRef]
Gong, H.; Zhuang, W.; Wang, X. Improving the Maize Crop Row Navigation Line Recognition Method of YOLOX. Front. Plant Sci. 2024, 15, 1338228. [Google Scholar] [CrossRef]
Kong, X.; Guo, Y.; Liang, Z.; Zhang, R.; Hong, Z.; Xue, W. A Method for Recognizing Inter-Row Navigation Lines of Rice Heading Stage Based on Improved ENet Network. Measurement 2025, 241, 115677. [Google Scholar] [CrossRef]
Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLO (Version 8.0.0) [Computer Software]. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 10 July 2024).
Qin, D.; Leichner, C.; Delakis, M.; Fornoni, M.; Luo, S.; Yang, F.; Wang, W.; Banbury, C.; Ye, C.; Akin, B.; et al. MobileNetV4: Universal Models for the Mobile Ecosystem. In Proceedings of the European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2024; pp. 78–96. [Google Scholar]
Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In Proceedings of the European Conference on Computer Vision; Springer: Cham, Switzerland, 2018; pp. 116–131. [Google Scholar]
Wang, S.; Sun, G.; Zheng, B.; Du, Y. A Crop Image Segmentation and Extraction Algorithm Based on Mask RCNN. Entropy 2021, 23, 1160. [Google Scholar] [CrossRef]
Huang, M.; Xu, G.; Li, J.; Huang, J. A Method for Segmenting Disease Lesions of Maize Leaves in Real Time Using Attention YOLACT++. Agriculture 2021, 11, 1216. [Google Scholar] [CrossRef]
Hidayatullah, P.; Syakrani, N.; Sholahuddin, M.R.; Gelar, T.; Tubagus, R. YOLOv8 to YOLO11: A Comprehensive Architecture In-Depth Comparative Review. J. Appl. Eng. Technol. Sci. 2025, submitted.
Zhou, M.; Xia, J.; Yang, F.; Zheng, K.; Hu, M.; Li, D.; Zhang, S. Design and Experiment of Visual Navigated UGV for Orchard Based on Hough Matrix and RANSAC. Int. J. Agric. Biol. Eng. 2021, 14, 176–184. [Google Scholar] [CrossRef]
Yu, G.; Wang, Y.; Gan, S.; Xu, H.; Chen, J.; Wang, L. Improved DeepLabV3+ Algorithm for Extracting Navigation Lines in Crop Free Fields. J. Agric. Eng. 2024, 40, 168–175. [Google Scholar]
Zhang, Y.; Wang, H.; Liu, J.; Zhao, X.; Lu, Y.; Qu, T.; Tian, H.; Su, J.; Luo, D.; Yang, Y. A Lightweight Winter Wheat Planting Area Extraction Model Based on Improved DeepLabv3+ and CBAM. Remote Sens. 2023, 15, 4156. [Google Scholar] [CrossRef]

Figure 1. The crop-free ridge.

Figure 2. Multi-environment ridge training image: (a) straight ridge—normal lighting; (b) straight ridge—strong light; (c) straight ridge—weak light; (d) curved ridge—strong light.

Figure 3. Image annotation method: (a) straight ridge mark; (b) curved ridge mark.

Figure 4. Image enhancement method: (a) fuzzy; (b) rotation; (c) exposure.

Figure 5. Improved network model structure diagram.

Figure 6. General to bottleneck module structure diagram.

Figure 7. Shuffleblock structure diagram.

Figure 8. Navigation line extraction method: (a) identify the field ridge mask; (b) select the largest area block; (c) scan line by line to obtain a set of boundary points; (d) calculate the average to obtain a set of midline points; (e) select the set of points below the surface; (f) fit the navigation line. Note: The curve in the figure is the extracted contour line, the circles are the extracted feature point, and the straight line is the fitted navigation line.

Figure 9. Comparison of multi-model detection effects.

Figure 10. Multi-scene navigation line fitting effect. Note: The red line is the extracted ridge centerline, the blue line is fitting result of the least square algorithm, and the green line is the fitting result of RANSAC algorithm.

Table 1. Backbone comparison experiment.

MobileNetV4	ShuffleNetV2	GhostNet	EfficientNet	Params/M	FLOPs/G	mAP/%	FPS
√	×	×	×	2.3	9.7	90.4	47.6
×	√	×	×	2.2	2.4	87.2	73.5
×	×	√	×	1.9	9.1	89.6	48.5
×	×	×	√	7.5	9.5	96.2	29.5

Note: The “√” means which was used, the “×” means which was not used.

Table 2. Ablation experiment.

MobileNetV4	Shuffleblock	Params/M	FLOPs/G	mAP/%	FPS
×	×	3.2	12.1	96.3	46.7
√	×	2.3	9.7	90.4	47.6
×	√	2.8	11.2	95.0	47.4
√	√	1.8	8.8	90.4	49.5

Note: The “√” means which was used, the “×” means which was not used.

Table 3. Model comparison experiment.

Model	Params/M	FLOPs/G	mAP/%	FPS
Mask-RCNN	43.9	134.07	80.6	13.8
YOLACT++	49.61	67.09	91.7	16.2
YOLOv8	3.2	12.1	96.3	46.7
YOLO11	2.8	10.4	95.2	43.7
this study ¹	1.8	8.8	90.4	49.5

¹ https://github.com/ssoulife/MSYOLOv8.git (accessed on 25 April 2025).

Table 4. Comparison of the effectiveness of the navigation line fitting methods.

	Scene 1	Scene 2	Scene 3	Scene 4	Scene 5	Average	Total Average	Standard Deviation	t	p
Initial deviation of this article	4.31	2.84	1.07	1.48	8.34	3.60	4.26	4.01	2.77	0.01
Initial deviation of RANSAC	11.91	2.84	1.07	2.37	12.15	6.07	6.66	8.22	2.77	0.01
Overall deviation of this article	1.72	1.50	1.88	2.50	2.90	2.10	2.68	1.79	1.33	0.19
Overall deviation of RANSAC	3.08	1.50	1.88	2.64	2.94	2.41	3.00	1.92	1.33	0.19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lv, R.; Hu, J.; Zhang, T.; Chen, X.; Liu, W. Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8. Agriculture 2025, 15, 942. https://doi.org/10.3390/agriculture15090942

AMA Style

Lv R, Hu J, Zhang T, Chen X, Liu W. Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8. Agriculture. 2025; 15(9):942. https://doi.org/10.3390/agriculture15090942

Chicago/Turabian Style

Lv, Runyi, Jianping Hu, Tengfei Zhang, Xinxin Chen, and Wei Liu. 2025. "Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8" Agriculture 15, no. 9: 942. https://doi.org/10.3390/agriculture15090942

APA Style

Lv, R., Hu, J., Zhang, T., Chen, X., & Liu, W. (2025). Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8. Agriculture, 15(9), 942. https://doi.org/10.3390/agriculture15090942

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Crop-Free-Ridge Navigation Line Recognition Based on the Lightweight Structure Improvement of YOLOv8

Abstract

1. Introduction

2. Materials and Methods

3. Design and Analysis of Key Components

3.1. Improved YOLOv8 Network Model

3.1.1. MobileNetV4 Backbone Network

3.1.2. Shuffleblock

3.2. Navigation Line Extraction Method

4. Optimization of Cylinder Working Parameters

4.1. Model Training Platform Environment

4.2. Model Evaluation Indicators

4.3. Backbone Comparison Experiment and Ablation Experiment

4.4. Comparison of Commonly Used Models

4.5. Verification of the Navigation Line Prediction Effect

5. Discussion

6. Conclusions

7. Future Work

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI