1. Introduction
Innovative Industry 4.0 technologies such as the Internet of Things (IoT), artificial intelligence (AI), big data, and robotics have brought Smart Agriculture 4.0 into the era of rural digital transformation [
1]. In this context, AI plays a central role, both in data management and big data analysis, and in image processing (computer vision) for crop monitoring applications [
2]. Image analysis using AI or machine learning methods appears in crop monitoring applications with autonomous vehicles on the ground or in the air and focuses on identifying crop risks such as diseases, nutrient deficiencies, and weed spread. For weeds in particular, applications use artificial intelligence for site-specific weed management, either mechanical or chemical [
3].
Of all the weed control methods (chemical, mechanical, flame, or laser), mechanical is the oldest and has contributed to weed control for centuries. Conventional mechanical weed control has a weed control efficiency ranging between 40 and 80% and can typically have a 10–30% efficacy gap compared with the very high efficiency of chemical weed control [
4,
5,
6]. Yet mechanical weeding in combination with sensor technologies has seen a renewed interest in the past years [
4,
5,
7]. European regulations restricting the use of chemical preparations [
8] and the banning of certain active substances [
9] due to significant environmental effects are the main reasons for the continued existence of mechanical weed control. The ever-increasing demand for organic products is also a major factor in the adoption of mechanical weeding by growers. Finally, the relatively rapid development of resistance of certain weed species to certain herbicides, which can occur in less than 5 years of application of a herbicide in a field [
10], further promotes mechanical weeding.
The effectiveness of mechanical weeding depends on the tools used to perform it [
3,
7]. For years, weeds between crop rows were managed using cultivators, while more recently, tools designed for weeding directly on the crop row have become established. This type of weeding is carried out either with conventional and easy-to-use tools such as finger weeders or torsion weeders [
5] or with newer and more innovative tools that do not apply uniform action along the entire length of the row (spatially individualized action in the intra-row space), such as moving blades [
11]. The latest are more complex systems that detect either the crops or the weeds along the row in order to act precisely where needed. Both types of intra-row weeding tools, however, require particularly high accuracy in their placement over the crop rows. For this reason, whether in autonomous robotic vehicles or in tractor-mounted implements, robotic systems for automatic steering have emerged to ensure precise row alignment of the weeding tools.
The input for these robotic alignment mechanisms can be either differential GPS, provided that sowing or planting was also carried out using this technology, or visual sensors. The data from visual sensors can be processed using traditional image analysis algorithms based on pixel color [
12,
13], but this method is vulnerable when images contain a high density of weeds [
14]. The employment of neural networks, such as convolutional neural networks (CNNs), can handle such situations more effectively, though they require greater onboard computational power and increase the analysis time per frame. It is, therefore, essential to use efficient, lightweight, and fast neural network models [
15].
The models used in automatic steering systems for real-time row detection are employed to recognize crop rows as lines (semantic segmentation or polyline/line object detection). The aim of this study is to introduce an alternative AI-based approach for crop row detection systems utilizing visual sensors. The proposed approach examines the indirect recognition of crop rows by detecting the plants that define it and then estimating the row using linear regression. Two CNNs were tested for crop detection, specifically YOLO models known for their strong detection performance [
16]. The selected models were YOLOv8n and YOLO11n. Despite their relatively low number of parameters, these “nano” versions achieve satisfactory detection results with low inference times in similar applications [
17]. This makes them ideal for the proposed indirect line detection system, which requires fast and accurate object detection. However, the challenge in this novel approach lies in grouping the various detected plant points according to the row they belong to, particularly in images where multiple crop rows appear. For this reason, three different point-grouping methodology approaches were tested, and the estimated line from each was compared with the ideal one using various line comparison metrics and estimation error measures.
2. Materials and Methods
2.1. Dataset Description
Approximately 1200 lettuces (Lactuca sativa L.) were planted linearly in an experimental field in Oropos, Attica, Greece (approximately 38.31641182, 23.75919572). The row spacing was 75 cm, and the intra-row plant spacing was 40 cm. The lettuces were planted on 20th December, and data collection was performed at 60, 90, and 120 days after planting. No weed control had been performed in the field to test the proposed row detection method under challenging conditions due to high weed infestation and crop-crop or crop-weed overlap.
A total of 763 sample images were collected, containing 13,642 lettuce instances. The images were captured using a Motorola Moto G85 (Motorola Inc., Chicago, Illinois, USAand a 50-megapixel RGB camera at noon using the standard image-capturing application and default settings. The camera has an f/1.79 aperture, 0.8 µm pixel size, electronic rolling shutter, Optical Image Stabilisation, and quad Phase Detection Auto Focus. JPEG images were taken at a resolution of 4096 × 3072 pixels. A shutter speed of 1/152 s was used, while the ISO was automatically calibrated to ensure optimal image quality under varying lighting conditions during measurements. The images were captured from a height of 1.1 m. The camera tilt ranged between 30 and 60 degrees, allowing visibility of 4 to 6 rows per image.
The visual sensor used for detecting crop rows can be positioned either between two rows or directly above a crop row. The camera tilt can be 90 degrees relative to the horizontal plane for vertical captures, or it can vary for oblique views. Consequently, the images used for row monitoring fall into four categories, as also illustrated in
Figure 1.
The proposed methodology utilizes images from category 4, which represents the most challenging case for indirect row detection via linear regression of the points defining the detected crops. The difficulty arises not only due to plant occlusion from frontward plants or weeds (overlap) but also from the fact that the exact planting point of each crop does not match the center of its visible boundary in the image. Therefore, the proposed row detection method is based on the assumption that the deviation between the actual central point of a plant and the center as depicted in the 2D image representation is negligible for the purpose of indirect row detection.
2.2. Crop Detection Models
The dataset of 763 images was split into training, validation, and test sets using a 70:15:15 ratio, resulting in 535 training images and 114 testing images. The lettuces were annotated with bounding boxes under the label “Crop”, while the crop rows were labeled as “Row” using 2-point polylines (straight lines). The row labels were not used in model training but served solely as ground truth for comparing the estimated row line produced by the developed algorithm.
Training of YOLOv8n and YOLO11n was conducted for three different landscape image resolutions (384 × 640, 768 × 1024, and 1536 × 2048) over 300 epochs with early stopping set at 50 epochs. For all six trained models, the evaluation metrics included precision, recall, F1 score, mAP50, and mAP50-95. The average model confidence was also calculated using a confidence threshold of 0.25. This threshold was also used for model testing and the assessment of the proposed crop row detection method. In this study’s context, the analysis time per frame from continuous image capture by a robotic implement dictates the robot’s or tractor’s movement speed and, therefore, the overall effectiveness of the weeding tools. As such, inference time (i.e., the time required for object localization and prediction) was also an essential comparison metric for the models. The YOLO models were validated and tested on an NVIDIA GeForce RTX 4080 GPU (Nvidia Corporation, Santa Clara, CA, USA), processing 1 image per batch.
Finally, the most efficient model was selected for the indirect detection of lettuce planting lines. The estimated row line was extracted via linear regression applied to the centers of the bounding boxes of the detected lettuces. For the selected model, the confidence threshold was also set at 0.25.
2.3. Estimation of Crop Lines
Since each image contained multiple rows and points belonging to different planting lines, it was necessary to apply a method to classify the detected lettuces by row. The grouping method relied on comparing Euclidean distances between points to identify their nearest available neighbor and group them into the same row. Differentiation between points of different rows was implemented using three different approaches: (1) a fixed pixel-distance threshold between a point and its nearest neighbor; (2) a dynamic threshold based on the last inter-point distance; and (3) a threshold based on the angle of the line connecting a point with its nearest available neighbor.
Grouping of rows always began from the lowest available point in the image (southernmost), and given that the camera angle had a tilt, it was ensured that the distance between two neighboring points (plants) in a row closer to the camera appeared larger than the distance between two plants in the same row that were farther from the camera. Additionally, the given planting distances and the placement of the camera between two parallel planting rows ensured that the distance between neighboring points on the same row was always smaller than that between neighboring points on different rows. These two conditions guaranteed the success of the grouping methodology across all three approaches.
The overall effectiveness of the methodology also heavily depends on the detection performance of the computer vision model. If the model fails to detect certain plants within a row, the algorithm may group a point with a neighbor from a different row. This is why a model comparison was initially conducted. Since the efficiency of all three grouping approaches depends on specific empirical parameters, it was essential not only to identify the most efficient approach but also to determine the optimal empirical parameter for each. Therefore, each of the three approaches was evaluated using different values for their respective empirical thresholds.
2.4. Evaluation of Estimated Lines
The accuracy of the estimated lines (ELs) was assessed by comparing them against the manually annotated ground truth planting lines (GTs). This comparison was conducted for each methodological approach and for all the empirical parameters used during the experiments. For every pair of EL and GT lines, the following evaluation metrics were computed to quantify the deviation between them:
Euclidean Distance between the coefficients (a, b) of the EL and GT, assuming lines are represented as y = ax + b, in pixels.
Hausdorff Distance, calculated from a uniform sample of 100 points per line, in pixels.
Chamfer Distance, also based on 100 sample points per line, in pixels.
Angle Difference between the two lines, measured in degrees.
Mean Orthogonal Distance from the detected points of the EL and the GT, in pixels.
Root Mean Square Error (RMSE) of the orthogonal distance (perpendicular) from the detected points and the GT, in pixels.
Mean Horizontal Distance from the detected points and the GT, in pixels.
RMSE of the Horizontal Projections (X-axis RMSE) of the detected points of EL to the GT line, in pixels.
The metrics were extracted on a line-by-line basis per image. The mean value of each metric per image was subsequently calculated, and finally, the average metric for all the images in the evaluation set (114 images). Even though these metrics are unable to accurately capture the actual distance between lines in units of length (e.g., centimeters), they provide an adequate measure of the different approaches and values of the empirical parameters.
3. Results
3.1. Detection Models Comparison Results
Comparison of the two YOLO models for three different image sizes resulted in a total of six models, each with varying performance in terms of accuracy in detecting lettuce plants in the field, as well as the time required for this detection. The detection results of the models are shown in
Figure 2a and
Figure 2b, corresponding to the YOLOv8n and YOLO11n versions, respectively.
As illustrated in
Figure 2a,b, all six models achieve a precision of approximately 98%. The F1 score, mAP@50, and mAP@50-95 values do not show significant differences between the two YOLO versions. However, the YOLO11n models generally present a higher mean confidence across most image sizes. It is also worth noting that lower-resolution images perform on par with higher-resolution ones, maintaining equally high detection accuracy.
The comparison of the average inference time for the models, along with the corresponding standard deviation, is presented in
Figure 3 below.
From both model versions, it is evident that processing high-resolution images (e.g., 1536 × 2048) requires significantly more time compared to smaller image sizes. Moreover, despite the higher mean confidence of YOLO11n, it consistently shows longer inference times than YOLOv8n, regardless of image resolution. For small- and medium-sized images, YOLOv8n’s inference time remains under 2 milliseconds, making it a reliable and fast option for real-time crop detection.
Although all the models demonstrate strong performance, the YOLOv8n model with 768 × 1024 image resolution was selected as the most suitable for validating the proposed methodology, as it offers the best balance of high precision and low inference time, which are the key factors for this application.
3.2. Approaches and Empirical Factors Comparison Results
The bar plots shown in
Figure 4,
Figure 5 and
Figure 6 illustrate the average values, calculated from all the detected lines and images, for each of the eight selected metrics used to compare the EL with the GT line.
The first approach used for grouping the detected points into lines is based on an empirical distance threshold between crops. After testing various threshold values (300, 400, 500, 600, 700, 800, and 900 pixels), it was found that the optimal results were obtained using a threshold of 500 pixels, as shown in the diagrams in
Figure 4. Specifically, with this threshold, the mean horizontal distance between the points forming the EL was approximately 45 pixels, while the RMSE of the horizontal projections remained below 54 pixels.
In the second approach, the threshold that triggers a line change in the proposed point-grouping methodology is determined dynamically by multiplying the last measured distance between two crops by an empirical factor. To identify the optimal factor, several values were tested: 0.75, 1, 1.25, 1.5, 1.75, and 2. As shown in
Figure 5, the best performance was achieved using a factor of 1.5, resulting in a mean horizontal distance of 58.8 pixels and an RMSE of horizontal projections equal to 77.1 pixels.
The third approach groups points by analyzing the slope of the line formed between each point and its closest neighbor. A new line is initiated when the slope difference exceeds an empirical angle threshold. Among the tested values (0.5, 0.625, 0.75, 0.875, 1, 1.125, 1.25, and 1.5), the best results were obtained for an angle threshold of 0.75 (see
Figure 6). This configuration led to a mean horizontal distance of 50.8 pixels and an RMSE (horizontal projection) of 61.9 pixels.
4. Discussion
The comparison of the YOLO models revealed key findings regarding their detection performance, inference time, and optimal image resolution. Implementing a reliable crop detection model is essential for the effectiveness of the proposed crop line detection methodology. Particular attention must be given to the model’s precision and its confidence threshold, as misclassifying a non-crop object (e.g., a weed) as a crop can significantly distort the estimated line via linear regression. On the other hand, a low recall, which might result in some true crops not being detected, would have a less critical impact, as the remaining detected points would typically be sufficient for estimating the line accurately.
The presented diagrams comparing different empirical parameter values across the three approaches offer initial insight into the behavior of each method. They indicate that the first approach, using a distance threshold of 500 pixels, yields the lowest error. It is worth noting, however, that although multiple metrics, including orthogonal distance, RMSE of orthogonal projections, horizontal distance, and RMSE of horizontal projections, indicate that the first approach yields the lowest pixel-based error, the mean angular difference between the evaluated lines is minimized in the third approach, with a value of just 2.3 degrees. This suggests that identifying a single optimal approach, empirical factor, or evaluation metric is not straightforward and largely depends on the specific application context. Therefore, a comparative evaluation of multiple empirical factors and grouping strategies is essential in each case, a process that is systematically presented and demonstrated in this work.
A potential direction for future research could be to explore a hybrid approach, for example, by combining the distance threshold method (Approach 1) with the angle-based method (Approach 3), aiming at defining an even better performance solution. To fully evaluate the effectiveness of this novel crop line detection methodology, field experiments are necessary. These should involve the integration of the proposed machine vision methodology with a row alignment system and a weeding tool, and the assessment of the weed control efficiency of the final system. Comparisons should be made against both traditional row detection techniques and systems without any alignment mechanism. Additionally, evaluation could focus on the computational efficiency gained by using a single AI model, rather than two separate models for row detection and intra-row crop-weed discrimination, which is common in current mechanical weeding systems. Field testing will also help determine which evaluation metric best reflects real-world error when comparing the ground truth estimated lines.
5. Conclusions
This work introduces a novel method for crop line detection. This robust approach is based on the indirect detection of the crop rows by recognizing the individual plants that comprise them, and it remains effective with low error rates, even in fields with high weed density.
In this study, YOLOv8n and YOLO11n models were compared across three different image resolutions. Based on this evaluation, the optimal model–image size combination was selected and used for assessing the proposed method for indirect crop line detection. The method was then evaluated through three different grouping strategies, identifying the most effective one for clustering detected crop points into rows in an image. The scientific novelty of this work lies primarily in the comparative analysis of the three strategies and the determination of the most effective empirical factor for each.
Although field testing and a direct comparison with conventional row detection techniques are still required, the proposed method shows great potential. Specifically, it offers a pathway toward integrating two key subsystems of mechanical weeding (row detection and crop–weed classification) into a single AI-driven image analysis model, which could significantly reduce onboard processing requirements.
Author Contributions
Conceptualization, I.G. and G.G.P.; methodology, I.G. and G.G.P.; software, I.G. and G.G.P.; validation, I.G., G.G.P. and K.G.A.; formal analysis, I.G.; investigation, I.G. and G.G.P.; resources, G.G.P.; data curation, G.G.P. and K.G.A.; writing—original draft preparation, I.G.; writing—review and editing, I.G. and G.G.P.; supervision, K.G.A.; project administration, G.G.P.; funding acquisition, G.G.P. All authors have read and agreed to the published version of the manuscript.
Funding
This research was carried out within the framework of the National Recovery and Resilience Plan Greece 2.0, funded by the European Union—NextGenerationEU (Implementation body: HFRI, Project: “Precision Weed Management in Cotton”, # 15563).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Dataset available on request from the authors.
Acknowledgments
The authors would like to thank Aspasia Efthimiadou and Nikolaos Katsenios for their invaluable help in the setup and execution of this experiment. We would also like to thank Christophoros-Nikitas Kasimatis and Christos Kyriakou for their assistance with the fieldwork, and Konstantinos Konstantinidis for his help with image annotation.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Abbasi, R.; Martinez, P.; Ahmad, R. The Digitization of Agricultural Industry—A Systematic Literature Review on Agriculture 4.0. Smart Agric. Technol. 2022, 2, 100042. [Google Scholar] [CrossRef]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
- Gerhards, R.; Sanchez, D.A.; Hamouz, P.; Peteinatos, G.G.; Christensen, S.; Fernandez-Quintanilla, C. Advances in Site-Specific Weed Management in Agriculture—A Review. Weed Res. 2022, 62, 123–133. [Google Scholar] [CrossRef]
- Andújar, D.; Ribeiro, A.; Fernández-Quintanilla, C.; Dorado, J. Herbicide Savings and Economic Benefits of Several Strategies to Control Sorghum Halepense in Maize Crops. Crop Prot. 2013, 50, 17–23. [Google Scholar] [CrossRef]
- Kunz, C.; Weber, J.F.; Gerhards, R. Benefits of Precision Farming Technologies for Mechanical Weed Control in Soybean and Sugar Beet—Comparison of Precision Hoeing with Conventional Mechanical Weed Control. Agronomy 2015, 5, 130–142. [Google Scholar] [CrossRef]
- Melander, B.; Rasmussen, I.A.; Bàrberi, P. Integrating Physical and Cultural Methods of Weed Control—Examples from European Research. Weed Sci. 2005, 53, 369–381. [Google Scholar] [CrossRef]
- Peruzzi, A.; Martelloni, L.; Frasconi, C.; Fontanelli, M.; Pirchio, M.; Raffaelli, M. Machines for Non-Chemical Intra-Row Weed Control in Narrow and Wide-Row Crops: A Review. J. Agric. Eng. 2017, 48, 57–70. [Google Scholar] [CrossRef]
- European Commission. A Farm to Fork Strategy for a Fair, Healthy and Environmentally-Friendly Food System. COM(2020) 381 Final. Brussels. 2020. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A52020DC0381 (accessed on 24 April 2025).
- Gensch, L.; Jantke, K.; Rasche, L.; Schneider, U.A. Pesticide Risk Assessment in European Agriculture: Distribution Patterns, Ban-Substitution Effects and Regulatory Implications. Environ. Pollut. 2024, 348, 123836. [Google Scholar] [CrossRef] [PubMed]
- Maxwell, B.D.; Roush, M.L.; Radosevich, S.R. Predicting the Evolution and Dynamics of Herbicide Resistance in Weed Populations. Weed Technol. 1990, 4, 2–13. [Google Scholar] [CrossRef]
- Junker, C.; Neuhoff, D.; Blum, H.; Heuberger, H.; Bernschein, M.; Pesch, M.; Döring, T.F. Mechanical Intra-Row Weed Control at Early Growth Stages in Medicinal and Aromatic Plants Using the Example of Parsley (Petroselinum crispum (Mill.) Fuss) and Lemon Balm (Melissa officinalis L.). J. Appl. Res. Med. Aromat. Plants 2025, 45, 100623. [Google Scholar] [CrossRef]
- Meyer, G.E.; Neto, J.C. Verification of Color Vegetation Indices for Automated Crop Imaging Applications. Comput. Electron. Agric. 2008, 63, 282–293. [Google Scholar] [CrossRef]
- Gerhards, R.; Benjamin, K.; Jannis, M.; Möller, K.; Butz, A.; Reiser, D.; Hans-Werner, G. Camera-guided weed hoeing in winter cereals with narrow row distance. Gesunde Pflanz. 2020, 72, 403–411. [Google Scholar] [CrossRef]
- Khan, M.N.; Rahi, A.; Rajendran, V.P.; Al Hasan, M.; Anwar, S. Real-Time Crop Row Detection Using Computer Vision- Application in Agricultural Robots. Front. Artif. Intell. 2024, 7. [Google Scholar] [CrossRef] [PubMed]
- Hossen, M.I.; Awrangjeb, M.; Pan, S.; Mamun, A.A. Transfer Learning in Agriculture: A Review. Artif. Intell. Rev. 2025, 58, 97. [Google Scholar] [CrossRef]
- Sharma, A.; Kumar, V.; Longchamps, L. Comparative Performance of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN Models for Detection of Multiple Weed Species. Smart Agric. Technol. 2024, 9, 100648. [Google Scholar] [CrossRef]
- Allmendinger, A.; Saltık, A.O.; Peteinatos, G.G.; Stein, A.; Gerhards, R. Assessing the Capability of YOLO- and Transformer-Based Object Detectors for Real-Time Weed Detection 2025. arXiv 2025, arXiv:2501.17387. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).