Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision

Gao, Xiangxin; Gao, Jianmin; Qureshi, Waqar Ahmed

doi:10.3390/agronomy15081954

Open AccessReview

Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision

by

Xiangxin Gao

,

Jianmin Gao

^*

and

Waqar Ahmed Qureshi

School of Agricultural Engineering, Jiangsu University, Zhenjiang 212013, China

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(8), 1954; https://doi.org/10.3390/agronomy15081954

Submission received: 17 July 2025 / Revised: 9 August 2025 / Accepted: 11 August 2025 / Published: 13 August 2025

(This article belongs to the Special Issue Research Progress in Agricultural Robots in Arable Farming)

Download

Browse Figures

Versions Notes

Abstract

Advanced computer vision (CV) and deep learning (DL) are essential for sustainable agriculture via automated vegetation management. This paper methodically reviews advancements in these technologies for agricultural settings, analyzing their fundamental principles, designs, system integration, and practical applications. The amalgamation of transformer topologies with convolutional neural networks (CNNs) in models such as YOLO (You Only Look Once) and Mask R-CNN (Region-Based Convolutional Neural Network) markedly enhances target recognition and semantic segmentation. The integration of LiDAR (Light Detection and Ranging) with multispectral imagery significantly improves recognition accuracy in intricate situations. Moreover, the integration of deep learning models with control systems, which include laser modules, robotic arms, and precision spray nozzles, facilitates the development of intelligent robotic mowing systems that significantly diminish chemical herbicide consumption and enhance operational efficiency relative to conventional approaches. Significant obstacles persist, including restricted environmental adaptability, real-time processing limitations, and inadequate model generalization. Future directions entail the integration of varied data sources, the development of streamlined models, and the enhancement of intelligent decision-making systems, establishing a framework for the advancement of sustainable agricultural technology.

Keywords:

deep learning; machine vision; smart weeding machinery; sustainable agriculture; weed recognition; precision agriculture technology

1. Introduction

1.1. The Urgent Need for Green and Sustainable Agricultural Development

Green and sustainable agriculture has become an increasingly important trend in modern farming, driven by the combined pressures of environmental conservation and global population growth. The food security of the projected 9 billion people in 2050 (which will require a 70–100% increase in production) is directly threatened by weeds, the primary biological stressor, which reduces global agricultural yields by an average of 10% annually (about 200 million tons of food) [1]. In response to the failure of chemical control methods, a shift toward eco-friendly management practices such as biological weed control, precision agriculture, and integrated ecological strategies has been prompted by the spread of herbicide resistance to 267 weed species (including 513 resistance cases) worldwide, posing a serious threat to agricultural productivity [2,3,4]. Moreover, traditional manual weeding has become less practical for large-scale farming operations due to rising labor costs. This is especially true in developing countries, where manual weeding costs account for 20–40% of total crop cultivation expenses, placing a significant financial burden on large farms [5].

In this context, one of the key innovations for promoting green and sustainable agricultural growth is intelligent weed treatment technology. By accurately identifying weed species and targeting specific areas for weed management, an intelligent weed control system based on deep learning and machine vision can significantly reduce herbicide use and improve efficiency. This approach can help achieve the dual goals of “Zero Hunger” and “Climate Action”, as outlined in the United Nations’ Sustainable Development Goals (SDGs). The adoption of intelligent weed management technology is expected to cut agricultural carbon emissions by 15–20% and pesticide use by over 30% worldwide [6,7]. To support the digital transformation of agriculture, it is essential to thoroughly analyze technological developments, current applications, and emerging trends in this field. This analysis holds both theoretical and practical importance. Crop yields and operational efficiency can be significantly enhanced by integrating mechanized equipment, drones, robotics, and deep learning algorithms to foster the intelligent evolution of agricultural technology. Although complex technology and high costs currently challenge the development of intelligent agricultural equipment, ongoing advances in robotics and AI algorithms indicate numerous potential future applications [8].

1.2. Development Status and Challenges of Weed Treatment Technology

Mechanical, chemical, and manual weeding are the three primary traditional weed control methods. Plowing and harrowing are examples of mechanical approaches that are eco-friendly but often ineffective because they can also harm crops. Chemical weed control is very effective but poses significant environmental risks. Manual weed removal is accurate but costly and complex to scale for large farms [5,9]. With advances in robotics and artificial intelligence, intelligent weed treatment technology has become a rapidly growing research area. Early studies primarily used traditional machine vision and machine learning techniques, such as support vector machines (SVMs) and random forests, to identify weeds based on color, texture, and other visual cues. However, accuracy in complex field conditions remains limited [10]. Recently, rapid progress in deep learning has created new opportunities for precise weed identification and the development of innovative weed removal systems. Still, three significant challenges remain:

(1): The visual similarity between weeds and crops in natural settings, combined with factors like light and shading interference, and the high economic and time costs for data collection and annotation;
(2): The need for real-time processing of large amounts of image data, along with model generalization and robustness;
(3): The necessity for precise operation and integrated control of innovative weeding systems, as well as issues related to cost management and scalability.

1.3. Deep Learning-Driven Intelligent Weed Control Technology Evolution

International research on intelligent weed control technologies began in the early 21st century, initially focusing on weed identification using traditional machine vision. The rise of deep learning technologies sparked a surge in related research after 2015. For example, using the YOLOv3 framework for farmland weed detection, improvements such as automatic focus layer replacement, image pyramid construction, adaptive spatial fusion strategies, and anchor frame optimization algorithms significantly enhance the model’s effectiveness in detecting crops and weeds. Experimental validation showed a detection accuracy of 80.01%, with a detection speed considerably higher than that of similar algorithms from the same period [6]. A research team from Yunnan Agricultural University used the U-Net model for weed segmentation in sugar beet fields, achieving an average intersection over union (mIoU) of 84.28% and an average pixel accuracy (mPA) of 88.59% on the specialized sugar beet dataset [11]. The lightweight model YOLO-WL, designed for precise weeding in cotton fields, modifies the backbone network using EfficientNet, incorporates a cross-attention mechanism (CA), and improves the feature fusion module (AFPN/EMA), reaching a mean average precision (mAP) of 92.30% on the CottonWeedDet12 dataset. The detection time for a single frame has been reduced to 1.9 ms, and the total number of model parameters has decreased by 30.3%. The integration of TensorRT optimization enables real-time video stream inference, providing an effective solution for weed detection in complex agricultural environments [12]. The Swin-Unet model, developed by Cao Hu’s team at Fudan University, creates a medical image segmentation framework based on a pure transformer architecture. It effectively learns local and global semantic features through image block tagging and uses a U-shaped encoder–decoder structure with skip connections [13].

The advancement of deep learning technology in image recognition and understanding offers strong technical support for accurately detecting weeds. A convolutional neural network (CNN) automatically extracts high-level semantic features from images through multi-layer feature extraction and adaptive learning, thereby improving weed recognition accuracy in complex environments [10,14,15]. The progression from early architectures like AlexNet and VGG to more advanced networks such as ResNet and DenseNet has significantly increased weed detection accuracy, rising from approximately 70–80% in the early stages to over 90% in current mainstream standards [14,16].

The evolution of weeding equipment technology has shifted from traditional tractor-drawn tools to intelligent mobile robot systems equipped with autonomous decision-making capabilities. This creates a closed-loop operation that combines visual perception, autonomous navigation, and precise execution within the main modules of ‘identification–positioning–processing’. In standard equipment development methods, the Onyx intelligent weeder, introduced by the German Kverneland Group, utilizes an integrated deep learning vision system and a mechanical weeding mechanism to achieve an efficiency of over 85% in removing inter-row weeds at speeds of 5–8 km/h [17]. The Autonomous Weeder robot, developed by Carbon Robotics, merges mechanical automation, artificial intelligence, and laser technology. It is equipped with eight 150 W CO₂ lasers and 12 high-resolution vision sensors, enabling it to perform autonomous navigation tasks across large agricultural fields (single field area ≥ 66.67 hm²). By adjusting the working width dynamically, it can cultivate an average of 0.24 hm² of farmland per hour; however, its high production cost limits widespread use [18]. The weeding robot platform by Wageningen University features a four-wheel drive steering system, allowing high-precision path planning and positioning control through the Global Navigation and Positioning System (GNPS) [19]. The photovoltaic-powered weeding robot, made by EcoRobotix (Switzerland), has a highly efficient solar power system that enables all-weather, autonomous operation. Its front vision system detects weeds, allowing precise positioning and effective removal with the Delta manipulator actuator [20]. The deep learning based variable pesticide spraying weeder for strawberry field (Figure 1a) can acquire images by camera, use deep learning convolutional neural network (e.g., VGG-16) to identify weeds (spotted ragweed, caper) in strawberry field, and control the solenoid valve to drive the nozzle to achieve variable spraying, with the complete spraying rate up to 93% and 86% in the laboratory and field experiments, respectively [21]. By carrying a spraying system, the UAV senses the farmland environment in real time based on sensors (e.g., vision, LIDAR), avoids obstacles (e.g., utility poles, trees) by combining path planning and obstacle avoidance algorithms, and accurately sprays herbicides to the target area (Figure 1b), realizing highly efficient and automated weed management [22]. Shenyu Zheng [23] developed an intelligent inter-plant weeding robotic system (Figure 1c), which drove the weeding knives through an electric pendulum opening and closing mechanism, combined with the YOLOv5s + Transformer deep learning model to identify the position of the kale in real time (with an accuracy of 96.1%), and achieved dynamic obstacle avoidance and weeding; the weeding accuracy of 96.67% and seedling injury rate of 0.83% at low speed (0.1 m/s) in the laboratory corresponds to 96.00% and 1.57% in the field. Still, the accuracy drops to 81.79% and the seedling injury rate rises to 5.49% when the speed is increased to 0.5 m/s. Hui Liu [24] designed an orchard spraying robot (with weeding function) (Figure 1d) by adjusting the side fan speed to control the herbicide spraying range; combined with the NPENet navigation path extraction technology, the robot can travel autonomously along the center line of the orchard road to achieve precise application of medicine and weeding.

This paper aims to develop an innovative theoretical framework and technical paradigm for weed management in intelligent agricultural settings. It employs a cross-disciplinary approach that combines deep learning and machine vision, establishing a comprehensive research framework encompassing ‘theoretical modeling–algorithmic innovation–system integration–field validation’. This framework integrates advanced technologies, including multimodal data fusion, lightweight model design, and optimization of intelligent decision-making systems, based on a systematic review of the progress in weed recognition algorithms such as convolutional neural networks. Through this review and analysis of developments in weed identification algorithms, including convolutional neural networks and Transformer architectures, as well as an examination of laser weeding modules, robotic arm actuators, and other intelligent devices, we analyze the current technological constraints from the viewpoints of principles, model architecture, and system integration. This provides foundational support for both academic insight and engineering application in exploring new technological directions, such as cross-modal feature learning and digital twin-driven decision-making, to advance field-oriented weed identification systems. It provides a theoretical foundation for exploring innovative approaches, such as cross-modal feature learning and digital twin-driven decision-making, which are both academically promising and practically applicable, thereby contributing to the development of weed management technologies for environmentally sustainable agriculture.

1.4. Research Objectives and Content Architecture

This review seeks to establish a three-dimensional analytical framework for ‘theory-technology-application’, with the following specific objectives:

To delineate the technical progression of deep learning models (CNN, Transformer, and hybrid architectures) in weed detection, together with associated technical accomplishments, and to assess their advantages and limits;
To investigate the utilization of machine vision and modal sensor fusion in agricultural contexts, and categorize and summarize the methodologies and impacts of current technologies;
To carefully summarize the integration frameworks of intelligent weeding apparatus and evaluate the operational effectiveness of various intelligent actuators;
To identify the existing technology constraints and suggest potential research avenues for intelligent agricultural weeding machinery and environmentally sustainable agriculture.

2. Review Methodology

This section outlines the research methods used in this review, including literature search criteria, keyword strategies, and inclusion/exclusion parameters. The study examines the practical applications of deep learning and machine vision technologies in agricultural weed control, with a focus on classifications of intelligent weed control systems and actuator design specifications. A systematic literature review was carried out through thorough data extraction, database screening procedures, and targeted keyword searches to comprehensively identify research progress, technical challenges, and future development directions in this field. To assist the assessment process, the following research questions were formulated:

What methods can be employed to identify weed species in agricultural fields accurately?
What remote sensors and collection systems are appropriate for monitoring weeds in agricultural fields?
What are the conventional concepts of deep learning and machine learning in weed control technologies? What are the strategies for improvement and performance enhancement?
What are the fundamental mechanisms employed in weeding operations?
Do weeding machines incorporate deep learning or machine learning modules?

This work adhered rigorously to review procedures and performed a literature search throughout Web of Science, Scopus, ScienceDirect, MDPI, PubMed, and Google Scholar academic databases, to methodically gather pertinent technical papers for evaluation and analysis. The literature search was confined to technical publications published from 2015 to 2025, utilizing a specific keyword combination technique of ‘weed identification’ and ‘weed control machinery’. Core subject phrases include (‘weed recognition’) and (‘weed control machinery’), paired with technological feature terms such as (‘intelligence’) and (‘machine learning’ or ‘deep learning’). By strategically employing Boolean operators (AND, OR), extraneous material was efficiently excluded while enhancing the comprehensiveness of pertinent studies. This review systematically assesses the literature from the preliminary search by analyzing the abstracts and essential findings to determine its potential contribution, with the screening process illustrated in Figure 2. Following the final literature screening, the overarching concept and framework of this article are first established. They will be refined and optimized during the subsequent writing process through alignment between the literature content and the review topic. The article’s structure will be refined in the following writing phase to align the literature content with the review topic.

Based on the preset inclusion and exclusion criteria, the literatures mentioned above were systematically screened. The inclusion criteria mainly include: having a significant correlation with the selected keywords, and involving the application of deep learning technology in field weed identification, or innovative research on weed identification methods for agricultural operations. At the same time, it was necessary to verify whether the literature has made targeted improvements to the original models related to deep learning, and whether such improvements have effectively enhanced the recognition accuracy of target crops or weeds. Research that combined deep learning methods with weed management machinery was given priority. The exclusion criteria include literature published in languages other than English.

The screening process involved several key stages: Firstly, the relevance of the literature to the research topic was preliminarily confirmed through an abstract review. After eliminating duplicate literature, the potential contribution of each literature to this research and the correlation between the involved technology and the future development of innovative agricultural weed control machinery were further evaluated to ensure the high relevance and research value of the final included literature.

This study involved creating a research dataset by extracting key information from the literature, including titles, keywords, abstracts, authors, and references. After thorough screening, 192 publications from different countries were ultimately selected for inclusion. This paper discusses advances in deep learning and machine vision technologies for agricultural weed management by choosing 23 essential documents from the literature. These selections were based on multiple criteria, such as topic relevance, innovation in weed control methods, recognition system accuracy, and publication year, as summarized in Table 1.

3. Results and Discussion

3.1. Deep Learning Infrastructure

3.1.1. Convolutional Neural Networks (CNN)

The Dense Convolutional Network (DenseNet), introduced by Gao Huang [46], improves feature transmission and reuse through a fully connected inter-layer approach. By incorporating growth rate, bottleneck layer, and compression layer designs to enhance parameter efficiency, DenseNet achieves superior performance on standard datasets, such as Canadian Institute for Advanced Research (CIFAR), Street View House Numbers (SVHN), and ImageNet, while significantly reducing the parameter count and improving recognition accuracy compared to traditional network architectures. This establishes a new paradigm for designing deep learning models [15,16]. The progressive development of residual connections, dense connections, and other methods has greatly alleviated the challenges of training deep networks, while markedly improving feature representation in complex scenarios.

The automatic weed detection model, developed using the Inception V2 convolutional neural network and transfer learning, utilizes a pre-trained model to enhance feature extraction, reaching up to 98% accuracy in identifying four common weeds from the Indian Agricultural Research Institute (IARI) dataset. This species recognition supports the use of unmanned airborne vision systems for precise weed removal [47]. A deep learning-based variable spraying system, utilizing the Visual Geometry Group-16 Layers (VGG-16) architecture, achieves a 93% complete spraying rate for target weeds, including spotted groundsel and caper, in strawberry fields, thereby reducing pesticide use and environmental risks [21]. In the early weed classification task in corn fields, the Xception model attains 97–83% accuracy under natural light conditions [48]. In high-resolution imagery of rapeseed fields, comparing semantic segmentation architectures like Segmentation Network (SegNet), U-Shaped Network (U-Net), and encoder modules such as VGG 16, Residual Network with 50 Layers (ResNet-50), the results show that the SegNet model based on ResNet-50 performs best, with an average intersection and union ratio (mIoU) of 0.8288 and a frequency-weighted intersection and union ratio (FWIoU) of 0.9869 [49]. Additionally, a weed recognition method that integrates multiple feature dimensions and BP neural networks optimizes computational efficiency through dynamic feature selection in asparagus fields, achieving an accuracy of 93–93.51% and providing technical support for precise weeding operations [50]. Table 2 displays a comparison of the performance of standard CNN models in the weed detection test.

In practical applications across various contexts, convolutional neural networks (CNNs) have achieved high recognition accuracy and operational precision in agricultural weed detection. This advancement provides significant technical support for precise weeding in farm fields, reduces pesticide use, and minimizes environmental hazards, clearly demonstrating the critical role of deep learning technology in advancing smart agriculture.

3.1.2. Target Detection Models

Target detection, as a key technology for weed localization, is mainly divided into single-stage (One-Stage) and two-stage (Two-Stage) models. Shaoqing Ren [51] proposed a Faster Region-Based Convolutional Neural Network (Faster R-CNN) target detection framework, which achieves near-zero-cost region proposal generation by introducing the Region Proposal Network (RPN) to share convolutional features with the detection network. Combined with anchor point design and alternate training strategy, it achieves efficient and high-precision target detection on datasets such as Pattern Analysis, Statistical Modelling and Computational Learning Visual Object Classes (PASCAL VOC) and Microsoft Common Objects in Context (MS COCO). Single-stage models such as the YOLO (You Only Look Once) family [34,52,53,54] transform the detection task into a regression problem, achieving a balance between speed and accuracy. By introducing multi-scale detection, ECA attention mechanism, soft-NMS, and Complete Intersection over Union (CIoU) loss function based on YOLOv4-Tiny, Ti achieved high-precision real-time detection of six weeds in a peanut field with 94.54% mAP in the test set [55]. Improvement of the RetinaNet network achieved real-time performance of 24.3 fps and mean average precision (mAP) of 94.1% in weed detection in rice field and demonstrated the potential of target detection networks in real-time weed control [56]. Optimizing the training parameters of the YOLOv7 model (batch size 2, 200 rounds of iterations) and combining it with MS COCO pre-training weight migration learning, high accuracy detection of multi-species tea buds (mAP 87.1%) was achieved [57]. An image dataset containing five species of weeds, such as celandine and dandelion, was constructed. The weed detection performance of YOLOv8, YOLOv11, and Faster R-CNN models was evaluated in comparison. It was found that YOLOv9 achieved the highest detection accuracy with an mAP@0.5 of 0.935, and YOLOv11 achieved the fastest response with an inference time of 13.5 ms, which were significantly better than the Faster R-CNN models. Faster R-CNN provides an efficient deep learning solution for accurate weed management in agricultural fields [58]. This study proposes a two-stage weed detection framework based on UAV images: firstly, YOLOv4-Tiny is used to mask corn rows to reduce the amount of annotation automatically, and then YOLOv4 is improved to achieve 86.89% mAP of weed identification, which is combined with spatial distribution analysis to guide variable spraying [59]. A real-time detection and localization method for potted flowers based on ZED 2 stereo camera and YOLO V4-Tiny deep learning algorithm is proposed, which achieves an average of 89.72% detection accuracy (MAP) and 80% recall by constructing a convolutional neural network model in combination with 3D point cloud information, and achieves a real-time detection frame rate of 16 FPS on Jetson TX2, with an average of 18.1 mm absolute error in the positioning of the flower center. Absolute error of 18.1 mm, which provides a reference for the mechanization and automation of flower management for potted flowers in trellises [60]. A comparison of the performance of typical target detection models in the weed detection task is shown in Table 3. Research has demonstrated that adopting the lightweight YOLO V4-Tiny model, by combining data augmentation techniques (including mirroring, scaling, rotation, and affine transformations) and loading COCO pre-trained weights (yolov4-tiny.conv.29) for transfer learning, can significantly improve the model’s performance.Various upgrades to YOLO (Figure 3) can improve the model’s performance in detecting potted flowers [61]. Comprehensively analyzing the effect of Yolov5, v8, v10, v11, and v12 models in rice field weed recognition, after using 3000 rice weed images for training, it can be concluded that the YOLOv12 model is more effective in rice field weed recognition and achieves 87.3% recognition accuracy. The recognition results are shown in Table 4.

In summary, two-stage models in object detection, such as Faster R-CNN, achieve high-precision detection through effective feature-sharing methods. Conversely, single-stage YOLO series models strike a balance between speed and accuracy with structural optimization. In environments such as peanut fields, rice paddies, and tea gardens, models like YOLOv4-Tiny, YOLOv7, and YOLOv9, which have undergone enhancements and transfer learning, have achieved higher mAP values in weed detection. Of these evaluated models, YOLOv9 had the highest accuracy, while YOLOv11 demonstrated the fastest response time, significantly outperforming traditional two-stage models. The detection framework combining data augmentation, lightweight models, and multi-modal information provides efficient solutions for accurate weeding and variable spraying in agriculture, effectively highlighting the practical utility of target detection models in agricultural intelligence.

3.1.3. Semantic Segmentation Models

Semantic segmentation, a key method for pixel-level separation of weeds and crops, is demonstrated by traditional designs like U-Net [62] and DeepLabv3+ [63]. U-Net effectively preserves the spatial details of the image due to its symmetric encoder–decoder structure and skip connections, achieving an intersection over union (IoU) of 88.59% in weed segmentation within sugar beet fields [11]. The network structure improves the accuracy of edge segmentation by combining multi-scale features.

The improved R-FCN (Region-based Fully Convolutional Network) reaches over 89% accuracy in segmenting sugar beet seedlings and weeds in low-light and soil background conditions by adding a cross-scale feature fusion module [64]. The Swin-DeepLab model replaces the traditional backbone network with the Swin-DeepLab Transformer. It substitutes the conventional backbone with the Swin Transformer, combines multi-scale feature fusion and the CBAM attention mechanism, and achieves a mean intersection over union (mIoU) of 91.53% in weed detection within soybean fields [65]. Using the enhanced GT-DeepLabv3+ version of DeepLabv3+, which features a lightweight MobileNet v2 backbone and includes the global attention mechanism (GAM) along with optimized void space pyramid pooling (GS-ASPP), a mean intersection over union (mIoU) of 64.91% and a mean pixel accuracy (mPA) of 79.77% were obtained in rice fields [66]. Table 5 presents a comparison of the performance of several models.

In the field of traditional machine vision, the proposed sub-area strategy for weed recognition in cotton fields (inter-row based on positional features, intra-row based on leaf morphology) achieved a recognition rate of 89.4% for inter-row weeds, and 84.6% for intra-row weeds [67]; and a staged approach based on color features and Otsu segmentation achieved the highest recognition rate of 92.9% for quackgrass using the dark red features of the stalks (Stage II) and the whole-plant color thresholds (B-R standard deviation < 5, stage III), achieving up to 92.9% recognition rate for quinoa [68].

In summary, semantic segmentation achieves high mIoU and accuracy in contexts like sugar beet and soybean fields using traditional models such as U-Net and DeepLabv3+, as well as advanced models like R-FCN and Swin-DeepLab, which have been improved through cross-scale fusion and attention mechanisms. The sub-region strategy and phased color technique of traditional machine vision have reached high recognition rates in specific contexts. These technologies offer various options for accurate weed identification and improved agricultural management, highlighting the potential applications of different visual techniques.

This section explores the use of deep learning infrastructure for weed identification. In convolutional neural networks (CNNs), models like Inception V2 and Xception achieve accuracy over 93% across various crop fields, while a ResNet-50 backbone combined with SegNet reaches a semantic segmentation mean intersection over union (mIoU) of 0.8288. In object detection, the single-stage YOLO series shows exceptional performance; notably, YOLOv9 achieves a mean average precision (mAP@0.5) of 0.935, and YOLOv11 inference takes only 13.5 ms, greatly surpassing the two-stage Faster R-CNN approach. For semantic segmentation, Swin-DeepLab attains an mIoU of 91.53% in soybean fields. Simultaneously, traditional machine vision region-based techniques remain competitive in recognition rates. Overall, optimizing architectures and scene-specific adaptation are key to improving the accuracy and efficiency of weed recognition systems.

3.2. Machine Vision Sensing Technologies

3.2.1. Combination of RGB Imaging and Deep Learning

The RGB camera has become the primary sensing device in weed detection due to its cost-effectiveness and versatile deployment. RGB images of pepper fields are captured by uncrewed aerial vehicles (UAVs) and combined with machine learning algorithms, such as random forest (RF) and support vector machine (SVM), to detect weeds, achieving 96% accuracy with RF and 94% with SVM [69]. The YOLOv5 target detection algorithm, utilizing the CottoWeedDet12 dataset generated from RGB cameras, achieved a mean average precision (mAP) of 0.82 in cotton fields, confirming that RGB images are effective for weed identification [70].

The sensitivity of RGB images to lighting conditions limits their reliability. Jin showed that the classification accuracy of the RGB model under natural lighting drops by 10–15% compared to controlled lighting environments [71]. To address this, a three-wheeled, asymmetrically designed Drop-on-Demand (DoD) weeding robot improves weed control in carrot rows by using combined RGB/HSV color space segmentation and the joint extraction of shape-texture features, along with precise droplet control. This technology dramatically reduces herbicide use by over 90% compared to traditional methods, but it struggles with small weed seedlings, and the system has high costs and maintenance needs [39].

A deep learning approach utilizing multimodal fusion with RGB-D images enhances weed detection in complex, occluded environments by converting depth images into PHA (phase–height–angle) format and employing a three-channel feature fusion network. It achieves 36.1% mAP for grassy weeds and 42.9% mAP for broadleaf weeds, with an overall intersection and union ratio (IoG) of 89.3% in a wheat field setting [40]. The overall IoG of 89.3% significantly boosts the accuracy of weed detection in challenging obstructed conditions [72]. Table 6 compares the performance of various solutions.

RGB cameras have become the dominant technology for weed detection because they are affordable and easy to use. When combined with algorithms, they can achieve higher detection accuracy in many environments; however, this accuracy can degrade depending on lighting conditions. Despite a significant reduction in chemical dosage through RGB/HSV fusion and feature extraction techniques, detecting tiny weeds remains difficult. RGB-D multimodal fusion greatly improves detection performance in complex occlusion scenarios by optimizing depth information, providing multiple approaches for weed detection.

3.2.2. Multi-Spectral Imaging Technology

Multi-spectral imaging enables weed detection by capturing spectral data across the visible to near-infrared spectrum (400–1000 nm), leveraging the differences in spectral reflectance among plant species. At the technical application level, a detection model built with an unsupervised classification algorithm achieves 89% accuracy in identifying weeds within a cornfield by combining spatial features (such as crop row structure) with four-band multispectral imagery and pixel-level classification using a support vector machine (SVM) [30,32,73], thus confirming the effectiveness of combining spectral and spatial data analysis.

Using multispectral temporal data expands the technical possibilities. The multispectral multi-temporal UAV dataset WeedsGalore utilizes DeepLabv3+ and MaskFormer models, combined with a probabilistic approach, to reach an 82.90% multispectral mean intersection over union (mIoU) in the weed segmentation challenge in cornfields [74]. A low-cost vegetation sensor, powered by blue LED-excited fluorescence and sinusoidal signal modulation to reduce ambient light interference, achieves 100% accuracy when distinguishing vegetation from non-vegetation in both indoor and outdoor tests. This provides a lightweight sensing option for precision spraying systems [75].

Fusing data from multiple sources significantly improves monitoring accuracy. A UAV captured both digital RGB and multispectral images, which were combined using the minimum redundancy–maximum correlation (mRMR) feature selection algorithm and Gaussian process regression (GPR) to estimate leaf nitrogen content (LNC) at different rice growth stages. The results showed that combining band reflectance, vegetation indices (VIs), and grey scale covariance matrices (GLCMs) improved estimation accuracy by 15–20% [76]. Table 7 offers a comparative analysis of typical multispectral technology systems.

In summary, multispectral imaging enables weed detection by analyzing spectral reflection differences and combining spatial data with SVM, achieving an accuracy of 89% in corn fields. The use of time series data and models has increased the mIoU for corn field segmentation to 82.90%, while affordable sensors help accurately distinguish vegetation.

3.2.3. Hyperspectral Imaging Technology

Hyperspectral imaging, characterized by ultrahigh spectral resolution across hundreds of wavelength bands, can detect subtle spectral differences in plants, offering a unique technological approach for precise weed identification and monitoring of crop physiological parameters. In weed classification, 61-band hyperspectral images integrated with CNN models can significantly outperform RGB data, especially in species with notable spectral differences [77]. A comparative experiment for categorizing four weed species in ryegrass/clover pastures using hyperspectral imaging alongside PLS-DA, SVM, and MLP models showed that the MLP model, leveraging hyperpixel spectral information, achieved a classification accuracy of 89.1% [78]. The proposed hyperspectral image transform (HIT) method, combined with CNN migratory learning and utilizing pre-trained LCCNet alongside PROSAIL model simulation data, was fine-tuned with field data to estimate chlorophyll content (LCC) in soybean leaves, reaching an R² of 0.78, a 12% improvement over the traditional spectral index + PLSR method [79]. The high-dimensional spectral dimensionality reduction method using self-organized mapping (SOM) converts spectral data into a low-dimensional space. It combines this with a radial basis function (RBF) classifier, achieving an average classification accuracy of 88.5% for crops and vegetation, with a dimensionality reduction rate of 64%. This approach outperforms principal component analysis (PCA) and wavelet decomposition techniques [80]. Hyperspectral technology shows clear advantages in agricultural quality inspection: by employing a deep learning model of hyperspectral imaging (HSI) integrated with a stacked weighted autoencoder (SWAE), pixel-level spectral features of Fuji apples were extracted and combined with a grey wolf-optimized support vector regression (GWO-SVR), resulting in a prediction of soluble solids content (SSC) with Rp² = 0.9436 and RMSEP = 0.1328 °Brix, surpassing traditional feature selection methods such as SPA and CARS [81]. The prediction accuracy was further improved by conventional SPA and CARS feature selection algorithms [82]; Five spectral principal components and six texture eigenvalues of nectarines were extracted by hyperspectral imaging in the 420–1000 nm band, and the detection of external defects of nectarines was realized by PLS, LS-SVM, and ELM modeling, among which the LS-SVM model performed optimally, providing an effective method for nondestructive inspection of nectarine quality [83]. Additionally, surface-enhanced Raman spectroscopy (SERS) with gold nanorod substrates enabled label-free detection of trace zearalenone (ZEN) in maize oil using 1D CNN and 2D CNN models [84]. Table 8 presents performance comparisons of standard hyperspectral technology schemes.

Hyperspectral imaging detects subtle variations in plants with exceptional spectral resolution. When combined with models like CNN, it exceeds the accuracy of RGB in weed classification. Using dimensionality reduction and classifiers, it achieved an accuracy of 88.5% in crop vegetation classification. The evaluation of crop physiological parameters and agricultural product quality showed significantly higher accuracy compared to previous methods, highlighting its technical advantages.

3.2.4. Application of Depth and Stereo Vision

Depth and stereo vision technology gather three-dimensional scene data using LiDAR, TOF cameras, and other depth sensors, effectively overcoming the occlusion issues of weeds and crops that are present in two-dimensional vision. This provides crucial technical support for accurate identification in complex agricultural settings. The stereo vision system in the rice field calculates depth via binocular parallax, achieving a weed classification accuracy of 96.95% under controlled lighting, which exceeds that of monocular vision by over 10% [81]. A depth feature analysis method that combines local binary pattern (LBP) and SVM effectively addresses misclassification issues caused by shape similarities between oilseed rape and radish leaves by utilizing 3D spatial information; however, recognition remains limited in complex soil backgrounds [30,85]. The weed management system for a cornfield, developed using the IPSO (Improved Particle Swarm Optimization) image segmentation algorithm, reduces segmentation error to 0.1%, with a response time of 1.562 s, and efficiently adjusts spraying volume based on weed density [86]. In industrial applications, Festo’s 3D vision-guided weeding robotic arm in Germany uses the PointNet++ model to analyze laser point cloud data, attaining a weed location accuracy of 92.1% in tomato fields [87].

Three-dimensional sensors solve the problem of two-dimensional visual occlusion in depth and stereo vision, leading to a classification accuracy for the paddy field stereo system that exceeds that of a monocular system by more than 10%. The method that combines 3D features improves the accuracy of leaf shape similarity classification, enabling precise spray adjustments in the cornfield system to help accurately identify complex agricultural environments.

This section examines how machine vision sensing technologies are used for weed identification. RGB imaging has become the preferred choice because it is cost-effective. While combining random forest (RF), support vector machine (SVM), and YOLOv5 with RGB achieves accuracy over 94% in crop fields, its performance can be greatly affected by changing lighting conditions. RGB-Depth (RGB-D) fusion improves results in more complex situations. Multispectral imaging reaches an accuracy of 89% through spectral–spatial fusion, and affordable multispectral sensors allow nearly complete (100%) differentiation of vegetation. Data fusion methods can increase monitoring accuracy by up to 20%. Hyperspectral imaging detects subtle spectral details, achieving over 89% accuracy in weed classification and enabling precise measurement of plant physiological traits. Depth vision helps solve occlusion problems, while stereo vision attains 96.95% identification accuracy in rice fields. Additionally, 3D-guided robotic arm positioning shows 92.1% accuracy. Each technology has its own strengths, and combining different modalities strategically is key to improving overall system reliability.

3.3. Multi-Technology Convergence Framework

3.3.1. Sensor Fusion Strategy

Multi-sensor fusion is a key technique for improving adaptation in complex agricultural environments. It enables the combination of multi-source information through various fusion methods, including data-level, feature-level, and decision-level fusion. Data-level fusion involves analyzing raw sensor data directly, while feature-level fusion creates cross-modal feature representations. Decision-level fusion combines weighted results based on separate decisions.

The hybrid U-Net model improves the segmentation accuracy of small weeds by integrating RGB images and depth information at the data level [88]. The FCNN-SPLBP framework enhances weed classification accuracy by combining high-level semantic features from CNN with local texture features from super-pixel LBP at the feature level [89]. Additionally, the use of rotationally invariant LBP-based features along with grey-scale gradient co-production further increases weed classification accuracy. A feature-level fusion approach that employs rotationally invariant LBP features and a grey-scale gradient co-production matrix can accurately distinguish between maize seedlings and weeds using an SVM classifier [90].

At the system integration level, the innovative weed control system for tomato fields achieves a crop–weed classification accuracy of 95.43% by combining color-marked sensors and camera data during decision-making. It uses a dual-precision spraying system to reach a weed treatment rate of 99.96% [43]. Cross-disciplinary research shows that a multi-sensor approach, which incorporates hyperspectral imaging spectral data, image features, and olfactory visualization of odor attributes, along with principal component analysis for dimensionality reduction and SVM classification, improves the precision of green tea quality evaluation from 75–78% to 92% with a single sensor [91]. Additionally, multi-sensor fusion methods, including electronic nose and electronic tongue, when combined with PCA-SVR chemometric modeling, have shown better performance than single-sensor techniques in food quality assessment [92].

Multi-sensor fusion combines information from various sources using data-level, feature-level, and decision-making level methods to improve adaptation in complex agricultural situations. The integration of RGB and depth data improves the segmentation of small weeds. The combination of semantic and texture features at the feature level increases classification accuracy. The decision-level fusion of tomato field sensors and image data achieved a classification accuracy of 95.43%. Along with double-precision spraying, it reached a processing rate of 99.96%. Additionally, interdisciplinary integration improved the evaluation accuracy to 92%, highlighting the technical benefits.

3.3.2. Cross-Modal Feature Learning

Cross-modal feature learning is crucial for overcoming the limitations of single-modal perception by simultaneously optimizing the feature space for different types of data, thereby leading to the complementary enhancement of heterogeneous information, including spectra, images, and point clouds. In transfer learning, using greenhouse environmental data for model pre-training and then fine-tuning it with a small amount of field data can significantly lower annotation costs [93].

The MCARN model skillfully addresses issues of modal heterogeneity and data imbalance in human activity recognition by using a novel cross-modal federated learning architecture [94]. Although this research mainly focuses on non-agricultural fields, its feature alignment approach provides a methodological framework for multimodal data fusion in agricultural settings. A multi-task YOLO model that integrates the C2f module with the Anchor-Free mechanism effectively performs simultaneous weed detection and navigation path extraction in pineapple fields, achieving a mean intersection over union (mIoU) of 77.8% and a positioning error of 5.472 cm [33].

The hyperspectral cross-modal fusion technique offers distinct advantages: spectral data are preprocessed using standard normal variable (SNV) and first-order derivative (FD) methods, and combined with feature selection strategies such as uninformative variable elimination (UVE) and competitive adaptive reweighted sampling (CARS). As a result, two-band (VI2) and three-band (VI3) vegetation indices are constructed and ultimately derived through intelligent algorithms, including dung beetle optimization (DBO) and subtractive averaging-based optimization (SABO), to fine-tune the parameters of the Extreme Learning Machine (ELM) [95]. A hyperspectral deep learning framework employing the wavelet transform (WT) and the Stacked Convolutional Autoencoder (SCAE) predicts cadmium and lead concentrations in lettuce for heavy metal detection, achieving R-p² values of 0.9319 and 0.9418, with a Root Mean Square Error of Prediction (RMSEP) of 0.04123 mg/kg [96], demonstrating the effectiveness of cross-modal fusion in high-precision detection.

Cross-modal feature learning improves heterogeneous information, such as spectra and images, by expanding the feature space across multiple data sources. Transfer learning reduces annotation costs, and the MCARN model architecture provides a methodological framework for multi-modal integration in agriculture. The multi-task YOLO model can detect weeds and extract paths. Hyperspectral cross-modal fusion, after preprocessing, feature selection, and intelligent algorithm optimization within a deep learning framework, achieves high-precision detection of heavy metals in lettuce, highlighting its importance for accurate perception in complex environments.

This section explores how multi-technology fusion frameworks are applied in agriculture. Sensor fusion improves information complementarity through data-level, feature-level, and decision-level strategies: Hybrid U-Net fusion combining RGB and depth data boosts small weed segmentation accuracy; feature-level fusion enhances classification performance; and decision-level fusion achieves a weed removal rate of 99.96% in tomato field trials. Cross-domain applications significantly raise accuracy from 75–78% to 92%. Cross-modal feature learning overcomes limitations of single-modal methods through joint optimization. Greenhouse pre-training lowers labeling costs, while a multi-task YOLO architecture detects weeds and guides navigation simultaneously. Hyperspectral fusion technology also shows high accuracy in heavy metal detection, confirming the vital role of multi-technology fusion in solving complex agricultural problems.

3.4. Model Architecture Innovation

3.4.1. Lightweight Model Design

To meet the real-time and low-power requirements of edge computing devices, the development of lightweight models has become a key focus in agricultural weed detection research. The customized 5-layer CNN model achieves a detection accuracy of 97.7% on a GPU by optimizing the convolutional kernel size and the number of feature channels. It maintains an accuracy of 95.12% after deployment to Raspberry Pi, with an inference latency of just 16.754 ms and a minimal memory footprint of 0.012 GB [97].

Model compression significantly improves deployment efficiency: The EM-YOLOv4-Tiny model’s file size is reduced by 40% through mixed-precision quantization, while maintaining over 90% detection accuracy [55]. The MobileNetV3 architecture uses depthwise separable convolution and an attention mechanism. The YOLOv8s model, based on this backbone network, has a parameter count equivalent to the original YOLOv8s for weed detection in cotton fields [93], maintaining a similar parameter count as the original YOLOv8s. The number is only 38% of the original YOLOv8, with a corresponding mAP decrease of just 2.1% [52,98].

In the field of specialized model development, YOLO-WDNet achieves a mean average precision (mAP) of 97.8% at a threshold of 0.5 on a custom-developed cotton weed dataset (CWDD) by improving the feature fusion pathway and loss function, marking a 9.1% improvement over the original YOLOv5s, with an average accuracy of 0.947 in field tests [99]. The YOLOv8n model, which incorporates the CBAM attention module and the C3Ghost lightweight component, achieves a mean average precision (mAP) of 97.6% at 0.5 on 12 cotton field weed detection tasks, and delivers real-time detection at 13.84 frames per second when executed on the Jetson Nano [35]. The YOLOv8-ECFS utilizes the EfficientNet backbone network, the Focal_SIoU loss function, and the coordinate attention (CA) module, achieving a 95.0% mAP in multi-weed detection within soybean fields while reducing computational load (GFLOPs) by 11.1G, thereby effectively balancing accuracy and lightweight requirements [100]. Table 9 provides a comparison of performance among standard lightweight models.

Lightweight versions have emerged as a focal point of study to satisfy the real-time, low-power consumption demands of edge devices. Tailored CNNs provide elevated accuracy and minimal latency post-deployment. Model compression employs quantization to decrease file size while preserving accuracy, and topologies such as MobileNetV3 minimize parameters. Following the optimization of the specialized model, a high mean average precision (mAP) was attained in cotton and soybean fields, effectively balancing accuracy and lightweight design.

3.4.2. Application of Transformer Architecture

The transformer architecture has become increasingly a focus of research in weed recognition due to its ability to represent long-range dependencies and effectively capture global context. In the semantic segmentation task, the Swin Transformer replaces the traditional backbone network, and the detection model, which combines multi-scale feature fusion and the CBAM attention mechanism, achieves a 91.53% mIoU in segmenting weeds in soybean fields. This effectively tackles the problem of boundary blurring when crops and weeds are densely packed [65]. Its architectural advantage lies in its capacity to capture global semantic relationships through the sliding window self-attention mechanism, while also integrating local feature interactions to ensure precise segmentation.

The Vision Transformer (ViT) demonstrates superior classification performance compared to traditional CNNs after extensive pre-training, achieved by dividing images into successive patches and applying the standard Transformer architecture. A study showed that ViT reduced the number of parameters by 30% in a weed image detection task while improving classification accuracy by 5.2% [101]. The RT-DETR-l model, based on the Transformer design, employs a hybrid encoder–decoder query approach and eliminates non-maximum suppression (NMS), resulting in a mean average precision (mAP) at 0.5 of 0.912 and an increase in inference speed to 42.3 FPS in cornfield weed detection [102]. A performance comparison of different models is provided in Table 10.

A sugarcane field spot spraying system that combines MobileNetV2 and Transformer achieves a 35% reduction in herbicide use and a 97% weed control efficiency through the chunking categorization technique [40]. A hybrid design incorporating the Swin Transformer module within the ResNet18 backbone improves the global feature characterization ability of the fruit grading task by 12.7%. Characterization of features by 12.7% [103] confirms the effectiveness of the Transformer architecture in enhancing long-range feature association.

The Transformer emphasizes the acquisition of long-range dependencies and global contexts in weed recognition studies. The Swin Transformer has elevated the mean intersection over union (mIoU) for soybean field segmentation to 91.53%. ViT decreases parameters by 30% and enhances accuracy by 5.2%, whilst RT-DETR-l attains elevated mAP and speed metrics. The sugarcane field system integrated with MobileNetV2 decreases medicine use by 35%, while the hybrid design improves feature representation, underscoring its technological benefits.

3.4.3. Hybrid Model Architecture

The hybrid model introduces a new architectural approach that combines accuracy and generalization by merging the local feature extraction strengths of CNNs with the global context modeling benefits of Transformers. The CoAtNet model cleverly integrates convolutional neural networks and Transformer architectures, unifying deep convolution with self-attention through the use of the relative attention mechanism. It skillfully combines a convolutional neural network with a Transformer architecture, harmonizing deep convolution and self-attention via a relative attention mechanism. It uses a multi-stage vertical configuration (e.g., C-C-T-T structure) and incorporates the MBConv module with the Transformer block, achieving 90.88% Top-1 accuracy in the ImageNet classification task [104]. This is boosted by the convolutional stage (C) for local feature extraction and the Transformer stage (T) for modeling long-range dependencies.

In crop–weed semantic segmentation, the hybrid CNN-Transformer network uses an encoder–decoder structure, including a concatenation-extended downsampling block (CEDB), a parallel input Transformer Semantic Enhancement Module (PITSEM), a global-local semantic fusion block (G-L SFB), and a fusion enhancement block (FEB) to achieve an over 86% recognition rate on embedded devices. Devices reach more than 86% recognition accuracy while reducing model parameters by 40% [105]. The model maintains spatial details with CEDB and improves global semantic connections through PITSEM.

A cross-domain application using a hybrid architecture of Long Short-Term Memory Network (LSTM) and convolutional neural network (CNN) achieved a detection limit of 0.012 mg/kg by extracting spectral features through CNN and capturing temporal dependencies with LSTM, improving detection sensitivity by 18% compared to a standalone CNN model in the Raman spectral detection of chlorpyrifos residues in corn oil [106].

The hybrid model amalgamates the benefits of CNN’s localized feature extraction with Transformer’s global modeling, resulting in elevated classification accuracy for CoAtNet. The hybrid network for crop and weed segmentation achieves an accuracy exceeding 86% and a 40% decrease in parameters on embedded devices. The cross-domain utilization of LSTM and CNN improves detection sensitivity and underscores the benefits of the integrated architecture.

This section explores how new model architectures improve agricultural weed detection. Lightweight models achieve a good balance for edge device deployment by using customized convolutional structures, mixed-precision quantization, and attention mechanism improvements. For example, YOLOv8n enhanced with the Convolutional Block Attention Module (CBAM) reaches a mean average precision (mAP) of 97.6% in cotton fields while providing real-time inference. The Transformer architecture uses its natural strength in modeling long-range dependencies. The Swin Transformer increases the mean intersection over union (mIoU) for soybean field segmentation to 91.53%, while Vision Transformers (ViT) cut parameter counts by 30% and improve classification accuracy by 5.2%. Hybrid models combine the local feature extraction strengths of convolutional neural networks (CNNs) with the global modeling abilities of Transformers. CoAtNet achieves high classification accuracy, and crop–weed segmentation networks using this hybrid approach reach recognition rates over 86% on embedded devices while reducing parameters by 40%, highlighting the value of architectural innovation.

3.5. Model Optimization Techniques

3.5.1. Data Enhancement and Expansion

Data augmentation, a key approach for tackling sample scarcity in agricultural datasets, has evolved from traditional geometric modifications to advanced generative models. Traditional techniques increase data by applying geometric and photometric adjustments, such as rotation, scaling, and adding noise, to reduce model overfitting [107]. The Random Image Cropping and Patching (RICAP) method improves the mean intersection over union (mIoU) for semantic segmentation tasks by 7.3% through image block cropping and splicing techniques [108].

Generative AI methodologies provide innovative ways for data augmentation: the classifier-guided diffusion-based probabilistic model (ADM-G) creates synthetic weed images using a 2D U-Net architecture with bi-directional noise processing, improving weed recognition accuracy by 12.5% in a limited sample within a cotton field [109]; additionally, the plant-alignment data augmentation technique PAIAM generates training data by segmenting and rearranging background, crop, and weed elements. The Plant Alignment Data Enhancement Method (PAIAM) generates training data by segmenting and reorganizing background, crop, and weed components, utilizing the U-Net + ResNet-50 model to enhance crop–weed segmentation accuracy by 4.23% in rice fields and 3.14% in sugar beet fields, respectively [110].

A laser weeding system utilizing DIN-LW-YOLO in strawberry fields achieved an 88.5% accuracy in weed and crop detection, along with a 92.6% weeding rate, through the integration of multi-scale feature fusion and deformation convolution optimization, along with a data enhancement strategy [42], thereby highlighting the vital role of data enhancement technology in improving model generalization.

Data augmentation has progressed from conventional geometric and photometric modifications to generative models, with RICAP improving semantic segmentation mIoU by 7.3%. In generative AI, ADM-G raises the recognition accuracy of cotton fields by 12.5%, while PAIAM improves the segmentation accuracy of rice and sugar beet fields. The laser weeding technology, along with data augmentation, improves detection and weeding efficiency, underscoring its vital importance in enhancing model generalization.

3.5.2. Loss Function Optimization

To address issues such as category imbalance (with a crop-to-weed sample ratio often at 1:10) and minimal target identification (less than 5% of weed pixels at the seedling stage) in agricultural settings, optimizing the loss function is crucial for improving model robustness. The development of the bounding box regression loss is significant: GIoU (Generalized IoU) resolves the problem of gradient vanishing in non-overlapping bounding boxes by including a convex area penalty. DIoU (Distance IoU) adds a centroid distance metric, while CIoU (Complete IoU) further includes aspect ratio constraints, boosting bounding box positioning accuracy by 15–20% compared to the original IoU. This enhances the precision of bounding box placement by 15–20% relative to the initial intersection over union (IoU).

In the extensive optimization framework, the HAD-YOLO model attains a significant enhancement in performance through three advancements: utilizing the lightweight backbone HGNetV2 to diminish computational load, incorporating SSFF (cross-scale feature fusion) and TFE (time domain feature enhancement) modules to bolster the representation of small targets, and implementing the dynamic attention head (Dyhead) to adaptively allocate semantic weights, which is integrated with the CIoU loss function to achieve the reverse amaranth in the wheat–cabbage field. The cabbage crop attained a mean average precision at IoU 0.5 of 96.2% for weeds, including Antirrhinum and Oxalis [111].

The cross-domain, innovative Mixed Gradient Loss Function (MixGE) combines L₁ regularization and Sobel operator gradient error, with weights of λ = 0.5, resulting in a 32% reduction in 3D surface reconstruction error during structured light contouring measurements. It also shows greater robustness than L₂ and SSIM in sharp edge scenarios, such as crop stalks [112].

3.5.3. Model Compression and Acceleration

Model compression and acceleration techniques reduce computational complexity while maintaining accuracy, utilizing methods such as pruning, quantization, and knowledge distillation, which are essential for real-time edge device operations. The three core methods form a synergistic and optimized technological pathway.

Advanced development of pruning technology: Layer-Adaptive Magnitude-based Pruning (LAMP) autonomously optimizes the sparsity of each layer by minimizing model-level L₂ distortion, achieving a Top-1 accuracy that exceeds traditional pruning methods by 3.2% in the ImageNet classification task, without requiring manual parameter adjustments [113]. By dynamically computing the weight importance matrix for each layer, we achieve intelligent sparsification, characterized by “minimal pruning of significant layers and extensive pruning of less critical layers”, which results in a 40% increase in model inference speed and a 50% reduction in accuracy loss compared to conventional global pruning methods.

NVIDIA TensorRT quantization engine converts FP32 models into INT8 formats, providing a 2.3× increase in weed identification inference speed on the Jetson AGX Orin platform, with only a 1.8% decrease in mAP@0.5 [114]. A study showed that the YOLOv4-Tiny model, utilizing mixed-precision quantization (FP16 + INT8), achieves an inference latency of 73 ms on the Raspberry Pi 4B, representing a 2.1× improvement over the original FP32 model [55].

The lightweight algorithm derived from YOLOv5 achieves improved performance through triple optimization: channel pruning removes 27% of unnecessary convolutional kernels, knowledge distillation allows the student model to absorb 92% of the semantic knowledge from the teacher model, and the innovative training strategy, TASA (Task Adaptive Sample Augmentation), enhances robustness for small objects. This method reduces the size of the weed identification model in cotton fields by 79.2% and boosts detection speed by 86.64%, while maintaining a mean average precision (mAP) at 0.5 of 91.3% [115]. The CIoU loss function accelerates model convergence by 30% by incorporating bounding box aspect ratio restrictions, resulting in a 2.5-day reduction in training time for rice field weed detection [56].

Consequently, model compression acceleration technology decreases computing complexity while preserving accuracy via the synergistic optimization of pruning, quantization, and knowledge distillation, establishing the groundwork for the real-time functionality of edge devices. LAMP pruning improves precision and efficiency, whereas quantization technology substantially accelerates reasoning with negligible accuracy degradation. The lightweight algorithm using triple optimization markedly diminishes the model size and enhances speed. The CIoU loss converges rapidly, effectively facilitating real-time detection needs.

3.5.4. Modelling Algorithm Improvements

The enhancement of deep learning model algorithms aims to improve accuracy, robustness, and adaptability to complex scenes, thereby advancing weed recognition technology through innovations in network architecture, multimodal fusion, the incorporation of attention mechanisms, and the development of optimization strategies.

Bah [116] introduced an improved ResNet18 model for fundamental network and transfer learning, which performs unsupervised data annotation through crop row detection (using the Hough transform and skeleton extraction) and superpixel segmentation (applying the SLIC algorithm). This model incorporates ImageNet pre-training transfer learning, achieving unsupervised annotation AUCs of 94.34% and 88.73% for spinach and bean fields, respectively, significantly reducing dependence on manual annotation accuracy. The 88.73% also markedly lowers costs associated with manual annotation; Khan [117] improved Faster R-CNN by replacing the VGG16 backbone with ResNet-101 and enhancing the anchor configuration (16 anchors to handle multi-scale targets), resulting in an average accuracy of 95.3% for weed identification and a 2.1% increase in overall target detection accuracy.

Multi-modal fusion and feature enhancement techniques have become essential for overcoming the limitations of a single modality. Xu [118] developed an RGB-D three-channel deep learning network, transforming the depth image into a three-channel PHA (phase–height–angle) representation. Using a feature-level fusion strategy (multiscale convolutional fusion) and a decision-level fusion strategy (weighted integration with weights α = 0.4 and β = 0.3), they achieved a mean average precision (mAP) of 36.1% for gramineous weeds, 42.9% for broadleaf weeds, and an overall detection accuracy (IoG) of 89.3%. Ahmad [119] assessed the effectiveness of transfer learning with pre-trained models such as VGG16, ResNet50, and InceptionV3, with VGG16 achieving an image classification accuracy of 98.90% under the PyTorch (version 1.2.0) framework. The VGG16 model, implemented within PyTorch, demonstrates superior performance in image classification, reaching an accuracy of 98.90%, and serves as a benchmark for developing lightweight models.

The optimization of the attention mechanism and small target detection significantly improves the model’s ability to adapt to complex environments. Urmashev [120] incorporated the ECA-Net attention module into YOLOv5 to boost inter-channel feature interactions, resulting in a 3.2% increase in small target detection accuracy and an mAP@0.5 of 78.1%. Honghua Jiang [121] introduced the YOLOv8-ECFS model, which enhances feature extraction with the EfficientNet-B0 backbone and the MBConv module featuring the SENet attention mechanism, while refining bounding box regression through the Focal_SIoU loss function (a mix of Focal Loss and SIoU). The improved mAP@0.5 reaches 95.0%, a 1.0% increase over the original YOLOv8s and a 1.3% boost compared to the initial YOLOv8s, with the mAP for difficult-to-identify weeds, such as bowl flower, increasing by more than 5%.

Khan [122] enhanced the YOLOv7 algorithm by incorporating spatial-channel information through lightweight convolutional layers, SE blocks, and batch normalization, along with an adaptive gradient optimizer and Lasso regularization. This led to a 3.2% increase in model precision, a 6.2% boost in recall, and a significant 7.1% improvement in mAP@0.5:0.95 metrics, confirming the effectiveness of the combined network and optimization enhancements. These improvements enhance recognition accuracy within a single scene (e.g., the supervised annotation AUC reaches 95.70% [116]) and strengthen the model’s robustness in cross-field and multi-crop environments through cross-modal fusion and unsupervised annotation, providing a solid foundation for weed recognition technology in engineering applications. Table 11 shows examples of these model improvements.

Algorithms of deep learning models can improve their accuracy, robustness, and adaptability to challenging scenarios by methods including enhancements in network architecture, multimodal fusion, incorporation of attention processes, and the formulation of optimization strategies. Enhance ResNet18, Faster R-CNN, and similar models to improve annotation efficiency and detection accuracy; RGB-D fusion and the utilization of pre-trained models overcome the constraints of unimodal approaches. Enhancing the attention mechanism and loss function can improve the detection of small targets and adaptation to complicated situations, establishing a robust foundation for the engineering application of weed identification.

This section explores the use of model optimization techniques in weed detection systems. Data augmentation has advanced from simple geometric transformations to sophisticated generative models. Notably, Augmented Diffusion Models for Generation (ADM-G) increase cotton field recognition accuracy by 12.5%. At the same time, the Position and Attribute Integration Augmentation Model (PAIAM) boosts segmentation accuracy in rice and sugar beet fields. In loss function optimization, the Generalized Intersection over Union (GIOU), Distance Intersection over Union (DIOU), and Complete Intersection over Union (CIoU) series collectively enhance bounding box localization accuracy by 15–20%. The HAD-YOLO model, which includes CIoU, reaches a mean average precision (mAP) of 96.2%. Model compression techniques—including pruning, quantization, and knowledge distillation—support edge deployment: Layer-Adaptive Magnitude-based Pruning (LAMP) improves efficiency by 40%, while TensorRT quantization speeds up inference by 2.3 times. Algorithm improvements that integrate network optimization, multimodal fusion, and attention mechanisms lead to gains such as a 3.2% increase in small object detection accuracy for YOLOv5. These various optimization strategies work together to improve model accuracy, robustness, and deployment efficiency.

3.6. Scene Adaptation Optimization

3.6.1. Multi-Scale Feature Processing

The morphological scale of weeds in agricultural fields varies significantly, with the difference in pixel area between seedling and mature stages sometimes exceeding ten times. This makes it difficult to accurately detect weeds across different growth stages using traditional single-scale models. Multi-scale feature processing technology has become a key solution to this scale variability by implementing a cross-level feature fusion mechanism. Standard models, such as the Feature Pyramid Network (FPN) and Path Aggregation Network (PANet), facilitate the integration of semantic and spatial information across scales through bidirectional feature flows, encompassing both top-down and bottom-up directions.

In target detection, the RicePest-YOLO model establishes a multi-scale feature enhancement system: deformable dynamic convolution (ODConv) adaptively adjusts the receptive field to match variations in weed size, a bidirectional feature pyramid network (BiFPN) enables efficient aggregation of multi-level features, and a Shape-IoU loss function is included to improve the accuracy of bounding-box geometry alignment. This method enhances the multi-scale weed detection mAP@0.5 in rice fields to 92.3%, while remaining lightweight (with a 40% reduction in parameters and a 35% decrease in computation) and achieving real-time inference at 28.7 FPS on Jetson Nano [123].

In the semantic segmentation task, the FPN-based multiscale prediction framework achieves a mean intersection over union (mIoU) of 77.9% while maintaining an inference speed of 62 FPS in complex field scenarios. This is accomplished through a dual-branch prediction module that handles high-resolution details and low-resolution semantics, a channel-attention mechanism that highlights scale-sensitive features, and a cross-layer feature-weighted fusion strategy [124]. The technical advantage lies in the hierarchical representation of the feature pyramid, which effectively captures the intricate texture of weed seedlings (pixel share < 5%) and the overall morphological characteristics of mature plants, thereby significantly improving segmentation consistency across various growth stages.

The advancement of multi-scale feature processing technology transcends the constraints of single-resolution feature representation, offering a systematic answer to the prevalent issue of scale diversity in agricultural landscapes. This is achieved by dynamic receptive field modification and cross-layer semantic integration techniques.

3.6.2. Anti-Environmental Disturbance Technique

In the agricultural context, fluctuations in light intensity (brightness changes up to 200 times due to daily temperature shifts), the complexity of soil texture (irregular distribution of ridges, straw mulch, etc.), and the variability in the crop growth cycle (e.g., a 40% increase in leaf shading during the maize pulling stage) pose numerous challenges to the model’s environmental robustness. The anti-environmental interference technology offers a multi-dimensional solution through light normalization, cross-domain feature alignment, and noise-robust modeling.

Regarding cross-scene adaptation, Domain-Adaptive YOLO, developed by the University of California, Berkeley, USA, establishes a feature distribution alignment loss between the source domain (manually labeled fields) and the target domain (natural farmland) using an adversarial training mechanism, thereby reducing the model’s mean average precision (mAP) decline from 22.3% to 8.7% when applied across different scenes [125]. The method effectively mitigates feature bias caused by environmental variables, including soil color and light angle, through the adversarial interaction between the domain discriminator and the detection network. The CenterNet-based weed identification system, enhanced by genetic algorithm optimization, can indirectly identify weed locations by detecting the presence of vegetable plants. This approach circumvents the complexities associated with the morphological variability of over 20 weed species, achieving an overall identification accuracy of 95.3% in multi-species vegetable fields [31].

The anti-jamming method for spectral data demonstrates clear benefits, utilizing the Savitzky–Golay (SG) algorithm for smoothing and reducing noise in hyperspectral data, and combining the wavelet transform (WT) with partial least squares regression (PLSR) to extract feature wavelengths. The moisture detection model effectively predicts lettuce leaf moisture content with RMSEP = 0.1688 and R² = 0.8307 [126], while improving noise resistance by 17% compared to single PLSR. The competitive surface-enhanced Raman scattering (SERS) immunosensor was developed using nitrile (C≡N) Raman tags to position the distinct peaks within the Raman quiet zone of 1800–2800 cm⁻¹, avoiding complex spectral interferences in the fingerprint region (500–1800 cm⁻¹), and achieving a detection limit for pesticide residues as low as 0.012 ng/mL [127].

These tactics markedly enhance the model’s resilience in intricate environments, including cloudy weather and monoculture agriculture, by isolating environmental variables, achieving cross-domain feature invariance, and facilitating noise-sensitive feature transfer.

3.6.3. Few-Shot and Unsupervised Learning

In agricultural contexts, the wide variety of weeds (over 3000 species worldwide) and the high cost of annotation (15–20 min for detailed annotation of a single image) have led to the adoption of few-shot learning and unsupervised learning as key technologies to reduce data dependency. Meta-learning (ML) enhances the ability to identify new weed species by developing an adaptive mechanism that ‘learns to learn’, enabling the model to be rapidly adjusted with only 5–20 new class samples.

Semi-supervised learning (SSL) effectively balances annotation costs and model performance. Kong [128] evaluated four semi-supervised methods: Mean Teacher, Pi-Model, Dash, and FixMatch. They found that FixMatch can accurately detect weeds in wheat fields using only 200 annotated images per class, which is 12% of the total supervised data, by applying consistency regularization and pseudo-label filtering techniques. FixMatch achieves a weed detection accuracy of 94.8% with just 200 annotated images per class (12% of the fully supervised data), which is only 2.7% below the performance of fully supervised learning (FSL), while reducing annotation costs by nearly 85%. This method effectively reduces the need for manual annotation by combining minimal labeled data with extensive unlabeled data augmentation.

In unsupervised learning, self-supervised pre-training combined with contrastive learning has become an innovative approach for feature extraction. The DINO (Dense Instance Noise contrastive learning) framework adaptively updates feature templates using a momentum encoder. It enhances scale invariance through a multi-crop training method, enabling the Visual Transformer (ViT) to achieve a linear classification accuracy of 80% on unlabeled ImageNet datasets. The ImageNet dataset achieved an 80.1% linear classification accuracy, and the features extracted can be directly used for a weed semantic segmentation task to locate crop–weed boundaries in unlabeled cornfield images [129]. This method of ‘unsupervised pre-training coupled with task fine-tuning’ offers a comprehensive solution to the problem of limited labeled data in agriculture.

The emergence of sample-less and unsupervised learning techniques diminishes the model’s dependence on extensive labeled datasets, transforming weed recognition technology from ‘data-intensive’ to ‘knowledge-efficient’ through the rapid implementation of meta-learning and self-supervised generic feature extraction.

This section explores how scene adaptation optimization technologies are used in weed detection systems. Multi-scale feature processing combines information from different layers through architectures like Feature Pyramid Networks (FPN) and Path Aggregation Networks (PANet). Specifically, RicePest-YOLO uses dynamic convolution along with Bidirectional Feature Pyramid Network (BiFPN) to increase the multi-scale detection mean average precision (mAP) to 92.3% in rice fields. Techniques to reduce environmental interference tackle issues from lighting and soil changes using domain adaptation and spectral denoising methods. A Domain-Adaptive YOLO setup limits cross-scenario mAP loss to 8.7%, and spectral processing boosts noise resistance by 17%. To reduce reliance on labeled data, few-shot and unsupervised learning methods are employed. FixMatch achieves 94.8% accuracy with only 200 labeled images, and the self-distillation with no labels (DINO) framework enables effective unsupervised pre-training for feature extraction. These approaches together help overcome data shortages and environmental challenges in farming, improving how well models generalize.

3.7. Hardware System Architecture

3.7.1. Mobile Platform Design

The mobile platform of intelligent weeding machinery needs to balance terrain adaptability and operational efficiency. Wheeled platforms with a simple structure and high speed are suitable for flat farmland. Shenyu Zheng [23] designed and developed a multifunctional robotic mobile platform (Figure 4) for unstructured and complex field environments for weeding in cropland.

Yu, Xin et al. [130] designed an active gimbaled wheeled agricultural omnidirectional mobile platform for unstructured terrain, and verified the feasibility of the platform through kinematic simulation. It was shown in [131] that the operating efficiency of the wheeled platform was 15–20% higher than that of the tracked platform in farmland environments with terrain flatness ≤5°. In contrast, tracked platforms have an advantage in adapting to complex terrain: Yang [132] developed a motorized tracked weeding robot for organic onion fields, which realized high-precision identification of crop rows through the YOLOv8-seg model combined with data enhancement technology, and strengthened the ability to adapt to the terrain through the tracked mechanism, with the lateral deviation of the automated operation being controlled at ±2.3 cm, which is significantly better than that of manual operation. In addition, the UAV platform shows excellent maneuverability, and the DJI T60 agricultural drone is equipped with a spraying and weeding module, which can precisely target spray weeds at an operating height of 3 m [133].

The movable platform of intelligent weeding machinery must reconcile terrain flexibility with operational efficiency. The wheeled platform possesses a straightforward design and elevated velocity. In farms with a gradient of ≤5°, the efficiency of this type surpasses that of the tracked kind by 15–20%. The crawler platform enhances its adaptability to complex terrains through a crawler mechanism, ensuring lateral deviation for onion field activities remains within ±2.3 cm. The uncrewed aerial vehicle (UAV) platform has superior mobility and can accomplish precise targeted spraying. Each platform can be tailored to various agricultural contexts as required, enhancing operational efficiency.

3.7.2. Perception System Integration

The perception system acts as the ‘eye’ of intelligent weeding machines, and its design directly affects recognition accuracy. The concept of an innovative agricultural system utilizing digital technology involves integrating multi-source data (soil, climate, and crop) through the use of IoT, AI, blockchain, and other technologies, resulting in the creation of a data platform and decision-support system designed to enhance agricultural production and sustainability [134]. Traditional sensing systems include sight sensors, location modules, and inertial measurement units.

An offline, target-free, continuous-time optimized unified spatio-temporal calibration framework for the high-precision and durable calibration of various sensors, including IMU, radar, LiDAR, and camera. The system improves calibration consistency through a dynamic initialization technique and a batch optimization algorithm, maintaining attitude error within 0.5° in dynamic operational situations [135]. In intelligent agricultural sensing, the IoT-based multi-source data fusion system achieves collaborative integration of light, temperature, humidity, and other sensor data, using a regression tree algorithm to fill in missing values. It verifies the accuracy of various sensor combinations for environmental monitoring and creates a cost-effective, collaborative, and real-time monitoring platform that incorporates both hardware and software. The strengths of this system are clear in its effective use of inexpensive sensors, improved data reliability through machine learning, and real-time analytical capabilities enabled by the IoT platform [136].

The EPMC monitoring framework for agricultural drought assessment utilizes multi-sensor collaboration with the cumulative drought index PADI, allowing for a quantitative analysis of drought progression by integrating precipitation data, soil moisture, vegetation indices, and crop growth cycle characteristics. The Pearson correlation coefficient between PADI and wheat yield loss ranges from ρ = 0.66 to 0.77, indicating a strong correlation [137].

The architecture of the perception system, serving as the “eyes” of intelligent weeding machinery, influences recognition accuracy. The spatio-temporal calibration framework improves sensor calibration accuracy, maintaining a dynamic scene attitude error of ≤0.5°. The multi-source data fusion system of the Internet of Things achieves the collaborative integration of environmental data, thereby cutting costs and enhancing reliability. The EPMC framework integrates several sensors and drought indexes to evaluate agricultural data precisely, hence offering robust assistance for agricultural output.

3.7.3. Weeding Mechanism Design

The choice of actuators directly affects weeding efficiency and environmental benefits, mainly including mechanical, chemical, laser, and thermal weeding methods. The technical specifications and usage contexts are as follows.

Mechanical weeding achieves weed removal through physical contact. Chang [138] proposed a precision weeding system based on the YOLOv3 deep learning model, which adopts a modular inverted triangular weeding tool to cut off the root system of weeds, and achieves a weeding efficiency of 84–92.3% and a crop damage rate of ≤11.1% when the operating speed is ≤15 cm/s, which possesses the advantages of high detection accuracy by deep learning, modular and easy deployment of mechanical structure, and is suitable for low-density weed scenarios. It offers several benefits, including high deep learning detection accuracy, modular and easy deployment, and suitability for low-density weed scenarios. In addition, the automated robotic arm carried by the robotic weeding platform achieves inter-row weed removal by rotating the blades, with a movement accuracy of 0.79 inches RMS, which can replace manual operation and significantly reduce the use of chemicals in organic farms [139]. An in-row intelligent weeding system based on electrically oscillating open-close knives (Figure 5) identifies the position of cabbage in real time through deep learning. It controls the knife trajectory to dynamically avoid the crop and accurately remove inter-plant weeds [23]. Jia et al. [140] designed an in-plant obstacle avoidance shovel weeder (Figure 6), which detects grapevines in real time through obstacle avoidance rods, triggering the hydraulic system. The parallelogram mechanism retracts the weeding shovel to achieve obstacle avoidance, and automatically resets to continue in-plant weeding after completion; after parameter optimization (hydraulic cylinder outreach speed of 120 mm/s, forward speed of the implement of 0.6 m/s, angular threshold of 18°), the weeding coverage rate reaches 86.8%, which effectively reduces the area of leakage and avoids collision. The design and development of a coaxial reversing weeding mechanism (Figure 7) allows weeds and some of their roots to be pulled and cut in the upper and lower weeding knives; each knife runs counter to the other in an opposing force to reduce the occurrence of a winding effect and achieve effective reduction of the impact of the entanglement of weed stalks.

Chemical weed management utilizes pesticides to achieve effective weed control. Alicia Allmendinger [141] developed a CNN-based modular point spraying system that achieves standardized integration of sensors and sprayers via the ISOBUS protocol, resulting in a reduction in herbicide use of 40% to 95% in crops such as maize and sugar beet. Yu [142] designed a fiber laser-based static movable lift-adjustable closed weed control device and system for four large weeds, quinoa, prickly amaranth, dogwood, and oxalis, with the following results: cutting energy increases with increasing stem diameter and decreases with increasing irradiation time at the same irradiation distance.

Laser weeding destroys weed cells with high-energy lasers. A prototype stationary weeding robot features a dual-head laser system and a fast path planning algorithm, enhancing laser targeting with machine vision, achieving 97% accuracy and an operational speed of 30 mm/s in a laboratory setting [143]. Using YOLOX, the blue laser weeding robot in maize fields uses a five-degree-of-freedom robotic arm for precise laser positioning. Moving at 0.2 m/s, it detects maize seedlings and weeds at rates of 92.45% and 88.94%, respectively, with a weed control efficiency of 85% and a seedling damage rate of just 4.68% [144]. The mechanical–laser combined inter-row weed control system provides a comprehensive technological approach for intelligent weed management by adjusting key parameters, such as laser power, through algorithms [145].

Thermal weed control uses high-temperature steam or flames to eliminate weeds. A portable flame weed control device, operating at 1 km/h and 40 psi, achieves a weed control efficiency of 91.1%, which is 94.82% better than manual methods and 50.42% more cost-effective [146]. A tractor-mounted flame weed management system includes optimized gas pressure (1–2 bar), flame height (15–25 cm), and operational speed (0.6–1.5 km/h) to improve efficiency. The combination of two burners, moving at 0.6 km/h, with a pressure of 2 bar and a flame height of 15 cm, achieved optimal weed eradication, using 40 kg of gas per hectare. This setup was more effective at controlling narrow-leaf and broad-leaf weeds than perennial weeds [147].

Enhancing the drone’s operational settings (flight altitude of 1.5 m and velocity of 5 m/s) maximizes droplet deposition in the lowest layer of the rice canopy, ensuring uniform distribution (coefficient of variation of 23%). This method effectively controls rice whitefly populations, reducing them by 92–74% within 3–10 days, and exhibits a significantly prolonged residual effect compared to conventional hand sprayers. This provides the basis for the parameters of low-volume, high-concentration spray technology [148].

The choice of actuators influences weeding efficiency and environmental sustainability. Mechanical weeding employs many processes to accommodate varied circumstances, demonstrating significant efficiency and coverage. Chemical weeding decreases pesticide usage by 40% to 95% via an advanced spraying method. Laser weeding exhibits exceptional precision and minimal damage. Thermal weeding is superior in efficiency and cost-effectiveness compared to hand weeding. Following the optimization of uncrewed aerial vehicle (UAV) settings, the residual impact of pest management is enduring, and various technologies offer robust support for intelligent weeding.

Table 12 shows traditional weed management methods, while Table 1 displays intelligent weed control equipment.

This section examines the application of hardware system architecture within intelligent weed control solutions. Robotic platforms demonstrate terrain adaptability: wheel-based configurations achieve operational efficiency gains of 15–20% over tracked platforms on slopes ≤ 5°, while tracked platforms maintain lateral deviation within ±2.3 cm. Uncrewed Aerial Vehicles (UAVs) facilitate precision spraying operations. The perception system employs a spatio-temporal calibration framework to limit dynamic attitude errors to 0.5°, and Internet of Things (IoT)-enabled multi-source data fusion enhances overall system reliability. Weed control mechanisms exhibit diversity: mechanical methods achieve removal efficiencies of 84–92.3%, chemical spraying reduces pesticide usage by 40–95%, laser ablation attains 97% target precision with minimal crop damage, and thermal techniques offer superior cost-effectiveness relative to manual labor. Collectively, optimized hardware systems synergistically integrate terrain adaptation, precise environmental perception, and efficient weed elimination modalities to enhance overall performance.

3.8. Software System Architecture

3.8.1. Real-Time Operating System

The practical and dependable operation of intelligent weeding machinery relies on precise task scheduling managed by a real-time operating system (RTOS). Research shows that real-time operating systems (RTOS) can ensure the completion of critical tasks, such as weed detection, path planning, and actuator control, within strict time limits by using deterministic scheduling algorithms. This helps prevent multitasking conflicts and delays in response [150]. In high-speed operations, an RTOS can process sensor data streams—like visual recognition results and terrain sensing information—within milliseconds. It can also dynamically adjust the machine’s operating parameters, including travel speed and tool angle, in real time, keeping system latency under 50 milliseconds and maintaining weeding accuracy above 95%.

Typical applications include the remote-operated weeding robot Wee Ro, which utilizes an integrated real-time control system based on the Pixhawk flight control platform. It facilitates low-latency sensor-actuator data exchange through the PPM/PWM signal processing protocol. The system uses an event-driven task scheduling mechanism to ensure that the wireless transmission delay of remote commands stays within 80 milliseconds. Combined with a dynamic priority scheduling algorithm, this guarantees real-time operation in complex agricultural environments [151]. This RTOS-based control architecture delivers deterministic performance, enabling uninterrupted operation of intelligent weeding machines in unstructured settings.

The reliable operation of intelligent weeding machinery relies on the precise scheduling of RTOS. RTOS ensures that critical tasks are completed within a limited time through deterministic algorithms, processes sensor data in milliseconds, and dynamically adjusts parameters to keep latency below 50 milliseconds, with weeding accuracy exceeding 95%. The Wee Ro robot relies on its ability to achieve low-latency data exchange, with a remote command delay of no more than 80 milliseconds, providing deterministic support for continuous operations in complex agricultural environments.

3.8.2. Task Planning Algorithms

Task planning algorithms are central control modules of intelligent weeding machines, primarily designed to create optimal paths that balance efficiency and safety based on environment-aware data. These algorithms must comprehensively consider multiple factors, including weed density distribution, crop row spacing limitations, and mechanical kinematic constraints, among others [24]. Technical solutions can be categorized into three main groups.

An algorithm [152], similar to a traditional graph search method, expands nodes based on the estimated best paths by combining a heuristic function h(x) (such as Manhattan distance or Euclidean distance) with a practical cost g(x) to create an evaluation function f(x) = g(x) + h(x). In an agricultural setting with minimal weed presence and gentle topography, the algorithm can improve pathfinding efficiency by 30–50% through heuristic guidance, with the resulting path length close to the theoretical optimal solution. Every day, use cases include inter-row weeding in conventional agriculture, where the A-algorithm efficiently generates collision-free Z-shaped operational paths based on crop row and ridge coordinates.

The RRT algorithm [153] employs a probabilistically complete random sampling method to construct a search tree and gradually refine paths through the “grow–rewire” process. The main advantages of RRT include: (1) no need for prior environment modeling, enabling quick detection of feasible paths in unfamiliar and complex terrains (such as ridge and furrow undulation or straw obstacles); (2) control of path curvature variation within 0.1 rad/m by employing path smoothing techniques (like Bessel curve fitting), suited to the motion limits of mowing equipment. Experimental results show that in a simulated farming scenario with 100 random obstacles, RRT’s path planning time is 40% faster than traditional RRT. Additionally, the rate of improvement in path length reaches 25%.

In recent years, end-to-end planning algorithms using CNN-RNN architecture [154] have become an increasingly prominent area of research. These algorithms utilize environment-aware data, such as LiDAR point clouds and visual images, as inputs to produce control outputs (e.g., steering angle, travel speed) through end-to-end training, thus eliminating the need for explicit environmental modeling found in traditional algorithms. The advantages are evident: (1) strong generalization ability, maintaining a path success rate of 92% across various crop scenarios, including corn and wheat; (2) fast response speed, thanks to GPU-accelerated planning models, enabling real-time decision-making at 20 Hz. A field experiment shows that the deep learning planning algorithm achieves an 18% higher obstacle avoidance success rate on unstructured farmland compared to standard algorithms, with path efficiency similar to that of human experts.

The job planning algorithm serves as the primary control module of intelligent weeding machinery, producing efficient and safe trajectories based on environmental data. The conventional graph search algorithm enhances pathfinding efficiency by 30% to 50% via heuristic guiding. The RRT technique necessitates no prior modeling, efficiently identifies possible paths in intricate terrains, and allows for controlled path curvature. The CNN-RNN end-to-end algorithm exhibits robust generalization, achieving a path success rate of 92% across various crop scenarios and demonstrating a high success rate in obstacle avoidance inside unstructured farmland. Each algorithm is tailored to the specific demands of certain scenarios.

3.8.3. Human–Machine Interface

The design of the human–machine interface (HMI), as the fundamental component of intelligent weeding machinery, directly influences operational efficiency and the experience of human–machine collaboration. An optimal HMI must possess three essential attributes: intuitive operational logic, real-time status feedback, and facile parameter change, thereby minimizing cognitive load on the user and facilitating the effective transition of technology from laboratory settings to field operations.

The small digital instrumentation system with HMI technology shows significant technical flexibility in agricultural machinery intelligence. It allows real-time display, data collection, and local storage of 12 key parameters, including tractor drive wheel slip rate and travel speed, through a touch-screen interface. The system also features automatic sensor calibration and dynamic operational status monitoring. After static calibration and field testing, the rotational speed measurement accuracy reaches ±1.2%, with a slip rate response delay of less than 200 ms. This enables smooth integration into wheeled and tracked power platforms, providing high-precision data support for agricultural equipment research, development, and educational experiments [155].

To meet the lightweight needs of small and medium-sized farms, the Wee Ro HMI (PTC, Version 5.0) system utilizes a remote control setup with a handheld terminal and ground station software, allowing for precise control of weeding machines within a 500 m range via 2.4 GHz wireless communication. The operator can access real-time, multidimensional data, including operational paths, heat maps for weed detection, and medication tank levels. It also allows one-click switching between modes, such as ‘fine spraying within rows’ and ‘general spraying over the entire field’. The design ensures that one machine can spray identically to others. Tests show that this setup increases the area one person can cover by 30% per day, reduces manual work by 40%, and significantly lowers labor effort compared to traditional manual methods [151]. In the future, adding a semi-autonomous decision module will improve human–machine cooperation.

The cotton harvester yield monitoring HMI software, developed on the Qt platform for precise agricultural data collection, utilizes a C++ event-driven approach to facilitate rapid communication with the embedded controller. The system includes a GPS positioning module (accuracy ±2.5 m) and a hydraulic pressure sensor (resolution 0.1 kg). It applies the Haversine formula for real-time calculation of the operational area. It uses a dynamic filtering algorithm to ensure accurate measurement of the weight of each cotton basket. Field testing shows that the yield monitoring error remains below 2.5%, meeting the requirements for onboard auto-calibration in multi-species cotton field trials [156].

The interface visualization design, real-time data synchronization, and fault-tolerance mechanisms of various HMI technologies have established a closed-loop interaction system for the integration of ‘human–machine–environment’. This device enhances operating safety in intricate field conditions. It enhances human–machine collaboration by precisely regulating operational factors, including the optimization of herbicide dosage and the management of mechanical energy consumption, while also aiding in decision-making.

This section explores how software system architecture supports intelligent weed control systems. The real-time operating system (RTOS) ensures deterministic scheduling to complete critical tasks within set time limits, keeping system latency below 50 ms and achieving weed control accuracy over 95%. Remote command latency for the Wee Ro robot is limited to ≤80 ms. Task planning algorithms demonstrate flexibility across various scenarios. For instance, classical graph search methods enhance operational efficiency by 30–50%, while Rapidly-exploring Random Trees (RRT) accelerate path planning in complex terrain by 40%. Furthermore, an end-to-end CNN-RNN architecture achieves a 92% success rate in executing paths. The human–machine interface offers intuitive operation with real-time feedback, increasing operational efficiency by 30% and reducing manual intervention by 40%. Overall, this software architecture guarantees system reliability and efficiency through three main elements: real-time scheduling, intelligent planning algorithms, and seamless human–machine collaboration.

3.9. Typical Application Scenarios

3.9.1. Row Crop Scenario

Advanced weed control technology has made significant progress in growing row crops such as maize and cotton. These crops provide a reliable basis for weed identification and precise weed management due to their uniform row and plant spacing. Research shows that innovative weeding systems utilizing machine vision can leverage the directional features of grass crop leaves—vertical within rows and horizontal between rows—to enable real-time, dynamic tracking of wheat rows. This is achieved through a combination of algorithms, including Sobel–Canny edge detection, Hough transform line segment extraction, Mean-Shift clustering, and Kalman filter tracking, which provides tracking accuracy under 50 mm in challenging field conditions [157]. A tracking precision of 50 mm is reached in complex field environments.

In the context of cornfields, combining morphological character analysis of crops and weeds with deep learning algorithms enables high-precision weed identification, often achieving recognition accuracy above 90%, which significantly reduces accidental crop damage. The improved Faster R-CNN deep learning model performs well in weed detection during the cotton seedling stage, reaching a weed identification accuracy of 98.43% (mAP) and an effective spraying rate of 98.93% when used in an autonomous spraying robot system [158]. The intelligent spraying system, which uses machine vision and the YOLOv4 algorithm and is powered by the NVIDIA Jetson AGX Orin edge computing platform, achieves effective spraying rates of 93.33% in indoor tests and 90.6% in field tests, while significantly reducing herbicide use [159].

The weed management methods employed in crops like corn and cotton have yielded exceptional outcomes. Machine vision, integrated with various algorithms, facilitates dynamic tracking of wheat rows, achieving an accuracy of up to 50 mm in intricate situations. Corn fields have attained a recognition rate of over 90% through morphological analysis and deep learning techniques. Enhanced models, such as Faster R-CNN and YOLOv4, achieve an accuracy and effective spraying rate of over 90% in weed recognition and intelligent spraying in cotton fields, substantially decreasing dosage and limiting harm.

3.9.2. Vegetable and Orchard Scenarios

Vegetable and orchard ecosystems are more complex than row crop systems; many vegetable species have distinct growth cycles, and higher planting densities result in significant inter-plant shading. Additionally, the irregularly shaped crowns of orchard fruit trees, combined with dense weed populations beneath them and considerable shading interference, require advanced intelligent weed control technologies. Jia et al. [160] developed a sophisticated orchard row obstacle avoidance weeding machine that ensures precise obstacle evasion through the integration of non-contact sensors and a mechanical haptic framework (Figure 8). It uses a hydraulically driven shovel-type weeding mechanism, improves the chassis structure through finite element analysis (resulting in an 8% reduction in weight), and optimizes operational parameters using the response surface method. The effectiveness of the weeding coverage was confirmed to be 84.6% during field testing.

To accurately identify weeds in orchards, developing an environmental model using 3D vision technology is essential, considering the shading effects of fruit trees and variations in light and shade. A real-time lettuce-weed localization method with the lightweight YOLOv7 model (Multimodule-YOLOv7-L) achieves a detection accuracy of 97.5% and an inference speed of 37.3 FPS. Additionally, designing a weed severity classification algorithm for in-row detection achieves 100% accuracy across eight scenarios, providing decision support for the mechanical-laser collaborative weeding robot [161]. Weed detection using DJI Phantom 4 Pro Multispectral data combined with machine learning algorithms shows that the Maximum Likelihood Classification (MLC) algorithm performs best (overall accuracy 90.78%, Kappa coefficient 0.85), enabling weed identification and optimized water and fertilizer management [162].

Vegetable and orchard ecosystems exhibit greater complexity and necessitate sophisticated intelligent weeding technology. The orchard obstacle avoidance and weeding machine integrates sensors and mechanical contact for accurate obstacle evasion, achieving a coverage rate of 84.6%. Three-dimensional vision facilitates weed identification, achieving a detection accuracy of 97.5% with a lightweight model and a severity classification accuracy of 100%. The recognition accuracy of the multispectral combination with the MLC algorithm is 90.78%, facilitating precision weeding and the control of water and fertilizer.

3.9.3. Paddy Field Scenario

The unique features of the paddy field ecosystem pose many challenges for intelligent weeding. High humidity can cause moisture-related problems with sensor optics, reflections on the water surface can interfere with image capture, and inter-plant shading during the rice tillering stage makes weed identification more difficult. J. Yu et al. [163] developed a pneumatic inter-plant mechanical weeding device for rice, which works through the combination of machine vision localization and pneumatic propulsion. The device was improved through kinematic simulation and field trials to balance weed control effectiveness with seedling damage. The system was optimized using kinematic simulation and field trials to find a balance between weed control success and seedling injury rate.

A rice seedling row identification method using the enhanced CS-YOLOv5 model and a hierarchical clustering technique (initial clustering at the base, followed by outlier removal at the top) is introduced to support the lateral seedling avoidance operation of the weeding machine in complex paddy field conditions. The system provides real-time recognition at 12 frames per second, with an average angle variation of less than 1° in challenging situations, including floating weed cover, water surface reflections, weed interference, and curved seedling rows, thereby significantly reducing seedling damage during mechanical weeding [164].

The adaptive cruise weeding robot (MW-YOLOv5s model, incorporating MobileViTv3 and WIoU_loss), developed from the enhanced YOLOv5 for rice fields, achieves a seedling recognition accuracy of 90.05% and a processing speed of 19.51 FPS through real-time seedling recognition, navigation path extraction, and autonomous operation control. The field test confirmed that the weed clearance rate reached 82.4%, while the seedling injury rate was 2.8%, thereby fully meeting the agronomic standards for rice [165].

The elevated humidity and reflecting nature of rice fields present difficulties for intelligent weeding. The pneumatic interplant weeding apparatus has been refined to balance weeding efficacy and minimize seedling harm. The CS-YOLOv5 model was augmented by integration with clustering technology to facilitate real-time recognition and mitigate seedling harm caused by mechanical weeding. The MM-YOLOV5S robot achieves a seedling recognition accuracy of 90.05%, a cleaning accuracy of 82.4%, and a seedling damage rate of 2.8%, thereby conforming to the agronomic norms for rice.

This section reviews applications of intelligent weed control technology across typical agricultural scenarios. In row crop systems, machine vision combined with multi-algorithm frameworks enables dynamic wheat plant tracking (positioning accuracy ≤ 50 mm). Modified Faster R-CNN and YOLOv4 architectures achieve over 90% accuracy in weed identification and targeted spraying in cotton fields, while also reducing pesticide use. In fruit and vegetable orchards, sensor-based obstacle avoidance systems reach 84.6% weed coverage, and lightweight models attain 97.5% detection accuracy. Multi-spectral imaging paired with Maximum Likelihood Classification (MLC) algorithms reaches 90.78% identification accuracy. In rice paddy settings, equipment is specially optimized for high-humidity and reflective conditions: CS-YOLOv5 supports mechanical obstacle avoidance, and MW-YOLOv5s achieves an 82.4% weed removal rate with only 2.8% seedling damage, both meeting operational agricultural standards.

3.10. Analysis of Environmental Protection and Economic Benefits

3.10.1. Environmental Benefits

The widespread adoption of intelligent weed management technology significantly advances ecological protection in agriculture. Traditional chemical weed control relies heavily on herbicides, which can cause environmental problems such as soil compaction, nutrient runoff into water bodies, and habitat destruction for species like bees and earthworms. Intelligent weed management reduces chemical use by accurately identifying and targeting weeds and crops. Research shows that AI-powered precision spraying systems (e.g., See and Spray, H-Sensor) can cut herbicide use by 40–60% compared to traditional methods through targeted application. Spot spraying technology can save between 5% and 90% of herbicides with its on-demand application, significantly lowering soil contamination and pesticide residues in soil and water environments [166].

Tirthankar Mohanty et al. [167] developed the YOLOv5 rice field intelligent weeding robot, which utilizes machine vision and a mechanical–chemical synergistic weeding approach to achieve a 95% weeding rate and a 90% weed burial rate. This also reduced herbicide use by 60% compared to traditional methods, balancing weed control and ecological safety. The composite intelligent inter-row weeding robot, based on YOLOv5, employs a dual mode of ‘mechanical damage + targeted application’, achieving weed clearance rates of 90.03% in corn fields and 94.45% in cabbage fields, while using only 15.28% of the herbicide required by conventional methods [168]. The UAV-based crop–weed classification system combined ExG/NDVI visual features with crop row line geometric features using random forest, achieving 86% accuracy in multi-category weed identification within a sugar beet field. Including near-infrared (NIR) data improved classification accuracy by another 3% [169], providing more reliable visual assistance for precise herbicide application.

Advanced weed management technology enhances agricultural ecological conservation. In contrast to the environmental issues associated with conventional chemical weeding, it decreases pesticide usage by 40% to 60% via exact identification and targeted application. It conserves 5% to 90% of pesticides through localized spraying. The pertinent robots utilize a collaborative approach, mitigating over 90% of negligence and substantially decreasing pesticide residues. The unmanned aerial vehicle system improves recognition precision, hence reducing water and soil contamination and chemical residues.

3.10.2. Economic Benefits

When analyzed from a long-term cost–benefit perspective, intelligent weed management systems show significant economic advantages. Despite high initial costs for equipment acquisition (such as drones and edge computing terminals), these costs can be recovered through savings in labor and chemicals in large-scale applications. Research shows that UAV-based advanced weed management systems can cut pesticide and labor costs by 50–80% in the short term using the ‘precise input–efficient output’ approach, and over the long term, increase crop yields by 10% [170].

Over the long term, improvements in crop yields (10–20%) and effective risk management can boost farmers’ net income by 20–35%.

In a small-to-medium-sized farm setting, the ‘AI recognition + precision application’ technological framework can reduce hardware costs to about $1500, cut pesticide use by 50–80%, and lower labor input by 80%, while offsetting the initial investment through increased yield [171]. The economic benefit is especially significant in labor-scarce areas, such as North American farms, where intelligent weeding can reduce operational expenses by 62% each season compared to traditional methods.

In large-scale planting operations, intelligent weeding systems can reduce pesticide costs by 30–50% and boost labor efficiency by four to six times through automation and data-driven decision-making [172]. Although the initial investment in edge computing equipment, including high-precision GPUs, is significant, the total lifecycle cost is over 40% lower than that of traditional chemical weed control, making it a ‘low-consumption and high-efficiency’ pillar of economic growth in the digital transformation of agriculture.

The advanced weed management technology demonstrates considerable long-term cost efficiency. Despite the substantial initial investment in technology, costs can be recouped through reductions in labor and pharmaceuticals in large-scale implementations. Achieve a reduction in drug and labor expenses by 50% to 80% in the short term, while enhancing production by 10% to 20% in the long run. The framework for small and medium-sized farms has diminished costs and enhanced efficiency, with quarterly expenses in labor-scarce regions decreasing by 62%. Extensive planting diminishes pesticide application by 30% to 50%, enhances labor efficiency by four to six times, and reduces the total life cycle cost by more than 40%, thus serving as a cornerstone of agricultural digital economic expansion.

3.11. Existing Technical Challenges

3.11.1. Insufficient Model Generalization Capability

Contemporary deep learning algorithms face significant limitations in their ability to generalize across different scenarios. Variations in soil types, changing light conditions, and shifts in weed community composition across multiple fields often result in a sharp decline in model performance. Research [173] shows that current models heavily depend on training with narrow datasets designed for specific conditions. The absence of large-scale, publicly available benchmark datasets makes these models poorly suited for new environments. Sudden changes in light levels, shading of crop leaves, phenotypic abnormalities caused by pests and diseases, and the diversity of weed species and their growth stages in natural ecosystems can significantly reduce the accuracy of model predictions. For example, the mean average precision (mAP) of Faster R-CNN is only 9.55% on the test set involving complex field scenarios. Meanwhile, even advanced models like the YOLO series often struggle to exceed a mAP of 80%.

The research by Shorewala, S. et al. [174] shows that the limited generalization ability of current models reveals three main issues: (1) Supervised learning requires large amounts of labeled data and needs re-labeling for cross-species transfer, leading to high costs. (2) Semi-supervised learning reduces the need for labeling but performs poorly in complex environments, such as sudden changes in illumination and overlapping plants. For example, the segmentation accuracy for the beet dataset decreases by 0.108 compared to that of the carrot, indicating the model’s lower adaptability to challenging conditions. (3) The model struggles to adapt to new species combinations and needs retraining even after fine-tuning with ResNet50. Additionally, class imbalance remains a problem due to the uneven distribution of weeds in the field after SMOTE, along with a complex environment that significantly decreases the accuracy of weed density estimation and distribution prediction.

Jia Chen Yang et al. [175] identified four drawbacks in the current intelligent agricultural vision system: (1) dependence on manually annotated datasets, lack of authentic data in natural settings, and limited robustness in complex field scenarios; (2) inconsistent image quality caused by varying illumination and weather conditions, along with preliminary processing steps that make application more difficult; (3) poor integration of theoretical research with the Internet of Things (IoT) and embedded technology, which limits adaptability to different agricultural environments; and (4) a narrow variety of weeds across different fields and a limited presence of weeds within the field. While small-sample learning reduces reliance on annotations, it shows limited ability to adapt to new species combinations. It also requires dynamic adjustment of technical parameters to match the changing agricultural environment, making generalization more complicated.

The generalization capability of modern deep learning algorithms is inadequate, and variations in soil, light, and weed populations result in a precipitous decline in model performance. The current models depend on limited datasets and are deficient in extensive benchmark datasets. The mAP of Faster R-CNN in intricate scenarios is merely 9.55%, whereas the YOLO series struggles to surpass 80%. It exhibits issues including elevated annotation expenses, suboptimal performance in intricate habitats, challenges in adapting to novel species, and category imbalance. The intelligent agricultural vision system encounters several problems, including data and technological integration.

3.11.2. Contradiction Between Real-Time and Accuracy

Achieving a balance between high-precision detection and real-time inference on edge devices remains a key challenge for deploying intelligent weed control technology. While premium edge computing systems, such as the NVIDIA Jetson AGX Orin, can support the YOLOv8L model at 30 FPS, the high hardware costs—running into thousands of dollars—limit their use in large-scale agricultural applications [114]. Although budget-friendly single-chip microcontrollers (MCUs) like Arduino are more affordable, they can only run lightweight models and often operate at a processing speed below 5 FPS, which is insufficient for the real-time needs of high-speed weed control. Multi-sensor data fusion (e.g., RGB, multispectral, and point cloud) requires increased computing power, heightening the tension between “high-precision identification” and “real-time processing”.

Zhang, J. et al. [176] demonstrate that the “precision-speed” trade-off in deep learning models reveals a triadic contradiction: (1) YOLO series models (e.g., YOLOv7-Fweed, mAP 96.6%) pose challenges for implementation on embedded hardware (e.g., Jetson Nano) due to their complex network architectures, resulting in subpar real-time performance (e.g., YOLOv4-Tiny at 14.7 FPS); (2) High-resolution sensors (e.g., hyperspectral cameras) improve identification precision (ResNet101_v segmentation mIoU reaches 93.9%), but they also increase data volume (processing 1181 images takes several hours), which conflicts with the real-time requirements of drones; (3) Lightweight models (e.g., MobileNetV2) reduce inference time to under 10 ms but lead to about an 8% drop in classification accuracy. Current research often employs model pruning (e.g., YOLOv5s reduces parameters by 40%) or hardware acceleration (e.g., FPGA implementation) to solve the issue. However, they have not yet achieved a perfect balance between the two approaches.

The implementation of intelligent weeding technology on edge devices encounters the issue of reconciling high-precision detection with real-time reasoning. Premium edge systems facilitate elevated frame rates but are expensive, whereas budget MCUs are limited to executing lightweight models and exhibit inadequate speed. The integration of multiple sensors amplifies the requirement for computational capacity. The model shows a dichotomy between “precision and speed”, and existing solutions such as pruning and hardware acceleration have not yet attained an optimal equilibrium between the two.

3.11.3. Complexity of Multimodal Data Processing

Existing fusion algorithms struggle to effectively utilize inter-modal complementary information in farming settings due to the high dimensionality and substantial heterogeneity of multimodal data (e.g., RGB, multispectral, point cloud). The complexity of multimodal data processing is reflected in the difficulty of automatically mining associations due to modal heterogeneity, the obsolescence of traditional fusion classification and the limitations of each of the five modern methods, as well as the practical challenges of missing data, data scarcity, insufficient non-visual-linguistic pre-trained models, and poor interpretability, which highlight the deep complexity of the fusion strategy, generalization capability, and practical adaptability. [177].

Ke Xu et al. [178] showed that the complexity of multimodal data processing is evident in three challenges: (1) Data acquired from various sensors (RGB, depth, and spectral) is prone to resolution discrepancies, synchronization errors in time and space, and other alignment issues, while sudden changes in lighting and material reflections can easily cause data loss or noise interference. (2) Current fusion techniques mainly use simple methods like feature concatenation and lack adaptive weighting mechanisms for modal complementarity. (3) The challenge of aligning multimodal features is significantly increased in situations involving field occlusion and crop overlap, and the high computational demands of deep learning models on multi-source data make the trade-off between hardware costs and real-time performance even more difficult.

Trong, V. H. et al. [179] further highlighted that multimodal model fusion faces a dual dilemma: (1) The probabilistic outputs of diverse models, such as NASNet and ResNet, need to be combined, requiring the calculation of composite score vectors through Bayesian conditional probability or hierarchical weighting (at either the model or species level), while also dealing with dataset imbalance. (2) Fusion methods, including weighted linear combination and power multiplication, must be adapted to each model’s structural features, with processing time increasing linearly as more models are added, thus demanding a careful balance between accuracy and efficiency in real-world applications.

The high-dimensional variability of multimodal data in agriculture renders current fusion methods ineffective in leveraging complementary modal information. The data exhibits alignment difficulties, including resolution discrepancies and patio-temporal synchronization faults, rendering it susceptible to interference. The existing fusion technology methods are rudimentary and deficient in adaptive weighting. Occlusion and crop overlap intensify the challenges of feature alignment, while the computing demands of deep learning models are substantial. Multimodal model fusion continues to encounter the issue of integrating probabilistic results and modifying methodologies, necessitating a compromise between accuracy and efficiency.

3.11.4. System Reliability to Be Improved

In real agricultural environments, sensors are often subjected to harsh conditions, including dust, moisture, and mechanical vibrations, which can lead to sensor failures or reduce model performance. Long-term exposure to dust and humidity can contaminate optical components, lowering image or data quality and negatively impacting the accuracy of weed identification. Chang, C. L. et al. [180] have shown that the main reliability challenge of intelligent weed management systems is their limited ability to adapt to environmental changes. The angle of afternoon sunlight and the short days during the season can cause fluctuations in categorization rates (e.g., weed classification accuracy ranges from 90.0% to 95.0%). Chang-Tao, Z. et al. [181] propose a robust improvement: combining the LettWd-YOLOv8l model with GAM and CA mechanisms to enhance feature extraction and localization, achieving a detection accuracy of 99%. The hardware uses an air-powered control system with an STM32 microprocessor to minimize crop damage. It increases data diversity through rotation and noise injection, boosting the model’s adaptability. It also improves the algorithm’s performance in low-light and high-density weed scenarios. The conveyor belt test achieved an 89.27% success rate for lettuce identification and an 83.73% success rate for weeding at a speed of 3.28 km/h, thereby increasing the system’s stability in complex field conditions.

In actual agricultural environments, sensors are prone to failure or reduced efficiency due to factors such as dust and humidity. Contamination of optical components lowers the accuracy of weed identification, and environmental changes cause fluctuations in classification rates. Through the LettWd-YOLOv8l model combined with enhanced mechanisms, hardware optimization, data augmentation, and other improvements, the detection accuracy reaches 99%. The success rate of lettuce recognition and weeding in the conveyor belt test is relatively high, which enhances the stability of the system in complex fields.

3.12. Future Directions

3.12.1. Generic AI Model Development

The development of generic AI models with cross-task migration and cross-scene adaptation is crucial for overcoming the limitations of current model generalization abilities. By creating a universal knowledge representation system, this model can capture more generalized feature patterns in agricultural environments and enable the quick transfer of applications across different crop scenarios and jobs. Using meta-learning, self-supervised learning, and transfer learning paradigms allows models to adapt to new settings with fewer samples, thereby significantly enhancing their generalization performance. Studies show that the meta-learning-based generalized AI model can increase the accuracy of cross-field weed identification by 15–20% compared to traditional models, demonstrating strong adaptability to complex agricultural environments [182].

3.12.2. Edge Computing and Model Light Weighting

Advancing model light weighting and edge computing hardware innovation to achieve ‘end-side intelligence’ is the key strategy for solving the conflict between real-time processing and cost efficiency. At the model design stage, methods such as hybrid accuracy training and dynamic network architecture search (NAS) can significantly reduce computational demands while maintaining detection accuracy. The mixed-precision training approach can reduce model computation by nearly 50% with a loss of less than 3% in accuracy [183]. Developing high-performance, low-power edge computing chips based on the RISC-V architecture can dramatically improve the cost-performance ratio of edge devices, enabling a balance between high-precision detection and real-time reasoning while maintaining low hardware costs.

3.12.3. Multimodal Fusion and Active Sensing

The development of Transformer-based multimodal fusion architectures enables deep semantic alignment of various sensor data and fully harnesses the complementary potential of multimodal inputs. By combining multiple sensor sources, including RGB/Depth cameras, LiDAR, and ultrasonic sensors, along with deep learning techniques such as CNN and Transformer, we can achieve cross-modal complementarity of spectral features, spatial information, and structural details. This effectively addresses issues such as spectral similarity and occlusion between weeds and crops. The use of active light source control (e.g., near-infrared auxiliary lighting) and an adaptive sensing approach (dynamic adjustment of LiDAR scanning frequency and camera exposure settings), combined with real-time optimization of sensing parameters based on environmental feedback, can enhance weed localization accuracy in challenging lighting conditions (such as overcast days or bright sunlight) and unstructured environments (tall crops, soil background variations). The feature fusion method, utilizing the attention mechanism, enables patio-temporal alignment and weighting of multimodal data, significantly enhancing the system’s ability to perceive weed distribution and make more robust decisions [184].

The active sensing technology enables the weeding machine to adjust its sensor gathering approach based on environmental data: when it detects a dense weed area, the ‘Spectral Sensing–Active Learning–Dynamic Modelling’ system is activated to optimize data collection frequency and feature extraction strategies in real time, thereby improving the accuracy of weed identification and localization [185]. This dynamic sensing mechanism marks a technological shift from ‘passive perception’ to ‘active cognition’ through a closed feedback loop, providing a perceptual solution for precise weed control in complex agricultural environments.

The Transformer-based multimodal fusion architecture achieves deep semantic alignment of multi-sensor data. It combines technologies such as CNN to leverage cross-modal complementarity, addressing issues of spectral similarity and occlusion between grass and seedlings. The integration of active light source control, adaptive sensing, and attention mechanism enhances the accuracy of weed positioning in complex lighting and unstructured environments. The dynamic sensing mechanism achieves the transformation from passive perception to active cognition through closed-loop feedback, providing a perception solution for precise weeding in complex farmlands.

3.12.4. Digital Twin and Intelligent Decision Making

The creation of digital twins for farms has become a key technological method for precise planning and informed decision-making in weed control operations. By combining real-time IoT sensor data, crop growth models, and historical operational records, a digital twin can be developed to map the spatial and temporal dynamics of farmland, allowing for the simulation and quantitative assessment of various weeding strategies in a virtual setting. It provides a data-driven scientific foundation for predicting weed growth patterns and evaluating the effects of weed control, supporting better decision-making.

Using the crop–weed competitive growth model to predict weed succession trends, the integration of real-time data from field sensors actively determines the best timing and pathway for weeding, enabling precise and intelligent control of weed removal tasks [186]. Building a multi-source, data-driven virtual model of crop, soil, and environment allows the dynamic simulation of weed growth and the improvement of weed management strategies [187]. In the future, the intelligent weeding decision system will rely on 5IR technology as its foundation, creating an intelligent closed loop of ‘data perception-model reasoning-dynamic execution,’ thereby helping to shift weeding operations from experience-based to data-driven methods [188].

3.12.5. Construction of a Sustainable Technology System

To promote sustainable agricultural development, integrating intelligent weed control technology with farming practices and establishing a comprehensive weed management system is crucial. Combining intelligent weed control with cover cropping, crop rotation, and fallow farming can create a synergistic effect in controlling weeds. Cover cropping utilizes straw or plastic film to block weed photosynthesis, while crop rotation relies on ecological niche competition among crops to reduce weed growth. The use of intelligent weed control technology can improve weed management efficiency by 30–50% in terms of chemical application and energy use, achieving the goals of “precision weeding” and “ecological protection” [189].

The technical architecture requires integrating multi-dimensional innovation and ecological practices: an integrated perception system is built using wireless sensor networks (WSNs) and uncrewed vehicles to collect data from soil, crops, and weeds; intelligent algorithms, including fuzzy logic and Transformer, are used to improve weed control strategies; and sustainable designs, like solar power and hydrogen hydroponics, are incorporated to reduce system energy use.

Blockchain technology allows the tracking of work data, forming a closed-loop management system characterized by “accurate perception–intelligent decision-making–sustainable implementation” [190,191,192]. This sustainable weed control framework, which integrates technology, agriculture, and ecology, meets modern agricultural needs for operational efficiency while supporting efforts to achieve the United Nations’ Sustainable Development Goals through initiatives like carbon footprint reduction and biodiversity protection.

4. Conclusions

This review systematically discusses the current status and prospects of the field. It carefully outlines how deep learning and machine vision are advancing in agricultural weed control equipment. The research shows that target detection and semantic segmentation models using CNN and Transformer architectures (like the YOLO series and U-Net) achieve over 90% accuracy in weed identification. Combining these models with multispectral imaging and laser radar sensors allows for precise weed detection even in complex agricultural environments. Advanced weed control systems that utilize deep learning algorithms, combined with lasers and mechanical arms, can reduce herbicide use by 30% to 70% and increase weed control efficiency by more than 40%. This approach offers significant environmental and economic advantages. However, current technology still faces challenges such as limited model generalization, real-time operation issues, and complex multi-modal data processing. Future research should aim to improve AI models, enhance edge computing, develop multi-modal integration, and build sustainable systems to advance smart weeding toward “high-efficiency, precision, and environmentally friendly” solutions. These efforts will support the global agriculture sector’s move toward sustainability.

Author Contributions

Conception: X.G. and J.G.; Methodology: X.G.; Validation: J.G.; Formal Analysis: X.G.; Draft composition: X.G.; Manuscript review and editing, X.G., J.G. and W.A.Q.; Visualization: X.G.; Oversight: J.G.; Project management: J.G.; Funding procurement: J.G. All authors have read and agreed to the published version of the manuscript.

Funding

The Project is Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (Grant No. PAPD-2018-87).

Conflicts of Interest

The authors declare that there are no financial or personal conflicts of interest that could affect the research or results of this paper.

References

Chauhan, B.S. Grand Challenges in Weed Management. Front. Agron. 2020, 1, 3. [Google Scholar] [CrossRef]
Ofosu, R.; Agyemang, E.D.; Márton, A.; Pásztor, G.; Taller, J.; Kazinczi, G. Herbicide Resistance: Managing Weeds in a Changing World. Agronomy 2023, 13, 1595. [Google Scholar] [CrossRef]
Sapkota, R.; Stenger, J.; Ostlie, M.; Flores, P. Towards reducing chemical usage for weed control in agriculture using UAS imagery analysis and computer vision techniques. Sci. Rep. 2023, 13, 6548. [Google Scholar] [CrossRef]
Huang, X.; Wang, W.; Li, Z.; Wang, Q.; Zhu, C.; Chen, L. Design method and experiment of machinery for the combined application of seed, fertiliser, and herbicide. Int. J. Agric. Biol. Eng. 2019, 12, 63–71. [Google Scholar] [CrossRef]
Adhinata, F.D.; Wahyono; Sumiharto, R. A comprehensive survey on weed and crop classification using machine learning and deep learning. Artif. Intell. Agric. 2024, 13, 45–63. [Google Scholar] [CrossRef]
Tang, M.; Xiang, S.; Wang, L. A YOLOv3-based framework for weed detection in agricultural fields. In Proceedings of the 2023 8th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 21–23 April 2023; pp. 1980–1986. [Google Scholar]
Li, H.; Travlos, I.; Qi, L.; Kanatas, P.; Wang, P. Optimization of Herbicide Use: Study on Spreading and Evaporation Characteristics of Glyphosate-Organic Silicone Mixture Droplets on Weed Leaves. Agronomy 2019, 9, 547. [Google Scholar] [CrossRef]
Wu, P.; Lei, X.; Zeng, J.; Qi, Y.; Yuan, Q.; Huang, W.; Ma, Z.; Shen, Q.; Lyu, X. Research progress in mechanized and intelligentized pollination technologies for fruit and vegetable crops. Int. J. Agric. Biol. Eng. 2024, 17, 11–21. [Google Scholar] [CrossRef]
Scavo, A.; Mauromicale, G. Integrated Weed Management in Herbaceous Field Crops. Agronomy 2020, 10, 466. [Google Scholar] [CrossRef]
Shaikh, T.A.; Rasool, T.; Lone, F.R. Towards leveraging the role of machine learning and artificial intelligence in precision agriculture and smart farming. Comput. Electron. Agric. 2022, 198, 107119. [Google Scholar] [CrossRef]
Li, Y.; Guo, R.; Li, R.; Ji, R.; Wu, M.; Chen, D.; Han, C.; Han, R.; Liu, Y.; Ruan, Y.; et al. An improved U-net and attention mechanism-based model for sugar beet and weed segmentation. Front. Plant Sci. 2025, 15, 1449514. [Google Scholar] [CrossRef]
Zheng, L.; Long, L.; Zhu, C.; Jia, M.; Chen, P.; Tie, J. A Lightweight Cotton Field Weed Detection Model Enhanced with EfficientNet and Attention Mechanisms. Agronomy 2024, 14, 2649. [Google Scholar] [CrossRef]
Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer Nature: Cham, Switzerland; pp. 205–218. [Google Scholar]
Hasan, A.S.M.M.; Sohel, F.; Diepeveen, D.; Laga, H.; Jones, M.G. A survey of deep learning techniques for weed detection from images. Comput. Electron. Agric. 2021, 184, 106067. [Google Scholar] [CrossRef]
Chen, C.; Zhu, W.; Steibel, J.; Siegford, J.; Han, J.; Norton, T. Classification of drinking and drinker-playing in pigs by a video-based deep learning method. Biosyst. Eng. 2020, 196, 1–14. [Google Scholar] [CrossRef]
Melander, B.; McCollough, M.R.; Poulsen, F. Informing the Operation of Intelligent Automated Intra-Row Weeding Machines in Direct-Sown Sugar Beet (Beta vulgaris L.): Crop Effects of Hoeing and Flaming Across Early Growth Stages, Tool Working Distances, and Intensities. Tool Work. Distances Intensities 2024, 177, 106562. [Google Scholar]
Kverneland. Onyx: Autonomous Weeding Robot; Kverneland Group: Klepp Stasjon, Norway, 2021. [Google Scholar]
Wang, Y.P.; Wang, Y.K. Carbon robotics launches new laser weeding machine capable of autonomously eradicating weeds. Agric. Eng. Technol. 2022, 42, 117–118. [Google Scholar]
Bakker, T.; van Asselt, K.; Bontsema, J.; Müller, J.; van Straten, G. Autonomous navigation using a robot platform in a sugar beet field. Biosyst. Eng. 2011, 109, 357–368. [Google Scholar] [CrossRef]
Zhao, C.J.; Fan, B.B.; Li, J.; Feng, Q.C. Progress, Challenges, and Trends in Agricultural Robotics. Smart Agric. 2023, 5, 1–15. [Google Scholar] [CrossRef]
Liu, J.; Abbas, I.; Noor, R.S. Development of Deep Learning-Based Variable Rate Agrochemical Spraying System for Targeted Weeds Control in Strawberry Crop. Agronomy 2021, 11, 1480. [Google Scholar] [CrossRef]
Ahmed, S.; Qiu, B.; Ahmad, F.; Kong, C.-W.; Xin, H. A State-of-the-Art Analysis of Obstacle Avoidance Methods from the Perspective of an Agricultural Sprayer UAV’s Operation Scenario. Agronomy 2021, 11, 1069. [Google Scholar] [CrossRef]
Zheng, S.; Zhao, X.; Fu, H.; Tan, H.; Zhai, C.; Chen, L. Design and Experimental Evaluation of a Smart Intra-Row Weed Control System for Open-Field Cabbage. Agronomy 2025, 15, 112. [Google Scholar] [CrossRef]
Liu, H.; Zeng, X.; Shen, Y.; Xu, J.; Khan, Z. A Single-Stage Navigation Path Extraction Network for agricultural robots in orchards. Comput. Electron. Agric. 2025, 229, 109687. [Google Scholar] [CrossRef]
Han, X.; Wang, H.; Yuan, T.; Zou, K.; Liao, Q.; Deng, K.; Zhang, Z.; Zhang, C.; Li, W. A rapid segmentation method for weed based on CDM and ExG index. Crop. Prot. 2023, 172, 106321. [Google Scholar] [CrossRef]
Jiang, H.; Zhang, C.; Qiao, Y.; Zhang, Z.; Zhang, W.; Song, C. CNN feature based graph convolutional network for weed and crop recognition in smart farming. Comput. Electron. Agric. 2020, 174, 105450. [Google Scholar] [CrossRef]
Arsa, D.M.S.; Ilyas, T.; Park, S.-H.; Won, O.; Kim, H. Eco-friendly weeding through precise detection of growing points via efficient multi-branch convolutional neural networks. Comput. Electron. Agric. 2023, 209, 107830. [Google Scholar] [CrossRef]
Visentin, F.; Cremasco, S.; Sozzi, M.; Signorini, L.; Signorini, M.; Marinello, F.; Muradore, R. A mixed-autonomous robotic platform for intra-row and inter-row weed removal for precision agriculture. Comput. Electron. Agric. 2023, 214, 108270. [Google Scholar] [CrossRef]
Arakeri, M.P.; Kumar, B.P.V.; Barsaiya, S.; Sairam, H.V. Computer vision based robotic weed control system for precision agriculture. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 1201–1205. [Google Scholar]
Zhao, X.; Wang, X.; Li, C.; Fu, H.; Yang, S.; Zhai, C. Cabbage and Weed Identification Based on Machine Learning and Target Spraying System Design. Front. Plant Sci. 2022, 13, 924973. [Google Scholar] [CrossRef]
Jin, X.; Che, J.; Chen, Y. Weed Identification Using Deep Learning and Image Processing in Vegetable Plantation. IEEE Access 2021, 9, 10940–10950. [Google Scholar] [CrossRef]
Tufail, M.; Iqbal, J.; Tiwana, M.I.; Alam, M.S.; Khan, Z.A.; Khan, M.T. Identification of Tobacco Crop Based on Machine Learning for a Precision Agricultural Sprayer. IEEE Access 2021, 9, 23814–23825. [Google Scholar] [CrossRef]
Lin, Y.; Xia, S.; Wang, L.; Qiao, B.; Han, H.; Wang, L.; He, X.; Liu, Y. Multi-task deep convolutional neural network for weed detection and navigation path extraction. Comput. Electron. Agric. 2025, 229, 109776. [Google Scholar] [CrossRef]
Xu, Y.; Bai, Y.; Fu, D.; Cong, X.; Jing, H.; Liu, Z.; Zhou, Y. Multi-species weed detection and variable spraying system for farmland based on W-YOLOv5. Crop. Prot. 2024, 182, 106720. [Google Scholar] [CrossRef]
Karim, J.; Nahiduzzaman; Ahsan, M.; Haider, J. Development of an early detection and automatic targeting system for cotton weeds using an improved lightweight YOLOv8 architecture on an edge device. Knowl. Based Syst. 2024, 300, 112204. [Google Scholar] [CrossRef]
Herterich, N.; Liu, K.; Stein, A. Accelerating weed detection for smart agricultural sprayers using a Neural Processing Unit. Comput. Electron. Agric. 2025, 237, 110608. [Google Scholar] [CrossRef]
Zhang, X.; Wang, Q.; Wang, C.; Wang, X.; Xu, Z.; Lu, C. Guidelines for mechanical weeding: Developing weed control lines through point extraction at maize root zones. Biosyst. Eng. 2024, 248, 321–336. [Google Scholar] [CrossRef]
Xiang, M.; Gao, X.; Wang, G.; Qi, J.; Qu, M.; Ma, Z.; Chen, X.; Zhou, Z.; Song, K. An application oriented all-round intelligent weeding machine with enhanced YOLOv5. Biosyst. Eng. 2024, 248, 269–282. [Google Scholar] [CrossRef]
Utstumo, T.; Urdal, F.; Brevik, A.; Dørum, J.; Netland, J.; Overskeid, Ø.; Berge, T.W.; Gravdahl, J.T. Robotic in-row weed control in vegetables. Comput. Electron. Agric. 2018, 154, 36–45. [Google Scholar] [CrossRef]
Azghadi, M.R.; Olsen, A.; Wood, J.; Saleh, A.; Calvert, B.; Granshaw, T.; Fillols, E.; Philippa, B. Precision robotic spot-spraying: Reducing herbicide use and enhancing environmental outcomes in sugarcane. Comput. Electron. Agric. 2025, 235, 110365. [Google Scholar] [CrossRef]
Sassu, A.; Motta, J.; Deidda, A.; Ghiani, L.; Carlevaro, A.; Garibotto, G.; Gambella, F. Artichoke deep learning detection network for site-specific agrochemicals UAS spraying. Comput. Electron. Agric. 2023, 213, 108185. [Google Scholar] [CrossRef]
Zhao, P.; Chen, J.; Li, J.; Ning, J.; Chang, Y.; Yang, S. Design and Testing of an autonomous laser weeding robot for strawberry fields based on DIN-LW-YOLO. Comput. Electron. Agric. 2025, 229, 109808. [Google Scholar] [CrossRef]
Li, J.L.; Su, W.H.; Hu, R.; Niu, L.T.; Wang, Q. Innovative weeding system with multi-sensor fusion for tomato plant detection and targeted micro-spraying of intra-row weeds. Comput. Electron. Agric. 2025, 237, 110598. [Google Scholar] [CrossRef]
Jin, X.; McCullough, P.E.; Liu, T.; Yang, D.; Zhu, W.; Chen, Y.; Yu, J. An innovative sprayer for weed control in ber-mudagrass turf based on the herbicide weed control spectrum. Crop Prot. 2023, 170, 106270. [Google Scholar] [CrossRef]
Quan, L.; Jiang, W.; Li, H.; Li, H.; Wang, Q.; Chen, L. Intelligent intra-row robotic weeding system combining deep learning technology with a targeted weeding mode. Biosyst. Eng. 2022, 216, 13–31. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
Tiwari, O.; Goyal, V.; Kumar, P.; Vij, S. An experimental setup for utilizing a convolutional neural network in automated weed detection. In Proceedings of the 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU), Ghaziabad, India, 18–19 April 2019; pp. 1–6. [Google Scholar]
Garibaldi-Márquez, F.; Flores, G.; Mercado-Ravell, D.A.; Ramírez-Pedraza, A.; Valentín-Coronado, L.M. Weed Classification from Natural Corn Field-Multi-Plant Images Based on Shallow and Deep Learning. Sensors 2022, 22, 3021. [Google Scholar] [CrossRef] [PubMed]
Asad, M.H.; Bais, A. Weed detection in canola fields using maximum likelihood classification and deep convolutional neural network. Inf. Process. Agric. 2020, 7, 535–545. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, X.; Ma, G.; Du, X.; Shaheen, N.; Mao, H. Recognition of weeds at asparagus fields using multi-feature fusion and backpropagation neural network. Int. J. Agric. Biol. Eng. 2021, 14, 190–198. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
Redmon, J. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
Mwitta, C.; Rains, G.C.; Prostko, E.P. Autonomous diode laser weeding mobile robot in cotton field using deep learning, visual servoing and finite state machine. Front. Agron. 2024, 6, 1388452. [Google Scholar] [CrossRef]
Abd Ghani, M.A.; Juraimi, A.S.; Su, A.S.M.; Ahmad-Hamdani, M.S.; Islam, A.M.; Motmainna, M. Chemical weed control in direct-seeded rice using drone and mist flow spray technology. Crop Prot. 2024, 184, 106853. [Google Scholar] [CrossRef]
Zhang, H.; Wang, Z.; Guo, Y.; Ma, Y.; Cao, W.; Chen, D.; Yang, S.; Gao, R. Weed Detection in Peanut Fields Based on Machine Vision. Agriculture 2022, 12, 1541. [Google Scholar] [CrossRef]
Peng, H.; Li, Z.; Zhou, Z.; Shao, Y. Weed detection in paddy field using an improved RetinaNet network. Comput. Electron. Agric. 2022, 199, 107179. [Google Scholar] [CrossRef]
Zhang, Z.; Lu, Y.; Yang, M.; Wang, G.; Zhao, Y.; Hu, Y. Optimal training strategy for high-performance detection model of multi-cultivar tea shoots based on deep learning methods. Sci. Hortic. 2024, 328, 112949. [Google Scholar] [CrossRef]
Sharma, A.; Kumar, V.; Longchamps, L. Comparative performance of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN models for detection of multiple weed species. Smart Agric. Technol. 2024, 9, 100648. [Google Scholar] [CrossRef]
Pei, H.; Sun, Y.; Huang, H.; Zhang, W.; Sheng, J.; Zhang, Z. Weed Detection in Maize Fields by UAV Images Based on Crop Row Preprocessing and Improved YOLOv4. Agriculture 2022, 12, 975. [Google Scholar] [CrossRef]
Xu, M.; Sun, J.; Cheng, J.; Yao, K.; Wu, X.; Zhou, X. Non-destructive prediction of total soluble solids and titratable acidity in Kyoho grape using hyperspectral imaging and deep learning algorithm. Int. J. Food Sci. Technol. 2023, 58, 9–21. [Google Scholar] [CrossRef]
Wang, J.; Gao, Z.; Zhang, Y.; Zhou, J.; Wu, J.; Li, P. Real-Time Detection and Location of Potted Flowers Based on a ZED Camera and a YOLO V4-Tiny Deep Learning Algorithm. Horticulturae 2022, 8, 21. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Computer Science, Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015 (MICCAI 2015), Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Springer: Cham, Switzerland, 2015; Volume 9351. [Google Scholar]
Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef]
Sun, J.; Yang, K.; He, X.; Luo, Y.; Wu, X.; Shen, J. Beet seedling and weed recognition based on a convolutional neural network and multi-modality images. Multimedia Tools Appl. 2022, 81, 5239–5258. [Google Scholar] [CrossRef]
Yu, H.; Che, M.; Yu, H.; Zhang, J. Development of Weed Detection Method in Soybean Fields Utilising Improved DeepLabv3+ Platform. Agronomy 2022, 12, 2889. [Google Scholar] [CrossRef]
Chen, G.; Fu, L.; Sun, L.; Zhao, B.; He, X. GT-DeepLabv3+: A Deep Learning-Based Algorithm for Field Crop Segmentation. In Proceedings of the 2024 International Symposium on Electrical, Electronics and Information Engineering (ISEEIE), Leicester, UK, 28–30 August 2024; pp. 401–405. [Google Scholar]
Memon, M.S.; Chen, S.; Shen, B.; Liang, R.; Tang, Z.; Wang, S.; Zhou, W.; Memon, N. Automatic visual recognition, detection and classification of weeds in cotton fields based on machine vision. Crop. Prot. 2024, 187, 106966. [Google Scholar] [CrossRef]
Chen, S.; Memon, M.S.; Shen, B.; Guo, J.; Du, Z.; Tang, Z.; Guo, X.; Memon, H. Identification of weeds in cotton fields at various growth stages using color feature techniques. Ital. J. Agron. 2024, 19, 100021. [Google Scholar] [CrossRef]
Islam, N.; Rashid, M.; Wibowo, S.; Xu, C.-Y.; Morshed, A.; Wasimi, S.A.; Moore, S.; Rahman, S.M. Early Weed Detection Using Image Processing and Machine Learning Techniques in an Australian Chilli Farm. Agriculture 2021, 11, 387. [Google Scholar] [CrossRef]
Dang, F.; Chen, D.; Lu, Y.; Li, Z. YOLOWeeds: A novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems. Comput. Electron. Agric. 2023, 205, 107655. [Google Scholar] [CrossRef]
Jin, X.; Bagavathiannan, M.; McCullough, P.E.; Chen, Y.; Yu, J. A deep learning-based method for classification, detection, and localisation of weeds in turfgrass. Pest Manag. Sci. 2022, 78, 4809–4821. [Google Scholar] [CrossRef]
Marx, C.; Barcikowski, S.; Hustedt, M.; Haferkamp, H.; Rath, T. Design and application of a weed damage model for laser-based weed control. Biosyst. Eng. 2012, 113, 148–157. [Google Scholar] [CrossRef]
Louargant, M.; Jones, G.; Faroux, R.; Paoli, J.-N.; Maillot, T.; Gée, C.; Villette, S. Unsupervised Classification Algorithm for Early Weed Detection in Row-Crops by Combining Spatial and Spectral Information. Remote Sens. 2018, 10, 761. [Google Scholar] [CrossRef]
Celikkan, E.; Kunzmann, T.; Yeskaliyev, Y.; Itzerott, S.; Klein, N.; Herold, M. WeedsGalore: A Multispectral and Multitemporal UAV-Based Dataset for Crop and Weed Segmentation in Agricultural Maize Fields. In Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA, 8 April 2025; pp. 4767–4777. [Google Scholar]
Wang, A.; Li, W.; Men, X.; Gao, B.; Xu, Y.; Wei, X. Vegetation detection based on spectral information and development of a low-cost vegetation sensor for selective spraying. Pest Manag. Sci. 2022, 78, 2467–2476. [Google Scholar] [CrossRef]
Xu, S.; Xu, X.; Zhu, Q.; Meng, Y.; Yang, G.; Feng, H.; Yang, M.; Zhu, Q.; Xue, H.; Wang, B. Monitoring leaf nitrogen content in rice based on information fusion of multi-sensor imagery from UAV. Precis. Agric. 2023, 24, 2327–2349. [Google Scholar] [CrossRef]
Farooq, A.; Hu, J.; Jia, X. Analysis of Spectral Bands and Spatial Resolutions for Weed Classification Via Deep Convolutional Neural Network. IEEE Geosci. Remote. Sens. Lett. 2019, 16, 183–187. [Google Scholar] [CrossRef]
Li, Y.; Al-Sarayreh, M.; Irie, K.; Hackell, D.; Bourdot, G.; Reis, M.M.; Ghamkhar, K. Identification of Weeds Based on Hyperspectral Imaging and Machine Learning. Front. Plant Sci. 2021, 11, 611622. [Google Scholar] [CrossRef]
Yue, J.; Yang, H.; Feng, H.; Han, S.; Zhou, C.; Fu, Y.; Guo, W.; Ma, X.; Qiao, H.; Yang, G. Hyperspectral-to-image transform and CNN transfer learning enhancing soybean LCC estimation. Comput. Electron. Agric. 2023, 211, 108011. [Google Scholar] [CrossRef]
Hidalgo, D.R.; Cortés, B.B.; Bravo, E.C. Dimensionality reduction of hyperspectral images of vegetation and crops based on self-organized maps. Inf. Process. Agric. 2021, 8, 310–327. [Google Scholar] [CrossRef]
Dadashzadeh, M.; Abbaspour-Gilandeh, Y.; Mesri-Gundoshmian, T.; Sabzi, S.; Arribas, J.I. A stereoscopic video computer vision system for weed discrimination in rice field under both natural and controlled light conditions by machine learning. Measurement 2024, 237, 115072. [Google Scholar] [CrossRef]
Tian, Y.; Sun, J.; Zhou, X.; Yao, K.; Tang, N. Detection of soluble solid content in apples based on hyperspectral technology combined with a deep learning algorithm. J. Food Process. Preserv. 2022, 46, e16414. [Google Scholar] [CrossRef]
Huang, F.H.; Liu, Y.H.; Sun, X.; Yang, H. Quality inspection of nectarine based on hyperspectral imaging technology. Syst. Sci. Control. Eng. 2021, 9, 350–357. [Google Scholar] [CrossRef]
Zhu, J.; Jiang, X.; Rong, Y.; Wei, W.; Wu, S.; Jiao, T.; Chen, Q. Label-free detection of trace level zearalenone in corn oil by surface-enhanced Raman spectroscopy (SERS) coupled with deep learning models. Food Chem. 2023, 414, 135705. [Google Scholar] [CrossRef]
Le, V.N.T.; Apopei, B.; Alameh, K. Effective plant discrimination based on the combination of local binary pattern operators and multiclass support vector machine methods. Inf. Process. Agric. 2019, 6, 116–131. [Google Scholar] [CrossRef]
Xu, Y.; Gao, Z.; Khot, L.; Meng, X.; Zhang, Q. A Real-Time Weed Mapping and Precision Herbicide Spraying System for Row Crops. Sensors 2018, 18, 4245. [Google Scholar] [CrossRef]
Festo. 3D Vision-Guided Weeding Manipulator; Festo Didactic: Eatontown, NY, USA, 2022. [Google Scholar]
Zou, K.; Chen, X.; Wang, Y.; Zhang, C.; Zhang, F. A modified U-Net with a specific data argumentation method for semantic segmentation of weed images in the field. Comput. Electron. Agric. 2021, 187, 106242. [Google Scholar] [CrossRef]
Farooq, A.; Jia, X.; Hu, J.; Zhou, J. Multi-Resolution Weed Classification via Convolutional Neural Network and Superpixel Based Local Binary Pattern Using Remote Sensing Images. Remote. Sens. 2019, 11, 1692. [Google Scholar] [CrossRef]
Chen, Y.; Wu, Z.; Zhao, B.; Fan, C.; Shi, S. Weed and Corn Seedling Detection in Field Based on Multi Feature Fusion and Support Vector Machine. Sensors 2020, 21, 212. [Google Scholar] [CrossRef] [PubMed]
Li, L.; Xie, S.; Ning, J.; Chen, Q.; Zhang, Z. Evaluating green tea quality based on multisensor data fusion combining hyperspectral imaging and olfactory visualization systems. J. Sci. Food Agric. 2019, 99, 1787–1794. [Google Scholar] [CrossRef]
Yu, S.; Huang, X.; Wang, L.; Chang, X.; Ren, Y.; Zhang, X.; Wang, Y. Qualitative and quantitative assessment of flavor quality of Chinese soybean paste using multiple sensor technologies combined with chemometrics and a data fusion strategy. Food Chem. 2023, 405, 134859. [Google Scholar] [CrossRef]
Espejo-Garcia, B.; Mylonas, N.; Athanasakos, L.; Fountas, S.; Vasilakoglou, I. Towards weeds identification assistance through transfer learning. Comput. Electron. Agric. 2020, 171, 105306. [Google Scholar] [CrossRef]
Yang, X.; Xiong, B.; Huang, Y.; Xu, C. Cross-modal federated human activity recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5345–5361. [Google Scholar] [CrossRef] [PubMed]
Liu, C.; Yu, H.; Liu, Y.; Zhang, L.; Li, D.; Zhang, J.; Li, X.; Sui, Y. Prediction of Anthocyanin Content in Purple-Leaf Lettuce Based on Spectral Features and Optimized Extreme Learning Machine Algorithm. Agronomy 2024, 14, 2915. [Google Scholar] [CrossRef]
Zhou, X.; Sun, J.; Tian, Y.; Lu, B.; Hang, Y.; Chen, Q. Hyperspectral technique combined with deep learning algorithm for detection of compound heavy metals in lettuce. Food Chem. 2020, 321, 126503. [Google Scholar] [CrossRef]
Razfar, N.; True, J.; Bassiouny, R.; Venkatesh, V.; Kashef, R. Weed detection in soybean crops using custom lightweight deep learning models. J. Agric. Food Res. 2022, 8, 100308. [Google Scholar] [CrossRef]
Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Adam, H. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
Fan, X.; Sun, T.; Chai, X.; Zhou, J. YOLO-WDNet: A lightweight and accurate model for weeds detection in cotton field. Comput. Electron. Agric. 2024, 225, 109317. [Google Scholar] [CrossRef]
Kerpauskas, P.; Sirvydas, A.P.; Lazauskas, P.; Vasinauskiene, R.; Tamosiunas, A. Possibilities of weed control by water steam. Agron. Res. 2006, 4, 221–225. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Houlsby, N. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
García-Navarrete, O.L.; Camacho-Tamayo, J.H.; Bregon, A.B.; Martín-García, J.; Navas-Gracia, L.M. Performance Analysis of Real-Time Detection Transformer and You Only Look Once Models for Weed Detection in Maize Cultivation. Agronomy 2025, 15, 796. [Google Scholar] [CrossRef]
Ji, W.; Wang, J.; Xu, B.; Zhang, T. Apple Grading Based on Multi-Dimensional View Processing and Deep Learning. Foods 2023, 12, 2117. [Google Scholar] [CrossRef]
Dai, Z. CoAtNet: Marrying Convolution and Attention for All Visual Tasks. arXiv 2022, arXiv:2106.04803. [Google Scholar]
Wei, Y.; Feng, Y.; Zu, D.; Zhang, X. A hybrid CNN-transformer network: Accurate and efficient semantic segmentation of crops and weeds on resource-constrained embedded devices. Crop. Prot. 2025, 188, 107018. [Google Scholar] [CrossRef]
Xue, Y.; Jiang, H. Monitoring of Chlorpyrifos Residues in Corn Oil Based on Raman Spectral Deep-Learning Model. Foods 2023, 12, 2402. [Google Scholar] [CrossRef]
Maram, B.; Das, S.; Daniya, T.; Cristin, R. A Framework for Weed Detection in Agricultural Fields Using Image Processing and Machine Learning Algorithms. In Proceedings of the 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP), Hyderabad, India, 21–23 July 2022; pp. 1–6. [Google Scholar]
Su, D.; Kong, H.; Qiao, Y.; Sukkarieh, S. Data augmentation for deep learning based semantic segmentation and crop-weed classification in agricultural robotics. Comput. Electron. Agric. 2021, 190, 106418. [Google Scholar] [CrossRef]
Chen, D.; Qi, X.; Zheng, Y.; Lu, Y.; Huang, Y.; Li, Z. Synthetic data augmentation by diffusion probabilistic models to enhance weed recognition. Comput. Electron. Agric. 2024, 216, 108517. [Google Scholar] [CrossRef]
Zamani, S.A.; Baleghi, Y. A novel plant arrangement-based image augmentation method for crop, weed, and background segmentation in agricultural field images. Comput. Electron. Agric. 2025, 233, 110151. [Google Scholar] [CrossRef]
Deng, L.; Miao, Z.; Zhao, X.; Yang, S.; Gao, Y.; Zhai, C.; Zhao, C. HAD-YOLO: An Accurate and Effective Weed Detection Model Based on Improved YOLOV5 Network. Agronomy 2025, 15, 57. [Google Scholar] [CrossRef]
Van Der Jeught, S.; Muyshondt, P.G.; Lobato, I. Optimized loss function in deep learning profilometry for improved prediction performance. J. Phys. Photonics 2021, 3, 024014. [Google Scholar] [CrossRef]
Lee, J.; Park, S.; Mo, S.; Ahn, S.; Shin, J. Layer-adaptive sparsity for the magnitude-based pruning. arXiv 2020, arXiv:2010.07611. [Google Scholar]
NVIDIA. TensorRT Optimisation for Weed Detection Models. In NVIDIA Technical Report; NVIDIA: Santa Clara, CA, USA, 2022. [Google Scholar]
Zhou, P.; Zhu, Y.; Jin, C.; Gu, Y.; Kong, Y.; Ou, Y.; Yin, X.; Hao, S. A new training strategy: Coordinating distillation techniques for training lightweight weed detection model. Crop. Prot. 2025, 190, 107124. [Google Scholar] [CrossRef]
Bah, M.D.; Hafiane, A.; Canals, R. Deep learning with unsupervised data labelling for weed detection in line crops in UAV images. Remote Sens. 2018, 10, 1690. [Google Scholar] [CrossRef]
Khan, S.; Tufail, M.; Khan, M.T.; Khan, Z.A.; Anwar, S. Deep learning-based identification system of weeds and crops in strawberry and pea fields for a precision agriculture sprayer. Precis. Agric. 2021, 22, 1711–1727. [Google Scholar] [CrossRef]
Xu, K.; Zhu, Y.; Cao, W.; Jiang, X.; Jiang, Z.; Li, S.; Ni, J. Multi-Modal Deep Learning for Weeds Detection in Wheat Field Based on RGB-D Images. Front. Plant Sci. 2021, 12, 732968. [Google Scholar] [CrossRef]
Ahmad, A.; Saraswat, D.; Aggarwal, V.; Etienne, A.; Hancock, B. Performance of deep learning models for classifying and detecting common weeds in corn and soybean production systems. Comput. Electron. Agric. 2021, 184, 106081. [Google Scholar] [CrossRef]
Urmashev, B.; Buribayev, Z.; Amirgaliyeva, Z.; Ataniyazova, A.; Zhassuzak, M.; Turegali, A. Development of a weed detection system using machine learning and neural network algorithms. East.-Eur. J. Enterp. Technol. 2021, 6, 114. [Google Scholar] [CrossRef]
Niu, W.; Lei, X.; Li, H.; Wu, H.; Hu, F.; Wen, X.; Song, H. YOLOv8-ECFS: A lightweight model for weed species de-tection in soybean fields. Crop Prot. 2024, 184, 106847. [Google Scholar] [CrossRef]
Khan, Z.; Liu, H.; Shen, Y.; Yang, Z.; Zhang, L.; Yang, F. Optimising precision agriculture: A real-time detection approach for grape vineyard unhealthy leaves using deep learning improved YOLOv7 with feature extraction capabilities. Comput. Electron. Agric. 2025, 231, 109969. [Google Scholar] [CrossRef]
Liu, G.; Di, J.; Wang, Q.; Zhao, Y.; Yang, Y. An Enhanced and Lightweight YOLOv8-Based Model for Accurate Rice Pest Detection. IEEE Access 2025, 13, 91046–91064. [Google Scholar] [CrossRef]
Van Quyen, T.; Kim, M.Y. Feature pyramid network with multi-scale prediction fusion for real-time semantic segmentation. Neurocomputing 2023, 519, 104–113. [Google Scholar] [CrossRef]
Hnewa, M.; Radha, H. Multiscale domain adaptive yolo for cross-domain object detection. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 3323–3327. [Google Scholar]
Zhou, X.; Sun, J.; Mao, H.; Wu, X.; Zhang, X.; Yang, N. Visualization research of moisture content in leaf lettuce leaves based on WT-PLSR and hyperspectral imaging technology. J. Food Process. Eng. 2018, 41, e12647. [Google Scholar] [CrossRef]
Sun, Y.; Zhang, N.; Han, C.; Chen, Z.; Zhai, X.; Li, Z.; Zheng, K.; Zhu, J.; Wang, X.; Zou, X.; et al. Competitive immunosensor for sensitive and optical anti-interference detection of imidacloprid by surface-enhanced Raman scattering. Food Chem. 2021, 358, 129898. [Google Scholar] [CrossRef] [PubMed]
Kong, X.; Liu, T.; Chen, X.; Lian, P.; Zhai, D.; Li, A.; Yu, J. Exploring the semi-supervised learning for weed detection in wheat. Crop. Prot. 2024, 184, 106823. [Google Scholar] [CrossRef]
Caron, M.; Touvron, H.; Misra, I.; Jégou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers. arXiv 2021, arXiv:2104.14294. [Google Scholar] [CrossRef]
Yu, X.; Sun, Y.J.; Zhao, W.C.; Gao, J.; Qi, T.; Jin, R.; Liu, W. Structural design of an agricultural omnidirectional mobile platform based on un-structured environments. Intern. Combust. Engine Parts 2017, 3, 1–3. [Google Scholar] [CrossRef]
Zhang, N.F.; Yang, J.F.; Xue, Y.J.; Li, Z.; Huang, X.L. Agricultural Machinery Operation Posture Rapid Detection Method Based on Intelligent Sensor. Appl. Mech. Mater. 2013, 373–375, 936–939. [Google Scholar] [CrossRef]
Yang, L.; Kamata, S.; Hoshino, Y.; Liu, Y.; Tomioka, C. Development of EV Crawler-Type Weeding Robot for Organic Onion. Agriculture 2025, 15, 2. [Google Scholar] [CrossRef]
DJI. T60 Agricultural Drone Technical Specifications; DJI Innovations: Shenzhen, China, 2024. [Google Scholar]
Gebresenbet, G.; Bosona, T.; Patterson, D.; Persson, H.; Fischer, B.; Mandaluniz, N.; Chirici, G.; Zacepins, A.; Komasilovs, V.; Pitulac, T.; et al. A concept for application of integrated digital technologies to enhance future smart agricultural systems. Smart Agric. Technol. 2023, 5, 100255. [Google Scholar] [CrossRef]
Chen, S.; Li, X.; Li, S.; Zhou, Y.; Yang, X. iKalibr: Unified Targetless Spatiotemporal Calibration for Resilient Integrated Inertial Systems. IEEE Trans. Robot. 2025, 41, 1618–1638. [Google Scholar] [CrossRef]
Aygün, S.; Güneş, E.O.; Subaşı, M.A.; Alkan, S. Sensor Fusion for IoT-Based Intelligent Agriculture Systems. In Proceedings of the 2019 the 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Istanbul, Turkey, 16–19 July 2019; pp. 1–5. [Google Scholar]
Zhang, X.; Chen, N.; Li, J.; Chen, Z.; Niyogi, D. Multi-sensor integrated framework and index for agricultural drought monitoring. Remote. Sens. Environ. 2017, 188, 141–163. [Google Scholar] [CrossRef]
Chang, C.-L.; Xie, B.-X.; Chung, S.-C. Mechanical Control with a Deep Learning Method for Precise Weeding on a Farm. Agriculture 2021, 11, 1049. [Google Scholar] [CrossRef]
Parulski, M.L. Design of an Automated Arm for a Robotic Weeding Platform. Master’s Thesis, Case Western Reserve University, Cleveland, OH, USA, 2020. [Google Scholar]
Jia, W.; Tai, K.; Wang, X.; Dong, X.; Ou, M. Design and Simulation of Intra-Row Obstacle Avoidance Shovel-Type Weeding Machine in Orchard. Agriculture 2024, 14, 1124. [Google Scholar] [CrossRef]
Allmendinger, A.; Spaeth, M.; Saile, M.; Peteinatos, G.G.; Gerhards, R. Precision Chemical Weed Management Strategies: A Review and a Design of a New CNN-Based Modular Spot Sprayer. Agronomy 2022, 12, 1620. [Google Scholar] [CrossRef]
Yu, Z.; He, X.; Qi, P.; Wang, Z.; Liu, L.; Han, L.; Huang, Z.; Wang, C. A Static Laser Weeding Device and System Based on Fiber Laser: Development, Experimentation, and Evaluation. Agronomy 2024, 14, 1426. [Google Scholar] [CrossRef]
Xiong, Y.; Ge, Y.; Liang, Y.; Blackmore, S. Development of a prototype robot and fast path-planning algorithm for static laser weeding. Comput. Electron. Agric. 2017, 142, 494–503. [Google Scholar] [CrossRef]
Zhu, H.; Zhang, Y.; Mu, D.; Bai, L.; Zhuang, H.; Li, H. YOLOX-based blue laser weeding robot in corn field. Front. Plant Sci. 2022, 13, 1017803. [Google Scholar] [CrossRef] [PubMed]
Hu, R.; Niu, L.T.; Su, W.H. A novel mechanical-laser collaborative intra-row weeding prototype: Structural design and optimisation, weeding knife simulation, and laser weeding experiment. Front. Plant Sci. 2024, 15, 1469098. [Google Scholar] [CrossRef] [PubMed]
Upadhyay, A.; Singh, K.P.; Jhala, K.; Kumar, M.; Salem, A. Non-chemical weed management: Harnessing flame weeding for effective weed control. Heliyon 2024, 10, e32776. [Google Scholar] [CrossRef]
Dress, A.; Balah, M. Flame is used for weed control in certain crops. J. Soil Sci. Agric. Eng. 2016, 7, 751–756. [Google Scholar]
Qin, W.-C.; Qiu, B.-J.; Xue, X.-Y.; Chen, C.; Xu, Z.-F.; Zhou, Q.-Q. Droplet deposition and control effect of insecticides sprayed with an unmanned aerial vehicle against plant hoppers. Crop. Prot. 2016, 85, 79–88. [Google Scholar] [CrossRef]
Sebastian, S.; Kalita, K. Development and field performance assessment of roller rake weeder. Crop. Prot. 2025, 189, 107051. [Google Scholar] [CrossRef]
Xiang, M.; Wei, S.; Zhang, M.; Li, M.Z. Real-time monitoring system for agricultural machinery operation information, utilizing ARM11 and GNSS technologies. IFAC-Pap. 2016, 49, 121–126. [Google Scholar]
Khadatkar, A.; Sujit, P.; Agrawal, R.; Vishwanath, K.; Sawant, C.; Magar, A.; Chaudhary, V. WeeRo: Design, development and application of a remotely controlled robotic weeder for mechanical weeding in row crops for sustainable crop production. Results Eng. 2025, 26, 105202. [Google Scholar] [CrossRef]
Rachmawati, D.; Gustin, L. Analysis of Dijkstra’s Algorithm and A* Algorithm in Shortest Path Problem. J. Phys. Conf. Ser. 2020, 1566, 012061. [Google Scholar] [CrossRef]
Adiyatov, O.; Varol, H.A. A novel RRT*-based algorithm for motion planning in dynamic environments. In Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 6–9 August 2017; pp. 1416–1421. [Google Scholar]
Levine, S.; Finn, C.; Darrell, T.; Abbeel, P. End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 2016, 17, 1–40. [Google Scholar]
Shafaei, S.; Loghavi, M.; Kamgar, S. Development and implementation of a human machine interface-assisted digital instrumentation system for high precision measurement of tractor performance parameters. Eng. Agric. Environ. Food 2019, 12, 11–23. [Google Scholar] [CrossRef]
Khramov, V.V. Development of a Human-Machine Interface Based on Hybrid Intelligence. Int. Sci. J. Mod. Inf. Technol. IT Educ. 2020, 16, 893–900. [Google Scholar] [CrossRef]
Li, X.; Lloyd, R.; Ward, S.; Cox, J.; Coutts, S.; Fox, C. Robotic crop row tracking around weeds using cereal-specific features. Comput. Electron. Agric. 2022, 197, 106941. [Google Scholar] [CrossRef]
Fan, X.; Chai, X.; Zhou, J.; Sun, T. Deep learning based weed detection and target spraying robot system at seedling stage of cotton field. Comput. Electron. Agric. 2023, 214, 108317. [Google Scholar] [CrossRef]
Upadhyay, A.; Sunil, G.C.; Zhang, Y.; Koparan, C.; Sun, X. Development and evaluation of a machine vision and deep learning-based smart sprayer system for site-specific weed management in row crops: An edge computing approach. J. Agric. Food Res. 2024, 18, 101331. [Google Scholar] [CrossRef]
Jia, W.; Tai, K.; Dong, X.; Ou, M.; Wang, X. Design of and Experimentation on an Intelligent Intra-Row Obstacle Avoidance and Weeding Machine for Orchards. Agriculture 2025, 15, 947. [Google Scholar] [CrossRef]
Hu, R.; Su, W.-H.; Li, J.-L.; Peng, Y. Real-time lettuce-weed localization and weed severity classification based on lightweight YOLO convolutional neural networks for intelligent intra-row weed control. Comput. Electron. Agric. 2024, 226, 109404. [Google Scholar] [CrossRef]
El Hafyani, M.; Saddik, A.; Hssaisoune, M.; Labbaci, A.; Tairi, A.; Abdelfadel, F.; Bouchaou, L. Weeds detection in a Citrus orchard using multispectral UAV data and machine learning algorithms: A case study from Souss-Massa basin, Morocco. Remote. Sens. Appl. Soc. Environ. 2025, 38, 101553. [Google Scholar] [CrossRef]
Yu, J.; Long, Q.; Hao, G.; Chuang, L.; Ming, T.; Xiaolu, H.; Qinling, C. Design and experiment of pneumatic paddy intra-row weeding device. J. South China Agric. Univ. 2020, 41, 37–49. [Google Scholar]
Wang, S.; Yu, S.; Zhang, W.; Wang, X. The identification of straight-curved rice seedling rows for automatic row avoidance and weeding system. Biosyst. Eng. 2023, 233, 47–62. [Google Scholar] [CrossRef]
Ju, J.; Chen, G.; Lv, Z.; Zhao, M.; Sun, L.; Wang, Z.; Wang, J. Design and experiment of an adaptive cruise weeding robot for paddy fields based on improved YOLOv5. Comput. Electron. Agric. 2024, 219, 108824. [Google Scholar] [CrossRef]
Monteiro, A.; Santos, S. Sustainable Approach to Weed Management: The Role of Precision Weed Management. Agronomy 2022, 12, 118. [Google Scholar] [CrossRef]
Mohanty, T.; Pattanaik, P.; Dash, S.; Tripathy, H.P.; Holderbaum, W. Innovative robotic system guided with YOLOv5-based machine learning framework for efficient herbicide usage in rice (Oryza sativa L.) under precision agriculture. Comput. Electron Agric. 2025, 231, 110032. [Google Scholar] [CrossRef]
Jiang, W.; Quan, L.; Wei, G.; Chang, C.; Geng, T. A conceptual evaluation of a weed control method with post-damage application of herbicides: A composite intelligent intra-row weeding robot. Soil Tillage Res. 2023, 234, 105837. [Google Scholar] [CrossRef]
Lottes, P.; Khanna, R.; Pfeifer, J.; Siegwart, R.; Stachniss, C. UAV-based crop and weed classification for smart farming. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3024–3031. [Google Scholar]
Sinha, J.P. Aerial robot for smart farming and enhancing farmers’ net benefit. Indian J. Agric. Sci. 2020, 90, 258–267. [Google Scholar] [CrossRef]
Partel, V.; Kakarla, S.C.; Ampatzidis, Y. Development and evaluation of a low-cost and smart technology for precision weed management utilizing artificial intelligence. Comput. Electron. Agric. 2019, 157, 339–350. [Google Scholar] [CrossRef]
Balafoutis, A.T.; Evert, F.K.V.; Fountas, S. Innovative farming technology trends: Economic and environmental effects, labour impact, and adoption readiness. Agronomy 2020, 10, 743. [Google Scholar] [CrossRef]
Li, Y.; Guo, Z.; Shuang, F.; Zhang, M.; Li, X. Key technologies of machine vision for weeding robots: A review and benchmark. Comput. Electron. Agric. 2022, 196, 106880. [Google Scholar] [CrossRef]
Shorewala, S.; Ashfaque, A.; Sidharth, R.; Verma, U. Weed Density and Distribution Estimation for Precision Agriculture Using Semi-Supervised Learning. IEEE Access 2021, 9, 27971–27986. [Google Scholar] [CrossRef]
Yang, J.; Guo, X.; Li, Y.; Marinello, F.; Ercisli, S.; Zhang, Z. A survey of few-shot learning in smart agriculture: Devel-opments, applications, and challenges. Plant Methods 2022, 18, 28. [Google Scholar] [CrossRef]
Zhang, J.; Yu, F.; Zhang, Q.; Wang, M.; Yu, J.; Tan, Y. Advancements of UAV and Deep Learning Technologies for Weed Management in Farmland. Agronomy 2024, 14, 494. [Google Scholar] [CrossRef]
Zhao, F.; Zhang, C.; Geng, B. Deep multimodal data fusion. ACM Comput. Surv. 2024, 56, 1–36. [Google Scholar] [CrossRef]
Xu, K.; Shu, L.; Xie, Q.; Song, M.; Zhu, Y.; Cao, W.; Ni, J. Precision weed detection in wheat fields for agriculture 4.0: A survey of enabling technologies, methods, and research challenges. Comput. Electron. Agric. 2023, 212, 108106. [Google Scholar] [CrossRef]
Trong, V.H.; Gwang-Hyun, Y.; Vu, D.T.; Jin-Young, K. Late fusion of multimodal deep neural networks for weeds classification. Comput. Electron. Agric. 2020, 175, 105506. [Google Scholar] [CrossRef]
Chang, C.-L.; Lin, K.-M. Smart Agricultural Machine with a Computer Vision-Based Weeding and Variable-Rate Irrigation Scheme. Robotics 2018, 7, 38. [Google Scholar] [CrossRef]
Chang-Tao, Z.; Rui-Feng, W.; Yu-Hao, T.; Xiao-Xu, P.; Wen-Hao, S. Automatic lettuce weed detection and classification based on optimised convolutional neural networks for robotic weed control. Agronomy 2024, 14, 2838. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Micikevicius, P.; Narang, S.; Alben, J.; Diamos, G.; Elsen, E.; Garcia, D.; Wu, H. Mixed precision training. arXiv 2017, arXiv:1710.03740. [Google Scholar] [CrossRef]
Xie, D.; Chen, L.; Liu, L.; Chen, L.; Wang, H. Actuators and sensors for application in agricultural robots: A review. Machines 2022, 10, 913. [Google Scholar] [CrossRef]
Pantazi, X.-E.; Moshou, D.; Bravo, C. Active learning system for weed species recognition based on hyperspectral sensing. Biosyst. Eng. 2016, 146, 193–202. [Google Scholar] [CrossRef]
Agrawal, A.; Fischer, M.; Singh, V. Digital Twin: From Concept to Practice. J. Manag. Eng. 2022, 38, 06022001. [Google Scholar] [CrossRef]
Palladations, N.; Piramal’s, D.; Chimeras, V.; Tserepas, E.; Munteanu, R.A.; Papageorgas, P. Enhancing smart agriculture by implementing digital twins: A comprehensive review. Sensors 2023, 23, 7128. [Google Scholar] [CrossRef]
San, C.T.; Kakani, V. Smart Precision Weeding in Agriculture Using 5IR Technologies. Electronics 2025, 14, 2517. [Google Scholar] [CrossRef]
Schut, A.G.; Giller, K.E. Sustainable intensification of agriculture in Africa. Front. Agric. Sci. Eng. 2020, 7, 371–375. [Google Scholar] [CrossRef]
Miranda, J.; Ponce, P.; Molina, A.; Wright, P. Sensing, smart and sustainable technologies for Agri-Food 4.0. Comput. Ind. 2019, 108, 21–36. [Google Scholar] [CrossRef]
Karunathilake, E.M.B.M.; Le, A.T.; Heo, S.; Chung, Y.S.; Mansoor, S. The Path to Smart Farming: Innovations and Opportunities in Precision Agriculture. Agriculture 2023, 13, 1593. [Google Scholar] [CrossRef]
Anastasiou, E.; Fountas, S.; Voulgaraki, M.; Psiroukis, V.; Koutsiaras, M.; Kriezi, O.; Lazarou, E.; Vatsanidou, A.; Fu, L.; Di Bartolo, F.; et al. Precision farming technologies for crop protection: A meta-analysis. Smart Agric. Technol. 2023, 5, 100323. [Google Scholar] [CrossRef]

Figure 1. Intelligent weeders: (a) Strawberry field variable pesticide spraying weeder, (b) agricultural spraying drone, (c) intelligent inter-plant weeding robotic system, (d) orchard spraying robot (1. RTK receiver 2. Lidar 3. RealSense camera 4. Nozzle 5. Embedded computer).

Figure 2. Flow chart for synthesis paper.

Figure 3. Processes for potted flower detection and location based on YOLO V4-Tiny. (a) YOLO V4-Tiny model. (b) Flowchart of detection and location.

Figure 4. Working platform.

Figure 5. Overall structure of intra-row weeding device. Note: 1. Weeding servo motor. 2. Lifting slide table. 3. Lifting servo motor. 4. Reducer. 5. Traverse servo motor. 6. Step gear. 7. Rack. 8. Slider. 9. Slide rail. 10. Brake resistor. 11. Servo motor driver. 12. C37 controller. 13. Angle encoder. 14. Weeding knife. 15. Profiling wheel. 16. Profiling wheel angle encoder. 17. Driven gear. 18. Driven gear.

Figure 6. Obstacle avoidance workflow: (a) intra-row weeding stage; (b) touching fruit trees stage; (c) obstacle avoidance stage; (d) avoidance completion stage. (Fruit trees are indicated by black dots).

Figure 7. Coaxial reversing weeding mechanism.

Figure 8. Intelligent orchard row avoiding weeder.

Table 1. Utilization of deep learning and machine learning in agricultural weeding equipment.

Author, Reference	Machine Learning or Deep Learning	Crop and Weed Species	Recognition Accuracy	Weed Control Methods	Weed Control Effectiveness
Sapkota, R. [3]	Self-developed Crop Row Identification (CRI) computer vision algorithm	Crop: Maize Weeds: All green vegetation outside the maize rows	Maize row detection: Accuracy: 99.5% Weed mapping: 35% of grid cells weed-free	Grid-based identification of weeds and precise spraying of herbicides	Application accuracy: 78.4% of actual actuation accuracy in non-sprayed areas
Zheng, S. [23]	Improved YOLOv5s model	Crop: Kale Weeds: Plants other than crops	Recognition accuracy for kale under motion blur conditions: 96.1%	Electric pendulum mechanical wedding	Field optimum: 96.00% weed control + 1.57% seedling injury
Xiaowu Han [25]	Improved YOLO-V4 Tiny model	Crop: Maize Weeds: Six species of oxalis, thistle, etc.	Average precision: 94.83%	Precision spraying systems	Jetson NX embedded decoding performance: FP32 mode accuracy 94.8%/time consumed 73 ms INT8 mode accuracy 84.2%/time consumed 13.6 ms
Honghua Jiang [26]	CNN Feature Extraction + GCN-ResNet-101 Model	Crops: Corn, lettuce, radish Weeds: Cirsium, bluegrass, sedge	GCN-ResNet-101 model performance: Corn dataset: 97.80% Lettuce dataset: 99.37% Radish dataset: 98.93%	Targeted spraying of herbicides, avoiding uniform spraying	Spraying for weeds only reduces herbicide use
Dewa Made Sri Arsa [27]	Improved Encoder–Decoder CNN Model	Crop: Beans Weeds: Focused Growth Spot Detection	Weed: Detection rate: 0.8505 Precision rate: 0.8641	Laser-based weeding: Destruction by spot irradiation with a CO₂ laser	1. Environmentally friendly: only damages the growing point, no injury to leaves/roots 2. Accuracy: Positioning error ≤ 15 pixels
Visentin, F [28]	Pre-training the ResNet18 model	Crop: Lactuca sativa Weeds: Satureja, Taraxacum, etc.	Recognition accuracy: Crop: 98% Weed: 97.8%	The robot pulls weeds and disposes of them in the recycling bin	Weed control success rate: 92% Weed control efficiency: 10 plants per minute
Arakeri, M. P [29]	Weed identification using artificial neural networks (ANN)	Crop: Onion Weeds: Asphodelus Fistulosus	Accuracy: 98.64% Sensitivity: 96.83%. Specificity: 99.57%	Automatic control of herbicide spraying by sprayers	Automatic control of the sprayer to apply herbicide based on weed density
Zhao, X [30]	Support Vector Machine (SVM), Fusion Skeleton Point-to-Line Ratio with Maximum Inner Circle Radius	Crop: Cabbage Weeds: Portulaca oleracea, Galinsoga parviflora	Average recognition accuracy in the field: Cabbage 95.0%; weeds 93.5%	Targeted spraying system based on active light sources	Effective spray rate: 92.9%
Jin, X [31]	Optimization of weed identification based on CenterNet and a genetic algorithm	Crop: Cabbage Weeds: Green objects outside the detection box	CenterNet detection: accuracy 95.6%	Spray weed killer based on the size of the weed area identified	Can reduce the area of pesticide spraying and achieve a weed segmentation accuracy rate of 92.7% under natural conditions
Tufail, M. [32]	1. Support SVMs 2. Customized ResNet18 Single Stage Detector (SSD) combined with MobileNet v2	Crop: Tobacco Weeds: Common broadleaf/narrowleaf weeds in the field	Supports SVM: 96% accuracy ResNet18 + SVM: 100% accuracy SSD-MobileNet v2: 81% accuracy	Control of electromagnetic valve directional spraying and switching based on SVM classification results	Can reduce pesticide use by 30–40% with an effective spraying rate of 92.9%
Lin, Y. [33]	Improved Multitasking YOLO Algorithm	Crop: Pineapple Weeds: All types of weeds in pineapple fields	Detection: Accuracy (P): 84.37% Segmentation: mIoU: 77.80% Accuracy: 86.35%	Precision spraying of herbicides	Indoor experiment: 98% of weeds were correctly sprayed, 10.1% survived
Xu, Y. [34]	W-YOLOv5	Crops: Wheat, radish Weeds: Multiple weeds in crop fields	Crop detection: mAP@0.5: 87.6% Weed detection: 98%	Precise spraying of herbicides based on weed identification results	Field test: Spraying accuracy 90.32 per cent at 4 km/h. Flow rate control error ≤2%
Karim, M. J. [35]	Improved YOLOv8n	Crop: Cotton Weeds: Waterhemp, beat bowl flower, etc.	Overall performance: mAP@0.5: 97.6% Precision: 94.5%,	After identifying weeds and locating them with lasers, apply herbicides with precision	Laser positioning accuracy: 92.3%
Herterich, N [36]	Lightweight weed detection model based on NPU optimization	Crops: Cotton, sugar beet Weeds: Field weeds, etc.	Overall performance: mAP@0.5: 97.6% Precision: 94.5%	Real-time detection of weed location based on NPU and precise spraying of herbicide	Pesticide coverage efficiency on diseased leaves was improved by 65.7%, and pesticide waste was reduced by 10% to 55%
Zhang, X [37]	Improved YOLOv8Pose Models	Crop: Corn	Average Precision (AP): 0.804 mAP@0.5: 0.957	Based on the detection results, control the finger-type weed remover to remove weeds along the route	Effective weed control rate (EWR): 95.6%
Xiang M [38]	Enhancement of the YOLOv5 model	Crops: Lettuce Weeds: Amaranth, caper	Accuracy rate of 99.1%	S-shaped flexible weed cutter for weeding within and between rows	Average weed control rate: 96.87%
Utstumo, T. [39]	Drop-on-Demand (DoD) Droplet Precision Spraying System	Crops: Carrots Weeds: Quinoa, fescue, etc.	The visual system detects weeds and sprays precisely to achieve 100% weed control	Precise spraying of weed control and inter-row mechanical weed control based on machine vision	Field: 5.3 µg glyphosate/drop for 100% weed control Reduces herbicide use by more than 90% compared to conventional methods
Azghadi, M. R [40]	MobileNetV2-based image classification model	Crops: Sugarcane, green beans Weeds: Balsam, grass weeds	The MobileNetV2 model achieved an average weed control rate of 95%	Computer vision-based precision spraying technology	Herbicide use reduced by 35% Weed control efficiency reaches 97%
Sassu, A., Motta [41]	Single-stage target detection model based on Feature Pyramid Network (FPN)	Crops: Cynara cardunculus L.	FPN model: Accuracy rate 93.2% YOLOv5n model: Accuracy rate 98.7%	Precision spraying using unmanned aerial systems (UAS)	Pesticide use reduced by 35–65% Foliar coverage efficiency (SR) of 91.5–95.7%
Zhao, P. [42]	Improved model based on YOLOv8-pose DIN-LW-YOLO	Crop: Strawberries Weeds: Field weeds	Average precision (mAP): 88.5% Average precision of weed growth points (MAP): 85.0%	Determine the coordinates of weeds based on the detection results and use laser equipment to remove them	Field trial: 92.6% weed control, 1.2% seedling injury 100% weed mortality 3 days after laser treatment
Li, J. [43]	Crop–weed classification algorithm based on multi-sensor fusion	Crops: Tomatoes Weeds: Snakeberry, salvia, etc.	Classification accuracy: 95.43% Spraying efficiency: 99.96%	Real-time herbicide spraying based on sensor signals	Weed control rate: 99.96% Average number of sprays: 5.81
Jin, X. [44]	Grid classification using EfficientNet-v2, ResNet, and VGGNet	Crop: Dogbane lawn Weeds: Annual bluegrass, dandelion	EfficientNet-v2: Weed detection F1: 99.6% ResNet: F1 value: 99.6% VGGNet: F1 value: 98.5%	Based on the detection results of the HWCS neural network, precise herbicide spraying is achieved	Precise spraying is equivalent to comprehensive spraying
Quan, L. [45]	YOLOv3-based target detection model	Crop: Maize Weeds: Broadleaf weeds, grass weeds	Maize detection accuracy: 98.5% Average weed detection accuracy: 90.9%	Based on the detection results, use a vertical rotary weed cutter for precise weed removal	Under single-crop cultivation conditions: Weed control effectiveness: 85.91% Crop damage: 1.17%

Note: NVIDIA Jetson Xavier NX (Jetson NX); Single-Precision Floating Point (32-bit) (FP32); 8-bit Integer (INT8); You Only Look Once version 5 miniature (YOLOv5s); You Only Look Once Version 4 (YOLO-V4); Convolutional Neural Network Feature (CNN Feature); Graph Convolutional Network—Residual Network 101 (GCN-ResNet-101); Encoder–Decoder Convolutional Neural Network Model (Encoder–Decoder CNN); Residual Network 18-layer (ResNet18); Center-based Object Detection Network (CenterNet); Mobile Network Version 2 (MobileNet v2); You Only Look Once version 5 (YOLOv5); You Only Look Once version 8 nano (YOLOv8n); Neural Processing Unit (NPU); You Only Look Once version 8 (YOLOv8); Mobile Network Version 2 (MobileNetV2); You Only Look Once version 8—pose estimation (YOLOv8-pose); Dynamic Inference Network—LightWeight YOLO (DIN-LW-YOLO); Efficient Network Version 2 (EfficientNet-v2); Residual Network (ResNet).

Table 2. Performance comparison of typical CNN models in a weed identification task.

Model Architecture	Application Scenarios	Recognition Accuracy	Data Sources
VGG-16	Strawberry field variable spraying	93%	[21]
Inception V2	Drone weed detection	98%	[47]
Xception	Early classification of cornfields	97.83%	[48]
ResNet-50 + SegNet	Semantic segmentation of rapeseed fields	mIoU 0.8288	[49]

Table 3. Performance comparison of typical target detection models in weed detection tasks.

Model Architecture	Application Scenarios	mAP@0.5	Inference Time	Data Sources
YOLOv4-Tiny	Weed detection in peanut fields	94.54%	73 ms (FP32)	[55]
RetinaNet	Weed detection in a rice field	94.10%	41.1 ms	[56]
Faster R-CNN	Weed detection in farmland	78.20%	218 ms	[58]
YOLOv8	Weed detection in farmland	85.60%	22.3 ms
YOLOv9	Weed detection in farmland	93.50%	18.7 ms
YOLOv11	Weed detection in farmland	89.10%	13.5 ms
Improvements to YOLOv4	Weed detection in corn fields	86.89%		[59]
YOLOv7	Multi-species tea bud detection	87.10%		[60]

Table 4. YOLOv5, v8, v10, v11, v12 model identification results.

Model	Training Weights	Pr	mAP50	mAP0.5–0.9	F1	TIME (h)	Size (MB)
YOLOv5	Pre-weighting	81%	86%	64%	83%	81%	14.10
YOLOv5	Pre-weighting after training	86%	90%	61%	83%	86%	13.70
YOLOv8	Pre-weighting	76%	88%	66%	72%	76%	6.23
YOLOv8	Pre-weighting after training	78%	95%	76%	76%	78%	5.98
YOLOv10	Pre-weighting	82%	87%	65%	59%	82%	31.40
YOLOv10	Pre-weighting after training	87%	92%	73%	75%	87%	15.80
YOLOv11	Pre-weighting	81%	88%	66%	84%	81%	5.35
YOLOv11	Pre-weighting after training	82%	88%	66%	84%	82%	5.22
YOLOv12	Pre-weighting	82%	87%	67%	86%	82%	5.25
YOLOv12	Pre-weighting after training	87%	88%	67%	84%	87%	5.23

Note: Time—training time; size—size of model after training.

Table 5. Performance comparison of typical semantic segmentation models for weed detection in agricultural fields.

Model Architecture	Application Scenarios	mIoU	Key Technologies	Data Sources
U-Net	Beet field	88.59%	Jump-join feature fusion	[11]
Improvement of R-FCN	Beet field	89%	Cross-Scale Feature Fusion	[64]
Swin-DeepLab	Soybean field	91.53%	Swin Transformer + CBAM Attention	[65]
GT-DeepLabv3+	Rice paddy	64.91%	MobileNet v2 + GS-ASPP	[66]
Sub-Area Machine Vision	Cotton field	89.4%	Positional characteristics + Morphological analysis	[67]
Color Feature Segmentation	Cotton field	92.9%	B-R standard deviation threshold + Otsu splitting	[68]

Table 6. Comparison of the performance of RGB imaging and multimodal fusion techniques in weed identification.

Technical Program	Application Scenarios	Key Indicator	Data Sources
DoD Robotics	Carrot field	Reduction in herbicide use by 90%	[39]
RF + RGB	Chili field	Detection accuracy 96%	[69]
SVM + RGB	Chili field	Detection accuracy 94%	[69]
YOLOv5 + RGB	Cotton field	mAP 0.82	[70]
RGB-D + 3-Channel Network	Wheat field	Gramineae mAP 36.1%	[72]
RGB-D + 3-Channel Network	Wheat field	Broadleaf weeds MAP 42.9%	[72]

Table 7. Typical applications of multispectral imaging for weed detection in agricultural fields.

Technical Program	Application Scenarios	Key Indicator	Data Sources
Spectral–Spatial Fusion SVM Classification	Corn field	Detection accuracy 89%	[73]
DeepLabv3 + Probabilistic Modelling	Corn field	Multi-spectral mIoU 82.90%	[74]
Blue LED Fluorescent Sensor	Generic Scenarios	Vegetation Detection Accuracy 100%	[75]
RGB—Multi-spectral Fusion GPR Monitoring	Rice paddy	20% improvement in LNC estimation accuracy	[76]

Table 8. Typical applications of hyperspectral imaging in agricultural inspection.

Technical Program	Application Scenarios	Key Indicator	Data Sources
61-band hyperspectral + CNN	Weed classification	Accuracy better than RGB data	[77]
Ultra Pixel Spectroscopy + MLP	Classification of weeds in rangelands	Accuracy 89.1%	[78]
HIT + CNN Transfer Learning	Soya chlorophyll estimates	R² = 0.78	[79]
SOM + RBF Classifiers	Crop vegetation classification	Accuracy 88.5%	[80]
HSI + SWAE + GWO-SVR	Apple SSC predictions	Rp² = 0.9436, RMSEP = 0.1328	[82]
Visible - near-infrared hyperspectral imaging technology	Detection of external defects in nectarines	PLS model accuracy: 89.73%; LS-SVM model accuracy: 94.45%; ELM model accuracy: 88.62	[83]
SERS + CNN	Corn oil toxin detection	Trace ZEN Markerless Detection	[84]

Table 9. Performance comparison of typical lightweight models for weed detection in agricultural fields.

Model Architecture	Core Technology Features	Application Scenarios	mAP@0.5	Quantity of Participants	Speed of Reasoning	Data Sources
YOLOv8n + CBAM	Attention Module + Lightweight Components	Cotton field	97.6%		13.84 FPS	[35]
YOLOv8s + MobileNetV3	Deep Separable Convolution + Attention Mechanism	Cotton field	82%	38% Original model		[52]
EM-YOLOv4-Tiny	Mixed-precision quantization + multi-scale detection	Peanut field	90%	40% reduction	73 ms (FP32)	[55]
5-Layer CNN	Customized Convolutional Architecture + Model Quantization	Generic Scenarios	95.12%	0.012 GB	16.754 ms	[97]
YOLO-WDNet	Feature Fusion Optimization + Loss Function Design	Cotton field	97.8%			[99]
YOLOv8-ECFS	EfficientNet + Focal_SIoU + CA	Soybean field	95.0%	GFLOPs↓ 11.1G		[100]

Table 10. Performance comparison of weed detection models based on Transformer architecture.

Model Architecture	Application Scenarios	Core Technology	mAP@0.5	Speed of Reasoning	Quantity of Participants	Data Sources
MobileNetV2 + Transformer	Sugar cane field	Chunk Sorting + Hardware Acceleration		18.6 FPS	35% reduction	[40]
Swin Transformer + UNet	Soybean field	Sliding Window Attention + CBAM	91.53%		40% reduction	[65]
ViT-Base	Weed classification	Image Patch Sequence + Self-Attention	92.1%	28.7 FPS	65 M	[101]
RT-DETR-l	Corn field	Hybrid Query Strategy + NMS-free	91.2%	42.3 FPS	32.5 M	[102]

Table 11. Deep learning model improvements.

Authors, Reference	Model	Modelling Improvements	Precision
Bah, M. D. [116]	Improved ResNet18 Convolutional Neural Network	1. Core method: Improved ResNet18 convolutional neural network combined with transfer learning. 2. Unsupervised data annotation: Training data is automatically generated through crop row detection and superpixel segmentation.	1. Spinach field: no supervisory labelling AUC 94.34%, supervisory labelling AUC 95.70% 2. Bean field: unsupervised labelling AUC 88.73%, supervised labelling AUC 94.84%.
Khan, S [117]	Improved Faster R-CNN	Replaced VGG16 with ResNet-101 and increased the number of anchors from 9 to 16.	Average weed identification accuracy: 95.3% Overall average accuracy: 94.73%
Xu, K [118]	Three-Channel Deep Learning Network Based on RGB-D Images	1. Recode single-channel depth images into three-channel PHA images. 2. Integrate multimodal information through feature-level fusion and decision-level fusion.	Grass weeds: mAP 36.1% Broadleaf weeds: mAP 42.9% Overall Detection Accuracy:89.3%
Ahmad, A. [119]	VGG16, ResNet50, InceptionV3 and YOLOv3 models	1. Image classification: Transfer learning based on the Keras and PyTorch frameworks, and replace the output layer with a 4-node soft maximum likelihood layer. 2. Object detection: Use the Darknet-53 feature extractor and adjust the image size to 416 × 416 pixels during training.	Image classification: VGG16 and ResNet50 accuracy: 97.80%. InceptionV3 accuracy: 96.70%. Average accuracy of target detection: 54.3%.
Mashev, B. [120]	Improved YOLOv5	ECA-Net Attention Module Introduced in YOLOv5 to Enhance Inter-Channel Feature Interaction and Improve Small Target Detection Capability.	Accuracy range by category: 82–92% mAP@0.5: 78.1%
Hong Hua Jiang [121]	YOLOv8-ECFS	1. Replace Backbone with EfficientNet-B0, introducing the MBConv module and SENet attention mechanism. 2. Use the Focal_SIoU loss function. 3. Add a coordination attention (CA) module after the C2f module in the Neck.	1. Overall performance: Accuracy: 92.2% 2. Typical weeds: Clover and alfalfa (CHW): mAP improved by 5.2% Soybean seedlings: mAP enhanced by 0.8%
Khan, Z. [122]	Improved YOLOv7 algorithm	1. Integrate lightweight convolutional layers into the backbone network to enhance feature extraction capabilities. 2. Introduce squeezed excitation (SE) blocks and batch normalization blocks to integrate spatial and channel information. 3. Combine adaptive gradient optimizers with Lasso regularization to improve model generalization capabilities. 4. Replace activation functions with ELU and GELU to improve model convergence and non-linear expression capabilities.	Compared to the original YOLOv7: Precision improved by 3.2% Recall improved by 6.2% mAP@0.5 improved by 1.6% mAP@0.5:0.95 improved by 7.1% F1-Score increased by 5%

Note: Residual Network 18-layer Convolutional Neural Network (ResNet18 CNN); Area Under the Curve (AUC); Faster Region-Based Convolutional Neural Network (Faster R-CNN); Visual Geometry Group 16-layer (VGG16); Residual Network 101-layer (ResNet-101); Red-Green-Blue Depth Images (RGB-D Images); Phase images (PHA images); Residual Network 50-layer (ResNet50); Inception version 3 (InceptionV3); You Only Look Once version 3 models (YOLOv3 models); You Only Look Once version 5 (YOLOv5); Efficient Channel Attention Network (ECA-Net); YOLOv8 with Efficient Channel and Feature Selection (YOLOv8-ECFS); EfficientNet version B0 (EfficientNet-B0); Mobile Inverted Bottleneck Convolution module (MBConv module); Squeeze-and-Excitation Network attention mechanism (SENet); Focal Scale-invariant intersection over union loss function (Focal_SIoU loss); Cross Stage Partial level with 2 convolutions and fusion module (C2f module); You Only Look Once version 7 algorithm (YOLOv7 algorithm); Exponential Linear Unit (ELU); Gaussian Error Linear Unit (GELU).

Table 12. Traditional weed control methods.

Authors, Reference	Weed Control Mechanism	Weeding Method	Crops and Weeds	Weed Control Effect
Melander, B. [16]	Weed Simulator: Side blade length 3 cm, pneumatic drive Flame Weed Simulator: Single nozzle, height 7.5 cm, angle 50°, propane flow rate 1.0 L/min	Mechanical weeding: Side blade weeding. Flame weeding: Weeding with a propane flame sprayer.	Crop: Direct-seeded sugar beet Weeds: Inter-row weeds	Mechanical hoe: Weed within 1 cm of the center of the crop Flame weeding: Two-leaf stage: Propane usage ≤ 0.74 kg/km Four-leaf stage: Usage ≤ 1.49 kg/km Six-leaf stage: Usage ≤ 5.95 kg/km
Mwitta, C. [53]	Mobile platform: Ackerman steerable four-wheeled robot Robotic arm: 2D Cartesian arm Laser module: pan-and-tilt rotating mechanism with servo motors	Control: Vision servo to locate weed stems, PID to regulate robot position. Tactics: Jittering of the laser beam (10° swing) to increase contact area, 2 s for a single shot (10 J energy).	Crop: Cotton Weed: Palmer amaranth (Amaranthus palmeri)	No tracking mode: Weed killing rate is 47%, cycle time is 9.5 s per plant, and when the laser is tilted downwards by 10°, weeds can be effectively killed
Abd Ghani [54]	UAV: Multi-rotor agricultural drone equipped with a GPS positioning system Atomizing sprayer: Backpack-mounted electric sprayer	Drone spraying: flight speed 18 km/h, spraying width 2 m, mist sprayer. Spraying: flow rate 0.05–2.64 L/min.	Crop: Direct-seeded rice Weeds: Grasses: barnyardgrass, miller’s weed	Best weed control efficiency: 96.26% for Novlect Rice yield: 38% higher yield in Novlect-treated areas compared to untreated areas
Marx, C. [72]	Hardware: CO₂ laser systems, coaxial HeNe laser positioning system Key parameters: Energy density: max. 5.00 J/mm²	Directed irradiation based on the CO₂ laser. Dynamic adjustment of laser energy, spot diameter.	Weeds: Monocotyledonous: barnyard grass Dicotyledonous: Amaranthus antiquus	Lethal dose (95% success rate): Weeding efficiency: lethality over 90% for laser energy ≥ 54 J
Kerpauskas, P. [100]	Experimental tractor unit: 4th generation mobile water vapor weeder	Water vapor spraying: Temperature-controlled water vapor contacts the weed in short bursts of 1–2 s.	Crops: onions, barley, maize Weeds: other annual weeds	Weed shoot destruction: up to 98% Weed dry weight reduction: 40–57%
Jia, W. [140]	Mechanical structure: parallelogram linkage, hydraulic drive cylinder Sensing system: Obstacle detection rod and displacement sensor	Shovel-type mechanical weed control (spade-type weed control shovel, 3 cm depth of entry) + hydraulic automatic obstacle avoidance system.	Crop: Grapevines Weeds: Grapevine interplant weeds	86.8% reduction in weed cover after optimization
Sebastian, S. [149]	Main structure: Handle, rollers (with V-shaped spikes), fixed rake, float Total machine mass 5.4 kg, operating speed 1.9–2.1 km/h	Hand-operated push–pull mechanical weed control using roller spikes and fixed rakes working in unison.	Crop: Rice Weeds: Row weeds	Weeding efficiency: 88–95% Actual field capacity: 0.038–0.04 ha/h Power requirement: 0.032–0.036 HP

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, X.; Gao, J.; Qureshi, W.A. Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision. Agronomy 2025, 15, 1954. https://doi.org/10.3390/agronomy15081954

AMA Style

Gao X, Gao J, Qureshi WA. Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision. Agronomy. 2025; 15(8):1954. https://doi.org/10.3390/agronomy15081954

Chicago/Turabian Style

Gao, Xiangxin, Jianmin Gao, and Waqar Ahmed Qureshi. 2025. "Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision" Agronomy 15, no. 8: 1954. https://doi.org/10.3390/agronomy15081954

APA Style

Gao, X., Gao, J., & Qureshi, W. A. (2025). Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision. Agronomy, 15(8), 1954. https://doi.org/10.3390/agronomy15081954

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Applications, Trends, and Challenges of Precision Weed Control Technologies Based on Deep Learning and Machine Vision

Abstract

1. Introduction

1.1. The Urgent Need for Green and Sustainable Agricultural Development

1.2. Development Status and Challenges of Weed Treatment Technology

1.3. Deep Learning-Driven Intelligent Weed Control Technology Evolution

1.4. Research Objectives and Content Architecture

2. Review Methodology

3. Results and Discussion

3.1. Deep Learning Infrastructure

3.1.1. Convolutional Neural Networks (CNN)

3.1.2. Target Detection Models

3.1.3. Semantic Segmentation Models

3.2. Machine Vision Sensing Technologies

3.2.1. Combination of RGB Imaging and Deep Learning

3.2.2. Multi-Spectral Imaging Technology

3.2.3. Hyperspectral Imaging Technology

3.2.4. Application of Depth and Stereo Vision

3.3. Multi-Technology Convergence Framework

3.3.1. Sensor Fusion Strategy

3.3.2. Cross-Modal Feature Learning

3.4. Model Architecture Innovation

3.4.1. Lightweight Model Design

3.4.2. Application of Transformer Architecture

3.4.3. Hybrid Model Architecture

3.5. Model Optimization Techniques

3.5.1. Data Enhancement and Expansion

3.5.2. Loss Function Optimization

3.5.3. Model Compression and Acceleration

3.5.4. Modelling Algorithm Improvements

3.6. Scene Adaptation Optimization

3.6.1. Multi-Scale Feature Processing

3.6.2. Anti-Environmental Disturbance Technique

3.6.3. Few-Shot and Unsupervised Learning

3.7. Hardware System Architecture

3.7.1. Mobile Platform Design

3.7.2. Perception System Integration

3.7.3. Weeding Mechanism Design

3.8. Software System Architecture

3.8.1. Real-Time Operating System

3.8.2. Task Planning Algorithms

3.8.3. Human–Machine Interface

3.9. Typical Application Scenarios

3.9.1. Row Crop Scenario

3.9.2. Vegetable and Orchard Scenarios

3.9.3. Paddy Field Scenario

3.10. Analysis of Environmental Protection and Economic Benefits

3.10.1. Environmental Benefits

3.10.2. Economic Benefits

3.11. Existing Technical Challenges

3.11.1. Insufficient Model Generalization Capability

3.11.2. Contradiction Between Real-Time and Accuracy

3.11.3. Complexity of Multimodal Data Processing

3.11.4. System Reliability to Be Improved

3.12. Future Directions

3.12.1. Generic AI Model Development

3.12.2. Edge Computing and Model Light Weighting

3.12.3. Multimodal Fusion and Active Sensing

3.12.4. Digital Twin and Intelligent Decision Making

3.12.5. Construction of a Sustainable Technology System

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI