A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots

Lin, Tao; Sun, Fuchun; Li, Xiaoxiao; Guo, Xi; Ying, Jing; Wu, Haorong; Li, Hanshen

doi:10.3390/horticulturae12020158

Open AccessArticle

A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots

by

Tao Lin

¹,

Fuchun Sun

¹

,

Xiaoxiao Li

¹

,

Xi Guo

²,

Jing Ying

²,

Haorong Wu

^3,4,*

and

Hanshen Li

⁵

¹

School of Mechanical Engineering, Chengdu University, Chengdu 610106, China

²

Key Laboratory of Agricultural Equipment Technology for Hilly and Mountainous Areas, Ministry of Agriculture and Rural Affairs, Chengdu 610066, China

³

School of Electronic Information and Electrical Engineering, Chengdu University, Chengdu 610106, China

⁴

Sichuan Cuisine Industrialization Engineering Research Center of Sichuan Higher Education Institutions, Sichuan Tourism University, Chengdu 610100, China

⁵

China Electric Power Planning & Engineering Institute, Beijing 100120, China

^*

Author to whom correspondence should be addressed.

Horticulturae 2026, 12(2), 158; https://doi.org/10.3390/horticulturae12020158

Submission received: 14 December 2025 / Revised: 22 January 2026 / Accepted: 23 January 2026 / Published: 30 January 2026

(This article belongs to the Special Issue Artificial Intelligence and Machine Vision for Full-Cycle Orchard Production Management and Harvest)

Download

Browse Figures

Versions Notes

Abstract

Intelligent fruit-picking robots have emerged as a promising solution to labor shortages and the increasing costs of manual harvesting. This review provides a systematic and critical overview of recent advances in three core domains: (i) vision-based fruit and peduncle detection, (ii) motion planning and obstacle-aware navigation, and (iii) robotic manipulation technologies for diverse fruit types. We summarize the evolution of deep learning-based perception models, highlighting improvements in occlusion robustness, 3D localization accuracy, and real-time performance. Various planning frameworks—from classical search algorithms to optimization-driven and swarm-intelligent methods—are compared in terms of efficiency and adaptability in unstructured orchard environments. Developments in multi-DOF manipulators, soft and adaptive grippers, and end-effector control strategies are also examined. Despite these advances, critical challenges remain, including heavy dependence on large annotated datasets; sensitivity to illumination and foliage occlusion; limited generalization across fruit varieties; and the difficulty of integrating perception, planning, and manipulation into reliable field-ready systems. Finally, this review outlines emerging research trends such as lightweight multimodal networks, deformable-object manipulation, embodied intelligence, and system-level optimization, offering a forward-looking perspective for autonomous harvesting technologies.

Keywords:

autonomous fruit harvesting; orchard robotics; deep learning; 3D perception; motion planning; peduncle recognition; robotic manipulation

1. Introduction

The fruit industry is a key component of modern agriculture, providing substantial economic and nutritional value while serving as an important application domain for agricultural mechanization and intelligent technologies. However, fruit harvesting remains the most technically complex and least automated stage of fruit production. Unlike standardized field crop operations, fruit harvesting involves highly variable fruit morphology, asynchronous ripening, and random spatial distribution, together with frequent occlusion by branches and leaves. Complex orchard backgrounds and the delicate nature of fruits further lead to recognition difficulties, low localization accuracy, and unstable grasping performance [1]. In real orchard environments, illumination variation, wind-induced branch motion, and fruit overlap further reduce perception robustness, resulting in decreased picking success and increased fruit damage [2]. Current harvesting robots typically exhibit picking success rates of approximately 66% and fruit damage rates around 5%, which significantly limits their commercial applicability [2]. These constraints highlight the technical challenges that remain unresolved in practical orchard deployment.

China faces particularly severe challenges in fruit harvesting mechanization. In 2023, China’s fruit orchard area exceeded 1.29 × 10⁷ ha, with a total fruit output of 2.96 × 10⁸ t; however, the mechanization rate of fruit harvesting remained below 20%, and harvesting labor accounted for 40–60% of total production costs [3]. Although fruit production scale continues to expand, harvesting efficiency and labor dependency have not been fundamentally improved. Current research still faces several bottlenecks. First, diverse fruit varieties impose different requirements on recognition and grasping strategies, such as differences between apples and citrus versus grapes and kiwifruit [2,4]. Second, illumination variation and branch motion significantly degrade perception performance under real orchard conditions [5,6]. Third, orchard terrain complexity increases computational burden, while existing systems remain costly; internationally, advanced harvesting robots are typically priced between 250,000 and 750,000 USD, far exceeding the affordability of most Chinese growers [7]. Due to differences in economic foundations and cultivation patterns, the development of fruit-harvesting robots varies considerably across regions. Europe and Israel initiated early research, with representative systems such as SWEEPER employing three-dimensional structural perception to standardize harvesting operations and reduce real-time modeling complexity [8]. Japan has focused on multi-degree-of-freedom manipulators and trellis-grown crops [9]. New Zealand has explored monkey-hand-type grippers and multi-arm task allocation strategies [10], while the United States has made progress in vision-guided harvesting systems, although many remain limited to laboratory or pilot-scale testing [11].

China’s demand for fruit-harvesting robots continues to increase. The domestic fruit-picking robot market expanded from 1.24 billion CNY (Chinese Yuan) in 2015 to 8.21 billion CNY in 2023, and China’s fruit cultivation area is projected to reach 137.3 million ha by 2030 [12]. From 2004 to 2020, national agricultural machinery purchase subsidies totaled 239.2 billion CNY, providing policy support for mechanized equipment deployment [13,14]. Research institutions such as China Agricultural University, Jiangsu University, and Zhejiang University have developed prototype fruit-picking robots, some demonstrating promising experimental performance. Nevertheless, China entered this field approximately 20–30 years later than leading countries, and many systems remain at the experimental stage [15]. Low mechanization levels, high equipment cost, and limited adaptability to hilly or fragmented orchards continue to restrict large-scale adoption [16].

Although extensive studies have been conducted worldwide, diversity in fruit species and orchard environments continue to constrain practical deployment. Existing review studies often focus on isolated algorithmic improvements, while insufficient attention is paid to system architecture, deployment prerequisites, and engineering trade-offs. In practice, harvesting performance depends not only on perception accuracy, but also on the coupling between perception, motion planning, and manipulation.

Therefore, this review synthesizes recent advances in fruit and peduncle recognition, motion planning, and harvesting mechanism design, with particular emphasis on system-level integration and deployment-oriented challenges. The reviewed literature is organized along three complementary dimensions: orchard environment (controlled versus open-field), fruit morphology and harvesting mode, and system-level integration and optimization. By combining technical analysis with engineering perspectives, this review aims to identify key limitations and future research directions for intelligent fruit-picking robots. To ensure transparency and reproducibility, the literature collection and screening process follows the PRISMA statement, as described in Section Review Methodology and Paper Selection (PRISMA).

Review Methodology and Paper Selection (PRISMA)

This review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [17] to improve transparency and reproducibility. Literature was retrieved exclusively from the Web of Science Core Collection (WoSCC). Using the keywords “Robotic Harvesting”, “Agricultural Robot”, and “Fruit Picking Robot”, publications from 1 January 2000 to 1 October 2025 were searched.

After duplicate records were removed, a two-stage screening procedure was applied. First, titles and abstracts were screened to exclude studies unrelated to fruit-harvesting robotics. Second, full-text eligibility assessment was conducted. Studies were included if they addressed at least one of the three core functional modules of fruit-picking robots—perception (fruit or peduncle recognition and localization), planning (navigation or manipulator motion planning), or manipulation (end-effector and harvesting mechanism design)—and provided clear technical descriptions together with experimental, simulation-based, or system-level validation. Purely conceptual designs and promotional product reports without technical verification were excluded. Review articles were retained only when they contributed methodological insights or comparative perspectives relevant to system integration. Conference papers were included if they reported original technical contributions and provided quantitative or qualitative performance evaluation.

Performance comparisons and qualitative summaries presented in tables throughout this review are derived from metrics reported in the original studies, such as harvesting success rate, fruit damage rate, cycle time, and system complexity, rather than from unified benchmark experiments. Details of the screening results, bibliometric statistics, and keyword analyses are reported in Section 7. The PRISMA flow diagram of the literature search and study selection process is shown in Figure 1.

The remainder of this paper is organized as follows: Section 2 reviews fruit and peduncle recognition and localization technologies; Section 3 summarizes motion-planning and obstacle-avoidance approaches; Section 4 discusses harvesting-device mechanism design and optimization; Section 5 reviews system integration and control architectures; Section 6 analyzes representative harvesting robotic systems and field deployment characteristics; Section 7 presents bibliometric trends; and Section 8 concludes with future research directions.

2. Target Recognition and Localization Technology

Target recognition and localization technologies constitute the core components enabling autonomous operation in fruit-picking robots. Their accuracy, speed, and robustness directly determine the overall performance of the harvesting system. In recent years, these technologies have gradually shifted from traditional hand-crafted methods toward deep learning-based visual frameworks, greatly improving detection reliability [6]. However, current technologies still face challenges such as insufficient generalization in unstructured orchard environments, sensitivity to illumination variations, incomplete occlusion handling, and suboptimal structural coordination between perception and mechanical execution [18]. These issues highlight the need for more robust and adaptive integrated sensing–control systems to meet the demands of real-world orchard scenarios.

2.1. Traditional Recognition Methods

Traditional fruit-recognition methods are primarily based on handcrafted features such as color, texture, and geometric shape. These approaches are computationally efficient and easy to deploy, making them suitable for embedded platforms with limited computing resources, low-cost systems, and scenarios with scarce labeled data [19]. Although deep-learning-based methods have become dominant in recent years, traditional approaches still retain practical value in specific deployment niches rather than being entirely replaced.

To address the insufficient classification accuracy of traditional methods under small-sample conditions, Chen [19] enhanced feature discrimination and reduced labeling costs through semi-supervised learning. However, under complex orchard environments—such as variable illumination, leaf occlusion, and fruit overlap—these methods often struggle to maintain robustness and generalization. To improve noise resistance, Gu [20] enhanced fruit image robustness under non-uniform illumination by optimizing local mean and gray-level features. Building on this, Qi [21] improved the Otsu method and enhanced robustness under uneven lighting conditions. These studies indicate that traditional methods can achieve acceptable recognition performance when operating conditions and environmental variability are relatively constrained.

In addition to purely handcrafted pipelines, hybrid recognition strategies that combine traditional image processing with lightweight deep-learning models have emerged as an intermediate solution. Ma [22] integrated traditional methods with lightweight neural networks to control model size and improve accuracy in complex, variable orchard scenes. Such hybrid pipelines exploit prior domain knowledge to simplify visual inputs, while neural networks compensate for the limited generalization ability of handcrafted features.

Despite these improvements, traditional and hybrid methods remain constrained by several inherent limitations. First, manually designed features are highly sensitive to illumination variation and threshold selection, resulting in unstable performance across different orchard environments. Second, occlusion and fruit overlap often cause feature ambiguity, leading to missed detections or false positives. Third, the scalability and transferability of handcrafted feature pipelines remain limited when applied to diverse fruit varieties and growth stages. Consequently, current research has gradually shifted toward deep-learning-based end-to-end recognition frameworks to address these challenges [23].

2.2. Deep Learning-Based Recognition Methods

The rapid development of deep learning has driven fruit-recognition technology from handcrafted feature engineering toward automatic feature learning. By learning discriminative representations from large datasets, deep-learning-based methods significantly improve robustness and generalization in complex orchard environments. Existing detection approaches can be broadly categorized into two-stage and one-stage frameworks, which differ in accuracy, computational cost, and deployment feasibility for harvesting robots.

2.2.1. Two-Stage Detection Algorithms

Two-stage detection algorithms, such as Faster R-CNN and Mask R-CNN, adopt a two-step pipeline of region proposal generation and feature classification, enabling relatively high detection accuracy. Parvathi [24] optimized the Faster R-CNN network with a ResNet-50 backbone for coconut maturity detection, where data augmentation significantly achieved a higher mAP than SSD and YOLOv3. Chen [25] applied a VGG16-based Faster R-CNN architecture for oil-tea fruit recognition, achieving an accuracy of 98.9%. Wang [26] used Mask R-CNN to detect Dangshan lum and achieved a recall rate of 92.8%. To address the limited availability of annotated datasets, Siricharoen [27] improved Mask R-CNN using small-sample training and realized pineapple maturity recognition.

However, the high computational cost of two-stage frameworks limits their suitability for real-time fruit-picking robots. Typical models exceed 100 MB in size and require more than 400 MB of memory, resulting in slower inference speed and increased latency. In harvesting systems, such perception latency directly increases manipulator waiting time and prolongs the picking cycle, thereby reducing overall system throughput. As a result, two-stage detectors are mainly applied in offline analysis or laboratory validation rather than in field-deployed harvesting robots.

2.2.2. One-Stage Detection Algorithms

One-stage detection algorithms perform end-to-end target recognition by directly regressing bounding boxes and class labels, offering superior real-time performance and deployment flexibility. By eliminating the region-proposal stage, one-stage detectors better satisfy the strict latency and cycle-time requirements of orchard harvesting operations. Consequently, one-stage frameworks have become widely adopted for fruit perception in harvesting robots.

(a) Lightweighting for embedded deployment. Model lightweighting is a core requirement for mobile and embedded harvesting platforms. YOLOv4 introduced architectural and training optimizations to improve the balance between detection accuracy and inference speed [28]. YOLOv5 further enhanced deployment efficiency through backbone redesign and training-strategy refinement [29]. These lightweight-oriented improvements provide practical advantages for on-board computing in field robots, especially when computing resources and power budgets are limited [30]. Ma [22] reported that the YOLOv8n model achieves the smallest size and fewest parameters among YOLO-series variants, with memory requirements at the megabyte level, making it particularly suitable for embedded and mobile devices. This lightweight characteristic supports real-time perception on harvesting platforms without relying on high-end GPUs, thereby improving deployment feasibility in practical orchards. Recent versions such as YOLOv10 and YOLOv11 further improve efficiency through hardware-aware optimization and architectural refinement, strengthening real-time inference under orchard conditions [31]. These advances also encourage broader field deployment where cost and energy constraints are critical, which aligns well with the practical needs of agricultural harvesting systems [32]. Additional improvements reported in newer YOLO variants indicate continued progress in deployment-friendly perception performance [33].

(b) Robustness under orchard uncertainty. Illumination variation, foliage occlusion, and fruit overlap represent dominant sources of perception uncertainty in real orchards. Kuznetsova [34] demonstrated that YOLOv3 maintains relatively stable detection performance under varying illumination and occlusion. Lawal [35] enhanced small-target recognition by integrating DenseNet, spatial pyramid pooling, and Mish activation functions. Such robustness-oriented enhancements are particularly important for sustaining stable detection results during dynamic harvesting operations, where branch motion and background clutter are unavoidable. To further improve robustness, attention mechanisms and transformer-based modules have been widely incorporated into YOLO-style architectures. Raj [36] integrated transformer structures into YOLOv5 to enhance fruit-cluster detection. Chen [37] showed that attention mechanisms can strengthen fine-grained feature extraction under dense canopy conditions. Gai [38] further validated the effectiveness of attention-based enhancement for improving recognition stability in complex orchard scenes.

(c) Localization quality and small or clustered fruit detection. Accurate localization is critical for harvesting tasks, as detection results directly determine grasping or cutting-point precision. Improvements in multi-scale feature fusion, loss-function design, and detection-head optimization have been widely adopted to enhance bounding-box regression quality. Gao [39] demonstrated accurate recognition of up to 18 fruit species using an improved YOLOv8-based framework. Fu [40] integrated EfficientNet to enhance feature representation for tomato detection. These advances provide more reliable spatial information for downstream manipulation and cutting operations, thereby supporting safer and more stable harvesting execution.

Overall, one-stage detectors offer strong support for real-time fruit recognition and field deployment. Their evolution toward lightweight design, enhanced robustness, and improved localization quality makes them well suited for unstructured orchard environments. In addition, the engineering relevance of these improvements should be interpreted in relation to harvesting system performance, such as picking cycle time and operational stability.

Table 1 summarizes representative fruit-recognition studies using two-stage and one-stage algorithms. It should be noted that the performance metrics reported in the table originate from different studies conducted under varying datasets, orchard environments, and evaluation protocols. Therefore, Table 1 is not intended to serve as a unified benchmark ranking across methods, but rather to present typical improvement strategies and the corresponding performance ranges reported in the literature, which helps identify dominant optimization trends for orchard deployment.

Based on these studies, it is evident that fruit-recognition algorithms continue to advance toward lightweight design, improved accuracy, and enhanced adaptability across diverse application scenarios. As one of the most widely used frameworks in current fruit-recognition tasks, YOLOv8 enables efficient detection through an optimized architecture, enhanced attention mechanisms, and bounding-box refinement modules. These improvements reduce model size while maintaining accuracy, offering substantial advantages for real-time deployment of commercial fruit-picking systems. With the emergence of YOLOv9, YOLOv10, and YOLOv11, fruit-recognition performance has continued to improve in terms of accuracy, inference speed, and robustness under varying environmental conditions, suggesting significant potential for future autonomous harvesting applications.

2.2.3. Pixel-Level Image Segmentation Techniques

Deep-learning-based pixel-level image segmentation techniques enable pixel-level depiction of irregular fruit and stem shapes, serving as a crucial bridge between object detection and fine-grained localization. Compared with box-level detection, segmentation provides more explicit boundary information, which is beneficial for downstream grasping/cutting-point reasoning in harvesting tasks. Current studies primarily focus on enhancing model lightweighting and feature representation. For example, lightweight networks and multi-scale fusion architectures have been designed to achieve real-time segmentation on embedded devices [44]. Attention mechanisms and multi-level semantic fusion have been introduced to improve model adaptability under complex illumination and occlusion conditions [45]. Improvements in feature extraction and deep semantic modeling have further enhanced segmentation stability in cluttered orchard scenarios with mixed fruits and branches [46].

In addition, structural pruning and architectural reconfiguration have been applied on computation-constrained edge platforms to achieve a balance between model compactness and segmentation accuracy [47]. Meanwhile, optimizing feature-fusion strategies and loss functions further improves boundary precision for fruit contours and stem regions, which is particularly important when targets are small or heavily occluded. For instance, segmentation optimization of feature fusion strategies and loss functions have further improved the segmentation precision of fruit boundaries and stem regions [48].

Overall, recent advancements in segmentation technology—including structural lightweighting, improved feature representation, and enhanced network fusion—have significantly strengthened the recognition performance and real-time capability of fruit-picking robots in complex orchard environments. However, under natural illumination variations, fruit overlap, and severe stem occlusion, model generalization and stability still require further improvement. Future work may further emphasize boundary-aware learning and multi-task joint optimization (fruit–peduncle–picking-point) to reduce boundary ambiguity in dense canopy scenes, thereby improving operational reliability for autonomous harvesting.

2.3. Current Challenges in Recognition Technology

To keep this review technically focused while also highlighting under-studied yet practically difficult cases, this section uses loquat (Eriobotrya japonica) as a representative example of “minority fruit types” that exhibit small targets, dense clustering, and severe occlusion in natural orchards. This choice is intended to illustrate why fruit–peduncle perception remains a critical bottleneck when moving from fruit-level detection to cutting-point localization in real harvesting systems, rather than to provide a crop-specific review.

In recent years, fruit detection research has advanced considerably; however, the recognition of loquat remains insufficiently developed. Although loquat recognition is a key prerequisite for realizing autonomous harvesting, bibliographic screening in this review indicates that peduncle-/cutting-related studies are substantially fewer than fruit-only detection studies, suggesting that current research remains largely focused on fruit-level detection rather than stem-level identification, which is critical for robotic cutting and grasping tasks. Challenges persist in both fruit detection and peduncle segmentation. Traditional classification and hand-crafted feature extraction methods show limited performance when applied to loquat recognition, particularly under complex orchard conditions [49].

Li [50] proposed an enhanced DeepLabv3+ framework that incorporates MobileNetV2 and CBAM attention mechanisms to improve segmentation of loquat peduncle regions, achieving finer localization and higher segmentation precision. However, lightweight methods still struggle to meet real-time processing demands during field operation, particularly when occlusion and illumination variations occur. This reveals a fundamental trade-off between segmentation accuracy and deployment efficiency in practical harvesting scenarios.

Current research also attempts to integrate multi-sensor fusion with robotic manipulators to address the instability caused by fruit–branch adhesion and frequent occlusion in dense loquat canopies [51,52]. Moreover, cutting-point detection continues to be a major challenge. While some studies have achieved precise localization of cutting points in simulated environments [53], performance deteriorates significantly in real orchard scenes due to shape variability, occlusion, and fruit clustering, limiting practical applicability.

In addition, multi-target detection remains difficult, especially when integrating RGB-D vision, point-cloud data, and geometric constraints during robotic cutting and gripping. Highly variable environmental conditions and inconsistent fruit morphology significantly increase the computational burden of recognition models. At the same time, excessive reliance on depth data can introduce sensing noise and reduce detection reliability in outdoor environments [54]. Lightweight design of segmentation and detection networks has therefore become essential for real-time orchard deployment, although achieving both high accuracy and high efficiency remains a key technical bottleneck [55].

Furthermore, deep-learning-based recognition models still face challenges in terms of generalization across regional and seasonal variations. For rare or minority fruit types, insufficient training samples hinder the model’s ability to learn discriminative features, exacerbating the difficulty of accurate detection [56]. Addressing these limitations requires exploration of multi-modal fusion, three-dimensional spatial reasoning, and domain-adaptation techniques to improve robustness and adaptability across diverse orchard conditions.

Overall, recognition techniques for loquat remain in an early developmental stage compared with those developed for apples, citrus, and other well-studied fruit species. The unique characteristics of loquat—small size, dense clustering, irregular shape, and frequent occlusion—introduce substantial challenges for RGB-based recognition. Although three-dimensional sensing technologies have demonstrated potential advantages, their integration and large-scale application in orchard environments still require further research and system-level optimization [57].

This section summarizes representative peduncle-related studies listed in Table 2.

Overall, fruit–peduncle recognition and picking-point localization techniques have transitioned from traditional geometric and handcrafted feature extraction toward deep-learning-based multimodal perception frameworks. Recent studies, including those summarized in Table 2, increasingly emphasize lightweight model design, attention mechanisms, and progressive refinement of peduncle features, enabling more accurate cutting-point detection and improved robotic manipulation stability.

Despite these advancements, performance remains hindered by limited annotated datasets, severe occlusion, inconsistent stem morphology, and strong illumination variations in natural orchards. These limitations are particularly pronounced for minority fruit types such as loquat, where dense clustering and small peduncle structures exacerbate perception uncertainty.

Future research is expected to incorporate three-dimensional spatial information, multi-view perception, and cross-season adaptation to further enhance the accuracy and robustness of peduncle localization in complex orchard environments. Such advances will provide more reliable technical support for improving both recognition precision and cutting-strategy planning in autonomous fruit-picking robots.

3. Motion Planning and Obstacle Avoidance Technology

To achieve precise fruit recognition and localization, fruit-picking robots must generate feasible and efficient motion trajectories that enable stable grasping and harvesting operations. In orchard environments, motion planning must simultaneously address global navigation of mobile platforms and fine-grained trajectory planning of manipulators, which together determine the overall harvesting efficiency and system reliability. Unlike industrial environments, orchard scenes are semi-structured and highly dynamic, characterized by irregular tree spacing, uneven terrain, flexible branches, and frequent human interference.

3.1. Orchard Path Navigation Technology

3.1.1. Path Navigation Using Traditional Algorithms

Traditional path-planning algorithms such as Dijkstra, A*, and RRT have been widely applied in orchard navigation due to their deterministic structure and mathematical simplicity. These algorithms typically rely on grid-based or graph-based environmental representations and are effective in relatively structured scenarios. However, in real orchard environments, the assumptions of static obstacles and uniform terrain often do not hold. Tree canopies, irregular row spacing, and slope-induced elevation changes introduce frequent local minima and narrow passages, which degrade planning efficiency and path smoothness. Wang [58] reduced redundant nodes during A* search in orchard paths by approximately 20%, shortening planning time by 30%. To address the difficulty of maintaining path smoothness in tree-dense environments, Ye [59] proposed the CBO-RRT algorithm, achieving a 21.7% improvement in path smoothness. Alshammeri [60] integrated dynamic feedback into Dijkstra-based planning to enhance trajectory continuity.

These improvements demonstrate that traditional algorithms remain valuable for structured or semi-structured orchards when combined with heuristic optimization and perception-driven map updating, benefiting from strong global-search capability and computational stability. However, their scalability and adaptability remain limited under highly dynamic conditions commonly found in real orchards, including uneven terrain, dense canopies, irregular tree arrangements, narrow passages, and moving obstacles (e.g., workers and equipment). Their performance is further constrained by the difficulty of accurately modeling rapidly changing outdoor agricultural environments, as well as the high computational cost associated with iterative map updates and frequent path re-evaluation, which can hinder real-time operation in large-scale orchards. Therefore, future research may emphasize more robust and adaptive navigation paradigms, such as improved sampling-based and heuristic-search strategies coupled with perception models, and online-learning-based approaches (e.g., probabilistic roadmaps integrated with adaptive environmental modeling) to achieve reliable navigation in complex orchard scenes.

3.1.2. Path Navigation Using Intelligent Algorithms

Intelligent algorithms, including ant colony optimization (ACO), particle swarm optimization (PSO), and hybrid heuristic–learning approaches, provide stronger adaptability to nonconvex constraints and dynamic uncertainties commonly encountered in orchards.

For environments characterized by complex constraints and multiple local optima, Zhang [61] proposed an optimized ant colony algorithm (OSACO), which significantly improved solution convergence and reduced the length of planned paths compared to the traditional ACO method. Tian [62] introduced a multi-swarm cooperative ACO (MSACO), which enhanced search efficiency and increased algorithm stability under orchard conditions featuring dense tree spacing. To further address issues of premature convergence and unstable path generation, Li [63] combined A* with PSO, achieving a reduction in planned path length from 393.4 m (traditional PSO) to 366.3 m. Chen [64] integrated TBRL with a PSO framework, further demonstrating the potential of intelligent algorithms for generating globally optimal routes in semi-structured orchard environments.

Compared with traditional graph-search methods, intelligent algorithms emphasize global optimization and multi-objective trade-offs, such as path length, smoothness, and energy efficiency, which are particularly relevant for large-scale orchard operations. Looking ahead, integrating intelligent optimization with visual perception, multimodal sensing, and 3D environmental understanding will be necessary to further improve global path-planning robustness and stability in orchard environments [65].

3.1.3. Summary of Path-Planning Technologies

Table 3 summarizes representative orchard path-planning algorithms. It should be noted that the reported performance improvements originate from different experimental setups and simulation environments rather than unified benchmarks. Therefore, the table is intended to illustrate typical optimization strategies and relative performance trends, rather than to provide absolute ranking among algorithms.

Based on the literature summarized in Table 3, a qualitative multi-dimensional evaluation was further conducted, focusing on path optimality, computation cost, and adaptability to dynamic obstacles. The evaluation scores in Table 4 are derived from representative studies under comparable orchard-scale conditions and normalized to facilitate relative comparison. They reflect typical performance ranges reported in the literature rather than absolute field benchmarks.

Overall, the comparative analysis in Table 3 and Table 4 demonstrates that both traditional and intelligent algorithms possess distinct advantages in path optimality, energy efficiency, memory utilization, and dynamic obstacle-avoidance capability. Traditional algorithms generally exhibit higher computational stability and interpretability, whereas intelligent algorithms show stronger global optimization ability and adaptability to complex constraints.

However, most existing studies rely on grid-based simulations rather than real orchard environments, resulting in limited verification of practical feasibility and insufficient investigation into dynamic adaptability and energy-aware planning. Moreover, the degree of algorithmic lightweighting remains inadequate for deployment on embedded agricultural platforms, where real-time responsiveness and limited onboard computational resources are critical constraints.

Future research in orchard path-planning is expected to focus on adaptive regulation based on reinforcement learning, enabling robots to better respond to dynamic and unstructured conditions through online policy updating and environment-aware decision-making. Multi-agent cooperative navigation and the integration of visual–LiDAR fusion will also become important directions for constructing more robust and flexible path-planning frameworks capable of supporting large-scale collaborative operations in orchard environments, while balancing accuracy, computational cost, and system reliability under real field conditions.

3.2. Manipulator Harvesting Planning

Manipulator trajectory planning is the core component of robotic fruit harvesting. Its objective is to generate motion trajectories and obstacle-avoidance strategies that satisfy kinematic constraints while enabling precise and efficient picking operations. Compared with orchard-level navigation, manipulator planning operates in a high-dimensional continuous space and is therefore more sensitive to trajectory smoothness, execution stability, and real-time computational constraints.

Gao [66] introduced an improved PSO algorithm that reduced the average picking time per fruit from 20 s to 12 s, demonstrating the effectiveness of swarm-intelligence-based optimization in shortening harvesting cycles. Cao [67] proposed an enhanced multi-objective particle swarm optimization method (GMOPSO), which simultaneously optimized multiple conflicting objectives and achieved an average picking time of 25.5 s, highlighting the importance of multi-objective trade-offs in manipulator planning. Lin [68] applied reinforcement learning to improve obstacle-avoidance performance in manipulator operation, resulting in over 90% success rate in trajectory planning under complex orchard conditions. Sun [69] further modeled human picking behavior to optimize robot motion, reducing trajectory MSE by 15.2% and significantly improving trajectory smoothness.

From a methodological perspective, existing manipulator trajectory planning approaches can be broadly categorized into three groups: (i) optimization-based methods (e.g., PSO and its variants), (ii) learning-based methods (e.g., deep reinforcement learning), and (iii) bio-inspired or human-skill–imitation strategies. Each category exhibits distinct advantages in convergence speed, adaptability, and motion smoothness, but also faces different limitations in terms of robustness and computational cost.

Ju [70] emphasized that agricultural manipulators often face challenges such as multi-task coordination, real-time trajectory control, and operation under uncertain and dynamically changing environments. They noted that key bottlenecks include allocation of task priorities, synchronization between collaborative modules, and minimization of trajectory execution delays. Li [71] introduced a multi-agent learning-based collaborative system, shortening the picking cycle time to 5.8 s with a success rate of 80.4%, demonstrating the potential of coordinated planning strategies for improving harvesting efficiency.

Overall, manipulator trajectory planning techniques are gradually transitioning from traditional optimization methods toward hybrid intelligent frameworks that combine deep learning, PSO variants, and multi-objective optimization. These approaches have significantly improved trajectory smoothness and operational efficiency; however, their robustness remains constrained by orchard environmental uncertainty, sensor inaccuracies, and complex fruit–branch interactions. In addition, existing research often lacks large-scale real-orchard evaluations, leading to insufficient generalizability in practice. Future developments should therefore focus on reinforcement-learning-driven adaptive regulation, tighter coupling between vision and motion modules, and hierarchical task-planning mechanisms to achieve more stable and deployment-ready manipulator control in real orchard environments.

3.3. Current Challenges in Motion Planning Technology

Motion planning and obstacle avoidance are core components enabling stable and high-precision harvesting operations. Compared with industrial environments, orchard scenarios exhibit high levels of uncertainty due to dense canopies, irregular terrain, dynamic branches, and frequent fruit occlusions. Although current research has achieved notable progress in local planning and obstacle avoidance, significant challenges persist in terms of environmental uncertainty, real-time computational constraints, and multi-source perception integration, which directly affect field deployability. Among these challenges, the interaction between flexible fruit branches and mechanical manipulators introduces nonlinear dynamic responses, such as branch vibrations and unexpected deformation. These effects cause large deviations in obstacle boundaries and rapidly changing spatial constraints, rendering traditional rule-based or static planning methods inadequate for reliable orchard operation. Consequently, recent studies have focused on three main research directions: (i) 3D environment reconstruction, (ii) semantic segmentation of flexible obstacles, and (iii) predictive modeling of branch motion.

In the context of 3D environment reconstruction, research integrates improved sampling-based algorithms with intelligent perception frameworks to achieve dynamic characterization of spatial structures and continuous path optimization [72,73]. These approaches aim to convert unstructured orchard scenes into explicit geometric constraints; however, their performance strongly depends on perception accuracy and computational efficiency. In predictive modeling, several studies seek to reduce trajectory deviation by incorporating the temporal–spatial evolution of branch motion [74], enabling anticipatory planning under dynamic conditions. From a semantic segmentation perspective, efforts have been made to use lightweight neural networks to extract fine-grained features of branches and leaves, thereby supporting manipulator motion safety and path continuity [75,76].

Table 5 summarizes representative studies on dynamic recognition of flexible obstacles. Rather than providing exhaustive coverage, the table highlights typical technical strategies and their contributions to dynamic perception, prediction, and planning robustness in orchard environments.

Overall, flexible obstacle recognition research is expanding from static geometric modeling toward dynamic perception supported by deep learning and motion-prediction frameworks. Reinforcement learning and 3D sensing technologies further enhance environmental adaptability. However, the high real-time computational load in outdoor orchards, together with domain inconsistencies between training data and real scenes, continues to hinder robust generalization and large-scale deployment. Future research should therefore integrate dynamic modeling, multi-modal perception, and cross-scale semantic reasoning to achieve more reliable branch motion prediction and ensure safer manipulator–environment interaction during autonomous harvesting.

4. Harvesting Device Mechanism and Optimization

The mechanical subsystem provides the physical basis for autonomous harvesting, determining reachable workspace, cycle time, payload capability, and field robustness. In this review, “design” refers to the structural configuration and parameterization of manipulators and end-effectors, whereas “optimization” denotes the systematic adjustment of degrees of freedom (DOF), geometric dimensions, actuation schemes, and motion strategies to balance harvesting performance, mechanical complexity, and environmental adaptability. This section focuses on (i) manipulator configuration/design optimization (DOF allocation and dual-arm cooperation) and (ii) end-effector mechanisms (clamping, suction, cutting, and hybrid). To avoid redundancy with Section 5 (system integration), this chapter emphasizes component-level structural principles, key design parameters, and performance trade-offs reported in recent studies, rather than overall system-level coordination.

4.1. Manipulator Design and Optimization

4.1.1. Optimization of Degree-of-Freedom Configuration

The optimization of manipulator degrees of freedom involves balancing flexibility, cost, and structural complexity with task requirements. Recent studies have demonstrated that the rational allocation of DOFs is crucial for improving the reliability and value of fruit-harvesting operations.

In terms of DOF optimization, Hu [77] proposed four simplified configurations for a fruit-harvesting manipulator. The optimal scheme achieved a 78% picking success rate, reduced average picking time by 20%, and lowered overall mechanism mass by 60%. A corresponding structural layout is shown in Figure 2a. In addition, Inas [78] developed an optimization method for temperature-sensitive harvesting tasks. Compared with traditional arc-planning strategies, the optimized trajectory-planning method reduced planning error by 27% and decreased trajectory deviation by 18%, as shown in Figure 2b,c.

Peng [79] designed a seven-DOF fruit-harvesting manipulator for densely planted, temperature-controlled greenhouse environments. Using particle swarm optimization to refine four key joint dimensions, the system achieved a fruit-picking success rate of 92.9%, and the single-fruit picking time was reduced to 8–10 s, enabling an expanded operational workspace. Fu [80] developed a six-DOF harvesting robot for outdoor orchard environments, achieving smooth motion within a 1.4 m × 1.8 m × 2.0 m workspace through optimized joint configuration and structural refinement.

Overall, the development trend of manipulator DOF optimization is shifting from purely task-driven design toward schemes that balance economy and versatility. Low-DOF manipulators are advantageous in cost-sensitive scenarios and reduce control complexity, but their operational flexibility may be insufficient in environments requiring intricate manipulation. High-DOF manipulators, on the other hand, benefit from greater redundancy and versatile movement capabilities, but they introduce challenges in motion planning and nonlinear dynamical behavior. Future DOF configuration research will likely focus on biomimetic and adaptive structures, enabling manipulators to better handle variability in fruit posture and branch interference. Through lightweight structural design and efficient kinematic optimization, manipulators will achieve improved stability and adaptability in complex orchard environments.

In practice, DOF selection should be evaluated through a multi-criteria perspective, including workspace coverage, dexterity (manipulability), payload-to-weight ratio, and control complexity. For field harvesting, additional constraints—such as collision-free reachability within dense canopies, tolerance to perception and localization errors, and robustness under branch-induced disturbances—must also be incorporated into DOF optimization decisions.

4.1.2. Dual-Arm Cooperative Control

Dual-arm cooperation allows task parallelization and functional division, significantly improving harvesting efficiency and thus becoming an important research direction. Key technical challenges include collision avoidance, task coordination, and motion synchronization.

Yoshida [9] proposed an asymmetric dual-arm cooperation framework, in which the upper arm controlled the fruit peduncle while the lower arm executed fruit grasping. Using a Transition-based RRT algorithm, the system achieved smooth dual-arm coordinated trajectory planning in cluttered environments, reducing dual-arm harvesting time to 10 s and improving trajectory stability by 45%. He [81] introduced an MTSP (Multi-Traveling Salesman Problem)-based regional planning method combined with particle swarm optimization, enabling intelligent task allocation between two arms. The system achieved a harvesting success rate of 86.7% and increased computational efficiency by 18-fold. Jiang [82] developed a novel “single-vision dual-arm” system that relies on a single RGB camera for bilateral arm control. In strawberry harvesting experiments, the system achieved an 83% success rate within 9 s per fruit, significantly reducing the complexity of sensing hardware while ensuring reliable operation. Ling [83] designed a fruit-harvesting dual-arm robot with three degrees of freedom in each arm, enabling specialized picking motions. Their dual-arm system achieved an 87.5% harvesting success rate and simplified control complexity.

Although dual-arm harvesting can improve throughput, current systems remain constrained by several structural and algorithmic factors: (i) coordination robustness under perception uncertainty, including illumination variations and partial occlusion; (ii) task-allocation sensitivity to fruit spatial distribution (clustered versus uniform); and (iii) limited fault tolerance when one arm fails or deviates. From a design and optimization perspective, these limitations indicate that dual-arm cooperation should not be treated solely as a kinematic extension, but rather as a coupled problem involving perception reliability, motion synchronization, and task-level decision-making. Future work should therefore emphasize synchronized perception–planning loops, redundancy-aware task reallocation, and lightweight coordination strategies suitable for deployment on embedded controllers.

4.2. End-Effector Design and Optimization

The end-effector serves as the component that directly interacts with fruit and connects with the manipulator. Its design directly determines whether the fruit-picking task can be successfully completed. Over years of development, end-effectors have evolved into multiple categories, including support-type, suction-type, cutting-type, and hybrid composite-type. This section discusses representative end-effector mechanisms to provide insight for future design and optimization. The categories are defined based on the dominant physical interaction mechanism between the end-effector and the fruit–peduncle system, including supporting/grasping, suction adhesion, cutting separation, or hybrid coupling. This classification emphasizes functional interaction rather than structural appearance, facilitating cross-comparison of performance and applicability across different harvesting scenarios.

Support-type end-effectors are the most common category. Xu [84] proposed a three-finger rotary adaptive end-effector (Figure 3a), which can adjust finger curvature based on fruit shape and improve grasping stability. Chen [85] developed a FinRay-structure end-effector (Figure 3b), which uses a deformable TPU material and a sliding detection system to reduce surface damage during harvesting. Navas [86] designed a pneumatically controlled end-effector (Figure 3c) that adapts to different fruit sizes and is particularly suitable for harvesting small fruits in dense clusters.

Suction-type end-effectors rely on negative vacuum pressure to attach to fruit surfaces and have advantages such as softness and operational flexibility. Hua [87] proposed an omnidirectionally compliant vacuum suction end-effector with three degrees of freedom (Figure 4), enabling stable grasping through a contraction-based suction mechanism. However, suction-type end-effectors still strongly depend on fruit surface roughness and light conditions, and vacuum pressure is insufficient when handling heavier fruits, limiting broader applicability.

Cutting-type end-effectors are used for fruits that must be separated from stems or peduncles to ensure postharvest quality. Luo [88] designed a V-shaped double-blade cutting device capable of performing complex trimming operations while preventing fruit damage (Figure 5). Fu [80] developed a multi-functional rotating cutting tool that can perform both clamping and stem-cutting tasks (Figure 6). During operation, a rotating blade enters the fruit–stem junction to achieve separation. Zhao [89] proposed a dual-arm cooperative cutting end-effector suitable for fruits of various sizes, designed to maintain high cutting accuracy and adaptability in dense orchard environments.

In recent years, hybrid end-effectors have attracted increasing research attention due to their ability to combine support, gripping, and cutting functions based on practical harvesting needs. These designs effectively reduce fruit-dropping risks and shorten harvesting cycle time. Park [90] proposed a combined gripping–suction–cutting end-effector, which integrates fruit transportation into a single mechanism and reduces the picking cycle to 15.5 s, as shown in Figure 7. To further reduce structural cost, Zhang [91] designed a support–cutting integrated end-effector in which three rotating modules achieve synchronized supporting and cutting motions. Its simple structure significantly reduces manufacturing costs, and the mechanism is illustrated in Figure 8. To enhance general applicability across different fruit types, Wang [92] developed a multi-mode end-effector featuring soft fingers and suction-cup coordination, suitable for harvesting various fruits; the structure is shown in Figure 9.

Overall, the advantages and limitations of different categories of end-effectors are qualitatively summarized in Table 6, highlighting representative structural features, typical performance indicators, and practical constraints reported in the literature.

To enable a structured comparison across different end-effector categories, a literature-based multi-criteria scoring approach is adopted. This evaluation does not represent direct experimental measurements, but rather a normalized comparison derived from reported performance ranges and qualitative descriptions in existing studies. The considered dimensions include fruit damage rate, harvesting success rate, system complexity, applicability across fruit types, and maintenance cost.

Each end-effector category is assigned a normalized score on a scale of 1–4 (4 = best, 1 = worst) for each dimension, based on representative values and trends reported in the literature. Specifically, lower damage rates and system complexity correspond to higher scores, whereas higher success rates and broader applicability also result in higher scores. Maintenance cost is evaluated qualitatively according to structural simplicity and reported operational requirements.

Considering that the relative importance of these dimensions may vary in practical applications, a weight coefficient is assigned to each dimension. The final composite score is obtained by weighted summation, expressed as:

S_{k} = \sum_{i = 1}^{m} ω_{i} \cdot S_{k, i}

(1)

where S_k is the composite score of the kkk-th end-effector category, mmm is the number of evaluation dimensions, ω_i is the weight coefficient of the i-th dimension, and S_k,i is the normalized score of the k-th category in the i-th dimension.

A comparative evaluation of end-effector performance, as presented in Table 7, indicates that clamping-type end-effectors exhibit consistently superior performance in small-fruit harvesting tasks, owing to their high success rate and controllable interaction force. Suction-type designs demonstrate clear advantages when interacting with large fruits featuring smooth surface textures; however, their effectiveness remains sensitive to surface conditions and environmental disturbances. Cutting-type mechanisms are particularly effective for fruit species requiring peduncle removal, although their applicability is constrained by fruit morphology and the accuracy of perception and pose estimation.

From an optimization perspective, these results suggest that end-effector selection is inherently task-dependent and involves trade-offs among harvesting success, fruit damage, system complexity, and maintenance cost, rather than a single dominant structural solution. Therefore, avoiding “one-size-fits-all” conclusions and explicitly aligning end-effector mechanisms with fruit traits and orchard constraints is essential for improving real-world harvesting reliability.

5. System Integration and Optimization

In this review, the term “system” refers to an integrated fruit-picking robotic platform that couples perception (Section 2), planning and control (Section 3), and manipulation hardware (Section 4) into a closed-loop operational framework. This framework encompasses sensing, computation, communication, actuation, and safety mechanisms, enabling coordinated task execution in complex orchard environments.

Accordingly, “system integration and optimization” in this section emphasizes cross-module coordination and consistency—such as perception–planning–execution alignment, robustness under field uncertainty, and real-time responsiveness—rather than isolated improvements to individual sensing, planning, or manipulation algorithms. The focus is placed on system-level performance trade-offs, including accuracy, latency, energy consumption, cost, and operational robustness, which collectively determine deployability in real-world harvesting scenarios.

5.1. Multi-Sensor Fusion Technologies

Multi-sensor fusion technologies integrate complementary information from heterogeneous sensing modalities to overcome the perceptual limitations of individual sensors in complex orchard environments. In the context of system integration, multi-sensor fusion serves as a foundational enabler that links perception reliability with downstream planning, control, and manipulation modules, rather than functioning as an isolated sensing component.

Fusion of Vision and LiDAR

Vision–LiDAR fusion has emerged as the mainstream perception paradigm for modern orchard environments. By combining semantic information extracted from RGB imagery with high-precision depth and point-cloud measurements, this approach enables unified fruit detection and three-dimensional localization.

Kang [93] developed an innovative LiDAR–camera fusion system that employs an improved YOLOACT two-stage network for fruit detection. At a sensing distance of 0.5 m, the system achieved a localization error of approximately 2.5 mm. However, the high computational load and latency of the method hinder its applicability for lightweight robotic platforms. To address the challenge of dynamic localization during robot motion, Liu [94] proposed an ORB–Livox (Figure 10) fusion system that leverages multi-sensor spatial calibration and joint optimization. Under dynamic operating conditions, the system achieved an average localization accuracy of 21.1 cm, with positional error remaining below 30 cm in 90% of cases. Nonetheless, its performance showed significant degradation under adverse weather conditions such as heavy rain, fog, and dust.

In terms of adaptation to strong-illumination environments, Mola [95] introduced an innovative airflow-assisted enhancement technique that achieved an apple localization success rate of 87.5%, with an F1-score of 0.858. To address the depth-loss problem encountered by RGB-D cameras under outdoor strong-light conditions, Aberathna [96] integrated the RealSense D455f RGB-D sensor with a single-line LiDAR ranging module. At distances up to 100,000 lux of ambient illumination, the system maintained a localization accuracy of ±2 cm. However, its performance remains constrained by the limited coverage range of single-line LiDAR, making global consistency difficult to achieve in dense orchard canopies.

In the context of multimodal sensing fusion, Gan [97] combined visible RGB imagery (using Faster R-CNN) with thermal imagery (using Hough-based circle detection). Through the CTCP (Color–Thermal Combined Probability) fusion algorithm, the detection accuracy for green citrus improved to 88.6%, representing an 8.9% increase over traditional RGB-only methods. Nevertheless, the system remains sensitive to field layout variations and temperature inconsistencies in complex orchard environments. Liu [98] developed a tri-modal fusion system incorporating RGB imagery, near-infrared light, and tactile sensing; the network architecture is shown in Figure 11. The system achieved a 99.4% classification accuracy for blueberry ripeness, outperforming RGB-only approaches by 5.2% and improving recall by 34.6%. However, the increased hardware cost—approximately five times that of a single-modality solution—limits its suitability for lightweight, low-cost harvesting applications.

To address occlusion challenges in high-density orchards, Kaukab [99] proposed the NBR-DF-YOLOv5 model, achieving an average recognition accuracy of 96.4%, representing a 4.2% improvement over standard YOLOv5. However, the increased architectural complexity requires millisecond-level synchronization with infrared imagery, placing additional computational burdens on real-time system operation, especially under embedded or resource-constrained deployment conditions. In strawberry detection tasks, Chakraborty [100] employed UAV-mounted RGB and multispectral sensors and achieved a detection accuracy of 94.7% under dense foliage and complex fruit–branch configurations, demonstrating the method’s potential for deployment on fruit-picking robotic platforms.

From a system integration perspective, vision–LiDAR and multimodal fusion frameworks primarily improve global perception consistency and spatial reliability, which are critical for stable trajectory planning and collision-free manipulation. However, the increased sensing redundancy and computational load introduce trade-offs in latency, energy consumption, and hardware cost, particularly for lightweight or embedded harvesting platforms. Therefore, fusion strategy selection should be jointly considered with system-level constraints, rather than evaluated solely on perception accuracy.

Accurate multi-sensor calibration and temporal synchronization remain foundational for reliable perception in dense canopies, as calibration errors directly affect point-cloud accuracy, leading to misalignment and impaired 3D reconstruction quality. To mitigate accumulated positioning errors, Zhang [101] designed a low-cost planar calibration board suitable for eye-in-hand (Figure 12) configurations, allowing direct calibration using printed A4 paper. Experiments showed that the positioning error for real orchard peaches was reduced to 5 cm after calibration. Jiang [102] further employed hdl-graph-slam to enhance LiDAR–camera fusion accuracy, integrating the robustness of NDT alignment with the precision of ICP matching. Although the resulting 3D-to-2D point-cloud registration exhibited notable accuracy improvements, performance degradation persisted in high-noise or low-texture environments. The reliance on point-cloud quality limits applicability in complex orchard scenes characterized by strong illumination variability and branch occlusions.

5.2. Hierarchical Control System Architecture

The control architecture of orchard-harvesting robots is shifting from centralized frameworks toward distributed models, and from local computation toward edge–cloud collaboration, enabling real-time and adaptive responses in dynamic environments [15]. Here, “architecture” is discussed at the system level, encompassing communication topology, computational distribution, coordination mechanisms, and fault tolerance strategies. The objective of such architectural design is to ensure that perception outputs (Section 2) can be reliably and consistently translated into planning decisions (Section 3) and manipulator or end-effector actions (Section 4) under field uncertainty, network variability, and real-time constraints.

5.2.1. Distributed Architecture

Mao [103] developed a master–slave distributed navigation system tailored for collaborative orchard operations. As illustrated in Figure 13, the system provides a reliable multi-node coordination mechanism; however, signal attenuation under dense canopies often leads to cooperation failure. Ma [104] proposed a distributed deep-reinforcement-learning-based control protocol, which reduces communication load by 30–40% compared with traditional methods. Shamshiri [105] designed a CAN-bus-based modular sensing architecture that achieved 100% data-transmission success rate. Nevertheless, its highly customized interfaces severely limit cross-platform compatibility, making it difficult to integrate with heterogeneous manipulators and modular end-effectors. Moreover, distributed systems remain susceptible to network latency, node-failure recovery constraints, and synchronization jitter, any of which may destabilize control responses and consequently impair the continuity and safety of picking operations.

From a system perspective, distributed architectures enhance scalability and parallelism, but their deployment in orchard environments remains constrained by communication reliability, synchronization stability, and recovery mechanisms under node or link failures.

5.2.2. Edge–Cloud Collaborative Computing

Edge–cloud collaborative computing represents a system-level strategy to balance real-time responsiveness and computational capacity, rather than a universal solution for all orchard harvesting scenarios. Edge–cloud collaborative computing enhances orchard-harvesting robots by enabling intelligent task offloading and optimized network communication, effectively addressing constraints related to real-time responsiveness and onboard computational capacity. Zahidi [106] proposed the E5SH system, which integrates private 5G-SA networks with edge server clusters, increasing processing throughput from 0.46 FPS to 8.6 FPS. Its network architecture is illustrated in Figure 14. However, the deployment cost is high, and WiFi-based configurations are susceptible to signal fluctuations in orchard environments, limiting their suitability for practical field applications.

Cruz [107] developed a four-layer edge–cloud fusion architecture that achieved a 92% detection accuracy for strawberry pests and diseases, demonstrating effective support for real-time fruit and defect identification during harvesting. Xie [108] introduced a dual-framework computing model composed of local and remote modules, providing flexible computational options for multi-robot collaborative operations across large orchard scenarios.

At the same time, the orchard monitoring data uploaded to the cloud lacks encryption and access control, posing potential risks of data exposure. In the event of network failures, the system lacks a local fallback mechanism, making task interruptions likely. Furthermore, the computational capability of edge nodes is limited, which may become a system-level bottleneck when multiple robotic units operate concurrently.

5.3. System-Level Integration Challenges and Trade-Offs

Although advances in perception, control architecture, and computing infrastructure have significantly improved the capabilities of fruit-picking robotic systems, practical deployment in orchard environments remains constrained by system-level trade-offs. These trade-offs arise from the coupling between sensing accuracy, computational latency, communication reliability, energy consumption, and mechanical execution stability.

In field conditions characterized by dense canopies, illumination variability, and unstructured obstacles, improving one subsystem often introduces additional burdens on others. For example, increasing sensing redundancy enhances perception robustness but raises computational and communication costs, while complex control architectures improve coordination but reduce fault tolerance under network instability. Therefore, system integration and optimization should prioritize balanced co-design across perception, planning, control, and manipulation, rather than isolated performance maximization of individual modules.

6. Field Deployment Status and System-Level Application Analysis

6.1. Representative International Systems and Deployment Characteristics

In recent years, fruit-picking robots for orchards have entered the experimental verification stage in facility-based orchards, and different countries have shown varying degrees of application progress depending on crop structures and labor conditions. Overall, the development trend is characterized by improved recognition accuracy, accelerated path-planning algorithms, and enhanced coordination of picking mechanisms, all of which are key to commercial deployment. The following section introduces representative systems from several countries to analyze the practical application conditions and technical features.

The strawberry-picking robot developed by the UK company Dogtooth Technologies (Figure 15) has already been deployed in commercial farms such as those in Kent. Operating under a “paid-per-ton” model, the robot autonomously navigates strawberry beds and uses onboard visual systems to estimate fruit maturity and detect occlusions. Its end-effector employs an upper–lower dual-blade mechanism to cut the peduncle, achieving automated harvesting and simultaneously sorting strawberries into clamshell packs. The system has been used across multiple farms in the UK and Australia, and has reached commercial service capability [109,110].

According to publicly available technical disclosures, the Dogtooth system is reported to achieve an average harvesting throughput of approximately 200 kg per day, with a fruit extraction rate of about 80–90%, a machine-induced waste rate below 2.5%, and an operational endurance of roughly 8 h under continuous operation [111].

However, its reliance on standardized cultivation patterns remains strong; adverse effects are observed when dealing with complex branch structures, large variations in environmental humidity, or inconsistent plant geometries, which limit the system’s adaptability across different orchard settings.

The Agrobot-E Series strawberry-picking robot (Figure 16), jointly developed by Spain and the United States, has also entered the pre-commercial stage. However, the robot still faces limitations in inter-arm coordination accuracy, and its compliant end-effectors are prone to wear during multi-batch operations, issues that require further improvement. According to company-released technical descriptions, the Agrobot-E Series adopts a multi-arm parallel harvesting configuration, with up to 24 independently actuated robotic arms operating simultaneously to increase harvesting throughput, emphasizing parallel manipulation rather than single-arm picking speed [112].

The “Eve” robot developed by the Australian company Ripe Robotics (Figure 17) has been deployed for continuous fruit harvesting in orchards across Victoria and Tasmania, targeting multiple fruit types including apples and pears. However, the vacuum-suction end-effector performs poorly when the fruit surface is wet, and accurate control of suction force is required to minimize fruit damage. According to publicly reported field trials and company disclosures, the Eve system is positioned as a prototype-level harvesting robot designed for trained orchard architectures, with its harvesting performance and operational reliability being highly dependent on orchard structure and canopy management rather than standardized quantitative benchmarks [113].

The FAR system developed by Tevel Aerobotics Technologies (Figure 18) adopts a swarm-UAV mode for apple harvesting. Each flying unit is equipped with a camera and a vacuum-suction end-effector. After attaching to the fruit, the mechanism rotates to detach the apple. However, its performance is highly dependent on wireless-link stability, and the positioning accuracy is susceptible to airflow disturbances. The endurance of the UAV platform is limited, making it difficult to maintain stable operation over long periods of continuous harvesting. In addition, under dense canopy occlusions, target recognition reliability remains insufficient. According to publicly available technical reports and patent disclosures, the FAR system is designed to handle a wide range of apple sizes, typically from approximately 50 g to 700 g, and employs multiple tethered or coordinated aerial units to enable continuous operation, although system robustness remains strongly influenced by environmental conditions and communication constraints [114].

Across these representative systems, field deployment performance is strongly conditioned by orchard structure, crop type, and operational assumptions embedded in system design. While task-specific optimization enables competitive performance under controlled conditions, limited adaptability, reliance on standardized canopy layouts, and sensitivity to environmental variability remain common barriers to scalable deployment. These observations highlight that system-level integration, rather than isolated algorithmic or mechanical improvements, is the key determinant of real-world applicability.

6.2. Representative China Systems and Application Constraints

Due to the relatively late development of domestic fruit-picking robotic technologies, most reported systems in China remain at an earlier stage of maturity compared with internationally deployed counterparts. The water-fruit harvesting robot launched by EasySmart () (Figure 19) is capable of continuous 24 h operation and achieves more than a threefold improvement in efficiency. According to product introductions and publicly available reports, the EasySmart system is primarily designed for facility-based strawberry production, where structured planting layouts and stable environmental conditions enable long-duration operation and efficiency gains compared with manual harvesting [115].

However, most domestic harvesting robots remain heavily dependent on controlled-environment greenhouses, showing weak robustness to natural variations in lighting and fruit density. In complex open-field orchards, the mobility of the robot chassis and the positioning accuracy of the system remain insufficient, greatly limiting its practical applicability. Moreover, under conditions of large variability in fruit morphology, the stability and compliance performance of the end-effector still require further improvement.

6.3. Cross-System Synthesis and Engineering Implications

To improve clarity and avoid repetitive descriptions across Section 6.1, Section 6.2 and Section 6.3, the representative harvesting robotic systems discussed above are further summarized in Table 8 from a system-deployment perspective. Specifically, the table consolidates key attributes (company, target crop, end-effector type, metric categories, commercialization status, and major limitations) to enable rapid cross-system comparison. Quantitative details are intentionally placed in the corresponding text descriptions rather than in the table, so that the table remains concise and readable while the narrative provides verifiable evidence and context for each system.

In summary, Table 8 indicates that current harvesting robots differ not only in perception algorithms but also—more critically—in their system architecture and deployment prerequisites. Many reported successful deployments are coupled with structured cultivation patterns (e.g., greenhouse tabletop systems or trained orchard architectures) and specific operational constraints (e.g., infrastructure dependence and platform endurance), which limits cross-platform migration to open-field orchards. In addition, multi-actuator complexity and frequent calibration/maintenance requirements remain practical barriers that increase downtime and operating costs. Therefore, future research should move beyond accuracy-oriented improvements and prioritize lightweight and robust perception–manipulation coupling, modular end-effector design for fruit variability, and field-oriented evaluation protocols, thereby accelerating the transition of harvesting robots from experimental validation to reliable commercial deployment under real orchard uncertainty.

7. Technological Development Trends and Analysis

This chapter presents a bibliometric analysis based on the WoSCC dataset described in Section Review Methodology and Paper Selection (PRISMA), aiming to quantitatively characterize the research evolution and emerging trends in intelligent fruit-picking robots. Specifically, we examine: (i) annual publication trends to reveal the growth trajectory of the field, (ii) geographic distribution to identify major contributing countries and regions and their research activity, and (iii) keyword co-occurrence and evolution to map research hotspots and shifting thematic focuses. To support knowledge mapping and interpretation, the retrieved records were analyzed using VOSviewer (Version 1.6.20), which enables visualization of collaboration patterns and topic structures. Only peer-reviewed journal articles were included in this analysis to ensure data consistency and comparability. The results in this chapter provide a macro-level perspective that complements the technical review in Section 2, Section 3, Section 4, Section 5 and Section 6 and helps contextualize future research directions.

7.1. Publication Trend Analysis

Using the Web of Science Core Collection, peer-reviewed journal articles containing the keywords “Robotic Harvesting”, “Agricultural Robot”, and “Fruit Picking Robot” were retrieved for the period from 1 January 2000 to 1 October 2025. Only journal articles were included in this analysis, while reviews, conference proceedings, and other document types were excluded to ensure consistency and comparability. A total of 564 relevant publications were identified. The annual distribution of publications was statistically analyzed, and the year-by-year trend is presented in Figure 20.

As shown in Figure 20, the annual number of publications in this field exhibits a clear upward trend since 2000. During the period from 2000 to 2010, research output remained relatively low, with only a limited number of studies published each year, reflecting the early exploratory stage of orchard harvesting robotics. After 2010, driven by advances in agricultural automation and robotic technologies, academic interest in orchard harvesting robots gradually increased, and the annual publication volume began to rise steadily. Entering 2018, factors such as labor shortages in agriculture and rapid breakthroughs in artificial intelligence further accelerated research activity, leading to a pronounced growth in annual publications.

It should be noted that the publication data for 2025 represent a partial year, as the literature retrieval was completed in October 2025. Overall, from 2000 to 2025, the number of studies related to orchard harvesting robots indexed in the Web of Science Core Collection shows a clear upward trajectory, particularly accelerating in the last five years. This trend is closely aligned with the rapid advancement of agricultural robotics technologies and the rising industrial demand.

On the one hand, the increasing volume of research worldwide has contributed to significant technological progress in this domain. On the other hand, the growing demand for automated harvesting in modern agricultural production continues to stimulate academic and industrial research output. With sustained technological innovation and application-driven development, contributions in the field of orchard harvesting robots are expected to maintain an upward momentum. Future studies are likely to place greater emphasis on machine vision, structural optimization, and multi-robot collaboration, with the aim of further improving harvesting efficiency and operational reliability.

7.2. Analysis of Publication Distribution by Region

Following the above analysis, the regional distribution of publications indexed in the Web of Science Core Collection was examined. Among all contributing regions, China produced the highest number of publications, with 294 articles accounting for more than half of the total output, followed by the United States with 75 publications. The national publication distribution is illustrated in Figure 21. Furthermore, the dataset was imported into VOSviewer for collaboration analysis, revealing the cooperative relationships among major contributing countries, as shown in Figure 22.

From the perspective of publication distribution, research on fruit-picking robots exhibits a clear regional concentration. China holds a leading position in this field, contributing more than 45% of the global publications, with research efforts primarily focused on fruit recognition, path planning, and compliant end-effector design. The United States, Japan, Republic of Korea, Italy, and the Netherlands also demonstrate strong research performance in algorithmic innovation and system integration. Specifically, the United States emphasizes commercialization and system-level engineering, while Japan and Republic of Korea focus on precise localization and coordinated control. European countries show notable strengths in orchard informatization and multi-robot collaboration.

At the level of regional collaboration, international cooperation continues to intensify, forming research networks centered on China–United States, Japan–Republic of Korea, and intra-European partnerships. These collaborations facilitate algorithm sharing, model standardization, and cross-regional knowledge transfer. Overall, the global research landscape reveals a trend characterized by Asia leading, Europe and North America synergizing, and multipolar expansion. With the rapid development of smart agriculture and the deepening of cross-disciplinary collaboration, future research is expected to prioritize multi-sensor fusion, low-cost system standardization, and the advancement of intelligent and large-scale deployment of fruit-picking robots.

7.3. Keyword Hotspot Analysis

In this section, bibliometric and visualization techniques are employed to analyze keyword co-occurrence and clustering patterns in journal articles published between 2000 and 2025. Based on this analysis, a keyword co-occurrence network is constructed to reveal the structural relationships and temporal evolution of research topics in orchard fruit-picking robotics, as illustrated in Figure 23.

Based on the keyword analysis of relevant publications over the past two decades, research on orchard fruit-picking robots has primarily focused on target recognition, path planning, mechanical arm design, end-effector development, and multi-sensor fusion. In the early stage, research was dominated by keywords such as “vision recognition”, “fruit detection”, and “image segmentation”, reflecting an emphasis on improving fruit identification and localization accuracy. Since 2020, with the rapid advancement of deep learning and data-driven approaches, keywords such as “YOLO”, “deep learning”, “RGB-D”, and “LiDAR fusion” have appeared with increasing frequency. This trend indicates a clear shift from traditional vision-based methods toward intelligent perception frameworks and lightweight detection models.

In addition, the increasing prominence of keywords such as “motion planning”, “reinforcement learning”, and “soft gripper” reflects a broadening of research focus from perception-level understanding toward autonomous decision-making and compliant manipulation. The clustering results further reveal a transition of research hotspots from single-task modules to multi-module coordination and system-level integration, highlighting an emerging trend toward intelligent perception, collaborative planning, and lightweight execution. Overall, future research is expected to continue advancing lightweight algorithms, cross-modal perception, and intelligent control system optimization, thereby laying a solid foundation for robust deployment and large-scale application of fruit-picking robots in real orchard environments.

8. Conclusions and Outlook

In recent years, driven by rapid advances in artificial intelligence, computer vision, and intelligent control technologies, fruit-picking robots have entered a critical stage of intelligent evolution. This study systematically reviewed research progress from 2000 to 2025, covering key aspects such as target recognition and localization, path planning and obstacle avoidance, multi-modal perception, end-effector design, and system integration. Overall, the development trajectory of fruit-picking robots reveals a clear transition from scenario-driven and single-module solutions toward integrated, multi-module collaborative systems oriented to real-world deployment.

Despite substantial progress, several fundamental challenges continue to limit large-scale application in complex orchard environments. These challenges include robustness of perception under variable illumination and occlusion, stable manipulation in unstructured and deformable environments, real-time decision-making under limited onboard computational resources, and reliable system-level integration across perception, planning, and control. Moreover, gaps remain between laboratory prototypes and field-ready systems, highlighting the need for improved standardization, evaluation benchmarks, and deployment-oriented design.

Looking forward, future research is expected to emphasize intelligent perception combined with lightweight sensing, adaptive planning and compliant manipulation, and tighter coupling between perception, decision-making, and control modules. System-level optimization, edge–cloud collaboration, and multi-robot coordination are also anticipated to play increasingly important roles in enhancing scalability, efficiency, and reliability. With continued advances in agricultural robotics and cross-disciplinary collaboration, fruit-picking robots are expected to transition from experimental systems toward practical, large-scale deployment, providing critical technological support for the sustainable development of modern orchards.

Author Contributions

Conceptualization, T.L. and H.W.; methodology, T.L., H.W. and F.S.; software, T.L. and X.G.; validation, X.L., X.G. and H.L.; formal analysis, H.W. and J.Y.; investigation, F.S., J.Y. and H.W.; resources, H.W. and X.L.; data curation T.L. and H.L.; writing—original draft preparation, T.L.; writing—review and editing, T.L., X.L. and H.W.; funding acquisition, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was Supported by the open project of Key Laboratory of Agricultural Equipment Technology for Hilly and Mountainous Areas, Ministry of Agriculture and Rural Affairs (No. 2025QSNZ03), and Sichuan Cuisine Industrialization Engineering Research Center of Sichuan Higher Education Institutions, Sichuan Tourism University, Chengdu 610100, China (No. GCZX25-41).

Data Availability Statement

The dataset supporting this research is available within the article. Data access requests should be addressed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, J.; Kang, N.; Qu, Q.; Zhou, L.; Zhang, H. Automatic fruit picking technology: A comprehensive review of research advances. Artif. Intell. Rev. 2024, 57, 54. [Google Scholar] [CrossRef]
Tang, Y.; Chen, M.; Wang, C.; Luo, L.; Li, J.; Lian, G.; Zou, X. Recognition and Localization Methods for Vision-Based Fruit Picking Robots: A Review. Front. Plant Sci. 2020, 11, 510. [Google Scholar] [CrossRef] [PubMed]
Feng, X.; Ding, X.; Yuan, J.; Xu, W.; Liu, J. Optimization Study on the Freshwater Production Ratio from the Freezing and Thawing Process of Saline Water with Varied Qualities. Agronomy 2024, 15, 33. [Google Scholar] [CrossRef]
Li, H.; Huang, K.; Sun, Y.; Lei, X.; Yuan, Q.; Zhang, J.; Lv, X. An autonomous navigation method for orchard mobile robots based on octree 3D point cloud optimization. Front. Plant Sci. 2024, 15, 1510683. [Google Scholar] [CrossRef]
Min, W.; Wang, Z.; Yang, J.; Liu, C.; Jiang, S. Vision-based fruit recognition via multi-scale attention CNN. Comput. Electron. Agric. 2023, 210, 107911. [Google Scholar] [CrossRef]
Tan, Y.; Liu, X.; Zhang, J.; Wang, Y.; Hu, Y. A Review of Research on Fruit and Vegetable Picking Robots Based on Deep Learning. Sensors 2025, 25, 3677. [Google Scholar] [CrossRef] [PubMed]
Santo, B. Outstanding in the Field: Robots That Can Pick Fruit. Available online: https://control.com/industry-articles/outstanding-in-the-field-robots-that-can-pick-fruit/ (accessed on 22 January 2026).
Kootstra, G.; Wang, X.; Blok, P.M.; Hemming, J.; van Henten, E. Selective Harvesting Robotics: Current Research, Trends, and Future Directions. Curr. Robot. Rep. 2021, 2, 95–104. [Google Scholar] [CrossRef]
Yoshida, T.; Onishi, Y.; Kawahara, T.; Fukao, T. Automated harvesting by a dual-arm fruit harvesting robot. ROBOMECH J. 2022, 9, 19. [Google Scholar] [CrossRef]
Barnett, J.; Duke, M.; Au, C.K.; Lim, S.H. Work distribution of multiple Cartesian robot arms for kiwifruit harvesting. Comput. Electron. Agric. 2020, 169, 105202. [Google Scholar] [CrossRef]
Zhang, K.; Lammers, K.; Chu, P.; Li, Z.; Lu, R. System design and control of an apple harvesting robot. Mechatronics 2021, 79, 102644. [Google Scholar] [CrossRef]
Next Move Strategy Consulting. China Agriculture Robots Market: By Type (Unmanned Aerial Vehicles, Milking Robots, Driverless Tractors, Automated Harvest Robots, Others), by Farming Type, by Application—Opportunity Analysis and Industry Forecast 2023–2030; AG630; Next Move Strategy Consulting: Pune, India, 2025; p. 107. [Google Scholar]
Ministry of Industry and Information Technology of the People’s Republic of China; National Development and Reform Commission; Ministry of Science and Technology; Ministry of Public Security; Ministry of Civil Affairs; Ministry of Housing and Urban-Rural Development; Ministry of Agriculture and Rural Affairs; National Health Commission; Ministry of Emergency Management; People’s Bank of China; et al. The 14th Five-Year Plan for the Development of the Robotics Industry (Gongxinbu Lianguī [2021] No. 206); 2021. Available online: https://www.gov.cn/zhengce/zhengceku/2021-12/28/content_5664988.htm (accessed on 22 January 2026).
National Development and Reform Commission of the People’s Republic of China. New Round of Agricultural Machinery Purchase Subsidy Policy Released; NDRC: Beijing, China, 2021. [Google Scholar]
Zhou, H.; Wang, X.; Au, W.; Kang, H.; Chen, C. Intelligent robots for fruit harvesting: Recent developments and future challenges. Precis. Agric. 2022, 23, 1856–1907. [Google Scholar] [CrossRef]
Kaleem, A.; Hussain, S.; Aqib, M.; Cheema, M.J.M.; Saleem, S.R.; Farooq, U. Development Challenges of Fruit-Harvesting Robotic Arms: A Critical Review. AgriEngineering 2023, 5, 2216–2237. [Google Scholar] [CrossRef]
Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [PubMed]
Chen, Z.; Lei, X.; Yuan, Q.; Qi, Y.; Ma, Z.; Qian, S.; Lyu, X. Key Technologies for Autonomous Fruit- and Vegetable-Picking Robots: A Review. Agronomy 2024, 14, 2233. [Google Scholar] [CrossRef]
Liu, J.; Zhou, G.; Chen, X. Advances in Semi-Supervised Learning Methods for Plant Image Processing. Comput. Eng. Appl. 2025, 61, 45–68. (In Chinese) [Google Scholar] [CrossRef]
Gu, Y.; Shi, G.; Liu, X. Optimization of Rotten Apple Image Segmentation Using a Spatial Feature Clustering Algorithm. Trans. Chin. Soc. Agric. Eng. 2016, 32, 159–167. (In Chinese) [Google Scholar] [CrossRef]
Qi, J.; Chen, M.; Yang, Z. Flower Image Segmentation Based on an Improved OTSU Algorithm Using the HSV Color Model. J. Chin. Agric. Mech. 2019, 40, 155–160. (In Chinese) [Google Scholar]
Ma, B.; Hua, Z.; Wen, Y.; Deng, H.; Zhao, Y.; Pu, L.; Song, H. Using an improved lightweight YOLOv8 model for real-time detection of multi-stage apple fruit in complex orchard environments. Artif. Intell. Agric. 2024, 11, 70–82. [Google Scholar] [CrossRef]
Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Parvathi, S.; Tamil Selvi, S. Detection of maturity stages of coconuts in complex background using Faster R-CNN model. Biosyst. Eng. 2021, 202, 119–132. [Google Scholar] [CrossRef]
Chen, B.; Rao, H.; Wang, Y. Detection of Camellia oleifera Fruits in Natural Scenes Based on Faster R-CNN. Acta Agric. Jiangxi 2021, 33, 67–70. (In Chinese) [Google Scholar]
Wang, Y.; Cao, H. Detection of Laiyang Pear Targets Based on the Mask R-CNN Model. J. Qingdao Agric. Univ. (Nat. Sci.) 2020, 37, 301–305. (In Chinese) [Google Scholar]
Siricharoen, P.; Yomsatieankul, W.; Bunsri, T. Fruit maturity grading framework for small dataset using single image multi-object sampling and Mask R-CNN. Smart Agric. Technol. 2023, 3, 100130. [Google Scholar] [CrossRef]
Wu, L.; Ma, J.; Zhao, Y.; Liu, H. Apple Detection in Complex Scene Using the Improved YOLOv4 Model. Agronomy 2021, 11, 476. [Google Scholar] [CrossRef]
Lu, S.; Chen, W.; Zhang, X.; Karkee, M. Canopy-attention-YOLOv4-based immature/mature apple fruit detection on dense-foliage tree architectures for early crop load estimation. Comput. Electron. Agric. 2022, 193, 106696. [Google Scholar] [CrossRef]
Xu, L.; Wang, Y.; Shi, X.; Tang, Z.; Chen, X.; Wang, Y.; Zou, Z.; Huang, P.; Liu, B.; Yang, N.; et al. Real-time and accurate detection of citrus in complex scenes based on HPL-YOLOv4. Comput. Electron. Agric. 2023, 205, 107590. [Google Scholar] [CrossRef]
Li, A.; Wang, C.; Ji, T.; Wang, Q.; Zhang, T. D3-YOLOv10: Improved YOLOv10-Based Lightweight Tomato Detection Algorithm Under Facility Scenario. Agriculture 2024, 14, 2268. [Google Scholar] [CrossRef]
Ye, R.; Shao, G.; Gao, Q.; Zhang, H.; Li, T. CR-YOLOv9: Improved YOLOv9 Multi-Stage Strawberry Fruit Maturity Detection Application Integrated with CRNET. Foods 2024, 13, 2571. [Google Scholar] [CrossRef] [PubMed]
Ma, N.; Sun, Y.; Li, C.; Liu, Z.; Song, H. AHG-YOLO: Multi-category detection for occluded pear fruits in complex orchard scenes. Front. Plant Sci. 2025, 16, 1580325. [Google Scholar] [CrossRef]
Kuznetsova, A.; Maleva, T.; Soloviev, V. Using YOLOv3 Algorithm with Pre- and Post-Processing for Apple Detection in Fruit-Harvesting Robot. Agronomy 2020, 10, 1016. [Google Scholar] [CrossRef]
Lawal, M.O. Tomato detection based on modified YOLOv3 framework. Sci. Rep. 2021, 11, 1447. [Google Scholar] [CrossRef]
Tsai, F.T.; Nguyen, V.T.; Duong, T.P.; Phan, Q.H.; Lien, C.H. Tomato Fruit Detection Using Modified Yolov5m Model with Convolutional Neural Networks. Plants 2023, 12, 3067. [Google Scholar] [CrossRef]
Chen, Y.; Xu, H.; Chang, P.; Huang, Y.; Zhong, F.; Jia, Q.; Chen, L.; Zhong, H.; Liu, S. CES-YOLOv8: Strawberry Maturity Detection Based on the Improved YOLOv8. Agronomy 2024, 14, 1353. [Google Scholar] [CrossRef]
Gai, R.; Liu, Y.; Xu, G. TL-YOLOv8: A Blueberry Fruit Detection Algorithm Based on Improved YOLOv8 and Transfer Learning. IEEE Access 2024, 12, 86378–86390. [Google Scholar] [CrossRef]
Gao, X.; Zhang, Y. Detection of Fruit using YOLOv8-based Single Stage Detectors. Int. J. Adv. Comput. Sci. Appl. IJACSA 2023, 14, 0141208. [Google Scholar]
Fu, Y.; Li, W.; Li, G.; Dong, Y.; Wang, S.; Zhang, Q.; Li, Y.; Dai, Z. Multi-stage tomato fruit recognition method based on improved YOLOv8. Front. Plant Sci. 2024, 15, 1447263. [Google Scholar] [CrossRef] [PubMed]
Wang, N.; Qian, T.; Yang, J.; Li, L.; Zhang, Y.; Zheng, X.; Xu, Y.; Zhao, H.; Zhao, J. An Enhanced YOLOv5 Model for Greenhouse Cucumber Fruit Recognition Based on Color Space Features. Agriculture 2022, 12, 1556. [Google Scholar] [CrossRef]
Yu, Y.; Liu, Y.; Li, Y.; Xu, C.; Li, Y. Object Detection Algorithm for Citrus Fruits Based on Improved YOLOv5 Model. Agriculture 2024, 14, 1798. [Google Scholar] [CrossRef]
Chen, G.; Hou, Y.; Cui, T.; Li, H.; Shangguan, F.; Cao, L. YOLOv8-CML: A lightweight target detection method for color-changing melon ripening in intelligent agriculture. Sci. Rep. 2024, 14, 14400. [Google Scholar] [CrossRef]
Tao, X.; Feng, Q.; Han, L. Lightweight Apple Instance Segmentation Method SSWYOLOv11n for Complex Orchard Environments. Smart Agric. 2025, 7, 114–123. (In Chinese) [Google Scholar] [CrossRef]
Maheswari, P.; Raja, P.; Karkee, M.; Raja, M.; Baig, R.U.; Trung, K.T.; Hoang, V.T. Performance analysis of modified DeepLabv3+ architecture for fruit detection and localization in apple orchards. Smart Agric. Technol. 2025, 10, 100729. [Google Scholar] [CrossRef]
Ni, X.; Li, C.; Jiang, H.; Takeda, F. Deep learning image segmentation and extraction of blueberry fruit traits associated with harvestability and yield. Hortic. Res. 2020, 7, 110. [Google Scholar] [CrossRef]
Yang, X.; Zhao, W.; Wang, Y.; Yan, W.Q.; Li, Y. Lightweight and efficient deep learning models for fruit detection in orchards. Sci. Rep. 2024, 14, 26086. [Google Scholar] [CrossRef]
Jia, W.; Wei, J.; Zhang, Q.; Pan, N.; Niu, Y.; Yin, X.; Ding, Y.; Ge, X. Accurate segmentation of green fruit based on optimized mask RCNN application in complex orchard. Front. Plant Sci. 2022, 13, 955256. [Google Scholar] [CrossRef]
López-Barrios, J.D.; Escobedo Cabello, J.A.; Gómez-Espinosa, A.; Montoya-Cavero, L.-E. Green Sweet Pepper Fruit and Peduncle Detection Using Mask R-CNN in Greenhouses. Appl. Sci. 2023, 13, 6296. [Google Scholar] [CrossRef]
Chen, X.; Dong, G.; Fan, X.; Xu, Y.; Liu, T.; Zhou, J.; Jiang, H. Fruit Stalk Recognition and Picking Point Localization of New Plums Based on Improved DeepLabv3+. Agriculture 2024, 14, 2120. [Google Scholar] [CrossRef]
Paudyal, S. Realizing the Potential of Eastern Uganda’s Smallholder Dairy Sector through Participatory Evaluation. Agriculture 2024, 14, 1173. [Google Scholar] [CrossRef]
Qin, X.; Cao, J.; Zhang, Y.; Dong, T.; Cao, H. Development of an Optimized YOLO-PP-Based Cherry Tomato Detection System for Autonomous Precision Harvesting. Processes 2025, 13, 353. [Google Scholar] [CrossRef]
Gutiérrez del Pozo, D.; Martín-Gómez, J.J.; Reyes Tomala, N.I.; Tocino, Á.; Cervantes, E. Seed Geometry in Species of the Nepetoideae (Lamiaceae). Horticulturae 2025, 11, 315. [Google Scholar] [CrossRef]
Liang, Y.; Jiang, W.; Liu, Y.; Wu, Z.; Zheng, R. Picking-Point Localization Algorithm for Citrus Fruits Based on Improved YOLOv8 Model. Agriculture 2025, 15, 237. [Google Scholar] [CrossRef]
Li, Y.; Feng, Q.; Zhang, Y.; Peng, C.; Ma, Y.; Liu, C.; Ru, M.; Sun, J.; Zhao, C. Peduncle collision-free grasping based on deep reinforcement learning for tomato harvesting robot. Comput. Electron. Agric. 2024, 216, 108488. [Google Scholar] [CrossRef]
Li, P.; Wen, M.; Zeng, Z.; Tian, Y. Cherry Tomato Bunch and Picking Point Detection for Robotic Harvesting Using an RGB-D Sensor and a StarBL-YOLO Network. Horticulturae 2025, 11, 949. [Google Scholar] [CrossRef]
Li, Y.; Zhang, Z.; Yu, D.; Li, Z. Advances in Deep-Learning-Based Spherical Fruit Harvesting and Recognition Algorithms. J. Fruit. Sci. 2025, 42, 412–426. (In Chinese) [Google Scholar] [CrossRef]
Wang, Y.; Fu, C.; Huang, R.; Tong, K.; He, Y.; Xu, L. Path planning for mobile robots in greenhouse orchards based on improved A* and fuzzy DWA algorithms. Comput. Electron. Agric. 2024, 227, 109598. [Google Scholar] [CrossRef]
Ye, L.; Li, J.; Li, P. Improving path planning for mobile robots in complex orchard environments: The continuous bidirectional Quick-RRT* algorithm. Front. Plant Sci. 2024, 15, 1337638. [Google Scholar] [CrossRef] [PubMed]
Alshammrei, S.; Boubaker, S.; Kolsi, L. Improved Dijkstra Algorithm for Mobile Robot Path Planning and Obstacle Avoidance. Comput. Mater. Contin. 2022, 72, 5939–5954. [Google Scholar] [CrossRef]
Zhang, C.; Wang, H.; Fu, L.H.; Pei, Y.H.; Lan, C.Y.; Hou, H.Y.; Song, H. Three-dimensional continuous picking path planning based on ant colony optimization algorithm. PLoS ONE 2023, 18, e0282334. [Google Scholar] [CrossRef]
Tian, H.; Mo, Z.; Ma, C.; Xiao, J.; Jia, R.; Lan, Y.; Zhang, Y. Design and validation of a multi-objective waypoint planning algorithm for UAV spraying in orchards based on improved ant colony algorithm. Front. Plant Sci. 2023, 14, 1101828. [Google Scholar] [CrossRef]
Li, C. Path Planning of Fruit and Vegetable Picking Robots Based on Improved a* Algorithm and Particle Swarm Optimization Algorithm. INMATEH Agric. Eng. 2023, 71, 470–482. [Google Scholar] [CrossRef]
Chen, Q.; Wang, R.; Lyu, M.; Zhang, J. Transformer-Based Reinforcement Learning for Multi-Robot Autonomous Exploration. Sensors 2024, 24, 5083. [Google Scholar] [CrossRef]
Dou, H.; Chen, Z.; Zhai, C.; Zou, W.; Song, J.; Feng, F.; Zhang, Y.; Wang, X. Advances in Autonomous Navigation Technology for Intelligent Orchard Operation Equipment. J. Agric. Mech. Res. 2024, 55, 1–22. (In Chinese) [Google Scholar]
Gao, R.; Zhou, Q.; Cao, S.; Jiang, Q. Apple-Picking Robot Picking Path Planning Algorithm Based on Improved PSO. Electronics 2023, 12, 1832. [Google Scholar] [CrossRef]
Cao, X.; Yan, H.; Huang, Z.; Ai, S.; Xu, Y.; Fu, R.; Zou, X. A Multi-Objective Particle Swarm Optimization for Trajectory Planning of Fruit Picking Manipulator. Agronomy 2021, 11, 2286. [Google Scholar] [CrossRef]
Lin, G.; Zhu, L.; Li, J.; Zou, X.; Tang, Y. Collision-free path planning for a guava-harvesting robot based on recurrent deep reinforcement learning. Comput. Electron. Agric. 2021, 188, 106350. [Google Scholar] [CrossRef]
Sun, J.; Feng, Q.; Zhang, Y.; Ru, M.; Li, Y.; Li, T.; Zhao, C. Fruit flexible collecting trajectory planning based on manual skill imitation for grape harvesting robot. Comput. Electron. Agric. 2024, 225, 109332. [Google Scholar] [CrossRef]
Ju, C.; Kim, J.; Seol, J.; Son, H.I. A review on multirobot systems in agriculture. Comput. Electron. Agric. 2022, 202, 107336. [Google Scholar] [CrossRef]
Li, T.; Xie, F.; Zhao, Z.; Zhao, H.; Guo, X.; Feng, Q. A multi-arm robot system for efficient apple harvesting: Perception, task plan and control. Comput. Electron. Agric. 2023, 211, 107979. [Google Scholar] [CrossRef]
Chen, X.; Lu, C.; Guo, Z.; Yin, C.; Wu, X.; Lv, X.; Chen, Q. Research on 3D Obstacle Avoidance Path Planning for Apple Picking Robotic Arm. Agronomy 2025, 15, 1031. [Google Scholar] [CrossRef]
Yan, B.; Quan, J.; Yan, W. Three-Dimensional Obstacle Avoidance Harvesting Path Planning Method for Apple-Harvesting Robot Based on Improved Ant Colony Algorithm. Agriculture 2024, 14, 1336. [Google Scholar] [CrossRef]
Magistri, F.; Pan, Y.; Bartels, J.; Behley, J.; Stachniss, C.; Lehnert, C. Improving Robotic Fruit Harvesting Within Cluttered Environments Through 3D Shape Completion. IEEE Robot. Autom. Lett. 2024, 9, 7357–7364. [Google Scholar] [CrossRef]
Lin, G.; Wang, C.; Xu, Y.; Wang, M.; Zhang, Z.; Zhu, L. Real-time guava tree-part segmentation using fully convolutional network with channel and spatial attention. Front. Plant Sci. 2022, 13, 991487. [Google Scholar] [CrossRef]
Yao, J.; Yu, Q.; Deng, G.; Wu, T.; Zheng, D.; Lin, G.; Zhu, L.; Huang, P. A Fast and Accurate Obstacle Segmentation Network for Guava-Harvesting Robot via Exploiting Multi-Level Features. Sustainability 2022, 14, 12899. [Google Scholar] [CrossRef]
Hu, G.; Chen, C.; Chen, J.; Sun, L.; Sugirbay, A.; Chen, Y.; Jin, H.; Zhang, S.; Bu, L. Simplified 4-DOF manipulator for rapid robotic apple harvesting. Comput. Electron. Agric. 2022, 199, 107177. [Google Scholar] [CrossRef]
Saoud, I.; Jaafari, H.I.; Chahboun, A.; Raissouni, N.; Achhab, N.B.; Azyat, A. Design optimization and trajectory planning of a strawberry harvesting manipulator. Bull. Electr. Eng. Inform. 2024, 13, 3948–3959. [Google Scholar] [CrossRef]
Peng, C.; Feng, Q.; Guo, Z.; Ma, Y.; Li, Y.; Zhang, Y.; Gao, L. Structural Parameter Optimization of a Tomato Robotic Harvesting Arm: Considering Collision-Free Operation Requirements. Plants 2024, 13, 3211. [Google Scholar] [CrossRef]
Fu, M.; Guo, S.; Chen, A.; Cheng, R.; Cui, X. Design and experimentation of multi-fruit envelope-cutting kiwifruit picking robot. Front. Plant Sci. 2024, 15, 1338050. [Google Scholar] [CrossRef]
He, Z.; Ma, L.; Wang, Y.; Wei, Y.; Ding, X.; Li, K.; Cui, Y. Double-Arm Cooperation and Implementing for Harvesting Kiwifruit. Agriculture 2022, 12, 1763. [Google Scholar] [CrossRef]
Jiang, Y.; Liu, J.; Wang, J.; Li, W.; Peng, Y.; Shan, H. Development of a dual-arm rapid grape-harvesting robot for horizontal trellis cultivation. Front. Plant Sci. 2022, 13, 881904. [Google Scholar] [CrossRef]
Ling, X.; Zhao, Y.; Gong, L.; Liu, C.; Wang, T. Dual-arm cooperation and implementing for robotic harvesting tomato using binocular vision. Robot. Auton. Syst. 2019, 114, 134–143. [Google Scholar] [CrossRef]
Xu, Y.; Lv, M.; Xu, Q.; Xu, R. Design and Analysis of a Robotic Gripper Mechanism for Fruit Picking. Actuators 2024, 13, 338. [Google Scholar] [CrossRef]
Chen, K.; Li, T.; Yan, T.; Xie, F.; Feng, Q.; Zhu, Q.; Zhao, C. A Soft Gripper Design for Apple Harvesting with Force Feedback and Fruit Slip Detection. Agriculture 2022, 12, 1802. [Google Scholar] [CrossRef]
Navas, E.; Shamshiri, R.R.; Dworak, V.; Weltzien, C.; Fernandez, R. Soft gripper for small fruits harvesting and pick and place operations. Front. Robot. AI 2023, 10, 1330496. [Google Scholar] [CrossRef]
Hua, W.; Zhang, W.; Zhang, Z.; Liu, X.; Huang, M.; Igathinathane, C.; Vougioukas, S.; Saha, C.K.; Mustafa, N.S.; Salama, D.S.; et al. Vacuum suction end-effector development for robotic harvesters of fresh market apples. Biosyst. Eng. 2025, 249, 28–40. [Google Scholar] [CrossRef]
Li, Z.; Yuan, X.; Yang, Z. Design, simulation, and experiment for the end effector of a spherical fruit picking robot. Int. J. Adv. Robot. Syst. 2023, 20, 17298806231213442. [Google Scholar] [CrossRef]
Zhao, Y.; Jin, Y.; Jian, Y.; Zhao, W.; Zhong, X. Kinematic design of new robot end-effectors for harvesting using deployable scissor mechanisms. Comput. Electron. Agric. 2024, 222, 109039. [Google Scholar] [CrossRef]
Park, Y.; Seol, J.; Pak, J.; Jo, Y.; Jun, J.; Son, H.I. A novel end-effector for a fruit and vegetable harvesting robot: Mechanism and field experiment. Precis. Agric. 2022, 24, 948–970. [Google Scholar] [CrossRef]
Zhang, T.; Huang, Z.; You, W.; Lin, J.; Tang, X.; Huang, H. An Autonomous Fruit and Vegetable Harvester with a Low-Cost Gripper Using a 3D Sensor. Sensors 2019, 20, 93. [Google Scholar] [CrossRef]
Wang, X.; Kang, H.; Zhou, H.; Au, W.; Wang, M.Y.; Chen, C. Development and evaluation of a robust soft robotic gripper for apple harvesting. Comput. Electron. Agric. 2023, 204, 107552. [Google Scholar] [CrossRef]
Kang, H.; Wang, X.; Chen, C. Accurate fruit localisation using high resolution LiDAR-camera fusion and instance segmentation. Comput. Electron. Agric. 2022, 203, 107450. [Google Scholar] [CrossRef]
Liu, T.; Kang, H.; Chen, C. ORB-Livox: A real-time dynamic system for fruit detection and localization. Comput. Electron. Agric. 2023, 209, 107834. [Google Scholar] [CrossRef]
Gené-Mola, J.; Gregorio, E.; Guevara, J.; Auat, F.; Sanz-Cortiella, R.; Escolà, A.; Llorens, J.; Morros, J.-R.; Ruiz-Hidalgo, J.; Vilaplana, V.; et al. Fruit detection in an apple orchard using a mobile terrestrial laser scanner. Biosyst. Eng. 2019, 187, 171–184. [Google Scholar] [CrossRef]
Abeyrathna, R.; Nakaguchi, V.M.; Liu, Z.; Sampurno, R.M.; Ahamed, T. 3D Camera and Single-Point Laser Sensor Integration for Apple Localization in Spindle-Type Orchard Systems. Sensors 2024, 24, 3753. [Google Scholar] [CrossRef]
Gan, H.; Lee, W.S.; Alchanatis, V.; Ehsani, R.; Schueller, J.K. Immature green citrus fruit detection using color and thermal images. Comput. Electron. Agric. 2018, 152, 117–125. [Google Scholar] [CrossRef]
Liu, Y.; Wei, C.; Yoon, S.C.; Ni, X.; Wang, W.; Liu, Y.; Wang, D.; Wang, X.; Guo, X. Development of Multimodal Fusion Technology for Tomato Maturity Assessment. Sensors 2024, 24, 2467. [Google Scholar] [CrossRef]
Kaukab, S.; Komal; Ghodki, B.M.; Ray, H.; Kalnar, Y.B.; Narsaiah, K.; Brar, J.S. Improving real-time apple fruit detection: Multi-modal data and depth fusion with non-targeted background removal. Ecol. Inform. 2024, 82, 102691. [Google Scholar] [CrossRef]
Chakraborty, D.; Deka, B. Deep Learning-Based Selective Feature Fusion for Litchi Fruit Detection Using Multimodal UAV Sensor Measurements. IEEE Trans. Artif. Intell. 2025, 6, 1932–1942. [Google Scholar] [CrossRef]
Zhang, X.; Yao, M.; Cheng, Q.; Liang, G.; Fan, F. A novel hand-eye calibration method of picking robot based on TOF camera. Front. Plant Sci. 2022, 13, 1099033. [Google Scholar] [CrossRef]
Jiang, S.; Qi, P.; Han, L.; Liu, L.; Li, Y.; Huang, Z.; Liu, Y.; He, X. Navigation system for orchard spraying robot based on 3D LiDAR SLAM with NDT_ICP point cloud registration. Comput. Electron. Agric. 2024, 220, 108870. [Google Scholar] [CrossRef]
Mao, W.; Liu, H.; Hao, W.; Yang, F.; Liu, Z. Development of a Combined Orchard Harvesting Robot Navigation System. Remote Sens. 2022, 14, 675. [Google Scholar] [CrossRef]
Ma, F.; Yao, H.; Du, M.; Ji, P.; Si, X. Distributed Averaging Problems of Agriculture Picking Multi-Robot Systems via Sampled Control. Front. Plant Sci. 2022, 13, 898183. [Google Scholar] [CrossRef]
Shamshiri, R.R.; Navas, E.; Dworak, V.; Auat Cheein, F.A.; Weltzien, C. A modular sensing system with CANBUS communication for assisted navigation of an agricultural mobile robot. Comput. Electron. Agric. 2024, 223, 109112. [Google Scholar] [CrossRef]
Zahidi, U.A.; Khan, A.; Zhivkov, T.; Dichtl, J.; Li, D.; Parsa, S.; Hanheide, M.; Cielniak, G.; Sklar, E.I.; Pearson, S.; et al. Optimising robotic operation speed with edge computing over 5G networks: Insights from selective harvesting robots. J. Field Robot. 2024, 41, 2771–2789. [Google Scholar] [CrossRef]
Cruz, M.; Mafra, S.; Teixeira, E.; Figueiredo, F. Smart Strawberry Farming Using Edge Computing and IoT. Sensors 2022, 22, 5866. [Google Scholar] [CrossRef] [PubMed]
Xie, F.; Li, T.; Feng, Q.; Zhao, H.; Chen, L.; Zhao, C. Boosting Cost-Efficiency in Robotics: A Distributed Computing Approach for Harvesting Robots. J. Field Robot. 2024, 42, 1633–1648. [Google Scholar] [CrossRef]
Smith, A. Kent’s growers given lesson in robotic fruit picking from Dogtooth Technologies near Cambridge. Kent. Online 2024. Available online: https://www.kentonline.co.uk/kent/news/the-robots-that-could-help-kent-s-fruit-picking-problems-312244. (accessed on 22 January 2026).
Mooney, M. Robotic Picker Deployed at Kent Fruit Farm. Available online: https://www.roboticsandautomationmagazine.co.uk/news/agriculture/robotic-picker-deployed-at-kent-fruit-farm.html (accessed on 22 January 2026).
Dogtooth Technologies. Robotic Strawberry Harvesting on Your Farm: Gen-5 Robot Specification Sheet (Version 0.1)). Dogtooth Technologies, 2025. Available online: https://dogtooth.tech/wp-content/uploads/2025/04/Gen-5-Specification-Sheet-0.1.pdf (accessed on 22 January 2026).
Agrobot. E-Series Robotic Harvesters for Strawberry Picking. Agrobot. Available online: https://www.agrobot.com/e-series (accessed on 22 January 2026).
Ripe Robotics. Meet Eve: Our Commercial Prototype. Ripe Robotics. Available online: https://www.riperobotics.com/ (accessed on 22 January 2026).
Tevel Aerobotics Technologies. Technology: Flying Autonomous Robots™ for Tree Fruit Harvesting. Tevel Aerobotics Technologies. Available online: https://www.tevel-tech.com/technology/ (accessed on 22 January 2026).
EasySmart (ZWinSoft). Product Introduction of EasySmart Fruit-Picking Robot. EasySmart. Available online: https://ep.zwinsoft.com/%E6%99%BA%E6%98%93%E6%97%B6%E4%BB%A3%E6%B0%B4%E6%9E%9C%E9%87%87%E6%91%98%E6%9C%BA%E5%99%A8%E4%BA%BA/ (accessed on 22 January 2026).

Figure 1. PRISMA flow diagram of the literature search and study selection process.

Figure 2. (a) RPRR-configuration manipulator; (b) Four-degree-of-freedom manipulator. (c) Denavit–Hartenberg (D–H) coordinate frame assignment diagram. Adapted from Saoud et al. [78], licensed under CC BY-SA 4.0.

Figure 3. Representative clamping end-effectors: (a) three-finger rotary adaptive gripper; (b) FinRay-effect soft gripper with slip detection; (c) pneumatic clamping gripper.

Figure 4. Suction-Type End-Effector.

Figure 5. Rotational Cutting End-Effector: 1—Clamping mechanism; 2—Cutting mechanism; 3—Flipping mechanism.

Figure 6. Enclosing–cutting principle: (A) limit distribution diagram for side-by-side arrangement of fruits in the picking bin, (B) structural diagram of curved guide rod, (C) schematic diagram of cutting off all fruit stalks, (D) schematic diagram of the curved guide rod delivering fruit 1, (E) schematic diagram of the fruits picking state. Numbers 1–4 indicate the labels of four individual kiwifruits in the picking bin. Adapted from Fu et al. [80], licensed under CC BY.

Figure 7. Cutting–Suction–Transport Integrated End-Effector. 1—Supporting mechanism; 2—Suction mechanism; 3—Transport mechanism.

Figure 8. Support–Cutting Integrated End-Effector. 1—Top plate; 2—Cutting blade; 3—Middle plate; 4—Bottom plate; 5—Axle.

Figure 9. Suction–Support Hybrid End-Effector. 1—Soft robotic fingers; 2—Suction cup; 3—Pneumatic cylinder.

Figure 10. ORB–Livox Fusion System Workflow. Reproduced from Liu et al. [94], licensed under CC BY-NC-ND 4.0.

Figure 11. Multimodal fusion fully connected network structure. Note: The three colored circles denote the three maturity classes (immature, semi-mature, and mature); colors are for visual distinction only. Reproduced from Liu et al. [98], licensed under CC BY 4.0.

Figure 12. Hand-eye Calibration and Designed calibration board. Adapted from Zhang et al. [101], licensed under CC BY 4.0.

Figure 13. Master–Slave Distributed Navigation System. Reproduced from Mao et al. [103], licensed under CC BY 4.0.

Figure 14. E5SH System Workflow. Reproduced from Zahidi & Khan [106], licensed under CC BY 4.0.

Figure 15. Strawberry-picking robot developed by Dogtooth Technologies.

Figure 16. Agrobot E-Series robot.

Figure 17. The Eve robotic harvester.

Figure 18. The FAR harvesting system.

Figure 19. EasySmart water-fruit harvesting robot.

Figure 20. Annual Publication Trend.

Figure 21. Total Number of Publications by Country.

Figure 22. Major Contributing Countries in the Research Field.

Figure 23. Keyword co-occurrence network generated from publications between 2000 and 2025.

Table 1. Main Recognition Algorithms and Their Performance.

Target Fruit	Baseline Algorithm	Improvement Points	Performance Improvement	Reference
Apple	YOLOv8n	ShuffleNetV2 + Ghost backbone + WIoU loss + SE module	Accuracy 94.1%, mAP 91.4%; only 2.6 MB	[22]
Apple	YOLOv3	Preprocessing module; boundary equalization; activation enhancement	Recall 90.8%, detection speed 19 ms	[34]
Tomato	YOLOv3	DenseNet feature fusion + SPP spatial pyramid + Mish activation	AP 95.0%, detection speed 52 ms	[35]
Cherry/Tomato	YOLOv5m	BoTNet backbone + Transformer-MHSA multi-head self-attention	mAP 94%, TPR 94–96%	[36]
Strawberries	YOLOv8	ConvNeXt-V2 backbone + ECA attention + SIoU loss	Accuracy 82.4% (+8.4%), mAP 92.8%	[37]
Blueberries	YOLOv8	MPCA multi-perspective attention + OREPA parameter optimization	Solved small-fruit overlap problem	[38]
Multiple Fruit Types	YOLOv8l	Multi-fruit mixed dataset + CSP module + C2f structure	recall 96%	[39]
Tomato	YOLOv8n	EfficientViT backbone + C2f-Faster + SIoU + Auxiliary Head	mAP 93.9%, accuracy 91.6%	[40]
Cucumber	YOLOv5s	Cr-color channel training + ReliefF feature weighting	mAP 85.2%	[41]
Citrus	YOLOv5	RFCF perceptual weighting + FLA hierarchical attention + K-means + feature enhancement	mAP + 0.6%, detection speed 1.26 ms/frame	[42]
Cantaloupe	YOLOv8n	PCConv partial convolution + EMA multi-scale attention + IoU + IoU weighting	mAP + 1.4%, FPS + 42.9%	[43]

Table 2. Fruit Peduncle Recognition Literature.

Target	Research Task	Algorithm Model	Innovation	Reference
Pepper	Fruit–peduncle joint detection	Mask R-CNN	Achieves high-precision extraction of fruit–peduncle regions through multi-scale feature fusion, particularly under complex illumination	[49]
Pear	Fruit recognition and peduncle localization	DeepLabv3+	Uses MobileNet-based lightweight backbone and attention mechanisms to achieve accurate peduncle segmentation and recognition	[50]
Tomato	Cutting-point detection	YOLOv8n-DDA-SAM + DOPE	Combines 3D geometric constraints with image features to enable autonomous cutting-point detection	[51]
Loquat	Peduncle detection	YOLO-PP (lightweight variant)	Incorporates transformer modules to enhance fine-grained peduncle feature extraction, enabling accurate picking-point localization	[52]
Tomato	Ripeness + peduncle joint detection	YOLO-TMPPD	Integrates ripeness estimation with peduncle localization, improving multi-task perception capability	[53]
Citrus	Picking-point localization	Two-Stage CPPL algorithm	Achieves accurate peduncle localization by decoupling segmentation and regression tasks	[54]
Tomato	Fruit–peduncle joint detection + robustness optimization	Vision-based deep-learning framework	Enhances robustness of robotic picking in complex orchards through spatial structural constraints	[55]
Cherry Tomato	Multi-view peduncle detection	StarBL-YOLO + RGB-D	Utilizes RGB-D fusion for improved peduncle detection accuracy and robust picking-point estimation	[56]

Table 3. Comparison of Orchard Path-Planning Algorithms and Their Performance.

Baseline Algorithm	Improvement Strategy	Performance Metrics	Reference
A*	Introduction of environmental constraints; removal of redundant nodes	constraints; removal of redundant nodes; increased global planning efficiency	[58]
Dijkstra	Cluster fusion + adaptive optimization	Reduced path oscillation and improved smoothness	[60]
RRT	Double-tree expansion + heuristic optimization	Path length −8.5%; turning smoothness +21.7%	[59]
ACO	Enhanced pheromone updating; optimized search neighborhoods	Path length −6.2%; improved convergence stability	[61]
ACO	Multi-source pheromone optimization + angle-factor enhancement	Energy consumption −30%; flight time −46–59%	[62]
A*-PSO	Environmental constraint embedding + dual-strategy search	Path smoothness +6.9%	[63]

Table 4. Algorithm Evaluation Table.

Evaluation Dimension	A*	Dijkstra	RRT	ACO	A*-PSO
Path optimality	85%	100%	68%	92%	89%
Computation time (100 m)	0.3 s	2.1 s	0.8 s	5.2 s	1.7 s
Memory usage	Medium	High	Low	Medium	High
Dynamic obstacle-avoidance ability	Weak	Weak	Strong	Medium	Medium
Parameter sensitivity	Low	Low	High	High	Medium
Multi-objective optimization	Not supported	Not supported	Not supported	Supported	Supported

Table 5. Research on Dynamic Recognition of Flexible Obstacles.

Target	Research Task	Algorithm Model	Innovation	Reference
Apple	Branch topology and 3D environment reconstruction	Improved Informed-RRT* + human posture estimation	Constructing branch topology; realizing dynamic 3D spatial constraint reconstruction	[72]
Apple	Dynamic obstacle detection and branch motion prediction	Improved swarm-intelligence algorithm + B-spline fitting	Fusion of posture features and complete branch geometry; enhanced prediction accuracy	[73]
Multi-fruit	Real-time 2D–3D flexible obstacle perception	Lightweight 3D neural networks	Dynamic segmentation of branches and leaves; improved environmental adaptability	[74]
Tomato	Branch vibration modeling and trajectory deviation recognition	Fully convolutional network (FCN)	Fine-grained extraction of branch–fruit topological relations	[75]
Tomato	Dry-branch occlusion prediction and collision-risk modeling	Multi-layer Informed-RRT* + OFN	Multi-scale feature extraction of leaf–branch structure; improved recognition of collision-risk regions	[76]

Table 6. Characteristics and Limitations of Different Types of End-Effectors.

Type	Features	Representative Performance	Limitations	Reference
Clamping	Three-finger gear-synchronous rotation with adaptive force control	Success rate 93%, damage < 5%	Complex structure, high manufacturing cost	[84]
Clamping	TPU soft material with slip-detection feedback	Success rate 80%, zero damage	Material ages easily, difficult to clean	[85]
Clamping	Can grasp fruits with different diameters; lightweight design	Success rate 87%, 5–10 s/fruit	Poor anti-interference ability; high air-pressure requirement	[86]
Clamping	Vacuum suction cup; rotary retraction separation with intelligent position adjustment	Success rate 81.7%, energy consumption −15%	Stability depends on surface smoothness	[87]
Cutting	Clustered rotary cutting tools for efficient cluster harvesting	Success rate 88%, cycle time 12 s	Applicable fruit shapes are limited	[80]
Cutting	Micro-motor-driven multi-edge blade, suitable for spherical fruit	Success rate 90%, damage < 8%	High control-precision requirement	[88]
Cutting	Modular design supporting simultaneous multi-fruit cutting	Success rate 85%, cycle time 10 s	Strong coupling in control, system is complex	[89]
Hybrid	Single-system control with short picking cycle	Picking cycle 15.5 s	System commissioning is complex	[90]
Hybrid	Three-plate co-driven, damage-prevention structure	Damage rate 3%, cycle time 8 s	Strong dependence on accurate pose recognition	[91]
Hybrid	Multi-modal perception for improved generality	Success rate 90%, damage < 6%	Control algorithm is complex	[92]

Table 7. Performance Scores of Different Types of End-Effectors.

Type	Damage Rate (0.25)	Success Rate (0.3)	System Complexity (0.15)	Applicability (0.2)	Maintenance Cost (0.1)	Total Score
Hybrid	3	4	1	4	2	3.10
Clamping	2	4	4	4	4	3.50
Suction	4	3	3	2	2	2.95
Cutting	3	2	2	2	3	2.35

Table 8. Representative fruit-harvesting robotic systems (condensed comparison).

Company	Target Crop	End-Effector	Metrics (Type)	Commercialization Info	Limitations	References
Dogtooth Technologies	Strawberry (tabletop, greenhouse)	Integrated cutting, transfer, grading & packing	Throughput; extraction rate; waste rate; endurance	Gen-5 robots deployed on commercial farms	Structured cultivation & infrastructure dependent	[111]
Agrobot	Strawberry	Multi-arm stem grip-and-cut	Parallel harvesting capacity	Pre-commercial pilot systems	High mechanical & control complexity	[112]
Ripe Robotics	Apples, plums, peaches, nectarines	Vision-guided grasping/suction	Multi-fruit adaptability	Prototype systems tested in orchards	Strong dependence on orchard training systems	[113]
Tevel Aerobotics Technologies	Tree fruits (e.g., apples)	UAV-mounted suction picking	Fruit size range adaptability	Early-stage field demonstrations	Sensitive to wind & canopy occlusion	[114]
EasySmart	Strawberry (facility agriculture)	Bionic flexible finger gripper	Continuous operation capability	Product prototypes introduced	Metrics mainly promotional; limited field validation	[115]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lin, T.; Sun, F.; Li, X.; Guo, X.; Ying, J.; Wu, H.; Li, H. A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots. Horticulturae 2026, 12, 158. https://doi.org/10.3390/horticulturae12020158

AMA Style

Lin T, Sun F, Li X, Guo X, Ying J, Wu H, Li H. A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots. Horticulturae. 2026; 12(2):158. https://doi.org/10.3390/horticulturae12020158

Chicago/Turabian Style

Lin, Tao, Fuchun Sun, Xiaoxiao Li, Xi Guo, Jing Ying, Haorong Wu, and Hanshen Li. 2026. "A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots" Horticulturae 12, no. 2: 158. https://doi.org/10.3390/horticulturae12020158

APA Style

Lin, T., Sun, F., Li, X., Guo, X., Ying, J., Wu, H., & Li, H. (2026). A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots. Horticulturae, 12(2), 158. https://doi.org/10.3390/horticulturae12020158

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots

Abstract

1. Introduction

Review Methodology and Paper Selection (PRISMA)

2. Target Recognition and Localization Technology

2.1. Traditional Recognition Methods

2.2. Deep Learning-Based Recognition Methods

2.2.1. Two-Stage Detection Algorithms

2.2.2. One-Stage Detection Algorithms

2.2.3. Pixel-Level Image Segmentation Techniques

2.3. Current Challenges in Recognition Technology

3. Motion Planning and Obstacle Avoidance Technology

3.1. Orchard Path Navigation Technology

3.1.1. Path Navigation Using Traditional Algorithms

3.1.2. Path Navigation Using Intelligent Algorithms

3.1.3. Summary of Path-Planning Technologies

3.2. Manipulator Harvesting Planning

3.3. Current Challenges in Motion Planning Technology

4. Harvesting Device Mechanism and Optimization

4.1. Manipulator Design and Optimization

4.1.1. Optimization of Degree-of-Freedom Configuration

4.1.2. Dual-Arm Cooperative Control

4.2. End-Effector Design and Optimization

5. System Integration and Optimization

5.1. Multi-Sensor Fusion Technologies

5.2. Hierarchical Control System Architecture

5.2.1. Distributed Architecture

5.2.2. Edge–Cloud Collaborative Computing

5.3. System-Level Integration Challenges and Trade-Offs

6. Field Deployment Status and System-Level Application Analysis

6.1. Representative International Systems and Deployment Characteristics

6.2. Representative China Systems and Application Constraints

6.3. Cross-System Synthesis and Engineering Implications

7. Technological Development Trends and Analysis

7.1. Publication Trend Analysis

7.2. Analysis of Publication Distribution by Region

7.3. Keyword Hotspot Analysis

8. Conclusions and Outlook

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI