Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting

Ghonimy, Mohamed; Abdel-Baky, Nagdy F.

doi:10.3390/agronomy16121127

Open AccessReview

Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting

by

Mohamed Ghonimy

^1,*

and

Nagdy F. Abdel-Baky

²

¹

Department of Agricultural and Biosystems Engineering, College of Agriculture and Food, Qassim University, Buraydah 51452, Saudi Arabia

²

Department of Plant Protection, College of Agriculture and Food, Qassim University, Buraydah 51452, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(12), 1127; https://doi.org/10.3390/agronomy16121127

Submission received: 20 April 2026 / Revised: 24 May 2026 / Accepted: 2 June 2026 / Published: 8 June 2026

(This article belongs to the Special Issue Robotics for Agricultural Production)

Download

Browse Figures

Versions Notes

Abstract

This review provides a comprehensive synthesis of robotic fruit harvesting systems, with a particular focus on the system-level integration of perception, manipulation, and fruit detachment within autonomous harvesting environments. Recent advances in machine vision, deep learning, sensor fusion, robotic end-effectors, grasping strategies, and motion planning are critically analyzed alongside cutting, pulling, and vibration-based detachment mechanisms under unstructured orchard conditions. Beyond component-level analysis, this review emphasizes the critical role of perception–action coupling and highlights key system integration challenges, including localization errors, perception-to-action latency, and environmental variability, which continue to limit reliable field deployment. In addition, orchard and pre-harvest-related factors such as canopy structure, fruit distribution, and detachment force variability are examined in relation to their direct impact on system performance, robustness, and harvesting efficiency. Furthermore, the review extends toward system-level considerations by incorporating performance evaluation metrics, economic feasibility, and scalability constraints, which are essential for transitioning robotic harvesting systems from experimental prototypes to commercially viable solutions, including practical field deployment in distributed and multi-robot harvesting systems. Emerging technologies, including artificial intelligence, advanced sensing, digital agriculture, and energy-aware system design, are discussed as key enablers for achieving adaptive, data-driven, and scalable autonomous harvesting. The novelty of this work lies in proposing an integrated framework that explicitly links perception, manipulation, and detachment with orchard-level constraints and deployment requirements, thereby bridging the gap between algorithmic advancements and real-world implementation of autonomous fruit harvesting systems.

Keywords:

robotics; autonomous harvesting; advanced perception; fruit detachment; artificial intelligence; precision agriculture

1. Introduction

The global fruit production sector is undergoing a profound transformation driven by increasing demand for high-quality produce, rising labor costs, workforce shortages, and the growing need for sustainable and resource-efficient agricultural practices. Among all stages of fruit production, harvesting remains one of the most labor-intensive, time-consuming, and economically critical operations, particularly for high-value crops that require selective picking based on ripeness, size, and quality. The inherent variability in fruit morphology, maturity levels, and spatial distribution within orchard environments further complicates the harvesting process, making it difficult to standardize and automate using conventional approaches [1,2]. Traditionally, fruit harvesting has relied heavily on manual labor due to the dexterity, adaptability, and decision-making capabilities of human workers. However, increasing labor scarcity, rising wages, and seasonal workforce instability have created substantial economic pressure on growers worldwide. These challenges have accelerated research efforts toward robotic and automated harvesting systems as viable alternatives to manual operations, positioning robotic harvesting as a key enabling technology within precision and digital agriculture [3,4].

Robotic harvesting systems aim to replicate—and potentially surpass—human capabilities through the integration of sensing, perception, planning, and actuation within autonomous platforms. A typical system comprises perception modules for fruit detection and localization, robotic manipulators for motion execution, and end-effectors for grasping and detachment [5,6]. Despite substantial technological progress, the deployment of such systems in real orchard environments remains limited, primarily due to the unstructured and dynamic nature of agricultural settings. Variations in illumination, occlusions caused by foliage, and irregular fruit arrangements continue to affect system reliability and operational consistency [7,8]. A fundamental challenge in robotic fruit harvesting lies in the effective integration of three core functional domains: perception, manipulation, and detachment. Perception systems, largely driven by advances in computer vision and deep learning, have demonstrated significant improvements in detecting and localizing fruits under complex field conditions, extracting critical information such as position, size, orientation, and ripeness level [2,6]. However, accurate perception alone does not guarantee successful harvesting outcomes. Robotic manipulation introduces additional complexity, requiring the system to navigate cluttered environments, avoid obstacles, and execute precise movements for stable and non-destructive grasping. The design of manipulators and end-effectors must accommodate fruit variability and fragility, balancing adaptability, control precision, and mechanical compliance [5,9]. In parallel, motion planning and control strategies must operate in real time to ensure efficient and collision-free interaction within dense canopy structures. Recent advances in automated control strategies for robotic manipulators in fruit harvesting systems have emphasized the use of closed-loop architectures that integrate visual feedback, force sensing, and adaptive control laws. These approaches enable continuous correction of manipulator motion based on real-time sensory measurements, improving handling accuracy and stability under dynamic orchard conditions. The detachment process represents a technically demanding yet comparatively under-integrated component of robotic harvesting systems. It involves applying mechanical forces—such as cutting, pulling, twisting, or dynamic excitation—to separate the fruit from the plant. The effectiveness of these methods depends strongly on the mechanical properties of the fruit–peduncle system, which vary across species, cultivars, and maturity stages. Inadequate detachment strategies may lead to incomplete harvesting, excessive force application, or damage to both fruit and plant, thereby reducing overall efficiency and product quality [3,10]. Recent research has reported substantial progress across these individual domains. Deep learning-based perception models have achieved high accuracy in fruit detection and segmentation under challenging conditions [4,11]. Advances in soft robotics and adaptive end-effectors have improved the handling of delicate fruits while minimizing damage [9,10]. Furthermore, integrated robotic platforms, including multi-arm and dual-arm systems, have demonstrated enhanced operational efficiency through coordinated task execution and parallel manipulation strategies [12,13]. Despite these advancements, most existing studies address perception, manipulation, and detachment as largely independent domains, often overlooking their strong interdependencies. Errors in perception can propagate into manipulation failures, while inappropriate detachment strategies may negate the benefits of accurate grasping. This fragmented approach limits system robustness, scalability, and real-world applicability. Consequently, the absence of a unified system-level framework that integrates perception, manipulation, and detachment represents a significant research gap. Existing review studies typically focus on isolated components, such as vision systems, mechanical design, or agronomic factors, without adequately examining the operational interactions among sensing, robotic decision-making, manipulation strategies, and fruit detachment under practical orchard conditions [14,15]. Previous reviews have generally treated perception, grasping, and detachment as independent technical modules rather than interconnected processes that collectively influence harvesting efficiency, operational reliability, and system scalability in autonomous field applications. In addition, limited attention has been given to perception–action coupling, real-time operational constraints, and the influence of orchard variability on harvesting performance. Therefore, this review aims to provide an integrative analysis of robotic fruit harvesting systems by linking perception technologies, robotic manipulation, and fruit detachment within a unified operational framework. The review further analyzes field deployment constraints, economic feasibility, and scalability factors that influence the practical implementation of autonomous harvesting systems.

2. Literature Review Approach

This review adopts a narrative approach to synthesize and critically analyze recent developments in robotic fruit harvesting systems, with a particular emphasis on the system-level integration of perception, manipulation, and detachment processes. Rather than following a rigid systematic protocol, the review is intentionally designed to provide a structured yet flexible examination of the literature. This approach is particularly suitable for interdisciplinary domains such as agricultural robotics, where technological advancements span multiple fields, including computer vision, robotics, and agronomy, and where strict systematic filtering may overlook important conceptual and integrative contributions. Relevant studies were identified through major scientific databases, including Scopus, Web of Science, IEEE Xplore, and ScienceDirect. The literature search focused on combinations of keywords such as “robotic fruit harvesting,” “agricultural robots,” “fruit detection,” “robotic manipulation,” “end-effectors,” and “fruit detachment mechanisms.” To improve coverage and reduce selection bias, backward and forward citation tracking was also employed, allowing the inclusion of influential and widely cited works that may not be captured through keyword-based searches alone. The selection process prioritizes recent publications and high-impact contributions that have advanced the state of the art in robotic harvesting, particularly those demonstrating experimental validation, real-world applicability, or novel methodological insights. At the same time, earlier foundational studies are incorporated where necessary to establish the theoretical and technological context, ensuring continuity between foundational principles and current innovations. To facilitate a coherent and system-oriented analysis, the reviewed literature is organized into four main thematic categories: (i) perception systems for fruit detection and localization, (ii) robotic manipulation and end-effector design, (iii) fruit detachment mechanisms, and (iv) integrated harvesting systems that combine multiple functionalities. This classification is not only thematic but also reflects the functional decomposition of robotic harvesting pipelines, thereby supporting a system-level understanding of how these components interact and influence overall performance. It is important to note that this review does not aim to provide an exhaustive or statistically complete survey of all published studies. Instead, it focuses on representative approaches, emerging trends, and critical challenges that define the current landscape of robotic fruit harvesting. By emphasizing interactions between subsystems rather than isolated components, the review aims to generate analytical insights into system integration challenges and identify directions for future research toward robust and scalable autonomous harvesting solutions.

3. System Architecture of Robotic Fruit Harvesting

The architecture of robotic fruit harvesting systems provides the structural and functional foundation that determines the efficiency, reliability, and scalability of autonomous operations in orchard environments (Figure 1). Modern systems integrate multiple interconnected modules, including perception, localization, planning, manipulation, detachment, and post-harvest handling. The overall design determines the flow of information and actuation while influencing system performance under complex and dynamic orchard conditions [5,12,14].

A well-structured architecture enables accurate processing of sensory information to guide robotic manipulators during fruit grasping and detachment. Integration between perception, control, and actuation modules is essential to minimize latency, reduce operational errors, and maintain stable performance in unstructured orchard environments. However, current robotic harvesting systems still face limitations related to partial module integration, delayed sensory processing, and reduced robustness in cluttered canopies [7,9].

Overall, understanding the system-level architecture is essential for analyzing the harvesting workflow and identifying the major operational constraints affecting current robotic systems. Accordingly, this section focuses on the harvesting pipeline, core system components, operational workflows in orchards, and the main architectural limitations influencing harvesting performance.

3.1. Overall Harvesting Pipeline

The overall harvesting pipeline in robotic fruit harvesting systems represents a structured sequence of interconnected stages designed to optimize harvesting efficiency, precision, and fruit quality in orchard environments. As illustrated in Figure 1, the pipeline consists of six principal stages: Perception → Localization → Planning → Manipulation → Detachment → Handling [6,12].

Perception represents the initial stage, where sensing technologies and machine learning algorithms detect fruits within dense canopies. RGB-D cameras, LiDAR, and hyperspectral sensors provide information related to fruit position, size, orientation, and ripeness [2]. Subsequently, Localization determines the three-dimensional coordinates of fruits relative to the manipulator using stereo vision, SLAM, and depth-sensing techniques to support safe access under occlusion conditions [13]. Planning then utilizes perception and localization outputs to generate optimized motion trajectories while considering environmental constraints, branch interference, and fruit clustering [9]. Manipulation involves physical interaction between the robotic end-effector and the fruit, requiring adaptive and precise motion control to ensure stable grasping and minimize its damage [10]. During the Detachment stage, fruits are separated from the plant using cutting, twisting, pulling, or vibration-based techniques depending on crop biomechanical characteristics [1]. Finally, handling includes transferring harvested fruits into collection units while minimizing postharvest losses and preserving quality [12]. As shown in Figure 1, effective communication among sensing, planning, and actuation modules is essential for maintaining real-time operation and reducing errors during the transition from perception to harvesting actions [16].

3.2. System Components

Robotic fruit harvesting systems rely on several interconnected components that collectively enable autonomous operation in orchard environments. As illustrated in Figure 1, the main components include sensing systems, robotic manipulators, end-effectors, and control units [17,18]. Sensors provide environmental perception through RGB-D cameras, LiDAR, and multispectral imaging systems that support fruit detection, ripeness evaluation, and obstacle recognition. Sensor fusion approaches improve detection robustness under variable illumination and occlusion conditions [19,20].

Robotic arms provide the dexterity required to access fruits located within dense canopy structures. Multi-degree-of-freedom manipulators enable flexible motion and accurate positioning, while lightweight and compliant designs improve maneuverability and reduce mechanical stress on plants [17]. End-effectors are responsible for fruit grasping and detachment. Different designs, including pneumatic grippers, adaptive claws, and suction devices, are selected according to fruit characteristics and harvesting strategies. Recent advances in soft robotic grippers have reduced fruit damage and improved harvesting efficiency [18,21]. Control units coordinate sensing, planning, manipulation, and detachment operations through real-time processing and feedback control. Advanced strategies such as model predictive control, reinforcement learning, and adaptive control improve operational robustness under dynamic orchard conditions [19,20].

3.3. Operational Workflow in Orchards

The operational workflow translates the harvesting pipeline into practical orchard operations through coordinated interaction among sensing, perception, planning, manipulation, and handling modules, as illustrated in Figure 1. The workflow begins with environmental sensing using LiDAR, RGB-D cameras, and multispectral imaging to generate three-dimensional orchard representations for canopy analysis, obstacle detection, and fruit localization [22]. Perception algorithms then perform fruit detection, segmentation, and ripeness evaluation under varying lighting and occlusion conditions [23]. Based on localization outputs, the control system generates real-time motion trajectories while considering manipulator kinematics and environmental constraints to minimize collision risks [24]. Robotic manipulators equipped with adaptive or soft end-effectors subsequently grasp fruits according to their mechanical properties, while force-feedback mechanisms support dynamic grip adjustment to reduce fruit damage [25]. Fruits are detached using crop-specific techniques such as cutting, twisting, or suction and transferred into collection systems with minimal postharvest losses [26]. Continuous sensing and adaptive control enable real-time adjustment to environmental variability, thereby improving harvesting reliability and operational stability [27]. Thus, the operational workflow demonstrates how sensing, perception, planning, manipulation, and detachment modules interact within an integrated harvesting framework to support efficient orchard operation.

3.4. Bottlenecks in Current Architectures

Despite substantial technological progress, current robotic fruit harvesting architectures still face major integration and operational bottlenecks that limit scalability and field reliability. As illustrated in Figure 1, these limitations primarily originate from insufficient coordination among perception, planning, manipulation, and detachment modules.

One major challenge is the fragmented interaction between system components, where errors generated during perception and localization propagate to manipulation and detachment stages, reducing harvesting reliability [7,17]. Real-time operation is also constrained by computational latency associated with large sensory data streams, deep learning inference, and complex trajectory planning algorithms [24,28]. Another critical bottleneck lies in the transition from perception to action. Converting detection and localization outputs into precise manipulator commands remains difficult due to calibration inaccuracies, coordinate misalignment, and limited feedback control, which may result in grasping failures or fruit damage even when perception accuracy is high [23,25]. System robustness further deteriorates under unstructured orchard conditions characterized by variable illumination, canopy occlusion, terrain irregularities, and dynamic environmental changes [22,26]. Therefore, Section 10 focuses specifically on system integration bottlenecks and closed-loop optimization strategies required to improve coordination, reduce latency, and enhance adaptive control performance in real-world harvesting environments. Addressing these challenges requires tighter integration between perception, planning, manipulation, and detachment modules, in addition to low-latency control architectures and adaptive intelligent control frameworks capable of maintaining stable performance under highly variable orchard conditions.

4. Perception Systems for Fruit Detection and Localization

Perception systems represent a fundamental pillar in robotic fruit harvesting, as they enable the identification, localization, and characterization of fruits within complex and unstructured orchard environments. Accurate perception is essential for guiding downstream processes such as motion planning, manipulation, and detachment, directly influencing the overall efficiency and success rate of harvesting operations. Recent advances in sensing technologies and artificial intelligence have significantly enhanced the capabilities of perception systems, particularly in terms of robustness and real-time performance [2,29]. Machine vision techniques, including RGB imaging, depth sensing, and stereo vision, have been widely adopted to capture visual and spatial information about fruits and their surroundings. These approaches allow the extraction of key features such as color, shape, and spatial position, forming the basis for fruit detection and localization tasks [17]. However, traditional vision-based methods often struggle under varying illumination conditions and occlusions, which are common in orchard environments.

To address these limitations, artificial intelligence and deep learning approaches have emerged as dominant solutions for perception tasks. Convolutional neural networks (CNNs) and advanced object detection frameworks, such as YOLO and Mask R-CNN, have demonstrated high accuracy in fruit detection and segmentation, even in challenging field conditions [30,31]. These methods enable more reliable extraction of fruit features and support real-time decision-making in autonomous harvesting systems. Furthermore, sensor fusion techniques have been increasingly explored to improve perception robustness by combining data from multiple sensing modalities, such as LiDAR and cameras. This integration enhances depth estimation, reduces uncertainty, and improves detection performance in cluttered environments [16]. Despite these advancements, perception systems still face significant challenges related to environmental variability, computational complexity, and generalization across different fruit types and orchard conditions.

4.1. Machine Vision Techniques

Machine vision techniques form the foundational layer of perception systems in robotic fruit harvesting, enabling the acquisition and preprocessing of visual and spatial data from orchard environments. As illustrated in Figure 2, this stage corresponds to the input sensing and preprocessing layers, where multiple sensing modalities—namely RGB imaging, depth sensing, and stereo vision—are employed to capture comprehensive information about fruits and their surroundings.

RGB imaging is the most used technique due to its ability to capture detailed color and texture information, which is essential for distinguishing fruits from foliage. Within the perception pipeline shown in Figure 2, RGB data serves as a primary input that supports feature extraction during subsequent detection stages. However, RGB-based approaches remain highly sensitive to complex outdoor orchard conditions, particularly variations in illumination, shadows, background clutter, and fruit occlusion, which can significantly reduce detection accuracy and system robustness [32,33]. To enhance spatial understanding, depth-sensing technologies, such as RGB-D cameras and time-of-flight sensors, provide explicit distance information between the sensor and the fruit. As depicted in Figure 2, depth data contributes to the preprocessing stage, where it is combined with RGB inputs to improve scene interpretation and facilitate accurate fruit localization. Depth information is particularly valuable for estimating three-dimensional positions and avoiding collisions during robotic manipulation [34,35]. Stereo vision systems further extend perception capabilities by reconstructing three-dimensional orchard scenes from multiple viewpoints. In the context of Figure 2, stereo vision enhances the sensing stage by generating depth information and point cloud data, enabling more accurate fruit localization even under occlusion, clustered fruit arrangements, and varying illumination conditions. This approach improves robustness in complex and dynamic canopy environments where single-view perception methods may exhibit limited performance [36,37]. The preprocessing stage, as illustrated in Figure 2, involves sensor calibration, noise reduction, image enhancement, and data alignment to improve data quality and ensure consistency across multiple sensing modalities before integration into higher-level AI models. These operations play a critical role in reducing uncertainty and enhancing the accuracy of subsequent detection and segmentation processes. Overall, machine vision techniques constitute the foundational input layer of perception systems by transforming raw environmental data into structured representations suitable for advanced AI-based analysis. Their integration, as shown in Figure 2, provides a robust basis for reliable fruit detection and precise localization in robotic harvesting applications.

4.2. AI and Deep Learning Approaches

Artificial intelligence and deep learning approaches constitute the core processing layer of perception systems, enabling accurate fruit detection and segmentation in complex orchard environments. As illustrated in Figure 2, this stage transforms preprocessed multi-modal data into structured outputs for robotic decision-making. Convolutional neural network (CNN)-based detectors, including Faster R-CNN, YOLO, and SSD, are widely employed for fruit identification, generating bounding boxes within the detection module while maintaining high accuracy and real-time performance under variations in fruit size, color, and orientation [38,39,40]. Complementarily, segmentation models such as Mask R-CNN and U-Net provide pixel-level localization, enabling precise boundary extraction, improved handling of overlapping fruits, and enhanced geometric understanding for grasp planning [41,42]. In addition, feature extraction and localization processes derive essential attributes, including fruit size, shape, color, and three-dimensional position, converting detection and segmentation outputs into structured representations directly usable by motion planning and control modules. Recent advances in transformer-based and hybrid architectures further improve feature representation and robustness under occlusion, lighting variability, and complex backgrounds [43,44].

More recently, foundation and vision-language models, such as CLIP, the Segment Anything Model (SAM), and Grounding DINO, have emerged as promising approaches for agricultural perception tasks. In contrast to traditional supervised learning frameworks, these models enable open-vocabulary recognition and prompt-based segmentation, allowing flexible adaptation to unseen fruit categories and complex orchard environments with reduced reliance on large annotated datasets. This paradigm shift improves generalization across crop types, illumination conditions, and occlusion scenarios commonly encountered in field applications. However, their deployment in robotic harvesting systems remains limited due to high computational requirements, domain adaptation challenges, and difficulties in achieving real-time inference on embedded robotic platforms. As shown in Figure 2, the outputs of this stage include structured perception data such as fruit location, maturity level, and geometric properties, forming critical inputs for downstream harvesting operations. Despite their effectiveness, these approaches remain limited by high computational requirements, dependence on large annotated datasets, and constrained generalization across crops and environments, thereby motivating the integration of sensor fusion techniques in subsequent stages. These limitations arise primarily from the high parameter complexity of deep learning architectures, which requires large-scale data to learn robust and transferable feature representations, as well as the substantial cost and effort associated with manual annotation in agricultural environments. To address these challenges, recent research has focused on transfer learning, self-supervised learning, and data augmentation techniques, alongside lightweight model compression strategies such as pruning, quantization, and knowledge distillation, enabling more efficient deployment on embedded robotic platforms. The outputs of these perception systems serve as essential inputs for downstream decision-making, manipulation, and detachment modules, forming a critical link between visual understanding and physical interaction in robotic harvesting pipelines.

4.3. Sensor Fusion

Sensor fusion has emerged as a critical approach for enhancing perception performance in robotic fruit harvesting systems by integrating complementary information from multiple sensing modalities. In complex orchard environments, reliance on a single sensor often leads to reduced accuracy due to limitations such as lighting sensitivity, occlusions, and incomplete spatial information. As illustrated in Figure 2, sensor fusion operates across both the input sensing and processing stages, where data from different sensors are combined to generate more reliable and robust representations of fruit location and characteristics.

At the sensing level, fusion typically involves combining data from RGB cameras, depth sensors, and LiDAR systems. RGB sensors provide rich color and texture information, while depth sensors contribute spatial geometry, and LiDAR offers precise distance measurements and robust performance under varying illumination conditions. As depicted in Figure 2, these heterogeneous inputs can be integrated prior to processing, enabling a more comprehensive understanding of the orchard environment [45,46]. Sensor fusion strategies are generally categorized into early fusion, mid-level fusion, and late fusion. In early fusion, raw data from multiple sensors are combined before feature extraction, allowing the model to learn joint representations directly from multi-modal inputs. This approach is reflected in the input stage of Figure 2, where multiple sensing streams are aligned and preprocessed together. In contrast, mid-level fusion integrates features extracted independently from each sensor, while late fusion combines decision outputs from separate models, improving robustness and modularity [47].

Within the processing stage shown in Figure 2, deep learning models are increasingly designed to support multi-modal data fusion. Multi-stream convolutional networks and attention-based architecture enable effective integration of visual and spatial features, significantly improving fruit detection and localization accuracy. These models leverage complementary information to address challenges such as partial occlusion, overlapping fruits, and background clutter [48,49]. One of the primary advantages of sensor fusion is its ability to mitigate environmental variability, particularly under challenging field conditions. For example, while RGB images may be affected by shadows or varying illumination, LiDAR and depth sensors remain relatively stable, ensuring consistent perception performance. As a result, fused systems demonstrate improved robustness and reliability compared to single-sensor approaches [46]. However, sensor fusion also introduces several challenges, including increased computational complexity, synchronization requirements, and calibration difficulties between sensors. As reflected in Figure 2, effective fusion requires precise alignment of multi-modal data streams during preprocessing to ensure accurate downstream processing. Additionally, the integration of multiple sensors increases system cost and energy consumption, which may limit scalability in commercial applications [45,47].

4.4. Challenges in Field Conditions

Perception systems for robotic fruit harvesting must operate under highly variable and unstructured orchard conditions, which introduce significant challenges that affect detection accuracy and system reliability. As illustrated in Figure 3, environmental factors such as illumination variability, occlusion, and fruit heterogeneity play a critical role in degrading perception performance.

One of the most prominent challenges is illumination variability, caused by changing sunlight intensity, shadows, and weather conditions. Outdoor environments introduce dynamic lighting that can significantly alter the appearance of fruits and foliage, leading to inconsistent feature extraction and reduced detection accuracy. Bright sunlight can cause overexposure, while shadows may obscure fruit visibility, making reliable perception difficult [50,51]. Another major challenge is occlusion, where fruits are partially or fully hidden by leaves, branches, or other fruits. As shown in Figure 3, occlusion disrupts the visibility of fruit boundaries, making it difficult for both traditional and deep learning-based methods to accurately detect and segment targets. This issue is particularly severe in dense canopy environments, where overlapping structures are common [30,52]. In addition, fruit variability introduces further complexity. Fruits of the same type may vary significantly in size, color, shape, and maturity level, while different fruit species exhibit even greater diversity. This variability challenges the generalization capability of perception models, especially when trained on limited datasets. As highlighted in Figure 3, such variations can lead to misclassification, missed detections, or inaccurate localization [53]. These environmental challenges often interact simultaneously, compounding their effects and further reducing system robustness. As a result, perception systems must incorporate adaptive algorithms and robust sensing strategies to maintain performance under real-world orchard conditions.

4.5. Comparative Analysis and Limitations

A comparative analysis of perception approaches highlights fundamental trade-offs between traditional machine vision methods and deep learning-based techniques in terms of accuracy, computational efficiency, robustness, and adaptability, as illustrated in Figure 3. Traditional machine vision methods, based on handcrafted features such as color thresholds, edge detection, and geometric analysis, offer low computational cost and fast implementation, making them suitable for resource-constrained systems. However, they are highly sensitive to environmental variations, particularly illumination changes and complex backgrounds, which limit robustness and generalization [54]. In contrast, deep learning approaches achieve higher accuracy and robustness by learning complex feature representations from large datasets. Models based on CNN architecture, including YOLO and Mask R-CNN, have demonstrated strong performance in fruit detection and segmentation tasks under agricultural conditions. However, these approaches often require large annotated datasets, high computational resources, and extended training procedures, which may limit inference speed and practical real-time deployment in smart farming applications [55]. As shown in Figure 3, a notable trade-off exists between accuracy and processing speed. While deep learning models provide superior detection performance, their inference time is often slower than traditional methods, particularly when handling high-resolution or multi-modal data, which can impact real-time harvesting efficiency. Additionally, environmental factors such as lighting variability, occlusion, and fruit diversity continue to affect both approaches, with performance degradation observed when operating outside trained conditions [51,52]. From a systems perspective, these limitations are primarily attributed to the high model complexity of modern deep learning architectures and their dependence on large-scale annotated datasets. In agricultural environments, data annotation is particularly challenging due to seasonal variability, occlusion, and the labor-intensive nature of labeling. These limitations further motivate the development of learning-based perception-to-action frameworks that can better handle uncertainty, variability, and partial observability in real-world orchards. In particular, recent trends are moving toward tighter integration between perception and decision-making, enabling more adaptive and intelligent robotic harvesting systems.

Overall, Figure 3 indicates that no single approach is sufficient on its own; achieving reliable perception performance requires combining efficient algorithms, adaptive learning strategies, and multi-sensor data fusion.

5. Robotic Manipulation and End-Effectors

Robotic manipulation is a critical stage in autonomous fruit harvesting systems, translating perception outputs into precise physical interaction with fruits. It directly affects harvesting success, efficiency, and fruit quality, particularly in unstructured orchard environments with high variability in fruit position and accessibility. As illustrated in Figure 4, this stage integrates grasp planning, motion control, and end-effector operation within a closed-loop interaction between sensing and actuation. Recent advances have focused on improving adaptability, dexterity, and compliance to handle diverse fruit types and complex canopy structures [10,18,21,22,56]. End-effectors serve as the physical interface between the robot and the fruit, with their design determining grasping effectiveness and detachment suitability. Rigid grippers provide stability and precision but may damage delicate fruits, whereas soft grippers offer greater compliance and safer interaction with irregular produce. Accordingly, end-effector selection depends on fruit properties such as size, texture, and attachment strength. Grasping strategies, including suction-based and multi-fingered mechanisms, further define interaction; suction systems enable rapid picking of smooth fruits, while fingered grippers provide higher control in cluttered environments. These approaches are closely integrated with motion planning algorithms to ensure efficient and collision-free access. Motion planning and control coordinate manipulator movements under dynamic conditions by incorporating obstacle avoidance, kinematic constraints, and real-time feedback. Feedback control systems enable continuous adjustment of grip force, position, and trajectory, improving robustness and reducing failure rates during grasping and detachment [22,56]. Minimizing fruit damage while maintaining efficiency remains a major challenge, as excessive force or improper contact can cause bruising or detachment failure. To address this, recent studies emphasize compliant control, soft robotic materials, and force-sensing technologies that support adaptive interaction and preserve fruit quality [21,57]. Despite these advancements, manipulation systems still face challenges related to fruit variability, occlusion, and coordination with perception outputs. As shown in Figure 4, achieving seamless integration between perception, grasp planning, and actuation remains an open issue, motivating further development of adaptive end-effectors, learning-based grasping, and real-time control frameworks for improved robustness and scalability.

5.1. Types of End-Effectors

End-effectors represent the critical interface between robotic manipulators and fruits, directly influencing grasping, detachment, and overall harvesting performance. As illustrated in Figure 4, they operate within the manipulation stage, where perception and motion planning outputs are converted into physical interaction with fruits. Their design must balance precision, adaptability, and compliance to ensure reliable performance under variable orchard conditions. End-effectors are commonly classified into rigid and soft grippers, each with distinct advantages. Rigid grippers, typically made of metallic or hard polymer materials, offer high structural stability and precise force control, making them suitable for fruits with firm surfaces and regular geometries. However, their limited compliance increases the risk of bruising or damage when handling delicate or irregular fruits [58,59]. In contrast, soft grippers, based on silicone, elastomers, or pneumatic structures, provide higher conformity to fruit shapes, enabling more uniform force distribution and reduced mechanical stress. As shown in Figure 4, this enhances adaptability in handling fruits with varying sizes and maturity levels, while significantly reducing damage rates, particularly for fragile crops such as strawberries and tomatoes [60,61,62]. Hybrid end-effectors combining rigid structures with soft contact elements have also been developed to balance precision and compliance. These systems aim to optimize the trade-off between grasping accuracy and fruit protection, particularly in complex environments with occlusion and limited accessibility [63,64]. Table 1 summarizes a comparative evaluation of rigid, soft, and hybrid end-effectors in terms of performance, limitations, and suitable applications.

Additionally, integrating sensing capabilities such as force, tactile, and proximity sensors enhance real-time feedback during grasping. Within the framework of Figure 4, this enables closed-loop control, improving grip adjustment, success rate, and damage reduction in uncertain orchard conditions [59,64]. Despite these developments, no single end-effector design is universally optimal, as performance depends on fruit properties, environmental conditions, and system integration, highlighting the need to consider end-effectors as part of a fully integrated manipulation system.

5.2. Grasping Strategies

Grasping represents a critical stage in robotic fruit harvesting systems, as it directly affects harvesting success, fruit damage, and overall system efficiency. Unlike perception, which focuses on environmental understanding, grasping involves physical interaction with deformable and uncertain biological objects, requiring careful balance between stability, compliance, and force control. Traditional grasping strategies in agricultural robotics are primarily based on suction-based and finger-based mechanisms. Suction grippers utilize negative pressure to attach to fruit surfaces and are widely adopted due to their simplicity and effectiveness for smooth or spherical fruits. However, their performance is sensitive to surface texture, fruit orientation, and leakage under irregular or dusty conditions. Finger-based grippers, on the other hand, provide mechanical grasping through multi-finger contact, offering improved adaptability to different fruit shapes and sizes. Despite their versatility, they require precise force control to avoid bruising or excessive compression, particularly for soft fruits. To overcome limitations of rigid grasping mechanisms, soft and adaptive grippers have been increasingly explored. These systems employ compliant materials such as silicone or pneumatic structures to conform to fruit geometry, enabling safer interaction and reduced damage rates. Their inherent compliance improves robustness under positioning uncertainty; however, trade-offs remain in terms of actuation complexity, control accuracy, and grasping speed [13,65].

In parallel with hardware-based developments, vision-guided grasping strategies have emerged, integrating perception outputs with grasp planning algorithms. In these approaches, fruit position, pose, and geometric properties estimated from vision systems are directly used to guide grasp selection and end-effector positioning. Although effective in structured scenarios, these methods typically rely on modular pipelines and remain sensitive to perception errors and environmental uncertainty [2,66]. More recently, end-to-end grasping strategies based on deep reinforcement learning (DRL) have gained increasing attention. Unlike conventional pipeline-based methods, DRL frameworks learn grasping policies directly from interaction with simulated or real environments, mapping sensory inputs to control actions without explicit feature engineering. This enables autonomous adaptation to variations in fruit geometry, occlusion, and attachment conditions. DRL-based approaches can also be enhanced through imitation learning and simulation-to-real transfer, which accelerate training and improve real-world applicability in agricultural environments [52,61,67]. However, despite their potential, these methods remain limited by high training complexity, sample inefficiency, sim-to-real gaps, and computational requirements, which currently restrict large-scale deployment in field robotics.

Overall, grasping strategies in robotic harvesting systems exhibit a clear evolution from rigid mechanical designs toward adaptive, perception-driven, and learning-based frameworks. While traditional suction and finger-based systems remain widely used in practical applications, emerging DRL-based approaches represent a promising direction for achieving fully autonomous, robust, and generalized grasping in unstructured orchard environments. As highlighted in previous sections, the effectiveness of any grasping strategy is strongly dependent on perception accuracy and environmental variability, emphasizing the need for tightly integrated perception–action systems [66,67].

5.3. Motion Planning and Control

Motion planning and control form a fundamental component of robotic manipulation in fruit harvesting systems, enabling the conversion of grasping decisions into precise and executable movements. As illustrated in Figure 4, this stage connects grasp planning with physical manipulation, allowing robotic arms to reach target fruits efficiently while avoiding collisions and ensuring stable interaction. It plays a central role in improving harvesting success, reducing execution time, and maintaining safe operation in complex orchard environments.

Path planning focuses on generating optimal trajectories for the robotic manipulator from its initial position to the target fruit using perception and localization data. These trajectories must satisfy kinematic constraints, workspace limitations, and dense canopy structures. Classical methods such as Rapidly exploring Random Trees (RRT) and Probabilistic Roadmaps (PRM) are widely used for high-dimensional spaces [17], but they may produce suboptimal or non-smooth paths requiring further refinement. Recent approaches based on trajectory optimization improve efficiency by incorporating cost functions related to time, energy, and collision risk [68], while deep learning-based planners enhance real-time motion prediction directly from perception data [48]. Obstacle avoidance is essential for safe operation in cluttered orchard environments, where manipulators must navigate around branches, leaves, trellis structures, and neighboring fruits. As shown in Figure 4, traditional geometric and map-based collision detection methods are effective in structured settings but less robust in dynamic and irregular environments [69]. To improve adaptability, reactive and sensor-based strategies such as potential field methods generate attractive and repulsive forces for continuous trajectory adjustment, although they may suffer from local minima issues [70]. More advanced techniques, including model predictive control and reinforcement learning, provide predictive and adaptive responses that enhance robustness in dynamic conditions [71,72]. Motion control ensures accurate execution of planned trajectories through regulation of joint motion, velocity, and interaction forces. Within the framework of Figure 4, it operates in a closed-loop system using real-time sensor feedback to adjust manipulator behavior. Advanced strategies such as adaptive control, impedance control, and learning-based control improve stability and precision during interaction with fruits and surrounding structures [73].

Overall, effective integration of path planning, obstacle avoidance, and motion control is essential, as delays or inaccuracies in any component may propagate through the system, leading to reduced efficiency or fruit damage, highlighting the importance of tight coupling with perception and manipulation modules.

5.4. Damage Minimization

Damage minimization is a critical objective in robotic fruit harvesting systems, since fruit quality directly affects market value and post-harvest shelf life. Mechanical damage such as bruising, puncturing, and surface abrasion may occur during grasping, manipulation, or detachment, particularly under inappropriate force application. As illustrated in Figure 4, this function is tightly coupled with the manipulation stage and continuously regulated through closed-loop feedback connecting perception, control, and performance evaluation. A major factor influencing damage is the magnitude and distribution of contact forces during grasping. Excessive or uneven force can cause internal tissue damage even without visible external signs. To mitigate this, modern systems integrate force sensing and tactile feedback within end-effectors, enabling real-time adjustment of grip force according to fruit properties [59,74]. As shown in Figure 4, this feedback mechanism supports dynamic adaptation of manipulation parameters to different firmness and maturity levels. The use of compliant and soft materials further contributes to damage reduction by distributing contact pressure over larger areas and reducing stress concentration on the fruit surface. Soft and hybrid grippers have shown lower damage rates compared to rigid designs, especially for delicate fruits such as strawberries and peaches [21,75]. Within the manipulation framework of Figure 4, these designs improve system adaptability under variable orchard conditions.

Motion control is also essential in reducing damage, as abrupt movements, high acceleration, or inaccurate trajectory execution can introduce unwanted dynamic forces. Smooth trajectory generation and precise velocity regulation are therefore required to ensure gentle interaction during grasping and detachment. Advanced methods such as impedance control and adaptive control enable the regulation of interaction forces and stable contact with fruits [73]. In addition, fruit-specific properties such as size, shape, surface characteristics, and attachment strength must be incorporated into manipulation strategies. Adaptive systems that adjust grasping force and configuration based on real-time perception data demonstrate improved performance across different fruit types [74]. However, consistent damage minimization remains challenging in unstructured orchard environments due to variability, occlusion, and uncertainty.

Finally, from Figure 4, damage minimization is not an isolated function but a system-level objective requiring coordinated integration of perception, grasp planning, motion control, and end-effector design. Future improvements should focus on more accurate modeling of fruit mechanical behavior, enhanced sensing capabilities, and learning-based control strategies to further reduce damage while maintaining high harvesting efficiency.

5.5. Comparative Evaluation of Manipulation Techniques

The performance of robotic manipulation techniques in fruit harvesting systems is typically evaluated based on key metrics such as grasping efficiency, success rate, and adaptability to different fruit types. As illustrated in Figure 4, these evaluation criteria are integrated within the closed-loop feedback system, where performance outcomes influence real-time adjustments in grasping and motion control strategies.

Grasping efficiency refers to the ability of the robotic system to securely and quickly acquire fruits with minimal repositioning or repeated attempts. Suction-based systems generally exhibit high efficiency under favorable conditions due to their simplicity and fast operation. However, their performance declines when surface conditions are unfavorable. In contrast, fingered and soft grippers provide more stable and adaptive grasping, albeit sometimes at the cost of increased execution time [26,67].

Success rate is a critical metric representing the percentage of successful harvesting attempts relative to total attempts. High success rates depend on accurate perception, effective grasping strategies, and reliable motion execution. Studies have shown that integrated systems combining adaptive grasping and real-time feedback control achieve significantly higher success rates compared to static or open-loop approaches [7,75]. As depicted in Figure 4, success rate is directly influenced by the coordination between perception, planning, and manipulation modules. Adaptability to different fruits with different shapes, sizes, surface properties, and mechanical properties. Fingered and soft grippers generally offer higher adaptability due to their ability to conform to diverse geometries, whereas suction-based systems are more limited in this regard. Hybrid approaches, as indicated in Figure 4, provide a promising solution by combining multiple grasping mechanisms to handle a broader range of fruit types [57,63]. To further clarify these differences, Table 2 presents a comparative evaluation of major manipulation techniques based on key performance metrics relevant to robotic harvesting systems.

Overall, the comparative analysis highlights that no single manipulation technique is universally optimal. Instead, performance depends on the interaction between system components and environmental conditions. As emphasized in Figure 4, integrating multiple strategies within a unified control framework, supported by real-time feedback and adaptive decision-making, is essential for achieving robust and scalable robotic harvesting systems.

In addition to qualitative comparisons, quantitative performance indicators provide a clearer understanding of the practical capabilities and limitations of robotic harvesting systems. Metrics such as harvesting success rate, cycle time, fruit damage rate, and operational efficiency are commonly used to evaluate robotic performance under real orchard conditions. Therefore, a comparative summary of representative robotic harvesting studies is presented in Table 3 to provide a more evidence-based evaluation of existing systems.

As shown in Table 3, harvesting success rates generally improve under structured orchard conditions and controlled illumination environments, while performance degradation is commonly observed in cluttered canopies and under severe occlusion. Similarly, cycle time and fruit damage remain strongly dependent on end-effector design, sensing strategy, and motion planning efficiency. These quantitative comparisons demonstrate that although substantial progress has been achieved, significant challenges remain regarding robustness, scalability, and real-time field deployment.

6. Fruit Detachment Mechanisms in Robotic Systems

Fruit detachment represents the most mechanically complex stage in robotic harvesting systems, as it directly affects harvesting success, fruit quality, and plant integrity. Unlike perception and manipulation, which primarily depend on sensing and control algorithms, detachment is governed by biomechanical interactions between the fruit, peduncle, and externally applied forces. As illustrated in Figure 5, this process involves tensile forces, torsional moments, or cutting actions at the fruit–peduncle interface until structural failure occurs, thereby defining the efficiency and selectivity of harvesting operations [84].

The peduncle exhibits viscoelastic and anisotropic mechanical properties that vary across fruit species, maturity stages, and environmental conditions. These properties determine the detachment force threshold required for separation without inducing damage. Exceeding this threshold may lead to stem tearing, fruit bruising, or damage to surrounding plant structures, making controlled detachment particularly challenging in unstructured orchard environments [53,85].

From a system perspective, as highlighted in Figure 5, detachment is tightly coupled with manipulation and control modules. Its performance depends on accurate perception, stable grasping, and precise motion execution, where errors in positioning, force estimation, or trajectory control can reduce efficiency and fruit quality. Recent studies emphasize integrating force sensing and adaptive control strategies to improve detachment reliability and reduce damage [74,86]. Detachment mechanisms are generally classified into cutting-based methods, pulling and twisting strategies, and dynamic excitation approaches. Despite differences in implementation, they share common mechanical principles related to stress, strain, and failure at the fruit–peduncle interface. Therefore, a quantitative understanding of detachment forces is essential before analyzing specific techniques, considering fruit properties, structural characteristics, and environmental variability.

6.1. Cutting-Based Detachment

Cutting-based detachment is a precise and controlled approach in robotic fruit harvesting systems, using blades, scissors, or motorized cutters to sever the peduncle at the fruit–stem junction. As illustrated in Figure 5, this method applies to localized cutting force at the peduncle, leading to separation through shear failure rather than tensile rupture, which reduces force transmission to the fruit and minimizes mechanical damage [11]. Mechanically, cutting-based detachment is governed by shear stress concentration at the cutting interface. Its performance depends on blade sharpness, cutting angle, cutting speed, and peduncle properties. Sharp blades reduce required force by concentrating stress, while misalignment increases resistance and may cause incomplete cutting or tissue damage [85,87]. A major advantage of this method is clean and repeatable separation, which is important for fruits where stem integrity affects post-harvest quality, such as apples, grapes, and peppers. Compared with pulling-based methods, cutting minimizes stress propagation into the fruit body, thereby reducing bruising and internal damage [26,74]. However, successful operation requires high precision in perception and manipulation, since accurate alignment between the cutting tool and peduncle is essential. As shown in Figure 5, small localization or motion planning errors may lead to missed cuts, higher energy consumption, or unintended damage to surrounding plant structures [86]. Operational challenges also include tool actuation complexity, blade wear, and increased system weight and energy demand. In dense canopy conditions, limited visibility further complicates the process. Therefore, integrating adaptive control and real-time feedback is necessary to ensure reliable field performance [85,87].

In general, cutting-based detachment provides a low-damage and highly controlled harvesting strategy, where shear-dominated separation and precise tool alignment determine performance. As shown in Figure 5, it serves as a fundamental benchmark for evaluating alternative detachment mechanisms.

6.2. Pulling and Twisting Mechanisms

Pulling and twisting mechanisms are widely used detachment strategies in robotic fruit harvesting due to their simplicity and ease of implementation. These methods rely on tensile force, torsional torque, or their combination to induce mechanical failure at the fruit–peduncle interface. As illustrated in Figure 5, pulling applies axial force along the peduncle, while twisting introduces shear stress through rotational motion, both promoting detachment via stress concentration [85]. Pulling-based detachment is effective for fruits with well-developed abscission zones, where the fruit–plant connection weakens during maturation. In such cases, relatively low tensile forces are sufficient for separation. However, excessive pulling may cause bruising, stem tearing, or damage to surrounding plant tissues, particularly in immature or strongly attached fruits [53]. Twisting mechanisms generate torsional stress that reduces the required detachment force by concentrating stress in weaker regions of the peduncle. As shown in Figure 5, torsion leads to shear deformation that facilitates failure at lower force levels compared to pure tension, making this approach suitable for fruits with strong attachment strength [26,88]. In practical systems, pulling and twisting are often combined to enhance detachment efficiency and robustness. This combined loading condition improves adaptability across different fruit types and maturity stages while reducing overall force requirements. However, performance depends on accurate control of force magnitude, direction, and duration, as well as precise alignment between the manipulator and the peduncle [86]. Despite their simplicity, these mechanisms are more sensitive to variability in fruit properties and environmental conditions compared to cutting-based methods. Therefore, integrating force sensing and adaptive control strategies is essential to maintain consistent performance and minimize damage in real orchard environments [88,89].

6.3. Shaking and Dynamic Excitation

Shaking and dynamic excitation mechanisms rely on the application of oscillatory or vibrational forces to induce fruit detachment by exploiting the dynamic response of the fruit–peduncle system. Unlike static methods such as cutting or pulling, this approach applies time-varying forces that generate cyclic stresses, leading to progressive weakening and eventual failure at the peduncle interface. As illustrated in Figure 5, these dynamic forces act on the same mechanical system but introduce additional inertial and resonance effects [90]. From a mechanical standpoint, dynamic excitation can reduce the average detachment force by inducing fatigue and resonance phenomena within the peduncle. When the excitation frequency approaches the natural frequency of the fruit–peduncle system, stress amplification occurs, facilitating detachment at lower force levels compared to purely static loading. To provide a more rigorous mechanical interpretation of this phenomenon, the fruit–peduncle system can be modeled using a single-degree-of-freedom (SDOF) representation, where the fruit is idealized as a lumped mass connected to the supporting branch through an equivalent spring–damper system. This formulation enables analytical investigation of the system’s response under harmonic excitation and provides insight into resonance conditions associated with optimal detachment efficiency [91]. However, the SDOF model neglects the dynamic interaction between the fruit and the supporting branch. To address this limitation, a two-degree-of-freedom (2DOF) model can be adopted, where both the fruit and a portion of the branch are represented as coupled masses. The governing dynamics of this system are described in Equations (1) and (2), capturing the coupled vibration behavior and energy transfer mechanisms within the tree structure. This extended model provides a more accurate representation of vibration propagation and force amplification effects under complex excitation conditions, as illustrated in Figure 6. The modeling framework is based on classical formulations of fruit–stem dynamic systems as reported in the literature [92].

m_{1} {\ddot{x}}_{1} + c_{1} ({\dot{x}}_{1} - {\dot{x}}_{2}) + k_{1} (x_{1} - x_{2}) = 0

(1)

m_{2} {\ddot{x}}_{2} + c_{1} ({\dot{x}}_{2} - {\dot{x}}_{1}) + k_{1} (x_{2} - x_{1}) + c_{2} {\dot{x}}_{2} + k_{2} x_{2} = F (t)

(2)

These equations describe the dynamic interaction between the fruit mass (m₁) and the branch mass (m₂), including the effects of coupling stiffness and damping.

Figure 6. (a) Single-degree-of-freedom (SDOF) model representing the fruit–pedicel system as a mass–spring–damper system. (b) Two-degree-of-freedom (2DOF) model illustrating the coupled dynamic interaction between fruit and branch under vibrational excitation. Redrawn and modified from Mohsenin et al. [92].

This principle is widely utilized in large-scale harvesting systems, particularly for crops such as olives, almonds, and citrus [93]. One of the primary advantages of shaking-based methods is their ability to harvest multiple fruits simultaneously, significantly improving harvesting efficiency and throughput. As conceptually illustrated in Figure 5, the applied vibrations propagate through the branches, transmitting energy to multiple fruit–peduncle attachment points. However, this lack of selectivity may lead to the unintended detachment of unripe fruits, as well as increased fruit drop and impact-related damage [90]. In robotic applications, integrating dynamic excitation mechanisms presents several challenges. The transmission of vibrations can affect the stability of robotic manipulators and interfere with perception and control systems. Additionally, precise control of excitation amplitude and frequency is required to avoid excessive forces that may damage both the fruit and the plant structure [93].

Despite these limitations, dynamic excitation remains a promising approach for improving harvesting efficiency, particularly in scenarios where selectivity is less critical. As illustrated in Figure 5, this method complements static detachment strategies by introducing time-dependent loading conditions, expanding the range of mechanical interactions that can be exploited for fruit removal.

6.4. Hybrid and Adaptive Detachment Strategies

Hybrid and adaptive detachment strategies aim to combine multiple force application modes—such as tension, torsion, and cutting—to improve detachment efficiency, robustness, and adaptability across varying fruit types and orchard conditions. As illustrated in Figure 5, these strategies operate on the same biomechanical system but leverage combined loading conditions to reduce the required detachment force and enhance reliability [88]. In hybrid approaches, different mechanisms are applied sequentially or simultaneously. For example, a robotic system may initiate detachment using a controlled pulling or twisting motion to weaken the peduncle, followed by a cutting action to complete separation. This combination reduces peak force requirements and minimizes the risk of fruit damage compared to single-method approaches [94].

Adaptive detachment strategies extend this concept by incorporating real-time sensing and decision-making, allowing the system to dynamically select or adjust the detachment method based on fruit properties, maturity level, and environmental conditions. As reflected in Figure 5, this requires continuous monitoring of force feedback and system response to ensure that the applied loading remains within optimal limits [86]. The primary advantage of hybrid and adaptive strategies lies in their ability to handle variability in fruit geometry, peduncle strength, and orchard structure. By combining multiple mechanisms, these systems can achieve higher success rates and improved selectivity compared to conventional methods. However, this flexibility comes at the cost of increased system complexity, requiring advanced control architectures, sensor integration, and coordination between perception, manipulation, and detachment modules [88,94].

Overall, as illustrated in Figure 5, hybrid and adaptive detachment strategies represent a transition toward intelligent and context-aware harvesting systems, where multiple mechanical principles are integrated to optimize performance under real-world conditions.

6.5. Modeling of Detachment Forces

The modeling of fruit detachment forces provides a fundamental framework for understanding and optimizing robotic harvesting strategies. The detachment process can be described using principles of solid mechanics, where the peduncle is treated as a deformable structure subjected to external forces and moments. Failure occurs when the applied stress exceeds the material strength at the fruit–peduncle junction. As illustrated in Figure 5, different detachment mechanisms correspond to distinct loading conditions acting on the same biomechanical system [95].

In pulling-based detachment, the applied force acts along the longitudinal axis of the peduncle, generating tensile stress as expressed in Equation (3).

σ = \frac{F}{A}

(3)

Detachment occurs when the applied stress exceeds the ultimate tensile strength of the material, leading to the detachment force defined in Equation (4).

F_{d e t} = σ_{u} \cdot A

(4)

where σ is the tensile stress, F is the applied pulling force, A is the cross-sectional area of the peduncle, and σ_u is the ultimate tensile strength. This model indicates that peduncles with larger cross-sectional areas or higher material strength require greater detachment forces.

In twisting-based mechanisms, torque is applied to induce shear stress within the peduncle, as described in Equation (5).

τ = \frac{T r}{J}

(5)

where τ is the shear stress, T is the applied torque, r is the radial distance, and J is the polar moment of inertia. This mechanism is often more efficient than pure tension due to stress concentration effects [15].

Under practical harvesting conditions, detachment typically occurs under combined loading. The bending stress component is expressed in Equation (6).

σ_{b} = \frac{M y}{I}

(6)

where M is the bending moment, y is the distance from the neutral axis, and I is the second moment of inertia. The interaction of tensile, torsional, and bending stresses reduces the effective detachment force due to stress superposition.

Fruit maturity significantly influences detachment behavior, as biochemical changes in the abscission zone reduce bonding strength [96]. This relationship can be approximated as shown in Equation (7).

F_{d e t} \propto k \cdot E \cdot A

(7)

where E is the elastic modulus and k is a maturity-dependent coefficient. Variations among fruit types also lead to differences in detachment forces, with apples generally requiring higher forces than tomatoes, while citrus fruits respond more effectively to torsional loading [53]. These models provide a unified mechanical framework for analyzing detachment across different harvesting strategies. As illustrated in Figure 5, detachment depends on the interaction between applied forces and peduncle properties. Integrating these models into the control architecture shown in Figure 4 enables real-time adjustment of detachment strategies, improving efficiency and reducing damage. Future research should focus on data-driven and learning-based models for accurate force prediction under variable field conditions [53].

6.6. Comparative Analysis of Detachment Methods

A comparative analysis of fruit detachment methods reveals trade-offs between force requirements, detachment success, fruit damage, and compatibility with robotic systems. As illustrated in Figure 5, all strategies act on the same biomechanical structure but differ in force type and distribution, which directly influence their performance [1].

Cutting-based detachment provides the highest precision by applying localized shear forces for clean separation, resulting in low damage and high consistency. It is well suited for high-value crops but requires accurate positioning and introduces additional mechanical complexity related to tool actuation and maintenance [94].

Pulling and twisting mechanisms offer simpler and more flexible solutions based on tensile and torsional loading. Twisting reduces detachment force through stress concentration, but both methods remain sensitive to fruit maturity and peduncle strength. Inadequate force control may increase the risk of bruising or structural damage [53,89].

Shaking and dynamic excitation methods enable high-throughput harvesting by detaching multiple fruits simultaneously. Through cyclic loading, they exploit resonance and fatigue effects to reduce average force requirements. However, limited selectivity and higher impact-related damage restrict their use in precision harvesting applications [90].

Hybrid and adaptive strategies combine multiple detachment mechanisms to enhance robustness and adaptability. The use of combined loading conditions reduces peak force requirements and improves performance across varying fruit types and environmental conditions, although at the cost of increased system complexity and control requirements [88]. Overall, no single method is universally optimal. The selection of an appropriate strategy depends on fruit characteristics, orchard conditions, and system constraints. Integrating multiple force application modes with real-time adaptation represents a promising direction for achieving efficient, reliable, and low-damage robotic harvesting [97]. The detachment mechanisms discussed in Section 6 provide the mechanical foundation for fruit removal; however, their effectiveness in real-world applications depends on their integration within complete robotic harvesting platforms. In practice, detachment strategies do not operate in isolation but are tightly coupled with perception, grasping, and motion control systems, as previously illustrated in Figure 4 and Figure 5. Therefore, understanding how these mechanisms are implemented within operational robotic platforms is essential for evaluating their practical performance and limitations. The following section examines representative robotic harvesting systems, highlighting how different detachment approaches are integrated at the system level and how they influence overall harvesting efficiency, selectivity, and robustness.

7. Robotic Fruit Harvesting Platforms and Field Implementations

To bridge the gap between theoretical developments and real-world implementation, this section presents representative robotic fruit harvesting platforms that integrate perception, manipulation, and detachment functionalities within operational systems. While substantial progress has been achieved at the component level, translating these advances into fully integrated field-ready platforms remains a major challenge due to environmental variability, sensing uncertainty, and system coordination constraints [63,64]. Recent studies further emphasize that the performance of complete harvesting systems is still limited by the complexity of unstructured orchard environments and the need for robust integration across perception, planning, and actuation modules [98,99].

Existing robotic harvesting platforms exhibit significant diversity in design and operational strategies, ranging from single-arm mobile manipulators to multi-arm coordinated systems and semi-mechanized harvesting solutions. These platforms differ in crop specificity, sensing technologies, harvesting mechanisms, and overall system performance. For instance, apple harvesting robots commonly employ vision-guided manipulators combined with suction-based end-effectors to enable selective picking, whereas strawberry harvesting systems rely on soft or adaptive grippers to minimize fruit damage during handling [24,100]. In contrast, trunk shaker systems used for olives and citrus represent semi-mechanized approaches that prioritize harvesting efficiency and throughput over selectivity, reflecting a fundamentally different operational paradigm [101].

Recent advancements in robotic harvesting platforms have focused on improving system-level coordination and operational efficiency. For example, co-robotic orchard platforms that integrate human–robot collaboration with adaptive control strategies have demonstrated improvements in harvesting throughput by optimizing platform movement and interaction with the environment [94]. Similarly, modern intelligent harvesting systems increasingly incorporate deep learning-based perception and grasp prediction models, enabling more accurate fruit detection, localization, and manipulation under varying environmental conditions [16]. Despite these advancements, field performance remains constrained by challenges such as occlusion, illumination variability, and irregular fruit distribution, which continue to affect perception reliability and manipulation accuracy [4]. Table 4 summarizes key representative platforms, highlighting their crop type, robotic configuration, sensing methods, reported success rates, and primary limitations. This comparison provides insight into current system-level trade-offs and technological gaps in robotic harvesting.

The comparison in Table 4 highlights several important system-level trends in robotic harvesting platforms. First, a clear trade-off exists between harvesting selectivity and operational efficiency. Selective harvesting systems, such as apple and strawberry robots, rely heavily on advanced perception and manipulation modules, which often limit operational speed and robustness under field conditions [64]. In contrast, bulk harvesting systems, such as trunk shakers, achieve significantly higher throughput but lack precision and may lead to increased fruit damage [100].

Second, sensing limitations remain a dominant bottleneck across most platforms, particularly in dense canopy environments where occlusion and illumination variability reduce detection accuracy. Even advanced RGB-D and AI-based perception systems exhibit performance degradation under such conditions, highlighting the need for more robust multi-modal sensing and data fusion strategies [4,16]. Additionally, perception errors directly propagate to manipulation and detachment stages, significantly affecting overall harvesting success rates [98]. Third, manipulation and end-effector design play a critical role in determining system performance. Soft and adaptive grippers improve fruit safety and reduce damage rates but often introduce trade-offs in terms of harvesting speed and control complexity [103]. Recent research trends focus on improving dexterity, adaptability, and learning-based grasping strategies to enhance performance under variable orchard conditions [99]. These observations confirm that current robotic harvesting platforms are still constrained by incomplete integration between perception, manipulation, and detachment modules. Although individual components have achieved significant technological maturity, system-level optimization remains a major challenge. Future developments should focus on adaptive, learning-driven, and tightly integrated robotic systems capable of operating reliably under real orchard conditions while balancing efficiency, selectivity, and fruit quality [98,99].

8. Orchard and Pre-Harvest Factors Affecting Robotic Harvesting

Orchard conditions and pre-harvest factors fundamentally influence the feasibility and performance of robotic fruit harvesting systems. Unlike structured industrial environments, orchards are inherently variable and dynamic, where plant architecture, fruit distribution, and cultivation practices introduce significant uncertainty. As a result, many limitations in robotic harvesting arise not only from sensing or control algorithms, but also from the interaction between robotic systems and the biological complexity of the crop environment [15].

Variability in canopy structure and fruit spatial distribution directly affects perception by increasing occlusion and reducing visibility, thereby limiting localization accuracy and task planning. In addition, fruit mechanical properties and attachment strength—dependent on species, maturity, and agronomic conditions—govern detachment force requirements and impose constraints on manipulation and end-effector design, particularly in balancing grasp stability with damage minimization [97].

Orchard management practices play a critical role in mitigating these challenges by shaping the physical environment. Pruning and canopy training can improve accessibility and reduce occlusion, while harvest facilitation techniques, such as chemical treatments, can lower detachment forces and enable more efficient harvesting [89].

To capture these interdependencies, Figure 7 presents a conceptual framework linking pre-harvest factors to robotic system performance through intermediate physical and perceptual conditions. Plant characteristics, management practices, and facilitation strategies influence visibility, accessibility, and detachment requirements, which collectively determine the effectiveness of perception, manipulation, and detachment processes. This highlights that harvesting performance emerges from coupled system–environment interactions rather than isolated subsystem capabilities.

8.1. Plant Characteristics

Plant characteristics represent a major source of variability in robotic fruit harvesting, directly influencing the interaction between perception, manipulation, and detachment subsystems. Unlike engineered environments, tree crops exhibit irregular architectures, where variations in canopy geometry, branch density, and fruit spatial distribution impose significant constraints on visibility, accessibility, and collision-free manipulation [1].

Tree architecture strongly affects perception performance. Dense canopies and overlapping branches increase occlusion, reducing detection accuracy and depth estimation, particularly in RGB and RGB-D systems. Even advanced deep learning models remain sensitive to illumination changes, background complexity, and fruit overlap, all of which are governed by canopy structure [105]. Consequently, perception uncertainty propagates to localization and manipulation planning.

Fruit distribution further increases complexity, as fruits are often unevenly distributed, clustered, or located in occluded regions near branches. This spatial variability complicates grasp planning, requiring adaptation to different orientations and constrained spaces, while increasing the risk of unintended contact or damage. Such conditions necessitate more advanced planning strategies for dynamic and cluttered environments [97].

From a mechanical perspective, plant characteristics also affect detachment behavior. Fruit orientation, peduncle stiffness, and structural support from surrounding branches influence force transmission during pulling, twisting, or cutting. Variability in these factors leads to inconsistent detachment responses, requiring adaptive control to ensure effective and safe separation [106]. Within the framework illustrated in Figure 7, plant characteristics act as upstream factors shaping visibility, accessibility, and interaction mechanics. These conditions directly impact perception accuracy, grasp feasibility, and detachment reliability, highlighting the importance of structural understanding in improving system robustness and efficiency.

8.2. Orchard Management

Orchard management practices play a central role in shaping the operational environment for robotic harvesting, acting as a controllable layer that can mitigate challenges associated with plant variability. Practices such as pruning, canopy training, and spatial arrangement enable systematic restructuring of orchards to improve accessibility, perception reliability, and operational efficiency [15]. Pruning directly reduces canopy density and branch interference, improving visibility and enhancing fruit detection and localization, particularly in dense environments. It also facilitates smoother motion planning and reduces collision risks during manipulation [105]. Training systems further standardize canopy geometry by guiding tree growth into structured forms, such as planar architectures, which improve fruit accessibility and simplify grasp planning and trajectory generation [107]. Orchard layout parameters, including row spacing and tree height, influence robot navigation and workspace feasibility. Proper spacing enhances mobility and sensor coverage, while controlled tree height ensures reachability. Poorly designed layouts increase occlusion, limit accessibility, and reduce harvesting efficiency.

Within the framework illustrated in Figure 7, orchard management modulates key intermediate conditions, particularly visibility and accessibility, thereby improving perception, manipulation, and detachment performance. This emphasizes that effective robotic harvesting depends not only on system design but also on how well orchard environments are adapted for automation [106].

8.3. Harvest Facilitation Techniques

Harvest facilitation techniques modify the mechanical interaction between fruit and plant to improve detachment efficiency. Unlike structural interventions, these techniques directly influence the physical properties governing detachment, reducing force requirements and improving operational stability [1]. Chemical abscission is a widely studied approach that weakens the fruit–peduncle connection by inducing physiological changes in the abscission zone. This reduces detachment force, enabling simpler end-effectors, lowering damage risk, and reducing reliance on complex force control strategies [106]. Mechanical facilitation methods, such as localized vibration or excitation, alter the dynamic response of the fruit–branch system to promote detachment under controlled conditions. However, their effectiveness depends on fruit mass, attachment stiffness, and structural support, making consistent application challenging in heterogeneous environments [107]. A critical consideration is balancing force reduction with fruit integrity, as excessive weakening may cause premature drop or handling damage. Therefore, facilitation strategies must be calibrated according to fruit type and maturity.

As illustrated in Figure 7, these techniques primarily influence detachment force requirements and fruit stability, directly affecting detachment performance and indirectly shaping manipulation strategies. Their integration provides a pathway to improving efficiency and adaptability in robotic harvesting systems [105].

8.4. Impact on Robotic System Performance

The combined effects of plant characteristics, orchard management, and harvest facilitation determine the overall performance of robotic harvesting systems through their influence on visibility, accessibility, and detachment force requirements. These interdependent factors shape perception, manipulation, and detachment processes, reinforcing that system performance emerges from coupled system–environment interactions rather than isolated components [15]. Visibility conditions directly affect perception accuracy, where occlusion and visual complexity reduce detection reliability and localization precision. These limitations propagate through the system, leading to inefficiencies in grasping and motion planning [105]. Accessibility governs manipulation feasibility, as dense or irregular environments increase path complexity and limit effective end-effector positioning. Reduced accessibility can result in unstable grasps and increased risk of damage [107]. Detachment performance depends on force requirements and fruit stability, which vary with biological and environmental conditions. This variability necessitates adaptive control and flexible detachment strategies, while facilitation techniques can help reduce force thresholds and improve consistency [106]. Within the framework illustrated in Figure 7, these interactions form a cascade in which pre-harvest factors define operational conditions that directly influence subsystem performance. The strong interdependence between perception, manipulation, and detachment highlights the need for an integrated system design that accounts for environmental variability.

Thus, improving robotic harvesting requires alignment between robotic capabilities and orchard conditions, emphasizing a system-level approach to achieve reliable and scalable performance in real-world applications.

9. Post-Harvest Handling, Collection, and Transport Systems

The post-harvest stage represents a critical phase in robotic harvesting systems, where handling and recovery operations directly determine fruit quality, loss rates, and overall system efficiency. Unlike detachment, this stage introduces additional mechanical interactions that may generate impact, compression, vibration, and abrasion, all of which contribute significantly to post-harvest losses [108]. These effects can propagate from external surface damage to internal structural degradation, ultimately affecting shelf life and market value. Robotic handling systems must therefore be designed to accommodate variability in fruit size, shape, and fragility. This requires integration of compliant end-effectors, controlled transfer trajectories, and real-time monitoring to reduce mechanical stress during handling. Adaptive strategies incorporating vibration damping and force regulation have been shown to significantly reduce damage while maintaining operational efficiency [109]. In addition, fruit recovery mechanisms are essential to capture misplaced or displaced fruits during handling, ensuring minimal loss and maintaining throughput performance [110,111]. As illustrated in Figure 8, post-harvest processes form an interconnected system linking damage mechanisms, transport systems, and performance outcomes such as efficiency, loss, and throughput. This framework emphasizes that handling and recovery are not isolated functions but integrated components influencing overall system behavior.

9.1. Fruit Damage Mechanisms

Fruit damage during post-harvest handling remains one of the most significant challenges in robotic harvesting systems [112]. Unlike controlled detachment, handling introduces continuous mechanical loads such as impact, compression, vibration, and abrasion from the moment of release until storage. These stresses can cause both visible damage, such as bruising, and latent internal degradation affecting firmness and shelf life [112,113]. Impact events, such as drops or collisions, may exceed the elastic limit of fruit tissue, leading to cell rupture and structural failure. Compression in storage bins or during stacking induces deformation and internal bruising, while prolonged vibration during transport accelerates fatigue-related tissue breakdown [113]. The severity of these effects depends strongly on fruit mechanical properties such as firmness, elasticity, and viscoelastic behavior [112]. Soft fruits with high deformability are more susceptible to damage under identical loading conditions compared to firmer fruits, highlighting the need for fruit-specific handling strategies. These biomechanical differences must be considered in the design of end-effectors, transfer systems, and control policies to avoid exceeding damage thresholds [112]. Within Figure 8, damage mechanisms act as intermediate variables linking mechanical interaction during transfer to measurable outcomes such as damage rate and throughput.

9.2. Collection and Transport Systems

Collection and transport systems form the critical link between detachment and storage, where improper design can significantly degrade the gains achieved in earlier stages [108,114]. These systems typically include conveyors, soft bins, or adaptive trays designed to minimize impact and distribute load uniformly. Modern robotic systems increasingly employ adaptive collection mechanisms that adjust to fruit size and fragility, reducing localized stress during transfer [115]. Transport pathways must be carefully controlled to avoid sudden accelerations, drops, or collisions, as even minor disturbances can induce bruising or internal damage. Sensor integration plays a crucial role in improving transport safety. Embedded force sensors and accelerometers enable real-time detection of abnormal loads or misalignment, allowing dynamic adjustments in trajectory and speed to preserve fruit integrity [112]. As shown in Figure 8, collection and transport systems act as the central conduit between detachment and final handling stages, directly influencing damage propagation and throughput efficiency.

9.3. Efficiency, Losses, and System Throughput

Post-harvest performance is ultimately evaluated through efficiency, losses, and throughput, which collectively define the economic and operational effectiveness of robotic harvesting systems [108,114]. Efficiency reflects the ability to preserve fruit quality while minimizing energy and time consumption, while losses quantify damaged or unusable produce. Throughput represents the rate of successful fruit progression from detachment to storage. Even minor inefficiencies in handling or transport can significantly amplify damage rates, particularly for delicate fruits such as peaches or berries [116]. Conversely, optimized transport design, adaptive handling, and real-time monitoring can substantially reduce losses and improve system stability. Importantly, post-harvest performance is strongly dependent on upstream system behavior. Errors in perception or manipulation can propagate into misplacement, collision, or inefficient transfer, increasing losses. This interdependence highlights that system-level performance is an emergent property of the entire robotic harvesting pipeline rather than isolated modules [112]. Within Figure 8, efficiency, loss, and throughput are represented as outcome metrics governed by upstream mechanical and control interactions.

10. System Integration and Autonomous Operation

The effectiveness of robotic fruit harvesting systems depends on the seamless integration of perception, decision-making, and actuation within a unified autonomous framework. Despite significant advances in individual components such as machine vision and robotic manipulators, the primary challenge lies in coordinating these subsystems to operate coherently under dynamic and uncertain field conditions. This requires efficient data flow, synchronization, and adaptive control across system layers [58,117]. A central aspect of this integration is the coupling between perception and action, where sensory data must be rapidly translated into precise physical operations. Field conditions such as occlusion, illumination variability, and irregular fruit distribution impose constraints on system responsiveness and accuracy. Consequently, modern harvesting systems rely on closed-loop control architectures that continuously refine motion and task execution based on real-time feedback [118]. In parallel, real-time decision-making enables robots to select targets, adapt grasping and detachment strategies, and respond to unexpected situations. This is closely linked with autonomous navigation, where robots must move efficiently in unstructured environments while maintaining task alignment. The integration challenge further extends to multi-robot coordination and robust control strategies capable of handling latency, synchronization issues, and system uncertainties [119]. The overall system integration and autonomous operation framework is illustrated in Figure 9, highlighting the interaction between perception, planning, control, and execution modules within a closed-loop architecture.

10.1. Perception–Action Coupling

Perception–action coupling extends beyond a simple data-to-command transformation and should be viewed as a tightly integrated mapping between perception, motion planning, and force-controlled interaction. As illustrated in Figure 9, perception outputs continuously parameterize both motion trajectories and interaction forces, enabling adaptive manipulation and controlled detachment.

Modern systems estimate fruit pose, geometry, and sometimes peduncle location, forming the basis for task-space planning. However, many approaches remain predominantly kinematic, focusing on collision-free trajectories while underrepresenting force requirements. This often leads to weak coupling between grasping and detachment stages [2,120]. Mechanically, successful detachment depends on aligning applied forces with the structural properties of the fruit–stem system. Therefore, perception must inform not only end-effector positioning but also force direction, magnitude, and application rate. Misalignment reduces efficiency and increases the likelihood of damage or incomplete detachment [13]. Localization and perception errors directly propagate to downstream manipulation and detachment stages, significantly affecting harvesting performance. Even small inaccuracies in fruit pose estimation may lead to end-effector misalignment, unstable grasping, excessive contact force, or incomplete peduncle separation. Previous studies have shown that localization errors can substantially reduce grasping success rates and increase fruit damage probability, particularly under occlusion and dense canopy conditions where localization uncertainty is high [57,120]. In addition, perception latency and inaccurate depth estimation may generate suboptimal trajectories, thereby increasing collision risk with branches and neighboring fruits. These error propagation effects highlight that harvesting performance should not be evaluated solely based on perception accuracy, but rather through integrated system-level assessment considering the interaction between sensing, planning, manipulation, and detachment modules [117,118]. Closed-loop strategies combining visual and force feedback enable continuous adjustment during execution. However, system performance remains sensitive to latency and synchronization between perception updates and control actions [57].

A key limitation is the absence of unified models linking perception outputs directly to detachment mechanics. To address this limitation, a systematic perception-to-action pipeline can be defined, where perception outputs such as fruit location, pose, geometry, and attachment-related cues are first transformed into structured representations for downstream processing. These representations serve as inputs to motion planning modules that generate feasible trajectories while ensuring collision-free approach and task-space feasibility. The planned motions are then executed through control systems that translate trajectories into real-time actuation commands under feedback regulation [117,118]. At the interaction level, force control plays a critical role in bridging the gap between kinematic planning and physical fruit–stem interaction, where perception-derived estimates indirectly inform the regulation of contact force magnitude, direction, and timing during manipulation and detachment. This closed-loop formulation enables continuous refinement of actions under uncertainty and environmental variability. It further emphasizes that harvesting performance depends not only on perception accuracy, but also on the consistency and coordination of planning and control stages. Consequently, tightly coupled perception–planning–control architectures are essential to improve robustness and adaptability in unstructured orchard environments [57,120].

10.2. Real-Time Decision Making

Real-time decision-making represents a constrained optimization process that transforms perception output into executable harvesting actions. As illustrated in Figure 9, it acts as a central layer connecting perception, planning, and manipulation. Due to environmental variability, decision-making must incorporate probabilistic reasoning and adaptive strategies rather than deterministic rules. Recent approaches formulate target selection as a multi-criteria optimization problem considering accessibility, ripeness, detachment difficulty, and energy cost [121,122]. A critical limitation in current systems is the weak integration between decision-making and mechanical feasibility. Selected targets may require excessive force or complex repositioning, reducing efficiency. More advanced formulations incorporate detachment force estimation into the decision process. To satisfy real-time constraints, hierarchical architectures are often used, where lightweight models filter candidates and more complex models refine final selections [123]. However, perception uncertainty remains a challenge, necessitating uncertainty-aware evaluation strategies [117]. Additionally, multi-target prioritization introduces combinatorial complexity, requiring dynamic task allocation strategies that balance efficiency and feasibility.

10.3. Autonomous Navigation in Orchards

Autonomous navigation is a perception-driven process that enables robots to position themselves effectively for harvesting tasks. As shown in Figure 9, navigation is tightly coupled with manipulation and directly influences harvesting efficiency. Orchard environments are inherently unstructured, characterized by irregular spacing, dense canopies, and uneven terrain. To address this, modern systems employ multi-modal sensing, including RGB-D cameras, LiDAR, and inertial sensors [124,125]. Unlike traditional robotics, navigation in harvesting systems is task-oriented, as positioning directly affects reachability and cycle time [126]. Learning-based approaches improve adaptability but often struggle with generalization across different orchard conditions [127].

Latency and synchronization between mapping, planning, and execution remain key challenges, particularly in dense environments. Energy-aware navigation is also essential to balance travel efficiency with harvesting productivity.

10.4. Multi-Robot Coordination

Multi-robot coordination enhances harvesting efficiency by enabling distributed operation across large orchard areas. As illustrated in Figure 9, coordination involves shared perception, task allocation, and spatial conflict management. Task allocation under uncertainty is a major challenge due to variability in fruit distribution and accessibility. Dynamic strategies, including auction-based and learning-based methods, have shown improved performance over static approaches [127]. Communication plays a critical role but introduces latency and bandwidth constraints, which can affect coordination quality. Decentralized approaches mitigate this by allowing local autonomy with periodic synchronization. Spatial coordination requires predictive motion planning to avoid collisions, while workload balancing ensures efficient utilization of robots with different capabilities. However, scalability remains limited due to increasing system complexity. Distributed robotic harvesting systems enable parallel execution across large orchard areas, improving overall throughput and efficiency. However, performance is constrained by communication latency, limited bandwidth, and synchronization requirements in dynamic environments. In addition, dynamic task allocation is challenging due to variability in fruit distribution and heterogeneous robot capabilities. These challenges require decentralized coordination mechanisms, latency-aware communication protocols, and adaptive task allocation strategies to improve scalability and robustness.

10.5. Control Strategies and System Robustness

Control strategies ensure system stability and reliability by integrating perception, decision-making, and execution within a unified framework. As illustrated in Figure 9, control forms the final layer that governs system response under dynamic conditions. Robotic harvesting involves nonlinear and uncertain dynamics due to environmental variability and sensor noise. Therefore, closed-loop control approaches are essential for maintaining stable performance [117,128]. The integration of visual servoing and force feedback enables precise and safe interaction during harvesting, particularly in detachment stages. However, system performance is affected by latency and model uncertainty. Robustness is enhanced through failure detection and recovery mechanisms, allowing systems to adjust actions dynamically [127]. Predictive control methods, such as Model Predictive Control, help mitigate delays but require accurate modeling. Learning-based control strategies improve adaptability but still face challenges in generalization and reliability under field conditions.

Thus, achieving robust autonomous harvesting requires tight integration between sensing, control, and mechanical interaction within a unified architecture.

11. System Performance, Economic Feasibility, and Scalability

The transition from manual to robotic fruit harvesting represents a major shift in autonomous agriculture, where adoption depends on system performance, economic feasibility, and scalability under real orchard conditions. As illustrated in Figure 10, these three dimensions form a hierarchical and interdependent framework in which system performance directly influences economic viability, which in turn determines large-scale deployment potential. This relationship is inherently bidirectional, as field-level constraints and operational conditions continuously shape system performance and design requirements. Performance evaluation in robotic harvesting is inherently multi-dimensional, integrating perception accuracy, mechanical interaction, computational efficiency, and environmental variability. Orchard environments introduce significant uncertainty due to illumination changes, occlusions, irregular canopy structures, and heterogeneous fruit distribution. Consequently, standardized metrics such as harvesting success rate, cycle time per fruit, fruit damage rate, and energy consumption are required. However, the success rate alone is insufficient and must be interpreted alongside fruit quality preservation and operational efficiency to reflect real-world feasibility [117,129]. From a system perspective, performance is strongly influenced by the coupling between perception and physical interaction. Errors in localization or segmentation propagate through manipulation and detachment, reducing success rates and increasing damage probability. Therefore, recent research emphasizes system-level metrics such as perception-to-action latency, end-effector efficiency, and failure recovery rate, rather than isolated subsystem indicators [123].

Figure 10. Integrated framework for performance evaluation, economic assessment, and scalability analysis of robotic fruit harvesting systems, illustrating the hierarchical relationship among system performance metrics, economic feasibility indicators, and deployment scalability factors, with bidirectional feedback between field constraints and overall system performance.

Economic feasibility remains a decisive factor for adoption. Robotic systems require significant capital investment in sensors, manipulators, and computational infrastructure, which must be balanced against long-term benefits such as reduced labor dependency and improved consistency. These systems become more competitive in labor-intensive crops with workforce shortages [58]. Key indicators such as return on investment (ROI), system throughput, and maintenance costs are directly influenced by technical performance and operational reliability, particularly in large-scale and structured orchards [130]. Scalability represents the final evaluation layer, where performance and economic factors converge into real-world applicability. Field deployment introduces additional challenges, including communication latency, environmental variability, and multi-robot coordination. Advances in modular architecture and cloud-based frameworks provide promising pathways for scalable deployment [117,119]. Overall, performance, economics, and scalability must be treated as an integrated framework, as illustrated in Figure 10, to enable the transition from experimental systems to commercial solutions.

11.1. Evaluation Metrics

Within the framework illustrated in Figure 10, evaluation metrics form the foundation for quantifying system performance and supporting both economic and scalability analyses. Due to the complexity of orchard environments, performance assessment requires a combination of quantitative and qualitative indicators that reflect efficiency, robustness, and fruit quality. The harvesting success rate remains a primary metric, measuring the proportion of fruits successfully detected, grasped, detached, and collected. However, it does not fully represent system capability unless combined with other indicators such as damage rate and cycle time. Overreliance on success rate may lead to overestimation of performance, particularly under occlusion and fruit clustering conditions [131,132]. Picking time per fruit is a key indicator of temporal efficiency, incorporating perception, planning, execution, and detachment stages. As shown in Figure 10, it directly influences system throughput and economic viability. While reducing cycle time is essential, it must be balanced against control precision to avoid increased damage or instability [130]. Fruit damage rate is a critical quality metric reflecting the percentage of mechanically affected fruits. It is closely associated with grasping force, detachment strategy, and collision avoidance, and is particularly important for high-value crops where minor damage significantly reduces market value [129]. Energy consumption is increasingly important in autonomous systems, encompassing perception, actuation, and mobility. As indicated in Figure 10, energy efficiency directly affects scalability, especially in battery-operated systems. Energy-aware strategies have demonstrated improved operational duration without compromising performance [124]. Advanced evaluation frameworks also incorporate system-level indicators such as perception-to-action latency, failure recovery rate, and robustness under environmental variability. These metrics capture subsystem interactions and highlight the importance of integrated evaluation, where errors propagate across stages and affect overall system performance [58].

11.2. Economic Comparison

As illustrated in Figure 10, economic assessment translates technical performance into practical feasibility and adoption potential. Manual harvesting is characterized by low capital investment and high adaptability but is increasingly constrained by rising labor costs and workforce shortages [133,134]. Robotic harvesting systems, in contrast, involve high capital expenditure (CAPEX) for equipment and infrastructure, along with operational expenditure (OPEX) related to maintenance, energy, and system updates. Economic efficiency improves with increased utilization and scale, as fixed costs are distributed over larger production volumes. From a cost–benefit perspective, robotic harvesting systems involve high initial capital investment; however, they provide long-term economic advantages through reduced labor requirements, improved harvesting efficiency, enhanced product quality, and minimized post-harvest losses. These benefits are particularly significant in large-scale orchards, where automation improves operational efficiency and mitigates labor shortages. A key advantage of robotic systems is continuous operation without fatigue, which can increase seasonal productivity. However, this benefit depends on performance metrics such as harvesting speed, success rate, and damage rate, which directly influence marketable yield [135,136]. The economic feasibility of robotic harvesting is strongly influenced by orchard structure, crop characteristics, and system operational efficiency. High-density and standardized orchards generally provide more favorable conditions for robotic deployment and process optimization. In addition to labor reduction, robotic and autonomous systems can improve resource-use efficiency, reduce operational waste, and enhance process precision, contributing to more sustainable agricultural production. Technological advancements continue to improve system feasibility, while hybrid human–robot solutions have emerged as practical transitional approaches that enhance operational efficiency and reduce adoption risks [137,138]. Overall, economic viability remains closely linked to performance and deployment scale, as reflected in Figure 10. The economic return on robotic harvesting systems is highly dependent on several interacting operational and environmental variables. Return on investment (ROI) is generally more favorable in high-value and labor-intensive crops, particularly in regions experiencing labor shortages or increasing labor costs. Orchard structure also plays a significant role, as standardized planting systems and high-density orchards improve robot accessibility, navigation efficiency, and harvesting throughput. In contrast, highly heterogeneous orchards with irregular canopy structures may reduce operational efficiency and increase cycle time. In addition, equipment utilization rate strongly influences economic feasibility, since low seasonal utilization may prolong the payback period of robotic systems. Maintenance requirements, energy consumption, operator training, and infrastructure adaptation further contribute to total operational cost. Therefore, the economic performance of robotic harvesting systems should be evaluated through integrated scenario-based analyses that consider crop characteristics, regional labor conditions, orchard design, and annual system utilization to better estimate long-term feasibility and commercial viability [130,137,138].

11.3. Scalability and Practical Deployment

As shown in Figure 10, scalability and deployment represent the final stage in which system performance and economic feasibility are validated under real-world conditions. Scalability requires maintaining consistent performance across large and heterogeneous orchard environments, where variability in canopy structure, illumination, and terrain introduces significant challenges. System architecture plays a central role in scalability. While centralized systems are limited by sequential operation, distributed multi-robot systems enable parallel harvesting and improved efficiency. However, they introduce challenges related to coordination, communication latency, and task allocation [139]. Environmental robustness is critical for deployment, as weather conditions, dust, and lighting variations can degrade sensor performance and system reliability. To address this, modern systems increasingly rely on sensor fusion, adaptive perception, and robust control strategies [87]. Operational considerations such as energy management, maintenance scheduling, and data processing are also essential. Autonomous systems must operate for extended periods with minimal human intervention, requiring efficient power usage and modular system design. Integration with existing agricultural systems remains a major challenge, leading to the development of “robot-ready orchards” where tree architecture and planning strategies are optimized for automation. Despite significant technological progress in robotic harvesting systems, the transition from experimental prototypes to fully commercial solutions remains a major barrier. Most existing systems are still operating on a research or pilot scale, with limited deployment in real-world orchards. This gap is mainly attributed to high system cost, limited long-term reliability, maintenance complexity, and insufficient robustness under highly variable field conditions. In addition, challenges related to system calibration, operator training, and integration with existing agricultural workflows further restrict large-scale adoption. Therefore, future development should focus on improving system robustness, reducing operational cost, and enhancing scalability through modular design and field-validated architectures to enable practical commercialization. Social and cultural factors may influence the adoption of robotic harvesting systems, particularly in farming communities where traditional agricultural practices and long-standing field experience play a key role in technology acceptance and operational decision-making. Advances in cloud robotics, edge computing, and digital agriculture further support scalable deployment through distributed processing and real-time coordination. Thus, scalability represents the ultimate validation stage for robotic harvesting systems, requiring coordinated advancements across robotics, agriculture, and systems engineering, as illustrated in Figure 10.

12. Emerging Technologies in Robotic Harvesting

Recent advances in robotic fruit harvesting systems are increasingly driven by emerging technologies that reshape the interaction between perception, decision-making, and physical execution within integrated autonomous systems. Rather than focusing solely on incremental improvements in individual modules, current research emphasizes system-level intelligence, adaptability, and scalability, which are essential for deployment in complex orchard environments [119]. These technologies address critical limitations identified in earlier sections, including perception uncertainty, weak perception–action coupling, limited adaptability, and high operational latency. In this context, advancements in artificial intelligence, sensing technologies, and distributed computing enable a transition toward adaptive and context-aware robotic operation [55,140]. As illustrated in Figure 11, emerging technologies function as cross-cutting enablers that span multiple layers of the harvesting pipeline, integrating sensing, planning, and actuation through data-driven models and real-time feedback loops. Rather than being independent directions, these four technological pillars are strongly interdependent within a unified system architecture. Artificial intelligence acts as the computational core that enables perception interpretation, decision-making, and learning-based adaptation. Advanced perception provides the multi-modal sensory foundation that feeds AI models with structured environmental representations, thereby directly influencing detection accuracy and contextual understanding. Digital agriculture extends this perception–decision loop to a system-level intelligence layer by incorporating IoT data, cloud/edge computing, and predictive analytics, which enable optimization over time and space. Energy-efficient design and distributed computing, in turn, constrain and enable all other components by determining the feasibility of real-time processing, onboard deployment, and large-scale scalability. Consequently, these technologies form a mutually reinforcing ecosystem in which perception generates data, AI transforms it into decisions, digital agriculture expands its scope, and energy-aware design ensures sustainable and scalable execution in real-world orchard environments. This integrated perspective reduces error propagation across subsystems and enhances overall system robustness [141]. Artificial intelligence and deep learning techniques have significantly improved perception and decision-making under variable agricultural conditions, particularly in addressing occlusion, illumination variability, and fruit detection challenges through advanced CNN- and YOLO-based architectures [32]. Concurrently, advances in multi-modal sensing and sensor fusion enhance operational reliability under challenging field conditions [142]. At the system level, digital agriculture introduces data-driven capabilities through IoT, cloud computing, and large-scale analytics, enabling predictive and optimized harvesting strategies [143]. In parallel, energy-aware design and distributed computing frameworks, including edge and cloud robotics, support sustainable and scalable deployment [119]. Overall, emerging technologies collectively redefine the level of integration, autonomy, and scalability in robotic fruit harvesting systems, forming a technological backbone that supports the transition toward intelligent and field-ready solutions, as depicted in Figure 11.

12.1. Artificial Intelligence and Learning Systems

As depicted in Figure 11, artificial intelligence serves as a central layer linking perception, decision-making, and control, enabling tighter perception–action coupling in robotic harvesting systems. Unlike traditional rule-based approaches, AI-driven methods leverage data-driven models capable of handling variability in illumination, occlusion, and canopy structure. Deep learning techniques, particularly CNN-based architectures, have significantly improved image analysis, fruit detection, and segmentation performance under complex and unstructured agricultural environments [55]. Widely adopted architectures such as Faster R-CNN, YOLO, and Mask R-CNN provide a balance between accuracy and real-time performance, while transformer-based models further enhance contextual understanding in complex scenes [126,140]. Beyond perception, AI contributes to decision-making through reinforcement learning (RL) and learning-based control strategies, enabling adaptive optimization of grasping and detachment actions based on environmental feedback [144,145]. Imitation learning further allows robots to acquire harvesting strategies from human demonstrations, reducing training complexity and incorporating expert knowledge [146]. A major limitation of deep learning applications in agriculture is the limited availability of large annotated datasets. Transfer learning and advanced deep learning strategies have therefore been explored to improve model generalization, reduce training requirements, and enhance performance across different agricultural environments and conditions [55]. In addition, multimodal learning integrates RGB, depth, and spectral data to enhance perception robustness [142]. Edge AI enables real-time inference directly on robotic platforms, reducing latency and improving responsiveness, which is critical for efficient perception–action coupling [126]. However, challenges remain in handling extreme environmental variability and ensuring reliable integration between perception outputs and physical interaction systems. Overall, AI-driven methodologies are transforming robotic harvesting into adaptive and intelligent systems, as illustrated in Figure 11, playing a key role in advancing autonomy and scalability.

12.2. Advanced Perception Technologies

As illustrated in Figure 11, advanced perception technologies form a critical layer that directly influences manipulation accuracy and harvesting success. Conventional RGB-based systems, although cost-effective, are highly sensitive to illumination changes, occlusion, and low contrast, limiting their reliability in real orchard conditions [142].

Depth-sensing technologies, including RGB-D cameras and stereo vision, provide geometric information for accurate three-dimensional fruit localization, improving grasp planning and collision avoidance. However, their performance may degrade under strong outdoor lighting [147]. LiDAR offers high-resolution 3D mapping with strong robustness to lighting variations, enabling accurate reconstruction of orchard structures for navigation and localization. Despite these advantages, its cost and limited texture information remain challenges [147]. Hyperspectral and multispectral imaging extend perception beyond the visible spectrum, allowing discrimination between fruits and foliage based on material properties and enabling assessment of fruit maturity and health [148]. Thermal imaging provides additional detection capabilities under specific environmental conditions, though its reliability depends on temperature contrast [149]. To address individual sensor limitations, sensor fusion combines multiple modalities to enhance robustness and reduce uncertainty. For example, integrating RGB and LiDAR data improves both geometric and texture-based detection [140]. As emphasized in Figure 11, effective perception requires not only accurate sensing but also efficient data processing. High data volumes necessitate real-time processing through edge computing and hardware acceleration. Challenges such as sensor calibration, synchronization, and cost-performance trade-offs remain critical considerations. Overall, advanced perception technologies significantly improve robustness and adaptability, enabling reliable operation in complex environments and supporting scalable robotic harvesting systems.

12.3. Digital Agriculture and Smart Orchards

Within the framework illustrated in Figure 11, digital agriculture introduces a higher-level intelligence layer that transforms robotic harvesting from reactive operation into predictive and data-driven decision-making. Traditional systems rely on real-time perception, whereas digital agriculture incorporates pre-harvest knowledge and environmental data to optimize system performance [143,150]. IoT-based sensor networks enable continuous monitoring of environmental and crop conditions, providing data on temperature, humidity, soil status, and plant health. These data support estimation of fruit maturity, yield prediction, and spatial distribution analysis, reducing uncertainty during harvesting operations [151]. Pre-harvest mapping plays a key role by providing spatial information on fruit location and density, which can be integrated into robotic planning algorithms to optimize navigation and reduce redundant operations. This reduces reliance on exhaustive visual search and improves efficiency.

Digital twin technology further enhances system optimization by enabling virtual modeling of orchard environments. These models integrate real-time and historical data to simulate harvesting strategies, evaluate performance, and optimize operational parameters prior to deployment [152]. Cloud computing supports large-scale data processing, coordination among multiple robots, and centralized model updates, although it introduces challenges related to latency and connectivity [153]. Edge computing addresses these issues by enabling local processing and real-time decision-making, while edge–cloud integration provides a balanced architecture for both responsiveness and scalability.

Data-driven decision support systems further enhance operational efficiency by providing recommendations on harvesting schedules, yield prediction, and risk management [143]. Despite these advantages, challenges related to data standardization, interoperability, and infrastructure limitations remain. Nevertheless, digital agriculture significantly enhances the perception–action loop by providing contextual information, as illustrated in Figure 11, enabling more efficient and scalable harvesting systems.

12.4. Energy Efficiency and Sustainability

As shown in Figure 11, energy efficiency acts as a cross-cutting constraint that directly influences system performance, economic feasibility, and scalability. Field-deployed robotic systems typically operate under energy limitations, with consumption distributed across perception, actuation, and mobility subsystems [119]. Energy constraints introduce trade-offs between speed, accuracy, and operational duration. High-speed operation increases throughput but also power consumption, while energy-efficient strategies may reduce productivity. This necessitates energy-aware planning and control approaches, such as trajectory optimization and task scheduling, to minimize energy usage while maintaining performance [78]. In large-scale deployments, energy management becomes more complex, particularly in multi-robot systems where coordination and task allocation must be optimized to reduce redundant operations. As indicated in Figure 11, scalability is closely linked to efficient energy utilization. The interaction between system architecture and energy consumption is also significant. Edge computing reduces latency but increases onboard energy demand, whereas cloud-based processing reduces local computation but introduces communication overhead [153]. Balancing this trade-off is essential for achieving efficient operation. Sustainability considerations further drive the integration of renewable energy sources, such as solar-assisted charging, and the development of lightweight robotic designs and energy-efficient actuators. Despite these advances, optimizing energy efficiency remains challenging due to the coupling between perception, computation, and physical interaction. High-performance AI models, for example, improve accuracy but increase energy demand, highlighting the need for integrated system-level optimization. In summary, energy efficiency is a key enabler of sustainable and scalable robotic harvesting systems. As illustrated in Figure 11, it connects performance, economics, and deployment, requiring coordinated design across hardware, algorithms, and system architecture.

13. Research Gaps and Future Directions

The rapid advancement of robotic fruit harvesting systems has led to substantial progress in perception, manipulation, and system integration. Nevertheless, several fundamental challenges continue to limit large-scale deployment and practical adoption in real agricultural environments. Addressing these limitations requires a shift from isolated component-level improvements toward holistic system-level innovations that enhance robustness, adaptability, and economic viability [119]. A primary research gap lies in the limited integration between perception, manipulation, and detachment processes. Current systems often treat these components as loosely connected modules, leading to error propagation and reduced operational efficiency. Future research should prioritize closed-loop perception–action frameworks, where continuous sensory feedback dynamically informs manipulation and detachment strategies. Such tightly coupled architectures are essential for improving system reliability under dynamic and uncertain orchard conditions [141,145]. Another important direction concerns the development of intelligent and adaptive end-effectors capable of handling variability in fruit size, shape, and attachment characteristics. Existing designs are often crop-specific and lack the flexibility required for diverse agricultural applications. Advances in soft robotics, tactile sensing, and adaptive control offer promising pathways for safer and more efficient interaction with delicate fruits, accommodating fruits exhibiting diverse shapes, sizes, and mechanical properties, thereby enhancing harvesting adaptability and reducing mechanical damage during handling operations. Embedding sensing capabilities within end-effectors can further enhance grasp stability and manipulation precision through real-time force feedback, contact sensing, and adaptive grip modulation to improve grasp stability and handling precision during harvesting tasks [60,103]. Accurate modeling of fruit detachment mechanics remains an open challenge. The detachment process is influenced by multiple interacting factors, including peduncle strength, fruit maturity, and environmental conditions, which complicates the development of generalized strategies. Future efforts should focus on data-driven and physics-informed models capable of predicting detachment forces and optimizing harvesting actions. Such models are critical for improving harvesting success rates while minimizing fruit damage and enabling more efficient data-driven and physics-informed harvesting strategies under variable field conditions [154]. Improving system performance under unstructured and highly variable field conditions is another major research priority. Agricultural environments present significant variability in illumination, occlusion, canopy structure, and terrain. Another critical challenge is the limited generalization capability of deep learning models across crops, seasons, and environmental conditions. Although these models perform well under controlled datasets, their performance often degrades in new orchard environments due to domain shifts and limited annotated data. To address this, future research should focus on domain adaptation, transfer learning, and self-supervised learning approaches. In addition, synthetic data generation and digital twin environments can enhance robustness, while unified benchmark datasets are essential for improving cross-domain generalization. Future work should emphasize robust perception and adaptive control strategies, including multimodal sensing, domain adaptation, and learning-based approaches that enable generalization across crops, seasons, and environmental scenarios [140]. Finally, bridging the gap between experimental prototypes and commercially viable systems remains a critical challenge. While many robotic harvesting solutions demonstrate promising results under controlled conditions, issues related to scalability, cost-effectiveness, and long-term reliability continue to hinder real-world adoption. Addressing these challenges requires interdisciplinary approaches that integrate engineering design, economic analysis, and system optimization. Standardization, extensive field validation, and user-centered design will be essential to support large-scale deployment in the agricultural sector [143].

Overall, advancing robotic fruit harvesting systems requires a unified system-oriented perspective that integrates sensing, intelligence, and physical interaction within a cohesive operational framework. Progress along these directions will be essential to achieve robust, scalable, and economically sustainable autonomous harvesting solutions.

14. Conclusions

Robotic fruit harvesting systems have evolved from isolated functional modules into tightly coupled cyber–physical systems in which perception, decision-making, and actuation are dynamically interconnected. From a system theory perspective, overall performance emerges from the interactions among subsystems rather than the optimization of individual components, making system-level coupling a key determinant of efficiency, robustness, and reliability. This review highlights that emerging technologies such as artificial intelligence, advanced sensing, digital agriculture, and energy-aware computing act as enabling layers that enhance system observability, controllability, and scalability within complex orchard environments. Despite significant progress, fundamental limitations remain in the form of weak perception–action coupling, uncertainty propagation, and incomplete integration between mechanical interaction and high-level decision models. These challenges indicate that future advancements must move toward fully integrated system architectures characterized by closed-loop learning, distributed intelligence, and real-time adaptive control. Ultimately, robotic harvesting should be understood not as a set of independent algorithms, but as an emergent system whose performance is governed by the coherence of sensing, computation, and physical interaction under real-world constraints.

Author Contributions

Conceptualization, M.G.; methodology, M.G. and N.F.A.-B.; investigation, M.G.; data curation, M.G. and N.F.A.-B.; visualization, M.G. and N.F.A.-B.; writing—original draft preparation, M.G.; review and editing, M.G. and N.F.A.-B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Acknowledgments

The Researchers would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support (QU-APC-2026). During the preparation of this manuscript, no artificial intelligence tools were used in the writing, analysis, interpretation, or preparation of the manuscript content. Any preliminary AI-generated graphical elements considered during the early design stage were removed during revision and are not included in the final manuscript. The authors take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhou, H.; Wang, X.; Au, W.; Kang, H.; Chen, C. Intelligent robots for fruit harvesting: Recent developments and future challenges. Precis. Agric. 2022, 23, 1856–1907. [Google Scholar] [CrossRef]
Liu, S.; Xue, J.; Zhang, T.; Lv, P.; Qin, H.; Zhao, T. Research progress and prospect of key technologies of fruit target recognition for robotic fruit picking. Front. Plant Sci. 2024, 15, 1423338. [Google Scholar] [CrossRef] [PubMed]
Hua, W.; Zhang, Z.; Zhang, W.; Liu, X.; Hu, C.; He, Y.; Mhamed, M.; Li, X.; Dong, H.; Abdelhamid, M.A.; et al. Key technologies in apple harvesting robot for standardized orchards: A comprehensive review of innovations, challenges, and future directions. Comput. Electron. Agric. 2025, 235, 110343. [Google Scholar] [CrossRef]
Yang, Y.; Han, Y.; Li, S.; Yang, Y.; Zhang, M.; Li, H. Vision based fruit recognition and positioning technology for harvesting robots. Comput. Electron. Agric. 2023, 213, 108258. [Google Scholar] [CrossRef]
Droukas, L.; Doulgeri, Z.; Tsakiridis, N.L.; Triantafyllou, D.; Kleitsiotis, I.; Mariolis, I.; Mariolis, I.; Giakoumis, D.; Tzovaras, D.; Bochtis, D.; et al. A survey of robotic harvesting systems and enabling technologies. J. Intell. Robot. Syst. 2023, 107, 21. [Google Scholar] [CrossRef]
Huang, Y.; Xu, S.; Chen, H.; Li, G.; Dong, H.; Yu, J.; Zhang, X.; Chen, R. A review of visual perception technology for intelligent fruit harvesting robots. Front. Plant Sci. 2025, 16, 1646871. [Google Scholar] [CrossRef]
Alaaudeen, K.M.; Selvarajan, S.; Manoharan, H.; Jhaveri, R.H. Intelligent robotics harvesting system process for fruits grasping prediction. Sci. Rep. 2024, 14, 2820. [Google Scholar] [CrossRef]
Zeeshan, S.; Aized, T.; Riaz, F. In-depth evaluation of automated fruit harvesting in unstructured environment for improved robot design. Machines 2024, 12, 151. [Google Scholar] [CrossRef]
Velasquez, A.; Grimm, C.; Davidson, J.R. Compact robotic gripper with tandem actuation for selective fruit harvesting. arXiv 2024, arXiv:2408.06674. [Google Scholar] [CrossRef]
Elfferich, J.F.; Shahabi, E.; Della Santina, C.; Dodou, D. BerryTwist: A Twisting-Tube Soft Robotic Gripper for Blackberry Harvesting. IEEE Robot. Autom. Lett. 2024, 10, 429–435. [Google Scholar] [CrossRef]
Vrochidou, E.; Tsakalidou, V.N.; Kalathas, I.; Gkrimpizis, T.; Pachidis, T.; Kaburlasos, V.G. An overview of end effectors in agricultural robotic harvesting systems. Agriculture 2022, 12, 1240. [Google Scholar] [CrossRef]
Lammers, K.; Zhang, K.; Zhu, K.; Chu, P.; Li, Z.; Lu, R. Development and evaluation of a dual-arm robotic apple harvesting system. Comput. Electron. Agric. 2024, 227, 109586. [Google Scholar] [CrossRef]
Hou, G.; Chen, H.; Jiang, M.; Niu, R. An overview of the application of machine vision in recognition and localization of fruit and vegetable harvesting robots. Agriculture 2023, 13, 1814. [Google Scholar] [CrossRef]
Tang, Y.; Chen, M.; Wang, C.; Luo, L.; Li, J.; Lian, G.; Zou, X. Recognition and localization methods for vision-based fruit picking robots: A review. Front. Plant Sci. 2020, 11, 510. [Google Scholar] [CrossRef] [PubMed]
Cheng, C.; Fu, J.; Su, H.; Ren, L. Recent advancements in agriculture robots: Benefits and challenges. Machines 2023, 11, 48. [Google Scholar] [CrossRef]
Ozcelik, T.I.; Masi, E.; Kargar, S.M.; Scagliarini, C.; Fatima, A.; Vertechy, R.; Berselli, G. Recent Developments in Electroadhesion Grippers for Automated Fruit Grasping. Machines 2025, 13, 1128. [Google Scholar] [CrossRef]
Yoshida, T.; Onishi, Y.; Kawahara, T.; Fukao, T. Automated harvesting by a dual-arm fruit harvesting robot. ROBOMECH J. 2022, 9, 19. [Google Scholar] [CrossRef]
Chen, Z.; Lei, X.; Yuan, Q.; Qi, Y.; Ma, Z.; Qian, S.; Lyu, X. Key technologies for autonomous fruit-and vegetable-picking robots: A review. Agronomy 2024, 14, 2233. [Google Scholar] [CrossRef]
Li, Z.; Singh, B. Robust Occluded Object Detection in Multimodal Autonomous Driving: A Fusion-Aware Learning Framework. Electronics 2026, 15, 245. [Google Scholar] [CrossRef]
Yang, Y.; Li, Y.; Cao, K.; Chen, X.; Jia, W. LEAF-Net: A Multi-Scale Frequency-Aware Framework for Automated Apple Blossom Monitoring in Complex Orchard. Horticulturae 2025, 11, 1382. [Google Scholar] [CrossRef]
Bhat, V.S.; Wang, Y. Revisiting the Control Systems of Autonomous Vehicles in the Agricultural Sector: A Systematic Literature Review. IEEE Access 2025, 13, 54686–54721. [Google Scholar] [CrossRef]
Wang, Z.; Sun, W.; Simionescu, P.A. Design and optimization of an astragalus slicing machine using computer simulations and design of experiments. Ind. Crops Prod. 2025, 236, 122037. [Google Scholar] [CrossRef]
Kim, T.; Lee, D.H.; Kim, K.C.; Kim, Y.J. 2D pose estimation of multiple tomato fruit-bearing systems for robotic harvesting. Comput. Electron. Agric. 2023, 211, 108004. [Google Scholar] [CrossRef]
Debnath, B.; Mghames, S.; Mandil, W.; Parsa, S.; Parsons, S.; Ghalamzan-E, A. Towards Autonomous Selective Harvesting: A Review of Robot Perception, Robot Design, Motion Planning and Control. arXiv 2023, arXiv:2304.09617. [Google Scholar] [CrossRef]
Singh, S.; Sood, V.; Srivastav, A.L.; Ampatzidis, Y. Hyperautomation in Precision Agriculture: Advancements and Opportunities for Sustainable Farming; Elsevier: Amsterdam, The Netherlands, 2024. [Google Scholar]
Zhang, R.; Bian, Z.; Wu, P.; Liu, Y.; Li, B.; Xiong, J.; Zhang, Y.; Zhu, B. Nondestructive prediction of fruit detachment force for investigating postharvest grape abscission. Postharvest Biol. Technol. 2024, 209, 112691. [Google Scholar] [CrossRef]
Zhang, J.; Kang, N.; Qu, Q.; Zhou, L.; Zhang, H. Automatic fruit picking technology: A comprehensive review of research advances. Artif. Intell. Rev. 2024, 57, 54. [Google Scholar] [CrossRef]
Zhang, Q.; Karkee, M.; Tabb, A. The use of agricultural robots in orchard management. In Robotics and Automation for Improving Agriculture; Burleigh Dodds Science Publishing: Sawston, UK, 2019; pp. 187–214. [Google Scholar]
Shi, J.; Bai, Y.; Diao, Z.; Zhou, J.; Yao, X.; Zhang, B. Row detection BASED navigation and guidance for agricultural robots and autonomous vehicles in row-crop fields: Methods and applications. Agronomy 2023, 13, 1780. [Google Scholar] [CrossRef]
Rahat, S.S.S.; Al Pitom, M.H.; Mahzabun, M.; Shamsuzzaman, M. Lemon Fruit Detection and Instance Segmentation in an Orchard Environment Using Mask R-CNN and YOLOv5. In Computer Vision and Image Analysis for Industry 4.0; Chapman and Hall/CRC: London, UK, 2023; pp. 28–40. [Google Scholar]
Dong, X.; Dong, J.; Li, Y.; Xu, H.; Tang, X. Maintaining the predictive abilities of egg freshness models on new variety based on VIS-NIR spectroscopy technique. Comput. Electron. Agric. 2019, 156, 669–676. [Google Scholar] [CrossRef]
Espinoza, S.; Aguilera, C.; Rojas, L.; Campos, P.G. Analysis of fruit images with deep learning: A systematic literature review and future directions. IEEE Access 2023, 12, 3837–3859. [Google Scholar] [CrossRef]
Tang, Y.; Qiu, J.; Zhang, Y.; Wu, D.; Cao, Y.; Zhao, K.; Zhu, L. Optimization strategies of fruit detection to overcome the challenge of unstructured background in field orchard environment: A review. Precis. Agric. 2023, 24, 1183–1219. [Google Scholar] [CrossRef]
Stein, M.; Bargoti, S.; Underwood, J. Image based mango fruit detection, localisation and yield estimation using multiple view geometry. Sensors 2016, 16, 1915. [Google Scholar] [CrossRef] [PubMed]
Cong, Y.; Chen, R.; Ma, B.; Liu, H.; Hou, D.; Yang, C. A comprehensive study of 3-D vision-based robot manipulation. IEEE Trans. Cybern. 2021, 53, 1682–1698. [Google Scholar] [CrossRef] [PubMed]
Magalhães, S.A.C. Harvesting with Active Perception for Open-Field Agricultural Robotics. Ph.D. Thesis, Universidade do Porto, Porto, Portugal, 2024. [Google Scholar]
Azizi, A.; Zhang, Z.; Hua, W.; Li, M.; Igathinathane, C.; Yang, L.; Ampatzidis, Y.; Ghasemi-Varnamkhasti, M.; Zhang, M.; Li, H.; et al. Image processing and artificial intelligence for apple detection and localization: A comprehensive review. Comput. Sci. Rev. 2024, 54, 100690. [Google Scholar] [CrossRef]
Ali, M.L.; Zhang, Z. The YOLO framework: A comprehensive review of evolution, applications, and benchmarks in object detection. Computers 2024, 13, 336. [Google Scholar] [CrossRef]
Sapkota, R.; Flores-Calero, M.; Qureshi, R.; Badgujar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Vaddevolu, U.B.P.; Khan, S.; Karkee, M.; et al. YOLO advances to its genesis: A decadal and comprehensive review of the You Only Look Once (YOLO) series. Artif. Intell. Rev. 2025, 58, 274. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
Yang, M.; Fan, X. YOLOv8-Lite: A lightweight object detection model for real-time autonomous driving systems. ICCK Trans. Emerg. Top. Artif. Intell. 2024, 1, 1–16. [Google Scholar] [CrossRef]
Sapkota, R.; Roumeliotis, K.I.; Karkee, M.; Tselikas, N.D. Generalization vs. Specialization: Evaluating Segment Anything Model (SAM3) Zero-Shot Segmentation Against Fine-Tuned YOLO Detectors. arXiv 2025, arXiv:2512.11884. [Google Scholar] [CrossRef]
Wang, Y.; Deng, Y.; Zheng, Y.; Chattopadhyay, P.; Wang, L. Vision transformers for image classification: A comparative survey. Technologies 2025, 13, 32. [Google Scholar] [CrossRef]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In European Conference on Computer Vision 2020 August; Springer International Publishing: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar] [CrossRef]
Brenner, M.; Reyes, N.H.; Susnjak, T.; Barczak, A.L. RGB-D and thermal sensor fusion: A systematic literature review. IEEE Access 2023, 11, 82410–82442. [Google Scholar] [CrossRef]
Ayanlade, T.T.; Jones, S.E.; Laan, L.V.D.; Chattopadhyay, S.; Elango, D.; Raigne, J.; Saxena, A.; Singh, A.; Ganapathysubramanian, B.; Sarkar, S.; et al. Multi-modal AI for ultra-precision agriculture. In Harnessing Data Science for Sustainable Agriculture and Natural Resource Management 2024; Springer Nature: Singapore, 2024; pp. 299–334. [Google Scholar] [CrossRef]
Javaid, S.; Fahim, H.; Zeadally, S.; He, B. Self-powered sensors: Applications, challenges, and solutions. IEEE Sens. J. 2023, 23, 20483–20509. [Google Scholar] [CrossRef]
Yu, T.; Hu, C.; Xie, Y.; Liu, J.; Li, P. Mature pomegranate fruit detection and location combining improved F-PointNet with 3D point cloud clustering in orchard. Comput. Electron. Agric. 2022, 200, 107233. [Google Scholar] [CrossRef]
Wang, Q.; Wang, T. Machine learning-driven sensor fusion for precision agriculture: Current applications and future perspectives. Sens. Rev. 2026, 1–26. [Google Scholar] [CrossRef]
Rahnemoonfar, M.; Sheppard, C. Deep count: Fruit counting based on deep simulated learning. Sensors 2017, 17, 905. [Google Scholar] [CrossRef]
Wang, Y.; Liu, D.; Li, Y.; Zhao, H.; Yang, C.; Zhang, Y. Effects of maturity of citrus fruits on their stalks cutting force. Int. J. Agric. Biol. Eng. 2022, 15, 23–30. [Google Scholar] [CrossRef]
Gené-Mola, J.; Gregorio, E.; Guevara, J.; Auat, F.; Sanz-Cortiella, R.; Escolà, A.; Llorens, J.; Morros, J.-R.; Ruiz-Hidalgo, J.; Rosell-Polo, J.R.; et al. Fruit detection in an apple orchard using a mobile terrestrial laser scanner. Biosyst. Eng. 2019, 187, 171–184. [Google Scholar] [CrossRef]
Verdouw, C.; Tekinerdogan, B.; Beulens, A.; Wolfert, S. Digital twins in smart farming. Agric. Syst. 2021, 189, 103046. [Google Scholar] [CrossRef]
Zhao, Y.; Yin, X.; Li, P.; Ren, Z.; Gu, Z.; Zhang, Y.; Song, Y. Multifunctional perovskite photodetectors: From molecular-scale crystal structure design to micro/nano-scale morphology manipulation. Nano-Micro Lett. 2023, 15, 187. [Google Scholar] [CrossRef]
Attri, I.; Awasthi, L.K.; Sharma, T.P.; Rathee, P. A review of deep learning techniques used in agriculture. Ecol. Inform. 2023, 77, 102217. [Google Scholar] [CrossRef]
Wiese, T.R.; Haskell, M.; Jarvis, S.; Rey-Planellas, S.; Turnbull, J. Concerns and research priorities for Scottish farmed salmon welfare–an industry perspective. Aquaculture 2023, 566, 739235. [Google Scholar] [CrossRef]
Attaran, M.; Attaran, S.; Celik, B.G. The impact of digital twins on the evolution of intelligent manufacturing and Industry 4.0. Adv. Comput. Intell. 2023, 3, 11. [Google Scholar] [CrossRef]
Yang, Q.; Du, X.; Wang, Z.; Meng, Z.; Ma, Z.; Zhang, Q. A review of core agricultural robot technologies for crop productions. Comput. Electron. Agric. 2023, 206, 107701. [Google Scholar] [CrossRef]
Wang, Y.; Yang, Y.; Zhao, H.; Liu, B.; Ma, J.; He, Y.; Zhang, Y.; Xu, H. Effects of cutting parameters on cutting of citrus fruit stems. Biosyst. Eng. 2020, 193, 1–11. [Google Scholar] [CrossRef]
Hughes, J.; Culha, U.; Giardina, F.; Guenther, F.; Rosendo, A.; Iida, F. Soft manipulators and grippers: A review. Front. Robot. AI 2016, 3, 69. [Google Scholar] [CrossRef]
Kulkarni, M.; Edward, S.; Golecki, T.; Kaehr, B.; Golecki, H. Soft robots built for extreme environments. Soft Sci. 2025, 5, 12. [Google Scholar] [CrossRef]
Liu, Y.; Hou, J.; Li, C.; Wang, X. Intelligent soft robotic grippers for agricultural and food product handling: A brief review with a focus on design and control. Adv. Intell. Syst. 2023, 5, 2300233. [Google Scholar] [CrossRef]
Park, Y.; Seol, J.; Pak, J.; Jo, Y.; Jun, J.; Son, H.I. A novel end-effector for a fruit and vegetable harvesting robot: Mechanism and field experiment. Precis. Agric. 2023, 24, 948–970. [Google Scholar] [CrossRef]
Lehnert, C.; English, A.; McCool, C.; Tow, A.W.; Perez, T. Autonomous sweet pepper harvesting for protected cropping systems. IEEE Robot. Autom. Lett. 2017, 2, 872–879. [Google Scholar] [CrossRef]
Seo, D.; Oh, I.S. Gripping Success Metric for Robotic Fruit Harvesting. Sensors 2024, 25, 181. [Google Scholar] [CrossRef]
Zhou, H.; Wang, X.; Kang, H.; Chen, C. A tactile-enabled grasping method for robotic fruit harvesting. arXiv 2021, arXiv:2110.09051. [Google Scholar] [CrossRef]
Xu, Y.; Lv, M.; Xu, Q.; Xu, R. Design and analysis of a robotic gripper mechanism for fruit picking. Actuators 2024, 13, 338. [Google Scholar] [CrossRef]
Arikapudi, R.; Vougioukas, S.G. Robotic Tree-fruit harvesting with arrays of Cartesian Arms: A study of fruit pick cycle times. Comput. Electron. Agric. 2023, 211, 108023. [Google Scholar] [CrossRef]
Aizat, M.; Qistina, N.; Rahiman, W. A comprehensive review of recent advances in automated guided vehicle technologies: Dynamic obstacle avoidance in complex environment toward autonomous capability. IEEE Trans. Instrum. Meas. 2023, 73, 1–25. [Google Scholar] [CrossRef]
Chen, M.; Chen, Z.; Luo, L.; Tang, Y.; Cheng, J.; Wei, H.; Wang, J. Dynamic visual servo control methods for continuous operation of a fruit harvesting robot working throughout an orchard. Comput. Electron. Agric. 2024, 219, 108774. [Google Scholar] [CrossRef]
Hsu, K.C.; Hu, H.; Fisac, J.F. The safety filter: A unified view of safety-critical control in autonomous systems. Annu. Rev. Control. Robot. Auton. Syst. 2023, 7, 47–72. [Google Scholar] [CrossRef]
Tai, L.; Paolo, G.; Liu, M. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems 2017, (IROS); IEEE: Piscataway, NJ, USA, 2017; pp. 31–36. [Google Scholar] [CrossRef]
Siciliano, B.; Khatib, O. Robotics and the handbook. In Springer Handbook of Robotics; Springer International Publishing: Cham, Switzerland, 2016; pp. 1–6. [Google Scholar] [CrossRef]
Megalingam, R.K.; Senthil, A.P.; Raghavan, D.; Manoharan, S.K. Modeling of novel circular gait motion through daisy sequence fitting algorithm in a vertical climbing snake robot. J. Field Robot. 2024, 41, 211–226. [Google Scholar] [CrossRef]
Navas, E.; Shamshiri, R.R.; Dworak, V.; Weltzien, C.; Fernández, R. Soft gripper for small fruits harvesting and pick and place operations. Front. Robot. AI 2024, 10, 1330496. [Google Scholar] [CrossRef]
Bac, C.W.; Van Henten, E.J.; Hemming, J.; Edan, Y. Harvesting robots for high-value crops: State-of-the-art review and challenges ahead. J. Field Robot. 2014, 31, 888–911. [Google Scholar] [CrossRef]
Davidson, J.R.; Silwal, A.; Hohimer, C.J.; Karkee, M.; Mo, C.; Zhang, Q. Proof-of-concept of a robotic apple harvester. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: Piscataway, NJ, USA, 2016; pp. 634–639. [Google Scholar] [CrossRef]
Silwal, A.; Davidson, J.R.; Karkee, M.; Mo, C.; Zhang, Q.; Lewis, K. Design, integration, and field evaluation of a robotic apple harvester. J. Field Robot. 2017, 34, 1140–1159. [Google Scholar] [CrossRef]
Zhao, Y.; Gong, L.; Huang, Y.; Liu, C. A review of key techniques of vision-based control for harvesting robot. Comput. Electron. Agric. 2016, 127, 311–323. [Google Scholar] [CrossRef]
Koirala, A.; Walsh, K.B.; Wang, Z.; McCarthy, C. Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of ‘MangoYOLO’. Precis. Agric. 2019, 20, 1107–1135. [Google Scholar] [CrossRef]
Kang, H.; Chen, C. Fast implementation of real-time fruit detection in apple orchards using deep learning. Comput. Electron. Agric. 2020, 168, 105108. [Google Scholar] [CrossRef]
Shamshiri, R.R.; Weltzien, C.; Hameed, I.A.; Yule, I.J.; Grift, T.E.; Balasundram, S.K.; Pitonakova, L.; Ahmad, D.; Chowdhary, G. Research and development in agricultural robotics: A perspective of digital farming. Int. J. Agric. Biol. Eng. 2018, 11, 1–14. [Google Scholar] [CrossRef]
Lin, T.; Sun, F.; Li, X.; Guo, X.; Ying, J.; Wu, H.; Li, H. A Review of Key Technologies and Recent Advances in Intelligent Fruit-Picking Robots. Horticulturae 2026, 12, 158. [Google Scholar] [CrossRef]
Fang, W.; Wu, Z.; Li, W.; Sun, X.; Mao, W.; Li, R.; Majeed, Y.; Fu, L. Fruit detachment force of multiple varieties kiwifruit with different fruit-stem angles for designing universal robotic picking end-effector. Comput. Electron. Agric. 2023, 213, 108225. [Google Scholar] [CrossRef]
Wang, F.; Urquizo, R.C.; Roberts, P.; Mohan, V.; Newenham, C.; Ivanov, A.; Dowling, R. Biologically inspired robotic perception-action for soft fruit harvesting in vertical growing environments. Precis. Agric. 2023, 24, 1072–1096. [Google Scholar] [CrossRef]
Xiong, Y.; Ge, Y.; Grimstad, L.; From, P.J. An autonomous strawberry-harvesting robot: Design, development, integration, and field evaluation. J. Field Robot. 2020, 37, 202–224. [Google Scholar] [CrossRef]
Selvam, A.P.; Al-Humairi, S.N.S. Environmental impact evaluation using smart real-time weather monitoring systems: A systematic review. Innov. Infrastruct. Solut. 2025, 10, 13. [Google Scholar] [CrossRef]
Rijal, M.; Shrestha, R.; Smith, T.; Gu, Y. Force Aware Branch Manipulation To Assist Agricultural Tasks. In 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems 2025, (IROS); IEEE: Piscataway, NJ, USA, 2025; pp. 1217–1222. [Google Scholar] [CrossRef]
Chen, J.; Ma, W.; Liao, H.; Lu, J.; Yang, Y.; Qian, J.; Xu, L. Balancing accuracy and efficiency: The status and challenges of agricultural multi-arm harvesting robot research. Agronomy 2024, 14, 2209. [Google Scholar] [CrossRef]
Castro-Garcia, S.; Aragon-Rodriguez, F.; Sola-Guirado, R.R.; Serrano, A.J.; Soria-Olivas, E.; Gil-Ribes, J.A. Vibration monitoring of the mechanical harvesting of citrus to improve fruit detachment efficiency. Sensors 2019, 19, 1760. [Google Scholar] [CrossRef]
Wei, H.; Zhang, Y.; Xiao, H.; Chen, W.; Chen, M.; Wang, J.; Lu, Q.; Luo, L. Contact force modeling and analysis of rheological deformation behaviors during clustered fruits harvesting. Comput. Electron. Agric. 2025, 237, 110772. [Google Scholar] [CrossRef]
Mohsenin, N.N. Physical Properties of Plant and Animal Materials: V. 1: Physical Characteristics and Mechanical Properties; Routledge: Oxfordshire, UK, 2020. [Google Scholar]
Ghonimy, M.; Ibrahim, M.M.; Helmy, H.S.; Alzoheiry, A. Power requirements for olive mechanical harvesting using trunk shaker. Sci. Rep. 2024, 14, 25585. [Google Scholar] [CrossRef] [PubMed]
Fei, Z.; Vougioukas, S.G. A robotic orchard platform increases harvest throughput by controlling worker vertical positioning and platform speed. Comput. Electron. Agric. 2024, 218, 108735. [Google Scholar] [CrossRef]
O’brien, M.; Cargill, B.F.; Fridley, R.B. Principles and Practices for Harvesting & Handling Fruits and Nuts; AVI bub. Comp, INC: Westport, CT, USA, 1983. [Google Scholar] [CrossRef]
Alzoheiry, A.; Ghonimy, M.; Abd El Rahman, E.; Abdelwahab, O.; Hassan, A. Improving olive mechanical harvesting using appropriate natural frequency. J. Agric. Eng. 2020, 51, 148–154. [Google Scholar] [CrossRef]
Ghonimy, M.I.; Ibrahim, M.M.; Ghaly, A.; El Rahman, E.N.E.-D.A. Performance evaluation of hand-held olive harvesters. Agric. Eng. Int. CIGR J. 2021, 23, 127–137. [Google Scholar]
Mail, M.F.; Maja, J.M.; Marshall, M.; Cutulle, M.; Miller, G.; Barnes, E. Agricultural harvesting robot concept design and system components: A review. AgriEngineering 2023, 5, 777–800. [Google Scholar] [CrossRef]
Spagnuolo, M.; Todde, G.; Caria, M.; Furnitto, N.; Schillaci, G.; Failla, S. Agricultural robotics: A technical review addressing challenges in sustainable crop production. Robotics 2025, 14, 9. [Google Scholar] [CrossRef]
Skafi, Z.; Brown, T.M. Conventional Substrates for Flexible Devices. In The Handbook of Paper-Based Sensors and Devices: Volume 1: Materials and Technologies; Springer Nature: Cham, Switzerland, 2025; pp. 27–48. [Google Scholar] [CrossRef]
Afsah-Hejri, L.; Homayouni, T.; Toudeshki, A.; Ehsani, R.; Ferguson, L.; Castro-García, S. Mechanical harvesting of selected temperate and tropical fruit and nut trees. Hortic. Rev. 2022, 49, 171–242. [Google Scholar] [CrossRef]
Bogue, R. Fruit picking robots: Has their time come? Ind. Robot. Int. J. Robot. Res. Appl. 2020, 47, 141–145. [Google Scholar] [CrossRef]
Shintake, J.; Cacucciolo, V.; Floreano, D.; Shea, H. Soft robotic grippers. Adv. Mater. 2018, 30, 1707035. [Google Scholar] [CrossRef] [PubMed]
Paul, A. Current Research in Agricultural Sciences. Curr. Res. Agric. Sci. 2026, 13, 1–42. [Google Scholar]
Beldek, C.; Dunn, A.; Cunningham, J.; Sariyildiz, E.; Phung, S.L.; Alici, G. Multi-vision-based Picking Point Localisation of Target Fruit for Harvesting Robots. In 2025 IEEE International Conference on Mechatronics (ICM); IEEE: Piscataway, NJ, USA, 2025; pp. 1–6. [Google Scholar] [CrossRef]
Jin, T.; Han, X. Robotic arms in precision agriculture: A comprehensive review of the technologies, applications, challenges, and future prospects. Comput. Electron. Agric. 2024, 221, 108938. [Google Scholar] [CrossRef]
Castro-Garcia, S.; Sola-Guirado, R.R.; Gil-Ribes, J.A. Vibration analysis of the fruit detachment process in late-season ‘Valencia’orange with canopy shaker technology. Biosyst. Eng. 2018, 170, 130–137. [Google Scholar] [CrossRef]
Chauhan, A.; Brouwer, B.; Westra, E. Robotics for a quality-driven post-harvest supply chain. Curr. Robot. Rep. 2022, 3, 39–48. [Google Scholar] [CrossRef]
Kumari, A.; Singh, S.; Aparnna, V.P.; Joshi, P.; Chauhan, A.K.; Singh, M.; Hemalatha, S. Mechanization in pre-harvest technology to improve quality and safety. In Engineering Aspects of Food Quality and Safety; Springer International Publishing: Cham, Switzerland, 2023; pp. 93–114. [Google Scholar]
Wu, X.; Jiang, Y. Post-Harvest Loss Reduction in Perishable Crops: Task-Technology Fit and Emotion-Driven Acceptance of On-Farm Transport Robots. Agronomy 2025, 15, 2169. [Google Scholar] [CrossRef]
Zhou, J.; Fu, X.; Zhou, S.; Zhou, J.; Ye, H.; Nguyen, H.T. Automated segmentation of soybean plants from 3D point cloud using machine learning. Comput. Electron. Agric. 2019, 162, 143–153. [Google Scholar] [CrossRef]
Ghonimy, M.; Alayouni, R.; Alshehry, G.; Barakat, H.; Ibrahim, M.M. Integrated Physical–Mechanical Characterization of Fruits for Enhancing Post-Harvest Quality and Handling Efficiency. Foods 2025, 14, 2521. [Google Scholar] [CrossRef]
Song, H.; Wang, K.; Wang, Y.; Zhang, X. Study on the abscission mechanical model of the peach fruit-branch system based on a mixed-mode cohesive zone model and finite element method. Comput. Electron. Agric. 2025, 231, 109965. [Google Scholar] [CrossRef]
Feng, Y.X.; Hwang, B.G. Charting the Unseen: A Systematic Review of Risk Perception in Emerging Technologies. IEEE Trans. Eng. Manag. 2025, 72, 3832–3848. [Google Scholar] [CrossRef]
Zhang, B.; Zhou, J.; Meng, Y.; Zhang, N.; Gu, B.; Yan, Z.; Idris, S.I. Comparative study of mechanical damage caused by a two-finger tomato gripper with different robotic grasping patterns for harvesting robots. Biosyst. Eng. 2018, 171, 245–257. [Google Scholar] [CrossRef]
Lu, G.; Wang, Z.; Dai, N.; Yuan, J.; Liu, X.; Papadakis, G. Dynamic impact mechanical damage analysis and tomato robotic post-picking crating optimization based on multiscale finite element model. Smart Agric. Technol. 2025, 10, 100742. [Google Scholar] [CrossRef]
Chen, C.W.; Lee, Y.H. Microsurgical robots—Engineering perspectives when scaling down for precision. In Robot Design: Application to Medical Robotics; Academic Press: Cambridge, MA, USA, 2025; p. 251. [Google Scholar]
Zhai, Z.; Martínez, J.F.; Beltran, V.; Martínez, N.L. Decision support systems for agriculture 4.0: Survey and challenges. Comput. Electron. Agric. 2020, 170, 105256. [Google Scholar] [CrossRef]
Bechar, A.; Vigneault, C. Agricultural robots for field operations: Concepts and components. Biosyst. Eng. 2016, 149, 94–111. [Google Scholar] [CrossRef]
Wang, C.; Pan, W.; Zou, T.; Li, C.; Han, Q.; Wang, H.; Yang, J.; Zou, X. A review of perception technologies for berry fruit-picking robots: Advantages, challenges, and prospects. Agriculture 2024, 14, 1346. [Google Scholar] [CrossRef]
Cha, P.Y. An Integrated Quantitative Framework Propose for Sustainability Assessment in Geographical Indication Production Systems. Ph.D. Thesis, Universidade de SÃ£ o Paulo, São Paulo, Brazil, 2020. [Google Scholar]
Niu, J.; Yu, Q.; Bi, M.; Zhao, J.; Zhang, T. Deep Reinforcement Learning-Based Cooperative Harvesting Strategy for Dual-Arm Robots in Apple Picking. Agronomy 2025, 15, 2565. [Google Scholar] [CrossRef]
Li, K.; Wang, J.; Jalil, H.; Wang, H. A fast and lightweight detection algorithm for passion fruit pests based on improved YOLOv5. Comput. Electron. Agric. 2023, 204, 107534. [Google Scholar] [CrossRef]
Singh, R.; Hasanain, M.; Babu, S.; Arif, M.; Ansari, M.A.; Kumar, A.; Nath, C.P.; Kumar, S. Energy Use Efficient Approaches in Farming: A Futuristic Strategy. In Modern Technology for Sustainable Agriculture; Springer Nature: Cham, Switzerland, 2025; pp. 205–229. [Google Scholar]
Liu, Y.; Li, Y.; Dong, Y.; Huang, M.; Zhang, T.; Cheng, J. Development of a variable-diameter threshing drum for rice combine harvester using MBD-DEM coupling simulation. Comput. Electron. Agric. 2022, 196, 106859. [Google Scholar] [CrossRef]
Khan, M.N. Edge AI based on Real-time Robotic Decision-Making in Resource-constrained Environment. Int. J. Adv. Innov. Res. (IJAIR) 2025, 1, 19–26. [Google Scholar] [CrossRef]
Vu, C.T.; Chen, H.C.; Liu, Y.C. Toward autonomous navigation for agriculture robots in orchard farming. In 2024 IEEE International Conference on Recent Advances in Systems Science and Engineering 2024, (RASSE); IEEE: Piscataway, NJ, USA, 2024; pp. 1–8. [Google Scholar]
Gilday, K.; Hughes, J.; Iida, F. Sensing, actuating, and interacting through passive body dynamics: A framework for soft robotic hand design. Soft Robot. 2023, 10, 159–173. [Google Scholar] [CrossRef]
Chen, Y.; Xiong, Y.; Zhang, B.; Zhou, J.; Zhang, Q. 3D point cloud semantic segmentation toward large-scale unstructured agricultural scene classification. Comput. Electron. Agric. 2021, 190, 106445. [Google Scholar] [CrossRef]
Javaid, M.; Haleem, A.; Singh, R.P. Advancement in Robotics for Agriculture: An Extensive Perspective on Present, Potential and Futuristic Aspects. J. Ind. Integr. Manag. 2026, 1–33. [Google Scholar] [CrossRef]
Ghonimy, M.; Younis, S.; Ghoniem, E. Development of Olive Harvesting Machine by Shaking. Turk. J. Agric. Eng. Res. 2022, 3, 86–102. [Google Scholar] [CrossRef]
Mhamed, M.; Zhang, Z.; Yu, J.; Li, Y.; Zhang, M. Advances in apple’s automated orchard equipment: A comprehensive research. Comput. Electron. Agric. 2024, 221, 108926. [Google Scholar] [CrossRef]
Arad, B.; Balendonck, J.; Barth, R.; Ben-Shahar, O.; Edan, Y.; Hellström, T.; Hemming, J.; Kurtser, P.; Ringdahl, O.; van Tuijl, B.; et al. Development of a sweet pepper harvesting robot. J. Field Robot. 2020, 37, 1027–1039. [Google Scholar] [CrossRef]
Martins, M.B.; Filho, A.C.M.; Drudi, F.S.; Bortolheiro, F.P.D.A.P.; Vendruscolo, E.P.; Esperancini, M.S.T. Economic efficiency of mechanized harvesting of sugarcane at different operating speeds. Sugar Tech 2021, 23, 428–432. [Google Scholar] [CrossRef]
Kootstra, G.; Wang, X.; Blok, P.M.; Hemming, J.; Van Henten, E. Selective harvesting robotics: Current research, trends, and future directions. Curr. Robot. Rep. 2021, 2, 95–104. [Google Scholar] [CrossRef]
Lowenberg-DeBoer, J.; Franklin, K.; Behrendt, K.; Godwin, R. Economics of autonomous equipment for arable farms. Precis. Agric. 2021, 22, 1992–2006. [Google Scholar] [CrossRef]
Pearson, S.; Camacho-Villa, T.C.; Valluru, R.; Gaju, O.; Rai, M.C.; Gould, I.; Brewer, S.; Sklar, E. Robotics and autonomous systems for net zero agriculture. Curr. Robot. Rep. 2022, 3, 57–64. [Google Scholar] [CrossRef]
Lowenberg-DeBoer, J.; Huang, I.Y.; Grigoriadis, V.; Blackmore, S. Economics of robots and automation in field crop production. Precis. Agric. 2020, 21, 278–299. [Google Scholar] [CrossRef]
Nkwocha, C.L.; Adewumi, A.; Folorunsho, S.O.; Eze, C.; Jjagwe, P.; Kemeshi, J.; Wang, N. A comprehensive review of sensing, control, and networking in agricultural robots: From perception to coordination. Robotics 2025, 14, 159. [Google Scholar] [CrossRef]
Zhang, X.; He, L.; Karkee, M.; Whiting, M.D.; Zhang, Q. Field evaluation of targeted shake-and-catch harvesting technologies for fresh market apple. Trans. ASABE 2020, 63, 1759–1771. [Google Scholar] [CrossRef]
Teixeira, K.; Miguel, G.; Silva, H.S.; Madeiro, F. A survey on applications of unmanned aerial vehicles using machine learning. IEEE Access 2023, 11, 117582–117621. [Google Scholar] [CrossRef]
Zhang, Z.; Igathinathane, C.; Li, J.; Cen, H.; Lu, Y.; Flores, P. Technology progress in mechanical harvest of fresh market apples. Comput. Electron. Agric. 2020, 175, 105606. [Google Scholar] [CrossRef]
De Alwis, S.; Hou, Z.; Zhang, Y.; Na, M.H.; Ofoghi, B.; Sajjanhar, A. A survey on smart farming data, applications and techniques. Comput. Ind. 2022, 138, 103624. [Google Scholar] [CrossRef]
Kober, J.; Bagnell, J.A.; Peters, J. Reinforcement learning in robotics: A survey. Int. J. Robot. Res. 2017, 32, 1238–1274. [Google Scholar] [CrossRef]
Ebert, F.; Finn, C.; Dasari, S.; Xie, A.; Lee, A.; Levine, S. Visual foresight: Model-based deep reinforcement learning for vision-based robotic control. arXiv 2018, arXiv:1812.00568. [Google Scholar] [CrossRef]
Argall, B.D.; Chernova, S.; Veloso, M.; Browning, B. A survey of robot learning from demonstration. Robot. Auton. Syst. 2009, 57, 469–483. [Google Scholar] [CrossRef]
Underwood, J.P.; Hung, C.; Whelan, B.; Sukkarieh, S. Mapping almond orchard canopy volume, flowers, fruit and yield using lidar and vision sensors. Comput. Electron. Agric. 2016, 130, 83–96. [Google Scholar] [CrossRef]
Joshi, N.; Pandey, R.; Kaparwal, S.; Yadav, V.; Gupta, A.K.; Jha, A.K.; Bhatt, S.; Kumar, V.; Naik, B.; Akhtar, S.; et al. State-of-the-art non-destructive approaches for maturity index determination in fruits and vegetables: Principles, applications, and future directions. Food Prod. Process. Nutr. 2024, 6, 56. [Google Scholar]
Zhang, Y.; Li, N.; Zhang, L.; Lin, J.; Gao, X.; Chen, G. A review on the recent developments in vision-based apple-harvesting robots for recognizing fruit and picking pose. Comput. Electron. Agric. 2025, 231, 109968. [Google Scholar] [CrossRef]
Luo, X.; Xiong, S.; Jia, X.; Zeng, Y.; Chen, X. AIoT-enabled data management for smart agriculture: A Comprehensive Review on Emerging Technologies. IEEE Access 2025, 13, 102964–102993. [Google Scholar] [CrossRef]
Wang, W.; Li, C.; Xi, Y.; Gu, J.; Zhang, X.; Zhou, M.; Peng, Y. Research progress and development trend of visual detection methods for selective fruit harvesting robots. Agronomy 2025, 15, 1926. [Google Scholar] [CrossRef]
Purcell, W.; Neubauer, T.; Mallinger, K. Digital Twins in agriculture: Challenges and opportunities for environmental sustainability. Curr. Opin. Environ. Sustain. 2023, 61, 101252. [Google Scholar] [CrossRef]
Wang, T.; Chen, B.; Zhang, Z.; Li, H.; Zhang, M. Applications of machine vision in agricultural robot navigation: A review. Comput. Electron. Agric. 2022, 198, 107085. [Google Scholar] [CrossRef]
Tokekar, P.; Vander Hook, J.; Mulla, D.; Isler, V. Sensor planning for a symbiotic UAV and UGV system for precision agriculture. IEEE Trans. Robot. 2016, 32, 1498–1511. [Google Scholar] [CrossRef]

Figure 1. Integrated architecture of a robotic fruit harvesting system illustrating the overall harvesting pipeline, system components, operational workflow in orchard environments, and key bottlenecks affecting system performance.

Figure 2. Perception pipeline for robotic fruit harvesting systems, illustrating multi-modal sensing using RGB, depth, and LiDAR inputs, followed by data processing, fruit detection and segmentation using deep learning techniques, and the generation of structured outputs including fruit position, maturity, and geometric features for downstream robotic operations.

Figure 3. Challenges and limitations in fruit perception systems for robotic harvesting, illustrating the impact of environmental conditions such as illumination variability, occlusion, and fruit variability, along with a comparative analysis of traditional machine vision and deep learning approaches, highlighting trade-offs between accuracy, computational cost, and real-time performance.

Figure 4. Conceptual framework of robotic manipulation in fruit harvesting systems, illustrating the interaction between perception outputs, grasp planning, motion control, end-effector design (rigid vs. soft grippers), and damage-aware handling during fruit picking operations.

Figure 5. Biomechanical model of fruit detachment illustrating tensile force (F), torsional moment (T), and cutting force (Fc) acting on the fruit–peduncle system, with stress concentration and failure occurring at the peduncle junction.

Figure 7. Integrated framework of orchard and pre-harvest factors affecting robotic harvesting performance.

Figure 8. Conceptual framework of post-harvest handling and fruit recovery, illustrating the flow from fruit damage through collection and transport systems to efficiency, loss, and overall system throughput.

Figure 9. Integrated framework of system integration and autonomous operation in robotic fruit harvesting systems, illustrating the interaction between perception, decision-making, navigation, control, and multi-robot coordination within a closed-loop architecture under dynamic orchard conditions.

Figure 11. Integrated framework of emerging technologies in robotic fruit harvesting systems, illustrating how artificial intelligence, advanced perception, digital agriculture, and energy-aware design enable tight coupling between perception, decision-making, and harvesting within autonomous operation.

Table 1. Comparative evaluation of end-effector types in robotic fruit harvesting systems.

End-Effector Type	Structural Characteristics	Grasping Efficiency	Damage Risk	Adaptability	Suitable Fruit Types	Key Limitations	References
Rigid Grippers	Hard materials, fixed geometry	High (for regular shapes)	High	Low	Apples, citrus	Poor compliance, risk of bruising	[58,59]
Soft Grippers	Compliant materials, flexible structure	Moderate to high	Low	High	Strawberries, tomatoes	Lower precision, slower response	[60,61,62]
Hybrid Grippers	Combination of rigid frame and soft contact	High	Low to moderate	High	Wide range of fruits	Increased complexity and cost	[11,64]

Table 2. Comparative evaluation of robotic manipulation techniques in fruit harvesting systems.

Technique	Grasping Efficiency	Success Rate	Adaptability	Damage Risk	Suitable Conditions	References
Suction Grasping	High	Moderate	Low	Low–Moderate	Smooth, regular fruits	[26,65]
Fingered Grasping	Moderate	High	High	Low	Complex, cluttered environments	[66,67]
Soft Grippers	Moderate	High	High	Very Low	Delicate fruits	[21,75]
Hybrid Systems	High	Very High	Very High	Low	Diverse fruit types	[57,63]

Table 3. Quantitative comparison of representative robotic fruit harvesting systems under different operational conditions.

Crop Type	Detection/Harvesting Method	HSR, %	CT, s Fruit⁻¹	FDR, %	Main Limitation	Reference
Sweet pepper	Stereo vision + robotic manipulator	62–68	24	<5	Occlusion and localization errors	[76]
Sweet pepper	RGB-D perception + autonomous harvesting robot	76.5	35–40	4–6	Complex canopy structure	[64]
Apple	Vision-guided harvesting robot	84	6–8	5	Fruit overlap and illumination variability	[77]
Apple	Vacuum-based robotic end-effector	84	5.9	<4	Branch interference	[78]
Tomato	Machine vision + robotic harvesting	87.5	8–10	3–5	Variable fruit orientation	[79]
Mango	Deep learning (Faster R-CNN)	90.7	—	—	Dense foliage and occlusion	[80]
Strawberry	Deep learning + soft gripper	86	7.5	<3	Delicate fruit handling	[81]
Multiple crops	Autonomous harvesting systems review	66–90	5–40	2–8	Environmental variability	[82]
Multiple fruits	Intelligent fruit-picking robots review	70–95	4–30	2–7	Scalability and real-time deployment	[83]

HSR = Harvest Success Rate; CT = Cycle Time; FDR = Fruit Damage Rate.

Table 4. Representative robotic fruit harvesting platforms and their key characteristics.

Platform/System	Crop Type	Robot Type	Sensing Method	Success Rate	Key Limitation	Reference
Abundant Robotics (prototype)	Apple	Mobile single-arm vacuum-based harvester	RGB-D vision	70–85%	Sensitive to occlusion, limited selectivity	[63,64]
FFRobotics multi-arm harvester	Apple	Multi-arm harvesting platform	Camera-based AI detection	~80%	High system complexity, cost	[64]
Octinion “Rubion” robot	Strawberry	Mobile robot with soft gripper	3D vision + AI	70–80%	Limited speed, delicate handling constraints	[102]
Wageningen UR strawberry robot	Strawberry	Autonomous greenhouse robot	RGB-D + structured light	60–75%	Lighting sensitivity, occlusion issues	[11,24]
Soft gripper harvesting systems	Strawberry/Tomato	Soft robotic manipulators	Vision + tactile sensing	65–85%	Lower picking speed	[100,103]
Trunk shaker systems	Olive/Citrus	Semi-mechanized shaking systems	Minimal sensing	High (bulk harvesting)	Non-selective, fruit damage	[104]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ghonimy, M.; Abdel-Baky, N.F. Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting. Agronomy 2026, 16, 1127. https://doi.org/10.3390/agronomy16121127

AMA Style

Ghonimy M, Abdel-Baky NF. Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting. Agronomy. 2026; 16(12):1127. https://doi.org/10.3390/agronomy16121127

Chicago/Turabian Style

Ghonimy, Mohamed, and Nagdy F. Abdel-Baky. 2026. "Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting" Agronomy 16, no. 12: 1127. https://doi.org/10.3390/agronomy16121127

APA Style

Ghonimy, M., & Abdel-Baky, N. F. (2026). Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting. Agronomy, 16(12), 1127. https://doi.org/10.3390/agronomy16121127

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robotic Fruit Harvesting Systems: Integration of Perception, Manipulation, and Detachment for Autonomous Harvesting

Abstract

1. Introduction

2. Literature Review Approach

3. System Architecture of Robotic Fruit Harvesting

3.1. Overall Harvesting Pipeline

3.2. System Components

3.3. Operational Workflow in Orchards

3.4. Bottlenecks in Current Architectures

4. Perception Systems for Fruit Detection and Localization

4.1. Machine Vision Techniques

4.2. AI and Deep Learning Approaches

4.3. Sensor Fusion

4.4. Challenges in Field Conditions

4.5. Comparative Analysis and Limitations

5. Robotic Manipulation and End-Effectors

5.1. Types of End-Effectors

5.2. Grasping Strategies

5.3. Motion Planning and Control

5.4. Damage Minimization

5.5. Comparative Evaluation of Manipulation Techniques

6. Fruit Detachment Mechanisms in Robotic Systems

6.1. Cutting-Based Detachment

6.2. Pulling and Twisting Mechanisms

6.3. Shaking and Dynamic Excitation

6.4. Hybrid and Adaptive Detachment Strategies

6.5. Modeling of Detachment Forces

6.6. Comparative Analysis of Detachment Methods

7. Robotic Fruit Harvesting Platforms and Field Implementations

8. Orchard and Pre-Harvest Factors Affecting Robotic Harvesting

8.1. Plant Characteristics

8.2. Orchard Management

8.3. Harvest Facilitation Techniques

8.4. Impact on Robotic System Performance

9. Post-Harvest Handling, Collection, and Transport Systems

9.1. Fruit Damage Mechanisms

9.2. Collection and Transport Systems

9.3. Efficiency, Losses, and System Throughput

10. System Integration and Autonomous Operation

10.1. Perception–Action Coupling

10.2. Real-Time Decision Making

10.3. Autonomous Navigation in Orchards

10.4. Multi-Robot Coordination

10.5. Control Strategies and System Robustness

11. System Performance, Economic Feasibility, and Scalability

11.1. Evaluation Metrics

11.2. Economic Comparison

11.3. Scalability and Practical Deployment

12. Emerging Technologies in Robotic Harvesting

12.1. Artificial Intelligence and Learning Systems

12.2. Advanced Perception Technologies

12.3. Digital Agriculture and Smart Orchards

12.4. Energy Efficiency and Sustainability

13. Research Gaps and Future Directions

14. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI