Next Article in Journal
Identification of Monilinia yunnanensis Causing Brown Rot in Korla Fragrant Pear and Evaluation of Bacillus siamensis PL55 as a Biocontrol Agent
Previous Article in Journal
Assessing the Carbon Balance and Its Drivers for Banana Cultivation in Hainan Island, China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

A Review of Integrated Approaches in Robotic Raspberry Harvesting

Department of Electrical Engineering and Automation, Faculty of Engineering, Czech University of Life Sciences Prague, 165 00 Praha, Czech Republic
*
Author to whom correspondence should be addressed.
Agronomy 2025, 15(12), 2677; https://doi.org/10.3390/agronomy15122677
Submission received: 27 October 2025 / Revised: 17 November 2025 / Accepted: 20 November 2025 / Published: 21 November 2025
(This article belongs to the Section Precision and Digital Agriculture)

Abstract

Raspberry cultivation represents a high-value global industry; however, concerns regarding its sustainability have been raised due to the high costs and labour shortages associated with manual harvesting. These challenges represent significant motivators for the development of robotic systems. This review article analyses contemporary robotic harvesting technologies, with a particular focus on integrated systems, machine vision and end-effectors. A review of the relevant literature was conducted in order to identify and compare the main development trends represented by academic and commercial prototypes. The analysis demonstrates that deep learning methodologies, most notably YOLO architectures, predominate within the domain of machine vision, thereby ensuring the effective identification and assessment of fruit ripeness. In order to ensure that the handling of the subject is done in a gentle manner, it is recommended that soft robotic end-effectors which are equipped with sensors and which minimise mechanical damage be used. In view of the fact that the number of studies focusing directly on raspberries is limited, the present study also analyses transferable technologies from other types of soft fruit. Consequently, future research should concentrate on integrating machine vision models that have been trained using raspberries and developing advanced soft end-effectors with integrated tactile sensors.

1. Introduction

Raspberries (Rubus idaeus L.) are a globally important, high-value crop whose economic impact is determined by increasing production and their wide application across various industries. According to 2022 data from FAOSTAT [1], the Russian Federation, Mexico, Serbia and Poland are among the world’s largest producers. Raspberries are valuable for direct consumption as fresh fruit and for their high content of bioactive compounds, including vitamins, minerals, and antioxidants, which make them an important raw material for the pharmaceutical and cosmetics industries. The fruits and shoots are used to produce phytotherapeutic and dermatological preparations, while the seeds are pressed to produce oil with a sun protection factor [2]. Raspberry leaves are also an important source of bioactive substances and have traditionally been used in medicine to treat common ailments such as colds and various inflammations. They contain high levels of polyphenols with antioxidant, anticancer and anti-inflammatory properties, including ellagic acid and salicylic acid, the latter of which is a key ingredient in the drug aspirin [2].
There are two main groups of cultivars: single-fruiting (summer) varieties, which bear fruit on two-year-old shoots; and ever-fruiting (remontant) varieties, which bear fruit on one-year-old shoots. Modern cultivation systems often use support, such as the V-shaped trellises, soil mulching with geotextiles and drip irrigation to optimise yield and fruit quality [3]. Harvesting whole, undamaged raspberries requires expensive manual picking. For example, in Hungary, manual labour accounted for 48.2% of the total cost of raspberry cultivation in 2014 [4]. In many regions, securing a sufficient number of seasonal workers is becoming increasingly difficult due to factors such as labour migration and limited access to finance for small growers [3,5]. These economic and labour pressures are the main drivers for the development and implementation of robotic harvesting systems. Additionally, the entire production cycle is increasingly being viewed in the context of a circular economy, where plant waste is composted and returned to the soil as organic fertiliser, thereby reducing dependence on external inputs and minimising environmental impact [3].
In response to these issues, mechanical harvesters have been developed since the 1960s [6]. These machines pass over rows of crops and shake the fruit using mechanisms such as vibrating fingers with a frequency typically between 8 and 9 Hz or pulsating air currents [6,7]. The efficiency of these systems ranges from 50 to 80% of ripe fruit harvested. Fruit separation is frequently attributable to short-term shocks and collisions with parts of the plant, as opposed to continuous vibration [8]. However, it should be noted that their operation is associated with several disadvantages.
A major drawback is the lack of selectivity, as the machines shake off the fruit regardless of its ripeness [6,8]. The harvest thus contains an undesirable proportion of unripe fruit, which reduces its overall quality. Another serious problem is mechanical damage to the fruit caused by shocks and impacts on the catch plates, which makes them unsuitable for the fresh fruit market and predestines them for industrial processing [6]. In addition, mechanical harvesters damage the plants themselves, especially annual shoots, which can lead to a reduction in yield of up to 40% in the following season due to increased susceptibility to disease [7].
Breeding programmes have focused on developing varieties better suited to mechanised harvesting. These have firm fruit structure, are easily separated from the receptacle when fully ripe and have suitable bush architecture with fruit concentrated on the outer parts for easier access. Historically, summer varieties such as ‘Malling Jewel’ and ‘Glen Clova’ have been utilised for this purpose [6]. Currently, remontant varieties such as ‘Polka’, ‘Polana’ or ‘Heritage’ appear to be particularly promising [3,7]. These varieties bear fruit on annual shoots, which are often more robust, more upright, and less dense. This can facilitate navigation and handling for automated systems. Furthermore, the fruits of these plants are frequently characterised by increased size and firmness [3]. However, the process of breeding entails a delicate balancing act between mechanical resistance and the sensory characteristics that are in demand by the market [6].
Robotic harvesting constitutes an alternative approach predicated upon the harvesting of individual fruits. Machine vision systems (e.g., 2D and 3D cameras) and image processing algorithms are utilised to identify ripe fruits, which are then separated using specialised end-effectors, often incorporating soft robotics elements [5].
The robotic harvesting method under discussion offers two main advantages. Firstly, it facilitates the production of fruit that is suitable for the fresh fruit market, which is the most economically viable. Secondly, even within the domain of the processing industry, robotic harvesting confers several advantages, including the selection of optimally ripe fruit and the elimination of damage to plants, which in turn contributes to enhanced raw material quality and long-term production sustainability [5,7].
Nevertheless, the development of robotic systems for raspberry harvesting is still in its infancy, and commercially available solutions are few and far between. The primary technical challenges encompass navigation and manipulation within the complex, unstructured environment of orchards, in addition to the handling requirements necessary for gentle treatment of the fruit. The short duration of the harvesting season further restricts the experimental testing of the materials in real conditions [5]. This combination of characteristics renders raspberries a model problem and one of the greatest challenges for current harvesting robotics; successful solutions developed for raspberries thus have high potential for transferability to other soft fruit species. Notwithstanding the aforementioned challenges, the field is undergoing active development, in which two of the primary approaches include: academic research at EPFL [5] and a commercial system from Fieldwork Robotics [9].
The aim of this study is to map and evaluate existing solutions for robotic raspberry harvesting. Manual harvesting of soft fruits such as raspberries remains a laborious and costly process, which motivates the development of automated harvesting systems. The study concentrates on identifying and analysing design strategies and technological principles that have proven effective for fruits with similar characteristics. The focus is placed on the technical aspects of robotic harvesting, particularly the design of end-effectors and the computer vision methods used for fruit detection, rather than on economic performance or yield analysis, which vary considerably by region and production system. Based on the synthesis of the findings, recommendations will be proposed to guide the future development of effective robotic harvesting systems for raspberries.
After the Introduction, the manuscript is structured into the following sections. Section 2 (Materials and methods) outlines the literature search strategy, inclusion criteria, and thematic organisation of studies on robotic raspberry harvesting and on soft, delicate fruits with physical properties similar to raspberries. Section 3 (Results) presents current integrated robotic harvesting systems, research on vision-based detection and ripeness assessment, deep learning methods for recognising small fragile fruits, and the design and evaluation of soft and hybrid end-effectors. Section 4 (Conclusions) summarises the key findings, identifies major technological gaps in perception and gripping, and provides recommendations for the future development of robotic harvesting systems for raspberries. Figure 1 provides a flowchart that visualises the overall structure of the manuscript.

2. Materials and Methods

A comprehensive literature review was conducted for the purposes of this study, with the objective of identifying, evaluating and synthesising relevant publications on robotic raspberry harvesting and transferable technologies from analogous applications. The objective was to concentrate on publications issued after 2020, but to ensure comprehensiveness, earlier studies were also included if they represented the most up-to-date relevant knowledge in a specific area. The research was conducted in the electronic scientific databases Scopus, IEEE Xplore, Web of Science, and Google Scholar, and included publications in English available until June 2025. This keyword-based approach was supplemented by an analysis of citations in the key articles found in order to identify other relevant sources that were not captured by the primary search.
While the issue of autonomous navigation and platform movement across the field is crucial for the complete system, it represents a separate, extensive area. Consequently, studies focused exclusively on this issue were not included in the present work.
The search was divided into two main thematic areas. The initial area of focus pertained to machine vision, utilising keywords such as: (“computer vision” OR “image processing” OR “deep learning” OR “YOLO”) AND (“fruit detection” OR “ripeness estimation”) AND “berry”. The second area targeted end-effectors and used keywords such as: “soft robotics” OR “gripper” OR “end-effector” AND (“harvesting” OR “picking”) AND (“raspberry” OR “soft fruit” OR “berry”). Given that even with these specific keywords, the initial search often yielded more than 2000 results, an initial screening was performed based on title relevance and, for selected articles, the recognition of authors active in the field.
The selection process of the study, as illustrated schematically in Figure 2, comprised multiple phases. The collection comprised peer-reviewed articles, conference proceedings, and relevant patent applications, with a focus on the development, testing, or analysis of integrated robotic systems and key technologies (end-effectors, machine vision systems) for harvesting raspberries or similar soft fruits, defined here as small, fragile berries with thin skin and easily deformable flesh (e.g., strawberries, blueberries, blackberries). In the synthesis of data from full texts, emphasis was placed on studies that described a unique technical approach with the potential for transferability to raspberry harvesting. In the course of the present study, a number of exclusions were made with regard to the literature on the subject. Firstly, studies focusing on fruits significantly different from raspberries and on large-scale mechanical harvesters operating on the principle of shaking were excluded. Secondly, review articles, duplicate records, and publications without available full text were also excluded.
The primary literature search yielded a total of 351 publications. Following the removal of duplicates, a total of 290 records were subjected to review. Following the application of the aforementioned criteria, a further 133 full texts were excluded from the study. The final synthesis incorporated 62 studies. Their distribution by year of publication is presented in Figure 3. The relevant data was extracted from the final set and subsequently synthesised into the following categories through thematic analysis:
(i).
Machine vision methods for detection, localisation, and ripeness assessment;
(ii).
Design principles of end-effectors.
To ensure conceptual consistency and avoid potential misinterpretation, this review deliberately excludes economic indicators such as raspberry yield, manual harvesting efficiency, labour inputs, and cost structures associated with hardware deployment or algorithm training. These parameters vary substantially across cultivars, production systems, training structures, and regional labour and market conditions. Moreover, cost-related information is not consistently reported or standardised in the available literature, preventing meaningful cross-study comparison. As the objective of this review is to synthesise technological and methodological advances in robotic raspberry harvesting, rather than to provide an agronomic or economic assessment of production systems, these factors fall outside the scope of the present work. A clarification reflecting this limitation has been included to prevent over-interpretation of technical comparisons.

3. Results

The following subsections present a synthesis of the current state of knowledge in robotic raspberry harvesting, as well as relevant applicable research from related fields. This synthesis is derived from the analysis of literature identified through the search strategy and selection criteria outlined previously.

3.1. Current State of Knowledge in Robotic Raspberry Harvesting

3.1.1. Integrated Robotic Systems Designed for Raspberry Harvesting

The current state of knowledge in robotic raspberry harvesting is defined by two main approaches. The initial direction is characterised by the implementation of sophisticated, integrated harvesting systems. The contemporary landscape encompasses a range of active development approaches. Prominent examples include the EPFL CREATE system [5], which employs an accelerated “Lab2Field” development methodology using physical twins (PTs), and the system from Fieldwork Robotics [9], which has been deployed in real field conditions.
The second research direction focuses on addressing the specific challenges necessary for the successful implementation of integrated systems. As will be discussed in the following sections, this involves the development of robust systems for fruit detection and localisation and the design of gentle end-effectors for gripping.
Lab2Field Approach to Robotic Raspberry Harvesting
As outlined in a study within the EPFL CREATE project by Junge, Pires, and Hughes [5], “Lab2Field” methodology, which aspires to circumvent the reliance on the circumscribed harvest season for testing and development purposes. The methodology underpinning this study is based on the utilisation of a soft, sensor-equipped PT (see Figure 4a), which replicates the mechanical and haptic properties of a real raspberry and its connection to the plant. This methodology enables the execution of experiments and the optimisation of control algorithms within a laboratory setting throughout the year, thereby reducing development cycles.
The physical twin has been designed with a focus on mechanical precision and the capacity to provide quantifiable feedback. The fruit is composed of silicone, the material properties of which have been verified through comparison with data measured on a sample of fourteen real raspberries. The integration of a fluid sensor, capable of measuring compressive force, is a notable feature. The bed and stem are manufactured using 3D printing technology from thermoplastic polyurethane (TPU) in order to replicate the plant’s flexibility. The connection between the fruit and the bed is achieved by an adjustable magnetic holder, which regulates the force required to separate the fruit, allowing different degrees of ripeness to be simulated. The objective of this approach is to minimise the discrepancy between the simulated and actual environments [5].
The development of the control algorithm is based on Learning from human Demonstration (LfD). Force profiles recorded during manual harvesting by a PT serve as quantitative reference data. The robot, equipped with a two-finger parallel end-effector with integrated pressure sensors, is programmed to mimic this technique. As part of an iterative process, the robot autonomously adjusts its control parameters using a gradient method with the objective of minimising deviation from the reference data. The harvesting algorithm is structured in three phases, as illustrated in the flowchart in Figure 5. The initial compression force is to be applied in accordance with the following parameters: (A) An initial compression force is applied during the approach phase; (B) A constant pulling speed of 10 m m · s 1 is maintained while the fruit is being detached; (C) The compression force is reduced to a maintenance level once fruit separation is detected. This step is analogous to human technique and is designed to prevent damage to the fruit after detachment [5].
After training in the laboratory, the system, consisting of a UR5 robotic arm and a developed end-effector (see Figure 4b), was deployed in field conditions. As reported by Junge et al. [5], during experimental verification on a sample of twenty fruits demonstrated that the system, without additional calibration, harvested 56% of raspberries classified as undamaged and 24% as minimally damaged. This equates to an overall harvest rate of 80% that is classified as ‘minor-to-no-damage’. This result supports the validity of the ‘Lab2Field’ methodology. The limitations of the system were identified as its reduced ability to manipulate dense clusters.
Fieldwork Approach to Robotic Raspberry Harvesting
Fieldwork Robotics has developed a commercial robotic system designed for harvesting soft fruits, with raspberries being the primary focus. Despite the absence of information regarding the project in peer-reviewed scientific publications, the creators have furnished details in media outlets such as The Guardian [10,11]. The technical analysis is, therefore, based on these sources and publicly available patent applications [12]. The system architecture has been designed to operate on a mobile platform capable of carrying and coordinating one or more independent robotic arms. This multi-arm configuration facilitates the execution of multiple harvesting operations in a concurrent manner. The end-effector functions on the principle of minimising direct contact with the surface of the fruit, thereby preventing mechanical damage during handling.
As articulated in the Sauerwald et al. patent [12], the fundamental principle of the end-effector (Figure 6) entails the utilisation of a pliable, inflatable membrane situated within the internal volume of the end-effector. This membrane, typically ring-shaped, is sealed at its upper and lower edges, thereby delineating the inflatable volume and the central opening. In the deflated state, the aperture is sufficiently spacious to permit the fruit to traverse into the internal volume. Increasing the volume of the membrane through the process of filling it with a fluid, such as air, results in narrowing of the central opening, thereby providing a gentle yet secure hold on the fruit. In order to ensure a controlled grip, the membrane may be formed around its circumference with areas of varying resistance to inflation. In comparison, areas of lesser resistance undergo a greater degree of deformation when subjected to inflation, resulting in the formation of inward-facing protrusions. These protrusions serve a dual function by gripping the fruit and contributing to its centring. The pressure within the membrane can be monitored by a pressure sensor, which provides feedback on the strength of the grip.
The patent further delineates numerous methodologies for the separation of the fruit from the plant. One such method involves two-stage inflation of a single membrane. In the initial stage, the upper part of the membrane is inflated to grasp the stem. In the subsequent stage, the lower part of the membrane is inflated to press down on the fruit and dislodge it. Alternatively, a second, separate inflatable membrane located beneath the first can be utilised to perform this pushing function. Another variant described combines an inflatable membrane with a mechanical gripping assembly. The assembly under consideration comprises an upper and lower part, with flexible elements, such as elastic strings, positioned between these two components. The rotation of these components relative to each other results in the strings intersecting and securing the stem of the fruit. The fruit is then detached by creating a relative linear movement between the assembly holding the stem and the membrane holding the fruit. This allows the fruit to be separated without the need to move the entire robotic arm and with less disturbance to the plant [12].
In order to facilitate functionality, the end-effector has been equipped with supplementary components. The outer surface of the fruit cluster is bevelled to facilitate the isolation of a single target fruit from the others during the approach. An integrated collection tray is positioned beneath the picking mechanism, enabling the collection of multiple fruits prior to delivery to the collection point. This approach serves to reduce the overall duration of the picking cycle. The apparatus is equipped with an opening door at the base, facilitating the extraction of its contents. The targeting and positioning of the end-effector is ensured by position sensors [12].
The robot’s perception system utilises 3D cameras and machine learning algorithms. The primary cameras, situated on a mobile platform, detect the presence of fruit and provide approximate navigation for the arms. In addition, supplementary camera pairs located at the extremities of each arm facilitate precise determination of the 3D position of the target fruit using triangulation. As articulated by David Fulton, the Chief Executive Officer of Fieldwork Robotics, in an interview with The Guardian, the integration of spectral analysis for the purpose of ripeness assessment is the pivotal inherent innovation in the Fieldworker 1 project. The system analyses the spectral frequency of light reflected from the surface of the raspberry, with different degrees of ripeness manifesting themselves in specific spectral signatures [11].

3.1.2. Research Conducted for Visual Detection and Assessment of Raspberry Ripeness

In the field of raspberry detection and localisation, Junge, Pires and Hughes [5] presented a method of localising raspberries using a visual servoing controller. This allows a robotic system to accurately determine the fruit’s position and direct the gripper to grasp it correctly. The authors used a combination of classic computer vision methods and a multi-sensor camera system consisting of an Oak-D stereo camera (Luxonis, Denver, CO, USA), a Raspberry Pi camera positioned between the gripper jaws, and a time-of-flight (ToF) sensor to detect the fruit. The algorithm processes the image in HSV colour space, isolating the pink-red areas corresponding to raspberries using hue thresholding. After smoothing the image, a circular Hough transform (CHT) is applied to estimate the centre and radius of each fruit. Based on this information, the robot calculates the fruit’s exact position in space and plans an approach trajectory that ensures the centre of the camera overlaps with the centre of the raspberry. The detection system works in several steps. First, the robot identifies the fruit at a distance of approximately 200–400 mm. Then, it performs horizontal and vertical alignment so that the raspberry is in the centre of the field of view. Finally, it moves forward to a distance of approximately 100 mm from the fruit. The final correction is performed using a ToF sensor and a camera positioned between the gripper jaws. These two components work together to ensure fine tuning of the position before gripping. A flowchart of this detection and alignment procedure is provided in Figure 7. In laboratory experiments, the robot demonstrated high guidance accuracy. Following alignment, the centre of the gripping mechanism was, on average, 2.9 mm (X-axis) and 3.1 mm (Y-axis) from the centre of the fruit, with a maximum deviation of no more than ±10 mm. This level of accuracy was also confirmed in field conditions, with virtually identical results to those obtained in the laboratory.
Although Junge, Pires and Hughes [5] used classical methods, developing more robust systems based on deep learning requires large, annotated datasets. To create such a publicly available resource, Strautina et al. [13] introduced the RaspberrySet dataset. Strautina et al.’s dataset was created to support the development and training of algorithms designed to detect and locate raspberry fruits at various stages of fruit development in orchards, as shown in Figure 8. The work aimed to create a standardised, professionally annotated dataset to enable the application of deep learning methods to the automated identification of raspberry phenological stages and assessment of ripeness. The dataset contains 2039 colour images (RGB) with a resolution of 1773 × 1773 px, taken under real conditions at the Institute of Horticulture (LatHort) in Dobele, Latvia. Each image was manually annotated by experts using Label Studio software (version 1.7.1; HumanSignal, San Francisco, CA, USA), with a total of 46,659 objects annotated. Annotations are processed in the You Only Look Once (YOLO) format, which is ideal for single-phase object detection models, enabling the simultaneous localisation and classification of objects within an image. Figure 8 shows the detection results based on the created dataset. The dataset distinguishes between five phenological categories: buds; damaged buds; flowers; unripe berries; and ripe berries. Thus, the dataset enables the identification and classification of fruits according to their stage of development, providing an indirect assessment of ripeness. To ensure that the images are representative of real-world agricultural conditions, they were captured from different angles (45°, 90°, 120°) and under varying lighting conditions (sunny, cloudy, partly cloudy).
Jafary et al. [14] presented the Raspberry PhenoSet dataset in their study. This dataset was created with the aim of enabling the automated localisation of raspberry fruits and the simultaneous determination of their phenological (developmental) stage. This refers to the assessment of their ripeness. The authors of the study concentrated on establishing a correlation between biologically relevant classification and object detection tasks, with a view to facilitating more accurate yield prediction and harvest planning in the context of automated agriculture. The dataset was created at Toronto Metropolitan University’s vertical farm, where 1853 high-resolution images (5184 × 3456 px) were captured, encompassing a total of 6907 manually annotated instances of raspberries, flowers, and buds. Each object was labelled using mask annotation (polygon) and classified according to seven phenological stages (A–G), corresponding to the BBCH development scale: from bud (A), through open flower (B), beginning of fruit formation (C), green fruit (D), yellowing fruit (E), semi-ripe pinkish fruit (F) to fully ripe red fruit (G). Each class is associated with the average number of days remaining until harvest, thus enabling the harvest date and total yield to be estimated. In order to evaluate the dataset, the authors employed a range of contemporary detection and segmentation models, including YOLOv8, YOLOv10, RT-DETR, Faster R-CNN, and Mask R-CNN. These models were tested across a variety of architecture sizes, such as ResNet-50, ResNet-101, and others. The YOLOv8-x model demonstrated the most optimal outcomes, exhibiting Precision = 0.721, Recall = 0.668, mAP@0.5 = 0.717, mAP@[0.5:0.95] = 0.548, and F1 = 0.693. The most readily identifiable classes were the early and late stages (A—buds and G—ripe fruits), while the middle stages (C–E) exhibited a higher degree of confusion due to minor variations in colour and shape. The confusion matrix demonstrated that the network primarily confuses adjacent phenological stages, which corresponds to the smooth biological transition between the individual stages of ripeness.
In their study, Ling et al. [15] developed the HSV Self-Adaption YOLOv5 (HSA-YOLOv5) method, which is a modified version of the well-known YOLOv5s detection model. The aim of the study was to automatically locate raspberries and determine their degree of ripeness (immature, nearly ripe, ripe) even under different lighting conditions. The primary issue with the original model pertained to the minimal colour difference between nearly ripe and ripe raspberries, which posed a challenge to the network’s learning process. In order to address this shortcoming, the authors converted the image from RGB (Red-Green-Blue) to HSV (Hue-Saturation-Value) colour space and performed an improved transformation. Utilising lookup tables (LUTs), the original values of the H, S, and V components were recalculated, resulting in new values that led to an enhancement in the contrast of analogous colours while preserving the image’s natural appearance. Simultaneously, the appearance of the fruits was uniform across diverse lighting conditions, thereby enabling the network to acquire consistent data for training. For the purpose of training the network for classification and ripeness assessment, 1563 samples were utilised. Of these samples, 593 were immature, 187 were near-ripe, and 783 were ripe berries. The findings of the experiments demonstrated that the proposed HSA-YOLOv5 model attained an average accuracy of mAP = 0.97, signifying an enhancement of 6.42% over the original YOLOv5 model. For unripe fruits, the value of AP is 0.95, for near-ripe fruits it is 0.97, and for ripe raspberries it is 0.99.
Luo, Ding and Wang [16] presented a methodology for the detection of raspberries and the subsequent assessment of their ripeness. This methodology was based on an enhanced YOLOv11n model, which was optimised through the incorporation of three novel modules: HCSA (Hybrid Channel-Spatial Attention), DWR (Dilation-Wise Residual) and DySample (Dynamic Sampling). The objective of this study was to facilitate precise fruit localisation whilst concurrently ascertaining their ripeness (unripe/ripe) in real-world conditions, encompassing both greenhouse and open field settings. These environments are characterised by overlapping leaves, intricate backgrounds, and variable lighting conditions. The fundamental architectural framework of the model was derived from YOLOv11n, with the authors integrating the HCSA attention module into the neck part, which integrates halo, channel, and spatial attention. This mechanism enables the network to more effectively differentiate between areas of interest and suppress the influence of the background, such as leaves of similar colours. The DWR module employs convolutions with varying dilation factors, thereby enhancing the recognition of fruits of diverse sizes and partially obscured objects. Concurrently, DySample dynamically adjusts the spatial resolution, thus facilitating more precise reconstruction of the fruit shape, even in instances of scale or lighting variations. The network was trained on a proprietary dataset of 3167 images containing manually annotated raspberries in two categories: ripe and unripe. It was demonstrated that the model was capable of performing both fruit position localisation and ripeness classification in a single step. The model demonstrated an average accuracy of 92.5% on the test set. The model achieved an average accuracy of mAP@0.5 = 0.934, mAP@0.5–0.95 = 0.798, and F1 score = 0.890, representing a 2.9% improvement over the original YOLOv11n model. The accuracy of individual class recognition reached 0.925 for unripe fruit and 0.943 for ripe fruit.
In their study, Zhang et al. [17] concentrated on the detection of raspberries and their stems with the objective of ascertaining the most suitable cutting point for robotic harvesting. In their publication, they presented a refined visual system based on the YOLOv8n model, designed to detect raspberries and their stems with the objective of determining the optimal cutting point for robotic harvesting. An additional detection layer for small objects detection was incorporated, and a container (Context Aggregation Network) and a Context Anchor Attention (CAA) module were also added. These modifications have been shown to enhance feature extraction and detection accuracy under varying lighting conditions. The model was created in two versions: The YOLOv8n model is designed to process daytime images, while the YOLOv8n-night model is designed to process nighttime images. It is noteworthy that both versions utilise data from the Intel RealSense D405 RGB-D camera, which also provides depth information for calculating the 3D coordinates of the cutting point. The YOLOv8n-day model demonstrated a precision of 0.810, a recall of 0.807, and an mAP@0.5 of 83.6%, while the YOLOv8n-night model exhibited a precision of 0.884, a recall of 0.907, and an mAP@0.5 of 93.4%. The proposed solution offers several advantages, including its capacity for real-time detection of fruits and stems with a high degree of accuracy, as well as its resilience to variable light conditions. The system is currently undergoing field testing in a raspberry plantation, providing a practical environment for evaluating its performance under real-world conditions. The disadvantage of this model is its higher computational complexity in comparison to the basic YOLOv8n model, and the necessity for separate training for day and night environments.

3.2. Navigation and Detection Analysis

Machine vision allows robotic systems to acquire visual information via sensors that convert images into digital signals based on pixel distribution and attributes such as brightness and colour. The imaging system then extracts features of the objects and, based on the recognition results, generates control signals for the robot’s actuators [18]. Fruit-harvesting robots detect and localise fruit using visual sensors such as binocular cameras [19], laser-based systems [20], Kinect-type sensors [21], multispectral cameras [22] and other imaging modalities.
In recent years, visual systems in harvesting robots have often relied on a combination of machine learning methods and traditional image processing techniques such as filtering, thresholding and morphological operations, which are widely used to detect various types of fruit [23].
In the early stages, classical image processing techniques and statistical approaches were employed for fruit ripeness assessment. These methods primarily relied on manually defined image features such as colour, shape, and texture. Among the most commonly applied techniques were Principal Component Analysis (PCA), cluster analysis, and discriminant analysis. For instance, Kienzle et al. [24] utilised a combination of PCA and cluster analysis for mango ripeness evaluation, while Mohammadi et al. [25] achieved an accuracy of 0.9024 in classifying persimmon ripeness using Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA). Although these approaches were computationally efficient, their performance was highly dependent on input data quality and the manual selection of features.
With the advancement of computational technologies, machine learning methods began to emerge, enabling models to learn from data and improve their decision-making capabilities without explicit rule-based programming. This category includes algorithms such as Support Vector Machines (SVMs) [26], Decision Trees [27], and Random Forests [27]. These algorithms demonstrated improved discrimination among ripeness stages but still exhibited limited generalisation ability and required manual feature extraction.
The next stage of development was represented by deep learning, a subset of machine learning. This group of methods employs deep neural networks capable of autonomously learning to recognise and extract complex features directly from raw image data, eliminating the need for manual feature design or predefined rules. As a result, the accuracy and stability of the obtained results increased significantly [28,29]. This approach enables what is known as end-to-end processing, which covers the entire workflow from the input image to the final classification or object detection output. Commonly used architectures include Convolutional Neural Networks (CNNs), YOLO models, Mask R-CNN, and more recent approaches based on Vision Transformers (ViTs) [30,31]. Owing to their ability to automatically extract features and achieve a high level of generalisation, these methods exhibit exceptional accuracy and robustness, making them the dominant approach in recent years for detecting small and delicate fruits under real world conditions.
At present, fruit ripeness detection is primarily dominated by YOLO models, which combine high detection accuracy with real time processing capability. This concept was first introduced by Redmon et al. [32] in 2016. YOLO models enable image analysis in real time, making them particularly suitable for implementation in harvesting robots. The most widely applied versions include YOLOv5, YOLOv8, YOLOv11, and their derivatives [30,33,34]. The YOLO architecture can be divided into three main components: the backbone, the neck, and the head [35], as illustrated in Figure 9. The backbone, typically based on transfer learning from models pre-trained on large datasets, is based on convolutional principles (YOLOv1 to YOLOv8, for example, Darknet, CSP, or C2f) or attention mechanisms (YOLOv9 to YOLOv12), enabling the extraction of key image features at multiple scales. The neck further refines these features by enhancing spatial and semantic representations, and finally, the head utilises the refined features to perform the actual object detection [35].

3.2.1. Evaluation Metrics

In the evaluation of machine vision model performance, various metrics are used to describe the reliability and capability of a model to correctly recognise or localise objects. The choice of specific metrics depends on the type of task, such as classification, detection, or segmentation. Although Accuracy provides an overall measure of correct predictions, it may be misleading in class-imbalanced datasets, where the majority class dominates the results. Therefore, additional metrics are used to provide a more balanced assessment. The fundamental metrics include Precision (Equation (1)) and Recall (Equation (2)). Precision represents the proportion of correctly identified positive cases among all cases predicted by the model as positive. Recall indicates the proportion of correctly recognised positive cases among all actual positive cases. Because these two measures capture different aspects of model performance, the F1 score (Equation (3)) is used to combine them into a single indicator. It represents the harmonic mean of Precision and Recall and provides a balanced evaluation, especially in cases of class imbalance where high Precision may coincide with low Recall or vice versa.
In object detection tasks, the Mean Average Precision (mAP) metric (Equation (4)) is frequently used. This metric represents the mean of the Average Precision (AP) values across all classes and is commonly applied for comparing multi-class models. The Average Precision (Equation (5)) is derived from the Precision–Recall curve for each class and corresponds to the area under this curve.
P r e c i s i o n = T P T P + F P
where TP denotes true positives and FP denotes false positives.
R e c a l l = T P T P + F N
where FN represents false negatives.
F 1 = 2 P r e c i s i o n R e c a l l P r e c i s i o n + R e c a l l
m A P = i = 1 k A P i k
where k is the number of object classes and A P i is the Average Precision for the i -th class.
A P = i = 1 N 1 ( r i + 1 r i ) P i n t e r ( r i + 1 )
where r i represents the recall value and P i n t e r ( r i + 1 ) the interpolated precision at the corresponding recall level.

3.2.2. The References Used

In the section focused on machine vision, a total of 31 publications published between 2020 and 2025 were analysed and compared. These studies primarily addressed the detection and localisation of raspberries but, due to the limited number of available works, also included other small and fragile fruits such as strawberries, blueberries, mulberries, and blackberries. The selection of these fruits was based on their morphological similarity and related physical characteristics. The results were organised into several topic-specific tables according to the type of analysed fruit. This structure enables a clearer comparison of individual methods and their performance in detecting and classifying fruits with similar characteristics. Each table summarises publications focused on a specific fruit type: Table 1 presents studies on raspberries, Table 2 summarises results from publications on blueberries, Table 3 includes studies concerning strawberries, and Table 4 lists works focused on blackberries and mulberries.
Each table contains the key parameters of the models used, particularly the type of architecture (for example, YOLO, Mask R-CNN, or other variants of convolutional neural networks), along with selected performance evaluation metrics. These include primarily mAP@0.5, Precision, Recall, F1-score, and Accuracy (classification accuracy), and in some cases also frames per second (fps), which expresses the real-time detection speed of the model.
The values of these metrics were taken directly from the respective publications and are presented in a unified format to enable comparison across studies. In cases where certain metrics were not reported, the corresponding table cells are left empty or marked with a dash. This approach reflects the considerable variability in evaluation methodologies among studies, which complicates direct performance comparison but also illustrates the broad range of current research approaches.
The tables provide a comprehensive overview of the current state of research in machine vision for the detection and classification of small fruits, allowing performance comparison of individual models both in terms of accuracy and processing speed.

3.2.3. Comparison of the Analysed Studies

Subsequently, the publications were analysed, and the approaches presented in these studies were compared according to the methodological categories applied for fruit detection and recognition. Within this framework, the currently applied approaches for raspberries and other small, delicate fruits are evaluated, along with their suitability for application to raspberry detection.
Deep Learning-Based Methods—CNN-Based Classification
Classification methods based on Convolutional Neural Networks (CNNs) identify visual features of objects and assign them to predefined categories [62]. In fruit evaluation, they are primarily applied to determine ripeness or quality based on colour and texture features, usually on pre-cropped or segmented images [63]. In the study by Olisah et al. [28], a multi-input convolutional neural model based on the VGG16 architecture was employed for classifying blackberry ripeness under natural conditions. The model processed bispectral images captured at wavelengths of 700 nm and 770 nm, which were input into two parallel branches of the neural network. The outputs of both branches were subsequently merged and further processed by fully connected layers. This method demonstrated high accuracy (Table 4) and robust classification performance. A major drawback of this approach lies in the absence of a detection component, the relatively small training dataset, and the increased demands on imaging hardware, which may consequently raise the overall system cost. Conceptually, this model would be suitable for assessing raspberry ripeness, as the hyperspectral approach combined with the VGG16 convolutional network can distinguish subtle spectral differences between ripeness stages. However, practical deployment would require retraining on raspberry data, recalibration of sensors, and integration of a detection component to enable operation in natural environments with overlapping fruits. A similar approach was presented by Miraei Ashtiani et al. [60], who developed a system for classifying mulberry ripeness stages using convolutional neural networks with transfer learning. The authors compared five architectures (AlexNet, ResNet-18, ResNet-50, Inception-v3, and DenseNet) for distinguishing four ripeness classes (unripe, semi-ripe, ripe, overripe) separately for white and black mulberries as well as in a combined dataset. The images were acquired in a controlled lighting chamber, and the background was segmented using colour-based techniques (most effectively in the YCbCr colour space). The best performance was achieved by AlexNet for white mulberries (accuracy 98.32%) and ResNet-18 for black mulberries (accuracy 98.65%). A major advantage of this approach was its low hardware requirements, as it relied only on a standard RGB camera, which also reduced overall implementation costs; however, the experiments were conducted under controlled laboratory conditions.
Deep Learning-Based Methods—Image Segmentation
Segmentation methods based on deep learning divide an image into regions corresponding to individual objects or their components [62,64]. Through pixel-wise classification, these approaches enable precise fruit boundary delineation and provide more detailed spatial information than conventional classification methods [64]. Ilyas et al. [47] introduced a hierarchical adaptive segmentation method for strawberries (Straw-Net), based on a convolutional encoder–decoder network featuring a newly designed modular attention mechanism, the Dense Attention Module (DAM), and Parallel Dilated Convolution (PDC). The architecture employs an SE-ResNet backbone and integrates both spatial and channel attention to achieve more accurate feature fusion between the encoder and decoder. The Straw-Net model performs pixel-level image segmentation, where each fruit is precisely outlined by a segmentation mask, rather than detected by bounding boxes, as in YOLO or Faster R-CNN models. The main advantage of this approach lies in its high segmentation accuracy (Table 3) and its ability to recognise different ripeness stages under real field conditions. However, the use of multiple attention modules increases computational complexity, which may consequently raise the overall system cost. Cai et al. [52] proposed an improved segmentation model, DeepLabV3+, designed to assess strawberry ripeness based on colour analysis. Their approach incorporates a dual attention mechanism that combines ECA-SimAM (in the Xception backbone) and CBAM (in the ASPP module), enhancing the network’s ability to distinguish subtle visual differences between ripeness stages while suppressing background interference from leaves or shadows. Although this method achieved slightly lower segmentation accuracy compared with Ilyas et al. [47] (Table 3), it retained the ability to distinguish ripeness levels under realistic greenhouse conditions, but at the expense of higher computational complexity due to the use of dual attention mechanisms.
The approach by Ilyas et al. [47] could be suitable for raspberry detection due to its strong background suppression capability achieved through the DAM attention mechanism. Similarly, the solution proposed by Cai et al. [52], which applies deep learning-based segmentation with attention mechanisms including the Convolutional Block Attention Module (CBAM) and a combined Efficient Channel Attention and Simple Attention Module (ECA-SimAM), could help in differentiating raspberries from leaves or background under variable lighting conditions. However, both models would likely experience a drop in accuracy when applied to raspberries because of their distinct surface texture and frequent occlusions. Raspberries have a complex morphology composed of small drupelets forming an uneven, glossy surface and often appear in overlapping clusters. This could cause segmentation errors, as the model might classify entire clusters as a single object instead of separating individual fruits. Successful adaptation would therefore require retraining on a raspberry-specific dataset and the addition of modules capable of 3D depth perception or enhanced spatial attention to improve object separation accuracy.
Deep Learning-Based Methods—Image Detection and Segmentation
Methods that combine detection and segmentation integrate object localisation with precise boundary delineation within a unified framework [62]. Models such as Mask R-CNN allow simultaneous object recognition and mask generation, providing detailed information about the number, position, and shape of detected objects. Ni et al. [36] (Table 2) presented a method for the detection, segmentation, and evaluation of blueberry characteristics using deep learning based on the Mask R-CNN architecture with a ResNet-101 backbone and a Feature Pyramid Network (FPN) neck. The model processes RGB images of four blueberry varieties and simultaneously determines fruit count, ripeness, and cluster compactness. Ripeness classification is based on hue values, which enables an objective distinction between ripe (blue) and unripe (green or red) fruits. The correlation between the detected and actual number of fruits reached R2 = 0.886. Mask R-CNN combines detection using bounding boxes with pixel-level segmentation masks, allowing both localisation of fruits and accurate contour extraction. This represents a major advantage for raspberries, which frequently overlap. The network is capable of distinguishing individual fruits within a cluster even under partial occlusion or varying illumination. In addition, the use of an FPN allows recognition of objects of different sizes, making the model adaptable to smaller fruits such as raspberries.
Pérez-Borrero et al. [45] (Table 3) proposed an improved instance segmentation approach for strawberries, also based on Mask R-CNN, intended for use in automated harvesting systems. The model replaces the original Mask R-CNN components with a lightweight architecture featuring a newly defined backbone, region proposal network (RPN), and masking branch, which reduces computational complexity and thereby lowers overall computational and hardware costs while maintaining high accuracy. The classification branch and bounding box regressor were removed, resulting in a model that performs precise localisation of individual fruits without multiclass classification. Although this model could detect and segment raspberries, its performance would likely be insufficient without further adaptation. The system focuses exclusively on detection and segmentation and does not include classification by ripeness or other fruit parameters.
Tang et al. [53] introduced a method for fine-grained strawberry ripeness recognition under field conditions, combining deep learning and image processing. In the first stage, an enhanced Mask R-CNN with Self Calibrated Convolutions (SCNet50) is used for fruit detection, expanding the receptive field and improving segmentation accuracy under partial occlusion. In the second stage, each detected fruit is divided into four subregions, and colour features are extracted from the Blue of the RGB colour model, Green of the RGB colour model, Lightness of the Lab colour model, a green-red axis of the Lab colour model, and Saturation of the HSV colour model channels. These features are then used to classify ripeness into six levels (White, Breaking, Turning 1, Turning 2, Ripe, and Fully Ripe) using an SVM classifier. This approach could also be suitable for detecting raspberries in field environments, as the Mask R-CNN model with Self Calibrated Convolutions enhances segmentation accuracy under partial occlusion and variable lighting. The model’s robustness to complex backgrounds is particularly advantageous for raspberries growing in dense vegetation. Overall, this method demonstrated high accuracy (Table 3), although a moderate reduction in processing speed was observed, with a detection and classification rate of 18.2 fps.
Deep Learning-Based Methods—Object Detection and Classification Using YOLO Architecture
Models based on the YOLO architecture are designed for fast object detection in real time [35]. Unlike classification or segmentation methods, they process the entire image at once and simultaneously classify and localise each object. Due to their high processing speed and the continuous evolution of successive versions, these models achieve strong accuracy even when detecting small objects and have found wide application in agricultural systems for monitoring and automation [65].
  • YOLOv3-Based Methods
The first approaches identified in the analysed publications employed the YOLOv3 architecture, representing the initial phase of applying YOLO models for fruit detection and ripeness evaluation. Yu et al. [46] proposed an improved model, R-YOLO (Rotated YOLO), based on the lightweight MobileNet V1 backbone, designed for the localisation of strawberry fruits and precise identification of cutting points on stems during harvesting in ridge planting systems. The model achieved high accuracy (precision 94.43%, recall 93.46%, F1 score 93.94%) at a processing speed of 18 frames per second (Table 3), demonstrating the suitability of lightweight architectures for real-time applications. Hu et al. [31] further enhanced the YOLOv3 architecture by integrating it with Mask R-CNN and a ZED 3D stereo camera (StereoLabs, San Francisco, CA, USA), which enabled more accurate spatial localisation and segmentation of strawberries. Their method achieved an average accuracy of 93.9% (Table 3). These studies mark the beginning of a trend in which YOLOv3 serves as the detection component within multi-stage systems that combine detection, segmentation, and spatial localisation of objects.
  • YOLOv4-Based Methods
The next generation of YOLO models focused on improving computational efficiency and spatial detection accuracy. Haydar et al. [38] employed the YOLOv4-tiny model in combination with an OAK-D depth camera and DepthAI API/software (Luxonis, Denver, CO, USA) for blueberry detection and subsequent estimation of fruit height above the ground. The proposed approach achieved an accuracy of 86.5% (Table 2) with significantly lower computational requirements, confirming the suitability of simplified YOLOv4 variants for systems operating on limited hardware. He et al. [51] developed a two-stage system for the localisation and recognition of strawberries under real field conditions to support precise robotic harvesting. The design combined YOLOv4, which detects strawberries from RGB images and classifies them into five ripeness categories, with YOLOv4-tiny integrated with a ZED2 depth camera to calculate 3D coordinates. The model achieved 80.68% accuracy for detection and 86.45% for localisation (Table 3). Although this generation introduced improvements in handling depth data, its speed and generalisation capabilities remained limited. A notable drawback of this system is its higher computational complexity and the requirement for a depth camera to accurately determine the fruit position, which together contribute to increased hardware and implementation costs.
  • YOLOv5-Based Methods
The introduction of YOLOv5 marked a significant expansion in the application of detection models in agriculture. An et al. [48], Fan et al. [49], Lemsalu et al. [50], Lawal [56], and Xie et al. [58] all utilised YOLOv5 for strawberry detection. An et al. [48] proposed an enhanced detection model, Strawberry Detect Net (SDNet), based on the YOLOv5 architecture. The model integrates a C3HB module combining standard convolution with a Horblock structure to improve spatial interaction and feature extraction. The approach achieved a detection accuracy of 93.15% at a processing rate of 30.5 frames per second (Table 3). Lemsalu et al. [50] designed a system employing an RGB-D Intel RealSense D435 camera (Intel Corporation, Santa Clara, CA, USA) that captures images from different distances and angles, while the YOLOv5 algorithm detects ripe and unripe strawberries and their stems in real time. The model was trained on 654 images and optimised for embedded operation on an NVIDIA Jetson AGX Xavier (NVIDIA Corporation, Santa Clara, CA, USA) using the TensorRT library. Their approach achieved 89% accuracy at a rate of 45 frames per second (Table 3). Xie et al. [58] introduced an autonomous robot for harvesting strawberries cultivated on ridge beds, equipped with dual manipulators and a vision system based on YOLOv5s and YOLOv5s-seg models. YOLOv5s was used for ripe fruit detection, and for each detected fruit, an expanded region of interest was created, within which the YOLOv5s-seg model performed instance segmentation of the stem. Based on the shape of the segmented stem, the optimal cutting point was calculated. The detection model achieved 97.0% accuracy, while the segmentation model reached 92.5% (Table 3). Yang et al. [37], Xiao et al. [39], and Liu et al. [42] applied YOLOv5 for blueberry detection. Xiao et al. [39] integrated a lightweight ShuffleNet module and the CBAM attention mechanism, significantly improving performance (precision 96.3%, recall 92%, F1-score 94.12%) and achieving a processing speed of 67.1 frames per second (Table 2). Ling et al. [15] developed the HSA-YOLOv5 method for automatic raspberry fruit localisation and ripeness classification (immature, nearly ripe, ripe) under varying lighting conditions. According to Ling et al. [15], the model achieved a mean average precision mAP@0.5 of 97% (Table 1).
  • YOLOv7-Based Methods
Zhang et al. [59] introduced the YOLOv7-base model designed for detecting the ripeness of blackberry fruits. The model achieved high performance metrics, with precision of 91.4%, recall of 89%, F1-score of 90%, and overall accuracy of 86% (Table 4). The YOLOv7 architecture provides improved information exchange between different network layers and more efficient weight optimisation compared to earlier versions, contributing to greater training stability. The model successfully recognised fruits even under partial occlusion or varying illumination and reached a processing speed of 46 fps. However, the model showed slightly lower accuracy in detecting smaller fruits and required higher computational resources when using high-resolution input, which can consequently increase processing time and overall system costs.
  • YOLOv8-Based Methods
The more recent YOLOv8 architecture represents a further step towards the integration of attention mechanisms and adaptive segmentation. This model was applied in studies by He et al. [57], Visentin et al. [54], and Ma et al. [55] for strawberry detection and classification, by Li et al. [43] and Gai et al. [40] for blueberries, by Qiu et al. [61] for mulberries, and by Zhang et al. [66] for raspberries.
He et al. [57] proposed a two-stage system combining YOLOv8 and YOLOv5-cls for the detection and classification of strawberries. The first stage uses a modified YOLOv8n model incorporating C3x modules and an additional head network structure, specifically tailored for accurate strawberry detection. Although the model achieved high precision, its two-stage image processing results in slightly higher computational requirements, leading to increased hardware demands and overall implementation costs.
Gai et al. [40] employed a transfer learning approach (TL-YOLOv8) and achieved mAP of 94.1% (Table 2). Their architecture integrated three new modules: MPCA (Multiplexed Coordinated Attention) to improve feature extraction precision, OREPA (Online Convolutional Re-parameterization) to accelerate training and reduce computational demands, and Multi-SEAM (Multi-scale Separation and Occlusion-Aware Module) to enhance detection of fruits of various sizes and those partially covered by leaves. The model was first pre-trained on the Fruits-360 dataset and then fine-tuned on a custom blueberry dataset.
Other advanced modifications such as Improved YOLOv8n [17] (Table 1) and STRAW-YOLO [55] achieved mAP levels up to 96% (Table 3) and demonstrated high robustness under variable illumination and even at night. In the STRAW-YOLO model, the YOLOv8-Pose architecture was enhanced through the integration of the Efficient Multi-Scale Attention (EMA) mechanism to strengthen the focus on ripe fruit regions, the C2f-OREPA module to accelerate training and improve feature extraction, and the DCN-C2f structure employing deformable convolution v3 for handling irregular fruit shapes. The network also incorporated a key-point detection branch predicting three characteristic points on each strawberry (stalk, connection point, and fruit tip), enabling estimation of fruit orientation and accurate localisation of the optimal picking point.
  • YOLOv11-Based Methods
The most recent generation identified in the analysed publications focuses on adaptive image processing and contextual scene analysis. Luo et al. [16] utilised the YOLOv11n architecture enhanced with HCSA, DWR, and DySample modules, which improve feature extraction and adaptive resolution adjustment. The aim of their study was to enable precise localisation of raspberry fruits and classification of their ripeness (unripe—“unripe”, ripe—“ripe”) under real-world conditions, including both greenhouse and open-field environments with overlapping leaves, complex backgrounds, and uneven lighting. The authors achieved high accuracy ranging from 92.5% to 94.3% (Table 1). Zhang et al. [41] (Table 2) proposed an open-source system for high-throughput blueberry phenotyping, combining the YOLOv11m model for fruit detection with image analysis methods for yield and ripeness estimation in natural conditions. Zhang et al. [44] developed a YOLOv11-BSD variant designed for blueberry detection under nighttime conditions. In this model, the YOLOv11 architecture was extended with C3k2-BFAM and C2PSA-SE modules for improved extraction of fine textures and channel-sensitive feature enhancement, an upgraded PANet for multi-scale fusion, and DySample for more precise upsampling. This solution demonstrated high accuracy (91.8%) with an average processing speed of 66.5 fps (Table 2).
  • Suitability for Raspberries
The analysis of existing YOLO-based solutions indicates that the most suitable approaches for automatic raspberry detection and recognition in robotic harvesting systems are models from the YOLOv5 to YOLOv11 families. Among the models tested directly on raspberries, the best-performing solution was the HSA-YOLOv5 model [15]. Converting the image to the HSV colour space and applying adaptive contrast enhancement effectively reduced the influence of variable lighting, allowing reliable differentiation of three ripeness levels. The Improved YOLOv11n model [16] employs attention and residual modules (HCSA, DWR, and DySample) that enhance the detection of small objects and stabilise performance when fruits overlap. Another suitable solution for robotic harvesting applications is the Improved YOLOv8n model [17], which combines RGB-D input with Container and CAA modules, enabling simultaneous detection of fruits and stems during both day and night operation.
For models that have not yet been applied to raspberries, only certain architectures show potential suitability for transfer, depending on their structural design and adaptability. Among the most promising are TL-YOLOv8 [40], which employs transfer learning and attention mechanisms to improve detection accuracy of small, overlapping fruits, and STRAW-YOLO [55], which combines Efficient Multi-Scale Attention mechanism and the C2f-OREPA module, achieving both high speed (62.6 fps) and high precision (mAP@0.5 = 96%) in object detection. With the addition of colour-based analysis, STRAW-YOLO could also be adapted for ripeness classification. Both of these solutions, however, would require retraining on raspberry datasets to account for differences in fruit texture, colour, and morphology compared to the original target species.

3.3. Analysis of Solutions for Gripping Fruits and Sensor Technology

Fruits such as raspberries have a fragile structure, thin skin and soft flesh that can be easily deformed by very small forces, typically less than 0.3 N [67]. Traditional robotic end-effectors based on rigid tongs or vacuum suction cups are unable to provide the necessary gentleness or adaptability of contact. Consequently, in recent years, research has shifted towards soft robotic grippers, i.e., flexible end-effectors that can adapt to the shape of the fruit, distribute pressure and control the grip in real time because of the integrated sensors [68,69]. Soft robotics in agriculture is based on biomimetic principles—designs that mimic tentacles, muscles or fish fins to ensure smooth movement and force distribution. These principles allow highly adaptive end-effectors to be created that can work with a wide variety of objects, including soft and deformable fruits [70,71].

3.3.1. The References Used

This chapter covers the research and applications of soft and hybrid end-effectors, as detailed in 31 analysed publications from 2015 to 2025. The overview primarily focuses on technologies designed for harvesting raspberries. The study also evaluates grippers that have been tested on morphologically similar small fruits, such as strawberries, blackberries and apples. The next section provides a systematic classification and comparison of these end-effectors according to key technological categories. It will evaluate the principles of operation and design solutions (e.g., Fin-Ray and tendon-driven), types of actuation (drives), materials used and the level of sensor integration, including opto-tactile sensing, grip force measurement and slip detection. At the end of the section, a table is provided which summarises the results achieved and evaluates the suitability of individual approaches to raspberry harvesting.

3.3.2. Classification and Principles of Grasping

Modern literature reviews [18,71,72] distinguish four basic design categories of soft end-effectors:
  • Fin-ray: fingers with an internal ribbed structure that deform on contact and passively envelop the shape of the fruit.
  • Tendon-driven: a design inspired by the human hand where flexible rods bend to function similarly to tendons, ensuring sensitive and controlled force distribution.
  • Enveloping: mechanisms that completely surround the fruit with soft material and release it through rotation or contraction.
  • Hybrid end-effectors: devices that combine soft grip principles with additional functions such as integrated cutting devices, stiffness-changing mechanisms or modular, interchangeable end pieces.
Each of these principles can be combined with various actuators, such as pneumatic, hydraulic, or electric ones. Pneumatics are often chosen for their simplicity and safety, while electric drives are used where a more precise approach is required. Another possible classification is based on finger design; however, this classification does not take into account the type of actuator used. An interesting summary can be found in the work of Wang et al. [18]; however, it does not clearly assess the use of defined grippers on raspberries or blackberries.
  • Fin-Ray end-effectors
The Fin-Ray end-effector is one of the most common designs for harvesting small fruit. A conceptual illustration of the Fin-Ray end-effector is shown in Figure 10a. The principle is based on the Fin-Ray effect, whereby deformation of the fingers causes the force to be distributed across the entire contact surface. Hughes et al. [68] have already summarised general recommendations for the design of gripping mechanisms, which are among the possible options for harvesting small fruits such as raspberries, blackberries and strawberries [73].
Lin et al. [78] presented a three-finger gripper that combines a Fin-Ray structure with visual-tactile sensors. This gripper could classify fruit ripeness and measure firmness without causing damage, with the pressure applied not exceeding values that would bruise the fruit. A similar concept was used by Chen et al. [74] for the quantitative measurement of fruit firmness, achieving a correlation between predicted and actual firmness of around R2 = 0.8. However, these grippers have only been tested on larger fruits, such as apples and tomatoes. Research shows that damage is mainly caused by tensile forces when breaking off the fruit. In the case of raspberries, this would require more accurate measurements of the breaking force at different levels of fruit ripeness.
The most common objective in the design of Fin Ray System fingers is to apply the Finite Element Method (FEM). Based on the design and optimisation of the model, it is possible to accurately predict the behaviour and shape transformation of the fingers when performing specific grips. Varghese et al. [79] designed and implemented a methodology for finger design based on this approach, comparing the gripping properties of their design with those of other approaches.
Other approaches also consider using a simple soft gripper finger with softened padding. However, this is merely a basic modification of a standard finger gripper. Ait et al. [80] proposed modifying the opto-tactile end-effector sensor to directly measure fruit deformation. Testing was performed on strawberries, raspberries and blueberries. However, this system achieved the lowest success rate with raspberries.
These studies confirm that Fin-Ray end-effectors can harvest fruit with high accuracy and minimal damage. The advantages of these end-effectors are their simple design and passive adaptation, which do not require complex mechanics to adjust the shape of the fingers to grip the fruit. A disadvantage is that the pressure can be insufficient or unevenly distributed, which can cause damage to the fruit.
  • Tendon-driven end-effectors
Tendon-driven systems are the second most widely used for testing small-fruit harvesting (Figure 10b). They offer precise control of movement and force. Gunderman et al. [75] presented a soft, cable-driven gripper for harvesting strawberries. Tactile sensors were used to provide grip control feedback. Mawah and Park [81] combined a pneumatic system with a tendon-driven mechanism to create a variable-stiffness end-effector. These grippers can control the grip using rods and can also encircle and anchor the fruit using a pneumatic system (Figure 10c).
Wang et al. [76] validated a similar principle for apples using a purely pneumatically controlled gripper. Although these were harder fruits, their measurements of contact pressure and grip angle provide a methodological basis for smaller fruits, such as raspberries. Navas et al. [82] also tested a purely pneumatic gripper on strawberries and grapes. These grippers are distinguished by the fact that they do not use flexible rods like pneumatic-driven end-effectors.
Tendon systems are suitable for applications requiring high precision and the ability to manipulate clusters. However, they have the disadvantage of requiring more complex control and implementation of flexible and rigid structures. Another disadvantage is the difficulty of indirectly determining the force acting on the gripped fruit; however, in combination with tactile sensors, they provide good results.
  • Enveloping end-effectors
Enveloping grippers surround the fruit with soft material and gently separate it from the plant (Figure 10d). Elfferich et al. [77] introduced the BerryTwist gripper, which uses a fabric tube that rotates around the fruit. This principle enabled blackberries to be detached with 82% success and 95% release without damaging the pulp. This approach is very suitable for raspberries, as it eliminates the need for precise stalk localisation, as with blackberries.
  • Hybrid end-effectors
Hybrid designs combine soft gripping with mechanical cutting or other modular components. Navas et al. [83] described an iris-type end-effector that simultaneously grips the fruit and cuts the stem. De Preter et al. [84] combined pneumatic gripping with scissor cutting. Both approaches confirmed the possibility of precise fruit separation with a controlled cutting force of 0.9–1.2 N.
For harvesting raspberries, where the fruits are picked without the stem, a variant without a cutting element is more suitable. Furia et al. [85] developed the GraspBerry gripper with interchangeable tips, tested directly on raspberries. The system achieved stable gripping without visible damage and adapted to diameters of 12–18 mm. The system was more of a combination of soft fingers with a pull mechanism.
The study by He et al. [73] can also be considered a hybrid end-effector, where the end-effector was combined with a fan to increase the accuracy of gripping and tearing without damage.

3.3.3. Gripper Materials, Construction, and Variable Stiffness

Most end-effectors use silicone elastomers (such as EcoFlex and PDMS) or thermoplastic elastomers (such as TPU and TPE), which offer high levels of flexibility and durability. Blanco et al. [86] demonstrated that internal gyroid fillings enhance force distribution.
Pneumatic systems [69,87] offer precise pressure control, and combinations with rigid inserts [88] enhance stability.
Variable stiffness is essential for handling fruit of different ripeness levels. Mawah and Park [81] demonstrated that dynamic changes in finger hardness can be achieved by combining pneumatic and cable control [3].

3.3.4. Sensors and Grip Control

Precise contact control requires sensory integration [89]. Xu et al. [90] and Li et al. [91] used tactile sensors to estimate fruit firmness. Ait Ameur et al. [80] combined an opto-tactile system with a camera to regulate pressure based on visually detected deformation. Another common approach is to compare the system with human grasping experience, measure the resulting forces and apply them to grippers. The gripping force can then be measured, or optimal finger shapes for gripping can be designed directly [75,82,92].
In this context, it is pertinent to reiterate the findings of the study by Junge et al. [5] on a digital twin of a raspberry. This study demonstrated the capability of the digital twin to predict deformation with an error tolerance of ±7% and to reduce fruit damage by 18%.

3.3.5. Comparison of Solutions for Gripping Fruits and Sensor Technology

Table 5 summarises the results achieved and evaluates the suitability of individual approaches for raspberry harvesting.

4. Conclusions

An analysis of 31 publications focusing on machine vision revealed that research into raspberry detection remains in its infancy. This is evident from the fact that only three studies focused directly on raspberries, compared to 15 on strawberries, nine on blueberries, two on blackberries, and two on mulberries. Regarding methodologies, approaches based on YOLO architecture clearly dominate the analysed set of 24 publications, while classical classification models (two publications), segmentation models (two publications) and combined detection-segmentation models (three publications) are used less frequently. Of the models tested directly on raspberries, the HSA-YOLOv5 model produced the best results, achieving an mAP of 97%.
Given the limited number of specific studies, models transferable to raspberry detection were also identified. The most promising of these include TL-YOLOv8 (precision = 84.6%, recall = 91.3%, mAP@0.5 = 94.1%) and STRAW-YOLO (precision = 91.6%, recall = 91.7%, F1-score = 91.7%, mAP@0.5 = 96% at 62.6 fps). While these models show potential for application to raspberries, their deployment requires retraining using specific data that takes into account the different textures, colours, and shapes of raspberries.
A total of 31 publications were reviewed in order to conduct a comprehensive analysis of end-effectors. The primary challenge in this area pertains to the fragility of raspberries, which possess a soft surface and are separated by pulling rather than cutting. The analysis confirmed that classic rigid jaw grippers are unsuitable for the purpose under discussion, because they cause deformation, damage, or incorrect gripping. The review also showed that alternative approaches, including tendon-driven and enveloping methods, have not yet been adequately verified in the context of raspberries. To date, the enveloping end-effector has only been tested on blackberries, which are significantly firmer. Most of the research conducted under field conditions evaluated the success rate of fruit picking and the extent of fruit damage, focusing mainly on strawberries (three studies), with only one study addressing blackberries and raspberries. It is evident from the findings of the present study that the most efficacious method currently is the utilisation of a soft silicone gripper in conjunction with tactile sensors, as demonstrated in the study conducted by Junge et al. [5]. The system, which utilises clamping force control, attained a success rate of approximately 80%, representing the proportion of undamaged harvested fruit in field tests.
Based on these findings, future research should concentrate on the development of advanced soft end-effectors (e.g., Fin-Ray, pneumatic, or tendon-driven) with variable stiffness and the capacity to replicate the shape of the fruit. In order to achieve accurate measurement of grip force and tearing force, it is imperative to integrate tactile sensors directly into the fingers. This will enable the prediction of safe force limits and increase reliability while minimising damage. The optimal objective is to achieve verification in field conditions with a reliability of >90% detachment and <10% damage, a feat heretofore unattained by any system for raspberry.

Author Contributions

Conceptualization, A.S., J.K. and B.K.; methodology, A.S. and J.K.; formal analysis, A.S., J.K. and B.K.; resources, M.H.; data curation, J.K.; writing—original draft preparation, A.S., B.K. and J.K.; writing—review and editing, A.S., B.K. and J.K.; visualisation, A.S.; supervision, M.H.; project administration, M.H.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Internal Grant Agency of Faculty of Engineering of Czech University of Life Sciences Prague, grant number: IGA 2025: 31200/1312/3106.

Data Availability Statement

No new data were generated or analysed in this study. All data discussed in this review are available in the cited publications.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
2DTwo-dimensional
3DThree-dimensional
ADownAttention-based Downsampling
APAverage Precision
ASPPAtrous Spatial Pyramid Pooling
BSDBerkeley Segmentation Dataset
C2fCross Concat Fusion
C2f-OREPACross Concat Fusion with Online Re-parameterization
C2PSA-SECross Concat Partial Spatial Attention with Squeeze-and-Excitation
C3k2-BFAMCross-Stage Partial module and Bidirectional Feature Alignment Module
C3xCross-Stage Partial module variant
CAACoordinate Attention Algorithm
CBAMConvolutional Block Attention Module
CNNConvolutional Neural Networks
CSPCross-Stage Partial
CSPPCCross-Stage Partial Spatial Pyramid Pooling
DAMDense Attention Module
DWRDilation-Wise Residual
DySampleDynamic Sampling
ECAEfficient Channel Attention
ECA-SimAMEfficient Channel Attention and Simple Attention Module
EcoFlexSilicone Elastomer Material
EIoU_LossExtended Intersection over Union Loss
EMAEfficient Multi-Scale Attention
EPFLÉcole Polytechnique Fédérale de Lausanne
F1F1 score
FAOSTATFood and Agriculture Organization Corporate Statistical Database
Fin-RayFin-Ray effect gripper
FNFalse Negative
FPFalse Positive
FPNFeature Pyramid Network
fpsFrames Per Second
HCSA Hybrid Channel-Spatial Attention
HSAHybrid Soft Attention
HSVColour space (Hue, Saturation, Value)
CHTCircular Hough Transform
knumber of object classes
LDALinear Discriminant Analysis
LfDLearning from Demonstration
LUTsLook-Up Tables
mAPmean Average Precision
MLMachine Learning
MobileNetv3Mobile Neural Network Version 3
MSSENetMulti-Scale Squeeze-and-Excitation Network
nnumber of samples
OAK-DOpenCV AI Kit with Depth camera
PCAPrincipal Component Analysis
PDCParallel Dilated Convolution
PDMSPolydimethylsiloxane
P-HeadPrediction Head
Pinterinterpolated precision
PTPhysical twin
QDAQuadratic Discriminant Analysis
rrecall value
R-CNNRegion-based Convolutional Neural Network
RDRRate of Damage Ratio
Res-NetResidual Network
RGBColour space (Red, Green, Blue)
RT-DETRReal-Time DEtection TRansformer
SAMSpatial Attention Module
SCNet50Self-Calibrated Network (50 layers)
SVMSupport Vector Machines
ToFTime of Flight
TPTrue Positive
TPEThermoplastic Elastomer
TPUThermoplastic Polyurethane
VGG16Visual Geometry Group 16-layer Network
ViTsVision Transformers
YCbCrLuminance–Chrominance Colour Space
YOLOYou Only Look Once (real-time deep learning algorithm for object detection)
ZED2StereoLabs ZED 2 Depth Camera

References

  1. FAO. Food and Agriculture Organization of the United Nations. Available online: https://www.fao.org/home/en (accessed on 11 October 2025).
  2. Ponder, A.; Hallmann, E. Phenolics and Carotenoid Contents in the Leaves of Different Organic and Conventional Raspberry (Rubus idaeus l.) Cultivars and Their in Vitro Activity. Antioxidants 2019, 8, 458. [Google Scholar] [CrossRef]
  3. Popa, R.G.; Șchiopu, E.C.; Pătrașcu, A.; Bălăcescu, A.; Toader, F.A. Raspberry Production Opportunity to Develop an Agricultural Business in the Context of the Circular Economy: Case Study in South-West Romania. Agriculture 2024, 14, 1822. [Google Scholar] [CrossRef]
  4. Apáti, F. Farm Economic Evaluation of Raspberry Production. Int. J. Hortic. Sci. 2014, 20, 53–56. [Google Scholar] [CrossRef]
  5. Junge, K.; Pires, C.; Hughes, J. Lab2Field Transfer of a Robotic Raspberry Harvester Enabled by a Soft Sensorized Physical Twin. Commun. Eng. 2023, 2, 40. [Google Scholar] [CrossRef]
  6. Ramsay, A.M. Mechanical Harvesting of Raspberries—A Review with Particular Reference to Engineering Development in Scotland. J. Agric. Eng. Res. 1983, 28, 183–206. [Google Scholar] [CrossRef]
  7. Rabcewicz, J.; Białkowski, P.; Konopacki, P. Evaluation of the Possibility of Shaking off Raspberry Fruits with a Pulsating Air Stream. J. Hortic. Res. 2017, 25, 61–66. [Google Scholar] [CrossRef][Green Version]
  8. Smith, E.A.; Ramsay, A.M. Forces during Fruit Removal by a Mechanical Raspberry Harvester. J. Agric. Eng. Res. 1983, 28, 21–32. [Google Scholar] [CrossRef]
  9. Fieldwork Robotics. Fieldwork Robotics—Soft, Selective & Autonomous Harvesting Robots. Available online: https://fieldworkrobotics.com/ (accessed on 12 October 2025).
  10. Kollewe, J. World’s First Raspberry Picking Robot Cracks the Toughest Nut: Soft Fruit. Available online: https://www.theguardian.com/business/2022/jun/01/uk-raspberry-picking-robot-soft-fruit (accessed on 12 October 2025).
  11. Kollewe, J. Improved Version of ‘Robocrop’ Only Picks Ripe Raspberries. Available online: https://www.theguardian.com/technology/article/2024/aug/26/improved-version-robocrop-only-picks-ripe-raspberries (accessed on 12 October 2025).
  12. Sauerwald, T.; Pulle, C.F.; Bodkin, T.; Whitear, D. An End-Effector. US20240284829A1, 29 August 2024. [Google Scholar]
  13. Strautiņa, S.; Kalniņa, I.; Kaufmane, E.; Sudars, K.; Namatēvs, I.; Nikulins, A.; Edelmers, E. RaspberrySet: Dataset of Annotated Raspberry Images for Object Detection. Data 2023, 8, 86. [Google Scholar] [CrossRef]
  14. Jafary, P.; Bazangeya, A.; Pham, M.; Campbell, L.G.; Saeedi, S.; Zareinia, K.; Bougherara, H. Raspberry PhenoSet: A Phenology-Based Dataset for Automated Growth Detection and Yield Estimation. arXiv 2024. [Google Scholar] [CrossRef]
  15. Ling, C.; Zhang, Q.; Zhang, M.; Gao, C. Research on Adaptive Object Detection via Improved HSA-YOLOv5 for Raspberry Maturity Detection. IET Image Process 2024, 18, 4898–4912. [Google Scholar] [CrossRef]
  16. Luo, R.; Ding, X.; Wang, J. Red Raspberry Maturity Detection Based on Multi-Module Optimized YOLOv11n and Its Application in Field and Greenhouse Environments. Agriculture 2025, 15, 881. [Google Scholar] [CrossRef]
  17. Zhang, X.; Zhang, N.; Xu, X.; Wang, H.; Cao, J. Optimal Cutting Point Determination for Robotic Raspberry Harvesting Based on Computer Vision Strategy. Multimed. Tools Appl. 2025, 84, 41257–41276. [Google Scholar] [CrossRef]
  18. Wang, C.; Pan, W.; Zou, T.; Li, C.; Han, Q.; Wang, H.; Yang, J.; Zou, X. A Review of Perception Technologies for Berry Fruit-Picking Robots: Advantages, Disadvantages, Challenges, and Prospects. Agriculture 2024, 14, 1346. [Google Scholar] [CrossRef]
  19. Li, L.; He, Z.; Li, K.; Ding, X.; Li, H.; Gong, W.; Cui, Y. Object Detection and Spatial Positioning of Kiwifruits in a Wide-Field Complex Environment. Comput. Electron. Agric. 2024, 223, 109102. [Google Scholar] [CrossRef]
  20. Gené-Mola, J.; Gregorio, E.; Guevara, J.; Auat, F.; Sanz-Cortiella, R.; Escolà, A.; Llorens, J.; Morros, J.R.; Ruiz-Hidalgo, J.; Vilaplana, V.; et al. Fruit Detection in an Apple Orchard Using a Mobile Terrestrial Laser Scanner. Biosyst. Eng. 2019, 187, 171–184. [Google Scholar] [CrossRef]
  21. Neupane, C.; Koirala, A.; Wang, Z.; Walsh, K.B. Evaluation of Depth Cameras for Use in Fruit Localization and Sizing: Finding a Successor to Kinect V2. Agronomy 2021, 11, 1780. [Google Scholar] [CrossRef]
  22. Gaikwad, S.; Tidke, S. Multi-Spectral Imaging for Fruits and Vegetables. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 743–760. [Google Scholar] [CrossRef]
  23. Li, H.; Gu, Z.; He, D.; Wang, X.; Huang, J.; Mo, Y.; Li, P.; Huang, Z.; Wu, F. A Lightweight Improved YOLOv5s Model and Its Deployment for Detecting Pitaya Fruits in Daytime and Nighttime Light-Supplement Environments. Comput. Electron. Agric. 2024, 220, 108914. [Google Scholar] [CrossRef]
  24. Kienzle, S.; Sruamsiri, P.; Carle, R.; Sirisakulwat, S.; Spreer, W.; Neidhart, S. Harvest Maturity Detection for ‘Nam Dokmai #4’ Mango Fruit (Mangifera indica L.) in Consideration of Long Supply Chains. Postharvest Biol. Technol. 2012, 72, 64–75. [Google Scholar] [CrossRef]
  25. Mohammadi, V.; Kheiralipour, K.; Ghasemi-Varnamkhasti, M. Detecting Maturity of Persimmon Fruit Based on Image Processing Technique. Sci. Hortic. 2015, 184, 123–128. [Google Scholar] [CrossRef]
  26. Zhao, J.; Chen, J. Detecting Maturity in Fresh Lycium barbarum L. Fruit Using Color Information. Horticulturae 2021, 7, 108. [Google Scholar] [CrossRef]
  27. Talekar, B. A Detailed Review on Decision Tree and Random Forest. Biosci. Biotechnol. Res. Commun. 2020, 13, 245–248. [Google Scholar] [CrossRef]
  28. Olisah, C.C.; Trewhella, B.; Li, B.; Smith, M.L.; Winstone, B.; Whitfield, E.C.; Fernández, F.F.; Duncalfe, H. Convolutional Neural Network Ensemble Learning for Hyperspectral Imaging-Based Blackberry Fruit Ripeness Detection in Uncontrolled Farm Environment. Eng. Appl. Artif. Intell. 2024, 132, 107945. [Google Scholar] [CrossRef]
  29. Fu, L.; Feng, Y.; Majeed, Y.; Zhang, X.; Zhang, J.; Karkee, M.; Zhang, Q. Kiwifruit Detection in Field Images Using Faster R-CNN with ZFNet. IFAC-PapersOnLine 2018, 51, 45–50. [Google Scholar] [CrossRef]
  30. Xiao, B.; Nguyen, M.; Yan, W.Q. Fruit Ripeness Identification Using Transformers. Appl. Intell. 2023, 53, 22488–22499. [Google Scholar] [CrossRef]
  31. Hu, H.M.; Kaizu, Y.; Zhang, H.D.; Xu, Y.W.; Imou, K.; Li, M.; Huang, J.J.; Dai, S. Recognition and Localization of Strawberries from 3D Binocular Cameras for a Strawberry Picking Robot Using Coupled YOLO/Mask R-CNN. Int. J. Agric. Biol. Eng. 2022, 15, 175–179. [Google Scholar] [CrossRef]
  32. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
  33. Xu, D.; Zhao, H.; Lawal, O.M.; Lu, X.; Ren, R.; Zhang, S. An Automatic Jujube Fruit Detection and Ripeness Inspection Method in the Natural Environment. Agronomy 2023, 13, 451. [Google Scholar] [CrossRef]
  34. Tituaña, L.; Gholami, A.; He, Z.; Xu, Y.; Karkee, M.; Ehsani, R. A Small Autonomous Field Robot for Strawberry Harvesting. Smart Agric. Technol. 2024, 8, 100454. [Google Scholar] [CrossRef]
  35. Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
  36. Ni, X.; Li, C.; Jiang, H.; Takeda, F. Deep Learning Image Segmentation and Extraction of Blueberry Fruit Traits Associated with Harvestability and Yield. Hortic. Res. 2020, 7, 110. [Google Scholar] [CrossRef]
  37. Yang, W.; Ma, X.; Hu, W.; Tang, P. Lightweight Blueberry Fruit Recognition Based on Multi-Scale and Attention Fusion NCBAM. Agronomy 2022, 12, 2354. [Google Scholar] [CrossRef]
  38. Haydar, Z.; Esau, T.J.; Farooque, A.A.; Zaman, Q.U.; Hennessy, P.J.; Singh, K.; Abbas, F. Deep Learning Supported Machine Vision System to Precisely Automate the Wild Blueberry Harvester Header. Sci. Rep. 2023, 13, 10198. [Google Scholar] [CrossRef] [PubMed]
  39. Xiao, F.; Wang, H.; Xu, Y.; Shi, Z.; Kujawa, S.; Wojciechowski, T.; Piekutowska, M.; Xiao, F.; Wang, H.; Xu, Y.; et al. A Lightweight Detection Method for Blueberry Fruit Maturity Based on an Improved YOLOv5 Algorithm. Agriculture 2023, 14, 36. [Google Scholar] [CrossRef]
  40. Gai, R.; Liu, Y.; Xu, G. TL-YOLOv8: A Blueberry Fruit Detection Algorithm Based on Improved YOLOv8 and Transfer Learning. IEEE Access 2024, 12, 86378–86390. [Google Scholar] [CrossRef]
  41. Zhang, J.; Maleski, J.; Ashrafi, H.; Spencer, J.A.; Chu, Y. Open-Source High-Throughput Phenotyping for Blueberry Yield and Maturity Prediction Across Environments: Neural Network Model and Labeled Dataset for Breeders. Horticulturae 2024, 10, 1332. [Google Scholar] [CrossRef]
  42. Liu, Y.; Zheng, H.; Zhang, Y.; Zhang, Q.; Chen, H.; Xu, X.; Wang, G. “Is This Blueberry Ripe?”: A Blueberry Ripeness Detection Algorithm for Use on Picking Robots. Front. Plant Sci. 2023, 14, 1198650. [Google Scholar] [CrossRef] [PubMed]
  43. Li, Z.; Xu, R.; Li, C.; Munoz, P.; Takeda, F.; Leme, B. In-Field Blueberry Fruit Phenotyping with a MARS-PhenoBot and Customized BerryNet. Comput. Electron. Agric. 2025, 232, 110057. [Google Scholar] [CrossRef]
  44. Zhang, R.; Dong, W.; Hou, P.; Li, H.; Han, X.; Chen, Q.; Li, F.; Zhang, X. YOLOv11-BSD: Blueberry Maturity Detection under Simulated Nighttime Conditions Evaluated with Causal Analysis. Smart Agric. Technol. 2025, 12, 101314. [Google Scholar] [CrossRef]
  45. Pérez-Borrero, I.; Marín-Santos, D.; Gegúndez-Arias, M.E.; Cortés-Ancos, E. A Fast and Accurate Deep Learning Method for Strawberry Instance Segmentation. Comput. Electron. Agric. 2020, 178, 105736. [Google Scholar] [CrossRef]
  46. Yu, Y.; Zhang, K.; Liu, H.; Yang, L.; Zhang, D. Real-Time Visual Localization of the Picking Points for a Ridge-Planting Strawberry Harvesting Robot. IEEE Access 2020, 8, 116556–116568. [Google Scholar] [CrossRef]
  47. Ilyas, T.; Umraiz, M.; Khan, A.; Kim, H. DAM: Hierarchical Adaptive Feature Selection Using Convolution Encoder Decoder Network for Strawberry Segmentation. Front. Plant Sci. 2021, 12, 591333. [Google Scholar] [CrossRef]
  48. An, Q.; Wang, K.; Li, Z.; Song, C.; Tang, X.; Song, J. Real-Time Monitoring Method of Strawberry Fruit Growth State Based on YOLO Improved Model. IEEE Access 2022, 10, 124363–124372. [Google Scholar] [CrossRef]
  49. Fan, Y.; Zhang, S.; Feng, K.; Qian, K.; Wang, Y.; Qin, S. Strawberry Maturity Recognition Algorithm Combining Dark Channel Enhancement and YOLOv5. Sensors 2022, 22, 419. [Google Scholar] [CrossRef]
  50. Lemsalu, M.; Bloch, V.; Backman, J.; Pastell, M. Real-Time CNN-Based Computer Vision System for Open-Field Strawberry Harvesting Robot. IFAC-PapersOnLine 2022, 55, 24–29. [Google Scholar] [CrossRef]
  51. He, Z.; Karkee, M.; Zhang, Q. Detecting and Localizing Strawberry Centers for Robotic Harvesting in Field Environment. IFAC-PapersOnLine 2022, 55, 30–35. [Google Scholar] [CrossRef]
  52. Cai, C.; Tan, J.; Zhang, P.; Ye, Y.; Zhang, J. Determining Strawberries’ Varying Maturity Levels by Utilizing Image Segmentation Methods of Improved DeepLabV3+. Agronomy 2022, 12, 1875. [Google Scholar] [CrossRef]
  53. Tang, C.; Chen, D.; Wang, X.; Ni, X.; Liu, Y.; Liu, Y.; Mao, X.; Wang, S. A Fine Recognition Method of Strawberry Ripeness Combining Mask R-CNN and Region Segmentation. Front. Plant Sci. 2023, 14, 1211830. [Google Scholar] [CrossRef]
  54. Visentin, F.; Castellini, F.; Muradore, R. A Soft, Sensorized Gripper for Delicate Harvesting of Small Fruits. Comput. Electron. Agric. 2023, 213, 108202. [Google Scholar] [CrossRef]
  55. Ma, Z.; Dong, N.; Gu, J.; Cheng, H.; Meng, Z.; Du, X. STRAW-YOLO: A Detection Method for Strawberry Fruits Targets and Key Points. Comput. Electron. Agric. 2025, 230, 109853. [Google Scholar] [CrossRef]
  56. Lawal, O.M. Study on Strawberry Fruit Detection Using Lightweight Algorithm. Multimed. Tools Appl. 2024, 83, 8281–8293. [Google Scholar] [CrossRef]
  57. He, Z.; Karkee, M.; Zhang, Q. Enhanced Machine Vision System for Field-Based Detection of Pickable Strawberries: Integrating an Advanced Two-Step Deep Learning Model Merging Improved YOLOv8 and YOLOv5-Cls. Comput. Electron Agric 2025, 234, 110173. [Google Scholar] [CrossRef]
  58. Xie, H.; Zhang, D.; Yang, L.; Cui, T.; He, X.; Zhang, K.; Zhang, Z. Development, Integration, and Field Evaluation of a Dual-Arm Ridge Cultivation Strawberry Autonomous Harvesting Robot. J. Field Robot. 2025, 42, 1783–1798. [Google Scholar] [CrossRef]
  59. Zhang, X.; Thayananthan, T.; Usman, M.; Liu, W.; Chen, Y. Multi-Ripeness Level Blackberry Detection Using YOLOv7 for Soft Robotic Harvesting. In Autonomous Air and Ground Sensing Systems for Agricultural Optimization and Phenotyping VIII; SPIE: Bellingham, DC, USA, 2023; p. 15. [Google Scholar] [CrossRef]
  60. Miraei Ashtiani, S.H.; Javanmardi, S.; Jahanbanifard, M.; Martynenko, A.; Verbeek, F.J. Detection of Mulberry Ripeness Stages Using Deep Learning Models. IEEE Access 2021, 9, 100380–100394. [Google Scholar] [CrossRef]
  61. Qiu, H.; Zhang, Q.; Li, J.; Rong, J.; Yang, Z. Lightweight Mulberry Fruit Detection Method Based on Improved YOLOv8n for Automated Harvesting. Agronomy 2024, 14, 2861. [Google Scholar] [CrossRef]
  62. Naranjo-Torres, J.; Mora, M.; Hernández-García, R.; Barrientos, R.J.; Fredes, C.; Valenzuela, A. A Review of Convolutional Neural Network Applied to Fruit Image Processing. Appl. Sci. 2020, 10, 3443. [Google Scholar] [CrossRef]
  63. Falih, B.S.; Gierz, Ł.; Al-Zaidi, G.A. Detecting Clustered Fruits Using a Hybrid of Convolutional Neural Networks and Machine Learning Classifiers—Case Study. Adv. Sci. Technology. Res. J. 2025, 19, 1–9. [Google Scholar] [CrossRef]
  64. Peng, Y.; Wang, A.; Liu, J.; Faheem, M. A Comparative Study of Semantic Segmentation Models for Identification of Grape with Different Varieties. Agriculture 2021, 11, 997. [Google Scholar] [CrossRef]
  65. Shi, X.; Wang, S.; Zhang, B.; Zhang, Z.; Wang, S.; Ding, X.; Wang, S.; Qi, P.; Yang, H. Advances in Berry Harvesting Robots. Horticulturae 2025, 11, 1042. [Google Scholar] [CrossRef]
  66. Zhang, D.; Zhang, W.; Yang, H.; Yang, H. Application of Soft Grippers in the Field of Agricultural Harvesting: A Review. Machines 2025, 13, 55. [Google Scholar] [CrossRef]
  67. Chauhan, A.; Brouwer, B.; Luo, L.; Nederhoff, L.; El Harchoui, N.; Shoushtari, A.L. Measuring the Response of Soft Fruits to Robotic Handling. Smart Agric. Technol. 2025, 12, 101445. [Google Scholar] [CrossRef]
  68. Hughes, J.; Culha, U.; Giardina, F.; Guenther, F.; Rosendo, A.; Iida, F. Soft Manipulators and Grippers: A Review. Frontiers Robotics AI 2016, 3, 69. [Google Scholar] [CrossRef]
  69. Navas, E.; Fernández, R.; Armada, M.; Gonzalez-de-Santos, P. Diaphragm-Type Pneumatic-Driven Soft Grippers for Precision Harvesting. Agronomy 2021, 11, 1727. [Google Scholar] [CrossRef]
  70. Elfferich, J.F.; Dodou, D.; Santina, C. Della Soft Robotic Grippers for Crop Handling or Harvesting: A Review. IEEE Access 2022, 10, 75428–75443. [Google Scholar] [CrossRef]
  71. Zhang, Y.; Wang, Z. Review of Robotic Grippers for High-Speed Handling of Fragile Foods. Adv. Robot. 2025, 39, 1054–1070. [Google Scholar] [CrossRef]
  72. Navas, E.; Fernández, R.; Sepúlveda, D.; Armada, M.; Gonzalez-De-santos, P. Soft Grippers for Automatic Crop Harvesting: A Review. Sensors 2021, 21, 2689. [Google Scholar] [CrossRef]
  73. He, Z.; Liu, Z.; Zhou, Z.; Karkee, M.; Zhang, Q. Improving Picking Efficiency under Occlusion: Design, Development, and Field Evaluation of an Innovative Robotic Strawberry Harvester. Comput Electron Agric 2025, 237, 110684. [Google Scholar] [CrossRef]
  74. Chen, K.; Li, T.; Yan, T.; Xie, F.; Feng, Q.; Zhu, Q.; Zhao, C. A Soft Gripper Design for Apple Harvesting with Force Feedback and Fruit Slip Detection. Agriculture 2022, 12, 1802. [Google Scholar] [CrossRef]
  75. Gunderman, A.L.; Collins, J.; Myer, A.; Threlfall, R.; Chen, Y. Tendon-Driven Soft Robotic Gripper for Berry. arXiv 2021. [Google Scholar] [CrossRef]
  76. Wang, X.; Kang, H.; Zhou, H.; Au, W.; Wang, M.Y.; Chen, C. Development and Evaluation of a Robust Soft Robotic Gripper for Apple Harvesting. Comput. Electron. Agric. 2023, 204, 107552. [Google Scholar] [CrossRef]
  77. Elfferich, J.F.; Shahabi, E.; Santina, C.D.; Dodou, D. BerryTwist: A Twisting-Tube Soft Robotic Gripper for Blackberry Harvesting. IEEE Robot. Autom. Lett. 2025, 10, 429–435. [Google Scholar] [CrossRef]
  78. Lin, J.; Hu, Q.; Xia, J.; Zhao, L.; Du, X.; Li, S.; Chen, Y.; Wang, X. Non-Destructive Fruit Firmness Evaluation Using a Soft Gripper and Vision-Based Tactile Sensing. Comput. Electron. Agric. 2023, 214, 108256. [Google Scholar] [CrossRef]
  79. Varghese, F.; Auat Cheein, F.; Koskinopoulou, M. Finite Element Optimization of a Flexible Fin-Ray-Based Soft Robotic Gripper for Scalable Fruit Harvesting and Manipulation. Smart Agric. Technol. 2025, 11, 100899. [Google Scholar] [CrossRef]
  80. Ait Ameur, M.A.; El-Sayed, A.M.; Yan, X.T.; Mehnen, J.; Maier, A.M. A Novel Opto-Tactile Sensing Approach to Enhance the Handling of Soft Fruit. Comput. Electron. Agric. 2025, 235, 110397. [Google Scholar] [CrossRef]
  81. Mawah, S.C.; Park, Y.-J. Tendon-Driven Variable-Stiffness Pneumatic Soft Gripper Robot. Robotics 2023, 12, 128. [Google Scholar] [CrossRef]
  82. Navas, E.; Shamshiri, R.R.; Dworak, V.; Weltzien, C.; Fernández, R. Soft Gripper for Small Fruits Harvesting and Pick and Place Operations. Front. Robot. AI 2023, 10, 1330496. [Google Scholar] [CrossRef]
  83. Navas, E.; Blanco, K.; Rodríguez-Nieto, D.; Fernández, R. A Modular Soft Gripper with Embedded Force Sensing and an Iris-Type Cutting Mechanism for Harvesting Medium-Sized Crops. Actuators 2025, 14, 432. [Google Scholar] [CrossRef]
  84. De Preter, A.; Anthonis, J.; De Baerdemaeker, J. Development of a Robot for Harvesting Strawberries. IFAC-PapersOnLine 2018, 51, 14–19. [Google Scholar] [CrossRef]
  85. Furia, F.; Pagliarani, N.; Junge, K.; Roels, E.; Terryn, S.; Vanderborght, B.; Brancart, J.; Hughes, J.; Cianchetti, M. Soft Pneumatic Gripper with Interchangeable Fingertips by Using Reversible Polymers: The GraspBerry, a Raspberry Picking Case Study. IEEE Robot. Autom. Mag. 2025, 2–10. [Google Scholar] [CrossRef]
  86. Blanco, K.; Navas, E.; Rodríguez-Nieto, D.; Emmi, L.; Fernández, R. Design and Experimental Assessment of 3D-Printed Soft Grasping Interfaces for Robotic Harvesting. Agronomy 2025, 15, 804. [Google Scholar] [CrossRef]
  87. Cao, M.; Sun, Y.; Zhang, J.; Ying, Z. A Novel Pneumatic Gripper Driven by Combination of Soft Fingers and Bellows Actuator for Flexible Grasping. Sens. Actuators A Phys. 2023, 355, 114335. [Google Scholar] [CrossRef]
  88. Li, H.; Xie, D.; Xie, Y. A Soft Pneumatic Gripper with Endoskeletons Resisting Out-of-Plane Bending. Actuators 2022, 11, 246. [Google Scholar] [CrossRef]
  89. Zaidi, S.; Maselli, M.; Laschi, C.; Cianchetti, M. Actuation Technologies for Soft Robot Grippers and Manipulators: A Review. Curr. Robot. Rep. 2021, 2, 355–369. [Google Scholar] [CrossRef]
  90. Xu, J.; Xu, B.; Zhan, H.; Xie, Z.; Tian, Z.; Lu, Y.; Wang, Z.; Yue, H.; Yang, F. A Soft Robotic System Imitating the Multimodal Sensory Mechanism of Human Fingers for Intelligent Grasping and Recognition. Nano Energy 2024, 130, 110120. [Google Scholar] [CrossRef]
  91. Li, S.; Sun, W.; Liang, Q.K.; Liu, C.P.; Liu, J. Assessing Fruit Hardness in Robot Hands Using Electric Gripper Actuators with Tactile Sensors. Sens. Actuators A Phys. 2024, 365, 114843. [Google Scholar] [CrossRef]
  92. Dimeas, F.; Sako, D.V.; Moulianitis, V.C.; Aspragathos, N.A. Design and Fuzzy Control of a Robotic Gripper for Efficient Strawberry Harvesting. In Robotica; Cambridge University Press: Cambridge, UK, 2015; Volume 33, pp. 1085–1098. [Google Scholar]
  93. Xiong, Y.; From, P.J.; Isler, V. Design and Evaluation of a Novel Cable-Driven Gripper with Perception Capabilities for Strawberry Picking Robots. Proc. IEEE Int. Conf. Robot. Autom. 2018, 7384–7391. [Google Scholar] [CrossRef]
  94. Yu, Y.; Xie, H.; Zhang, K.; Wang, Y.; Li, Y.; Zhou, J.; Xu, L. Design, Development, Integration, and Field Evaluation of a Ridge-Planting Strawberry Harvesting Robot. Agriculture 2024, 14, 2126. [Google Scholar] [CrossRef]
  95. Sobol, Z.; Kurpaska, S.; Nawara, P.; Pedryc, N.; Basista, G.; Tabor, J.; Hebda, T.; Tomasik, M. Prototype of a New Head Grabber for Robotic Strawberry Harvesting with a Vision System. Sensors 2024, 24, 6628. [Google Scholar] [CrossRef]
Figure 1. Flowchart illustrating the structure of the manuscript.
Figure 1. Flowchart illustrating the structure of the manuscript.
Agronomy 15 02677 g001
Figure 2. Scoping review flux diagram.
Figure 2. Scoping review flux diagram.
Agronomy 15 02677 g002
Figure 3. Distribution of analysed publications by year of publication (2015–2025).
Figure 3. Distribution of analysed publications by year of publication (2015–2025).
Agronomy 15 02677 g003
Figure 4. Training and Implementation of the End-Effector. (a) Lab-to-field artificial raspberry used for training [5]; (b) The end-effector [5].
Figure 4. Training and Implementation of the End-Effector. (a) Lab-to-field artificial raspberry used for training [5]; (b) The end-effector [5].
Agronomy 15 02677 g004
Figure 5. Sequential phases of the raspberry detachment procedure in the Lab2Field harvesting system.
Figure 5. Sequential phases of the raspberry detachment procedure in the Lab2Field harvesting system.
Agronomy 15 02677 g005
Figure 6. Robotic End-Effector for Fruit Picking. (a) 3D model of the designed end-effector [12]; (b) Close-up view of the end-effector interacting with a raspberry [9].
Figure 6. Robotic End-Effector for Fruit Picking. (a) 3D model of the designed end-effector [12]; (b) Close-up view of the end-effector interacting with a raspberry [9].
Agronomy 15 02677 g006
Figure 7. Flowchart of the Lab2Field visual detection and alignment algorithm for raspberry harvesting.
Figure 7. Flowchart of the Lab2Field visual detection and alignment algorithm for raspberry harvesting.
Agronomy 15 02677 g007
Figure 8. Identification of raspberry fruits based on the dataset by Strautina et al. [13].
Figure 8. Identification of raspberry fruits based on the dataset by Strautina et al. [13].
Agronomy 15 02677 g008
Figure 9. Simplistic illustration of the YOLO model architecture [35].
Figure 9. Simplistic illustration of the YOLO model architecture [35].
Agronomy 15 02677 g009
Figure 10. Conceptual illustrations of various soft robotic grippers grasping a raspberry. The designs shown are adapted from: (a) Chen et al. [74], fin-ray end-effector, (b) Gunderman et al. [75], tendon-driven gripper, (c) X. Wang et al. [76], pneumatic gripper, and (d) Elfferich et al. [77], enveloping gripper. The visualisations have been modified to standardise the depiction of the grasping task.
Figure 10. Conceptual illustrations of various soft robotic grippers grasping a raspberry. The designs shown are adapted from: (a) Chen et al. [74], fin-ray end-effector, (b) Gunderman et al. [75], tendon-driven gripper, (c) X. Wang et al. [76], pneumatic gripper, and (d) Elfferich et al. [77], enveloping gripper. The visualisations have been modified to standardise the depiction of the grasping task.
Agronomy 15 02677 g010
Table 1. Overview of studies focused on raspberry detection and classification using computer vision methods.
Table 1. Overview of studies focused on raspberry detection and classification using computer vision methods.
SourceModelmAP@0.5 [%]Precision [%]Recall [%]F1-Score [%]Accuracy [%]fps
Ling et al. [15]HSA-YOLOv5 (HSV Self-Adaption YOLOv5)97-----
Luo et al. [16]Improved YOLOv11n (HCSA + DWR + DySample)93.492.5–94.3-89--
Zhang et al. [17]Improved YOLOv8n (Container + CAA. RGB-D)83.6 (day)/93.4 (night)81.0/88.480.7/90.7≈85–89-153–167
Table 2. Overview of studies focused on blueberry detection and classification using computer vision methods.
Table 2. Overview of studies focused on blueberry detection and classification using computer vision methods.
SourceModelmAP@0.5 [%]Precision [%]Recall [%]F1-Score [%]Accuracy [%]fps
Ni et al. [36]Mask R-CNN (ResNet-101 + FPN)71.6---90.5-
Yang et al. [37] YOLOv583.283.876.179.8--
Haydar et al. [38]YOLOv4-tiny (DepthAI − OAK-D)86.5-----
Xiao et al. [39] Modified YOLOv5 (ShuffleNet + CBAM)91.596.39294.12-67.1
Gai et al. [40]TL-YOLOv8 (Transfer Learning YOLOv8)94.184.691.387.8--
Zhang et al. [41]YOLOv11m (High-Throughput Phenotyping Model)-90.0 (mature)/81.0 (immature)91.0 (mature)/79.0 (immature)90.0/80.0--
Liu et al. [42]BlueberryYOLO (YOLOv5x + MobileNetv3 + Little-CBAM + MSSENet + EIoU_Loss78.379.375.9---
Li et al. [43]BerryNet (YOLOv8 + SAM + CNN maturity classifier)78.7 (fruit segmentation)/52.8 (cluster detection)75.471.373.3--
Zhang et al. [44]YOLOv11-BSD (nighttime detection. causal robustness)91.88985.787.3-66.5
Table 3. Overview of studies focused on strawberry detection and classification using computer vision methods.
Table 3. Overview of studies focused on strawberry detection and classification using computer vision methods.
SourceModelmAP@0.5 [%]Precision [%]Recall [%]F1-Score [%]Accuracy [%]fps
Pérez-Borrero et al. [45]Improved Mask R-CNN43.85----10
Yu et al. [46]R-YOLO (MobileNet-V1) 94.4393.4693.94 18
Ilyas et al. [47] Straw-Net (encoder–decoder s DAM a PDC)91.6791.787.489.588.853
An et al. [48] Improved YOLOv594.2693.1590.7691.9193.1530.5
Fan et al. [49] Lightweight YOLOv5 >85>80>8085–9080–88-
Hu et al. [31] YOLOv3 + Mask R-CNN (stereo 3D)-93.9--93.4–94.5-
Lemsalu et al. [50]YOLOv5 (TensorRT. edge device Jetson AGX Xavier)91.5 (ripe fruit)/43.6 (stalk)8989.889.445
He et al. [51]YOLOv4 + YOLOv4-tiny (Dual-stage 3D localisation)80.68 (detection)/86.45 (localisation)--0.8-55.2/4.18
Cai et al. [52] Improved DeepLabV3+ (ECA-SimAM + CBAM)83.05---90.97.67
Tang et al. [53]Mask R-CNN + Self-Calibrated Convolutions (SCNet50) + SVM97.9988483–9886.618.2
Visentin et al. [54]YOLOv8 (detection + ripeness + force ML)-98.2 (plant detection)92.4 (fruit detection)-82 (successful picking)-
Ma et al. [55]STRAW-YOLO (YOLOv8-Pose + EMA + C2f-OREPA + DCN-C2f + Keypoints)9691.691.791.6-62.6
Lawal [56]YOLOStrawberry (YOLOv5-light. Shuffle_Block + ResNet + SE)89.7≈86≈83≈84-≈137
He et al. [57]Two-step YOLOv8 + YOLOv5-cls83.2 (field dataset YOLOv8) ---95.1 (YOLOv5-cls)119
Xie et al. [58]YOLOv5s (detection)/YOLOv5s-seg (segmentation)98.0/93.497.6/92.593.7/86.695.6/89.5--
Table 4. Overview of studies focused on blackberry (Zhang et al. [59] and Olisah et al. [28]) and mulberry (Miraei Ashtiani et al. [60] and Qiu et al. [61]) detection and classification using computer vision methods.
Table 4. Overview of studies focused on blackberry (Zhang et al. [59] and Olisah et al. [28]) and mulberry (Miraei Ashtiani et al. [60] and Qiu et al. [61]) detection and classification using computer vision methods.
SourceModelmAP@0.5 [%]Precision [%]Recall [%]F1-Score [%]Accuracy [%]fps
Zhang et al. [59] YOLOv7-base91.4899086-46
Olisah et al. [28]VGG16 95.495.194.895.1
Miraei Ashtiani et al. [60]ResNet-18/AlexNet (CNN classification)----98.0–98.6-
Qiu et al. [61]Improved YOLOv8n (CSPPC + ADown + P-Head + KD)86.888.978.183.2-19.8 (Jetson Nano)
Table 5. Comparison of Studies on Robotic Grippers for Soft Fruit Harvesting.
Table 5. Comparison of Studies on Robotic Grippers for Soft Fruit Harvesting.
SourceCropsGripper type/Picking MethodPicking Success RateFruit DamageOther Comparable Metrics
Design and fuzzy control of a robotic gripper for efficient strawberry harvesting [92]Strawberries Pressure sensor network + fuzzy force control; mechanical fruit removal (imitation of the human hand)Not specified numerically; efficiency comparable to the human handTargeted minimization through force control; no quantificationMaximum permissible clamping force and tearing force measured; design and verification of fuzzy control
Design and Evaluation of a Novel Cable-Driven Gripper with Perception Capabilities for Strawberry Picking Robots [93]Strawberries Cable-driven ‘iris’ gripper with internal rotating blade, no contact with fruit (cuts the stem)96.77% (isolated strawberries)Minimal; fruit is not touched by fingers, only the stem is cutPicking time: 7.49 s (operation), 10.62 s total; tray of 7–12 fruits
Design, Development, Integration, and Field Evaluation of a Ridge-Planting Strawberry Harvesting Robot [94]Strawberries (ridge-planting)Non-destructive head: gripping the stem + quick cut (laser scanning on fingers)49.30% after sorting; 30.23% without sortingDeclared as non-destructive; quantification not specifiedSpeed: 7 s/fruit (1 arm), 4 s/fruit (2 arms)
Prototype of a New Head Grabber for Robotic Strawberry Harvesting with a Vision System [95]Strawberries (gutters)Jaw gripping of the stem + clamping + cutting; no contact with the fruit90% (harvesting efficiency of the robotic arm)No mechanical damage in the laboratory; average length of stem remnant 14 mm95% accuracy in detecting ripe fruit; target time < 4 s/fruit
Berry Twist: a Twisting-Tube Soft Robotic Gripper for Blackberry Harvesting [77]BlackberriesSoft textile ‘twisting-tube’ cuff; gripping + twisting + pulling82% (tearing), 95% (release from gripper)RDR indicator evaluated; damage quantification not in %Materials tested: thick/thin gauze, spandex, combination; best results with thick gauze
Lab2Field transfer of a robotic raspberry harvester enabled by a soft sensorized PT [5]RaspberriesParallel jaws with silicone fingers; clamping force control4/7 attempts successful in full-pipeline test (field)≈80% harvest with little or no damage (during lab → field transfer)Alignment errors Δx, Δy evaluated; measurement of tensile and compressive forces during harvesting
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Suchopár, A.; Kuře, J.; Kuřetová, B.; Hromasová, M. A Review of Integrated Approaches in Robotic Raspberry Harvesting. Agronomy 2025, 15, 2677. https://doi.org/10.3390/agronomy15122677

AMA Style

Suchopár A, Kuře J, Kuřetová B, Hromasová M. A Review of Integrated Approaches in Robotic Raspberry Harvesting. Agronomy. 2025; 15(12):2677. https://doi.org/10.3390/agronomy15122677

Chicago/Turabian Style

Suchopár, Albert, Jiří Kuře, Barbora Kuřetová, and Monika Hromasová. 2025. "A Review of Integrated Approaches in Robotic Raspberry Harvesting" Agronomy 15, no. 12: 2677. https://doi.org/10.3390/agronomy15122677

APA Style

Suchopár, A., Kuře, J., Kuřetová, B., & Hromasová, M. (2025). A Review of Integrated Approaches in Robotic Raspberry Harvesting. Agronomy, 15(12), 2677. https://doi.org/10.3390/agronomy15122677

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop