A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation

Wang, Shengxue; Luo, Tianhong

doi:10.3390/horticulturae10101024

Open AccessArticle

A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation

by

Shengxue Wang

and

Tianhong Luo

^*

College of Information and Intelligent Manufacturing, Chongqing City Vocational College, Yongchuan, Chongqing 402160, China

^*

Author to whom correspondence should be addressed.

Horticulturae 2024, 10(10), 1024; https://doi.org/10.3390/horticulturae10101024

Submission received: 5 August 2024 / Revised: 27 August 2024 / Accepted: 3 September 2024 / Published: 26 September 2024

(This article belongs to the Special Issue Application of Smart Technology and Equipment in Horticulture—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

In the context of agricultural modernization and intelligentization, automated fruit recognition is of significance for improving harvest efficiency and reducing labor costs. The variety of fruits commonly planted in orchards and the fluctuations in market prices require farmers to adjust the types of crops they plant flexibly. However, the differences in size, shape, and color among different types of fruits make fruit recognition quite challenging. If each type of fruit requires a separate visual model, it becomes time-consuming and labor intensive to train and deploy these models, as well as increasing system complexity and maintenance costs. Therefore, developing a general visual model capable of recognizing multiple types of fruits has great application potential. Existing multi-fruit recognition methods mainly include traditional image processing techniques and deep learning models. Traditional methods perform poorly in dealing with complex backgrounds and diverse fruit morphologies, while current deep learning models may struggle to effectively capture and recognize targets of different scales. To address these challenges, this paper proposes a general fruit recognition model based on the Multi-Scale Attention Network (MSA-Net) and a Hough Transform localization compensation mechanism. By generating multi-scale feature maps through a multi-scale attention mechanism, the model enhances feature learning for fruits of different sizes. In addition, the Hough Transform ellipse detection compensation mechanism uses the shape features of fruits and combines them with MSA-Net recognition results to correct the initial positioning of spherical fruits and improve positioning accuracy. Experimental results show that the MSA-Net model achieves a precision of 97.56, a recall of 92.21, and an mAP@0.5 of 94.81 on a comprehensive dataset containing blueberries, lychees, strawberries, and tomatoes, demonstrating the ability to accurately recognize multiple types of fruits. Moreover, the introduction of the Hough Transform mechanism reduces the average localization error by 8.8 pixels and 3.5 pixels for fruit images at different distances, effectively improving the accuracy of fruit localization.

Keywords:

fruit identification; deep learning; Hough Transform; fruit-harvesting robot

1. Introduction

Fruits constitute one of the largest segments in the global agricultural production market, contributing significantly to the development of the agricultural economy [1]. With a wide variety of types, they offer high nutritional value and excellent taste [2]. In recent years, with the increasing demand for fruits, more farmers have entered the fruit cultivation industry, and orchard coverage has gradually increased [3]. Due to the shortage of agricultural labor, the planting, maintenance, and harvesting of fruits are trending toward mechanization and automation [4]. Accurate and rapid fruit recognition is the crucial foundation for these operations.

Currently, research on visual models capable of detecting single types of fruits is mature. Wang et al. used various improved visual models to accurately recognize strawberries at different distances, effectively guiding orchard yield estimation and harvesting work [5,6]. However, orchards typically grow multiple types of fruits simultaneously. As fruit market prices fluctuate, farmers need to adjust the types of crops they plant to maximize economic benefits [7]. Replacing the visual model used in the detection production process will cause interruptions in the picking process and result in economic losses. Xiao et al. believe that using a visual framework that can detect multiple fruits is a potential cost-saving solution [8]. Therefore, developing a visual model that can simultaneously recognize multiple fruits has practical value.

Sensors are widely used in precision agriculture and can effectively obtain various data for use as inputs in deep learning models for tasks such as fruit variety and fruit disease classification [9,10]. Current methods for identifying multiple types of fruits mainly include deep learning models and image processing algorithms. Deep-learning-based methods extract features of the fruits to be recognized by performing multiple convolution operations on fruit images. Duong et al. efficiently classified various fruits based on fruit recognition using EfficientNet and MixNet [11]. The DCNN model, due to the depth of its convolutional processing, has been applied to fruit recognition, effectively distinguishing multiple fruits with similar characteristics [12]. Rachmawati used the PCoST model, which defines a semantic color for different fruits, effectively identifying various types of fruits [13]. Nguyen explored the advantages and disadvantages of deep learning algorithms, effectively eliminating the impact of external lighting variations to classify different types of fruits [14]. Given the numerous types of fruits to be detected, the similar shapes of different types of fruits, and the significant color differences at different maturity stages of the same fruit, it is currently challenging to detect multiple types of fruits with high precision using deep learning models.

Image processing methods can distinguish different fruits using their color, texture, and size, achieving precise fruit recognition. Wang accurately identified citrus fruits based on texture features and blueberry fruits based on color features using the improved LBP algorithm and the MSRCR algorithm, respectively [15,16]. Image processing methods can achieve high-precision detection of various target fruit types by adjusting algorithm steps. However, when fruit types change, these adjustments lack flexibility. He et al. used a gray-level co-occurrence matrix and a color histogram to extract the color and texture features of different fruits as the basis for identifying different fruits [17].

This paper aimed to develop a general fruit recognition model based on the Multi-Scale Attention Network (MSA-Net) and a Hough Transform localization compensation mechanism to address the limitations of existing methods. Specifically, MSA-Net was applied first, using a multi-scale attention mechanism to generate multi-scale feature maps for different types of targets, effectively learning the features of fruits of different sizes (including blueberries, citrus, lychees, and tomatoes). Secondly, this paper introduced a Hough Transform elliptical detection compensation mechanism. The model’s initial recognition results further corrected the initial localization of spherical fruits based on the position of the ellipse center. This study conducted a series of experiments and evaluated the detection performance of the MSA-Net model based on metrics such as mAP@0.5, precision, recall, and the F1 score. Additionally, it analyzed the performance of the Hough Transform compensation algorithm based on localization pixel errors.

2. Materials and Methods

2.1. Data Acquisition

There were citrus orchards, lychee orchards, tomato orchards, and blueberry orchards located in Zengcheng District, Guangzhou City, Guangdong Province, China (23°16′ N, 113°51′ E) for dataset collection. A Canon 200D II DSLR camera (Canon, Tokyo, Japan) was used to capture images. A total of 3331 raw images were collected and saved in .jpeg format, with a resolution of 4032 × 3024 pixels. The collected data were preliminarily screened. The variance of the initial image after Laplace transform was calculated. If the value was less than 100, it was considered that the image had defocusing and blurring problems, and out-of-focus or blurry images were removed. Overall, 20% of the images were randomly selected from the initial dataset and enhanced through random flipping, tilting, and other operations. The dataset was categorized into several classes, as shown in Figure 1, with the specific composition detailed in Table 1.

2.2. MSA-Net

The backbone structure of MSA-Net is shown in Figure 2.

When multiple manually labeled fruit images are input into the backbone of MSA-Net, several subnetworks of the backbone layer, namely low-level, mid-level, and high-level subnetworks, capture and integrate scale-specific features of different fruits. The fruit features captured by different subnetworks are further complemented across scales to achieve network feature fusion. The features fused from the subnetworks are output after multi-scale aggregation, resulting in the extracted intermediate features.

The overall structure of MSA-Net is shown in Figure 3.

MSA-Net uses an asymmetric encoder–subnetwork–decoder architecture. The encoder extracts features at different scales through four residual blocks. In the subnetwork, AFeB and AMB modules are alternately stacked to integrate contextual information. In the decoder, the first three AFuB blocks adaptively sample and transfer fine-grained detail features to coarse-grained contextual features. The final AFuB blocks samples and transfers image details from noisy inputs, achieving the final output of MSA-Net, which detects multiple fruits in different locations. The feature maps of different types of fruits have different sizes, and MSA-Net has multiple subnetworks for processing feature maps of different sizes, enabling effective feature extraction of different fruits and reducing feature omissions.

2.3. Hough Transform Compensation Mechanism

To refine the recognition results of the MSA-Net model, this study proposed an ellipse compensation matrix based on Hough Transform, grounded in the shape characteristics of the fruits to be detected. The method’s workflow is illustrated in Figure 4.

The images detected by MSA-Net served as the input for the algorithm. In step 1, the Canny algorithm was applied to the input images to extract the edge lines of the fruits and branches. In step 2, three points were randomly selected and all points within neighborhoods of the same size centered on these three points were fitted into an ellipse using the least squares method. In step 3, a fourth point was randomly selected from the edge points, and it was determined whether this point lies on the fitted ellipse. If so, an ellipse was constructed using these four points. If the overlap area between the bounding box and the constructed ellipse exceeded 70%, a line was drawn between the center point of the bounding box and the center point of the ellipse, and the midpoint of this line was taken as the corrected fruit center coordinate.

2.4. Experimental Design

To validate the effectiveness of the MSA-Net model and the Hough Transform algorithm in this study, three sets of experiments were conducted sequentially.

In the first set of experiments, the MSA-Net model proposed in this study, along with several common deep learning network models, such as YOLOv4, YOLOv5s, YOLOv8s, and faster-RCNN models, was trained using the same dataset. A comparative performance analysis of the models was conducted.

The second set of experiments was used to test the performance of the proposed MSA-Net model to identify fruits of different categories in natural environments.

In the third set of experiments, 100 images of each type of fruit detected by MSA-Net were selected. The Hough Transform compensation mechanism was used to correct the fruit localization in each image. The actual center positions of the detected targets in the images were precisely determined manually using Photoshop 2018 software. The errors between the detection center positions before and after algorithm correction and the manually marked positions were recorded and analyzed.

The experimental hardware setup primarily involved a computer system featuring an Intel i5-14600kf processor, 32 GB RAM, and a GeForce GTX 4070 GPU (12G GDDR6X). The computer was configured with CUDA 12.0 parallel computing architecture and used the NVIDIA cuDNN 8.9.3 GPU acceleration library. The software simulation environment was built on the PyTorch deep learning framework (Python version 3.9). The data pre-processing involved the use of Matlab 2022 b. For configuring and managing the virtual environment, Anaconda was used, and program compilation and execution were carried out using Pycharm 2020. Model performance metrics mainly included P (precision), R (recall), the F1 score (harmonic average), the average precision (AP), and the mean average precision (mAP), as shown in Formulas (1)–(3).

\{\begin{matrix} \begin{matrix} precision = \frac{T_{p}}{T_{p} {+ F}_{p}} \\ recall = \frac{T_{p}}{T_{p} {+ F}_{N}} \end{matrix} | \\ | \\ F_{1} = \frac{2 \times precision \times recall}{precision + recall} \end{matrix}

(1)

AP = \frac{\sum precision}{N}

(2)

mAP @ 0 . 5 = \frac{\sum_{i = 1}^{K} {AP}_{i}}{N_{C}}

(3)

where

T_{p}

represents the number of fruits correctly identified,

F_{p}

represents the number of fruits incorrectly identified,

F_{N}

represents the number of missed fruits,

N

represents the total number of images, and

N_{C}

represents the number of categories of each fruit.

AP

, representing the integral of accuracy rate to recall rate, is equal to the area under the P–R curve, and

mAP @ 0.5

is the average of the average precision of all categories.

3. Results and Discussion

3.1. Detection Performance of the MSA-Net Model

In the first set of experiments, different models underwent training with an epoch of 300. The performance results of different models are shown in Table 2.

The precision of the MSA-Net model was 97.56, which is 0.28 lower than that of YOLOv8s but higher than that of YOLOv5s, Faster R-CNN, and YOLOv4-tiny by 1.05, 4.37, and 6.7, respectively. MSA-Net achieved the highest recall value of 92.21, which is 8.64, 7.24, 1.73, and 4.47 higher than the recall values of YOLOv8s, YOLOv5s, Faster R-CNN, and YOLOv4-tiny, respectively. MSA-Net had the highest F1 score of 94.81, which is 4.67, 4.44, 2.99, and 5.54 higher than the F1 scores of YOLOv8s, YOLOv5s, Faster R-CNN, and YOLOv4-tiny, respectively. MSA-Net also achieved the highest mAP@0.5 and mAP@0.5:0.95 values, which were 92.72 and 61.85, respectively.

While the precision value of MSA-Net is slightly higher than that of other models, its recall value is significantly better. This is due to the model’s use of a multi-scale network for detection, where multiple subnetworks are configured to process the feature maps of fruits of different sizes based on the type of fruit being detected. This comprehensive approach to recognizing various types of fruits reduces missed detections, resulting in a higher recall value. The detection performance of MSA-Net is similar to that of conventional models for detecting single fruits, and it has the accuracy to complete actual detection [18]. MSA-Net’s highest mAP@0.5 and mAP@0.5:0.95 values indicate that the model has high recognition accuracy and can detect multiple types of fruits with high confidence.

The changes in mAP@0.5 during the training process for different models are shown in Figure 5.

From Figure 5, it can be seen that the mAP@0.5 curve of the MSA-Net model rose rapidly with the increase in epochs and gradually stabilized, ultimately converging to a value superior to that of other models. MSA-Net uses multiple subnetworks to extract features, enabling the model to quickly and effectively learn the characteristics of various fruits and achieve a high mAP value.

3.2. Analysis of MSA-Net Recognition Results

The detection results of MSA-Net for fruits under different environmental conditions are shown in Figure 6.

From the detection results of MSA-Net, it can be seen that for strawberries under various environmental conditions, such as occlusion and backlighting, the model performs comprehensive detection with a confidence level above 0.82. For tomatoes in environments with close-ups, distant views, and dense occlusion, the model detects tomatoes with a confidence level above 0.78. For lychees and blueberries, under various conditions, such as distance, close-up, backlighting, and exposure, the model can comprehensively detect lychees in images with a confidence level above 0.70. The robustness of MSA-Net is reflected in its recognition accuracy under various lighting and distance conditions. The multi-scale detection layers of MSA-Net effectively adapt to different fruits in various environments, achieving accurate detection.

3.3. Analysis of Hough Transform Compensation Results

The Hough Transform compensation mechanism was used to correct the detection results of MSA-Net. The errors before and after correction are shown in Table 3.

As shown in the table, the error reduction was related to different distances and fruit types. For close-up fruit images, the average localization error for different fruits decreased by 8 pixels. Among these, the localization error for tomatoes decreased the most, by 19 pixels. The localization error for blueberries decreased by an average of 11 pixels, for lychees by an average of 4 pixels, and for strawberries by only 1 pixel. For distant fruit images, the average localization error for different fruits decreased by 3 pixels. The localization error for tomatoes decreased the most, by 8 pixels, while the errors for blueberries, lychees, and strawberries decreased by 1, 2, and 3 pixels, respectively. The positioning error of fruits at close range reduced by an average of 24.97%, while the positioning error of fruits at long distances reduced by an average of 14.01%.

From these results, it can be seen that the Hough Transform algorithm significantly reduces the localization error for tomatoes at different distances because the shape of tomatoes closely matches an ellipse, making the Hough Transform results more accurate and effectively correcting the localization error. In contrast, the shape of strawberries differs more from a circle, resulting in a smaller change in the localization error after the algorithm is applied. Lychees and blueberries, having shapes closer to a circle, also show good error correction results.

The correction effect of the Hough Transform compensation mechanism is shown in Figure 7.

From the aforementioned results, it can be seen that the proposed Hough Transform compensation mechanism can further improve the localization accuracy for different types of spherical fruits. More accurate coordinates can be used in subsequent operations, such as picking, thereby increasing the success rate of these tasks. However, the effectiveness of the Hough Transform compensation mechanism is mainly reflected in approximately circular fruits, which can further improve positioning accuracy. For irregularly shaped fruits, there are errors in the detection process of Hough Transform, which affects the effectiveness of the compensation mechanism but still has a certain reduction effect on positioning errors (such as strawberries that are approximately diamond shaped).

The method proposed in this article can be easily integrated into automation systems to reduce labor and improve efficiency [19]. For example, combining this technology with other innovations, such as drones, soil sensors, or weather models, can create comprehensive smart agriculture solutions that improve fruit recognition and overall farm management [20]. The proposed method demonstrates high accuracy in correcting the localization of spherical fruits, but it still has some limitations. The method uses Hough Transform for elliptical detection, which works well when dealing with regularly shaped or symmetrical fruits. However, for irregularly shaped or asymmetrical fruits, such as strawberries, the model’s performance is noticeably constrained. This drawback primarily arises from the limitations of elliptical detection in recognizing non-spherical objects [21]. The current model also lacks robustness when handling various environmental conditions, such as changes in lighting, occlusion, and fruit overlap, leading to blurred fruit boundaries and consequently affecting the detection accuracy of Hough Transform [22]. Future research should consider incorporating new morphological feature analysis methods and developing localization compensation algorithms adaptable to various fruit shapes to enhance the model’s applicability across different fruit morphologies and improve its robustness under complex environmental conditions [23].

4. Conclusions

This study aimed to develop a method capable of recognizing various fruits. First, an MSA-Net model was proposed for identifying multiple fruits. Based on the shape characteristics of different fruits, a Hough Transform compensation mechanism was introduced to further refine the model’s recognition results. The specific conclusions are as follows:

MSA-Net can effectively learn the features of different types of fruits. For a comprehensive dataset including blueberries, lychees, strawberries, and tomatoes, the model achieved a precision of 97.56, a recall of 92.21, and an mAP@0.5 of 94.81, accurately identifying various fruits in the environment.
The introduction of the Hough Transform ellipse detection compensation mechanism further refines the initial localization of spherical fruits. For close-up fruit images, the average localization error for different fruits decreased by 8.8 pixels. For distant fruit images, the average localization error for different fruits decreased by 3.5 pixels, further improving the accuracy of fruit localization.

The proposed model for recognizing multiple fruits has a certain degree of scalability. When new fruit varieties need to be recognized, the existing model can be appropriately extended and fine-tuned, eliminating the need to develop a new model from scratch. This demonstrates considerable application value.

Author Contributions

Conceptualization, methodology, S.W. and T.L.; software, data curation, S.W.; writing—original draft, S.W.; writing—review and editing, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Chongqing Natural Science Foundation Innovation and Development Joint Fund (grant no. CSTB2024NSCQ-LZX0128).

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors thank the editor and anonymous reviewers for providing helpful suggestions for improving the quality of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Osorio, L.; Flórez-López, E.; Grande-Tovar, C. The Potential of Selected Agri-Food Loss and Waste to Contribute to a Circular Economy: Applications in the Food, Cosmetic and Pharmaceutical Industries. Molecules 2021, 26, 515. [Google Scholar] [CrossRef] [PubMed]
Río-Celestino, M.; Font, R. The Health Benefits of Fruits and Vegetables. Foods 2020, 9, 369. [Google Scholar] [CrossRef]
Andaryani, S.; Sloan, S.; Nourani, V.; Keshtkar, H. The utility of a hybrid GEOMOD-Markov Chain model of land-use change in the context of highly water-demanding agriculture in a semi-arid region. Ecol. Inform. 2021, 64, 101332. [Google Scholar] [CrossRef]
Zhang, Z.; Igathinathane, C.; Li, J.; Cen, H.; Lu, Y.; Flores, P. Technology progress in mechanical harvest of fresh market apples. Comput. Electron. Agric. 2020, 175, 105606. [Google Scholar] [CrossRef]
Wang, C.; Han, Q.; Li, C.; Li, J.; Kong, D.; Wang, F.; Zou, X. Assisting the Planning of Harvesting Plans for Large Strawberry Fields through Image-Processing Method Based on Deep Learning. Agriculture 2024, 14, 560. [Google Scholar] [CrossRef]
Wang, C.; Wang, H.; Han, Q.; Zhang, Z.; Kong, D.; Zou, X. Strawberry Detection and Ripeness Classification Using YOLOv8+ Model and Image Processing Method. Agriculture 2024, 14, 751. [Google Scholar] [CrossRef]
Fu, H.; Fang, Y.; Qu, Y.; Pan, Y. A Sustainable Economic System to Face the Fluctuation of Fruit Prices: Based on a Small-Region DSGE Model. Discret. Dyn. Nat. Soc. 2021, 2021, 6693709. [Google Scholar] [CrossRef]
Xiao, F.; Wang, H.; Xu, Y.; Zhang, R. Fruit Detection and Recognition Based on Deep Learning for Automatic Harvesting: An Overview and Review. Agronomy 2023, 13, 1625. [Google Scholar] [CrossRef]
Soussi, A.; Zero, E.; Sacile, R.; Trinchero, D.; Fossa, M. Smart Sensors and Smart Data for Precision Agriculture: A Review. Sensors 2024, 24, 2647. [Google Scholar] [CrossRef]
Nasir, I.M.; Bibi, A.; Shah, J.H.; Khan, M.A.; Sharif, M.; Iqbal, K.; Nam, Y.; Kadry, S. Deep learning-based classification of fruit diseases: An application for precision agriculture. Comput. Mater. Contin. 2021, 66, 1949–1962. [Google Scholar]
Duong, L.T.; Nguyen, P.T.; Di Sipio, C.; Di Ruscio, D. Automated fruit recognition using EfficientNet and MixNet. Comput. Electron. Agric. 2020, 171, 105326. [Google Scholar] [CrossRef]
Hussain, D.; Hussain, I.; Ismail, M.; Alabrah, A.; Ullah, S.; Alaghbari, H. A Simple and Efficient Deep Learning-Based Framework for Automatic Fruit Recognition. Comput. Intell. Neurosci. 2022, 2022, 35237311. [Google Scholar] [CrossRef] [PubMed]
Rachmawati, E.; Supriana, I.; Khodra, M.L.; Firdaus, F. Integrating semantic features in fruit recognition based on perceptual color and semantic template. Inf. Process. Agric. 2022, 9, 316–334. [Google Scholar] [CrossRef]
Cuong, N.; Luong, A.; Trinh, T.; Ho, P.; Meesad, P.; Nguyen, T. Intelligent Fruit Recognition System Using Deep Learning. Lect. Notes Netw. Syst. 2021, 251, 13–22. [Google Scholar]
Wang, C.; Han, Q.; Li, C.; Zou, T.; Zou, X. Fusion of fruit image processing and deep learning: A study on identification of citrus ripeness based on R-LBP algorithm and YOLO-CIT model. Front. Plant Sci. 2024, 15, 1397816. [Google Scholar] [CrossRef]
Wang, C.; Han, Q.; Li, J.; Li, C.; Zou, X. YOLO-BLBE: A Novel Model for Identifying Blueberry Fruits with Different Maturities Using the I-MSRCR Method. Agronomy 2024, 14, 658. [Google Scholar] [CrossRef]
He, Q.; Xia, K.; Pan, H. Fruit Recognition Using Color Statistics. In Proceedings of the 2022 International Conference on Automation, Robotics and Computer Engineering, Wuhan, China, 16–17 December 2022. [Google Scholar] [CrossRef]
Wang, L.; Qin, M.; Lei, J.; Wang, X.; Tan, K. Blueberry maturity recognition method based on improved YOLOv4-Tiny. Trans. Chin. Soc. Agric. Eng. 2021, 37, 170–178. [Google Scholar]
Zhang, L.; Gui, G.; Khattak, A.M.; Wang, M.; Gao, W.; Jia, J. Multi-Task Cascaded Convolutional Networks Based Intelligent Fruit Detection for Designing Automated Robot. IEEE Access 2019, 7, 56028–56038. [Google Scholar] [CrossRef]
Elbasi, E.; Mostafa, N.; AlArnaout, Z.; Aymen, I.Z.; Cina, E.; Varghese, G.; Shdefat, A.; Topcu, A.E.; Abdelbaki, W.; Mathew, S.; et al. Artificial Intelligence Technology in the Agricultural Sector: A Systematic Literature Review. IEEE Access 2023, 11, 171–202. [Google Scholar] [CrossRef]
Mao, S.; Li, Y.; Ma, Y.; Zhang, B.; Zhou, J.; Wang, K. Automatic cucumber recognition algorithm for harvesting robots in the natural environment using deep learning and multi-feature fusion. Comput. Electron. Agric. 2020, 170, 105254. [Google Scholar] [CrossRef]
Kheiralipour, K.; Kazemi, A. A new method to determine morphological properties of fruits and vegetables by image processing technique and nonlinear multivariate modeling. Int. J. Food Prop. 2020, 23, 368–374. [Google Scholar] [CrossRef]
Saedi, S.; Khosravi, H. A deep neural network approach towards real-time on-branch fruit recognition for precision horticulture. Expert Syst. Appl. 2020, 159, 113594. [Google Scholar] [CrossRef]

Figure 1. Dataset examples: (a) distant tomatoes, (b) exposed tomatoes, (c) close-up of tomatoes, (d) backlit strawberries, (e) strawberries under natural light, (f) exposed strawberries, (g) close-up of lychees, (h) distant lychees, (i) exposed lychees, (j) exposed blueberries, (k) distant blueberries, and (l) backlit blueberries.

Figure 2. The backbone structure of MSA-Net.

Figure 3. The overall structure of MSA-Net.

Figure 4. The Hough Transform compensation mechanism.

Figure 5. The mAP@0.5 variation during the training process.

Figure 6. Identification results of different fruits: (a–d) identification results of strawberries, (e–h) tomato recognition results, (i–l) lychee recognition results, and (m–p) blueberry recognition results.

Figure 7. Hough Transform compensation results for different fruits: (a) blueberry; (b) lychee; (c) tomato; (d) strawberry.

Table 1. Dataset composition table.

Image Type	Number of Blueberry Images	Number of Strawberry Images	Number of Lychee Images	Number of Tomato Images	Total
Distant fruit image	158	214	174	131
Close-range-exposure fruit image	179	194	216	184
Close-range natural-light fruit image	314	337	402	367
Close-range backlit fruit image	144	95	138	84
Total	795	840	930	766	3331

Table 2. Performance of different models, including the MSA-Net model.

Model	Precision	Recall	F1	mAP@0.5	mAP@0.5:0.95
MSA-Net	97.56	92.21	94.81	92.72	61.85
YOLOV8s	97.84	83.57	90.14	86.77	56.14
YOLOV5s	96.51	84.97	90.37	88.69	58.79
FasterRCNN	93.19	90.48	91.82	76.88	50.77
YOLOV4tiny	90.86	87.74	89.27	86.31	55.98

Table 3. Hough Transform compensation results.

Image Type	Initial Positioning Average Error (Close Range)/Pixel	Initial Positioning Average Error (Close Range)/Pixel	Positioning Accuracy Improvement (Close Range)/Percentage	Initial Positioning Average Error (Distant Range)/Pixel	Average Error after Compensation (Distant Range)/Pixel	Positioning Accuracy Improvement (Distant Range)/Percentage
Blueberry	29	18	37.93%	18	17	5.55%
Lychee	34	30	11.76%	25	23	8.00%
Tomato	41	22	46.34%	30	22	26.67%
Strawberry	26	25	3.84%	19	16	15.79%
Average	33	24	24.97%	23	20	14.01%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, S.; Luo, T. A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation. Horticulturae 2024, 10, 1024. https://doi.org/10.3390/horticulturae10101024

AMA Style

Wang S, Luo T. A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation. Horticulturae. 2024; 10(10):1024. https://doi.org/10.3390/horticulturae10101024

Chicago/Turabian Style

Wang, Shengxue, and Tianhong Luo. 2024. "A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation" Horticulturae 10, no. 10: 1024. https://doi.org/10.3390/horticulturae10101024

APA Style

Wang, S., & Luo, T. (2024). A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation. Horticulturae, 10(10), 1024. https://doi.org/10.3390/horticulturae10101024

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Multi-Fruit Recognition Method for a Fruit-Harvesting Robot Using MSA-Net and Hough Transform Elliptical Detection Compensation

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. MSA-Net

2.3. Hough Transform Compensation Mechanism

2.4. Experimental Design

3. Results and Discussion

3.1. Detection Performance of the MSA-Net Model

3.2. Analysis of MSA-Net Recognition Results

3.3. Analysis of Hough Transform Compensation Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI