A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model
Abstract
:1. Introduction
2. Materials and Methods
2.1. Image Collection and Dataset Construction
2.1.1. Image Collection
2.1.2. Image Annotation and Dataset Creation
2.2. YOLOv8+ Model Building
2.2.1. RCA-CBAM Attention Mechanism Module
2.2.2. Loss Function Improvement
- 1.
- The overall architecture, as shown in Figure 4.
- (1)
- Focal Loss Part: for dealing with the classification problem.
- (2)
- CIoU Loss Part: to optimize the localization problem for target detection.
- (3)
- Feature Fusion and Reconstruction Part: to enhance the feature representation of the model.
- 2.
- Focal Loss Part
- 3.
- CIoU Loss Part
- (1)
- Solve for the distance loss: Calculate the Euclidean distance between the predicted box of tomatoes and the center point of the annotated real box, and standardize it as the ratio of the diagonal length of the closure region.
- (2)
- Solve the aspect ratio loss: The difference in aspect ratio between the predicted box and the labeled real box of the tomato in the results is calculated.
- (3)
- Solve the IoU loss: that is, the loss of the overlap degree between the predicted box of the tomato in the calculation results and the real box at the time of labeling.
- 4.
- Feature Fusion Part:
- (1)
- Feature Fusion: The features obtained through the Focal Loss part and the features obtained through the CIoU Loss part are subjected to feature-weighted summation, i.e.,
- (2)
- Auto-Attention Spanning Layer: A self-attention mechanism layer is applied to process the fused features, allowing the model to automatically focus on the key areas of the fruit with significant ripeness features.
- (3)
- Cross-Cutting Attention Span: The model computes the correlations between multiple feature spaces to establish links between different dimensions, such as color and shape.
- (4)
- Feature Reconstruction Layer: The attention-processed features are remapped to the original feature space, enabling the computation of reconstruction loss in the next step.
- (5)
- Calculation of Reconstruction Loss:
- (6)
- Inner-Focaleriou Loss: as the sum of Focal Loss, CIoU Loss, and reconstruction loss, i.e.,
2.2.3. BiFPN Feature Fusion Network
- 1.
- Uniform number of channels
- 2.
- Top-down feature fusion
- 3.
- Bottom-up feature fusion
2.2.4. YOLOv8+ Overall Structure
- 1.
- Backbone: used to extract the basic features of the image.
- 2.
- Neck: contains the structure of BiFPN, which is used for multi-scale feature fusion to ensure that targets of different sizes are effectively represented in the feature map.
- 3.
- Head: outputs the prediction results, including the location (bounding box), category, and the confidence score of the target.
3. Results
3.1. Experimental Environment
3.2. Indicators for Model Evaluation
3.3. Ablation Experiment
3.3.1. Visualization of Results and Indicators
3.3.2. Detection Results
3.4. Comparison Experiment
3.4.1. Visualization of Results and Indicators
3.4.2. Test Results
4. Discussion
4.1. Discussion of Ablation Experiment
4.2. Comparison Experiment Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Food and Agriculture Organization (FAO). FAOSTAT Database; FAO: Rome, Italy, 2020. [Google Scholar]
- Tilesi, F.; Lombardi, A.; Mazzucato, A. Scientometric and Methodological Analysis of the Recent Literature on the Health-Related Effects of Tomato and Tomato Products. Foods 2021, 10, 1905. [Google Scholar] [CrossRef] [PubMed]
- Cheng, B.; Li, H. Improving Water Saving Measures Is the Necessary Way to Protect the Ecological Base Flow of Rivers in Water Shortage Areas of Northwest China. Ecol. Indic. 2021, 123, 107347. [Google Scholar] [CrossRef]
- Yang, G.; Wang, J.; Nie, Z.; Yang, H.; Yu, S. A Lightweight YOLOv8 Tomato Detection Algorithm Combining Feature Enhancement and Attention. Agronomy 2023, 13, 1824. [Google Scholar] [CrossRef]
- Ali, H.; Lali, M.I.; Nawaz, M.Z.; Sharif, M.; Saleem, B.A. Symptom-Based Automated Detection of Citrus Diseases Using Color Histogram and Textural Descriptors. Comput. Electron. Agric. 2017, 138, 92–104. [Google Scholar] [CrossRef]
- Abera, G.; Ibrahim, A.M.; Forsido, S.F.; Kuyu, C.G. Assessment on Post-Harvest Losses of Tomato (Lycopersicon esculentum Mill.) in Selected Districts of East Shewa Zone of Ethiopia Using a Commodity System Analysis Methodology. Heliyon 2020, 6, e03749. [Google Scholar] [CrossRef]
- Kaur, K.; Gupta, O.P. A Machine Learning Approach to Determine Maturity Stages of Tomatoes. Orient. J. Comput. Sci. Technol. 2017, in press. [Google Scholar] [CrossRef]
- Sugino, N.; Watanabe, T.; Kitazawa, H. Effect of Transportation Temperature on Tomato Fruit Quality: Chilling Injury and Relationship Between Mass Loss and a* Values. Food Meas. 2022, 16, 2884–2889. [Google Scholar] [CrossRef]
- Su, F.; Zhao, Y.; Wang, G.; Liu, P.; Yan, Y.; Zu, L. Tomato Ripeness Classification Based on SE-YOLOv3-MobileNetV1 Network under Nature Greenhouse Environment. Agronomy 2022, 12, 1638. [Google Scholar] [CrossRef]
- Wiesner-Hanks, T.; Wu, H.; Stewart, E.; DeChant, C.; Kaczmar, N.; Lipson, H.; Gore, M.A.; Nelson, R.J. Millimeter-Level Plant Disease Detection from Aerial Photographs via Deep Learning and Crowdsourced Data. Front. Plant Sci. 2019, 10, 1550. [Google Scholar] [CrossRef]
- Behera, S.K.; Rath, A.K.; Sethy, P.K. Maturity Status Classification of Papaya Fruits Based on Machine Learning and Transfer Learning Approach. Inf. Process. Agric. 2021, 8, 244–250. [Google Scholar] [CrossRef]
- Halstead, M.; McCool, C.; Denman, S.; Perez, T.; Fookes, C. Fruit Quantity and Ripeness Estimation Using a Robotic Vision System. IEEE Robot. Autom. Lett. 2018, 3, 2995–3002. [Google Scholar] [CrossRef]
- Fu, L.; Gao, F.; Wu, J.; Li, R.; Karkee, M.; Zhang, Q. Application of Consumer RGB-D Cameras for Fruit Detection and Localization in Field: A Critical Review. Comput. Electron. Agric. 2020, 177, 105687. [Google Scholar] [CrossRef]
- Lu, J.; Tan, L.; Jiang, H. Review on Convolutional Neural Network (CNN) Applied to Plant Leaf Disease Classification. Agriculture 2021, 11, 707. [Google Scholar] [CrossRef]
- Saleem, M.H.; Potgieter, J.; Arif, K.M. Automation in Agriculture by Machine and Deep Learning Techniques: A Review of Recent Developments. Precis. Agric. 2021, 22, 2053–2091. [Google Scholar] [CrossRef]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Event, 14–19 June 2020; pp. 9759–9768. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Newry, UK, 2017; pp. 5998–6008. [Google Scholar]
- Bharati, P.; Pramanik, A. Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey. In Computational Intelligence in Pattern Recognition; Das, A., Nayak, J., Naik, B., Pati, S., Pelusi, D., Eds.; Advances in Intelligent Systems and Computing; Springer: Singapore, 2020; Volume 999, pp. 657–668. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 13–16 December 2015; pp. 1440–1448. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Zhang, K.; Wu, Q.; Chen, Y. Detecting Soybean Leaf Disease from Synthetic Image Using Multi-Feature Fusion Faster R-CNN. Comput. Electron. Agric. 2021, 183, 106064. [Google Scholar] [CrossRef]
- Xu, X.; Zhao, M.; Shi, P.; Ren, R.; He, X.; Wei, X.; Yang, H. Crack Detection and Comparison Study Based on Faster R-CNN and Mask R-CNN. Sensors 2022, 22, 1215. [Google Scholar] [CrossRef]
- Terven, J.; Córdova-Esparza, D.-M.; Romero-González, J.-A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
- Liu, G.; Nouaze, J.C.; Touko Mbouembe, P.L.; Kim, J.H. YOLO-Tomato: A Robust Algorithm for Tomato Detection Based on YOLOv3. Sensors 2020, 20, 2145. [Google Scholar] [CrossRef]
- Bai, Y.; Yu, J.; Yang, S.; Ning, J. An Improved YOLO Algorithm for Detecting Flowers and Fruits on Strawberry Seedlings. Biosyst. Eng. 2024, 237, 1–12. [Google Scholar] [CrossRef]
- Wang, C.; Han, Q.; Li, J.; Li, C.; Zou, X. YOLO-BLBE: A Novel Model for Identifying Blueberry Fruits with Different Maturities Using the I-MSRCR Method. Agronomy 2024, 14, 658. [Google Scholar] [CrossRef]
- Zhai, S.; Shang, D.; Wang, S.; Dong, S. DF-SSD: An Improved SSD Object Detection Algorithm Based on DenseNet and Feature Fusion. IEEE Access 2020, 8, 24344–24357. [Google Scholar] [CrossRef]
- Li, Y.; Huang, H.; Xie, Q.; Yao, L.; Chen, Q. Research on a Surface Defect Detection Algorithm Based on MobileNet-SSD. Appl. Sci. 2018, 8, 1678. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Ding, Y.; Zhou, Y.; Zhu, Y.; Ye, Q.; Jiao, J. Selective Sparse Sampling for Fine-Grained Image Recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6598–6607. [Google Scholar]
- Chen, J.; Mai, H.; Luo, L.; Chen, X.; Wu, K. Effective Feature Fusion Network in BIFPN for Small Object Detection. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 699–703. [Google Scholar]
- Jiang, B.; Luo, R.; Mao, J.; Xiao, T.; Jiang, Y. Acquisition of Localization Confidence for Accurate Object Detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 784–799. [Google Scholar]
Method | P/% | R/% | mAP@0.5 |
---|---|---|---|
Yolov8n | 0.831 | 0.846 | 0.907 |
Yolov8n (+RCA-CBAM) | 0.855 | 0.844 | 0.909 |
Yolov8n (+RCA-CBAM+BiFPN) | 0.865 | 0.839 | 0.930 |
Yolov8+ | 0.917 | 0.873 | 0.958 |
Method | P/% | R/% | mAP@0.5 |
---|---|---|---|
Yolov5 | 0.863 | 0.830 | 0915 |
Yolov7 | 0.872 | 0.854 | 0.920 |
Yolov8s | 0.848 | 0.869 | 0.918 |
Yolov8+ | 0.917 | 0.873 | 0.958 |
Method | N | D | D% | tN (ms) | FPS |
---|---|---|---|---|---|
Yolov5 | 116 | 979 | 843.97% | 111.4 | 9 |
Yolov7 | 116 | 4155 | 3581.90% | 113.0 | 8.9 |
Yolov8s | 116 | 116 | 100.00% | 16.2 | 61.7 |
Yolov8+ | 116 | 116 | 100.00% | 9.5 | 105.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Z.; Li, Y.; Han, Q.; Wang, H.; Li, C.; Wu, Z. A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model. Horticulturae 2025, 11, 15. https://doi.org/10.3390/horticulturae11010015
Yang Z, Li Y, Han Q, Wang H, Li C, Wu Z. A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model. Horticulturae. 2025; 11(1):15. https://doi.org/10.3390/horticulturae11010015
Chicago/Turabian StyleYang, Zhanshuo, Yaxian Li, Qiyu Han, Haoming Wang, Chunjiang Li, and Zhandong Wu. 2025. "A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model" Horticulturae 11, no. 1: 15. https://doi.org/10.3390/horticulturae11010015
APA StyleYang, Z., Li, Y., Han, Q., Wang, H., Li, C., & Wu, Z. (2025). A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model. Horticulturae, 11(1), 15. https://doi.org/10.3390/horticulturae11010015