An Application of Deep Learning Models for the Detection of Cocoa Pods at Different Ripening Stages: An Approach with Faster R-CNN and Mask R-CNN
Abstract
1. Introduction
1.1. Context
1.2. Challenges in Ripeness Detection
2. Related Work
- Controlled environment datasets: These methods utilize digital image datasets obtained from controlled environments where the fruits have already been harvested [33,34,35,36,37,38,39]. In these cases, the background of the images is free from noise, making classification easier. Algorithms trained using these methods are typically employed in agro-industrial plants for tasks such as fruit classification or packaging.
2.1. Application of Faster R-CNN and Mask R-CNN in Fruit Ripeness Detection
2.1.1. R-CNN Models Advantages
2.1.2. R-CNN Models Limitations
2.2. Comparison of Two-Stage vs. One-Stage Detectors in Agriculture
3. Materials and Methods
3.1. Dataset
3.1.1. Data Augmentation and Class Balancing Strategy
- -
- If an image contained class c2, 1 additional augmented image was generated.
- -
- If it contained class c3, 1 additional augmented image was generated.
- -
- If it contained class c4, 2 additional augmented images were generated.
3.1.2. Dataset Partitioning After Data Augmentation
3.2. Convolutional Neural Networks Selected for This Research
3.2.1. Faster R-CNN
3.2.2. Mask R-CNN
3.3. Detectron2 Library
Nomenclature in Detectron2
- faster_rcnn: Base architecture (Faster R-CNN).
- BACKBONE: Feature extraction backbone (e.g., R_50, X_101_32x8d).
- NECK: Neck design for multi-scale feature handling (e.g., C4, FPN, DC5).
- TRAINING-SCHEDULE: Training duration/hyperparameters (e.g., 1x, 3x).
- ✓
- R_50: ResNet-50 (50 layers).
- ✓
- R_101: ResNet-101 (101 layers, deeper/more accurate).
- ✓
- X_101_32x8d: ResNeXt-101 with 32 groups and 8 bottleneck width (higher capacity than ResNet).
- ✓
- C4: Uses conv4 features for RPN and ROI heads.Pros: Faster but less precise for small objects.
- ✓
- FPN: Feature Pyramid Network (multi-scale features).Pros: Best for objects of varying sizes (e.g., fruits at different distances).
- ✓
- DC5: Dilated convolutions in conv5 (higher spatial resolution).Pros: Better for small objects but computationally heavy.
- ✓
- 1x: Standard training (~90k iterations).
- ✓
- 3x: Extended training (~270k iterations).
- ✓
- Pros: Higher accuracy but 3× slower.
3.4. Metrics
3.4.1. Intersection over Union
3.4.2. Average Precision
3.5. Model Configuration and Training Parameters for Base Line
3.6. Hyperparameter Optimization
3.7. Confidence Heatmaps for Object Detection
- Identify missed detections—regions with visible objects but low or no heat response;
- Assess the spatial uncertainty of predictions;
- Evaluate potential confusion between classes in ambiguous contexts.
4. Results
4.1. Baseline Results
4.1.1. Top Performer Selection: Comparative Analysis of the Four Models with the Highest mAP in Faster R-CNN and Mask R-CNN
4.1.2. Comprehensive Performance Evaluation: Comparative Metrics of mAP, AP50/75, and Specific Categories in Faster R-CNN and Mask R-CNN Models
4.2. Enhanced R-CNN Models with Balancing and Tuning
4.3. Comparative Performance Analysis of Faster R-CNN, Mask R-CNN, and YOLOv8
4.3.1. Training Configuration for Faster R-CNN and Mask R-CNN
4.3.2. Training Configuration for YOLOv8
4.3.3. Comparative Performance of R-CNN and YOLOv8 Models
4.4. Qualitative Analysis: Object Detections in Test Images
4.5. Qualitative Evaluation: Confidence Heatmaps and Detection Performance
- -
- The model performs well when pods are fully visible and well-lit.
- -
- Errors often occur when pods are partially occluded, have unusual colors, or are smaller in scale.
- -
- Confidence heatmaps provide insights into the regions the model focuses on and those it overlooks.
- -
- Augmenting the training data with varied lighting conditions, occlusions, and less common pod appearances;
- -
- Implementing multi-scale attention mechanisms or context-based detection heads to enhance detection performance in complex backgrounds.
5. Discussion
6. Conclusions
6.1. Findings
6.2. Limitations
6.3. Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Fountain, A.; Hütz-Adams, F. Cocoa Barometer 2022. 2022. Available online: https://www.cocoabarometer.org (accessed on 9 February 2025).
- Cubillos-Bojacá, A.F.; García-Muñoz, M.C.; Calvo-Salamanca, A.H.; Carvajal-Rojas, G.H.; Tarazona-Diaz, M.P. Study of the physical and chemical changes during the maduration of three cocoa clones EET8, CCN51, and ICS60. J. Sci. Food Agric. 2019, 99, 5910–5917. [Google Scholar] [CrossRef] [PubMed]
- Rizzo, M.; Marcuzzo, M.; Zangari, A.; Gasparetto, A.; Albarelli, A. Fruit ripeness classification: A survey. Artif. Intell. Agric. 2023, 7, 44–57. [Google Scholar] [CrossRef]
- Nevavuori, P.; Narra, N.; Lipping, T. Crop yield prediction with deep convolutional neural networks. Comput. Electron. Agric. 2019, 163, 104859. [Google Scholar] [CrossRef]
- Apolo-Apolo, O.E.; Martínez-Guanter, J.; Egea, G.; Raja, P.; Pérez-Ruiz, M. Deep learning techniques for estimation of the yield and size of citrus fruits using a UAV. Eur. J. Agron. 2020, 115, 126030. [Google Scholar] [CrossRef]
- Bargoti, S.; Underwood, J.P. Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards. J. Field Robot. 2017, 34, 1039–1060. [Google Scholar] [CrossRef]
- Häni, N.; Roy, P.; Isler, V. A comparative study of fruit detection and counting methods for yield mapping in apple orchards. J. Field Robot. 2020, 37, 263–282. [Google Scholar] [CrossRef]
- Dorj, U.O.; Lee, M.; Yun, S.S. An yield estimation in citrus orchards via fruit detection and counting using image processing. Comput. Electron. Agric. 2017, 140, 103–112. [Google Scholar] [CrossRef]
- He, L.; Fang, W.; Zhao, G.; Wu, Z.; Fu, L.; Li, R.; Majeed, Y.; Dhupia, J. Fruit yield prediction and estimation in orchards: A state-of-the-art comprehensive review for both direct and indirect methods. Comput. Electron. Agric. 2022, 195, 106812. [Google Scholar] [CrossRef]
- Shawon, S.M.; Ema, F.B.; Mahi, A.K.; Niha, F.L.; Zubair, H.T. Crop yield prediction using machine learning: An extensive and systematic literature review. Smart Agric. Technol. 2025, 10, 100718. Available online: https://linkinghub.elsevier.com/retrieve/pii/S2772375524003228 (accessed on 1 May 2025). [CrossRef]
- Stein, M.; Bargoti, S.; Underwood, J. Image based mango fruit detection, localisation and yield estimation using multiple view geometry. Sensors 2016, 16, 1915. [Google Scholar] [CrossRef] [PubMed]
- Galindo, J.A.M.; Rosal, J.E.C.; Villaverde, J.F. Ripeness Classification of Cacao Using Cepstral-Based Statistical Features and Support Vector Machine. In Proceedings of the 2022 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), Kota Kinabalu, Malaysia, 13–15 September 2022. [Google Scholar] [CrossRef]
- Bueno, G.E.; Valenzuela, K.A.; Arboleda, E.R. Maturity classification of cacao through spectrogram and convolutional neural network. J. Teknol. dan Sist. Komput. 2020, 8, 228–233. [Google Scholar] [CrossRef]
- Gallego, A.M.; Zambrano, R.A.; Zuluaga, M.; Rodríguez, A.V.C.; Cortés, M.S.C.; Vergel, A.P.R.; Valencia, J.W.A. Analysis of fruit ripening in Theobroma cacao pod husk based on untargeted metabolomics. Phytochemistry 2022, 203, 113412. [Google Scholar] [CrossRef]
- Lockman, N.A.; Hashim, N.; Onwude, D.I. Laser-Based imaging for Cocoa Pods Maturity Detection. Food Bioprocess Tech. 2019, 12, 1928–1937. [Google Scholar] [CrossRef]
- Liu, Y.; Zheng, H.; Zhang, Y.; Zhang, Q.; Chen, H.; Xu, X.; Wang, G. ‘Is this blueberry ripe?’: A blueberry ripeness detection algorithm for use on picking robots. Front. Plant Sci. 2023, 14, 1198650. [Google Scholar] [CrossRef]
- Azodanlou, R.; Darbellay, C.; Luisier, J.L.; Villettaz, J.C.; Amadò, R. Changes in flavour and texture during the ripening of strawberries. Eur. Food Res. Technol. 2004, 218, 167–172. [Google Scholar] [CrossRef]
- Hu, C.; Liu, X.; Pan, Z.; Li, P. Automatic detection of single ripe tomato on plant combining faster R-CNN and intuitionistic fuzzy set. IEEE Access 2019, 7, 154683–154696. [Google Scholar] [CrossRef]
- Vrochidou, E.; Bazinas, C.; Manios, M.; Papakostas, G.A.; Pachidis, T.P.; Kaburlasos, V.G. Machine vision for ripeness estimation in viticulture automation. Horticulturae 2021, 7, 282. [Google Scholar] [CrossRef]
- Yang, W.; Ma, X.; An, H. Blueberry Ripeness Detection Model Based on Enhanced Detail Feature and Content-Aware Reassembly. Agronomy 2023, 13, 1613. [Google Scholar] [CrossRef]
- Mubin, N.A.; Nadarajoo, E.; Shafri, H.Z.M.; Hamedianfar, A. Young and mature oil palm tree detection and counting using convolutional neural network deep learning method. Int. J. Remote Sens. 2019, 40, 7500–7515. [Google Scholar] [CrossRef]
- Harel, B.; Parmet, Y.; Edan, Y. Maturity classification of sweet peppers using image datasets acquired in different times. Comput. Ind. 2020, 121, 103274. [Google Scholar] [CrossRef]
- Caladcad, J.A.; Cabahug, S.; Catamco, M.R.; Villaceran, P.E.; Cosgafa, L.; Cabizares, K.N.; Hermosilla, M.; Piedad, E.J. Determining Philippine coconut maturity level using machine learning algorithms based on acoustic signal. Comput. Electron. Agric. 2020, 172, 105327. [Google Scholar] [CrossRef]
- Su, F.; Zhao, Y.; Wang, G.; Liu, P.; Yan, Y. Tomato Maturity Classification Based on SE-YOLOv3-MobileNetV1 Network under Nature Greenhouse Environment. Agronomy 2022, 12, 1638. [Google Scholar] [CrossRef]
- Behera, S.K.; Rath, A.K.; Sethy, P.K. Maturity status classification of papaya fruits based on machine learning and transfer learning approach. Inf. Process. Agric. 2021, 8, 244–250. [Google Scholar] [CrossRef]
- Septiarini, A.; Sunyoto, A.; Hamdani, H.; Kasim, A.A.; Utaminingrum, F.; Hatta, H.R. Machine vision for the maturity classification of oil palm fresh fruit bunches based on color and texture features. Sci. Hortic. 2021, 286, 110245. [Google Scholar] [CrossRef]
- Restrepo-Arias, J.F.; Salinas-Agudelo, M.I.; Hernandez-Pérez, M.I.; Marulanda-Tobón, A.; Giraldo-Carvajal, M.C. RipSetCocoaCNCH12: Labeled Dataset for Ripeness Stage Detection, Semantic and Instance Segmentation of Cocoa Pods. Data 2023, 8, 112. [Google Scholar] [CrossRef]
- Chen, S.; Xiong, J.; Jiao, J.; Xie, Z.; Huo, Z.; Hu, W. Citrus fruits maturity detection in natural environments based on convolutional neural networks and visual saliency map. Precis. Agric. 2022, 23, 1515–1531. [Google Scholar] [CrossRef]
- Bazame, H.C.; Molin, J.P.; Althoff, D.; Martello, M. Detection, classification, and mapping of coffee fruits during harvest with computer vision. Comput. Electron. Agric. 2021, 183, 106066. [Google Scholar] [CrossRef]
- Arendse, E.; Fawole, O.A.; Magwaza, L.S.; Opara, U.L. Non-destructive prediction of internal and external quality attributes of fruit with thick rind: A review. J. Food Eng. 2018, 217, 11–23. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, G.; Cao, H.; Hu, K.; Wang, Q.; Deng, Y.; Gao, J.; Tang, Y. Geometry-Aware 3D Point Cloud Learning for Precise Cutting-Point Detection in Unstructured Field Environments. J. Field Robotics 2025. [Google Scholar] [CrossRef]
- Wu, F.; Zhu, R.; Meng, F.; Qiu, J.; Yang, X.; Li, J.; Zou, X. An Enhanced Cycle Generative Adversarial Network Approach for Nighttime Pineapple Detection of Automated Harvesting Robots. Agronomy 2024, 14, 3002. [Google Scholar] [CrossRef]
- Ali, M.M.; Hashim, N.; Shahamshah, M.I. Durian (Durio zibethinus) ripeness detection using thermal imaging with multivariate analysis. Postharvest Biol. Technol. 2021, 176, 111517. [Google Scholar] [CrossRef]
- Begum, N.; Hazarika, M.K. Maturity detection of tomatoes using transfer learning. Meas. Food 2022, 7, 100038. [Google Scholar] [CrossRef]
- Khojastehnazhand, M.; Mohammadi, V.; Minaei, S. Maturity detection and volume estimation of apricot using image processing technique. Sci. Hortic. 2019, 251, 247–251. [Google Scholar] [CrossRef]
- Sahu, D.; Potdar, R.M. Defect Identification and Maturity Detection of Mango Fruits Using Image Analysis. Artic. Int. J. Artif. Intell. 2017, 1, 5–14. Available online: http://www.sciencepublishinggroup.com/j/ajai (accessed on 2 May 2025). [CrossRef]
- Saranya, N.; Srinivasan, K.; Kumar, S.K.P. Banana ripeness stage identification: A deep learning approach. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 4033–4039. [Google Scholar] [CrossRef]
- Wan, P.; Toudeshki, A.; Tan, H.; Ehsani, R. A methodology for fresh tomato maturity detection using computer vision. Comput. Electron. Agric. 2018, 146, 43–50. [Google Scholar] [CrossRef]
- Xiao, B.; Nguyen, M.; Yan, W.Q. Fruit ripeness identification using YOLOv8 model. Multimed. Tools Appl. 2024, 83, 28039–28056. [Google Scholar] [CrossRef]
- Gai, R.L.; Wei, K.; Wang, P.F. SSMDA: Self-Supervised Cherry Maturity Detection Algorithm Based on Multi-Feature Contrastive Learning. Agriculture 2023, 13, 939. [Google Scholar] [CrossRef]
- Han, W.; Hao, W.; Sun, J.; Xue, Y.; Li, W. Tomatoes Maturity Detection Approach Based on YOLOv5 and Attention Mechanisms. In Proceedings of the 2022 IEEE 4th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 12–14 October 2022; pp. 1363–1371. [Google Scholar] [CrossRef]
- Tao, Z.; Li, K.; Rao, Y.; Li, W.; Zhu, J. Strawberry Maturity Recognition Based on Improved YOLOv5. Agronomy 2024, 14, 460. [Google Scholar] [CrossRef]
- Jiang, S.; Liu, Z.; Hua, J.; Zhang, Z.; Zhao, S.; Xie, F.; Ao, J.; Wei, Y.; Lu, J.; Li, Z.; et al. A Real-Time Detection and Maturity Classification Method for Loofah. Agronomy 2023, 13, 2144. [Google Scholar] [CrossRef]
- Zhu, X.; Chen, F.; Zhang, X.; Zheng, Y.; Peng, X.; Chen, C. Detection the maturity of multi-cultivar olive fruit in orchard environments based on Olive-EfficientDet. Sci. Hortic. 2024, 324, 112607. [Google Scholar] [CrossRef]
- Wang, C.; Wang, C.; Wang, L.; Wang, J.; Liao, J.; Li, Y.; Lan, Y. A Lightweight Cherry Tomato Maturity Real-Time Detection Algorithm Based on Improved YOLOV5n. Agronomy 2023, 13, 2106. [Google Scholar] [CrossRef]
- Wang, Z.; Ling, Y.; Wang, X.; Meng, D.; Nie, L.; An, G.; Wang, X. An improved Faster R-CNN model for multi-object tomato maturity detection in complex scenarios. Ecol. Inform. 2022, 72, 101886. [Google Scholar] [CrossRef]
- Xu, D.; Zhao, H.; Lawal, O.M.; Lu, X.; Ren, R.; Zhang, S. An Automatic Jujube Fruit Detection and Ripeness Inspection Method in the Natural Environment. Agronomy 2023, 13, 451. [Google Scholar] [CrossRef]
- Zhang, M.; Shen, M.; Pu, Y.; Li, H.; Zhang, B.; Zhang, Z.; Ren, X.; Zhao, J. Rapid Identification of Apple Maturity Based on Multispectral Sensor Combined with Spectral Shape Features. Horticulturae 2022, 8, 361. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. 2015. Available online: http://arxiv.org/abs/1506.02640 (accessed on 5 May 2025).
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. 2020. Available online: http://arxiv.org/abs/2004.10934 (accessed on 5 May 2025).
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. 2022. Available online: http://arxiv.org/abs/2207.02696 (accessed on 5 May 2025).
- Redmon, J.; YOLO, A.F. 9000: Better, Faster, Stronger. Available online: http://arxiv.org/abs/1612.08242 (accessed on 5 May 2025).
- Redmon, J. YOLOv3: An Incremental Improvement. 2018. Available online: http://arxiv.org/abs/1804.02767 (accessed on 5 May 2025).
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. 2022. Available online: http://arxiv.org/abs/2209.02976 (accessed on 5 May 2025).
- Tang, Y.; Zhou, H.; Wang, H.; Zhang, Y. Fruit detection and positioning technology for a Camellia oleifera C. Abel orchard based on improved YOLOv4-tiny model and binocular stereo vision. Expert Syst. Appl. 2023, 211, 118573. [Google Scholar] [CrossRef]
- Li, P.; Zheng, J.; Li, P.; Long, H.; Li, M.; Gao, L. Tomato Maturity Detection and Counting Model Based on MHSA-YOLOv8. Sensors 2023, 23, 6701. [Google Scholar] [CrossRef] [PubMed]
- Sapkota, R.; Ahmed, D.; Karkee, M. Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments. Artif. Intell. Agric. 2024, 13, 84–99. [Google Scholar] [CrossRef]
- Liu, T.H.; Nie, X.-N.; Wu, J.-M.; Zhang, D.; Liu, W.; Cheng, Y.-F.; Zheng, Y.; Qiu, J.; Qi, L. Pineapple (Ananas comosus) fruit detection and localization in natural environment based on binocular stereo vision and improved YOLOv3 model. Precis. Agric. 2023, 24, 139–160. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. Available online: http://arxiv.org/abs/1506.01497 (accessed on 1 March 2025). [CrossRef] [PubMed]
- He, K.; Gkioxari, G.; Dollár, P. Mask R-CNN. arXiv 2018, arXiv:1703.06870. [Google Scholar]
- Huang, F.; Li, Y.; Liu, Z.; Gong, L.; Liu, C. A Method for Calculating the Leaf Area of Pak Choi Based on an Improved Mask R-CNN. Agriculture 2024, 14, 101. [Google Scholar] [CrossRef]
- Gong, X.; Zhang, S. A High-Precision Detection Method of Apple Leaf Diseases Using Improved Faster R-CNN. Agriculture 2023, 13, 240. [Google Scholar] [CrossRef]
- Li, Y.; Wang, Y.; Xu, D.; Zhang, J.; Wen, J. An Improved Mask RCNN Model for Segmentation of ‘Kyoho’ (Vitis labruscana) Grape Bunch and Detection of Its Maturity Level. Agriculture 2023, 13, 914. [Google Scholar] [CrossRef]
- Jia, W.; Wei, J.; Zhang, Q.; Pan, N.; Niu, Y.; Yin, X.; Ding, Y.; Ge, X. Accurate segmentation of green fruit based on optimized mask RCNN application in complex orchard. Front. Plant Sci. 2022, 13, 955256. [Google Scholar] [CrossRef] [PubMed]
- Huang, Y.P.; Wang, T.H.; Basanta, H. Using Fuzzy Mask R-CNN Model to Automatically Identify Tomato Ripeness. IEEE Access 2020, 8, 207672–207682. [Google Scholar] [CrossRef]
- Tu, S.; Xue, Y.; Zheng, C.; Qi, Y.; Wan, H.; Mao, L. Detection of passion fruits and maturity classification using Red-Green-Blue Depth images. Biosyst. Eng. 2018, 175, 156–167. [Google Scholar] [CrossRef]
- Tu, S.; Pang, J.; Liu, H.; Zhuang, N.; Chen, Y.; Zheng, C.; Wan, H.; Xue, Y. Passion fruit detection and counting based on multiple scale faster R-CNN using RGB-D images. Precis. Agric. 2020, 21, 1072–1091. [Google Scholar] [CrossRef]
- Zhao, Z.; Hicks, Y.; Sun, X.; Luo, C. Peach ripeness classification based on a new one-stage instance segmentation model. Comput. Electron. Agric. 2023, 214, 108369. [Google Scholar] [CrossRef]
- Li, J.; Zhu, Z.; Liu, H.; Su, Y.; Deng, L. Strawberry R-CNN: Recognition and counting model of strawberry based on improved faster R-CNN. Ecol. Inform. 2023, 77, 102210. [Google Scholar] [CrossRef]
- Deka, B.; Chakraborty, D. UAV Sensing-Based Litchi Segmentation Using Modified Mask-RCNN for Precision Agriculture. IEEE Trans. AgriFood Electron. 2024, 2, 509–517. [Google Scholar] [CrossRef]
- Guo, Z.; Shi, Y.; Ahmad, I. Design of smart citrus picking model based on Mask RCNN and adaptive threshold segmentation. PeerJ Comput. Sci. 2024, 10, e1865. [Google Scholar] [CrossRef] [PubMed]
- Siricharoen, P.; Yomsatieankul, W.; Bunsri, T. Fruit maturity grading framework for small dataset using single image multi-object sampling and Mask R-CNN. Smart Agric. Technol. 2023, 3, 100130. [Google Scholar] [CrossRef]
- Jiao, L.; Dong, S.; Zhang, S.; Xie, C.; Wang, H. AF-RCNN: An anchor-free convolutional neural network for multi-categories agricultural pest detection. Comput. Electron. Agric. 2020, 174, 105522. [Google Scholar] [CrossRef]
- Sharma, A.; Kumar, V.; Longchamps, L. Comparative performance of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN models for detection of multiple weed species. Smart Agric. Technol. 2024, 9, 100648. [Google Scholar] [CrossRef]
- Maru, R.; Kamat, A.; Shah, M.; Kute, P.; Shrawne, S.C.; Sambhe, V. Improved Faster RCNN for Ripeness and Size Estimation of Mangoes with multi-label output. In Proceedings of the 2024 International Conference on Computational Intelligence and Network Systems (CINS), Dubai, United Arab Emirates, 28–29 November 2024. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014. Available online: http://arxiv.org/abs/1311.2524 (accessed on 5 March 2025).
- Ayikpa, K.J.; Mamadou, D.; Gouton, P.; Adou, K.J. Classification of Cocoa Pod Maturity Using Similarity Tools on an Image Database: Comparison of Feature Extractors and Color Spaces. Data 2023, 8, 99. [Google Scholar] [CrossRef]
- Veites-Campos, S.A.; Ramírez-Betancour, R.; González-Pérez, M. Identification of Cocoa Pods with Image Processing and Artificial Neural Networks. Int. J. Adv. Eng. Manag. Sci. 2018, 4, 510–518. [Google Scholar] [CrossRef]
- Heredia-Gómez, J.F.; Rueda-Gómez, J.P.; Talero-Sarmiento, L.H.; Ramírez-Acuña, J.S.; Coronado-Silva, R.A. Cocoa pods ripeness estimation, using convolutional neural networks in an embedded system. Rev. Colomb. de Comput. 2020, 21, 42–55. [Google Scholar] [CrossRef]
- Baculio, N.G.; Barbosa, J.B. An Objective Classification Approach of Cacao Pods using Local Binary Pattern Features and Artificial Neural Network Architecture (ANN). Indian J. Sci. Technol. 2022, 15, 495–504. [Google Scholar] [CrossRef]
- Kim, E.C.; Hong, S.-J.; Kim, S.-Y.; Lee, C.-H.; Kim, S.; Kim, H.-J.; Kim, G. CNN-based object detection and growth estimation of plum fruit (Prunus mume) using RGB and depth imaging techniques. Sci. Rep. 2022, 12, 20796. [Google Scholar] [CrossRef]
- Wu, T.; Miao, Z.; Huang, W.; Han, W.; Guo, Z.; Li, T. SGW-YOLOv8n: An Improved YOLOv8n-Based Model for Apple Detection and Segmentation in Complex Orchard Environments. Agriculture 2024, 14, 1958. [Google Scholar] [CrossRef]
- Xiao, F.; Wang, H.; Xu, Y.; Zhang, R. Fruit Detection and Recognition Based on Deep Learning for Automatic Harvesting: An Overview and Review. Agronomy 2023, 13, 1625. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Zhao, X.; Li, W.; Zhang, Y.; Gulliver, T.A.; Chang, S.; Feng, Z. A Faster RCNN-based Pedestrian Detection System. In Proceedings of the 2016 IEEE 84th Vehicular Technology Conference (VTC-Fall), Montreal, QC, Canada, 18–21 September 2016. [Google Scholar]
- Cheng, B.; Wei, Y.; Shi, H.; Feris, R.; Xiong, J.; Huang, T. Revisiting RCNN: On Awakening the Classification Power of Faster RCNN. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Roh, M.-C.; Lee, J. Refining Faster-RCNN for Accurate Object Detection. In Proceedings of the 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA), Nagoya, Japan, 8–12 May 2017. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, B.; Gulliver, T.A.; Alzahir, S. Image splicing detection using mask-RCNN. Signal Image Video Process 2020, 14, 1035–1042. [Google Scholar] [CrossRef]
- Nowozin, S. Optimal decisions from probabilistic models: The intersection-over-union case. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 548–555. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014. [Google Scholar] [CrossRef]
- Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
- Alibrahim, H.; Ludwig, S.A. Hyperparameter Optimization: Comparing Genetic Algorithm against Grid Search and Bayesian Optimization. In Proceedings of the 2021 IEEE Congress on Evolutionary Computation (CEC), Kraków, Poland, 28 June–1 July 2021; pp. 1551–1559. [Google Scholar] [CrossRef]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning Deep Features for Discriminative Localization. 2016. Available online: http://cnnlocalization.csail.mit.edu (accessed on 10 June 2025).
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. 2017. Available online: http://gradcam.cloudcv.org (accessed on 10 June 2025).
Class | Class Name | Instances |
---|---|---|
c1 | Stage 1 (0–2 months) | 3427 |
c2 | Stage 2 (2–4 months) | 1679 |
c3 | Stage 3 (4–6 months) | 1545 |
c4 | For harvest (>6 months) | 1106 |
Total | 7757 |
Model | mAP | AP 50 | AP 75 | c1 | c2 | c3 | c4 |
---|---|---|---|---|---|---|---|
faster_rcnn_X_101_32x8d_FPN_3x | 64.145 | 72.284 | 71.362 | 69.005 | 67.080 | 74.878 | 92.417 |
mask_rcnn_X_101_32x8d_FPN_3x | 60.812 | 65.863 | 65.23 | 72.169 | 57.696 | 72.983 | 87.743 |
faster_rcnn_R_50_C4_3x | 60.360 | 70.196 | 69.006 | 63.957 | 56.812 | 74.440 | 88.846 |
mask_rcnn_R_50_DC5_3x | 59.875 | 66.451 | 65.486 | 63.335 | 64.04 | 71.388 | 89.419 |
Average Precision (AP) % (Test Subset) | |||||||
---|---|---|---|---|---|---|---|
Model | mAP | AP 50 | AP 75 | c1 | c2 | c3 | c4 |
faster_rcnn_X_101_32x8d_FPN_3x | 64.145 | 72.284 | 71.362 | 69.005 | 67.080 | 74.878 | 92.417 |
faster_rcnn_R_50_C4_3x | 60.360 | 70.196 | 69.006 | 63.957 | 56.812 | 74.440 | 88.846 |
faster_rcnn_R_50_C4_1x | 59.820 | 70.232 | 68.882 | 65.383 | 60.495 | 70.280 | 88.894 |
faster_rcnn_R_50_FPN_1x | 59.518 | 69.799 | 68.832 | 65.459 | 57.188 | 76.177 | 83.568 |
faster_rcnn_R_101_C4_3x | 58.991 | 67.047 | 65.274 | 61.463 | 57.751 | 77.022 | 92.590 |
faster_rcnn_R_50_DC5_3x | 57.761 | 66.670 | 65.344 | 59.756 | 60.859 | 71.373 | 85.898 |
faster_rcnn_R_50_FPN_3x | 55.913 | 67.250 | 65.595 | 57.732 | 57.693 | 68.173 | 83.910 |
faster_rcnn_R_50_DC5_1x | 55.173 | 64.371 | 62.453 | 57.398 | 45.782 | 75.916 | 88.193 |
faster_rcnn_R_101_FPN_3x | 54.123 | 62.987 | 63.721 | 64.543 | 58.753 | 72.094 | 83.642 |
faster_rcnn_R_101_DC5_3x | 54.154 | 63.555 | 61.775 | 64.457 | 48.593 | 60.569 | 84.686 |
Average Presicion (AP) % (Test Subset) | |||||||
---|---|---|---|---|---|---|---|
Model | mAP | AP 50 | AP 75 | c1 | c2 | c3 | c4 |
mask_rcnn_X_101_32x8d_FPN_3x | 60.812 | 65.863 | 65.230 | 72.169 | 57.696 | 72.983 | 87.743 |
mask_rcnn_R_50_DC5_3x | 59.875 | 66.451 | 65.486 | 63.335 | 64.040 | 71.388 | 89.419 |
mask_rcnn_R_101_FPN_3x | 59.371 | 65.09 | 63.914 | 68.499 | 60.116 | 76.764 | 85.199 |
mask_rcnn_R_50_C4_1x | 59.293 | 65.218 | 62.48 | 54.949 | 58.127 | 69.118 | 85.415 |
mask_rcnn_R_50_C4_3x | 59.063 | 64.446 | 63.189 | 68.729 | 59.160 | 75.081 | 89.502 |
mask_rcnn_R_50_DC5_1x | 58.988 | 64.564 | 63.266 | 62.038 | 64.364 | 75.639 | 88.231 |
mask_rcnn_R_50_FPN_3x | 57.785 | 62.873 | 62.033 | 63.072 | 63.109 | 72.924 | 87.946 |
mask_rcnn_R_50_FPN_1x | 57.583 | 62.595 | 61.853 | 64.979 | 62.024 | 72.031 | 88.038 |
Average Precision (AP) % (Test Subset) | |||||||
---|---|---|---|---|---|---|---|
Model | mAP | AP 50 | AP 75 | c1 | c2 | c3 | c4 |
YOLO V8x (bbox) | 86.360 | 91.800 | 90.676 | 69.601 | 86.400 | 91.520 | 94.900 |
YOLO V8l seg | 83.855 | 88.220 | 87.898 | 55.300 | 87.000 | 89.000 | 95.600 |
mask_rcnn_X_101_32x8d_FPN_3x | 73.205 | 80.347 | 76.895 | 47.310 | 70.551 | 79.068 | 92.634 |
faster_rcnn_X_101_32x8d_FPN_3x | 67.747 | 82.227 | 75.973 | 45.656 | 68.446 | 71.440 | 84.192 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Restrepo-Arias, J.F.; Montoya-Castaño, M.J.; Moreno-De La Espriella, M.F.; Branch-Bedoya, J.W. An Application of Deep Learning Models for the Detection of Cocoa Pods at Different Ripening Stages: An Approach with Faster R-CNN and Mask R-CNN. Computation 2025, 13, 159. https://doi.org/10.3390/computation13070159
Restrepo-Arias JF, Montoya-Castaño MJ, Moreno-De La Espriella MF, Branch-Bedoya JW. An Application of Deep Learning Models for the Detection of Cocoa Pods at Different Ripening Stages: An Approach with Faster R-CNN and Mask R-CNN. Computation. 2025; 13(7):159. https://doi.org/10.3390/computation13070159
Chicago/Turabian StyleRestrepo-Arias, Juan Felipe, María José Montoya-Castaño, María Fernanda Moreno-De La Espriella, and John W. Branch-Bedoya. 2025. "An Application of Deep Learning Models for the Detection of Cocoa Pods at Different Ripening Stages: An Approach with Faster R-CNN and Mask R-CNN" Computation 13, no. 7: 159. https://doi.org/10.3390/computation13070159
APA StyleRestrepo-Arias, J. F., Montoya-Castaño, M. J., Moreno-De La Espriella, M. F., & Branch-Bedoya, J. W. (2025). An Application of Deep Learning Models for the Detection of Cocoa Pods at Different Ripening Stages: An Approach with Faster R-CNN and Mask R-CNN. Computation, 13(7), 159. https://doi.org/10.3390/computation13070159