Enhancing Bee Mite Detection with YOLO: The Role of Data Augmentation and Stratified Sampling
Abstract
1. Introduction
2. Materials and Methods
2.1. Building Datasets for Bee Mite and Beekeeping Objects
2.1.1. Measurement of Beecomb Images with Beekeeping Objects
2.1.2. Extracting Region of Interest Images for Building Datasets
2.1.3. Bee Mite and Beekeeping Object Annotation Rules and Distribution Methods
2.2. Development of Bee Mite and Beekeeping Object Detection Models
2.2.1. YOLO Architecture and Hyperparameters for Bee Mite Beekeeping Object Detection
2.2.2. Performance Evaluation Methods for Bee Mite and Beekeeping Object Detection Models
3. Results and Discussion
3.1. Results of Building Dataset for Bee Mites and Beekeeping Object Detection
3.2. Evaluation of Model Performance
3.2.1. Performance of Bee Mite and Beekeeping Object Detection Models
3.2.2. Performance Comparison of Bee Mite and Beekeeping Object Detection Models Based on Original and Image-Processed Data
3.2.3. Performance Comparison of Bee Mite and Beekeeping Object Detection Models Based on Random and Stratified Sampling Methods
3.2.4. Determining the Best Models for Bee Mite Detection
3.3. Comparative Analysis with Previous Research
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Abu, E.S. The use of smart apiculture management system. Asian J. Adv. Res. 2020, 5, 6–16. [Google Scholar]
- Sammataro, D.; Gerson, U.; Needham, G. Parasitic mites of honey bees: Life history, implications, and impact. Annu. Rev. Entomol. 2000, 45, 519–548. [Google Scholar] [CrossRef] [PubMed]
- Wilfert, L.; Long, G.; Leggett, H.; Schmid-Hempel, P.; Butlin, R.; Martin, S.; Boots, M. Deformed wing virus is a recent global epidemic in honeybees driven by Varroa mites. Science 2016, 351, 594–597. [Google Scholar] [CrossRef] [PubMed]
- Chuleui, J. Simulation Study of Varroa Population under the Future Climate Conditions. J. Apic. 2015, 30, 349–358. [Google Scholar] [CrossRef]
- Boecking, O.; Genersch, E. Varroosis—The ongoing crisis in bee keeping. J. Für Verbraucherschutz Lebensmittelsicherheit 2008, 3, 221–228. [Google Scholar] [CrossRef]
- Hristov, P.; Shumkova, R.; Palova, N.; Neov, B. Factors associated with honey bee colony losses: A mini-review. Vet. Sci. 2020, 7, 166. [Google Scholar] [CrossRef]
- Kane, T.R.; Faux, C.M. Honey Bee Medicine for the Veterinary Practitioner; John Wiley & Sons: Hoboken, NY, USA, 2021; pp. 229–234. [Google Scholar] [CrossRef]
- Braga, A.R.; Gomes, D.G.; Rogers, R.; Hassler, E.E.; Freitas, B.M.; Cazier, J.A. A method for mining combined data from in-hive sensors, weather and apiary inspections to forecast the health status of honey bee colonies. Comput. Electron. Agric. 2020, 169, 105161. [Google Scholar] [CrossRef]
- Gregorc, A.; Sampson, B. Diagnosis of Varroa Mite (Varroa destructor) and sustainable control in honey bee (Apis mellifera) colonies—A review. Diversity 2019, 11, 243. [Google Scholar] [CrossRef]
- Delaplane, K.S.; Berry, J.A.; Skinner, J.A.; Parkman, J.P.; Hood, W.M. Integrated pest management against Varroa destructor reduces colony mite levels and delays treatment threshold. J. Apic. Res. 2005, 44, 157–162. [Google Scholar] [CrossRef]
- Jack, C.J.; Ellis, J.D. Integrated pest management control of Varroa destructor (Acari: Varroidae), the most damaging pest of (Apis mellifera L. (Hymenoptera: Apidae)) colonies. J. Insect Sci. 2021, 21, 6. [Google Scholar] [CrossRef]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. Available online: https://openaccess.thecvf.com/content_iccv_2015/html/Girshick_Fast_R-CNN_ICCV_2015_paper.html (accessed on 1 March 2023).
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. Available online: https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Redmon_You_Only_Look_CVPR_2016_paper.html (accessed on 1 April 2023).
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
- Badgujar, C.M.; Poulose, A.; Gan, H. Agricultural object detection with You Only Look Once (YOLO) Algorithm: A bibliometric and systematic literature review. Comput. Electron. Agric. 2024, 223, 109090. [Google Scholar] [CrossRef]
- Suto, J. Codling moth monitoring with camera-equipped automated traps: A review. Agriculture 2022, 12, 1721. [Google Scholar] [CrossRef]
- Thakker, M.; Anand, S.; Purandare, S. Assessment of Honey Bee Colony Health Using Computer Vision and Machine Learning; Indraprastha Institute of Information Technology New Delhi: New Delhi, India, 2019; p. 2016159. Available online: http://repository.iiitd.edu.in/xmlui/handle/123456789/901 (accessed on 20 October 2022).
- Bilik, S.; Kratochvila, L.; Ligocki, A.; Bostik, O.; Zemcik, T.; Hybl, M.; Horak, K.; Zalud, L. Visual diagnosis of the varroa destructor parasitic mite in honeybees using object detector techniques. Sensors 2021, 21, 2764. [Google Scholar] [CrossRef] [PubMed]
- Voudiotis, G.; Moraiti, A.; Kontogiannis, S. Deep Learning Beehive Monitoring System for Early Detection of the Varroa Mite. Signals 2022, 3, 506–523. [Google Scholar] [CrossRef]
- Jiao, L.; Xie, C.; Chen, P.; Du, J.; Li, R.; Zhang, J. Adaptive feature fusion pyramid network for multi-classes agricultural pest detection. Comput. Electron. Agric. 2022, 195, 106827. [Google Scholar] [CrossRef]
- Kaur, P.; Khehra, B.S.; Mavi, E.B.S. Data augmentation for object detection: A review. In Proceedings of the 2021 IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), Lansing, MI, USA, 9–11 August 2021; pp. 537–543. [Google Scholar]
- Rendon, E.; Alejo, R.; Castorena, C.; Isidro-Ortega, F.J.; Granda-Gutierrez, E.E. Data sampling methods to deal with the big data multi-class imbalance problem. Appl. Sci. 2020, 10, 1276. [Google Scholar] [CrossRef]
- Mahmud, M.S.; Huang, J.Z.; Salloum, S.; Emara, T.Z.; Sadatdiynov, K. A survey of data partitioning and sampling methods to support big data analysis. Big Data Min. Anal. 2020, 3, 85–101. [Google Scholar] [CrossRef]
- Al-Kateb, M.; Lee, B.S. Stratified reservoir sampling over heterogeneous data streams. In International Conference on Scientific and Statistical Database Management; Springer: Berlin/Heidelberg, Germany, 2010; pp. 621–639. [Google Scholar]
- Dong, S.; Wang, R.; Liu, K.; Jiao, L.; Li, R.; Du, J.; Teng, Y.; Wang, F. CRA-Net: A channel recalibration feature pyramid network for detecting small pests. Comput. Electron. Agric. 2021, 191, 106518. [Google Scholar] [CrossRef]
- Wang, R.; Jiao, L.; Xie, C.; Chen, P.; Du, J.; Li, R. S-RPN: Sampling-balanced region proposal network for small crop pest detection. Comput. Electron. Agric. 2021, 187, 106290. [Google Scholar] [CrossRef]
- Wang, Z.; Zhang, S.; Chen, L.; Wu, W.; Wang, H.; Liu, X.; Fan, Z.; Wang, B. Microscopic Insect Pest Detection in Tea Plantations: Improved YOLOv8 Model Based on Deep Learning. Agriculture 2024, 14, 1739. [Google Scholar] [CrossRef]
- Ye, R.; Gao, Q.; Qian, Y.; Sun, J.; Li, T. Improved yolov8 and sahi model for the collaborative detection of small targets at the micro scale: A case study of pest detection in tea. Agronomy 2024, 14, 1034. [Google Scholar] [CrossRef]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
- Rebuffi, S.-A.; Gowal, S.; Calian, D.A.; Stimberg, F.; Wiles, O.; Mann, T.A. Data augmentation can improve robustness. Adv. Neural Inf. Process. Syst. 2021, 34, 29935–29948. [Google Scholar]
- Bosquet, B.; Cores, D.; Seidenari, L.; Brea, V.M.; Mucientes, M.; Del Bimbo, A. A full data augmentation pipeline for small object detection based on generative adversarial networks. Pattern Recognit. 2023, 133, 108998. [Google Scholar] [CrossRef]
- Kisantal, M.; Wojna, Z.; Murawski, J.; Naruniec, J.; Cho, K. Augmentation for small object detection. arXiv 2019, arXiv:1902.07296. [Google Scholar] [CrossRef]
- Mahmoud, H.; Kurniawan, I.F.; Aneiba, A.; Asyhari, A.T. Enhancing detection of remotely-sensed floating objects via Data Augmentation for Maritime SAR. J. Indian Soc. Remote Sens. 2024, 52, 1285–1295. [Google Scholar] [CrossRef]
- Kumar, T.; Brennan, R.; Mileo, A.; Bendechache, M. Image data augmentation approaches: A comprehensive survey and future directions. IEEE Access 2024, 12, 187536–187571. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big data 2019, 6, 60. [Google Scholar] [CrossRef]
- Lee, H.G.; Kim, M.-J.; Kim, S.-B.; Lee, S.; Lee, H.; Sin, J.Y.; Mo, C. Identifying an image-processing method for detection of bee mite in honey bee based on keypoint analysis. Agriculture 2023, 13, 1511. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Chaudhuri, S.; Das, G.; Narasayya, V. Optimized stratified sampling for approximate query processing. ACM Trans. Database Syst. (TODS) 2007, 32, 9. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Liu, M.; Cui, M.; Xu, B.; Liu, Z.; Li, Z.; Chu, Z.; Zhang, X.; Liu, G.; Xu, X.; Yan, Y. Detection of Varroa destructor Infestation of Honeybees Based on Segmentation and Object Detection Convolutional Neural Networks. AgriEngineering 2023, 5, 1644–1662. [Google Scholar] [CrossRef]
Hyperparameters | Component |
---|---|
Image resize | 640 × 640 (same with cropped data) |
Batch size | 34 |
Epochs | 1200 |
Cache | True |
Device | 0, 1 |
Config | YOLOv7 (num_classes = 7) |
CPU | AMD Ryzen Threadripper 3960X 24 Core 3.80 GHz |
GPU | NVIDIA GeForce RTX 3090 Ti GDDR6X 24 GB × 2 |
RAM | 256 GB |
Data Configuration | Split Method | Number of Images | |
---|---|---|---|
Dataset A | Original data | Random | 1463 |
Dataset B | Image-processed data | Random | 1463 |
Dataset C | Original + Image-processed data | Random | 2926 |
Dataset D | Original + Image-processed data | Stratified | 2926 |
Dataset | Bee | Deformed Wing Bee | Infested Bee | Mite | Larvae | Abnormal Larvae | Cell | |
---|---|---|---|---|---|---|---|---|
Dataset A | Train (%) | 72.36 | 75.42 | 69.84 | 69.90 | 67.20 | 71.39 | 69.58 |
Test (%) | 18.16 | 17.32 | 20.47 | 20.18 | 24.44 | 20.36 | 19.78 | |
Validation (%) | 9.48 | 7.26 | 9.69 | 9.92 | 8.36 | 8.25 | 10.64 | |
Dataset B | Train (%) | 68.92 | 69.83 | 68.81 | 68.89 | 74.28 | 75.52 | 68.47 |
Test (%) | 21.13 | 19.55 | 20.96 | 21.08 | 21.54 | 15.46 | 21.26 | |
Validation (%) | 9.95 | 10.61 | 10.24 | 10.03 | 4.18 | 9.02 | 10.27 | |
Dataset C | Train (%) | 68.68 | 69.27 | 69.93 | 70.15 | 79.10 | 70.75 | 68.71 |
Test (%) | 21.00 | 19.83 | 19.90 | 19.82 | 15.92 | 19.46 | 20.24 | |
Validation (%) | 10.31 | 10.89 | 10.18 | 10.03 | 4.98 | 9.79 | 11.05 | |
Dataset D | Train (%) | 70.52 | 70.39 | 70.14 | 70.04 | 72.19 | 74.87 | 71.70 |
Test (%) | 19.90 | 17.60 | 19.90 | 19.96 | 19.61 | 16.37 | 18.61 | |
Validation (%) | 9.58 | 12.01 | 9.96 | 10.01 | 8.20 | 8.76 | 9.68 |
Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) | mAP [0.5] (%) | mAP [0.5:0.95] (%) | Best Epochs | ||
---|---|---|---|---|---|---|---|---|
All | DA | 92.3 | 88.8 | 92.0 | 90.4 | 91.9 | 65.8 | 474(DA) 616(DB) |
DB | 87.0 | 87.5 | 85.8 | 86.6 | 89.6 | 63.5 | ||
Bees | DA | 95.0 | 92.8 | 93.9 | 93.3 | 96.6 | 77.3 | |
DB | 95.2 | 93.5 | 93.9 | 93.7 | 96.5 | 76.2 | ||
Deformed bees | DA | 92.3 | 76.3 | 92.3 | 83.5 | 86.4 | 76.5 | |
DB | 94.7 | 98.8 | 94.7 | 96.7 | 94.5 | 77.9 | ||
Infested bees | DA | 96.8 | 93.6 | 93.1 | 93.3 | 94.3 | 80.3 | |
DB | 98.7 | 96.3 | 91.1 | 93.6 | 94.7 | 79.4 | ||
Mites | DA | 93.2 | 96.4 | 91.7 | 94.0 | 94.1 | 52.7 | |
DB | 91.1 | 97.5 | 87.8 | 92.4 | 91.1 | 50.2 | ||
Larvae | DA | 88.5 | 82.5 | 90.8 | 86.5 | 93.1 | 73.2 | |
DB | 63.2 | 55.5 | 84.6 | 67.0 | 80.8 | 63.4 | ||
Abnormal larvae | DA | 87.5 | 93.5 | 90.6 | 92.0 | 85.5 | 45.0 | |
DB | 78.1 | 81.5 | 62.9 | 71.0 | 78.6 | 42.0 | ||
Cells | DA | 92.7 | 86.5 | 91.8 | 89.1 | 93.3 | 55.7 | |
DB | 88.3 | 89.8 | 85.8 | 87.8 | 90.8 | 55.0 |
Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) | mAP [0.5] (%) | mAP [0.5:0.95] (%) | Best Epochs | ||
---|---|---|---|---|---|---|---|---|
All | DC | 95.2 | 95.1 | 92.4 | 93.7 | 96.4 | 72.0 | 947(DC) 580(DD) |
DD | 95.2 | 93.4 | 92.4 | 92.9 | 95.0 | 70.0 | ||
Bees | DC | 98.6 | 97.4 | 97.5 | 97.4 | 99.1 | 83.2 | |
DD | 97.9 | 96.2 | 98.2 | 97.2 | 98.7 | 82.0 | ||
Deformed bees | DC | 100 | 100 | 95.4 | 97.6 | 97.5 | 83.8 | |
DD | 97.4 | 99.4 | 90.7 | 94.9 | 94.1 | 79.0 | ||
Infested bees | DC | 97.9 | 98.0 | 97.6 | 97.8 | 98.0 | 85.1 | |
DD | 98.2 | 97.0 | 98.2 | 97.6 | 98.1 | 83.1 | ||
Mites | DC | 97.2 | 99.4 | 94.3 | 96.8 | 97.3 | 60.2 | |
DD | 98.0 | 98.2 | 96.6 | 97.4 | 97.4 | 61.7 | ||
Larvae | DC | 87.1 | 80.5 | 80.6 | 80.5 | 90.4 | 63.1 | |
DD | 86.5 | 81.0 | 83.5 | 82.2 | 89.5 | 65.4 | ||
Abnormal larvae | DC | 88.5 | 94.3 | 86.9 | 90.4 | 94.3 | 60.7 | |
DD | 92.5 | 86.6 | 85.2 | 85.9 | 89.7 | 52.4 | ||
Cells | DC | 97.2 | 96.0 | 94.1 | 95.0 | 98.3 | 68.0 | |
DD | 96.1 | 95.7 | 94.0 | 94.8 | 97.3 | 66.7 |
Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) | mAP [0.5] (%) | mAP [0.5:0.95] (%) | |
---|---|---|---|---|---|---|
All | 95.2 (DC, DD) | 95.1 (DC) | 92.4 (DC, DD) | 93.7 (DC) | 96.4(DC) | 72.0 (DC) |
Bees | 98.6 (DC) | 97.4 (DC) | 98.2 (DD) | 97.4 (DC) | 99.1 (DC) | 83.2 (DC) |
Deformed bees | 100 (DC) | 100 (DC) | 95.4 (DC) | 97.6 (DC) | 97.5 (DC) | 83.8 (DC) |
Infested bees | 98.2 (DD) | 98.0 (DC) | 98.2 (DD) | 97.8 (DC) | 98.1 (DD) | 85.1 (DC) |
Mites | 98.0 (DD) | 99.4 (DC) | 96.6 (DD) | 97.4 (DD) | 97.4 (DD) | 61.7 (DD) |
Larvae | 88.5 (DA) | 82.5 (DA) | 90.8 (DA) | 86.5 (DA) | 93.1 (DA) | 73.2 (DA) |
Abnormal Larvae | 92.5 (DD) | 94.3 (DC) | 90.6 (DA) | 92.0 (DA) | 94.3 (DC) | 60.7 (DC) |
Cells | 97.2 (DC) | 96.0 (DC) | 94.1 (DC) | 95.0 (DC) | 98.3 (DC) | 68.0 (DC) |
Models | Number of Objects | F1 Score | mAP [0.5] | mAP [0.5:0.95] |
---|---|---|---|---|
YOLO-DD (This study) | Bees | 0.972 | 0.987 | 0.820 |
Deformed bees | 0.949 | 0.941 | 0.790 | |
Infested bees | 0.976 | 0.981 | 0.831 | |
Mites | 0.974 | 0.974 | 0.617 | |
Larvae | 0.822 | 0.895 | 0.654 | |
Abnormal larvae | 0.859 | 0.897 | 0.524 | |
Cells | 0.948 | 0.973 | 0.667 | |
Liu’s Model [40] | Bees | 0.944 | 0.956 | |
Mites | 0.970 | (Average) | ||
Bilik’s Model 1 [18] | Bees | 0.556 | 0.547 | 0.281 |
Mites | 0.681 | 0.529 | 0.252 | |
Bilik’s Model 2 [18] | Mites | 0.714 | 0.519 | 0.239 |
Voudiotis’s Model [19] | Mites | - | 0.481 | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, H.-G.; Shin, J.-Y.; Kim, S.-B.; Kim, M.-J.; Kim, M.S.; Lee, H.; Mo, C. Enhancing Bee Mite Detection with YOLO: The Role of Data Augmentation and Stratified Sampling. Agriculture 2025, 15, 1221. https://doi.org/10.3390/agriculture15111221
Lee H-G, Shin J-Y, Kim S-B, Kim M-J, Kim MS, Lee H, Mo C. Enhancing Bee Mite Detection with YOLO: The Role of Data Augmentation and Stratified Sampling. Agriculture. 2025; 15(11):1221. https://doi.org/10.3390/agriculture15111221
Chicago/Turabian StyleLee, Hong-Gu, Jeong-Yong Shin, Su-Bae Kim, Min-Jee Kim, Moon S. Kim, Hoyoung Lee, and Changyeun Mo. 2025. "Enhancing Bee Mite Detection with YOLO: The Role of Data Augmentation and Stratified Sampling" Agriculture 15, no. 11: 1221. https://doi.org/10.3390/agriculture15111221
APA StyleLee, H.-G., Shin, J.-Y., Kim, S.-B., Kim, M.-J., Kim, M. S., Lee, H., & Mo, C. (2025). Enhancing Bee Mite Detection with YOLO: The Role of Data Augmentation and Stratified Sampling. Agriculture, 15(11), 1221. https://doi.org/10.3390/agriculture15111221