Improved Chinese Giant Salamander Parental Care Behavior Detection Based on YOLOv8
Abstract
Simple Summary
Abstract
1. Introduction
- For the first time, a deep learning model was used to automatically identify A. davidianus’ parental care behavior, optimizing the method of observing A. davidianus’ behavior, promoting the application of information technology in amphibian behavioral ecology research, and providing a reference for the study of other amphibians or aquatic animals;
- We constructed the first dataset dedicated to the behavior of amphibians, A. davidianus’ parental care behavior dataset. This dataset includes six fundamental behaviors: tail fanning, agitating, shaking, egg eating, entering caves, and exiting caves;
- Inspired by the concepts of Res2Net [31], this study proposes a multi-scale feature fusion convolution (MSConv), which is integrated with the C2f module to form C2f-MSConv. Experimental results demonstrate that this module significantly enhances the model’s feature extraction capability and reduces computational costs;
- The integration of the large separable kernel attention (LSKA) [32] mechanism in the SPPF layer minimizes background interference in A. davidianus’ parental care behavior detection. Additionally, the WIoU [33] loss function addresses issues of error and missed detections associated with low-quality samples.
2. Materials and Methods
2.1. Materials
2.1.1. Data Collection
2.1.2. Dataset Creation
2.2. Standard YOLOv8 Model
2.3. Improved YOLOv8 Model
2.3.1. Multi-Scale Convolution C2f-MSConv Module
2.3.2. Optimization of Feature Fusion Networks
2.3.3. Improved Regression Loss Function
3. Results
3.1. Experimental Environment and Parameter Adjustment
3.2. Assessment of Indicators
3.3. Comparison of Ablation Experiments
- Through the comparative analysis between the first and second sets of experiments, we found that the proposed MSConv module demonstrated significant advantages on this dataset. It not only effectively reduced the model’s parameter count and computational load (measured in GFLOPs) but also successfully improved the model’s mAP;
- Further, in the comparison between the second and third sets of experiments, we introduced the LSKA mechanism into the SPPF layer. Although this led to a slight increase in the model’s parameter count and computational load, it significantly enhanced the model’s ability to extract feature behaviors in complex backgrounds, resulting in a 0.5% increase in mAP;
- Lastly, by comparing the third and fourth sets of experiments, we replaced the original model’s CIoU loss function with the WIoU loss function. The dynamic gradient distribution strategy of WloU inhibits the learning of low-quality samples and improves the mAP by 0.3%.
3.4. Comparison Experiments
4. Discussion
4.1. Research Value
4.2. Limitation and Outlook
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Component | Description |
---|---|
Conv | Convolutional layer that performs feature extraction using filters. |
Conv2d | Two-dimensional convolutional layer crucial for image processing. |
BatchNorm2d | Batch normalization for 2D layers, aiding in training stability and acceleration. |
Bottleneck | Design pattern that temporarily reduces feature map dimensions to lower computational costs before restoring them. |
C2f | Custom module in YOLOv8, likely involved in feature transformation or fusion. |
Concat | Concatenation layer that merges feature maps to integrate multi-scale information. |
SiLU | Sigmoid linear unit, an activation function that enhances deeper network training with self-gating. |
Detect | The detection layer that outputs object localization and classification. |
Upsample | Upsampling layer that increases feature map resolution for precise object localization. |
SPPF | Module that captures multi-scale context through an efficient pyramid pooling technique |
References
- He, D.; Zhu, W.; Zeng, W.; Lin, J.; Ji, Y.; Wang, Y.; Zhang, C.; Lu, Y.; Zhao, D.; Su, N.; et al. Nutritional and medicinal characteristics of Chinese giant salamander (Andrias davidianus) for applications in healthcare industry by artificial cultivation: A review. Food Sci. Hum. Well. 2018, 7, 1–10. [Google Scholar] [CrossRef]
- National Forestry and Grassland Administration of China. Official Release of the Updated List of Wild Animals under Special State Protection in China. Available online: http://www.forestry.gov.cn/main/586/20210208/095403793167571.html (accessed on 8 February 2021).
- Wang, M.; Luo, Q.; Wang, H.; Wang, C.; Chen, G.; Xian, J. Analysis of Nutrients Components in the Muscle of Zhangjiajie Giant Salamander. Acta Nutr. Sin. 2015, 37, 411–413. [Google Scholar]
- Liu, J.; Zha, X.; Luo, C.; Chen, P.; Li, W.; Tong, C. Advance of Structure-Activity Relationship of Active Substances in Andrias davidianus. Farm. Prod. Process. 2023, 19, 73–77. [Google Scholar]
- Yang, A.S.; Liu, G.J. Preliminary Study on Artificial Reproduction of Chinese Giant Salamander. Freshw. Fish. 1979, 2, 1–5. [Google Scholar]
- Luo, Q.; Tong, F.; Song, Y.; Wang, H.; Du, M.; Ji, H. Observation of the breeding behavior of the Chinese giant salamander (Andrias davidianus) using a digital monitoring system. Animals 2018, 8, 161. [Google Scholar] [CrossRef]
- Luo, S.; Wang, P.; Zhang, Y.; Wang, Z.; Tian, H.; Luo, Q. Ethogram of the Chinese Giant Salamander during the Breeding Period Based on the PAE Coding System. Animals 2023, 13, 3632. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Annual Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Jocher, G.; Stoken, A.; Borovec, J.; NanoCode012, C.; Changyu, L.; Laughing, H. Ultralytics/yolov5: v3.0. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 20 December 2020).
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Volume 9905, pp. 21–37. [Google Scholar]
- Wang, G. Machine learning for inferring animal behavior from location and movement data. Ecol. Inform. 2019, 49, 69–76. [Google Scholar] [CrossRef]
- Ditria, E.M.; Lopez-Marcano, S.; Sievers, M.; Jinks, E.L.; Brown, C.J.; Connolly, R.M. Automating the Analysis of Fish Abundance Using Object Detection: Optimizing Animal Ecology with Deep Learning. Front. Mar. Sci. 2020, 7, 429. [Google Scholar] [CrossRef]
- Hou, J.; He, Y.; Yang, H.; Connor, T.; Gao, J.; Wang, Y.; Zeng, Y.; Zhang, J.; Huang, J.; Zheng, B. Identification of animal individuals using deep learning: A case study of giant panda. Biol. Conserv. 2020, 242, 108414. [Google Scholar] [CrossRef]
- Xu, W.; Zhu, Z.; Ge, F.; Han, Z.; Li, J. Analysis of behavior trajectory based on deep learning in ammonia environment for fish. Sensors 2020, 20, 4425. [Google Scholar] [CrossRef]
- Xue, Y.; Zhu, X.; Zheng, C.; Mao, L.; Yang, A.; Tu, S.; Huang, N.; Yang, X.; Chen, P.; Zhang, N. Lactating sow postures recognition from depth image of videos based on improved Faster R-CNN. Trans. CSAE 2018, 34, 189–196. [Google Scholar]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
- Wen, L.; Sun, M.; Wu, M. Ocean target recognition model based on attention mechanism and Fast R-CNN deep learning. J. Dalian Ocean. Univ. 2021, 36, 859–865. [Google Scholar]
- Kang, J.; Tian, Y.; Yang, G. Research on Crowd Abnormal Behavior Detection Based on Improved SSD. Infrared Technol. 2022, 44, 1316. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Hu, J.; Zhao, D.; Zhang, Y.; Zhou, C.; Chen, W. Real-time nondestructive fish behavior detecting in mixed polyculture system using deep-learning and low-cost devices. Expert Syst. Appl. 2021, 178, 115051. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, S.; Zhao, S.; Wang, Q.; Li, D.; Zhao, R. Real-time detection and tracking of fish abnormal behavior based on improved YOLOV5 and SiamRPN++. Comput. Electron. Agric. 2022, 192, 106512. [Google Scholar] [CrossRef]
- Tu, W.; Yu, H.; Zhang, P.; Wei, S.; Zhang, X.; Yang, Z.; Wu, J.; Lin, Y.; Hu, Z. Farmed fish detection by improved YOLOv8 based on channel non-degradation with spatially coordinated attention. J. Dalian Ocean. Univ. 2023, 38, 717. [Google Scholar]
- Gao, S.H.; Cheng, M.M.; Zhao, K.; Zhang, X.Y.; Yang, M.H.; Torr, P. Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 652–662. [Google Scholar] [CrossRef]
- Lau, K.W.; Po, L.M.; Rehman, Y.A.U. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Syst. Appl. 2024, 236, 121352. [Google Scholar] [CrossRef]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Guo, M.H.; Lu, C.Z.; Liu, Z.N.; Cheng, M.M.; Hu, S.M. Visual attention network. Comput. Vis. Media 2023, 9, 733–752. [Google Scholar] [CrossRef]
- Schütz, A.; Schöler, V.; Krause, E.; Fischer, M.; Müller, T.; Freuling, C.; Conraths, F.; Stanke, M.; Homeier-Bachmann, T.; Lentz, H. Application of YOLOv4 for Detection and Motion Monitoring of Red Foxes. Animals 2021, 11, 1723. [Google Scholar] [CrossRef]
- Guo, Q.; Sun, Y.; Orsini, C.; Bolhuis, J.E.; Vlieg, J.d.; Bijma, P.; With, P.H.N.d. Enhanced camera-based individual pig detection and tracking for smart pig farms. Comput. Electron. Agric. 2023, 211, 14. [Google Scholar] [CrossRef]
- Zhu, L.; Weng, W. Catadioptric stereo-vision system for the real-time monitoring of 3D behavior in aquatic animals. Physiol. Behav. 2007, 91, 106–119. [Google Scholar] [CrossRef]
- Gosztolai, A.; Günel, S.; Lobato-Ríos, V.; Pietro Abrate, M.; Morales, D.; Rhodin, H.; Fua, P.; Ramdya, P. LiftPose3D, a deep learning-based approach for transforming two-dimensional to three-dimensional poses in laboratory animals. Nat. Methods 2021, 18, 975–981. [Google Scholar] [CrossRef]
- Wang, Y.; Li, R.; Wang, Z.; Hua, Z.; Jiao, Y.; Duan, Y.; Song, H. E3D: An efficient 3D CNN for the recognition of dairy cow’s basic motion behavior. Comput. Electron. Agric. 2023, 205, 107607. [Google Scholar] [CrossRef]
Behavior Types | Judging Standard | Label | Sample Size |
---|---|---|---|
Tail fanning | The tail of A. davidianus swings from side to side beside or in the egg pile. | shanwei | 600 |
Agitating | The head of the A. davidianus drills into the egg pile or the body passes through the egg pile. | jiaodong | 700 |
Shaking | The head or body of the A. davidianus straddles above or near the egg pile and swings from side to side or up and down. | zhendong | 500 |
Egg eating | A. davidianus holds the egg with its mouth open, and it is often accompanied by shaking its head. | chsihi | 700 |
Entering caves | Only the head of the A. davidianus appears near the cave mouth. | jindong | 250 |
Exiting caves | Only the tail of the A. davidianus appears near the cave mouth. | chudong | 250 |
Category | Configuration |
---|---|
CPU | 16 vCPU Intel(R) Xeon(R) Gold 6430 |
GPU | RTX A5000 24G |
System environment | ubuntu20.04 |
Framework | Pytorch 1.11.0 |
Programming voice | Python 3.8 |
Baseline | C2f-MSConv | SPPF-LSKA | WIoU | mAP@50-90 | GFLOPs | Parameters/106 | FPS |
---|---|---|---|---|---|---|---|
YOLOv8s | 83.6% | 28.8 | 11.16 | 130.0 | |||
√ | 84.9% | 27.8 | 10.61 | 107.3 | |||
√ | √ | 85.4% | 28.7 | 11.68 | 106.4 | ||
√ | √ | √ | 85.7% | 28.7 | 11.68 | 106.4 |
Model | mAP@50-95 | GFLOPs | Parameters/106 | FPS |
---|---|---|---|---|
Faster-RCNN | 73.5% | 251.4 | 41.37 | 57.4 |
SSD | 62.2% | 62.7 | 24.26 | 48 |
YOLOv5x | 84.6% | 204.7 | 86.25 | 48.6 |
YOLOX | 72.5% | 26.8 | 9.00 | 83.3 |
YOLOv7 | 81.3% | 103.5 | 37.62 | 56.4 |
YOLOv8s | 83.6% | 28.8 | 11.16 | 130.0 |
our | 85.7% | 28.7 | 11.68 | 106.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Luo, S.; Xiang, J.; Chen, Y.; Luo, Q. Improved Chinese Giant Salamander Parental Care Behavior Detection Based on YOLOv8. Animals 2024, 14, 2089. https://doi.org/10.3390/ani14142089
Li Z, Luo S, Xiang J, Chen Y, Luo Q. Improved Chinese Giant Salamander Parental Care Behavior Detection Based on YOLOv8. Animals. 2024; 14(14):2089. https://doi.org/10.3390/ani14142089
Chicago/Turabian StyleLi, Zhihao, Shouliang Luo, Jing Xiang, Yuanqiong Chen, and Qinghua Luo. 2024. "Improved Chinese Giant Salamander Parental Care Behavior Detection Based on YOLOv8" Animals 14, no. 14: 2089. https://doi.org/10.3390/ani14142089
APA StyleLi, Z., Luo, S., Xiang, J., Chen, Y., & Luo, Q. (2024). Improved Chinese Giant Salamander Parental Care Behavior Detection Based on YOLOv8. Animals, 14(14), 2089. https://doi.org/10.3390/ani14142089