Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data
Abstract
:1. Introduction
2. Methodology
2.1. System Overview
2.2. Data Collection
2.3. Object Detection Model
2.4. Model Training
- Image size: 640 pixels, chosen to optimise detection accuracy, while maintaining computational efficiency, aligning with the dataset’s mean resolution;
- Batch size: 256, to enable stable weight updates without exceeding GPU memory capacity;
- Epochs: 50, providing sufficient time for convergence while minimising overfitting risks;
- Learning rate: 0.01, enabling balanced gradient updates for steady training progress;
- Momentum: 0.937, enhancing gradient stability and directional convergence during training.
- Hue adjustment (hsv_h = 0.015): randomly modified by up to 1.5%, introducing subtle colour shifts;
- Saturation adjustment (hsv_s = 0.7): altered by up to 70%, diversifying the colour intensity;
- Brightness adjustment (hsv_v = 0.4): adjusted by up to 40%, simulating various lighting conditions;
- Horizontal flip (fliplr = 0.5): applied with a 50% probability, increasing the invariance to directionality;
- Translation (translate = 0.1): randomly shifted up to 10%, enhancing the robustness to positional variations;
- Scaling (scale = 0.5): objects were resized by up to 50%, improving detection across size variations;
- Random erasing (erasing = 0.4): applied to 40% of images, simulating occlusions by randomly removing portions of the image.
2.5. Vision Language Model
- “How many elephants were observed in January 2024, and how does this compare to January 2023?”
- “How many zebras were observed in January 2023 compared with 2024, and were they more commonly observed during the day or night?”
2.6. Retrieval-Augmented Generation (RAG)
2.7. Visual Question Answering
2.8. Automatic Reporting
- “How has the population of giraffes fluctuated between 2023 and 2024?”
- “What species were observed at night in the dry season verses the wet season?”
2.9. Evaluation Metrics
3. Results
3.1. Training Results for the Sub-Saharan Model
3.2. Results for Vision–Language Model Without YOLOv10-X Object Detection Support
3.3. Results for Vision–Language Model with OD Support
3.4. Results for Retrieval-Augmented Generation
3.5. Automated Reporting
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
References
- O’Connell, A.F.; Nichols, J.D.; Karanth, K.U. Camera Traps in Animal Ecology: Methods and Analyses; Springer: Berlin/Heidelberg, Germany, 2011; Volume 271. [Google Scholar]
- Wearn, O.R.; Glover-Kapfer, P. Snap happy: Camera traps are an effective sampling tool when compared with alternative methods. R. Soc. Open Sci. 2019, 6, 181748. [Google Scholar] [CrossRef] [PubMed]
- Villa, A.G.; Salazar, A.; Vargas, F. Towards automatic wild animal monitoring: Identification of animal species in camera-trap images using very deep convolutional neural networks. Ecol. Inform. 2017, 41, 24–32. [Google Scholar] [CrossRef]
- Young, S.; Rode-Margono, J.; Amin, R. Software to facilitate and streamline camera trap data management: A review. Ecol. Evol. 2018, 8, 9947–9957. [Google Scholar] [CrossRef] [PubMed]
- Nazir, S.; Kaleem, M. Advances in image acquisition and processing technologies transforming animal ecological studies. Ecol. Inform. 2021, 61, 101212. [Google Scholar] [CrossRef]
- Findlay, M.A.; Briers, R.A.; White, P.J.C. Component processes of detection probability in camera-trap studies: Understanding the occurrence of false-negatives. Mammal Res. 2020, 65, 167–180. [Google Scholar] [CrossRef]
- Meek, P.D.; Ballard, G.; Claridge, A.; Kays, R.; Moseby, K.; O’brien, T.; O’connell, A.; Sanderson, J.; Swann, D.E.; Tobler, M.; et al. Recommended guiding principles for reporting on camera trapping research. Biodivers. Conserv. 2014, 23, 2321–2343. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Scotson, L.; Johnston, L.R.; Iannarilli, F.; Wearn, O.R.; Mohd-Azlan, J.; Wong, W.M.; Gray, T.N.E.; Dinata, Y.; Suzuki, A.; Willard, C.E.; et al. Best practices and software for the management and sharing of camera trap data for small and large scales studies. Remote. Sens. Ecol. Conserv. 2017, 3, 158–172. [Google Scholar] [CrossRef]
- Swanson, A.; Kosmala, M.; Lintott, C.; Simpson, R.; Smith, A.; Packer, C. Snapshot Serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna. Sci. Data 2015, 2, 150026. [Google Scholar] [CrossRef] [PubMed]
- Reynolds, J.H.; Thompson, W.L.; Russell, B. Planning for success: Identifying effective and efficient survey designs for monitoring. Biol. Conserv. 2011, 144, 1278–1284. [Google Scholar] [CrossRef]
- Swinnen, K.R.R.; Reijniers, J.; Breno, M.; Leirs, H. A novel method to reduce time investment when processing videos from camera trap studies. PLoS ONE 2014, 9, e98881. [Google Scholar] [CrossRef] [PubMed]
- Ihaka, R.; Gentleman, R. R: A language for data analysis and graphics. J. Comput. Graph. Stat. 1996, 5, 299–314. [Google Scholar] [CrossRef]
- Beery, S.; Morris, D.; Yang, S. Efficient pipeline for camera trap image review. arXiv 2019, arXiv:1907.06772. [Google Scholar]
- Fennell, M.; Beirne, C.; Burton, A.C. Use of object detection in camera trap image identification: Assessing a method to rapidly and accurately classify human and animal detections for research and application in recreation ecology. Glob. Ecol. Conserv. 2022, 35, e02104. [Google Scholar] [CrossRef]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Ehrlich, P.R.; Wilson, E.O. Biodiversity studies: Science and policy. Science 1991, 253, 758–762. [Google Scholar] [CrossRef] [PubMed]
- Urbano, F.; Viterbi, R.; Pedrotti, L.; Vettorazzo, E.; Movalli, C.; Corlatti, L. Enhancing biodiversity conservation and monitoring in protected areas through efficient data management. Environ. Monit. Assess. 2024, 196, 12. [Google Scholar] [CrossRef] [PubMed]
- Michener, W.K.; Jones, M.B. Ecoinformatics: Supporting ecology as a data-intensive science. Trends Ecol. Evol. 2012, 27, 85–93. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Z.-Q.; Zheng, P.; Xu, S.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [PubMed]
- Fergus, P.; Chalmers, C.; Longmore, S.; Wich, S. Harnessing Artificial Intelligence for Wildlife Conservation. arXiv 2024, arXiv:2409.10523. [Google Scholar] [CrossRef]
- Fergus, P.; Chalmers, C.; Longmore, S.; Wich, S.; Warmenhove, C.; Swart, J.; Ngongwane, T.; Burger, A.; Ledgard, J.; Meijaard, E. Empowering wildlife guardians: An equitable digital stewardship and reward system for biodiversity conservation using deep learning and 3/4G camera traps. Remote Sens. 2023, 15, 2730. [Google Scholar] [CrossRef]
- Schneider, S.; Taylor, G.W.; Kremer, S. Deep learning object detection methods for ecological camera trap data. In Proceedings of the 2018 15th Conference on Computer and Robot Vision (CRV), Toronto, ON, Canada, 9–11 May 2018; pp. 321–328. [Google Scholar]
- Lahoz-Monfort, J.J.; Magrath, M.J.L. A comprehensive overview of technologies for species and habitat monitoring and conservation. BioScience 2021, 71, 1038–1062. [Google Scholar] [CrossRef] [PubMed]
- Yin, S.; Fu, C.; Zhao, S.; Li, K.; Sun, X.; Xu, T.; Chen, E. A survey on multimodal large language models. arXiv 2023, arXiv:2306.13549. [Google Scholar] [CrossRef]
- Zang, Y.; Li, W.; Han, J.; Zhou, K.; Loy, C.C. Contextual object detection with multimodal large language models. Int. J. Comput. Vis. 2024, 1–19. [Google Scholar] [CrossRef]
- Zhou, K.; Yang, J.; Loy, C.C.; Liu, Z. Learning to prompt for vision-language models. Int. J. Comput. Vis. 2022, 130, 2337–2348. [Google Scholar] [CrossRef]
- Wang, H.; Li, J.; Wu, H.; Hovy, E.; Sun, Y. Pre-trained language models and their applications. Engineering 2023, 25, 51–65. [Google Scholar] [CrossRef]
- Jain, J.; Yang, J.; Shi, H. Vcoder: Versatile vision encoders for multimodal large language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 27992–28002. [Google Scholar]
- Wang, W.; Chen, Z.; Chen, X.; Wu, J.; Zhu, X.; Zeng, G.; Luo, P.; Lu, T.; Zhou, J.; Qiao, Y.; et al. Visionllm: Large language model is also an open-ended decoder for vision-centric tasks. In Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NIPS 2024), Vancouver, BC, Canada, 10–15 December 2024. [Google Scholar]
- Sun, J.; Jacobs, D.W. Seeing what is not there: Learning context to determine where objects are missing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5716–5724. [Google Scholar]
- Lamba, A.; Cassey, P.; Segaran, R.R.; Koh, L.P. Deep learning for environmental conservation. Curr. Biol. 2019, 29, R977–R982. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. Yolov10: Real-time end-to-end object detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
- Micrsoft. Microsoft/Phi-3.5-Vision-Instruct. 2024. Available online: https://huggingface.co/microsoft/Phi-3.5-vision-instruct (accessed on 10 November 2024).
- Vaswani, A. Attention is all you need. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.T.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural Inf. Process. Syst. 2020, 33, 9459–9474. [Google Scholar]
- Bland, L.M.; Keith, D.A.; Miller, R.M.; Murray, N.J.; Rodríguez, J.P. Guidelines for the application of IUCN Red List of Ecosystems Categories and Criteria: Version 2.0; International Union for the Conservation of Nature: Gland, Switzerland, 2024. [Google Scholar]
- Kleyer, M.; Bekker, R.M.; Knevel, I.C.; Bakker, J.P.; Thompson, K.; Sonnenschein, M.; Poschlod, P.; van Groenendael, J.M.; Klimeš, L.; Klimešová, J.; et al. The LEDA Traitbase: A database of life-history traits of the Northwest European flora. J. Ecol. 2008, 96, 1266–1274. [Google Scholar] [CrossRef]
- Gallagher, R.V.; Falster, D.S.; Maitner, B.S.; Salguero-Gómez, R.; Vandvik, V.; Pearse, W.D.; Schneider, F.D.; Kattge, J.; Poelen, J.H.; Madin, J.S.; et al. Open Science principles for accelerating trait-based science across the Tree of Life. Nat. Ecol. Evol. 2020, 4, 294–303. [Google Scholar] [CrossRef] [PubMed]
- Porras, I.; Steele, P. Biocredits. In A Solution for Protecting Nature and Tackling Poverty Environmental Economics; Issue Paper February 2020; IIED: London, UK, 2020. [Google Scholar]
- Zhang, J.; Huang, J.; Jin, S.; Lu, S. Vision-language models for vision tasks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5625–5644. [Google Scholar] [CrossRef]
- Whytock, R.C.; Suijten, T.; van Deursen, T.; Świeżewski, J.; Mermiaghe, H.; Madamba, N.; Mouckoumou, N.; Zwerts, J.A.; Pambo, A.F.K.; Bahaa-el-din, L.; et al. Real-time alerts from AI-enabled camera traps using the Iridium satellite network: A case-study in Gabon, Central Africa. Methods Ecol. Evol. 2023, 14, 867–874. [Google Scholar] [CrossRef]
- Vélez, J.; McShea, W.; Shamon, H.; Castiblanco-Camacho, P.J.; Tabak, M.A.; Chalmers, C.; Fergus, P.; Fieberg, J. An evaluation of platforms for processing camera-trap data using artificial intelligence. Methods Ecol. Evol. 2023, 14, 459–477. [Google Scholar] [CrossRef]
- Ma, X.; Wang, L.; Yang, N.; Wei, F.; Lin, J. Fine-tuning llama for multi-stage text retrieval. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington DC, USA, 14–18 July 2024; pp. 2421–2425. [Google Scholar]
- Douze, M.; Guzhva, A.; Deng, C.; Johnson, J.; Szilvasy, G.; Mazaré, P.E.; Lomeli, M.; Hosseini, L.; Jégou, H. The faiss library. arXiv 2024, arXiv:2401.08281. [Google Scholar]
- Padilla, R.; Netto, S.L.; Da Silva, E.A.B. A survey on performance metrics for object-detection algorithms. In Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Niterói, Brazil, 3–5 June 2020; pp. 237–242. [Google Scholar]
- Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Hosang, J.; Benenson, R.; Schiele, B. Learning non-maximum suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4507–4515. [Google Scholar]
- Sapkota, R.; Meng, Z.; Ahmed, D.; Churuvija, M.; Du, X.; Ma, Z.; Karkee, M. Comprehensive Performance Evaluation of YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments. arXiv 2024, arXiv:2407.12040. [Google Scholar]
- Sapkota, R.; Qureshi, R.; Flores-Calero, M.; Badgujar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Bhanu Prakash Vaddevolu, U.; Yan, P.; Karkee, M. Yolov10 to its genesis: A decadal and comprehensive review of the you only look once series. arXiv 2024, arXiv:2406.19407. [Google Scholar]
- Savard, C.; Manganelli, N.; Holzman, B.; Gray, L.; Perloff, A.; Pedro, K.; Stenson, K.; Ulmer, K. Optimizing High-Throughput Inference on Graph Neural Networks at Shared Computing Facilities with the NVIDIA Triton Inference Server. Comput. Softw. Big Sci. 2024, 8, 14. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. pp. 740–755. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Abdin, M.; Aneja, J.; Awadalla, H.; Awadallah, A.; Awan, A.A.; Bach, N.; Bahree, A.; Bakhtiari, A.; Bao, J.; Behl, H.; et al. Phi-3 technical report: A highly capable language model locally on your phone. arXiv 2024, arXiv:2404.14219. [Google Scholar]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
- Hussain, M. YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision. arXiv 2024, arXiv:2407.02988. [Google Scholar]
- Topsakal, O.; Akinci, T.C. Creating large language model applications utilizing langchain: A primer on developing llm apps fast. In Proceedings of the International Conference on Applied Engineering and Natural Sciences, Konya, Turkey, 25–26 September 2023; pp. 1050–1056. [Google Scholar]
- Chen, L.; Li, S.; Yan, J.; Wang, H.; Gunaratna, K.; Yadav, V.; Tang, Z.; Srinivasan, V.; Zhou, T.; Huang, H.; et al. Alpagasus: Training a better alpaca with fewer data. arXiv 2023, arXiv:2307.08701. [Google Scholar]
- Hu, T.; Zhou, X.-H. Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions. arXiv 2024, arXiv:2404.09135. [Google Scholar]
- Pezoa, F.; Reutter, J.L.; Suarez, F.; Ugarte, M.; Vrgoč, D. Foundations of JSON schema. In Proceedings of the 25th international conference on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 263–273. [Google Scholar]
Without OD | With OD | ||||||||
---|---|---|---|---|---|---|---|---|---|
Class | Common Name | Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 |
Canis mesomelas | Black-backed jackal | 0.98 | 1.00 | 0.20 | 0.33 | 0.99 | 0.70 | 1.00 | 0.82 |
Syncerus caffer | African buffalo | 0.98 | 1.00 | 0.63 | 0.77 | 0.99 | 1.00 | 0.84 | 0.91 |
Aepyceros melampus | Impala | 0.96 | 1.00 | 0.18 | 0.30 | 0.99 | 1.00 | 0.63 | 0.77 |
Hippotragus equinus | Roan antelope | 0.95 | 1.00 | 0.08 | 0.14 | 0.99 | 1.00 | 0.54 | 0.70 |
Bunolagus monticularis | Bushman rabbit | 0.95 | 1.00 | 0.08 | 0.15 | 0.99 | 1.00 | 0.91 | 0.95 |
Phacochoerus africanus | Common warthog | 0.94 | 1.00 | 0.16 | 0.28 | 0.99 | 1.00 | 0.72 | 0.83 |
Pan troglodytes | Chimpanzee | 0.96 | 1.00 | 0.27 | 0.42 | 0.99 | 0.91 | 1.00 | 0.95 |
Smutsia gigantea | Giant ground pangolin | 0.88 | 1.00 | 0.08 | 0.16 | 0.98 | 1.00 | 0.68 | 0.80 |
Orycteropus afer | Aardvark | 0.95 | 1.00 | 0.14 | 0.25 | 0.98 | 1.00 | 0.61 | 0.75 |
Hippopotamus amphibious | Common hippopotamus | 0.95 | 1.00 | 0.83 | 0.15 | 1.00 | 1.00 | 1.00 | 1.00 |
Oryx gazella | South African oryx | 0.95 | 1.00 | 0.20 | 0.33 | 1.00 | 1.00 | 1.00 | 1.00 |
Struthio camelus | Common ostrich | 0.98 | 1.00 | 0.60 | 0.75 | 0.99 | 1.00 | 0.80 | 0.88 |
Alcelaphus buselaphus | Hartebeest | 0.97 | 1.00 | 0.16 | 0.28 | 1.00 | 1.00 | 1.00 | 1.00 |
Kobus ellipsiprymnus | Waterbuck | 0.98 | 1.00 | 0.20 | 0.33 | 0.99 | 1.00 | 0.40 | 0.57 |
Gorilla sp. | Gorilla | 0.97 | 1.00 | 0.68 | 0.81 | 1.00 | 1.00 | 1.00 | 1.00 |
Tragelaphus eurycerus | Bongo | 0.92 | 1.00 | 0.09 | 0.16 | 0.96 | 1.00 | 0.09 | 0.16 |
Kobus kob | African antelope | 0.97 | 1.00 | 0.28 | 0.44 | 1.00 | 1.00 | 1.00 | 1.00 |
Numida meleagris | Helmeted guineafowl | 0.98 | 1.00 | 0.25 | 0.40 | 1.00 | 1.00 | 1.00 | 1.00 |
Hystrix cristata | Crested porcupine | 0.95 | 1.00 | 0.09 | 0.16 | 0.99 | 1.00 | 0.72 | 0.84 |
Crocuta crocuta | Spotted hyena | 0.98 | 1.00 | 0.55 | 0.71 | 0.99 | 0.76 | 1.00 | 0.86 |
Loxodonta africana | African elephant | 0.97 | 1.00 | 0.86 | 0.92 | 0.98 | 0.84 | 0.97 | 0.90 |
Acinonyx jubatus | Cheetah | 0.97 | 1.00 | 0.62 | 0.76 | 0.99 | 1.00 | 0.78 | 0.88 |
Papio sp. | Baboon | 0.90 | 1.00 | 0.07 | 0.13 | 0.97 | 1.00 | 0.44 | 0.61 |
Rhinocerotidae | Rhinoceros | 0.37 | 0.08 | 0.66 | 0.15 | 0.88 | 0.46 | 0.98 | 0.62 |
Connochaetes taurinus | Blue wildebeest | 0.93 | 1.00 | 0.30 | 0.46 | 0.99 | 0.90 | 0.86 | 0.88 |
Tragelaphus oryx | Common eland | 0.93 | 0.83 | 0.23 | 0.37 | 0.99 | 1.00 | 0.80 | 0.89 |
Giraffa camelopardalis | Giraffa | 0.95 | 1.00 | 0.71 | 0.83 | 0.99 | 0.97 | 0.97 | 0.97 |
Panthera leo | African lion | 0.79 | 1.00 | 0.31 | 0.47 | 0.98 | 0.93 | 0.96 | 0.94 |
Equus quagga | Plains zebra | 0.92 | 1.00 | 0.52 | 0.68 | 0.99 | 0.97 | 1.00 | 0.98 |
Question | Answer | Ground Truth | Pre | Rec | F1 |
---|---|---|---|---|---|
Q1. Read the label on the bounding box to identify the animal. What is the species identified in the image, and what is its IUCN conservation status? | A1. The species identified in the image is a rhino. According to the provided passages, the IUCN conservation status of the rhinoceros is “near threatened”. | GT1. The species in the image is a rhinoceros, and its IUCN status is near threatened. | 0.95 | 0.91 | 0.93 |
Q2 Read the label on the bounding box to identify the animal. What is the average weight of the species identified, and does this species have any notable characteristics or behaviours? | A2. The animal identified in the image is a Connochaetes taurinus, also known as the blue wildebeest. The average weight of the blue wildebeest is around 170 to 410 kg (370 to 900 lb). This species is known for its distinctive blue-grey skin, which is unique among wildebeest species. They are also known for their migratory behaviour, forming large herds that travel in search of food and water. | GT2. There are two Connochaetes taurinus in the image. The average weight is 250 kg (550 lb). It is highly gregarious and superbly adapted for a migratory existence. | 0.87 | 0.85 | 0.86 |
Q3. Was the image taken during the day or night, and what environmental factors can be observed (e.g., forest, bush, water sources)? | A3. The image was taken during the day. The environmental factors that can be observed include a grassy hill with some bushes and trees scattered around. There are no visible water sources in the image. | GT3. The image was taken during the day. The environment is a grassland with several trees typical of an African plain. | 0.94 | 0.91 | 0.91 |
Q4. Read the label on the bounding box to identify the animal. How does the species identified in the image compare to other species in the same habitat in terms of size, behaviour, and diet? | A4. The species identified in the image is the Giraffa camelopardalis, which is a large herbivorous mammal native to Africa. Giraffes are known for their long necks and legs, and they are the tallest mammals on Earth. They are typically found in savannas, grasslands, and open woodlands in Africa. Giraffes are also known for their unique behaviour of browsing on leaves, flowers, and fruits from tall trees, which sets them apart from other antelope species that primarily graze on grasses. In terms of diet, giraffes are herbivores and primarily feed on leaves, flowers, and fruits from trees, particularly from the Acacia tree. | GT4. The animal is a Giraffa camelopardalis. Giraffa camelopardalis are the tallest mammal with a long neck that makes it ideal for eating the leaves and twigs of acacia, mimosa, and wild apricot trees. Giraffa camelopardalis are herbivores. | 0.87 | 0.86 | 0.86 |
Q5. Read the label on the bounding box to identify animals. Can you identify other animals or objects in the image, such as nearby trees, water bodies, or structures? | A5. In the image, there is a zebra standing in the centre of the frame. There are no other animals visible in the image. The background consists of a dirt road and some vegetation, but no water bodies or structures are visible. | GT5. The animal in the image is a Equus quagga. It is standing on a dirt track, which is surrounded by trees. There are no other animals in the image. | 0.89 | 0.89 | 0.89 |
Q6 Read the labels on the bounding boxes to identify animals. What animals are in the image and how many are there of each animal species identified? | A6. The image contains three rhinoceroses. | GT6. The animals in the image are rhinoceroses. There are three of them. | 0.87 | 0.87 | 0.87 |
Q7. Based on the species and its habits, what predictions can be made about its activity at the time the camera trap image was taken (e.g., hunting, foraging, resting)? | A7. Based on the species and its habits, it can be predicted that the lion is likely resting or inactive at the time the camera trap image was taken. Lions spend much of their time resting, with only intermittent bursts of activity throughout the day. The image shows a lion lying down on the grass, which supports this prediction. | GT7. The picture contains several lions—only one is fully visible. All the lions are lying down resting. | 0.89 | 0.85 | 0.87 |
Q8. Read the label on the bounding box around the animal to determine the species. What potential threats, either natural or human-induced, are most relevant to the species in the image, given its current IUCN status and environment? | A8. The species in the image is a pangolin, specifically the giant pangolin (Manis gigantea). Given its current IUCN status as endangered, the potential threats most relevant to the species in the image include illegal, international trade in pangolin skin, scales, and meat, as well as habitat loss and fragmentation. These threats are primarily human-induced and can lead to a decline in pangolin populations. | GT8. The image contains a pangolin, which is the most hunted mammal with an IUCN status of endangered. Its main threat is humans who poach them for the meat and scales. | 0.90 | 0.87 | 0.88 |
Q9. Read the label on the bounding box around the animal to determine the species. What is the species role in the ecosystem, and how does its presence effect other species or the environment in the area where the image was captured? | A9. The species in the image is the African bush elephant (Loxodonta africana). As a keystone species, the presence of African bush elephants in the ecosystem has a significant impact on other species and the environment. They play a crucial role in shaping the landscape by uprooting trees and undergrowth, which can transform savannah into grasslands. This creates more accessible habitat for smaller herbivores to access trees that have been mowed down by elephants. Additionally, their digging for water during dry seasons can create water holes that benefit other species in the area. | GT9. The image contains an African elephant. They maintain habitats and promote the growth of larger trees. They trample forests and dense grasslands, making room for smaller species to co-exist. | 0.87 | 0.83 | 0.85 |
Q10. Read the label on the bounding box around the animal to determine the species. What are the known predators or threats to the species in the image, and are there any visible indicators in the environment that suggest the presence of these threats? | A10. The species in the image is a spotted hyena (Crocuta crocuta). Known predators or threats to the spotted hyena include lions (Panthera leo), cheetahs (Acinonyx jubatus), leopards (Panthera pardus), brown hyenas (Parahyaena brunnea), and African wild dogs (Lycaon). There are no visible indicators in the environment that suggest the presence of these threats. | GT10. The animal in the image is Crocuta Crocuta. Its only true predators are lions and humans. There are no visible threats in the image. | 0.90 | 0.81 | 0.85 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fergus, P.; Chalmers, C.; Matthews, N.; Nixon, S.; Burger, A.; Hartley, O.; Sutherland, C.; Lambin, X.; Longmore, S.; Wich, S. Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data. Sensors 2024, 24, 8122. https://doi.org/10.3390/s24248122
Fergus P, Chalmers C, Matthews N, Nixon S, Burger A, Hartley O, Sutherland C, Lambin X, Longmore S, Wich S. Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data. Sensors. 2024; 24(24):8122. https://doi.org/10.3390/s24248122
Chicago/Turabian StyleFergus, Paul, Carl Chalmers, Naomi Matthews, Stuart Nixon, André Burger, Oliver Hartley, Chris Sutherland, Xavier Lambin, Steven Longmore, and Serge Wich. 2024. "Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data" Sensors 24, no. 24: 8122. https://doi.org/10.3390/s24248122
APA StyleFergus, P., Chalmers, C., Matthews, N., Nixon, S., Burger, A., Hartley, O., Sutherland, C., Lambin, X., Longmore, S., & Wich, S. (2024). Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data. Sensors, 24(24), 8122. https://doi.org/10.3390/s24248122