Eyebirds: Enabling the Public to Recognize Water Birds at Hand
Abstract
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Construction of the Water Bird Image Dataset
2.2. The Eyebirds System
2.2.1. Architecture
2.2.2. Attention Mechanism-Based Deep Convolution Neural Network
2.3. Implementation of the EyeBirds System
2.4. Experimental Protocol
3. Results and Discussion
3.1. Effect of the Parameter on AM-CNN Performance
3.2. Performance on the North American Bird Dataset
3.3. Performance on Our Water Bird Dataset
3.4. Advantages and Disadvantages of the AM-CNN Model
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AM-CNN | attention mechanism-based deep convolution neural network |
PCL | primary convolutional layer |
SFEL | shallow network feature extraction layer |
SPP | spatial pyramidal pooling |
References
- Amano, T.; Székely, T.; Sandel, B.; Nagy, S.; Mundkur, T.; Langendoen, T.; Blanco, D.; Soykan, C.; Sutherland, W. Successful conservation of global waterbird populations depends on effective governance. Nature 2018, 553, 199–202. [Google Scholar] [CrossRef] [PubMed]
- Zhu, F.; Zou, Y.; Zhang, S.Q.; Chen, X.S.; Li, F.; Deng, Z.M.; Zhu, X.Y.; Xie, Y.H.; Zou, D.S. Dyke demolition led to a sharp decline in waterbird diversity due to habitat quality reduction: A case study of Dongting Lake, China. Ecol. Evol. 2022, 12, e8782. [Google Scholar] [CrossRef] [PubMed]
- Wang, C.; Zhou, Y.; Zhang, H.B.; Li, Y.F.; Liu, H.Y.; Dong, B. Study on the rare waterbird habitat networks of a new UNESCO World Natural Heritage site based on scenario simulation. Sci. Total Environ. 2022, 843, 157058. [Google Scholar] [CrossRef] [PubMed]
- Andreia, M.; Jacques, F.; Alessandro, L.K. Bird species classification based on color features. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Manchester, UK, 13–16 October 2013; pp. 4336–4341. [Google Scholar] [CrossRef]
- Kasten, E.P.; McKinley, P.K.; Gage, S.H. Ensemble extraction for classification and detection of bird species. Ecol. Inform. 2010, 5, 153–166. [Google Scholar] [CrossRef]
- Acevedo, M.A.; Corrada-Bravo, C.J.; Corrada-Bravo, H.; Villanuevaa, L.J.; Aide, T.M. Automated classification of bird and amphibian calls using machine learning: A comparison of methods. Ecol. Inform. 2009, 4, 206–214. [Google Scholar] [CrossRef]
- Bardeli, R.; Wolff, D.; Kurth, F.; Koch, M.; Tauchert, K.; Frommolt, K. Detecting bird songs in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recogn. Lett. 2010, 31, 1524–1534. [Google Scholar] [CrossRef]
- Stefan, K.; Connor, M.W.; Maximilian, E.; Holger, K. BirdNET: A deep learning solution for avian diversity monitoring. Ecol. Inform. 2021, 61, 101236. [Google Scholar] [CrossRef]
- Akçay, H.G.; Kabasakal, B.; Aksu, D.; Demir, N.; Öz, M.; Erdoğan, A. Automated Bird Counting with Deep Learning for Regional Bird Distribution Mapping. Animals 2020, 10, 1207. [Google Scholar] [CrossRef]
- Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE. Trans. Pattern Anal. 2013, 35, 1798–1828. [Google Scholar] [CrossRef]
- Lee, M.H.; Park, I.K. Performance evaluation of local descriptors for maximally stable external regions. J. Vis. Commun. Image Represent. 2017, 47, 62–72. [Google Scholar] [CrossRef]
- Deng, L.; Yu, D. Deep learning: Methods and applications. Found. Trends Signal Process. 2014, 7, 197–387. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Jeffrey, M.; Udaysanker, N.; Rahul, R.; Manil, M. Detection of transverse cirrus bands in satellite imagery using deep learning. Comput. Geosci. 2018, 118, 79–85. [Google Scholar] [CrossRef]
- Xiu, S.W.; Chen, W.X.; Wu, J.X.; Shen, C.H. Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recogn. 2018, 76, 704–714. [Google Scholar] [CrossRef]
- Zhao, B.; Feng, J.; Wu, X.; Yan, S.C. A survey on deep learning-based fine grained object classification and semantic segmentation. Int. J. Autom. Comput. 2017, 14, 119–135. [Google Scholar] [CrossRef]
- Zhang, N.; Donahue, J.; Girshick, R.B.; Darrell, T. Part-Based R-CNNs for Fine-Grained Category Detection. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; pp. 834–849. [Google Scholar]
- Krause, J.; Jin, H.; Yang, J.; Li, F.F. Fine-grained recognition without part annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 5546–5555. [Google Scholar] [CrossRef]
- Wang, D.; Shen, Z.; Shao, J.; Zhang, W.; Xue, X.; Zhang, Z. Multiple granularity descriptors for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 2399–2406. [Google Scholar] [CrossRef]
- Zhang, X.; Xiong, H.; Zhou, W.; Lin, W.; Tian, Q. Picking deep filter responses for fine-grained image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1134–1142. [Google Scholar] [CrossRef]
- Liu, Z.D.; Lu, F.X.; Wang, P.; Miao, H.; Zhang, L.G.; Zhou, B. 3D Part Guided Image Editing for Fine-Grained Object Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11336–11345. [Google Scholar] [CrossRef]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-UCSD Birds-200-2011 Dataset; Computation & Neural Systems Technical Report, 2010–001; California Institute of Technology: Pasadena, CA, USA, 2011; Available online: https://resolver.caltech.edu/CaltechAUTHORS:20111026-120541847 (accessed on 2 February 2021).
- Li, Y.; Zeng, J.B.; Shan, S.G.; Chen, X.L. Occlusion Aware Facial Expression Recognition Using CNN With Attention Mechanism. IEEE Trans. Image Process. 2019, 28, 2439–2450. [Google Scholar] [CrossRef]
- Rodriguez, P.; Velazquez, D.; Cucurull, G.; Gonfaus, J.M.; Roca, F.X.; Gonzalez, J. Pay Attention to the Activations: A Modular Attention Mechanism for Fine-Grained Image Recognition. IEEE Trans. Multimed. 2020, 22, 502–514. [Google Scholar] [CrossRef]
- Cai, W.; Wei, Z. Remote Sensing Image Classification Based on a Cross-Attention Mechanism and Graph Convolution. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Niu, Z.Y.; Zhong, G.Q.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Ghaffarian, S.; Valente, J.; Voort, M.; Tekinerdogan, B. Effect of Attention Mechanism in Deep Learning-Based Remote Sensing Image Processing: A Systematic Literature Review. Remote Sens. 2021, 13, 2965. [Google Scholar] [CrossRef]
- Zhang, Z.; Liang, X.; Dong, X.; Xie, Y.; Gao, G. A Sparse-View CT Reconstruction Method Based on Combination of DenseNet and Deconvolution. IEEE Trans. Med. Imaging 2018, 37, 1407–1417. [Google Scholar] [CrossRef] [PubMed]
- Zhou, F.Q.; Li, X.J.; Li, Z.X. High-frequency details enhancing DenseNet for super-resolution. Neurocomputing 2018, 209, 34–42. [Google Scholar] [CrossRef]
- Cui, B.; Chen, X.; Lu, Y. Semantic Segmentation of Remote Sensing Images Using Transfer Learning and Deep Convolutional Neural Network with Dense Connection. IEEE Access 2020, 8, 116744–116755. [Google Scholar] [CrossRef]
- Zhang, J.M.; Lu, C.Q.; Li, X.D.; Kim, H.; Wang, J. A full convolutional network based on DenseNet for remote sensing scene classification. Math. Biosci. Eng. 2019, 16, 3345–3367. [Google Scholar] [CrossRef]
- Li, Y.; Xie, X.; Shen, L.; Liu, S.X. Reverse active learning based atrous DenseNet for pathological image classification. BMC Bioinform. 2019, 20, 445. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J.S. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Chen, K.; Wang, J.Q.; Pang, J.M. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollar, P.; Zitnick, L. Microsoft COCO: Common Objects in Context. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; Volume 8691, pp. 740–755. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. In Computer Vision – ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; Volume 8691, pp. 346–361. [Google Scholar]
- Dewi, C.; Chen, R.C.; Yu, H. Robust detection method for improving small traffic sign recognition based on spatial pyramid pooling. J. Ambient. Intell. Humaniz. Comput. 2021. [Google Scholar] [CrossRef]
- Tan, Y.S.; Lim, K.M.; Tee, C.; Low, C.Y. Convolutional neural network with spatial pyramid pooling for hand gesture recognition. Neural Comput. Appl. 2021, 33, 5339–5351. [Google Scholar] [CrossRef]
- Lian, X.H.; Pang, Y.W.; Han, J.G.; Pan, J. Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation. Pattern Recogn. 2021, 110, 107622. [Google Scholar] [CrossRef]
- Zhang, H.; Xu, T.; Elhoseiny, M.; Huang, X.; Zhang, S.; Elgammal, A.; Metaxas, D. SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1143–1152. [Google Scholar] [CrossRef]
- Lin, T.Y.; RoyChowdhury, A.; Maji, S. Bilinear CNN models for fine-grained visual recognition. IEEE Trans. Pattern Anal. 2018, 40, 1309–1322. [Google Scholar] [CrossRef] [PubMed]
- Xiao, T.; Xu, Y.; Yang, K.; Zhang, J.; Peng, Y.; Zhang, Z. The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 842–850. [Google Scholar] [CrossRef]
- Zhao, B.; Wu, X.; Feng, J.; Peng, Q.; Yan, S. Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimed. 2017, 19, 1245–1256. [Google Scholar] [CrossRef]
No | Item | Specification |
---|---|---|
1 | CPU | Intel (R) Xeon Silver, Intel (R) Xeon Silver, 2.2 GHZ processor with 40 cores |
2 | Hard drive and operating memory | SSD: 2T, DDR: 128 G |
3 | GPU | Dual channels (RTX 2080) |
4 | Linux operating system | CentOS 7.3 |
5 | HTTP dynamic/static page service | Flask, Nginx + gunicom |
6 | Deep learning framework | Pytorch1.0 |
No | Values | Accuracy (%) |
---|---|---|
1 | 82.77 | |
2 | 82.43 | |
3 | 82.82 | |
4 | 83.60 | |
5 | 84.56 | |
6 | 83.43 | |
7 | 83.30 | |
8 | 83.42 | |
9 | 83.04 | |
10 | 82.36 | |
11 | 82.60 | |
12 | 82.43 |
No | Methods | Additional Annotation | Accuracy (%) |
---|---|---|---|
1 | Part-RCNN | YES | 81.6 |
2 | DeepLAC | YES | 80.3 |
3 | MG-RCNN | YES | 83.0 |
4 | PA-CNN | YES | 82.6 |
5 | B-CNN | YES | 85.1 |
6 | SPDA-CNN | YES | 85.1 |
7 | PDFR | NO | 82.6 |
8 | DVAN | NO | 79.0 |
9 | TLAN | NO | 77.9 |
10 | AM-CNN | NO | 85.0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Wang, Y.; Zhang, C.; Wu, W.; Ji, Y.; Zou, Y. Eyebirds: Enabling the Public to Recognize Water Birds at Hand. Animals 2022, 12, 3000. https://doi.org/10.3390/ani12213000
Zhou J, Wang Y, Zhang C, Wu W, Ji Y, Zou Y. Eyebirds: Enabling the Public to Recognize Water Birds at Hand. Animals. 2022; 12(21):3000. https://doi.org/10.3390/ani12213000
Chicago/Turabian StyleZhou, Jiaogen, Yang Wang, Caiyun Zhang, Wenbo Wu, Yanzhu Ji, and Yeai Zou. 2022. "Eyebirds: Enabling the Public to Recognize Water Birds at Hand" Animals 12, no. 21: 3000. https://doi.org/10.3390/ani12213000
APA StyleZhou, J., Wang, Y., Zhang, C., Wu, W., Ji, Y., & Zou, Y. (2022). Eyebirds: Enabling the Public to Recognize Water Birds at Hand. Animals, 12(21), 3000. https://doi.org/10.3390/ani12213000