Blind Quality Assessment of Images Containing Objects of Interest
Abstract
:1. Introduction
2. Materials and Methods
2.1. Images Containing Objects of Interest
2.2. DETR-IQA
2.3. Evaluation Metrics
2.4. Implementation Details
3. Results
3.1. Images Containing Objects of Interest
3.2. Performance Evaluation
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rowcliffe, J.M.; Carbone, C. Surveys using camera traps: Are we looking to a brighter future? Anim. Conserv. 2008, 11, 185–186. [Google Scholar] [CrossRef]
- O’connell, A.F.; Nichols, J.D.; Karanth, K.U. Camera Traps in Animal Ecology: Methods and Analyses; Springer: New York, NY, USA, 2011. [Google Scholar]
- McCallum, J. Changing use of camera traps in mammalian field research: Habitats, taxa and study types. Mammal Rev. 2013, 43, 196–206. [Google Scholar] [CrossRef]
- Tang, Z.; Yang, J.; Liu, X.H.; Wang, P.Y.; Li, Z.Y.; Liu, C.S. Activity pattern of Lophophorus lhuysii by camera-trapping in Wolong National Nature Reserve, China. Sichuan J. Zool. 2017, 36, 582–587. [Google Scholar]
- Royle, J.A.; Nichols, J.D.; Karanth, K.U.; Gopalaswamy, A.M. A hierarchical model for estimating density in camera-trap studies. J. Appl. Ecol. 2009, 46, 118–127. [Google Scholar] [CrossRef]
- Karlin, M.; De La Paz, G. Using Camera-Trap Technology to Improve Undergraduate Education and Citizen-Science Contributions in Wildlife Research. Southwest. Nat. 2015, 60, 171–179. [Google Scholar] [CrossRef]
- Yin, Y.F.; Drubgyal, A.; Lu, Z.; Sanderson, J. First photographs in nature of the Chinese mountain cat. Cat News 2007, 47, 6–7. [Google Scholar]
- Huang, Z.; Qi, X.; Garber, P.A.; Jin, T.; Guo, S.; Li, S.; Li, B. The use of camera traps to identify the set of scavengers preying on the carcass of a golden snub-nosed monkey (Rhinopithecus roxellana). Sci. Rep. 2014, 9, e87318. [Google Scholar] [CrossRef]
- Li, D.; Jiang, T.; Lin, W.; Jiang, M. Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal? IEEE Trans. Multimed. 2018, 21, 1221–1234. [Google Scholar] [CrossRef]
- Li, F.; Shuang, F.; Liu, Z.; Qian, X. A cost-constrained video quality satisfaction study on mobile devices. IEEE Trans. Multimed. 2017, 20, 1154–1168. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, L.; Mou, X.Q.; Zhang, D. FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef] [PubMed]
- Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef]
- Cheon, M.; Yoon, S.J.; Kang, B.; Lee, J. Perceptual image quality assessment with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 433–442. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Chandler, D.M.; Hemami, S.S. VSNR: A wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans. Image Process. 2007, 16, 2284–2298. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Zhai, G.; Gu, K.; Liu, X.; Zhao, D.; Gao, W. Reduced-reference image quality assessment in free-energy principle and sparse representation. IEEE Trans. Multimed. 2017, 20, 379–391. [Google Scholar] [CrossRef]
- Wu, J.; Liu, Y.; Li, L.; Shi, G. Attended Visual Content Degradation Based Reduced Reference Image Quality Assessment. IEEE Access 2018, 6, 2169–3536. [Google Scholar] [CrossRef]
- Shi, Y.; Guo, W.; Niu, Y.; Zhan, J. No-reference stereoscopic image quality assessment using a multi-task cnn and registered distortion representation. Pattern Recognit 2020, 100, 107168. [Google Scholar] [CrossRef]
- Li, J.; Yan, J.; Deng, D.; Shi, W.; Deng, S. No-reference image quality assessment based on hybrid model. Signal Image Video Process. 2017, 11, 985–992. [Google Scholar] [CrossRef]
- Cai, W.; Fan, C.; Zou, L.; Liu, Y.; Ma, Y.; Wu, M. Blind Image Quality Assessment Based on Classification Guidance and Feature Aggregation. Electronics 2020, 9, 1811. [Google Scholar] [CrossRef]
- Yang, P.; Sturtz, J.; Qingge, L. Progress in Blind Image Quality Assessment: A Brief Review. Mathematics 2023, 11, 2766. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Kang, L.; Ye, P.; Li, Y.; Doermann, D. Convolutional neural networks for no-reference image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1733–1740. [Google Scholar]
- Lin, K.-Y.; Wang, G. Hallucinated-iqa: No-reference image quality assessment via adversarial learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 732–741. [Google Scholar]
- Zhang, W.; Ma, K.; Yan, J.; Deng, D.; Wang, Z. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans. Circuits Syst. Video Technol. 2018, 30, 36–47. [Google Scholar] [CrossRef]
- Bianco, S.; Celona, L.; Napoletano, P.; Schettini, R. On the use of deep learning for blind image quality assessment. Signal Image Video Process. 2018, 12, 2. [Google Scholar] [CrossRef]
- Su, S.; Yan, Q.; Zhu, Y.; Zhang, C.; Ge, X.; Sun, J.; Zhang, Y. Blindly assess image quality in the wild guided by a selfadaptive hyper network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 3667–3676. [Google Scholar]
- Zhu, H.; Li, L.; Wu, J.; Dong, W.; Shi, G. Metaiqa: Deep meta-learning for no-reference image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Kim, J.; Lee, S. Fully deep blind image quality predictor. IEEE J. Sel. Top. Signal Process. 2016, 11, 206–220. [Google Scholar] [CrossRef]
- Moorthy, A.K.; Bovik, A.C. A two-step framework for constructing blind image quality indices. IEEE Signal Process. Lett. 2010, 17, 513–516. [Google Scholar] [CrossRef]
- Liu, L.; Dong, H.; Huang, H.; Bovik, A.C. No-reference image quality assessment in curvelet domain. Signal Process. Image Commun. 2014, 29, 494–505. [Google Scholar] [CrossRef]
- Xue, W.; Mou, X.; Zhang, L.; Bovik, A.C.; Feng, X. Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features. IEEE Trans. Image Process. 2014, 23, 4850–4862. [Google Scholar] [CrossRef] [PubMed]
- Freitas, P.G.; Akamine, W.Y.L.; Farias, M.C.Q. Blind image quality assessment using multiscale local binary patterns. J. Imaging Sci. Technol. 2017, 29, 7–14. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; FeiFei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Zhou, B.; Lapedriza, A.; Khosla, A.; Oliva, A.; Torralba, A. Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1452–1464. [Google Scholar] [CrossRef]
- Tang, H.X.; Joshi, N.; Kapoor, A. Blind image quality assessment using semi-supervised rectifier networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 287–2884. [Google Scholar]
- Sun, C.R.; Li, H.Q.; Li, W.P. No-reference image quality assessment based on global and local content perception. In Proceedings of the Visual Communications and Image Processing, Chengdu, China, 27–30 November 2016; pp. 1–4. [Google Scholar]
- Golestaneh, S.A.; Dadsetan, S.; Kitani, K.M. No-reference image quality assessment via transformers, relative ranking, and self-consistency. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 4–8 January 2022; pp. 1220–1230. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, Y.; Po, L.-M.; Feng, L.; Yuan, F. No-reference image quality assessment with deep convolutional neural networks. In Proceedings of the 2016 IEEE International Conference on Digital Signal Processing (DSP), Beijing, China, 16–18 October 2016; pp. 685–689. [Google Scholar]
- Ma, K.; Liu, W.; Zhang, K.; Duanmu, Z.; Wang, Z.; Zuo, W. End-to-End Blind Image Quality Assessment Using Deep Neural Networks. IEEE Trans. Image Process. 2018, 27, 1202–1213. [Google Scholar] [CrossRef] [PubMed]
- Rehman, M.U.; Nizami, I.F.; Majid, M. DeepRPN-BIQA: Deep architectures with region proposal network for natural-scene and screen-content blind image quality assessment. Displays 2021, 71, 102101. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All Your Need. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition At Scale. arXiv 2010, arXiv:2010.11929. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In European Conference on Computer Vision (ECCV); Springer: Cham, Switerland, 2020. [Google Scholar]
- You, J.; Korhonen, J. Transformer For Image Quality Assessment. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 1389–1393. [Google Scholar]
- Ke, J.; Wang, Q.; Wang, Y.; Milanfar, P.; Yang, F. MUSIQ: Multi-scale Image Quality Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 5128–5137. [Google Scholar]
- Yang, S.; Wu, T.; Shi, S.; Lao, S.; Gong, Y.; Cao, M.; Wang, J.; Yang, Y. MANIQA: Multi-dimension Attention Network for No-Reference Image Quality Assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 1190–1199. [Google Scholar]
- Cao, J.; Wu, W.; Wang, R.; Kwong, S. No-reference image quality assessment by using convolutional neural networks via object detection. Int. J. Mach. Learn. Cyber. 2022, 13, 3543–3554. [Google Scholar] [CrossRef]
- Popescu, M.; Balas, V.E.; Perescu-Popescu, L.; Mastorakis, N.E. Multilayer perceptron and neural networks. WSEAS Trans. Circuits Syst. Arch. 2009, 8, 579–588. [Google Scholar]
- Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 2006, 15, 3440–3451. [Google Scholar] [CrossRef]
- Ponomarenko, N.; Jin, L.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Astola, J.; Vozel, B.; Chehdi, K.; Carli, M.; Battisti, F.; et al. Image database TID2013: Peculiarities, results and perspectives. Signal Process. Image Commun. 2015, 30, 57–77. [Google Scholar] [CrossRef]
- Hosu, V.; Lin, H.; Sziranyi, T.; Saupe, D. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Trans. Image Process. 2020, 29, 4041–4056. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar]
- Hosang, J.; Benenson, R.; Schiele, B. Learning Non-maximum Suppression. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6469–6477. [Google Scholar]
- He, W.; Luo, Z.; Tong, X.; Hu, X.; Chen, C.; Shu, Z. Long-Tailed Metrics and Object Detection in Camera Trap Datasets. Appl. Sci. 2023, 13, 6029. [Google Scholar] [CrossRef]
- Jagtap, A.D.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2019, 404, 109136. [Google Scholar] [CrossRef]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]
- Zhang, H.; Li, F.; Liu, S.; Zhang, L.; Su, H.; Zhu, J.; Ni, L.M.; Shum, H. DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. arXiv 2022, arXiv:2203.03605. [Google Scholar]
- Zong, Z.; Song, G.; Liu, Y. DETRs with Collaborative Hybrid Assignments Training. arXiv 2022, arXiv:2211.12860. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 9992–10002. [Google Scholar]
Class | No. of Images |
---|---|
Person | 2069 |
Bird | 543 |
Dog | 456 |
Horse | 105 |
Other animal | 541 |
Training Set | Test Set |
---|---|
2866 | 716 |
Model Name | |||||
---|---|---|---|---|---|
DETR | 0.74 | 0.663 | 0.112 | 0.269 | 0.678 |
0.99 | 0.728 | 0.646 | 0.118 | 0.21 | 0.657 | 0.746 | 0.707 | 0.282 | 0.299 |
0.90 | 0.726 | 0.647 | 0.224 | 0.201 | 0.665 | 0.732 | 0.686 | 0.549 | 0.514 |
0.80 | 0.741 | 0.647 | 0.114 | 0.267 | 0.668 | 0.785 | 0.727 | 0.659 | 0.609 |
0.70 | 0.73 | 0.642 | 0.083 | 0.256 | 0.663 | 0.763 | 0.721 | 0.643 | 0.591 |
0.60 | 0.672 | 0.589 | 0.107 | 0.299 | 0.61 | 0.735 | 0.709 | 0.645 | 0.565 |
Model Name | FLOPs | Inference FPS | Params |
---|---|---|---|
DETR-IQA | 70.09 G | 41.41 M | 20 |
DETR | 70.01 G | 41.27 M | 20 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
He, W.; Luo, Z. Blind Quality Assessment of Images Containing Objects of Interest. Sensors 2023, 23, 8205. https://doi.org/10.3390/s23198205
He W, Luo Z. Blind Quality Assessment of Images Containing Objects of Interest. Sensors. 2023; 23(19):8205. https://doi.org/10.3390/s23198205
Chicago/Turabian StyleHe, Wentong, and Ze Luo. 2023. "Blind Quality Assessment of Images Containing Objects of Interest" Sensors 23, no. 19: 8205. https://doi.org/10.3390/s23198205
APA StyleHe, W., & Luo, Z. (2023). Blind Quality Assessment of Images Containing Objects of Interest. Sensors, 23(19), 8205. https://doi.org/10.3390/s23198205