Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection
Abstract
:1. Introduction
- We propose a patch-based method to fuse information from the local regions of nodes, which addresses the problem of inaccurate classification results due to the missing contextual information of nodes in node classification tasks, and reduces the risk of node category confusion after domain fusion.
- We design a progressive strategy to integrate the inherent features of nodes and the learned local region features, enabling the network to stably utilize local region features for accurate node classification.
- We develop a model that incorporates the above methods, strategies, and DAOD model. We assess our proposed model in different domain adaptation scenarios, including adverse weather conditions, adapt synthesized data to real data, and achieve SOTA results.
2. Related Work
2.1. Object Detection
2.2. Unsupervised Domain Adaptation
2.3. Domain Adaptive Object Detection
3. Method
3.1. Overview
3.2. Patch-Based Auxiliary Node Classification
Algorithm 1. Patch-based Auxiliary Node Classification (PANC) |
Input: domain fusion nodes |
Output: node classification loss |
1. for do |
2. Find k neighbor nodes closest to , i.e., |
3. Merge these k neighbor nodes with node to construct a patch representation using Equation (2) |
4. Employ multi-layer convolutional neural networks to perceive local region information of and learn local region feature representation , i.e., |
5. Calculate the classification cross-entropy loss of based on the local region feature representation using Equation (4) |
6. end for |
7. Calculate the node classification loss using Equation (3) |
8. return |
3.3. Progressive Node Feature Fusion Strategy
3.4. Loss Function and Model Optimization
4. Experiments
4.1. Datasets
4.2. Implementation Details
4.3. Comparison with SOTA Methods
4.4. Parameter Analysis
4.5. Visualization Analysis
4.5.1. Result Visualization Analysis
4.5.2. Feature Visualization Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–10 October 2016; pp. 21–37. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 1440–1448. [Google Scholar] [CrossRef] [PubMed]
- Tzeng, E.; Hoffman, J.; Zhang, N.; Saenko, K.; Darrell, T. Deep domain confusion: Maximizing for domain invariance. arXiv 2014, arXiv:1412.3474. [Google Scholar]
- Ganin, Y.; Lempitsky, V. Unsupervised domain adaptation by backpropagation. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 1180–1189. [Google Scholar]
- Bousmalis, K.; Silberman, N.; Dohan, D.; Erhan, D.; Krishnan, D. Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 22–25 July 2017; pp. 3722–3731. [Google Scholar]
- Koga, Y.; Miyazaki, H.; Shibasaki, R. A method for vehicle detection in high-resolution satellite images that uses a region-based object detector and unsupervised domain adaptation. Remote Sens. 2020, 12, 575. [Google Scholar] [CrossRef]
- Koga, Y.; Miyazaki, H.; Shibasaki, R. Adapting Vehicle Detector to Target Domain by Adversarial Prediction Alignment. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 2341–2344. [Google Scholar]
- Wu, W.; Zheng, J.; Fu, H.; Li, W.; Yu, L. Cross-regional oil palm tree detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 56–57. [Google Scholar]
- Lu, X.; Zhong, Y. A Noval Global-Local Adversarial Network for Unsupervised Cross-Domain Road Detection. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 2775–2778. [Google Scholar]
- Shao, Y.; Li, L.; Ren, W.; Gao, C.; Sang, N. Domain adaptation for image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020; pp. 2808–2817. [Google Scholar]
- Chen, Y.; Li, W.; Sakaridis, C.; Dai, D.; Van Gool, L. Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 19–21 June 2018; pp. 3339–3348. [Google Scholar]
- He, Z.; Zhang, L. Multi-adversarial faster-rcnn for unrestricted object detection. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6668–6677. [Google Scholar]
- Saito, K.; Ushiku, Y.; Harada, T.; Saenko, K. Strong-weak distribution alignment for adaptive object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 6956–6965. [Google Scholar]
- Xu, C.D.; Zhao, X.R.; Jin, X.; Wei, X.S. Exploring categorical regularization for domain adaptive object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 11724–11733. [Google Scholar]
- Xu, M.; Wang, H.; Ni, B.; Tian, Q.; Zhang, W. Cross-domain detection via graph-induced prototype alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 12355–12364. [Google Scholar]
- Li, W.; Liu, X.; Yuan, Y. Sigma: Semantic-complete graph matching for domain adaptive object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 5291–5300. [Google Scholar]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
- Sakaridis, C.; Dai, D.; Van Gool, L. Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 2018, 126, 973–992. [Google Scholar] [CrossRef]
- Johnson-Roberson, M.; Barto, C.; Mehta, R.; Sridhar, S.N.; Rosaen, K.; Vasudevan, R. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In Proceedings of the IEEE International Conference on Robotics and Automation, Singapore, 29 May–3 June 2017; pp. 746–753. [Google Scholar]
- Uijlings, J.R.; Van De Sande, K.E.; Gevers, T.; Smeulders, A.W. Selective search for object recognition. Int. J. Comput. Vis. 2013, 104, 154–171. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Hsu, C.C.; Tsai, Y.H.; Lin, Y.Y.; Yang, M.H. Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 733–748. [Google Scholar]
- Tian, K.; Zhang, C.; Wang, Y.; Xiang, S.; Pan, C. Knowledge mining and transferring for domain adaptive object detection. In Proceedings of the IEEE International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 9133–9142. [Google Scholar]
- Zhao, L.; Wang, L. Task-specific inconsistency alignment for domain adaptive object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 14217–14226. [Google Scholar]
- Fu, K.; Liu, S.; Luo, X.; Wang, M. Robust point cloud registration framework based on deep graph matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 8893–8902. [Google Scholar]
- Sinkhorn, R. A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Stat. 1964, 35, 876–879. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Paszke, A.; Gross, S.; Chintala, S.; Chanan, G.; Yang, E.; DeVito, Z.; Lin, Z.; Desmaison, A.; Antiga, L.; Lerer, A. Automatic differentiation in pytorch. In Proceedings of the Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Zhao, G.; Li, G.; Xu, R.; Lin, L. Collaborative training between region proposal localization and classification for domain adaptive object detection. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 86–102. [Google Scholar]
- Chen, C.; Li, Z.; Zheng, Z.; Huang, Y.; Ding, X.; Yu, Y. Dual bipartite graph learning: A general approach for domain adaptive object detection. In Proceedings of the IEEE International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 2703–2712. [Google Scholar]
- Zhang, Y.; Wang, Z.; Mao, Y. Rpn prototype alignment for domain adaptive object detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 12425–12434. [Google Scholar]
- Chen, C.; Li, J.; Zhou, H.Y.; Han, X.; Huang, Y.; Ding, X.; Yu, Y. Relation matters: Foreground-aware graph-based relational reasoning for domain adaptive object detection. Pattern Anal. Mach. Intell. 2022, 45, 3677–3694. [Google Scholar] [CrossRef] [PubMed]
- VS, V.; Gupta, V.; Oza, P.; Sindagi, V.A.; Patel, V.M. Mega-cda: Memory guided attention for category-aware unsupervised domain adaptive object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 4516–4526. [Google Scholar]
- Zhou, W.; Du, D.; Zhang, L.; Luo, T.; Wu, Y. Multi-granularity alignment domain adaptation for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 21–24 June 2022; pp. 9581–9590. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Stage | Network’s Characteristic | Reliable | Effective | Weight of Local Features () |
---|---|---|---|---|
Early | Weak foundation | No | No | Increase slowly |
Middle | Solid foundation and active thinking | Yes | Yes | Increase rapidly |
Later | Rigid thinking | Yes | No | Keep unchanged |
Method | Person | Rider | Car | Truck | Bus | Train | Motor | Bike | mAP |
---|---|---|---|---|---|---|---|---|---|
EPM ECCV’20 | 43.9 | 41.2 | 60.1 | 22.5 | 49.8 | 35.3 | 23.7 | 36.4 | 39.1 |
CTRP ECCV’20 | 32.7 | 44.4 | 50.1 | 21.7 | 45.6 | 25.4 | 30.1 | 36.8 | 35.9 |
GPA(Res-50) CVPR’20 | 32.9 | 46.7 | 54.1 | 24.7 | 45.7 | 41.1 | 32.4 | 38.7 | 39.5 |
DBGL ICCV’21 | 33.5 | 46.4 | 49.7 | 28.2 | 45.9 | 39.7 | 34.8 | 38.3 | 39.6 |
RPA CVPR’21 | 33.6 | 43.8 | 49.6 | 32.9 | 45.5 | 46.0 | 35.7 | 36.8 | 40.5 |
FGRR TPAMI’22 | 33.5 | 46.4 | 49.7 | 28.2 | 45.9 | 39.7 | 34.8 | 38.3 | 39.6 |
SIGMA CVPR’22 | 44.4 | 43.3 | 60.4 | 25.5 | 43.9 | 45.4 | 31.9 | 36.7 | 41.4 |
Ours | 44.8 | 42.6 | 60.8 | 30.9 | 48.2 | 43.8 | 28.3 | 37.7 | 42.1 |
Method | AP on Car |
---|---|
EPM ECCV’20 | 52.3 |
KTNet ICCV’21 | 50.7 |
MeGA CVPR’21 | 44.8 |
RPA CVPR’21 | 45.7 |
MGA CVPR’22 | 49.8 |
FGRR TPAMI’22 | 44.5 |
SIGMA CVPR’22 | 51.9 |
Ours | 53.0 |
Use of PS | k | AP on Car | |
---|---|---|---|
- | - | 5 | 50.1 |
√ | 0.1 | 5 | 49.1 |
√ | 0.2 | 5 | 53.0 |
√ | 0.3 | 5 | 51.2 |
√ | 0.2 | 7 | 51.6 |
√ | 0.2 | 9 | 51.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qiu, Y.; Xu, Z.; Zhang, J. Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection. Electronics 2024, 13, 1239. https://doi.org/10.3390/electronics13071239
Qiu Y, Xu Z, Zhang J. Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection. Electronics. 2024; 13(7):1239. https://doi.org/10.3390/electronics13071239
Chicago/Turabian StyleQiu, Yuanyuan, Zhijie Xu, and Jianqin Zhang. 2024. "Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection" Electronics 13, no. 7: 1239. https://doi.org/10.3390/electronics13071239
APA StyleQiu, Y., Xu, Z., & Zhang, J. (2024). Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection. Electronics, 13(7), 1239. https://doi.org/10.3390/electronics13071239