CPD-UAV: A Benchmark Dataset for Detecting Personnel Visually Blended with the Environment Under UAV Perspective
Highlights
- The establishment of CPD-UAV, a novel benchmark dataset comprising 1061 high-resolution images meticulously annotated to address the challenge of detecting visually blended individuals under complex UAV perspectives.
- The development of the Residual Gated Alignment Module (RGAM), a lightweight, plug-and-play architectural component that significantly improves the structural integrity and precise boundary localization of minute targets.
- The proposed CPD-UAV dataset provides a rigorous data platform that bridges the critical domain gap between conventional camouflaged object detection research and practical, real-world field search-and-rescue applications.
- RGAM offers a highly cost-effective and efficient algorithmic solution to the “vanishing boundary” and extreme scale variation challenges, making it exceptionally well-suited for the resource constraints of intelligent aerial monitoring systems.
Abstract
1. Introduction
2. Related Works
2.1. Camouflaged Object Detection
2.2. Object Detection Under UAV Perspectives
2.3. Camouflaged Object Detection Datasets
3. The Proposed Dataset
3.1. Data Collection and Annotation
3.2. Dataset Properties and Statistics
3.3. Benchmark Evaluation on CPD-UAV
4. Proposed Method
4.1. Overall Framework Formulation
4.2. Residual Gated Alignment Module
4.3. Effectiveness and Generalization of RGAM
4.4. Cross-Dataset Generalization Analysis
5. Conclusions
5.1. Contributions and Impact
5.2. Limitations and Future Work
Author Contributions
Funding
Data Availability Statement
DURC Statement
Acknowledgments
Conflicts of Interest
References
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Tang, G.; Ni, J.; Zhao, Y.; Gu, Y.; Cao, W. A survey of object detection for UAVs based on deep learning. Remote Sens. 2024, 16, 149. [Google Scholar] [CrossRef]
- Luo, Z.; Liu, N.; Zhao, W.; Yang, X.; Zhang, D.; Fan, D.P.; Khan, F.; Han, J. VSCode: General visual salient and camouflaged object detection with 2D prompt learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Denver, CO, USA, 3–7 June 2024; pp. 28726–28736. [Google Scholar]
- Liu, K.; Li, A.; Yang, S.; Wang, C.; Zhang, Y. Multi-scale attention and boundary-aware network for military camouflaged object detection using unmanned aerial vehicles. Signal Image Video Process. 2025, 19, 184. [Google Scholar] [CrossRef]
- He, C.; Li, K.; Zhang, Y.; Zhang, Y.; You, C.; Guo, Z.; Li, X.; Danelljan, M.; Yu, F. Strategic preys make acute predators: Enhancing camouflaged object detectors by generating camouflaged objects. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 7–11 May 2024; pp. 36657–36675. [Google Scholar]
- Xu, Z.; Zhao, H.; Liu, P.; Wang, L.; Zhang, G.; Chai, Y. SRTSOD-YOLO: Stronger real-time small object detection algorithm based on improved YOLO11 for UAV imageries. Remote Sens. 2025, 17, 3414. [Google Scholar] [CrossRef]
- Jiang, L.; Yuan, B.; Du, J.; Chen, B.; Xie, H.; Tian, J.; Yuan, Z. MFFSODNet: Multiscale feature fusion small object detection network for UAV aerial images. IEEE Trans. Instrum. Meas. 2024, 73, 5015214. [Google Scholar] [CrossRef]
- Wang, P.; Zhao, Y.; Hu, Z. Boundary-aware camouflaged object detection via spatial-frequency domain supervision. Electronics 2025, 14, 2541. [Google Scholar] [CrossRef]
- Ni, X.; Wong, Z.J.; Mrejen, M.; Wang, Y.; Zhang, X. An ultrathin invisibility skin cloak for visible light. Science 2015, 349, 1310–1314. [Google Scholar] [CrossRef]
- Yuan, C.; Liu, L.; Li, Y.; Li, J. SAM2-DFBCNet: A camouflaged object detection network based on the Heira architecture of SAM2. Sensors 2025, 25, 4509. [Google Scholar] [CrossRef]
- Khan, A.; Khan, M.; Gueaieb, W.; El Saddik, A.; De Masi, G.; Karray, F. CamoFocus: Enhancing camouflage object detection with split-feature focal modulation and context refinement. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2024; pp. 1434–1443. [Google Scholar]
- Hwang, K.-S.; Ma, J. Military camouflaged object detection with deep learning using dataset development and combination. J. Def. Model. Simul. 2026, 23, 67–78. [Google Scholar] [CrossRef]
- Fan, D.P.; Ji, G.P.; Cheng, M.M.; Shao, L. Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6024–6042. [Google Scholar] [CrossRef]
- Wan, Z.; Lan, Y.; Xu, Z.; Shang, K.; Zhang, F. DAU-YOLO: A lightweight and effective method for small object detection in UAV images. Remote Sens. 2025, 17, 1768. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, S.; Chen, C.; Xiang, T.Z. Boundary-guided camouflaged object detection. arXiv 2022, arXiv:2207.00794. [Google Scholar] [CrossRef]
- Zhang, X.; Gao, M.; Gao, G.; Wang, X.; Wang, Q. Edge-guided multilevel feature fusion network for lightweight camouflaged object detection. In Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 30 June–5 July 2024; pp. 1–7. [Google Scholar]
- Shang, Y.; Wang, L.; Dong, J.; Dong, X. Boundary-aware distracted attention network for camouflaged object detection. IEEE Trans. Artif. Intell. 2026, in press. [Google Scholar]
- Alghamdi, L.; Usman, M.; Anwar, H.; Bais, A.; Anwar, S. MSRNet: A multi-scale recursive network for camouflaged object detection. arXiv 2025, arXiv:2511.12810. [Google Scholar] [CrossRef]
- Xiao, F.; Hu, S.; Shen, Y.; Fang, C.; Huang, J.; He, C.; Tang, L.; Yang, Z.; Li, X. A survey of camouflaged object detection and beyond. arXiv 2024, arXiv:2408.14562. [Google Scholar] [CrossRef]
- Khan, A.; Ullah, H.; Munir, A. LiteCOD: Lightweight camouflaged object detection via holistic understanding of local-global features and multi-scale fusion. AI 2025, 6, 197. [Google Scholar] [CrossRef]
- Sun, Y.; Xuan, H.; Yang, J.; Luo, L. GLCONet: Learning multisource perception representation for camouflaged object detection. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 13262–13275. [Google Scholar] [CrossRef]
- Liang, Y.; Qin, G.; Sun, M.; Wang, X.; Yan, J.; Zhang, Z. A systematic review of image-level camouflaged object detection with deep learning. Neurocomputing 2024, 566, 127050. [Google Scholar] [CrossRef]
- Yan, J.; Le, T.N.; Nguyen, K.D.; Tran, M.T.; Do, T.T.; Nguyen, T.V. MirrorNet: Bio-inspired camouflaged object segmentation. IEEE Access 2021, 9, 43290–43300. [Google Scholar] [CrossRef]
- Zhu, H.; Li, P.; Xie, H.; Yan, X.; Liang, D.; Chen, D.; Wang, M.; Qin, J. I can find you! Boundary-guided separated attention network for camouflaged object detection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Virtual, 22 February–1 March 2022; pp. 3608–3616. [Google Scholar]
- Le, M.Q.; Tran, M.T.; Le, T.N.; Nguyen, T.V.; Do, T.T. CamoFA: A learnable Fourier-based augmentation for camouflage segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA, 26 February–6 March 2025; pp. 3427–3436. [Google Scholar]
- Yin, B.; Zhang, X.; Fan, D.P.; Jiao, S.; Cheng, M.M.; Van Gool, L.; Hou, Q. CamoFormer: Masked separable attention for camouflaged object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 10623294. [Google Scholar] [CrossRef]
- Habash, N.; Abu Alqumsan, A.; Zhou, T. Recent real-time aerial object detection approaches, performance, optimization, and efficient design trends for onboard performance: A survey. Sensors 2025, 25, 7563. [Google Scholar] [CrossRef]
- Huang, M.; Jiang, W. DMS-YOLO: Small target detection algorithm based on YOLOv11. PLoS ONE 2026, 21, e0341991. [Google Scholar] [CrossRef]
- Berndt, J.; Meissner, H.; Kraft, T. On the accuracy of YOLOv8-CNN regarding detection of humans in nadir aerial images for search and rescue applications. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 48, 139–146. [Google Scholar] [CrossRef]
- Wang, X.; Fang, H.; Li, Q.; Wang, L.; Chang, Y.; Yan, L. Blur-robust detection via feature restoration: An end-to-end framework for prior-guided infrared UAV target detection. Proc. AAAI Conf. Artif. Intell. 2026, 40, 10181–10189. [Google Scholar] [CrossRef]
- Nikouei, M.; Baroutian, B.; Nabavi, S.; Taraghi, F.; Aghaei, A.; Sajedi, A.; Moghaddam, M.E. Small object detection: A comprehensive survey on challenges, techniques and real-world applications. Intell. Syst. Appl. 2025, 200561. [Google Scholar] [CrossRef]
- Liu, J.; Kong, L.; Chen, G. Improving SAM for camouflaged object detection via dual stream adapters. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, 19–23 October 2025; pp. 21906–21916. [Google Scholar]
- Zheng, Z.; Yuan, J.; Yao, W.; Yao, H.; Liu, Q.; Guo, L. Crop classification from drone imagery based on lightweight semantic segmentation methods. Remote Sens. 2024, 16, 4099. [Google Scholar] [CrossRef]
- Lyu, Y.; Vosselman, G.; Xia, G.S.; Yilmaz, A.; Yang, M.Y. UAVid: A semantic segmentation dataset for UAV imagery. ISPRS J. Photogramm. Remote Sens. 2020, 165, 108–119. [Google Scholar] [CrossRef]
- Cai, W.; Jin, K.; Hou, J.; Guo, C.; Wu, L.; Yang, W. VDD: Varied Drone Dataset for semantic segmentation. arXiv 2023, arXiv:2305.13608. [Google Scholar] [CrossRef]
- Huang, S.; Hu, M.; Zou, L.; Chi, H.; Li, Z.; Gao, F.; Yang, F.; Wu, Q.; Chen, K. UAV-CB: A Complex-Background RGB-T Dataset and Local Frequency Bridge Network for UAV Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–25 June 2026; pp. 40468–40478. [Google Scholar]
- Skurowski, P.; Abdulameer, H.; Błaszczyk, J.; Depta, T.; Kornacki, A.; Kozieł, P. Animal camouflage analysis: Chameleon database. Unpublished manuscript 2018, 2, 7.
- Le, T.N.; Nguyen, T.V.; Nie, Z.; Tran, M.T.; Sugimoto, A. Anabranch network for camouflaged object segmentation. Comput. Vis. Image Underst. 2019, 184, 45–56. [Google Scholar] [CrossRef]
- Fan, D.P.; Ji, G.P.; Sun, G.; Cheng, M.M.; Shen, J.; Shao, L. Camouflaged object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 2774–2784. [Google Scholar]
- Lv, Y.; Zhang, J.; Dai, Y.; Li, A.; Liu, B.; Barnes, N.; Fan, D.P. Simultaneously localize, segment and rank the camouflaged objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 11591–11601. [Google Scholar]
- Haider, A. Adaptive Camouflaged Dataset (ACD1K). Kaggle. 2023. Available online: https://www.kaggle.com/datasets/aalihhiader/military-camouflage-soldiers-dataset-mcs1k (accessed on 21 April 2026).
- Chen, G.; Liu, S.J.; Sun, Y.J.; Ji, G.P.; Wu, Y.F.; Zhou, T. Camouflaged object detection via context-aware cross-level fusion. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 6981–6993. [Google Scholar] [CrossRef]
- Pang, Y.; Zhao, X.; Xiang, T.Z.; Zhang, L.; Lu, H. Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 2160–2170. [Google Scholar]
- Chen, T.; Xiao, J.; Hu, X.; Zhang, G.; Wang, S. Boundary-guided network for camouflaged object detection. Knowl.-Based Syst. 2022, 248, 108901. [Google Scholar] [CrossRef]
- Hu, X.; Zhang, X.; Wang, F.; Sun, J.; Sun, F. Efficient camouflaged object detection network based on global localization perception and local guidance refinement. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 5452–5465. [Google Scholar] [CrossRef]
- Guan, J.; Fang, X.; Zhu, T.; Qian, W. SDRNet: Camouflaged object detection with independent reconstruction of structure and detail. Knowl.-Based Syst. 2024, 299, 112051. [Google Scholar] [CrossRef]
- Chen, T.; Lu, A.; Zhu, L.; Ding, C.; Yu, C.; Ji, D.; Li, Z.; Sun, L.; Mao, P.; Zang, Y. Sam2-adapter: Evaluating & adapting segment anything 2 in downstream tasks: Camouflage, shadow, medical image segmentation, and more. arXiv 2024, arXiv:2408.04579. [Google Scholar] [CrossRef]
- Bibbo, L.; Genovese, E.; Maesano, C.; Calluso, S.; Barrile, G.; Meduri, G.M.; Bilotta, A.; Caroti, G.; Piemonte, A.; Barrile, V. Electronic components and key algorithms for a prototype drone: Economic and sustainability advantages. AIMS Electron. Electr. Eng. 2026, 10, 92–128. [Google Scholar] [CrossRef]









| Dataset | Year | Scale (Images) | Target Type | Perspective | Annotation | Key Characteristics |
|---|---|---|---|---|---|---|
| CHAMELEON | 2018 | 76 | Natural Animals | Horizontal | Pixel-level Mask | Early small-scale validation |
| CAMO | 2019 | 1250 | Animals & Artificial Objects | Horizontal | Pixel-level Mask | Natural and artificial hybrid |
| COD10K | 2021 | 10,000 | Various Natural Objects | Horizontal/ Close-up | Pixel & Matting-level | Largest scale, multi-category |
| NC4K | 2021 | 4121 | Various Natural Objects | Diverse Horizontal | Pixel-level Mask | Largest specialized test set |
| ACD1K | 2024 | 1078 | Artificial Camouflage | Diverse Horizontal | Pixel-level Mask & BBox | Advanced artificial camouflage |
| CPD-UAV (Ours) | 2026 | 1061 | Visually Blended Individuals | UAV (Aerial) | Pixel-level Mask | UAV perspective, extreme scale variation |
| Model | Year | Sm ↑ | Em ↑ | Fm ↑ | MAE ↓ | GFLOPs ↓ | Params (M) ↓ | FPS ↑ |
|---|---|---|---|---|---|---|---|---|
| SINet | 2020 | 0.827 | 0.817 | 0.674 | 0.020 | 27.31 | 48.95 | 31.56 |
| C2FNet | 2021 | 0.937 | 0.973 | 0.874 | 0.002 | 18.19 | 25.21 | 25.52 |
| ZoomNet | 2022 | 0.919 | 0.945 | 0.856 | 0.003 | 101.80 | 32.38 | 20.51 |
| BGNet | 2022 | 0.739 | 0.922 | 0.795 | 0.030 | 58.50 | 77.80 | 41.04 |
| PRNet | 2024 | 0.689 | 0.891 | 0.745 | 0.044 | 21.02 | 14.12 | 26.00 |
| SDRNet | 2024 | 0.932 | 0.972 | 0.865 | 0.001 | 106.26 | 126.04 | 14.17 |
| SAM2-UNet | 2024 | 0.934 | 0.980 | 0.867 | 0.002 | 159.52 | 216.40 | 22.27 |
| Model | Sm ↑ | Em ↑ | Fm ↑ | MAE ↓ | GFLOPs ↓ | Params (M) ↓ | FPS ↑ |
|---|---|---|---|---|---|---|---|
| SINet | 0.827 | 0.817 | 0.674 | 0.020 | 27.31 | 48.95 | 31.56 |
| SINet-Light | 0.804 | 0.869 | 0.640 | 0.008 | 33.93 | 46.36 | 46.67 |
| SINet-RGAM | 0.828 | 0.880 | 0.671 | 0.004 | 22.45 | 46.43 | 72.55 |
| BGNet | 0.739 | 0.922 | 0.795 | 0.030 | 58.50 | 77.80 | 41.04 |
| BGNet-Light | 0.909 | 0.934 | 0.827 | 0.002 | 48.07 | 76.18 | 36.41 |
| BGNet-RGAM | 0.930 | 0.959 | 0.870 | 0.001 | 48.07 | 76.26 | 48.09 |
| PRNet | 0.689 | 0.891 | 0.745 | 0.044 | 21.02 | 14.12 | 26.00 |
| PRNet-Light | 0.715 | 0.894 | 0.732 | 0.040 | 15.87 | 12.05 | 16.50 |
| PRNet-RGAM | 0.750 | 0.912 | 0.768 | 0.036 | 16.33 | 12.24 | 26.20 |
| Model | Sm ↑ | Em ↑ | Fm ↑ | MAE ↓ |
|---|---|---|---|---|
| SINet | 0.607 | 0.594 | 0.358 | 0.095 |
| SINet+RGAM | 0.628 | 0.643 | 0.388 | 0.121 |
| BGNet | 0.781 | 0.819 | 0.663 | 0.043 |
| BGNet+RGAM | 0.783 | 0.839 | 0.672 | 0.042 |
| PRNet | 0.868 | 0.931 | 0.813 | 0.024 |
| PRNet+RGAM | 0.870 | 0.934 | 0.811 | 0.023 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, X.; Kang, W.; Peng, Y.; Tang, W.; Li, Q.; Hao, H.; Hou, L.; Ying, X. CPD-UAV: A Benchmark Dataset for Detecting Personnel Visually Blended with the Environment Under UAV Perspective. Drones 2026, 10, 447. https://doi.org/10.3390/drones10060447
Zhang X, Kang W, Peng Y, Tang W, Li Q, Hao H, Hou L, Ying X. CPD-UAV: A Benchmark Dataset for Detecting Personnel Visually Blended with the Environment Under UAV Perspective. Drones. 2026; 10(6):447. https://doi.org/10.3390/drones10060447
Chicago/Turabian StyleZhang, Xuekai, Wenchao Kang, Yueping Peng, Wei Tang, Qilong Li, Hexiang Hao, Liming Hou, and Xin Ying. 2026. "CPD-UAV: A Benchmark Dataset for Detecting Personnel Visually Blended with the Environment Under UAV Perspective" Drones 10, no. 6: 447. https://doi.org/10.3390/drones10060447
APA StyleZhang, X., Kang, W., Peng, Y., Tang, W., Li, Q., Hao, H., Hou, L., & Ying, X. (2026). CPD-UAV: A Benchmark Dataset for Detecting Personnel Visually Blended with the Environment Under UAV Perspective. Drones, 10(6), 447. https://doi.org/10.3390/drones10060447

