YOLOv7-tiny-CR: A Causal Intervention Framework for Infrared Small Target Detection with Feature Debiasing
Abstract
1. Introduction
- (1)
- We construct a structural causal model specifically designed for infrared scenarios, formally distinguishing causal features from non-causal ones. This provides the theoretical foundation for analyzing the root sources of feature deviation.
- (2)
- We develop YOLOv7-tiny-CR, an enhanced architecture that incorporates a CAM into the backbone to emphasize causally relevant target features. In addition, a CI module is introduced into the neck to inhibit the propagation of contextual bias, thereby achieving debiasing at the model level.
- (3)
- Extensive experiments on the public FLIR_ADASv2 dataset demonstrate that the proposed framework significantly improves feature discriminability. It achieves notable gains over the baseline in both mAP@50 and mAP@50:95, validating its effectiveness in mitigating feature bias and enhancing generalization.
2. Methods
2.1. YOLOv7-tiny-CR Model
2.2. Structural Causal Model
2.3. Causal Attention Mechanism Module
2.4. Causal Intervention Module
3. Experiments
3.1. Dataset
3.2. Evaluation Metric
3.3. Experimental Environment and Parameter Setting
3.4. Confusion Matrix Plot Results Analysis
3.5. F1-Confidence Plot Results Analysis
3.6. P, R and PR Curve Results Analysis
- (1)
- Precision–Confidence Curve: For most categories (e.g., “car”), precision increases monotonically with confidence, consistent with expectations for a well-performing model. However, fluctuations observed in categories such as “bus” indicate inconsistencies in prediction reliability across different classes.
- (2)
- Recall–Confidence Curve: Recall generally decreases as the confidence threshold increases, in line with theoretical expectations. Notably, the recall for the “person” category declines sharply once the confidence threshold exceeds 0.6, highlighting a significant risk of missed detections for this category under high-confidence settings.
- (3)
- Precision–Recall Curve: This curve illustrates the trade-off between precise identification and comprehensive coverage. The larger area under the curve for the “car” category indicates a better balance between precision and recall. In contrast, the steep decline of the curve for the “bus” category underscores the greater difficulty the model encounters in reliably detecting this class.
3.7. Loss Function and Evaluation Metric Curve Plot Results Analysis
3.8. Comparative Experimental Results Analysis
3.9. Ablation Experimental Results Analysis
- (1)
- Baseline (YOLOv7-tiny): Serves as the reference for all comparative metrics.
- (2)
- CAM Module (YOLOv7-tiny+CAM): Incorporating the CAM module results in a noticeable increase in both parameter count and computational cost, while yielding only marginal improvements in the two mAP metrics. This outcome is expected, as the primary purpose of the CAM module is not to independently enhance performance, but to perform essential feature preprocessing that supports subsequent causal intervention.
- (3)
- CI Module (YOLOv7-tiny+CI): With a minimal parameter increase of only 0.4 M, this configuration achieves substantial performance gains mAP@50 and mAP@50:95 each improve by 2.5%. This strongly validates that the CI module effectively mitigates bias introduced by contextual confounders through backdoor adjustment, representing the most significant contributor to detection improvement.
- (4)
- Full Model (YOLOv7-tiny-CR): The integration of both CAM and CI modules achieves the highest scores for mAP@50 and mAP@50:95, demonstrating a clear synergistic advantage. This indicates that the feature decoupling enabled by the CAM module, combined with the causal intervention performed by the CI module, forms a complete debiasing pipeline, wherein their combined action results in an optimal overall bias-removal effect.
3.10. Visualization Results Analysis
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pearl, J. Causality: Models, Reasoning, and Inference; Cambridge University Press: Cambridge, UK, 2000; ISBN 978-1-139-64936-0. [Google Scholar]
- Gao, C.; Zheng, Y.; Wang, W.; Feng, F.; He, X.; Li, Y. Causal Inference in Recommender Systems: A Survey and Future Directions. ACM Trans. Inf. Syst. 2024, 42, 1–32. [Google Scholar] [CrossRef]
- Rebane, G.; Pearl, J. The Recovery of Causal Poly-Trees from Statistical Data. In Proceedings of the 3rd Conference on Uncertainty in Artificial Intelligence, Seattle, WA, USA, 10–12 July 1987; AUAI Press: Arlington, VA, USA, 1987; pp. 222–228. [Google Scholar]
- Castro, D.C.; Walker, I.; Glocker, B. Causality Matters in Medical Imaging. Nat. Commun. 2020, 11, 3673. [Google Scholar] [CrossRef] [PubMed]
- Huang, W.; Jiang, M.; Li, M.; Meng, B.; Ren, J.; Zhao, S.; Bai, R.; Yang, Y. Causal Intervention for Object Detection. In Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence, Washington, DC, USA, 1–3 November 2021; pp. 770–774. [Google Scholar]
- Guo, Y.; Chen, D.; Bao, C.; Luo, Y. Causal Attention-Based Lightweight and Efficient Cervical Cancer Cell Detection Model. In Proceedings of the 2023 IEEE International Conference on Bioinformatics and Biomedicine, Istanbul, Turkiye, 5–8 December 2023; pp. 1104–1111. [Google Scholar]
- Zhang, Y.; Li, R.; Du, Z.; Ye, Q. A Ship Detection Method in Infrared Remote Sensing Images Based on Image Generation and Causal Inference. Electronics 2024, 13, 1293. [Google Scholar] [CrossRef]
- Kim, T.; Shin, S.; Yu, Y.; Kim, H.G.; Ro, Y.M. Causal Mode Multiplexer: A Novel Framework for Unbiased Mul tispectral Pedestrian Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 26774–26783. [Google Scholar]
- Shao, F.; Luo, Y.; Zhang, L.; Ye, L.; Tang, S.; Yang, Y.; Xiao, J. Improving Weakly Supervised Object Localization via Causal Intervention. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, 20–24 October 2021; ACM: New York, NY, USA, 2021; pp. 3321–3329. [Google Scholar]
- Ding, X.; Guo, Y.; Ding, G.; Han, J. ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmet ric Convolution Blocks. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1911–1920. [Google Scholar]
- Liu, Y.; Zhou, S.; Liu, X.; Hao, C.; Fan, B.; Tian, J. Unbiased Faster R-CNN for Single-Source Domain Generalized Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 28838–28847. [Google Scholar]
- Liu, Y.; Shao, Z.; Teng, Y.; Hoffmann, N. NAM: Normalization-Based Attention Module. In Proceedings of the 35th Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; OpenReview: New York, NY, USA, 2023. [Google Scholar]
- Gkioxari, G.; Johnson, J.; Malik, J. Mesh R-CNN. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9784–9794. [Google Scholar]
- Lopez-Paz, D.; Nishihara, R.; Chintala, S.; Scholkopf, B.; Bottou, L. Discovering Causal Signals in Images. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 58–66. [Google Scholar]
- Wang, T.; Huang, J.; Zhang, H.; Sun, Q. Visual Commonsense R-CNN. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10757–10767. [Google Scholar]
- Xu, M.; Qin, L.; Chen, W.; Pu, S.; Zhang, L. Multi-View Adversarial Discriminator: Mine the Non-Causal Factors for Object Detection in Unseen Domains. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 8103–8112. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Li, C.; Zhang, B.; Li, L.; Li, L.; Geng, Y.; Cheng, M.; Xiaoming, X.; Chu, X.; Wei, X. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. In Proceedings of the 12th International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024; OpenReview: New York, NY, USA, 2024. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
- Wang, C.-Y.; Yeh, I.-H.; Mark Liao, H.-Y. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In Proceedings of the 18th European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 1–21. [Google Scholar]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. In Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 9–15 December 2024; Curran Associates: Red Hook, NY, USA, 2024; pp. 107984–108011. [Google Scholar]
- Wan, D.; Lu, R.; Hu, B.; Yin, J.; Shen, S.; Xu, T.; Lang, X. YOLO-MIF: Improved YOLOv8 with Multi-Information Fusion for Object Detection in Gray-Scale Images. Adv. Eng. Inform. 2024, 62, 102709. [Google Scholar] [CrossRef]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502:12524. [Google Scholar] [CrossRef]
- Lei, M.; Li, S.; Wu, Y.; Hu, H.; Zhou, Y.; Zheng, X.; Ding, G.; Du, S.; Wu, Z.; Gao, Y. YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception. arXiv 2025, arXiv:2506.17733. [Google Scholar] [CrossRef]













| Method | Parameters (M) | FLOPs (G) | mAP@50 (%) | mAP@50:95 (%) | Inference-Time (ms) |
|---|---|---|---|---|---|
| YOLOv3-tiny [17] | 12.1 | 18.9 | 30.3 | 17.1 | 0.8 |
| YOLOv4-tiny [18] | 7.7 | 14.5 | 33.2 | 18.2 | 0.9 |
| YOLOv5n | 2.5 | 7.1 | 40.4 | 23.8 | 0.8 |
| YOLOv6n [19] | 4.2 | 11.8 | 37.0 | 21.3 | 1.0 |
| YOLOv7-tiny [20] | 8.4 | 21.9 | 44.1 | 26.4 | 1.2 |
| YOLOv8n | 3.0 | 8.1 | 42.3 | 25.1 | 0.8 |
| YOLOv9-tiny [21] | 2.0 | 7.6 | 43.4 | 25.4 | 1.1 |
| YOLOv10n [22] | 2.7 | 8.3 | 41.2 | 24.9 | 1.1 |
| YOLO-MIF-n [23] | 4.6 | 9.5 | 42.1 | 25.2 | 0.8 |
| YOLOv11n | 2.6 | 6.3 | 39.9 | 23.6 | 0.9 |
| YOLOv12n [24] | 2.5 | 6.3 | 40.6 | 24.2 | 1.1 |
| YOLOv13n [25] | 2.5 | 6.2 | 40.2 | 23.3 | 1.2 |
| YOLOv7-tiny-CR | 13.8 | 31.2 | 47 | 29.1 | 1.5 |
| Method | Parameters (M) | FLOPs (G) | mAP@50 (%) | mAP@50:95 (%) | Inference-Time (ms) |
|---|---|---|---|---|---|
| YOLOv7-tiny | 8.4 | 21.9 | 44.1 | 26.4 | 1.2 |
| YOLOv7-tiny+CAM | 13.4 | 25.9 | 44.5 | 26.5 | 1.4 |
| YOLOv7-tiny+CI | 8.8 | 27.1 | 46.6 | 28.9 | 1.4 |
| YOLOv7-tiny+CAM+CI | 13.8 | 31.2 | 47 | 29.1 | 1.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Sun, L. YOLOv7-tiny-CR: A Causal Intervention Framework for Infrared Small Target Detection with Feature Debiasing. Appl. Sci. 2025, 15, 13008. https://doi.org/10.3390/app152413008
Wang H, Sun L. YOLOv7-tiny-CR: A Causal Intervention Framework for Infrared Small Target Detection with Feature Debiasing. Applied Sciences. 2025; 15(24):13008. https://doi.org/10.3390/app152413008
Chicago/Turabian StyleWang, Honglong, and Lihui Sun. 2025. "YOLOv7-tiny-CR: A Causal Intervention Framework for Infrared Small Target Detection with Feature Debiasing" Applied Sciences 15, no. 24: 13008. https://doi.org/10.3390/app152413008
APA StyleWang, H., & Sun, L. (2025). YOLOv7-tiny-CR: A Causal Intervention Framework for Infrared Small Target Detection with Feature Debiasing. Applied Sciences, 15(24), 13008. https://doi.org/10.3390/app152413008
