SF6 Leak Detection in Infrared Video via Multichannel Fusion and Spatiotemporal Features
Abstract
1. Introduction
- (1)
- Spatiotemporal feature joint modeling: Utilizing the P3D-CE backbone network to jointly extract temporal and spatial features from infrared videos, thereby improving the perception of dynamic and small-scale leakage targets.
- (2)
- Multi-scale semantic fusion: Employing a Feature Pyramid Network (FPN) to fuse semantic information across multiple feature scales, thereby improving small-target detection performance in complex backgrounds.
- (3)
- Dynamic variation perception enhancement: Incorporating a temporal Transformer module to strengthen the model’s capability to capture dynamic changes and moving targets associated with leakage events.
- (4)
- Real-time and high-precision detection: Achieving significant improvements over existing methods across diverse complex scenarios while maintaining computational efficiency, with advantages in both detection accuracy and false-negative suppression.
2. Related Works
2.1. Infrared Video Preprocessing
2.2. Object Detection Methods
2.3. Attention Mechanisms in Object Detection
2.4. Temporal Modeling in Detection Frameworks
3. Research Methodology
3.1. Image Preprocessing
3.1.1. Image Enhancement
- (1)
- Divide the input image into equally sized subregions.
- (2)
- Compute the histogram for each subregion. Assume that the pixel intensity values in a local region fall within the range , where G denotes the number of gray-scale levels. The local histogram represents the frequency of intensity value , and is calculated as follows:where is the number of pixels in the region, is the gray value of the pixels in the region, and is the Kronecker delta function, which is used to indicate whether the pixel value is . Next, histogram equalization is performed based on the computed histogram. First the cumulative distribution function (CDF) is calculated, which is the cumulative sum of the histograms.
- (3)
- CLAHE introduces a contrast-limiting factor , which serves to limit the maximum height of the histogram in each block to avoid excessive contrast enhancement due to the high frequency of certain gray values. When the frequency of a certain gray value exceeds the set threshold , is directly truncated to :
- (4)
- During histogram equalization, CLAHE uses a modified cumulative distribution function (CDF) to redistribute pixel values. For each pixel with an intensity value , the new equalized pixel value is computed using the following equation:
- (5)
- Dividing the image into multiple blocks for local equalization may lead to unnatural transitions at block boundaries. To mitigate this issue, CLAHE applies bilinear interpolation to smooth the edges between adjacent blocks, thereby enhancing local contrast continuity and overall image naturalness.
3.1.2. Motion Foreground Extraction
3.2. P3D-CE
3.2.1. Feature Fusion Module
3.2.2. Backbone
3.2.3. Neck
3.2.4. Head
4. Experiments
4.1. Dataset Introduction
4.2. Implementation Details
4.3. Evaluation Metrics
4.4. Algorithm Comparison
4.5. Ablation Experiments
4.5.1. Analysis of Model Components
4.5.2. Attention Mechanism Analysis
4.5.3. Analysis of Input Channels
4.6. Cross-Video Validation for Generalization
5. Conclusions
- (1)
- To address the issue of insufficient detection accuracy of SF6 gas leakage in infrared video, this paper proposes VGEC-Net, a real-time detection model that integrates multi-channel inputs and temporal modeling. Built upon the P3D backbone, the model incorporates the CE-Net attention mechanism, FPN structure, and a temporal Transformer module to efficiently model dynamic and subtle leakage patterns, enhancing the spatiotemporal representation capability in complex scenarios.
- (2)
- Experimental results on the self-constructed SF6 leakage detection dataset show that VGEC-Net yields higher mAP and mAP50 compared with existing methods such as YOLOv8s and 3DVSD. In addition, it provides better performance in small-object detection (mAPs) and model compactness, achieving a good trade-off between detection accuracy and computational efficiency.
- (3)
- Further evaluations show that VGEC-Net maintains strong robustness and generalization ability when applied to infrared videos with blurred backgrounds and complex environmental interference. This approach offers promising insights and technical support for intelligent detection tasks involving other industrial gases or infrared targets.
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Xu, C.; Zhou, T.; Chen, X.; Li, X.; Kang, C. Estimating of sulfur hexafluoride gas emission from electric equipments. In Proceedings of the 2011 1st International Conference on Electric Power Equipment-Switching Technology, Xi’an, China, 23–27 October 2011; pp. 299–303. [Google Scholar]
- Yang, L.; Wang, S.; Chen, C.; Zhang, Q.; Sultana, R.; Han, Y. Monitoring and Leak Diagnostics of Sulfur Hexafluoride and Decomposition Gases from Power Equipment for the Reliability and Safety of Power Grid Operation. Appl. Sci. 2024, 14, 3844. [Google Scholar] [CrossRef]
- Masson-Delmotte, V.; Zhai, P.; Pirani, A.; Connors, S.L.; Péan, C.; Berger, S.; Caud, N.; Chen, Y.; Goldfarb, L.; Zhou, B.; et al. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2021; Volume 2, 2391p. [Google Scholar]
- GOV.UK. How to Operate or Service Electrical Switchgear Containing SF6 [Guidance]. 2014. Available online: https://www.gov.uk/guidance/how-to-operate-or-service-high-voltage-switchgear-containing-sf6 (accessed on 15 May 2025).
- Lu, Q.; Li, Q.; Hu, L.; Huang, L. An effective Low-Contrast SF6 gas leakage detection method for infrared imaging. IEEE Trans. Instrum. Meas. 2021, 70, 5009009. [Google Scholar] [CrossRef]
- Wang, Y.; Yao, Y.; Zhao, R.; Zhang, Z.; Jing, R. SF6 Research on the Key Technology of the Gas Integrated Online Monitoring System in the Fault Early Warning and Diagnosis of GIS Equipment. In Proceedings of the 2024 Boao New Power System International Forum-Power System and New Energy Technology Innovation Forum (NPSIF), Qionghai, China, 8–10 December 2024; pp. 163–169. [Google Scholar]
- Zheng, K.; Luo, W.; Duan, L.; Zhao, S.; Jiang, S.; Bao, H.; Ho, H.L.; Zheng, C.; Zhang, Y.; Ye, W.; et al. High sensitivity and stability cavity-enhanced photoacoustic spectroscopy with dual-locking scheme. Sens. Actuators B Chem. 2024, 415, 135984. [Google Scholar] [CrossRef]
- Yun, Y.X.; Chen, W.G.; Sun, C.X.; Pang, C. Photoacoustic spectroscopy detection method for methane gas in transformer oil. Proc. Chin. Soc. Electr. Eng. 2008, 28, 40–46. [Google Scholar]
- Yang, Z.H.; Zhang, Y.K.; Chen, Y.; Li, X.F.; Jiang, Y.; Feng, Z.Z.; Deng, B.; Chen, C.-l.; Zhou, D.F. Simultaneous detection of multiple gaseous pollutants using multi-wavelength differential absorption LIDAR. Opt. Commun. 2022, 518, 128359. [Google Scholar] [CrossRef]
- Shen, Y.; Shao, K.M.; Wu, J.; Huang, F.; Guo, Y. Research progress on gas optical detection technology and its application. Opto-Electron. Eng. 2020, 47, 3–18. [Google Scholar]
- Wu, H.; Chen, Y.; Lin, W.; Wang, F. Novel signal denoising approach for acoustic leak detection. J. Pipeline Syst. Eng. Pract. 2018, 9, 04018016. [Google Scholar] [CrossRef]
- Wu, S.Q.; Shen, B.; Xiong, G.; Xu, W. Detection and analysis of photoacoustic signals radiated by gas plasma. Laser Infrared 2017, 47, 428–431. [Google Scholar]
- Li, Y.; Zhang, Y.; Geng, A.; Cao, L.; Chen, J. Infrared image enhancement based on atmospheric scattering model and histogram equalization. Opt. Laser Technol. 2016, 83, 99–107. [Google Scholar] [CrossRef]
- Zhang, F.; Hu, H.; Wang, Y. Infrared image enhancement based on adaptive non-local filter and local contrast. Optik 2023, 292, 171407. [Google Scholar] [CrossRef]
- Zhang, C.J.; Fu, M.Y.; Jin, M. Resistible noise approach of infrared image contrast enhancement. Infrared Laser Eng. 2004, 33, 50–54. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Liu, Y.; Zhou, T.; Xu, J.; Hong, Y.; Pu, Q.; Wen, X. Rotating target detection method of concrete bridge crack based on YOLO v5. Appl. Sci. 2023, 13, 11118. [Google Scholar] [CrossRef]
- Xu, S.; Wang, X.; Sun, Q.; Dong, K. MWIRGas-YOLO: Gas leakage detection based on mid-wave infrared imaging. Sensors 2024, 24, 4345. [Google Scholar] [CrossRef] [PubMed]
- Xu, C.J.; Wang, X.F.; Yang, Y.D. Attention-YOLO: YOLO detection algorithm with attention mechanism. Comput. Eng. Appl. 2019, 55, 13–23. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multi-box detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Yang, S.; Chen, Z.; Ma, X.; Zong, X.; Feng, Z. Real-time high-precision pedestrian tracking: A detection–tracking–correction strategy based on improved SSD and Cascade R-CNN. J. Real-Time Image Process. 2022, 19, 287–302. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Zhang, R.M.; Jia, Z.N.; Li, J.X.; Wu, L.; Xu, X.; Yuan, B. Improved EfficientDet remote sensing object detection algorithm based on multi-receptive field feature enhancement. Electron. Opt. Control 2024, 31, 53–60+96. [Google Scholar]
- Liu, B.; Zhao, W.; Sun, Q. Study of object detection based on Faster R-CNN. In Proceedings of the 2017 Chinese Automation Congress (CAC), Jinan, China, 20–22 October 2017; pp. 6233–6236. [Google Scholar]
- Ren, Z.J.; Lin, S.Z.; Li, D.W.; Wang, L.; Zuo, J. Mask R—CNN object detection method based on improved feature pyramid. Laser Optoelectron. Prog. 2019, 56, 174–179. [Google Scholar]
- Sahin, M.E.; Ulutas, H.; Yuce, E.; Erkoc, M.F. Detection and classification of COVID-19 by using faster R-CNN and mask R-CNN on CT images. Neural Comput. Appl. 2023, 35, 13597–13611. [Google Scholar] [CrossRef]
- Huo, Y.; Zhang, Q.; Zhang, Y.; Zhu, J.; Wang, J. 3DVSD: An end-to-end 3D convolutional object detection network for video smoke detection. Fire Saf. J. 2022, 134, 103690. [Google Scholar] [CrossRef]
- Chen, Y.; Cao, Y.; Hu, H.; Wang, L. Memory enhanced global-local aggregation for video object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10337–10346. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Cao, Y.; Xu, J.; Lin, S.; Wei, F.; Hu, H. Global context networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 45, 6881–6895. [Google Scholar] [CrossRef]
- Hu, J.; Wang, H.; Wang, J.; Wang, Y.; He, F.; Zhang, J. SA-Net: A scale-attention network for medical image segmentation. PLoS ONE 2021, 16, e0247388. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Fu, G.D.; Huang, J.; Yang, T.; Zheng, S. Lightweight attention model with improved CBAM. Comput. Eng. Appl. 2021, 57, 150–156. [Google Scholar]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–29 June 2019; pp. 3146–3154. [Google Scholar]
- Park, S.; Kim, J.; Kim, J.; Wang, S. Fault Diagnosis of Air Handling Units in an Auditorium Using Real Operational Labeled Data across Different Operation Modes. J. Comput. Civ. Eng. 2025, 39, 04025065. [Google Scholar] [CrossRef]
- Wang, S. Automated non-PPE detection on construction sites using YOLOv10 and transformer architectures for surveillance and body worn cameras with benchmark datasets. Sci. Rep. 2025, 15, 27043. [Google Scholar] [CrossRef]









| INF | FPI | |
|---|---|---|
| Training Set | 4992 | 4992 |
| Validation Set | 704 | 704 |
| Test Set | 704 | 704 |
| Method | Detection | Input Size | mAP (%) | mAP50 (%) | mAPs (%) | FAR (%) | MAR (%) | Params (M) | FPS |
|---|---|---|---|---|---|---|---|---|---|
| YOLOv5s | Online | 224 × 224 | 37.1 ± 0.2 | 65.3 ± 0.2 | 15.7 ± 0.3 | 8.7 ± 0.2 | 28.1 ± 0.2 | 8.7 | 147.8 |
| YOLOv5n | Online | 224 × 224 | 40.3 ± 0.4 | 60.8 ± 0.2 | 11.9 ± 0.3 | 9.4 ± 0.2 | 9.4 ± 0.3 | 2.9 | 137.5 |
| YOLOv8n | Online | 224 × 224 | 39.8 ± 0.3 | 67.4 ± 0.2 | 9.7 ± 0.2 | 8.4 ± 0.1 | 8.4 ± 0.2 | 4.2 | 90.6 |
| YOLOv8s | Online | 224 × 224 | 43.9 ± 0.2 | 69.9 ± 0.3 | 7.3 ± 0.3 | 7.8 ± 0.1 | 26.7 ± 0.2 | 14.3 | 79.4 |
| 3DVSD | Online | 224 × 224 | 50.7 ± 0.2 | 81.6 ± 0.1 | 25.9 ± 0.3 | 5.8 ± 0.1 | 22.4 ± 0.1 | 55.9 | 59.7 |
| Faster-RCNN | Online | 224 × 224 | 37.4 ± 0.5 | 70.9 ± 0.5 | 19.4 ± 0.4 | 7.5 ± 0.3 | 26.7 ± 0.2 | 45.6 | 18.2 |
| MEGA | Offline | 224 × 224 | 47.4 ± 0.3 | 78.4 ± 0.2 | 28.8 ± 0.3 | 6.3 ± 0.1 | 23.6 ± 0.2 | 59.8 | 13.4 |
| VGEC-Net | Online | 224 × 224 | 59.8± 0.1 | 89.7± 0.3 | 39.7± 0.3 | 4.1 ± 0.1 | 18.2 ± 0.2 | 27.5 | 77.4 |
| Method | mAP (%) | mAP50 (%) | FAR (%) | MAR (%) | FPS |
|---|---|---|---|---|---|
| YOLOv5s | 34.7 | 63.8 | 0.031 | 0.689 | 153.7 |
| YOLOv5n | 38.6 | 56.4 | 0.083 | 0.391 | 137.9 |
| YOLOv8n | 46.7 | 67.9 | 0.095 | 0.310 | 90.2 |
| YOLOv8s | 47.1 | 68.4 | 0.058 | 0.388 | 87.4 |
| 3DVSD | 53.7 | 80.7 | 0.051 | 0.235 | 57.9 |
| Faster-RCNN | 30.9 | 65.5 | 0.313 | 0.303 | 20.8 |
| MEGA | 46.3 | 78.6 | 0.041 | 0.211 | 12.8 |
| VGEC-Net | 61.7 | 87.3 | 0.035 | 0.158 | 78.2 |
| Model | mAP (%) | mAP50 (%) | mAPs (%) | Params (M) |
|---|---|---|---|---|
| P3D | 22.6 | 54.7 | 13.4 | 18.8 |
| P3D + FPN | 43.7 | 65.7 | 32.1 | 20.1 |
| P3D + FPN + Transformer | 45.3 | 67.3 | 32.3 | 20.8 |
| Attention Mechanism | mAP (%) | mAP50 (%) | mAPs (%) | Params (K) |
|---|---|---|---|---|
| ECA-Net | 54.4 | 79.3 | 35.8 | 3.4 |
| CBAM | 58.7 | 85.9 | 37.9 | 8 |
| CE-Net | 59.8 | 89.7 | 39.7 | 9.6 |
| INF | FPI | mAP (%) | mAP50 (%) | mAPs (%) |
|---|---|---|---|---|
| √ | 13.5 | 40.9 | 9.3 | |
| √ | 46.3 | 67.3 | 33.3 | |
| √ | √ | 47.6 | 69.3 | 34.1 |
| Fold (Held-Out Videos) | mAP (%) | mAP50 (%) | mAPs (%) | FAR (%) | MAR (%) |
|---|---|---|---|---|---|
| F1 (V1–V4) | 65.4 | 94.1 | 43.7 | 2.7 | 6.2 |
| F2 (V5–V8) | 61.5 | 89.4 | 40.3 | 3.6 | 9.1 |
| F3 (V9–V13) | 54.0 | 81.4 | 36.4 | 4.8 | 13.5 |
| Mean ± std | 60.3 ± 4.7 | 88.3 ± 5.2 | 40.1 ± 3.0 | 3.7 ± 0.9 | 9.6 ± 3.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Zhang, X.; Xu, Z.; Liu, Y.; Zhang, F. SF6 Leak Detection in Infrared Video via Multichannel Fusion and Spatiotemporal Features. Appl. Sci. 2025, 15, 11141. https://doi.org/10.3390/app152011141
Li Z, Zhang X, Xu Z, Liu Y, Zhang F. SF6 Leak Detection in Infrared Video via Multichannel Fusion and Spatiotemporal Features. Applied Sciences. 2025; 15(20):11141. https://doi.org/10.3390/app152011141
Chicago/Turabian StyleLi, Zhiwei, Xiaohui Zhang, Zhilei Xu, Yubo Liu, and Fengjuan Zhang. 2025. "SF6 Leak Detection in Infrared Video via Multichannel Fusion and Spatiotemporal Features" Applied Sciences 15, no. 20: 11141. https://doi.org/10.3390/app152011141
APA StyleLi, Z., Zhang, X., Xu, Z., Liu, Y., & Zhang, F. (2025). SF6 Leak Detection in Infrared Video via Multichannel Fusion and Spatiotemporal Features. Applied Sciences, 15(20), 11141. https://doi.org/10.3390/app152011141
