Wildfire Smoke Detection Enhanced by Image Augmentation with StyleGAN2-ADA for YOLOv8 and RT-DETR Models
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Source
- The Fire Ignition Images Library (FIgLib): Provided by the High-Performance Wireless Research and Education Network, which contains wildfire imagery from Southern California from 2014 to 2023 [43]. The FIgLib is a dataset of wildfire imagery collected by fixed cameras in Southern California, a region known for its fire risk. To prepare this data for model training and validation, we manually annotated the FIgLib images with bounding boxes using LabelMe [44], an open-source annotation tool.
- The Nevada Smoke Detection Benchmark (Nemo) dataset [38]: This dataset includes 2859 images from nearly 900 cameras across eight western U.S. states (Nevada, California, Oregon, Washington, Utah, Idaho, Colorado, and Montana). Available on GitHub [45], Nemo contains bounding box labels from AlertWildfire imagery. Although the Nemo dataset was pre-labeled, we comprehensively re-evaluated the annotations through visual inspection.
2.2. Data Augmentation
2.2.1. Basic Transforms Augmentation
2.2.2. Augmentation with StyleGAN2-ADA
2.2.3. Overall Dataset
2.3. Wildfire Smoke Detection Models
2.3.1. YOLOv8
2.3.2. RT-DETR
2.4. Model Training
2.5. Model Evaluation
3. Results
3.1. Evaluation Results with OD
3.2. Model Comparison Based on Data Augmentation Methods
- BT: Unexpectedly, the OD + BT dataset showed lower performance metrics than the OD alone for YOLOv8X. This result can be attributed to the nature of basic transformations, which modify existing images in the dataset. While intended to increase data diversity, this approach may lead to overrepresentation of certain smoke patterns or scenes, as it essentially duplicates existing wildfire smoke instances with variations. This overrepresentation can potentially cause overfitting, where the model becomes too specialized in recognizing these specific, augmented patterns at the expense of generalization to novel, unseen data. The decrease in performance suggests that, for some models, particularly complex ones like YOLOv8X, the additional augmented data may skew the learning process, emphasizing certain features or patterns that are not as relevant or generalizable to the validation set. This highlights the importance of carefully balancing data augmentation techniques to enhance model robustness without compromising its ability to generalize new, unseen wildfire smoke scenarios.
- SG: The OD + SG dataset configuration demonstrated significant improvements, particularly in [email protected]:0.95 and detection rate. [email protected] was the highest and identical for the OD and OD + SG datasets. This suggests that the synthetic images generated by StyleGAN2-ADA effectively expanded the dataset’s diversity, helping the model generalize better to various smoke appearances and backgrounds.
- Combined Augmentation (BT + SG): The OD + BT + SG configuration showed mixed results. While it achieved the best performance in [email protected] (0.900) and detection rate (0.991), it unexpectedly resulted in slight decreases in [email protected] (0.956 vs. 0.962) and [email protected]:0.95 (0.579 vs. 0.580) compared to the OD. These contrasting results across different metrics suggest a complex interaction between data augmentation techniques and model performance. The combination of the BT and SG dataset appears to improve certain aspects of detection ([email protected] and detection rate) while slightly compromising others ([email protected] and [email protected]:0.95).
- BT: The OD + BT dataset showed nearly identical [email protected] and detection rates compared to the OD, but also showed significantly increased [email protected] and [email protected]:0.95. This suggests that RT-DETR-X’s architecture can be more robust in terms of the types of variations introduced by basic transformations.
- SG: The OD + SG configuration improved all metrics except [email protected] relative to the OD. While [email protected] in the OD + SG dataset was marginally lower (0.001) than in the OD + BT, [email protected] increased by 0.018 to 0.924. This underscores the effectiveness of GAN-generated images in enhancing the model’s overall performance, particularly in detecting small or ambiguous smoke instances.
- Combined Augmentation (BT + SG): The OD + BT + SG dataset demonstrated the highest accuracy across all metrics, with notable improvements over the OD. [email protected] improved by 0.026, [email protected] by 0.030, [email protected]:0.95 by 0.076, and the detection rate increased by 0.012, resulting in 16 additional detected scenes. This synergistic effect suggests that the combination of augmentation techniques provides a rich and diverse dataset that aligns well with RT-DETR-X’s architectural strengths.
4. Discussion
4.1. Comparative Analysis of YOLOv8 and RT-DETR Performance
4.2. Early Detection Capability
4.3. Evaluation Results Based on the Object Size
4.4. Model Ensemble Based on Object Size
4.5. Detection Performance on Challenging Conditions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
No | Video Name | FRCNN [30] | DETR-sc [38] | YOLOv8X (Ours) | RT-DETR-X (Ours) |
---|---|---|---|---|---|
1 | 20190813_FIRE_69bravo-e-mobo-c | 5 | 1 | 1 | 2 |
2 | 20190813_Topanga_69bravo-n-mobo | 3 | 0 | 1 | 2 |
3 | 20190610_Pauma_bh-w-mobo-c | 9 | 4 | 1 | 1 |
4 | 20191001_FIRE_bh-w-mobo-c | No Detection | 5 | 3 | 3 |
5 | 20190829_FIRE_bl-n-mobo-c | 4 | 3 | 0 | 0 |
6 | 20190716_FIRE_bl-s-mobo-c | 8 | 3 | 2 | 3 |
7 | 20190924_FIRE_bl-s-mobo-c | 7 | 5 | 0 | 2 |
8 | 20191005_FIRE_bm-e-mobo-c | No Detection | 1 | 0 | 1 |
9 | 20190629_FIRE_hp-n-mobo-c | No Detection | 2 | 0 | 1 |
10 | 20190716_Meadowfire_hp-n-mobo-c | 12 | 6 | 4 | 5 |
11 | 20190924_FIRE_hp-s-mobo-c | No Detection | 10 | 1 | 3 |
12 | 20191005_FIRE_hp-s-mobo-c | No Detection | 4 | 12 | 0 |
13 | 20191006_FIRE_lo-s-mobo-c | 13 | 3 | 1 | 0 |
14 | 20190924_FIRE_lo-w-mobo-c | 9 | 3 | 1 | 0 |
15 | 20191006_FIRE_lo-w-mobo-c | 2 | 1 | 0 | 0 |
16 | 20191006_FIRE_lp-e-mobo-c | No Detection | 1 | 0 | 0 |
17 | 20191006_FIRE_lp-e-mobo-c | No Detection | 29 | 1 | 0 |
18 | 20190717_FIRE_lp-n-mobo-c | 2 | 1 | 1 | 13 |
19 | 20190728_Dehesa_lp-n-mobo | No Detection | 6 | 6 | 1 |
20 | 20190924_FIRE_lp-n-mobo-c | 4 | 1 | 0 | 8 |
21 | 20191006_FIRE_lp-n-mobo-c | 4 | 1 | 2 | 0 |
22 | 20190814_Border_lp-s-mobo | 7 | 5 | 1 | 4 |
23 | 20191001_FIRE_lp-s-mobo-c | 6 | 2 | 0 | 4 |
24 | 20191006_FIRE_lp-s-mobo-c | 4 | 1 | 0 | 1 |
25 | 20191007_FIRE_lp-s-mobo-c | No Detection | 3 | 2 | 0 |
26 | 20190716_FIRE_mg-n-mobo-c | 3 | 2 | 0 | 2 |
27 | 20190922_FIRE_ml-w-mobo-c | 3 | 1 | 4 | 0 |
28 | 20190924_FIRE_ml-w-mobo-c | 4 | 3 | 1 | 9 |
29 | 20191006_FIRE_ml-w-mobo-c | 7 | 4 | 1 | 3 |
30 | 20190712_FIRE_om-e-mobo-c | 2 | 5 | No Detection | 2 |
31 | 20190814_FIRE_om-e-mobo-c | 30 | 5 | 3 | 9 |
32 | 20191001_FIRE_om-e-mobo-c | 5 | 1 | 4 | 5 |
33 | 20190728_FIRE_om-n-mobo-c | No Detection | 3 | 0 | 9 |
34 | 20191006_FIRE_om-n-mobo-c | 13 | 2 | 0 | 2 |
35 | 20190930_FIRE_om-s-mobo-c | 3 | 1 | 1 | 0 |
36 | 20191003_FIRE_om-s-mobo-c | No Detection | 2 | 5 | 2 |
37 | 20191007_FIRE_om-s-mobo-c | No Detection | 2 | 1 | 5 |
38 | 20190801_Caliente_om-w-mobo | No Detection | No Detection | 2 | 3 |
39 | 20190829_FIRE_pi-e-mobo-c | No Detection | No Detection | 3 | 3 |
40 | 20190814_FIRE-pi-s-mobo-c | 5 | 3 | 0 | 1 |
41 | 20190826_FIRE_pi-s-mobo-c | 31 | 1 | 0 | 3 |
42 | 20191006_FIRE_pi-s-mobo-c | 4 | 1 | 0 | 2 |
43 | 20190717_FIRE_pi-w-mobo-c | No Detection | 7 | 3 | 0 |
44 | 20190924_FIRE_pi-w-mobo-c | 6 | 2 | 1 | 1 |
45 | 20190620_FIRE_rm-w-mobo-c | 3 | 7 | 1 | 2 |
46 | 20190826_FIRE_rm-w-mobo-c | No Detection | 3 | 2 | 1 |
47 | 20190829_FIRE_rm-w-mobo-c | No Detection | 8 | 6 | 9 |
48 | 20191001_FIRE_rm-w-mobo-c | No Detection | 2 | 1 | 3 |
49 | 20191003_FIRE_rm-w-mobo-c | 2 | 1 | 0 | 0 |
50 | 20190924_FIRE_sm-n-mobo-c | 8 | 1 | 0 | 0 |
51 | 20191007_FIRE_sm-s-mobo-c | 7 | 1 | 2 | 1 |
52 | 20190825_FIRE_sm-w-mobo-c | 7 | 2 | 2 | 2 |
53 | 20190825_FIRE-smer-tcs8-mobo-c | No Detection | 17 | 0 | 0 |
54 | 20190829_FIRE_smer-tcs8-mobo-c | 3 | 2 | 1 | 1 |
55 | 20190620_FIRE_smer-tcs9-mobo-c | 18 | 3 | 0 | 0 |
56 | 20191001_FIRE_smer-tcs9-mobo-c | No Detection | 4 | 0 | 9 |
57 | 20191003_FIRE_smer-tcs9-mobo-c | No Detection | 2 | 0 | 0 |
58 | 20190716_FIRE_so-w-mobo-c | 0 | 0 | 0 | 0 |
59 | 20190827_FIRE_so-w-mobo-c | No Detection | 1 | 0 | 0 |
60 | 20190805_FIRE_sp-e-mobo-c | No Detection | 6 | 1 | 0 |
61 | 20190728_FIRE_sp-n-mobo-c | No Detection | 8 | 0 | 0 |
62 | 20191005_FIRE_vo-n-mobo-c | 1 | 1 | 1 | 0 |
63 | 20190924_FIRE_wc-e-mobo-c | 12 | 2 | 1 | 4 |
64 | 20190925_FIRE_wc-e-mobo-c | 7 | 5 | 4 | 4 |
65 | 20191005_FIRE_wc-e-mobo-c | No Detection | 1 | 1 | 0 |
66 | 20191005_FIRE_wc-n-mobo-c | No Detection | 13 | 2 | 1 |
67 | 20190924_FIRE_wc-s-mobo-c | 10 | 8 | 2 | 7 |
68 | 20190925_FIRE_wc-s-mobo-c | No Detection | 7 | 4 | 4 |
Mean ± standard deviation | 7.15 ± 6.55 | 3.80 ± 4.41 | 1.49 ± 2.00 | 2.44 ± 2.91 | |
Median | 5 | 2.5 | 1 | 2 |
References
- Heilman, W.E.; Liu, Y.; Urbanski, S.; Kovalev, V.; Mickler, R. Wildland fire emissions, carbon, and climate: Plume rise, atmospheric transport, and chemistry processes. For. Ecol. Manag. 2014, 317, 70–79. [Google Scholar] [CrossRef]
- Higuera, P.E.; Abatzoglou, J.T. Record-setting climate enabled the extraordinary 2020 fire season in the western United States. Glob. Change Biol. 2021, 27, 1. [Google Scholar] [CrossRef]
- Goss, M.; Swain, D.L.; Abatzoglou, J.T.; Sarhadi, A.; Kolden, C.A.; Williams, A.P.; Diffenbaugh, N.S. Climate change is increasing the likelihood of extreme autumn wildfire conditions across California. Environ. Res. Lett. 2020, 15, 094016. [Google Scholar] [CrossRef]
- Xu, R.; Yu, P.; Abramson, M.J.; Johnston, F.H.; Samet, J.M.; Bell, M.L.; Haines, A.; Ebi, K.L.; Li, S.; Guo, Y. Wildfires, global climate change, and human health. N. Engl. J. Med. 2020, 383, 2173–2181. [Google Scholar] [CrossRef]
- Bowman, D.M.; Kolden, C.A.; Abatzoglou, J.T.; Johnston, F.H.; van der Werf, G.R.; Flannigan, M. Vegetation fires in the Anthropocene. Nat. Rev. Earth Environ. 2020, 1, 500–515. [Google Scholar] [CrossRef]
- Chakrabarty, R.K.; Shetty, N.J.; Thind, A.S.; Beeler, P.; Sumlin, B.J.; Zhang, C.C.; Liu, P.; Idrobo, J.C.; Adachi, K.; Wagner, N.L.; et al. Shortwave absorption by wildfire smoke dominated by dark brown carbon. Nat. Geosci. 2023, 16, 683–688. [Google Scholar] [CrossRef]
- Szpakowski, D.M.; Jensen, J.L. A review of the applications of remote sensing in fire ecology. Remote Sens. 2019, 11, 2638. [Google Scholar] [CrossRef]
- Jain, P.; Coogan, S.C.; Subramanian, S.G.; Crowley, M.; Taylor, S.; Flannigan, M.D. A review of machine learning applications in wildfire science and management. Environ. Rev. 2020, 28, 478–505. [Google Scholar] [CrossRef]
- Lee, D.-H.; Yoo, J.-W.; Lee, K.-H.; Kim, Y. A Real Time Flame and Smoke Detection Algorithm Based on Conditional Test in YCbCr Color Model and Adaptive Differential Image. J. Korea Soc. Comput. Inf. 2010, 15, 57–65. [Google Scholar]
- Yan, F.; Xu, X.; Han, N. Identification method of forest fire based on color space. In Proceedings of the 2nd International Conference on Industrial Mechatronics and Automation, Wuhan, China, 30–31 May 2010; pp. 448–451. [Google Scholar]
- Chunyu, Y.; Jun, F.; Jinjun, W.; Yongming, Z. Video fire smoke detection using motion and color features. Fire Technol. 2010, 46, 651–663. [Google Scholar] [CrossRef]
- Chmelar, P.; Benkrid, A. Efficiency of HSV over RGB Gaussian Mixture Model for fire detection. In Proceedings of the 2014 24th International Conference Radioelektronika, Bratislava, Slovakia, 15–16 April 2014; pp. 1–4. [Google Scholar]
- Chen, X.J.; Dong, F. Recognition and segmentation for fire smoke based HSV. In Proceedings of the Computing, Control, Information and Education Engineering. In Proceedings of the 2015 Second International Conference on Computer, Intelligent and Education Technology (CICET 2015), Guilin, China, 11–12 April 2015; p. 349. [Google Scholar]
- Toreyin, B.U.; Dedeoglu, Y.; Cetin, A.E. Contour based smoke detection in video using wavelets. In Proceedings of the 2006 14th European Signal Processing Conference, Florence, Italy, 4–8 September 2006; pp. 1–5. [Google Scholar]
- Poobalan, K.; Liew, S.-C. Fire detection algorithm using image processing techniques. In Proceedings of the 3rd International Conference on Artificial Intelligence and Computer Science (AICS2015), Penang, Malaysia, 12–13 October 2015; pp. 160–168. [Google Scholar]
- Li, X.; Wang, J.; Song, W.; Ma, J.; Telesca, L.; Zhang, Y. Automatic smoke detection in modis satellite data based on k-means clustering and fisher linear discrimination. Photogramm. Eng. Remote Sens. 2014, 80, 971–982. [Google Scholar] [CrossRef]
- He, H.; Peng, L.; Yang, D.; Chen, X. Smoke detection based on a semi-supervised clustering model. In Proceedings of the MultiMedia Modeling: 20th Anniversary International Conference, MMM 2014, Proceedings, Part II 20. Dublin, Ireland, 6–10 January 2014; pp. 291–298. [Google Scholar]
- Khatami, A.; Mirghasemi, S.; Khosravi, A.; Nahavandi, S. A new color space based on k-medoids clustering for fire detection. In Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics, Hong Kong, China, 9–12 October 2015; pp. 2755–2760. [Google Scholar]
- Ajith, M.; Martínez-Ramón, M. Unsupervised segmentation of fire and smoke from infra-red videos. IEEE Access 2019, 7, 182381–182394. [Google Scholar] [CrossRef]
- Wu, X.; Lu, X.; Leung, H. A video based fire smoke detection using robust AdaBoost. Sensors 2018, 18, 3780. [Google Scholar] [CrossRef]
- Ko, B.; Kwak, J.-Y.; Nam, J.-Y. Wildfire smoke detection using temporospatial features and random forest classifiers. Opt. Eng. 2012, 51, 017208. [Google Scholar] [CrossRef]
- Xiong, D.; Yan, L. Early smoke detection of forest fires based on SVM image segmentation. J. For. Sci. 2019, 65, 150–159. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, Z.; Han, S.; Qu, C.; Yuan, Z.; Zhang, D. SVM based forest fire detection using static and dynamic features. Comput. Sci. Inf. Syst. 2011, 8, 821–841. [Google Scholar] [CrossRef]
- Jeong, Y.; Kim, S.; Kim, S.-Y.; Yu, J.-A.; Lee, D.-W.; Lee, Y. Detection of Wildfire Smoke Plumes Using GEMS Images and Machine Learning. Korean J. Remote Sens. 2022, 38, 967–977. [Google Scholar]
- Qiao, Y.M.; Jiang, W.Y.; Wang, F.; Su, G.F.; Li, X.; Jiang, J.C. FireFormer: An efficient Transformer to identify forest fire from surveillance cameras. Int. J. Wildland Fire 2023, 32, 1364–1380. [Google Scholar] [CrossRef]
- Xu, G.; Zhang, Y.; Zhang, Q.; Lin, G.; Wang, J. Deep domain adaptation based video smoke detection using synthetic smoke images. Fire Saf. J. 2017, 93, 53–59. [Google Scholar] [CrossRef]
- Zhang, Q.-X.; Lin, G.-H.; Zhang, Y.-M.; Xu, G.; Wang, J.-J. Wildland forest fire smoke detection based on faster R-CNN using synthetic smoke images. Procedia Eng. 2018, 211, 441–446. [Google Scholar] [CrossRef]
- Barmpoutis, P.; Dimitropoulos, K.; Kaza, K.; Grammalidis, N. Fire detection from images using faster R-CNN and multidimensional texture analysis. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 8301–8305. [Google Scholar]
- Chaoxia, C.; Shang, W.; Zhang, F. Information-guided flame detection based on faster R-CNN. IEEE Access 2020, 8, 58923–58932. [Google Scholar] [CrossRef]
- Guede-Fernández, F.; Martins, L.; de Almeida, R.V.; Gamboa, H.; Vieira, P. A deep learning based object identification system for forest fire detection. Fire 2021, 4, 75. [Google Scholar] [CrossRef]
- Pan, J.; Ou, X.; Xu, L. A collaborative region detection and grading framework for forest fire smoke using weakly supervised fine segmentation and lightweight faster-RCNN. Forests 2021, 12, 768. [Google Scholar] [CrossRef]
- Mukhiddinov, M.; Abdusalomov, A.B.; Cho, J. Automatic fire detection and notification system based on improved YOLOv4 for the blind and visually impaired. Sensors 2022, 22, 3307. [Google Scholar] [CrossRef]
- Wu, Z.; Xue, R.; Li, H. Real-time video fire detection via modified YOLOv5 network model. Fire Technol. 2022, 58, 2377–2403. [Google Scholar] [CrossRef]
- Huo, Y.; Zhang, Q.; Jia, Y.; Liu, D.; Guan, J.; Lin, G.; Zhang, Y. A deep separable convolutional neural network for multiscale image-based smoke detection. Fire Technol. 2022, 58, 1445–1468. [Google Scholar] [CrossRef]
- Liu, H.; Hu, H.; Zhou, F.; Yuan, H. Forest flame detection in unmanned aerial vehicle imagery based on YOLOv5. Fire 2023, 6, 279. [Google Scholar] [CrossRef]
- Chen, X.; Xue, Y.; Hou, Q.; Fu, Y.; Zhu, Y. RepVGG-YOLOv7: A modified YOLOv7 for fire smoke detection. Fire 2023, 6, 383. [Google Scholar] [CrossRef]
- Kristiani, E.; Chen, Y.-C.; Yang, C.-T.; Li, C.-H. Flame and smoke recognition on smart edge using deep learning. J. Supercomput. 2023, 79, 5552–5575. [Google Scholar] [CrossRef]
- Yazdi, A.; Qin, H.Y.; Jordan, C.B.; Yang, L.; Yan, F. Nemo: An Open-Source Transformer-Supercharged Benchmark for Fine-Grained Wildfire Smoke Detection. Remote Sens. 2022, 14, 3979. [Google Scholar] [CrossRef]
- Li, Y.M.; Zhang, W.; Liu, Y.Y.; Jing, R.D.; Liu, C.S. An efficient fire and smoke detection algorithm based on an end-to-end structured network. Eng. Appl. Artif. Int. 2022, 116, 105492. [Google Scholar] [CrossRef]
- Huang, J.W.; Zhou, J.S.; Yang, H.Z.; Liu, Y.F.; Liu, H. A Small-Target Forest Fire Smoke Detection Model Based on Deformable Transformer for End-to-End Object Detection. Forests 2023, 14, 162. [Google Scholar] [CrossRef]
- Wang, X.Z.; Li, M.Y.; Gao, M.K.; Liu, Q.Y.; Li, Z.N.; Kou, L.Y. Early smoke and flame detection based on transformer. J. Saf. Sci. Resil. 2023, 4, 294–304. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–24 June 2024; pp. 16965–16974. [Google Scholar]
- The HPWREN Fire Ignition Images Library for Neural Network Training. Available online: https://hpwren.ucsd.edu/FIgLib (accessed on 10 February 2023).
- Labelme: Image Polygonal Annotation with Python. Available online: https://github.com/wkentaro/labelme (accessed on 20 October 2022).
- Nemo. Available online: https://github.com/SayBender/Nemo (accessed on 13 March 2023).
- Kang, L.-W.; Wang, I.-S.; Chou, K.-L.; Chen, S.-Y.; Chang, C.-Y. Image-based real-time fire detection using deep learning with data augmentation for vision-based surveillance applications. In Proceedings of the 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Taipei, Taiwan, 18–21 September 2019; pp. 1–4. [Google Scholar]
- Shi, X.; Lu, N.; Cui, Z. Smoke detection based on dark channel and convolutional neural networks. In Proceedings of the 2019 5th International Conference on Big Data and Information Analytics (BigDIA), Kunming, China, 8–10 July 2019; pp. 23–28. [Google Scholar]
- Zheng, X.; Chen, F.; Lou, L.; Cheng, P.; Huang, Y. Real-time detection of full-scale forest fire smoke based on deep convolution neural network. Remote Sens. 2022, 14, 536. [Google Scholar] [CrossRef]
- Zhang, J.; Ke, S. Improved YOLOX fire scenario detection method. Wirel. Commun. Mob. Comput. 2022, 2022, 9666265. [Google Scholar] [CrossRef]
- Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.; Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and flexible image augmentations. Information 2020, 11, 125. [Google Scholar] [CrossRef]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–19 June 2019; pp. 4401–4410. [Google Scholar]
- Karras, T.; Aittala, M.; Hellsten, J.; Laine, S.; Lehtinen, J.; Aila, T. Training generative adversarial networks with limited data. Adv. Neural Inf. Process. Syst. 2020, 33, 12104–12114. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv 2017, arXiv:1706.08500. [Google Scholar]
- Kim, D.; Lai, C.-H.; Liao, W.-H.; Murata, N.; Takida, Y.; Uesaka, T.; He, Y.; Mitsufuji, Y.; Ermon, S. Consistency trajectory models: Learning probability flow ode trajectory of diffusion. arXiv 2023, arXiv:2310.02279. [Google Scholar]
- Sadat, S.; Buhmann, J.; Bradely, D.; Hilliges, O.; Weber, R.M. CADS: Unleashing the diversity of diffusion models through condition-annealed sampling. arXiv 2023, arXiv:2310.17347. [Google Scholar]
- Lee, D.-H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning, ICML, Online, 16–21 June 2013; p. 896. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- YOLO by Ultralytics. Available online: https://github.com/ultralytics/ultralytics (accessed on 2 May 2024).
- Ultralytics YOLOv8 Docs. Available online: https://docs.ultralytics.com (accessed on 2 May 2024).
- Loshchilov, I.; Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. arXiv 2016, arXiv:1608.03983. [Google Scholar]
- He, T.; Zhang, Z.; Zhang, H.; Zhang, Z.; Xie, J.; Li, M. Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–19 June 2019; pp. 558–567. [Google Scholar]
- Xu, R.J.; Lin, H.F.; Lu, K.J.; Cao, L.; Liu, Y.F. A Forest Fire Detection System Based on Ensemble Learning. Forests 2021, 12, 217. [Google Scholar] [CrossRef]
- Yildiran, N. Real-time verification of solar-powered forest fire detection system using ensemble learning. Expert Syst. Appl. 2024, 255, 124791. [Google Scholar] [CrossRef]
Condition | Train | Validation |
---|---|---|
Hazy | 470 | 52 |
Lens flare | 110 | 9 |
Noise on lens | 70 | 4 |
Raindrops on lens | 19 | 1 |
Night | 10 | 1 |
Total | 679 (10.97%) | 67 (10.17%) |
Size | Metric (Square Pixels) | Number of Objects (Train) | Number of Objects (Validation) |
---|---|---|---|
Small | Area ≤ 322 | 504 | 56 |
Medium | 322 < Area < 962 | 1972 | 223 |
Large | 962 ≤ Area | 3581 | 390 |
Total | 6057 | 669 |
Methods | Limits | Probability |
---|---|---|
Horizontal Flip | - | 0.3 |
Random Brightness Contrast | −0.07, 0.17 | 0.5 |
Blur | 20 | 0.1 |
Median Blur | 21 | 0.1 |
To Gray | - | 0.03 |
CLAHE 1 | - | 0.1 |
RGB Shift | R = 5, G = 5, B = 5 | 0.5 |
Methods | Explanation |
---|---|
Xflip | Horizontal flipping of the image |
Rotate90 | 90-degree rotation of the image |
Xint | Shifting the position by whole pixel values |
Scale | Resizing or scaling the image |
Rotate | Rotation of the image to a specific angle |
ANISO 1 | Resizing the image unequally in the X and Y dimensions |
Xfrac | Moves the image in fractional coordinate units |
Brightness | Adjustment of the image’s brightness |
Contrast | Adjustment of the image’s contrast |
Lumaflip | Flip the image’s luma (brightness) channel horizontally. |
Hue | Adjustment of the image’s hue (color tone) |
Saturation | Adjustment of the image’s saturation (color intensity) |
Dataset | Train | Validation |
---|---|---|
OD 1 | 6190 | 659 |
OD + BT 2 | 12,121 | 659 |
OD + SG 3 | 12,073 | 659 |
OD + BT + SG | 18,004 | 659 |
Hyperparameter | YOLOv8 | RT-DETR |
---|---|---|
Input size | 1280 | 1280 |
Epochs | 300 | 300 |
Batch size | 4 | 2 |
Optimizer | SGD | AdamW |
Initial learning rate | 0.01 | 0.0001 |
Final learning rate | 0.01 | 0.01 |
momentum | 0.937 | 0.937 |
Weight decay | 0.0005 | 0.0001 |
Warmup epoch | 3 | 3 |
Warmup momentum | 0.8 | 0.8 |
Warmup bias learning rate | 0.1 | 0.1 |
Probability of mosaic | 1.0 | 1.0 |
Close mosaic | 10 | - |
Model | [email protected] | [email protected] | [email protected]:0.95 | Detection Rate | FPS |
---|---|---|---|---|---|
YOLOv8S | 0.942 | 0.893 | 0.570 | 0.961 | 143 |
YOLOv8M | 0.951 | 0.896 | 0.577 | 0.974 | 59 |
YOLOv8L | 0.951 | 0.903 | 0.578 | 0.959 | 43 |
YOLOv8X | 0.962 | 0.891 | 0.580 | 0.967 | 25 |
RT-DETR-L | 0.931 | 0.848 | 0.511 | 0.973 | 63 |
RT-DETR-X | 0.908 | 0.821 | 0.452 | 0.983 | 46 |
Model | Dataset | [email protected] | [email protected] | [email protected]:0.95 | Detection Rate |
---|---|---|---|---|---|
YOLOv8X | OD 1 | 0.962 | 0.891 | 0.580 | 0.967 |
OD + BT 2 | 0.926 | 0.859 | 0.545 | 0.954 | |
OD + SG 3 | 0.962 | 0.892 | 0.581 | 0.988 | |
OD + BT + SG | 0.956 | 0.900 | 0.579 | 0.991 | |
RT-DETR-X | OD | 0.908 | 0.821 | 0.452 | 0.983 |
OD + BT | 0.906 | 0.848 | 0.519 | 0.982 | |
OD + SG | 0.924 | 0.847 | 0.522 | 0.986 | |
OD + BT + SG | 0.934 | 0.851 | 0.528 | 0.995 |
No | Video Name | Ignition Time | Time Elapsed (min) | ||||
---|---|---|---|---|---|---|---|
FRCNN [30] | DETR-dg [38] | DETR-sc [38] | YOLOv8X (Ours) | RT-DETR-X (Ours) | |||
1 | 20160722-FIRE-mw-e-mobo-c | 14:32 | 5 | 10 | 11 | 2 | 0 |
2 | 20170520-FIRE-lp-s-iqeye | 11:19 | 2 | 0 | 0 | 0 | 0 |
3 | 20170625-BBM-bm-n-mobo | 11:46 | 21 | 9 | 8 | 7 | 7 |
4 | 20170708-Whittier-syp-n-mobo-c | 13:37 | 5 | 3 | 3 | 3 | 3 |
5 | 20170722-FIRE-so-s-mobo-c | 15:07 | 13 | 2 | 2 | 0 | 2 |
6 | 20180504-FIRE-smer-tcs8-mobo-c | 14:33 | 9 | 8 | 8 | 5 | 8 |
7 | 20180504-FIRE-smer-tcs10-mobo-c | 15:10 | 3 | 4 | 1 | 0 | 0 |
8 | 20180809-FIRE-mg-w-mobo-c | 13:10 | 2 | 0 | 0 | 0 | 0 |
9 | 20190529-94Fire-lp-s-mobo-c | 15:03 | 3 | 1 | 1 | 1 | 2 |
10 | 20190610-FIRE-bh-w-mobo-c | 13:22 | 5 | 3 | 4 | 2 | 2 |
11 | 20190716-FIRE-bl-s-mobo-c | 12:41 | 18 | 4 | 3 | 0 | 3 |
12 | 20190924-FIRE-sm-n-mobo-c | 14:57 | 7 | 3 | 1 | 0 | 0 |
13 | 20200611_skyline_lp-n-mobo-c | 11:36 | 4 | 3 | 3 | 3 | 5 |
14 | 20200806_SpringsFire_lp-w-mobo-c | 18:33 | 1 | 1 | 1 | 1 | 1 |
15 | 20200822_BrattonFire_lp-e-mobo-c | 12:56 | 5 | 2 | 2 | 0 | 1 |
16 | 20200905_ValleyFire_lp-n-mobo-c | 14:28 | 3 | 3 | 2 | 2 | 2 |
+ 68 Scenes [Table A1] | |||||||
Mean ± std (1–16) | 6.63 ± 5.85 | 3.5 ± 3.01 | 3.13 ± 3.18 | 1.63 ± 2.06 | 2.25 ± 2.49 | ||
Median (1–16) | 5 | 3 | 2 | 1 | 2 | ||
Mean ± std (all videos) | 7.00 ± 6.31 | - | 3.67 ± 4.19 | 1.52 ± 2.00 | 2.40 ± 2.82 | ||
Median (all videos) | 5 | - | 2 | 1 | 2 |
Model | Number of Detected Objects | Object Detection Rate | |||||
---|---|---|---|---|---|---|---|
Small | Medium | Large | Small | Medium | Large | Average (Micro) | |
YOLOv8S | 42 | 214 | 370 | 0.750 | 0.960 | 0.949 | 0.894 |
YOLOv8M | 44 | 218 | 373 | 0.786 | 0.978 | 0.956 | 0.897 |
YOLOv8L | 44 | 210 | 369 | 0.786 | 0.942 | 0.946 | 0.892 |
YOLOv8X | 43 | 214 | 374 | 0.768 | 0.960 | 0.959 | 0.906 |
RT-DETR-L | 49 | 212 | 368 | 0.875 | 0.951 | 0.944 | 0.903 |
RT-DETR-X | 50 | 216 | 373 | 0.893 | 0.969 | 0.956 | 0.901 |
Model | Dataset | False Positive | False Negative |
---|---|---|---|
YOLOv8X | OD 1 | 1 | 4 |
OD + BT 2 | 1 | 4 | |
OD + SG 3 | 1 | 2 | |
OD + BT + SG | 1 | 2 | |
RT-DETR-X | OD | 3 | 4 |
OD + BT | 5 | 2 | |
OD + SG | 4 | 2 | |
OD + BT + SG | 3 | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Park, G.; Lee, Y. Wildfire Smoke Detection Enhanced by Image Augmentation with StyleGAN2-ADA for YOLOv8 and RT-DETR Models. Fire 2024, 7, 369. https://doi.org/10.3390/fire7100369
Park G, Lee Y. Wildfire Smoke Detection Enhanced by Image Augmentation with StyleGAN2-ADA for YOLOv8 and RT-DETR Models. Fire. 2024; 7(10):369. https://doi.org/10.3390/fire7100369
Chicago/Turabian StylePark, Ganghyun, and Yangwon Lee. 2024. "Wildfire Smoke Detection Enhanced by Image Augmentation with StyleGAN2-ADA for YOLOv8 and RT-DETR Models" Fire 7, no. 10: 369. https://doi.org/10.3390/fire7100369
APA StylePark, G., & Lee, Y. (2024). Wildfire Smoke Detection Enhanced by Image Augmentation with StyleGAN2-ADA for YOLOv8 and RT-DETR Models. Fire, 7(10), 369. https://doi.org/10.3390/fire7100369