Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (321)

Search Parameters:
Keywords = fire image dataset

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 11243 KB  
Article
TCSN-YOLO: A Small-Target Object Detection Method for Fire Smoke
by Cao Yang, Zhou Jun, Wen Hongyuan and Wang Gang
Fire 2025, 8(12), 466; https://doi.org/10.3390/fire8120466 - 29 Nov 2025
Viewed by 502
Abstract
Forest fires continue to pose a significant threat to public and personal safety. Detecting smoke in its early stages or when it is distant from the camera is challenging because it appears in only a small region of the captured images. This paper [...] Read more.
Forest fires continue to pose a significant threat to public and personal safety. Detecting smoke in its early stages or when it is distant from the camera is challenging because it appears in only a small region of the captured images. This paper proposes a small-scale smoke detection algorithm called TCSN-YOLO to address these challenges. First, it introduces a novel feature fusion module called trident fusion (TF), which is innovatively designed and incorporated into the neck of the model. TF significantly enhances small target smoke recognition. Additionally, to obtain global contextual information with high computational efficiency, we propose a Cross Attention Mechanism (CAM). CAM captures diverse smoke features by assigning attention weights in both horizontal and vertical directions. Furthermore, we suggest using SoftPool to preserve more detailed information in the feature map. Normalized Wasserstein Distance (NWD) metric be embedded into the loss function of our detector to distinguish positive and negative samples under the same threshold. Finally, we evaluate the proposed model using AI For Humankind dataset and FlgLib dataset. The experimental results demonstrate that our method achieves 37.1% APs, 90.3% AP50, 40.4% AP50:95, 45.34 M Params and 170.5 G FLOPs. Full article
Show Figures

Figure 1

22 pages, 1171 KB  
Article
Feature Extraction and Comparative Analysis of Firing Pin, Breech Face, and Annulus Impressions from Ballistic Cartridge Images
by Sangita Baruah, R. Suresh, Rajesh Babu Govindarajulu, Chandan Jyoti Kumar, Bibhakar Chanda, Lakshya Dugar and Manob Jyoti Saikia
Forensic Sci. 2025, 5(4), 62; https://doi.org/10.3390/forensicsci5040062 - 12 Nov 2025
Viewed by 683
Abstract
Background/Objectives: Toolmark analysis on cartridge cases offers critical insights in forensic ballistics, as the impressions left on cartridge cases by firearm components—such as the firing pin, breech face, and annulus—carry distinctive patterns and act as unique identifiers that can be used for firearm [...] Read more.
Background/Objectives: Toolmark analysis on cartridge cases offers critical insights in forensic ballistics, as the impressions left on cartridge cases by firearm components—such as the firing pin, breech face, and annulus—carry distinctive patterns and act as unique identifiers that can be used for firearm linkage. This study aims to develop a systematic and interpretable feature extraction pipeline for these regions to support future automation and comparison studies in forensic cartridge case analysis. Methods: A dataset of 20 high-resolution cartridge case images was prepared, and each region of interest (firing pin impression, breech face, and annulus) was manually annotated using the LabelMe tool. ImageJ and Python-based scripts were employed for feature extraction, capturing geometric descriptors (area, perimeter, circularity, and eccentricity) and texture-based features (Local Binary Patterns and Haralick statistics). In total, 61 quantitative features were derived from the annotated regions. Similarity between cartridge cases was evaluated using Euclidean distance metrics after normalization. Results: The extracted and calibrated region-wise geometric and texture features demonstrated distinct variation patterns across firing pin, breech face, and annulus regions. Pairwise similarity analysis revealed measurable intra-class differences, indicating the discriminative potential of the extracted features even within cartridges likely fired from the same firearm. Conclusions: This study provides a foundational, region-wise quantitative framework for analysing cartridge case impressions. The extracted dataset and similarity outcomes establish a baseline for subsequent research on firearm identification and model-based classification in forensic ballistics. Full article
Show Figures

Figure 1

15 pages, 2139 KB  
Proceeding Paper
Fire Detection Using CCTV Images with 1-Dimensional Convolutional Neural Network Based on GUI
by Muhammad A. P. Putra, Neny Rosmawarni, Muhammad Adrezo, Nunik D. Arianti and Mustika Sari
Eng. Proc. 2025, 107(1), 134; https://doi.org/10.3390/engproc2025107134 - 10 Nov 2025
Viewed by 237
Abstract
Fire is a phenomenon that causes physical and material losses to humans. Fires are difficult to predict based on causes and location. Therefore, early detection of fires is necessary to reduce the impact. With these issues, this research aims to detect fires based [...] Read more.
Fire is a phenomenon that causes physical and material losses to humans. Fires are difficult to predict based on causes and location. Therefore, early detection of fires is necessary to reduce the impact. With these issues, this research aims to detect fires based on CCTV images. So far, there has been no research on fire detection based on CCTV images using a 1D CNN. The detection of fires based on CCTV images will be carried out by creating an algorithm model with a 1D convolutional neural network. This research uses a dataset of fire images based on CCTV that is already pre-processed. An interface display is created for inputting data using the Tkinter library to show a graphical user interface (GUI). The result of the algorithm model process using the 1D convolutional neural network based on accuracy, precision, and recall is 88.43%. The understanding of the actual input data is still low in terms of detecting fires based on CCTV images and requires further processing of CCTV image data. Full article
Show Figures

Figure 1

5297 KB  
Proceeding Paper
Forest Fire Monitoring from Unmanned Aerial Vehicles Using Deep Learning
by Christophe Graveline and Pierre Payeur
Eng. Proc. 2025, 118(1), 66; https://doi.org/10.3390/ECSA-12-26597 - 7 Nov 2025
Abstract
Forest fires pose a serious threat to the environment with the potential of causing ecological harm, financial losses, and human casualties. While research suggests that climate change will increase the frequency and severity of these fires, recent developments in deep learning and convolutional [...] Read more.
Forest fires pose a serious threat to the environment with the potential of causing ecological harm, financial losses, and human casualties. While research suggests that climate change will increase the frequency and severity of these fires, recent developments in deep learning and convolutional neural networks (CNN) have greatly enhanced fire detection techniques and capability. These models can be leveraged by unmanned aerial vehicles (UAVs) to automatically monitor burning areas. However, drones can carry only limited computational and power resources; therefore, on-board computing capabilities are constrained by hardware limitations. This work focuses on the design of segmentation models to identify and localize active burning areas from aerial RGB images processed on limited computing resources. To achieve this goal, the research compares the performance of different variants of the DeepLabv3 neural network model for fire segmentation when trained and tested with the FLAME dataset using a k-fold cross validation approach. Experimental results are compared with U-Net, a benchmark model used with the FLAME dataset, by implementing this model in the same codebase as the DeepLabv3 model. This work demonstrates that a refined version of DeepLabv3, with a MobileNetv2 backbone using pretrained layers and a simplified atrous spatial pyramid pooling (ASPP) module, yields a similar performance to U-Net, with a precision of 87.8% and a recall of 83.2%, while only requiring 20% of the number of parameters involved with the U-Net topology. This significantly reduces memory and power consumption, enabling longer UAV flight duration and reducing the processing overhead associated with sensor input, making it more suitable for deployment on unmanned aerial vehicles. The model’s compact architecture, implemented using TensorFlow and Keras for model design and training, along with OpenCV for image preprocessing, makes it portable and easy to integrate with edge devices such as NVIDIA Jetson boards. Full article
Show Figures

Figure 1

27 pages, 4587 KB  
Article
Detecting Burned Vegetation Areas by Merging Spectral and Texture Features in a ResNet Deep Learning Architecture
by Jiahui Fan, Yunjun Yao, Yajie Li, Xueyi Zhang, Jiquan Chen, Joshua B. Fisher, Xiaotong Zhang, Bo Jiang, Lu Liu, Zijing Xie, Luna Zhang and Fei Qiu
Remote Sens. 2025, 17(22), 3665; https://doi.org/10.3390/rs17223665 - 7 Nov 2025
Viewed by 593
Abstract
Timely and accurate detection of burned areas is crucial for assessing fire damage and contributing to ecosystem recovery efforts. In this study, we propose a framework for detecting fire-affected vegetation anomalies on the basis of a ResNet deep learning (DL) algorithm by merging [...] Read more.
Timely and accurate detection of burned areas is crucial for assessing fire damage and contributing to ecosystem recovery efforts. In this study, we propose a framework for detecting fire-affected vegetation anomalies on the basis of a ResNet deep learning (DL) algorithm by merging spectral and textural features (ResNet-IST) and the vegetation abnormal spectral texture index (VASTI). To train the ResNet-IST, a vegetation anomaly dataset was constructed on high-resolution 30 m fire-affected remote sensing images selected from the Global Fire Atlas (GFA) to extract the spectral and textural features. We tested the model to detect fire-affected vegetation in ten study areas across four continents. The experimental results demonstrated that the ResNet-IST outperformed the VASTI by approximately 3% in terms of anomaly detection accuracy and achieved a 5–15% improvement in the detection of the normalized burn ratio (NBR). Furthermore, the accuracy of the VASTI was significantly greater than that of NBR for burn detection, indicating that the merging of spectral and textural features provides complementary advantages, leading to stronger classification performance than the use of SFs alone. Our results suggest that deep learning outperforms traditional mathematical models in burned vegetation anomaly detection tasks. Nevertheless, the scope and applicability of this study are somewhat limited, which also provides directions for future research. Full article
Show Figures

Figure 1

22 pages, 6682 KB  
Article
Multimodal Fire Salient Object Detection for Unregistered Data in Real-World Scenarios
by Ning Sun, Jianmeng Zhou, Kai Hu, Chen Wei, Zihao Wang and Lipeng Song
Fire 2025, 8(11), 415; https://doi.org/10.3390/fire8110415 - 26 Oct 2025
Viewed by 1193
Abstract
In real-world fire scenarios, complex lighting conditions and smoke interference significantly challenge the accuracy and robustness of traditional fire detection systems. Fusion of complementary modalities, such as visible light (RGB) and infrared (IR), is essential to enhance detection robustness. However, spatial shifts and [...] Read more.
In real-world fire scenarios, complex lighting conditions and smoke interference significantly challenge the accuracy and robustness of traditional fire detection systems. Fusion of complementary modalities, such as visible light (RGB) and infrared (IR), is essential to enhance detection robustness. However, spatial shifts and geometric distortions occur in multi-modal image pairs collected by multi-source sensors due to installation deviations and inconsistent intrinsic parameters. Existing multi-modal fire detection frameworks typically depend on pre-registered data, which struggles to handle modal misalignment in practical deployment. To overcome this limitation, we propose an end-to-end multi-modal Fire Salient Object Detection framework capable of dynamically fusing cross-modal features without pre-registration. Specifically, the Channel Cross-enhancement Module (CCM) facilitates semantic interaction across modalities in salient regions, suppressing noise from spatial misalignment. The Deformable Alignment Module (DAM) achieves adaptive correction of geometric deviations through cascaded deformation compensation and dynamic offset learning. For validation, we constructed an unregistered indoor fire dataset (Indoor-Fire) covering common fire scenarios. Generalizability was further evaluated on an outdoor dataset (RGB-T Wildfire). To fully validate the effectiveness of the method in complex building fire scenarios, we conducted experiments using the Fire in historic buildings (Fire in historic buildings) dataset. Experimental results demonstrate that the F1-score reaches 83% on both datasets, with the IoU maintained above 70%. Notably, while maintaining high accuracy, the number of parameters (91.91 M) is only 28.1% of the second-best SACNet (327 M). This method provides a robust solution for unaligned or weakly aligned modal fusion caused by sensor differences and is highly suitable for deployment in intelligent firefighting systems. Full article
Show Figures

Figure 1

16 pages, 6847 KB  
Article
Edge-Based Autonomous Fire and Smoke Detection Using MobileNetV2
by Dilshod Sharobiddinov, Hafeez Ur Rehman Siddiqui, Adil Ali Saleem, Gerardo Mendez Mezquita, Debora Libertad Ramírez Vargas and Isabel de la Torre Díez
Sensors 2025, 25(20), 6419; https://doi.org/10.3390/s25206419 - 17 Oct 2025
Cited by 1 | Viewed by 945
Abstract
Forest fires pose significant threats to ecosystems, human life, and the global climate, necessitating rapid and reliable detection systems. Traditional fire detection approaches, including sensor networks, satellite monitoring, and centralized image analysis, often suffer from delayed response, high false positives, and limited deployment [...] Read more.
Forest fires pose significant threats to ecosystems, human life, and the global climate, necessitating rapid and reliable detection systems. Traditional fire detection approaches, including sensor networks, satellite monitoring, and centralized image analysis, often suffer from delayed response, high false positives, and limited deployment in remote areas. Recent deep learning-based methods offer high classification accuracy but are typically computationally intensive and unsuitable for low-power, real-time edge devices. This study presents an autonomous, edge-based forest fire and smoke detection system using a lightweight MobileNetV2 convolutional neural network. The model is trained on a balanced dataset of fire, smoke, and non-fire images and optimized for deployment on resource-constrained edge devices. The system performs near real-time inference, achieving a test accuracy of 97.98% with an average end-to-end prediction latency of 0.77 s per frame (approximately 1.3 FPS) on the Raspberry Pi 5 edge device. Predictions include the class label, confidence score, and timestamp, all generated locally without reliance on cloud connectivity, thereby enhancing security and robustness against potential cyber threats. Experimental results demonstrate that the proposed solution maintains high predictive performance comparable to state-of-the-art methods while providing efficient, offline operation suitable for real-world environmental monitoring and early wildfire mitigation. This approach enables cost-effective, scalable deployment in remote forest regions, combining accuracy, speed, and autonomous edge processing for timely fire and smoke detection. Full article
Show Figures

Figure 1

20 pages, 1991 KB  
Article
EcoWild: Reinforcement Learning for Energy-Aware Wildfire Detection in Remote Environments
by Nuriye Yildirim, Mingcong Cao, Minwoo Yun, Jaehyun Park and Umit Y. Ogras
Sensors 2025, 25(19), 6011; https://doi.org/10.3390/s25196011 - 30 Sep 2025
Viewed by 754
Abstract
Early wildfire detection in remote areas remains a critical challenge due to limited connectivity, intermittent solar energy, and the need for autonomous, long-term operation. Existing systems often rely on fixed sensing schedules or cloud connectivity, making them impractical for energy-constrained deployments. We introduce [...] Read more.
Early wildfire detection in remote areas remains a critical challenge due to limited connectivity, intermittent solar energy, and the need for autonomous, long-term operation. Existing systems often rely on fixed sensing schedules or cloud connectivity, making them impractical for energy-constrained deployments. We introduce EcoWild, a reinforcement learning-driven cyber-physical system for energy-adaptive wildfire detection on solar-powered edge devices. EcoWild combines a decision tree-based fire risk estimator, lightweight on-device smoke detection, and a reinforcement learning agent that dynamically adjusts sensing and communication strategies based on battery levels, solar input, and estimated fire risk. The system models realistic solar harvesting, battery dynamics, and communication costs to ensure sustainable operation on embedded platforms. We evaluate EcoWild using real-world solar, weather, and fire image datasets in a high-fidelity simulation environment. Results show that EcoWild consistently maintains responsiveness while avoiding battery depletion under diverse conditions. Compared to static baselines, it achieves 2.4× to 7.7× faster detection, maintains moderate energy consumption, and avoids system failure due to battery depletion across 125 deployment scenarios. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

27 pages, 51271 KB  
Article
Surface Damage Detection and Analysis for Reduction-Fired Cyan Square Bricks in Jiangnan Gardens via YOLOv12
by Lina Yan, Yile Chen, Xingkang Jia and Liang Zheng
Coatings 2025, 15(9), 1066; https://doi.org/10.3390/coatings15091066 - 11 Sep 2025
Viewed by 925
Abstract
As an outstanding UNESCO World Heritage Site, the Jiangnan gardens feature both exquisite and fragile components. Reduction-fired cyan square bricks, serving as crucial paving materials, are long-term exposed to natural and anthropogenic factors, making them prone to various types of surface damage and [...] Read more.
As an outstanding UNESCO World Heritage Site, the Jiangnan gardens feature both exquisite and fragile components. Reduction-fired cyan square bricks, serving as crucial paving materials, are long-term exposed to natural and anthropogenic factors, making them prone to various types of surface damage and urgently requiring efficient, non-destructive detection methods to support scientific conservation. Traditional manual inspection methods suffer from low efficiency, strong subjectivity, and potential disturbance to the fragile heritage structures. This study focuses on developing an intelligent detection method based on advanced computer vision, employing the YOLOv12 object detection model to achieve non-contact, automated identification of typical tile surface damage types in the Jiangnan gardens (such as cracking, stains, water stains, and wear). A total of 691 images of reduction-fired cyan square bricks collected on-site were used as training samples. The main conclusions of this study are as follows: (1) By constructing a dataset containing multiple samples and multiple scenes of reduction-fired cyan square brick images in Jiangnan gardens, the YOLOv12 model was trained and optimized, enabling it to accurately identify subtle damage features under complex texture backgrounds. (2) Overall indicators: Through the comparison of the confusion matrices of the four key training nodes, model C (the 159th epoch, highest mAP50–95) has the most balanced overall performance in multiple categories, with an accuracy of 0.73 for cracking, 0.77 for wear, 0.60 for water stain, and 0.65 for stains, which can meet basic detection requirements. (3) Difficulty of discrimination: Compared with stains and water stains, cracking and wear are easier to distinguish. Experimental results indicate that the detection method is feasible and effective in identifying the surface damage types of reduction-fired cyan square bricks in Jiangnan gardens. This research provides a practical and efficient “surface technology” solution for the preventive protection of cultural heritage, contributing to the sustainable preservation and management of world heritage. Full article
(This article belongs to the Special Issue Solid Surfaces, Defects and Detection, 2nd Edition)
Show Figures

Graphical abstract

20 pages, 3510 KB  
Article
FM-Net: A New Method for Detecting Smoke and Flames
by Jingwu Wang, Yuan Yao, Yinuo Huo and Jinfu Guan
Sensors 2025, 25(17), 5597; https://doi.org/10.3390/s25175597 - 8 Sep 2025
Cited by 1 | Viewed by 1013
Abstract
Aiming at the core problem of high false and missed alarm rate and insufficient interference resistance of existing smoke and fire detection algorithms in complex scenes, this paper proposes a target detection network based on improved feature pyramid structure. By constructing a Context [...] Read more.
Aiming at the core problem of high false and missed alarm rate and insufficient interference resistance of existing smoke and fire detection algorithms in complex scenes, this paper proposes a target detection network based on improved feature pyramid structure. By constructing a Context Guided Convolutional Block instead of the traditional convolutional operation, the detected target and the surrounding environment information are fused with secondary features while reconfiguring the feature dimensions, which effectively solves the problem of edge feature loss in the down-sampling process. The Poly Kernel Inception Block is designed, and a multi-branch parallel network structure is adopted to realize multi-scale feature extraction of the detected target, and the collaborative characterization of the flame profile and smoke diffusion pattern is realized. In order to further enhance the logical location sensing ability of the target, a Manhattan Attention Mechanism Unit is introduced to accurately capture the spatial and temporal correlation characteristics of the flame and smoke by establishing a pixel-level long-range dependency model. Experimental tests are conducted using a self-constructed high-quality smoke and fire image dataset, and the results show that, compared with the existing typical lightweight smoke and fire detection models, the present algorithm has a significant advantage in detection accuracy, and it can satisfy the demand for real-time detection. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

15 pages, 1786 KB  
Article
Application of Gaussian SVM Flame Detection Model Based on Color and Gradient Features in Engine Test Plume Images
by Song Yan, Yushan Gao, Zhiwei Zhang and Yi Li
Sensors 2025, 25(17), 5592; https://doi.org/10.3390/s25175592 - 8 Sep 2025
Viewed by 1080
Abstract
This study presents a flame detection model that is based on real experimental data that were collected during turbopump hot-fire tests of a liquid rocket engine. In these tests, a MEMRECAM ACS-1 M40 high-speed camera—serving as an optical sensor within the test instrumentation [...] Read more.
This study presents a flame detection model that is based on real experimental data that were collected during turbopump hot-fire tests of a liquid rocket engine. In these tests, a MEMRECAM ACS-1 M40 high-speed camera—serving as an optical sensor within the test instrumentation system—captured plume images for analysis. To detect abnormal flame phenomena in the plume, a Gaussian support vector machine (SVM) model was developed using image features that were derived from both color and gradient information. Six representative frames containing visible flames were selected from a single test failure video. These images were segmented in the YCbCr color space using the k-means clustering algorithm to distinguish flame and non-flame pixels. A 10-dimensional feature vector was constructed for each pixel and then reduced to five dimensions using the Maximum Relevance Minimum Redundancy (mRMR) method. The reduced vectors were used to train the Gaussian SVM model. The model achieved a 97.6% detection accuracy despite being trained on a limited dataset. It has been successfully applied in multiple subsequent engine tests, and it has proven effective in detecting ablation-related anomalies. By combining real-world sensor data acquisition with intelligent image-based analysis, this work enhances the monitoring capabilities in rocket engine development. Full article
Show Figures

Figure 1

22 pages, 1663 KB  
Review
Large-Space Fire Detection Technology: A Review of Conventional Detector Limitations and Image-Based Target Detection Techniques
by Li Deng, Siqi Wu, Shuang Zou and Quanyi Liu
Fire 2025, 8(9), 358; https://doi.org/10.3390/fire8090358 - 7 Sep 2025
Viewed by 2417
Abstract
With the rapid development of large-space buildings, their fire risk has become increasingly prominent. Conventional fire detection technologies are often limited by spatial height and environmental interference, leading to false alarms, missed detections, and delayed responses. This paper reviews 83 publications to analyze [...] Read more.
With the rapid development of large-space buildings, their fire risk has become increasingly prominent. Conventional fire detection technologies are often limited by spatial height and environmental interference, leading to false alarms, missed detections, and delayed responses. This paper reviews 83 publications to analyze the limitations of conventional methods in large spaces and highlights the advantages of and current developments in image-based fire detection technology. It outlines key aspects such as equipment selection, dataset construction, and target recognition algorithm optimization, along with improvement directions including scenario-adaptive datasets, model enhancement, and adaptability refinement. Research demonstrates that image-based technology offers broad coverage, rapid response, and strong anti-interference capability, effectively compensating for the shortcomings of conventional methods and providing a new solution for early fire warning in large spaces. Finally, future prospects are discussed, focusing on environmental adaptability, algorithm efficiency and reliability, and system integration, offering valuable references for related research and applications. Full article
(This article belongs to the Special Issue Building Fire Dynamics and Fire Evacuation, 2nd Edition)
Show Figures

Figure 1

32 pages, 6058 KB  
Article
An Enhanced YOLOv8n-Based Method for Fire Detection in Complex Scenarios
by Xuanyi Zhao, Minrui Yu, Jiaxing Xu, Peng Wu and Haotian Yuan
Sensors 2025, 25(17), 5528; https://doi.org/10.3390/s25175528 - 5 Sep 2025
Viewed by 1381
Abstract
With the escalating frequency of urban and forest fires driven by climate change, the development of intelligent and robust fire detection systems has become imperative for ensuring public safety and ecological protection. This paper presents a comprehensive multi-module fire detection framework based on [...] Read more.
With the escalating frequency of urban and forest fires driven by climate change, the development of intelligent and robust fire detection systems has become imperative for ensuring public safety and ecological protection. This paper presents a comprehensive multi-module fire detection framework based on visual computing, encompassing image enhancement and lightweight object detection. To address data scarcity and to enhance generalization, a projected generative adversarial network (Projected GAN) is employed to synthesize diverse and realistic fire scenarios under varying environmental conditions. For the detection module, an improved YOLOv8n architecture is proposed by integrating BiFormer Attention, Agent Attention, and CCC (Compact Channel Compression) modules, which collectively enhance detection accuracy and robustness under low visibility and dynamic disturbance conditions. Extensive experiments on both synthetic and real-world fire datasets demonstrated notable improvements in image restoration quality (achieving a PSNR up to 34.67 dB and an SSIM up to 0.968) and detection performance (mAP reaching 0.858), significantly outperforming the baseline. The proposed system offers a reliable and deployable solution for real-time fire monitoring and early warning in complex visual environments. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

26 pages, 30652 KB  
Article
Hybrid ViT-RetinaNet with Explainable Ensemble Learning for Fine-Grained Vehicle Damage Classification
by Ananya Saha, Mahir Afser Pavel, Md Fahim Shahoriar Titu, Afifa Zain Apurba and Riasat Khan
Vehicles 2025, 7(3), 89; https://doi.org/10.3390/vehicles7030089 - 25 Aug 2025
Viewed by 1351
Abstract
Efficient and explainable vehicle damage inspection is essential due to the increasing complexity and volume of vehicular incidents. Traditional manual inspection approaches are not time-effective, prone to human error, and lead to inefficiencies in insurance claims and repair workflows. Existing deep learning methods, [...] Read more.
Efficient and explainable vehicle damage inspection is essential due to the increasing complexity and volume of vehicular incidents. Traditional manual inspection approaches are not time-effective, prone to human error, and lead to inefficiencies in insurance claims and repair workflows. Existing deep learning methods, such as CNNs, often struggle with generalization, require large annotated datasets, and lack interpretability. This study presents a robust and interpretable deep learning framework for vehicle damage classification, integrating Vision Transformers (ViTs) and ensemble detection strategies. The proposed architecture employs a RetinaNet backbone with a ViT-enhanced detection head, implemented in PyTorch using the Detectron2 object detection technique. It is pretrained on COCO weights and fine-tuned through focal loss and aggressive augmentation techniques to improve generalization under real-world damage variability. The proposed system applies the Weighted Box Fusion (WBF) ensemble strategy to refine detection outputs from multiple models, offering improved spatial precision. To ensure interpretability and transparency, we adopt numerous explainability techniques—Grad-CAM, Grad-CAM++, and SHAP—offering semantic and visual insights into model decisions. A custom vehicle damage dataset with 4500 images has been built, consisting of approximately 60% curated images collected through targeted web scraping and crawling covering various damage types (such as bumper dents, panel scratches, and frontal impacts), along with 40% COCO dataset images to support model generalization. Comparative evaluations show that Hybrid ViT-RetinaNet achieves superior performance with an F1-score of 84.6%, mAP of 87.2%, and 22 FPS inference speed. In an ablation analysis, WBF, augmentation, transfer learning, and focal loss significantly improve performance, with focal loss increasing F1 by 6.3% for underrepresented classes and COCO pretraining boosting mAP by 8.7%. Additional architectural comparisons demonstrate that our full hybrid configuration not only maintains competitive accuracy but also achieves up to 150 FPS, making it well suited for real-time use cases. Robustness tests under challenging conditions, including real-world visual disturbances (smoke, fire, motion blur, varying lighting, and occlusions) and artificial noise (Gaussian; salt-and-pepper), confirm the model’s generalization ability. This work contributes a scalable, explainable, and high-performance solution for real-world vehicle damage diagnostics. Full article
Show Figures

Figure 1

21 pages, 3049 KB  
Article
SRoFF-Yolover: A Small-Target Detection Model for Suspicious Regions of Forest Fire
by Lairong Chen, Ling Li, Pengle Cheng and Ying Huang
Forests 2025, 16(8), 1335; https://doi.org/10.3390/f16081335 - 16 Aug 2025
Viewed by 759
Abstract
The rapid detection and confirmation of Suspicious Regions of Forest Fire (SRoFF) are critical for timely alerts and firefighting operations. In the early stages of forest fires, small flames and heavy occlusion lead to low accuracy, false detections, omissions, and slow inference in [...] Read more.
The rapid detection and confirmation of Suspicious Regions of Forest Fire (SRoFF) are critical for timely alerts and firefighting operations. In the early stages of forest fires, small flames and heavy occlusion lead to low accuracy, false detections, omissions, and slow inference in existing target-detection algorithms. We constructed the Suspicious Regions of Forest Fire Dataset (SRFFD), comprising publicly available datasets, relevant images collected from online searches, and images generated through various image enhancement techniques. The SRFFD contains a total of 64,584 images. In terms of effectiveness, the individual augmentation techniques rank as follows (in descending order): HSV (Hue Saturation and Value) random enhancement, copy-paste augmentation, and affine transformation. A detection model named SRoFF-Yolover is proposed for identifying suspicious regions of forest fire, based on the YOLOv8. An embedding layer that effectively integrates seasonal and temporal information into the image enhances the prediction accuracy of the SRoFF-Yolover. The SRoFF-Yolover enhances YOLOv8 by (1) adopting dilated convolutions in the Backbone to enlarge feature map receptive fields; (2) incorporating the Convolutional Block Attention Module (CBAM) prior to the Neck’s C2fLayer for small-target attention; and (3) reconfiguring the Backbone-Neck linkage via P2, P4, and SPPF. Compared with the baseline model (YOLOv8s), the SRoFF-Yolover achieves an 18.1% improvement in mAP@0.5, a 4.6% increase in Frames Per Second (FPS), a 2.6% reduction in Giga Floating-Point Operations (GFLOPs), and a 3.2% decrease in the total number of model parameters (#Params). The SRoFF-Yolover can effectively detect suspicious regions of forest fire, particularly during winter nights. Experiments demonstrated that the detection accuracy of the SRoFF-Yolover for suspicious regions of forest fire is higher at night than during daytime in the same season. Full article
(This article belongs to the Section Natural Hazards and Risk Management)
Show Figures

Figure 1

Back to TopTop