Comprehensive Building Fire Risk Prediction Using Machine Learning and Stacking Ensemble Methods
Abstract
:1. Introduction
1.1. Background and Motivation
1.2. Machine Learning in Predictive Modeling
1.3. Limitations of Traditional Fire Risk Models
1.4. Objectives of This Study
- To develop and compare fire risk prediction models using 16 machine learning algorithms, evaluating their individual strengths and weaknesses in predicting fire incidents;
- To enhance the prediction performance of these models by incorporating a comprehensive range of building, environmental, and demographic variables;
- To validate the model’s predictive performance and reliability using real-world fire incident data, ensuring its applicability in practical settings;
- To create a stacking ensemble model capable of classifying fire risk into five distinct grades, providing a more reliable and accurate risk-classification system.
1.5. Structure of the Paper
- Section 2: Describes the data collection and preprocessing methods used in this study, including a detailed explanation of the independent variables and machine learning algorithms employed;
- Section 3: Presents the results of the model development and evaluation process, including performance metrics such as precision, recall, and F1-score for each algorithm;
- Section 4: Provides an in-depth discussion of the findings, highlights the strengths and limitations of the models, and suggests potential areas for future research;
- Section 5: Concludes the paper by discussing the practical implications of the model for fire prevention, insurance risk assessments, and fire safety management, along with recommendations for further refinement and application.
2. Materials and Methods
2.1. Data Collection
2.2. Data Preprocessing
2.2.1. One-Hot Encoding
2.2.2. Standardization
2.2.3. Handling Class Imbalance
2.2.4. Splitting Train and Test Data
2.3. Modeling
2.4. Model Evaluation
- True Positive (TP): Correctly classified fire instances;
- True Negative (TN): Correctly classified non-fire instances;
- False Positive (FP): Non-fire instances incorrectly classified as fires;
- False Negative (FN): Fire instances incorrectly classified as non-fires.
- Accuracy: Measures the proportion of correct predictions among all instances.
- Precision: Indicates the proportion of true positive predictions among all positive predictions.
- Recall (Sensitivity): Reflects the proportion of actual positives correctly identified.
- F1-score: The harmonic mean of precision and recall.
3. Results
3.1. Model Training Results
- Accuracy measures the proportion of correct predictions (both fire and non-fire incidents) among all predictions. Models such as the Random Forest (RF), Bagging Classifier (BC), and XGBoost (XGB) models demonstrated higher accuracy. However, in the context of imbalanced datasets, accuracy alone can be misleading, as a model may achieve high accuracy by predominantly predicting the majority class (non-fire incidents).
- Precision indicates the proportion of true positive predictions among all positive predictions made by the model. Models like the RF, XGB, and CatBoost (CB) models showed higher precision, suggesting they have a lower rate of false positives. High precision is crucial when the cost of false positives is significant.
- Recall (also known as sensitivity) reflects the model’s ability to identify actual fire incidents among all actual positive cases. Models such as the Long Short-Term Memory (LSTM) model, Artificial Neural Networks (ANNs), and Decision Trees (DTs) exhibited high recall, indicating effectiveness in detecting actual fires but potentially producing more false positives.
- The F1-score is the harmonic mean of precision and recall, providing a balance between the two metrics. Models like the Gradient Boosting (GB), LightGBM (LGB), and XGB models achieved higher F1-scores, suggesting they maintain a good balance between detecting fires and minimizing false positives.
- High Recall, Low Precision: Models like LSTM and ANNs have high recall but low precision, meaning they are good at detecting actual fires but also predict many false positives;
- High Precision, Lower Recall: Models like RF and XGB have higher precision but lower recall, indicating they make fewer false positive predictions but may miss some actual fire incidents;
- Balanced Performance: The GB, LGB, and XGB models achieve a better balance between precision and recall, as reflected in their higher F1-scores.
3.2. Analysis of Individual Models
3.2.1. Random Forest (RF)
3.2.2. Decision Tree (DT)
3.2.3. CatBoost (CB)
3.2.4. XGBoost (XGB)
3.2.5. K-Nearest Neighbors (KNN)
3.2.6. Naive Bayes (NB)
3.2.7. Artificial Neural Networks (ANN)
3.2.8. Deep Neural Networks (DNN)
3.2.9. Long Short-Term Memory (LSTM)
3.2.10. Support Vector Machines with Polynomial Kernel (SVMPs)
3.2.11. Support Vector Machines with Radial Basis Function (SVMRs)
3.2.12. Support Vector Machines with Sigmoid Kernel (SVMSs)
3.2.13. AdaBoost (AB)
3.2.14. Light Gradient Boosting Machine (LGB)
3.2.15. Bagging Classifier (BC)
3.2.16. Gradient Boosting (GB)
3.3. Rationale for Choosing the Ensemble Method
- Improved Accuracy: Combining predictions reduces the likelihood of errors being made by the individual models;
- Enhanced Robustness: Diversity among models ensures that the ensemble is less sensitive to anomalies and variations in the data;
- Balanced Precision and Recall: Models with a high recall compensate for those with high precision, achieving a better overall balance that is essential for fire risk prediction.
3.4. Ensemble Voting Methodology
- Prediction Aggregation: For each building, collect the predictions from all models;
- Vote Counting: Tally the number of models predicting a fire for a given building;
- Risk Level Assignment: Classify buildings into risk levels based on the number of votes they receive.
- Grade 1 (Extremely Low Risk, Blue color): 0–1 models predicting a fire;
- Grade 2 (Low Risk, Green color): 2 models predicting a fire;
- Grade 3 (Medium Risk, Yellow color): 3–4 models predicting a fire;
- Grade 4 (High Risk, Orange color): 5–10 models predicting a fire;
- Grade 5 (Extremely High Risk, Red color): 11–16 models predicting a fire.
- Grade 1 (Lowest Risk): Only about 3.5% of the buildings experienced actual fires, representing 6.9% of all fire occurrences, indicating this as the safest category;
- Grade 5 (Highest Risk): While comprising only 21.5% of the predicted buildings, it accounted for 54.5% of actual fire occurrences, marking it as the highest risk category.
4. Discussion
4.1. Model Evaluation and Performance
- High Recall Models: Models such as the LSTM, ANN, and DT models exhibited a high recall, effectively identifying most actual fire incidents. However, their low precision means they also produced many false positives, which can lead to unnecessary alarms and strain on resources;
- High Precision Models: Models like the RF, XGB, and CB models demonstrated higher precision, indicating a lower rate of false positives. Nevertheless, their recall was lower, suggesting that they might miss some actual fire incidents, which is undesirable in safety-critical applications;
- Balanced Models: The GB, LGB, and XGB models achieved higher F1-scores, reflecting a better balance between precision and recall. These models are particularly valuable because they maintain a reasonable detection rate of actual fires while minimizing false alarms.
4.2. Effectiveness of the Ensemble Method
- Improved Balance: The ensemble method effectively balances precision and recall by leveraging the strengths of different models. Models with a high recall compensate for those with high precision, resulting in a more balanced overall prediction;
- Enhanced Robustness: Aggregating multiple models reduces the impact of any single model’s biases or errors, leading to more reliable and consistent predictions.
- Grade 1 (Lowest Risk): Comprised 24% of the predicted buildings but only accounted for 7% of the actual fire occurrences, indicating a low risk and validating the model’s ability to identify safer buildings;
- Grade 5 (Highest Risk): Represented 22% of the predicted buildings but accounted for 54% of the actual fire occurrences, demonstrating the model’s proficiency in accurately pinpointing high-risk buildings.
4.3. Practical Implications for Fire Risk Management
- Policy Makers: Regulatory authorities can prioritize inspections and enforce stricter safety regulations for buildings identified as high-risk. This targeted approach can enhance the effectiveness of fire safety policies and resource allocation’
- Insurance Companies: Risk-based premium adjustments can be made, promoting fair and accurate insurance policies that reflect the actual risk levels of buildings. This can incentivize property owners to invest in fire safety measures;
- Building Managers: Resources can be allocated more efficiently by focusing on maintenance and safety measures in higher-risk buildings. Proactive measures can be implemented to mitigate identified risks;
- Emergency Services: Fire departments can optimize response planning by identifying areas with higher predicted fire risks. This allows for the strategic placement of resources and quicker response times to high-risk areas.
4.4. Limitations
- Data Limitations: Reliance on publicly available data may limit the model’s accuracy due to potential issues like missing values, the data quality, and the inability to capture all relevant factors. The overestimation of fire occurrences due to address-level data aggregation is a specific concern;
- False Positives: The ensemble model, while improving the overall performance, still produces a considerable number of false positives, which could lead to unnecessary resource allocation and potential desensitization to fire alarms;
- Model Complexity: The stacking ensemble approach increases the computational complexity, potentially hindering real-time implementation and scalability, especially in resource-constrained environments;
- Generalizability: The model was developed based on data from the Republic of Korea’s special buildings and may require adaptation for use in other regions or building categories due to differences in building codes, environmental factors, and demographic characteristics.
4.5. Future Research Directions
- Data Enrichment: Incorporating additional data sources, such as real-time sensor data (e.g., smoke detectors, temperature sensors) or maintenance records, to improve the model’s predictive accuracy and capture dynamic risk factors;
- Model Optimization: Simplifying the model to reduce the computational demands without sacrificing performance. Techniques like feature selection, dimensionality reduction, or using more efficient algorithms can be explored;
- Cross-Regional Validation: Testing the model in different geographical contexts and building types to assess its generalizability and adaptability to other regions;
- Stakeholder Collaboration: Working with industry experts, policymakers, and fire safety professionals to refine the fire risk grading system, ensuring its practical relevance and facilitating its adoption in policy and practice.
5. Conclusions
5.1. Key Contributions
- Comprehensive Data Integration: Incorporating diverse variables captures the complex factors contributing to fire risk, improving the model’s robustness and applicability in various contexts. This holistic approach enhances our understanding of fire risk factors;
- Practical Risk Stratification: Classifying the fire risk into five distinct grades provides a usable framework for stakeholders to implement targeted fire-prevention measures. The grading system simplifies complex predictive outputs into actionable insights, facilitating decision-making.
5.2. Implications for Stakeholders
- Policy Implementation: Authorities can utilize the model to prioritize safety inspections, allocate resources effectively, and update regulatory standards based on the identified risk levels. This can lead to the more efficient use of public resources and improved safety outcomes;
- Insurance Assessment: Insurance companies can adjust their premiums and coverage options based on the risk grades, leading to fairer and more accurate policies that reflect the actual risk. This approach can promote risk-mitigation efforts by building owners;
- Resource Allocation: Emergency services and building managers can allocate resources more efficiently by focusing on higher-risk buildings, enhancing the overall fire safety and response effectiveness. This targeted approach can improve emergency preparedness and reduce response times.
5.3. Limitations and Recommendations
- Data Quality and Availability: Future work should aim to incorporate more detailed and high-quality data to improve the model’s accuracy. Collaborations with governmental agencies and private organizations could facilitate access to better data and enable the inclusion of additional relevant variables;
- False Positives Reduction: Further research is needed to refine the model to reduce the false-positive rates without compromising the recall. Techniques such as cost-sensitive learning, adjusting classification thresholds, or employing more sophisticated ensemble methods could be explored;
- Model Simplification: Simplifying the ensemble model can make it more accessible for real-time implementation. Investigating methods to reduce the model’s computational complexity, such as pruning less-impactful models from the ensemble or using more efficient algorithms, is recommended.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhang, T.; Wang, Z.; Wong, H.Y.; Tam, W.C.; Huang, X.; Xiao, F. Real-time Forecast of Compartment Fire and Flashover based on Deep Learning. Fire Saf. J. 2022, 130, 103579. [Google Scholar] [CrossRef]
- Tam, W.C.; Fu, E.Y.; Li, J.; Peacock, R.D.; Reneke, P.A.; Cleary, T.; Ngai, G.; Leong, H.V.; Huang, M.X. Real-Time Flashover Prediction Model for Multi-Compartment Building Structures Using Attention Based Recurrent Neural Networks. Expert Syst. Appl. 2023, 223, 119899. [Google Scholar] [CrossRef]
- Li, Y.; Li, G.; Wang, K.; Wang, Z.; Chen, Y. Forest Fire Risk Prediction Based on Stacking Ensemble Learning for Yunnan Province of China. Fire 2024, 7, 13. [Google Scholar] [CrossRef]
- Akyol, K. Robust stacking-based ensemble learning model for forest fire detection. Int. J. Environ. Sci. Technol. 2023, 12, 13245–13258. [Google Scholar] [CrossRef]
- Zhang, X.; Li, X.; Mehaffey, J.; Hadjisophocleous, G. A Probability-Based Monte Carlo Life-Risk Analysis Model for Fire Emergencies. Fire Saf. J. 2017, 89, 51–62. [Google Scholar] [CrossRef]
- Kim, J.; Lee, S.; Park, H. Probabilistic Analysis of Occupant Safety During Fire Emergencies Using Monte Carlo Simulation. Saf. Sci. 2017, 95, 1–12. [Google Scholar]
- Lu, C.; Li, Q. Monte Carlo Simulation for Life-Risk Assessment in Fire Scenarios. J. Fire Prot. Eng. 2018, 28, 347–360. [Google Scholar]
- Zhang, Y.; Wang, G.; Wang, X.; Kong, X.; Jia, H.; Zhao, J. Regional High-Rise Building Fire Risk Assessment Based on the Spatial Markov Chain Model and an Indicator System. Fire 2024, 7, 16. [Google Scholar] [CrossRef]
- Zhu, W.; You, Q. High-Rise Building Group Regional Fire Risk Assessment Model Based on AHP. J. Risk Anal. Crisis Response 2021, 6, 31–37. [Google Scholar] [CrossRef]
- Zhu, W.; You, Q. Comprehensive Evaluation of Fire Risk for High-Rise Civil Buildings Based on Fuzzy Analytic Hierarchy Process. In Proceedings of the 2019 IEEE 11th International Conference on Advanced Infocomm Technology (ICAIT), Jinan, China, 18–20 October 2019; pp. 179–185. [Google Scholar]
- Manikandan, K.; Nakkeeran, E. Safety Analysis Improvement in Fire Risk Assessment Model and Optimized Risk Indexing Using Deep Learning Approach. Int. J. Intell. Syst. Appl. Eng. 2024, 12, 732–742. [Google Scholar]
- Li, S.; Tao, G.; Zhang, L. Fire Risk Assessment of High-Rise Buildings Based on Gray-FAHP Mathematical Model. Procedia Eng. 2018, 211, 395–402. [Google Scholar] [CrossRef]
- Rezaei, S.; Shokouhyar, S.; Zandieh, M. A Neural Network Approach for Retailer Risk Assessment in the Aftermarket Industry. Benchmarking: Int. J. 2019, 26, 1631–1647. [Google Scholar] [CrossRef]
- Kim, D.; Cha, H.; Jiang, S. The Prediction of Fire Disaster Using BIM-Based Visualization for Expediting the Management Process. Sustainability 2023, 15, 3719. [Google Scholar] [CrossRef]
- Wehbe, R.; Shahrour, I. A BIM-Based Smart System for Fire Evacuation. Future Internet 2021, 13, 221. [Google Scholar] [CrossRef]
- Mutakabbir, A.; Lung, C.-H.; Ajila, S.A. Forest Fire Prediction Using Multi-Source Deep Learning. In Proceedings of the 13th EAI International Conference, BDTA 2023, Edinburgh, UK, 23–24 August 2023; pp. 123–134. [Google Scholar]
- Jaafari, A.; Zenner, E.K.; Panahi, M.; Shahabi, H. Hybrid artificial intelligence models based on a neuro-fuzzy system and metaheuristic optimization algorithms for spatial prediction of wildfire probability. Agric. For. Meteorol. 2019, 266–267, 198–207. [Google Scholar] [CrossRef]
- Al-Hamd, R.K.S.; Albostami, A.S.; Alzabeebee, S.; Al-Bander, B. An optimized prediction of FRP bars in concrete bond strength employing soft computing techniques. J. Build. Eng. 2024, 65, 105835. [Google Scholar] [CrossRef]
- Albostami, A.S.; Al-Hamd, R.K.S.; Al-Matwari, A.A. Data-driven predictive modeling of steel slag concrete strength for sustainable construction. Buildings 2024, 14, 2476. [Google Scholar] [CrossRef]
- Hong, S.G.; Jeong, S.R. Development and Comparison of Data Mining-Based Prediction Models of Building Fire Probability. Korean Soc. Internet Inf. 2018, 19, 101–112. [Google Scholar]
- Yoon, D.W.; Hwang, H.; Pak, T.Y.; Kim, B.T.; Li, X.; Lee, J. Fire Risk Prediction Using Building Information and Machine Learning Methods. Adv. Inf. Commun. 2022, 1, 22–30. [Google Scholar]
- Ryu, J.W.; Kim, Y.J.; Kim, E.J.; Kim, M.W. A Generation Method of Fire Probability Prediction Model Based on Weather Forecast. J. KIISE: Comput. Pract. Lett. 2014, 20, 68–79. [Google Scholar]
- Sterne, J.A.; White, I.R.; Carlin, J.B.; Spratt, M.; Royston, P.; Kenward, M.G.; Wood, A.M.; Carpenter, J.R. Multiple Imputation for Missing Data in Epidemiological and Clinical Research: Potential and Pitfalls. BMJ 2009, 338, b2393. [Google Scholar] [CrossRef] [PubMed]
- Nugroho, H. A Review: Data Quality Problem in Predictive Analytics. Int. J. Appl. Inf. Technol. 2023, 7, 79–91. [Google Scholar] [CrossRef]
- Tam, W.C.; Fu, E.Y.; Mensch, A.; Hamins, A.; You, C.; Ngai, G.; Leong, H.V. Prevention of cooktop ignition using detection and multi-step machine learning algorithms. Fire Saf. J. 2021, 120, 103043. [Google Scholar] [CrossRef] [PubMed]
- Ngai, M.; Fu, E.Y.; Tam, A.; Yang, A.; Ngai, G. Finding the signal from the smoke: A real-time, unattended fire prevention system using 3D CNNs. J. Stud. Res. 2022, 11, 1–12. [Google Scholar] [CrossRef]
Year | Number of All Buildings (A) | Number of Actual Fire Occurrences (B) | Fire Occurrence Rate (B/A) | Number of SBs (C) | Number of Actual Fire Occurrences (D) | Fire Occurrence Rate (D/C) |
---|---|---|---|---|---|---|
2018 | 8,730,004 | 84,768 | 0.97% | 141,454 | 18,903 | 13.36% |
2019 | 8,688,370 | 78,734 | 0.91% | 145,491 | 18,323 | 12.59% |
2020 | 8,628,998 | 76,390 | 0.89% | 149,107 | 19,200 | 12.88% |
2021 | 8,569,610 | 71,746 | 0.84% | 155,583 | 18,588 | 11.95% |
2022 | 8,508,355 | 74,165 | 0.87% | 157,824 | 19,192 | 12.16% |
Types | Abbreviation | Independent Variable | Units | Ranges |
---|---|---|---|---|
Building Characteristics | B_usg | Type of building usage | Categorical | 40 classes |
B_str | Type of building structure | Categorical | 22 classes | |
B_ta | Total area of the building | m2 | 0~15,392.02 | |
B_fa | Total floor area | m2 | 0~36,910.96 | |
B_bcr | Building coverage ratio on the plot | % | 0~74.75 | |
B_dist_fs | Distance to the nearest fire station | meters | 0~84,175.94 | |
B_ht | Height of the building | meters | 0~76.3 | |
B_age | Age of the building | years | 0~103 | |
B_far | Floor area to plot ratio | % | 0~754.25 | |
B_econ | Annual electricity consumption | kWh | 0~490,988,077 | |
B_emax | Monthly Maximum electricity usage | kWh | 0~43,416,800 | |
B_emin | Monthly Minimum electricity usage | kWh | 0~35,485,800 | |
B_park | Total parking area | m2 | 0~367,651 | |
B_ps | Number of parking spaces | count | 0~20,684.62 | |
B_plot | Area of the plot | m2 | 0~104,784.7 | |
B_tf | Total number of floors | count | 0~80 | |
B_fg | Floors above ground | count | 0~80 | |
B_fb | Floors below ground | count | 0~20 | |
B_dist_b | Minimum distance to nearest building | meters | 0~631.75 | |
B_nb | Number of buildings within a 20-m radius | count | 0~104 | |
Land Characteristics | L_price | Official price of the land | KRW/m2 | 365~191,000,000 |
L_type | Categorical land type | Categorical | 27 classes | |
L_area | Total land area | m2 | 3.0~3,564,218.9 | |
L_zon1 | Primary land zoning category | Categorical | 24 classes | |
L_zon2 | Secondary land zoning category | Categorical | 25 classes | |
L_cond | Current land use condition | Categorical | 53 classes | |
L_elev | Elevation of the land | Categorical | 7 classes | |
L_shape | Shape of the topography | Categorical | 10 classes | |
L_road | Type of adjacent road | Categorical | 14 classes | |
Demographic and Administrative | D_pop | Population of the district | persons | 0~258,867 |
D_bld | Number of buildings in the district | count | 0~22,652 | |
D_fire | Number of fires reported in the district | count/year | 0.0~175.67 | |
D_fc | Fire incidents per capita | incidents per persons | 0.0~4.33 | |
D_fb | Fire incidents per building | incidents per buildings | 0.0~0.33 |
Building ID | B_usg_ Temporary | B_usg_ PublicOffice | … | B_str_ SteelPipe | B_str_ LightSteel | … | L_type_ Park | L_type_ Factory | … |
---|---|---|---|---|---|---|---|---|---|
B_1 | 1 | 0 | … | 0 | 0 | … | 0 | 1 | … |
B_2 | 0 | 1 | … | 0 | 1 | … | 1 | 0 | … |
B_3 | 0 | 0 | … | 1 | 0 | … | 0 | 1 | … |
B_4 | 0 | 0 | … | 0 | 0 | … | 0 | 0 | … |
B_5 | 0 | 1 | … | 0 | 0 | … | 1 | 0 | … |
B_6 | 0 | 0 | … | 1 | 0 | … | 0 | 0 | … |
B_7 | 0 | 0 | … | 0 | 0 | … | 0 | 1 | … |
B_8 | 1 | 0 | … | 0 | 0 | … | 0 | 0 | … |
B_9 | 0 | 0 | … | 0 | 1 | … | 1 | 0 | … |
B_10 | 0 | 1 | … | 0 | 0 | … | 0 | 1 | … |
· · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · |
Building ID | B_ta | B_fa | B_plot | … | L_area | … | D_bld | D_pop | … |
---|---|---|---|---|---|---|---|---|---|
B_1 | 0.0090 | 0.2812 | −0.0179 | … | −0.3749 | … | −1.1212 | −0.8186 | … |
B_2 | −0.0189 | −0.0197 | −0.0689 | … | −0.3318 | … | −1.1162 | −0.8211 | … |
B_3 | −0.0189 | −0.0197 | −0.0689 | … | −0.3732 | … | −1.1087 | −0.8190 | … |
B_4 | 0.0065 | 0.2086 | −0.0318 | … | −0.3681 | … | −1.1212 | −0.8186 | … |
B_5 | −0.0032 | 0.1568 | −0.0333 | … | −0.3692 | … | −1.1212 | −0.8186 | … |
B_6 | −0.0042 | 0.0978 | −0.0402 | … | −0.3746 | … | −1.1278 | −0.8210 | … |
B_7 | −0.0189 | −0.0183 | −0.0667 | … | −0.3796 | … | −1.0638 | −0.8179 | … |
B_8 | 0.0138 | 0.1785 | −0.0187 | … | −0.3581 | … | −1.1254 | −0.7451 | … |
B_9 | −0.0002 | 0.0952 | −0.0240 | … | −0.3643 | … | −1.1286 | −0.8210 | … |
B_10 | 0.0072 | 0.0758 | −0.0450 | … | −0.3827 | … | −1.1278 | −0.8210 | … |
· · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · |
Techniques | Abbreviation |
---|---|
Random Forest | RF |
Decision Trees | DTs |
CatBoost | CB |
XGBoost | XGB |
K-Nearest Neighbors | KNN |
Naive Bayes | NB |
Artificial Neural Networks | ANNs |
Deep Neural Networks | DNNs |
Long Short-Term Memory | LSTM |
Support Vector Machines with Polynomial Kernel | SVMPs |
Support Vector Machines with Radial Basis Function Kernel | SVMRs |
Support Vector Machines with Sigmoid Kernel | SVMSs |
AdaBoost | AB |
Light Gradient Boosting Machine | LGB |
Bagging Classifier | BC |
Gradient Boosting | GB |
Model | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
GB | 0.7371 | 0.2708 | 0.6754 | 0.3866 |
LGB | 0.7562 | 0.2777 | 0.6166 | 0.3829 |
XGB | 0.8033 | 0.3075 | 0.4822 | 0.3755 |
SVMR | 0.7115 | 0.2552 | 0.7047 | 0.3747 |
SVMP | 0.7229 | 0.2555 | 0.6577 | 0.3680 |
RF | 0.8470 | 0.3676 | 0.3430 | 0.3549 |
DT | 0.6690 | 0.2310 | 0.7294 | 0.3509 |
CB | 0.7856 | 0.2790 | 0.4720 | 0.3507 |
NB | 0.6899 | 0.2342 | 0.6731 | 0.3475 |
SVMS | 0.6816 | 0.2188 | 0.6208 | 0.3236 |
KNN | 0.6903 | 0.2172 | 0.5852 | 0.3168 |
DNN | 0.6261 | 0.1965 | 0.6632 | 0.3032 |
ANN | 0.5029 | 0.1731 | 0.8080 | 0.2851 |
LSTM | 0.3279 | 0.1437 | 0.9029 | 0.2479 |
BC | 0.8406 | 0.2737 | 0.1810 | 0.2179 |
AB | 0.7195 | 0.1542 | 0.2869 | 0.2006 |
Building ID | RF | DT | CB | XGB | KNN | NB | ANN | DNN | LSTM | SVMP | SVMR | SVMS | AB | LGB | BC | GB |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
B_1 | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X |
B_2 | X | O | X | X | X | O | O | X | X | X | X | O | O | X | X | X |
B_3 | X | O | X | X | O | O | O | X | O | X | X | O | O | X | X | X |
B_4 | X | X | X | X | X | O | X | X | X | X | X | O | O | X | X | X |
B_5 | X | X | X | X | X | O | X | X | X | X | X | O | X | X | X | X |
B_6 | X | X | X | X | X | X | X | X | X | X | X | X | O | X | X | X |
B_7 | X | X | X | X | X | X | X | X | X | X | X | O | O | X | X | X |
B_8 | O | O | O | X | X | O | O | O | O | X | X | O | O | O | O | O |
B_9 | X | X | X | X | X | O | X | X | X | X | X | O | X | X | X | X |
B_10 | X | O | X | X | X | O | X | X | X | X | X | O | X | X | X | X |
· · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · | · · · |
Number of Models Predicting Fire | Number of Predicted Buildings | Number of Actual Fire Occurrences | Fire Occurrence Rate | Share of Predicted Buildings | Share of Actual Fire Occurrences |
---|---|---|---|---|---|
0 | 14,410 | 526 | 4% | 9% | 3% |
1 | 24,791 | 843 | 3% | 15% | 4% |
2 | 27,496 | 1455 | 5% | 17% | 7% |
3 | 19,821 | 1111 | 6% | 12% | 6% |
4 | 11,345 | 856 | 8% | 7% | 4% |
5 | 6534 | 808 | 12% | 4% | 4% |
6 | 5219 | 570 | 11% | 3% | 3% |
7 | 4474 | 584 | 13% | 3% | 3% |
8 | 3864 | 587 | 15% | 2% | 3% |
9 | 3965 | 677 | 17% | 2% | 3% |
10 | 5176 | 1026 | 20% | 3% | 5% |
11 | 6847 | 1382 | 20% | 4% | 7% |
12 | 8051 | 1879 | 23% | 5% | 9% |
13 | 8066 | 2537 | 31% | 5% | 13% |
14 | 6888 | 2884 | 42% | 4% | 15% |
15 | 3769 | 1665 | 44% | 2% | 8% |
16 | 1201 | 473 | 39% | 1% | 2% |
Grade | Number of Predicted Buildings | Number of Actual Fire Occurrences | Fire Occurrence Rate (%) | Share of Predicted Buildings (%) | Share of Actual Fire Occurrences (%) |
---|---|---|---|---|---|
1 | 39,201 | 1369 | 3.5% | 24.2% | 6.9% |
2 | 27,496 | 1,455 | 5.3% | 17.0% | 7.3% |
3 | 31,166 | 1967 | 6.3% | 19.2% | 9.9% |
4 | 29,232 | 4252 | 14.5% | 18.1% | 21.4% |
5 | 34,822 | 10,820 | 31.1% | 21.5% | 54.5% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ahn, S.; Won, J.; Lee, J.; Choi, C. Comprehensive Building Fire Risk Prediction Using Machine Learning and Stacking Ensemble Methods. Fire 2024, 7, 336. https://doi.org/10.3390/fire7100336
Ahn S, Won J, Lee J, Choi C. Comprehensive Building Fire Risk Prediction Using Machine Learning and Stacking Ensemble Methods. Fire. 2024; 7(10):336. https://doi.org/10.3390/fire7100336
Chicago/Turabian StyleAhn, Seungil, Jinsub Won, Jangchoon Lee, and Changhyun Choi. 2024. "Comprehensive Building Fire Risk Prediction Using Machine Learning and Stacking Ensemble Methods" Fire 7, no. 10: 336. https://doi.org/10.3390/fire7100336
APA StyleAhn, S., Won, J., Lee, J., & Choi, C. (2024). Comprehensive Building Fire Risk Prediction Using Machine Learning and Stacking Ensemble Methods. Fire, 7(10), 336. https://doi.org/10.3390/fire7100336