A Novel Framework for Roof Accident Causation Analysis Based on Causation Matrix and Bayesian Network Modeling Methods
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper deals with the analysis of the causes of accidents caused by the collapse of roof structures in coal mines, which represent one of the most frequent and dangerous types of mining accidents. The authors propose a new analytical framework that combines an accident cause matrix and a Bayesian network. The model was verified through comparison with random forest and logistic regression, as well as the analysis of real cases, where it shows greater accuracy and ability to diagnose key risk factors. The paper is nicely written, however, I suggest a minor revision before publication.
The Introduction chapter is extensive, it is necessary to divide it into several sub-headings, in order to make it more readable and transparent. It is also necessary to enrich the chapter with at least ten more recent works, in order to improve the list of literature. I suggest you touch on the aspects shown in the paper where all use Bayesian network https://doi.org/10.3390/en18195089. This is the easiest way to add more recent references and expand the introduction image.
Although the model is explained in detail, additional emphasis is recommended so that mining companies can concretely apply the results, e.g. through a software tool, guidelines or procedures, so that the work has a greater practical application. In short.
An overview of existing methods is useful, but it would be worth adding recent work from the last three years on combining Bayesian networks with machine learning in the mining and industrial security domain. This would better position the work in the current research context.
Describe in more detail the selection criteria of 100 accident reports and the method of data processing. These instructions would facilitate the comparison of results in future studies.
The authors mentioned certain limitations of the work, but it would be useful to expand the part about potential biases when choosing factors and about the possibilities of collecting better quality data in the future, such as sensors in real time.
As a direction for future research, it is worth briefly demonstrating how the same framework could be adapted for other frequent mining incidents, such as gas explosions, floods, which would emphasize the breadth of the model.
There are technical errors in formatting, while the bibliography does not comply with the journal's requirements.
Author Response
Comments 1: [The Introduction chapter is extensive, it is necessary to divide it into several sub-headings, in order to make it more readable and transparent. It is also necessary to enrich the chapter with at least ten more recent works, in order to improve the list of literature. I suggest you touch on the aspects shown in the paper where all use Bayesian network https://doi.org/10.3390/en18195089. This is the easiest way to add more recent references and expand the introduction image.]
Response 1: [Thank you for pointing this out. We agree with this comment. Therefore, We have added the application of Bayesian network models in the coal mining field in the related works. Mention exactly where in the revised manuscript this change can be found at line 141-209 in Section 2.2.]
Comments 2: [Although the model is explained in detail, additional emphasis is recommended so that mining companies can concretely apply the results, e.g. through a software tool, guidelines or procedures, so that the work has a greater practical application. In short.]
Response 2: [Thank you for your valuable suggestions. The findings of this study have significant application scenarios in the current artificial intelligence mining applications, as detailed in the fifth section of the paper. This change can be found at line 562-566.]
Comments 3: [An overview of existing methods is useful, but it would be worth adding recent work from the last three years on combining Bayesian networks with machine learning in the mining and industrial security domain. This would better position the work in the current research context.]
Response 3: [Thank you for your valuable feedback! Regarding the above issues, we have addressed the first issue by supplementing the literature with relevant research findings from the past three years. Refer to line 141-209 in Section 2.2. ]
Comments 4: [Describe in more detail the selection criteria of 100 accident reports and the method of data processing. These instructions would facilitate the comparison of results in future studies.]
Response 4: [Thank you for your valuable suggestions. We have provided detailed supplementary information in Appendix B of the article regarding the selection criteria and data processing methods for 100 coal mine roof accident reports. The selection criteria specify that only roof accident reports are included to avoid interference, covering major, relatively serious, and general accidents (accounting for 30% and 70% respectively). The data is sourced from official platforms, academic literature, and major accident news, with priority given to cases from the past 5 years and supplemented with cases from 2018 to 2022. The data processing is based on the 2-4 model to extract 25 core elements across 5 dimensions, constructing a structured matrix of 100 rows × 25 columns. The matrix is then split into a training set (for EM algorithm parameter learning) and a test set (for leave-one-out cross-validation) in an 8:2 ratio, providing a standardized reference for future research and facilitating result comparison and method reuse.]
Comments 5: [The authors mentioned certain limitations of the work, but it would be useful to expand the part about potential biases when choosing factors and about the possibilities of collecting better quality data in the future, such as sensors in real time. As a direction for future research, it is worth briefly demonstrating how the same framework could be adapted for other frequent mining incidents, such as gas explosions, floods, which would emphasize the breadth of the model.]
Response 5: [Thank you for your valuable suggestions and recommendations! Regarding the above issues, we will provide the following explanations. This study is based on 100 coal mine roof accident reports, which cover most of the causal dimensions. However, there are limitations such as the difficulty in fully adapting the samples to diverse coal mine scenarios, a bias towards severe accidents, and the potential for subjective bias in extracting implicit factors. Nevertheless, the framework constructed based on the 2-4 model, which involves "decomposition of five-dimensional factors → construction of a structured matrix → quantitative analysis using Bayesian networks," possesses general applicability and can be adapted to other coal mine accidents such as gas explosions and water disasters. By specifically adjusting the specific factors under the dimensions of "people, objects, individual abilities, management system, and safety culture," an exclusive causal matrix can be constructed, becoming a universal tool for causal analysis of multiple types of coal mine accidents. In the future, by obtaining high-quality data such as real-time sensors, we will address the deficiencies in sample and factor extraction, further enhancing the adaptability and accuracy of the framework across multiple accident types, and providing support for coal mine safety prevention and control.]
Comments 6: [There are technical errors in formatting, while the bibliography does not comply with the journal's requirements.]
Response 6: [Thank you for pointing this out. We agree with this comment. Therefore, we have made modifications to the format of the references. Mention exactly where in the revised manuscript this change can be found in line 749-830.]
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis article aims to define a scientific framework to evaluate the key factors of the roof accident and their dynamic changes and internal correlations. This framework acts as a decision support for roof accident prevention.
Topic is very interesting and less presented in existing literature. The research method is innovative, complex and is well described. It has high theoretical and practical implications.
Questions:
The authors mention a sample of 100 accident reports used for collecting data. Could they provide more information on this sample? (What reports? What accidents? Where from, when? Where the accidents took place etc …)
Line 359 -> inconsistent with line 465 (80 accident reports vs 100 accident reports)
- What is the main question addressed by the research?
This article aims to define a scientific framework to evaluate the key factors of the roof accident and their dynamic changes and internal correlations. This framework acts as a decision support for roof accident prevention.
- Do you consider the topic original or relevant to the field? Does it address a specific gap in the field? Please also explain why this is/ is not the case.
Topic is very interesting and less presented in existing literature, with a lot of implication in practice of occupational safety in cole mining field.
- What does it add to the subject area compared with other published material?
The research method is complex and innovative, as the authors are mentioning in lines 424-425: “This study proposes a novel framework for analyzing the causes of roof accidents by integrating an accident causation matrix with a Bayesian network model”. This is “overcoming the limitations of traditional static analysis methods” taking into account the interactions among the multidimensional factors.
- What specific improvements should the authors consider regarding the methodology?
The research method combines causation matrix construct and the Bayesian network model. The two methods-model comparative analysis and real case verification prove the effectiveness of this study.
Questions:
- The authors mention a sample of 100 accident reports used for collecting data. Could they provide more information on this sample? (What reports? What accidents? Where from, when? Where the accidents took place etc …)
- Line 359 -> inconsistent with line 465 (80 accident reports vs 100 accident reports)
- Are the conclusions consistent with the evidence and arguments presented and do they address the main question posed? Please also explain why this is/is not the case.
The results are in detail described. At the same time, the authors highlight the theoretical and empirical implications. The limitations of the study are mentioned.
- Are the references appropriate?
The authors are using relevant, updated literature references, pertaining to the investigated topic.
Author Response
Comments 1: [The authors mention a sample of 100 accident reports used for collecting data. Could they provide more information on this sample? (What reports? What accidents? Where from, when? Where the accidents took place etc …) . Line 359 -> inconsistent with line 465 (80 accident reports vs 100 accident reports)]
Response 1: [Thank you for your valuable feedback! In response to the above issues, we will provide the following explanations. We have provided detailed selection criteria and data processing methods for 100 coal mine roof accident reports in Appendix B of the article. The samples are all reports of coal mine roof accidents (excluding gas explosions, water disasters, and other types of accidents), covering major, major, and general accidents that occurred in coal mines of different regions (such as Guizhou, Shaanxi, Hebei), geological conditions, and mining scales (accounting for 30% and 70% respectively); The sources include reports released by official platforms such as the National Mining Safety Supervision Bureau and its local branches, the Coal Mine Safety Network, accident data from published academic dissertations and journal articles, and news reports on serious accidents; Priority will be given to collecting data within the past 5 years, while also supplementing typical cases from 2018 to 2022. Regarding the issue of the number of accidents, we have clearly stated the relevant content in Appendix B's "Model Training" section, specifying a total sample size of 100 and dividing it into 80 training sets and 20 testing sets in an 8:2 ratio.]
Comments 2: [What is the main question addressed by the research? This article aims to define a scientific framework to evaluate the key factors of the roof accident and their dynamic changes and internal correlations. This framework acts as a decision support for roof accident prevention. Do you consider the topic original or relevant to the field? Does it address a specific gap in the field? Please also explain why this is/ is not the case.]
Response 2: [Thank you for your valuable suggestions! Regarding the above issues, we will provide the following explanations. Firstly, this study primarily explores a new analytical framework to address the deficiencies of existing methods in the multi-dimensional causal analysis of "human-object-management-individual ability-safety culture" in coal mine safety accident analysis. It forms an integrated analytical framework from qualitative feature description to quantitative probabilistic reasoning, enabling accident probability prediction and dynamic diagnosis of key risk factors, providing scientific decision support for roof accident prevention. Secondly, this study is original and relevant to the field. Specifically, it creatively proposes an accident causation matrix and integrates it with a Bayesian network model to form a new analytical framework. The proposal of the accident causation matrix transforms unstructured accident information into a comprehensive and distinctly characterized matrix form, providing precise discretized data for the Bayesian network model and realizing the connection between qualitative feature description and quantitative probability calculation of accident information.]"[updated text in the manuscript if necessary.]
Comments 3: [Topic is very interesting and less presented in existing literature, with a lot of implication in practice of occupational safety in cole mining field. What does it add to the subject area compared with other published material?]
Response 3: [Thank you for your valuable question! Regarding the above issue, we will provide the following explanation. Compared with other published materials, firstly, the accident causation matrix proposed in this study adds a technical path for standardizing unstructured data to the discipline. Existing research often relies on a large amount of unstructured text information or structured information with less information content, resulting in a certain gap between the model and reality. This study, however, utilizes association rule algorithms to mine high-frequency associated factors in accident reports and maps them to five dimensions of the matrix in conjunction with the "Coal Mine Safety Regulations". Through discretization, it standardizes unstructured information, improves the utilization rate of real accident data, and provides a text data quantification and reuse scheme for subsequent similar research. Secondly, by constructing an accident causation matrix and a Bayesian network model, an analytical framework that precisely quantifies qualitative factors into quantitative dynamic reasoning is formed, providing the discipline with a complete analytical paradigm from causation factor representation to risk decision-making.]
Comments 4: [The research method is complex and innovative, as the authors are mentioning in lines 424-425: “This study proposes a novel framework for analyzing the causes of roof accidents by integrating an accident causation matrix with a Bayesian network model”. This is “overcoming the limitations of traditional static analysis methods” taking into account the interactions among the multidimensional factors. What specific improvements should the authors consider regarding the methodology? ]
Response 4: [Thank you for your valuable suggestions. Regarding the interactions among multiple factors, we have conducted a causal analysis, utilizing association rule algorithms and fault tree analysis methods, to establish a feature element library and an associated event library, in order to identify their correlations. Based on this, we have proceeded to model the accident causation matrix.]
Comments 5: [The authors mention a sample of 100 accident reports used for collecting data. Could they provide more information on this sample? (What reports? What accidents? Where from, when? Where the accidents took place etc …) . Line 359 -> inconsistent with line 465 (80 accident reports vs 100 accident reports) . Are the conclusions consistent with the evidence and arguments presented and do they address the main question posed? Please also explain why this is/is not the case. The results are in detail described. At the same time, the authors highlight the theoretical and empirical implications. The limitations of the study are mentioned. Are the references appropriate? The authors are using relevant, updated literature references, pertaining to the investigated topic.]
Response 5: [Thank you for your valuable suggestions and recommendations! In response to the above issues, we will provide the following explanations. In the future, we will address the shortcomings of the current 100 coal mine roof accident reports, such as difficulty in adapting to diverse coal mine scenarios, biased sources towards major accidents, and subjective biases in extracting hidden factors. We will integrate real-time underground sensors, geological exploration data, and all types of accident reports to improve sample representativeness and objectivity in factor extraction; At the same time, further leveraging the universal value of the "five dimensional decomposition matrix construction Bayesian quantification" framework based on the 2-4 model, it can be adapted to coal mine accidents such as gas explosions and water disasters. It can also be explored and promoted to non coal mines, and integrated with intelligent safety control platforms to form a "monitoring prediction prevention and control" loop, ultimately becoming a universal technical carrier for mine accident analysis and safety prevention and control, providing support for safety production.]
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe paper deals with the roof accidents' problem, which poses a significant threat to the safety of miners (especially in China). More in detail, this study creatively constructs an accident causation matrix and provides a new analysis framework, integrating the accident causation matrix and the Bayesian network model. The paper aims at giving a scientific basis for coal mining enterprises to formulate preventive measures.
The article is quite interesting, but there is the lack of a consistent international literature framework. Hence, I suggest revising the paper, according to the following points.
1) Section 1. Please remane it "Introduction and literature review" and try to enlarge it, by including new references on:
- the problems related to the coal mine industries (for example, you could cite some reports published by World Health Orhanization, International Labour Organization etc.);
- European strategy on the coal phase out;
- some statistics on different countries. Lines 36-45 are too much related on Chinese situation;
- different accident analysis models. For instance, several scientific works have provided new evidence with ESAW data and established the use of clustering techniques for risk management in workplaces. You should cite some of them to establish a complete literature framework (after line 46 you could create some sub-sections to distinguish the different approaches to accident analysis).
2) Section 2. It is not clear the entire research approach. I suggest creating a flowchart to clarify such an aspect.
3) Section 4. The discussion should include some comparisons with existing and similar research works. Please address this point.
4) In the conclusions, a reference to Agenda 2030 SDGs related to this study could be really appreciated.
Good luck!
Author Response
Comments 1: [Section 1. Please rename it "Introduction and literature review" and try to enlarge it, by including new references on: the problems related to the coal mine industries (for example, you could cite some reports published by World Health Orhanization, International Labour Organization etc.); European strategy on the coal phase out; some statistics on different countries. Lines 36-45 are too much related on Chinese situation; different accident analysis models. For instance, several scientific works have provided new evidence with ESAW data and established the use of clustering techniques for risk management in workplaces. You should cite some of them to establish a complete literature framework (after line 46 you could create some sub-sections to distinguish the different approaches to accident analysis).]
Response 1: [Thank you for your valuable suggestions. Due to the lengthy nature of the literature review section, we have divided the "Introduction and Literature Review" into two chapters. The first chapter serves as the introduction, while the second chapter is dedicated to the literature review. Within this second chapter, we have further divided it into two sections based on different accident causation methodologies. Following your suggestion, we have incorporated relevant content into the introduction. Detailed modifications can be found in Section 1. Introduction and Section 2. Related works of the original manuscript. ]
Comments 2: [ Section 2. It is not clear the entire research approach. I suggest creating a flowchart to clarify such an aspect.]
Response 2: [Thank you for your valuable suggestions. To make the research methodology clearer, we have included a research roadmap for the novel framework of this study in the revised manuscript. Please refer to lines 71-73 for details. ]
Comments 3: [Section 4. The discussion should include some comparisons with existing and similar research works. Please address this point.]
Response 3: [Thank you for your valuable suggestions. The novel analytical framework of this study has been thoroughly discussed in terms of its advantages and disadvantages. The analysis of similar research work conducted in the review section highlights the strengths of this new framework, which addresses the deficiencies found in other studies. I hope this response can gain your support. For instance, the new analytical framework addresses the shortcomings of existing methods in coal mine safety accident analysis regarding multi-dimensional causation analysis, encompassing "human-object-management-individual ability-safety culture". It forms an integrated analytical framework that moves from qualitative feature description to quantitative probabilistic reasoning, enabling accident probability prediction and dynamic diagnosis of key risk factors. This provides scientific decision support for roof accident prevention, which is precisely the issue that similar methods have failed to address. ]
Comments 4: [In the conclusions, a reference to Agenda 2030 SDGs related to this study could be really appreciated.]
Response 4: [Thank you for your valuable suggestions. They have been added to the original text. The specific content is as follows: Through technological innovation and management optimization, this study provides a theoretical basis for Goal 9 of the 2030 Agenda for Sustainable Development [47]—"Build resilient infrastructure, promote inclusive and sustainable industrialization, and foster innovation." The reference has been provided.]
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsDear Authors!
The manuscript under review presents a well-structured and methodologically mature study dedicated to the analysis of roof accident causation in coal mining through the integration of an accident causation matrix and Bayesian network modeling. The topic is highly relevant and timely, given the growing demand for advanced probabilistic approaches in industrial safety management. The authors have chosen an important and complex research problem, one that combines traditional accident causation theory with contemporary methods of probabilistic reasoning and data analytics.
The paper demonstrates a clear understanding of the limitations of conventional techniques such as fault-tree analysis and attempts to overcome them by integrating the descriptive power of the causation matrix with the inferential capacity of Bayesian networks. This synthesis is both conceptually and technically significant. The proposed framework enables a transition from static, qualitative analysis to dynamic, quantitative reasoning, thereby contributing to a more precise and proactive approach to accident prevention. The study also reflects a solid theoretical foundation built upon the 2-4 Model of accident causation and the methodological use of association-rule mining for the identification of key causal factors.
From a methodological standpoint, the paper is comprehensive and logically coherent. The authors employ a stepwise procedure: constructing the causation matrix based on multi-dimensional accident factors, transforming the fault tree into a Bayesian network, and conducting parameter learning using the Expectation-Maximization algorithm. The approach is appropriate, and the application of the GeNIe tool for model training is well justified. The comparative evaluation against Random Forest and Logistic Regression models adds credibility to the findings, confirming that the proposed model achieves higher predictive accuracy and interpretability in complex, multi-factor accident scenarios.
Nevertheless, certain aspects of the methodology would benefit from additional clarification. The rationale for selecting a minimum support threshold of 0.2 in the association-rule algorithm should be explicitly explained, as this parameter directly influences the completeness and robustness of the causation factors extracted from historical data. The description of variable discretization, especially for continuous indicators such as age, work experience, and material quality, remains somewhat general and could be presented with more precision. Furthermore, while the authors correctly emphasize the model’s prediction accuracy, they could strengthen the empirical section by providing additional validation metrics—such as precision, recall, F1-score, or ROC-AUC—to quantify its performance more comprehensively.
The dataset consisting of 100 accident reports provides an informative empirical basis but also introduces a limitation regarding the model’s generalizability. The authors should acknowledge this constraint more explicitly and indicate how future research could address it, for instance by expanding the dataset or incorporating real-time sensor and monitoring data. The discussion of sensitivity analysis is insightful, and the identification of key influencing factors such as poor quality of support materials, short work experience, and insufficient support structures is convincing. Yet, it would further strengthen the argument to relate these findings to broader industrial safety contexts—such as applications in transportation, construction, or chemical process management—to demonstrate the potential transferability of the proposed framework.
In terms of structure and presentation, the manuscript is clearly written, with a logical flow between sections. The introduction provides a solid literature background, the methodology is well organized, and the results are detailed and informative. However, Sections 2.1 and 2.2 are somewhat overloaded with technical descriptions that could be summarized or partially replaced by schematic diagrams to improve readability. The figures are informative but would benefit from improved resolution and more descriptive captions. The English language is overall clear and professional, although minor stylistic and grammatical polishing is advisable before final publication.
Overall, this manuscript represents a substantial and well-executed contribution to the field of safety science and probabilistic risk analysis. It successfully bridges qualitative causation theory with quantitative modeling techniques and provides a replicable methodological framework that could be extended to other domains of industrial safety. The study combines academic rigor with practical applicability, demonstrating both theoretical novelty and operational value.
In conclusion, I find the paper to be scientifically sound, methodologically well founded, and practically meaningful. The identified issues are minor and concern mainly the presentation of methodological details and the completeness of statistical validation.
Comments on the Quality of English LanguageThe English language is overall clear and professional, although minor stylistic and grammatical polishing is advisable before final publication.
Author Response
Comments 1: [Nevertheless, certain aspects of the methodology would benefit from additional clarification. The rationale for selecting a minimum support threshold of 0.2 in the association-rule algorithm should be explicitly explained, as this parameter directly influences the completeness and robustness of the causation factors extracted from historical data. The description of variable discretization, especially for continuous indicators such as age, work experience, and material quality, remains somewhat general and could be presented with more precision. Furthermore, while the authors correctly emphasize the model’s prediction accuracy, they could strengthen the empirical section by providing additional validation metrics—such as precision, recall, F1-score, or ROC-AUC—to quantify its performance more comprehensively.]
Response 1: [Thank you for your valuable suggestions. The reason for choosing 0.2 as the minimum support threshold in the association rule algorithm in this study is that when the minimum support is greater than 0.2, some valuable information will be lost. However, when the minimum support is less than 0.2, it will be impossible to truly identify key causal factors with high frequency of occurrence. Therefore, by setting the minimum support to 0.2, this study balances the ubiquity and rarity of itemsets in the data. In subsequent experiments, we have added content on precision, recall, and F1 score. Please refer to the experimental section in the revised version. Thank you very much again. ]
Comments 2: [The dataset consisting of 100 accident reports provides an informative empirical basis but also introduces a limitation regarding the model’s generalizability. The authors should acknowledge this constraint more explicitly and indicate how future research could address it, for instance by expanding the dataset or incorporating real-time sensor and monitoring data. The discussion of sensitivity analysis is insightful, and the identification of key influencing factors such as poor quality of support materials, short work experience, and insufficient support structures is convincing. Yet, it would further strengthen the argument to relate these findings to broader industrial safety contexts—such as applications in transportation, construction, or chemical process management—to demonstrate the potential transferability of the proposed framework.]
Response 2: [Thank you very much for your suggestions, and we highly appreciate your input. The construction principle of our new model is to support good scalability, and we will spare no effort to apply it to production practice as soon as possible. On the other hand, in terms of data dimensions, we will gradually improve and optimize the model during production events, providing support for the actual mining production and even other industries in the field of safety. ]
Comments 3: [In terms of structure and presentation, the manuscript is clearly written, with a logical flow between sections. The introduction provides a solid literature background, the methodology is well organized, and the results are detailed and informative. However, Sections 2.1 and 2.2 are somewhat overloaded with technical descriptions that could be summarized or partially replaced by schematic diagrams to improve readability. The figures are informative but would benefit from improved resolution and more descriptive captions. The English language is overall clear and professional, although minor stylistic and grammatical polishing is advisable before final publication.]
Response 3: [We highly appreciate your questions. We have added a research roadmap for the new framework in the original text, making the methodology research path clearer. Regarding the issue of overly lengthy technical descriptions, we strive to refine the language expression and enrich the charts. Overall, we have polished the style and charts. We are very grateful for the expert's valuable suggestions. ]
Author Response File:
Author Response.pdf
Reviewer 5 Report
Comments and Suggestions for Authors1. In this paper, causal analysis of the roof accident occurrence process is used to creatively construct an accident causality matrix to obtain the characteristic description of the causes of accidents.
2. This paper establishes a new analysis framework that integrates the accident causality matrix and the Bayesian network model.3. The accident analysis process is based on the theory of the 2-4 causality model and the association rule algorithm. Key accident factors and their internal correlations are identified, and the accident causality matrix is ​​constructed.
4. The fault tree is transformed into a Bayesian network model; however, the goodness of fit and predictive capacity of the model need to be discussed.
5. The accident causality matrix is ​​used for learning and parameter optimization; however, it is useful to clarify and discuss the statistical criteria that guarantee optimal parameter values.
6. The references are appropriate to the context of the phenomenon studied.
7. It is necessary to mention the scenarios in which the accident causality matrix can effectively characterize accident causal factors.
Author Response
Comments 1: [The fault tree is transformed into a Bayesian network model; however, the goodness of fit and predictive capacity of the model need to be discussed.]
Response 1: [Thank you very much for your valuable suggestions. Regarding the goodness of fit and predictive ability of the model in this study, we provide the following explanations in terms of methodology and experimentation.
(1) Goodness of fit evaluation: During the Bayesian network parameter learning stage, the Expectation-Maximization (EM) algorithm is used to iteratively optimize the model parameters, and the model goodness of fit index log(p) is outputted through the GeNIe tool. This index can directly reflect the degree of adaptation of the model to the training data. The closer the value of log(p) is to 0, the better the fitting effect of the model with the data. In this study, based on the causation matrix data constructed from 100 roof accident investigation reports, after iterative convergence of the EM algorithm, the model's log(p) value falls within a reasonable range, which proves that the Bayesian network structure transformed from the fault tree highly matches the inherent laws of the accident causation data, without obvious overfitting or underfitting issues.
(2) Predictive ability verification: To objectively evaluate the predictive performance of the model, we adopted the leave-one-out cross-validation method and compared our Bayesian network model with two commonly used machine learning models: random forest and logistic regression. The experimental results showed that the average predictive accuracy of our model for all nodes reached 0.792, which was superior to the comparative models (random forest: 0.706, logistic regression: 0.717). This result indicates that the Bayesian network transformed from the fault tree can more accurately capture the complex causal relationships among the causal factors of roof accidents, and its predictive ability is superior to traditional machine learning models, especially when dealing with accident systems involving dynamic interactions of multiple factors.]
Comments 2: [The accident causality matrix is ​​used for learning and parameter optimization; however, it is useful to clarify and discuss the statistical criteria that guarantee optimal parameter values.]
Response 2: [Thank you very much for your valuable suggestions. Regarding the issue of cause-effect matrix and parameter value optimization, we respond as follows.
On the one hand, during the construction of the accident causation matrix, key causative factors are first filtered using the association rule algorithm, with a minimum support threshold set at 0.2. Factors with a support degree higher than 0.2 (such as the quality of support materials, years of service, safety awareness, etc.) are included in the matrix. This approach ensures the effectiveness and conciseness of the matrix data by avoiding the loss of key low-frequency factors such as "untimely support" due to an excessively high threshold, and preventing the introduction of meaningless redundant items due to an excessively low threshold. On the other hand, this study employs the Expectation-Maximization (EM) algorithm for parameter learning, ultimately obtaining the maximized log-likelihood function of the observed data. This is a widely recognized standard method in statistics for estimating parameters in models with latent variables or incomplete data. Convergence judgment: The algorithm iteratively performs the E-step (expectation) and M-step (maximization) until the model's log-likelihood value converges, meaning that the change in likelihood values between two consecutive iterations is less than a preset threshold, thereby ensuring that the obtained parameters are local optimal solutions. The parameter learning process is based on the accident causation matrix constructed from 100 real accident reports, which provides sufficient data support.]
Comments 3: [It is necessary to mention the scenarios in which the accident causality matrix can effectively characterize accident causal factors.]
Response 3: [Thank you very much for your suggestion.The accident causation matrix is particularly suitable for analyzing complex system accidents with multidimensional and hierarchical causation characteristics. For instance, when it is necessary to systematically extract, summarize, and standardize causation factors from a large number of accident reports (such as the 100 roof accident reports used in this study), this matrix provides a structured feature extraction framework that transforms scattered text information into structured data that can be used in quantitative models. Its most advantageous scenarios include:
Scenario 1: From "Qualitative Attribution" to "Construction of Quantitative Modeling Scenario".
Scenario 2: Systematic integration and visualization of multi-dimensional and cross-level causal factors.
Scenario 3: Dynamic Diagnosis and Mining of Key Causal Pathways.]
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThank you for your efforts.
No other suggestion

