Intelligent Risk Identification in Construction Projects: A Case Study of an AI-Based Framework
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsFor the publication, authors need to solve the following comments carefully:
- Please state the novelty more explicitly in the Introduction (2–3 sentences). Clearly explain what is new compared with prior “AI in risk management” reviews—for example, the phase-based prompting aligned with PM², the cross-model comparison, and the proposed four-layer validation framework.
- Please improve the consistency of the risk categorization. Some category labels overlap or mix concepts (e.g., “Financial/Political Risk” vs. “Political/Regulatory Risk”). Define a fixed taxonomy and map all identified risks to that taxonomy in a consistent manner.
- Please avoid overly strong claims such as “no bias” or “reliable AI.” Add a short subsection discussing common risks of LLM-based analysis (e.g., hallucination, omission, and framing bias) and explain how your framework mitigates them (e.g., grounding outputs in document evidence and applying human expert review).
- Please expand the Future Work section with concrete next steps, such as moving from risk identification to risk prioritization (likelihood × impact), linking outputs to PM² Risk Log fields, and testing the framework on additional projects (different types and countries) to demonstrate generalizability.
- The Introduction currently focuses mainly on LLM-based risk identification and overlooks other AI techniques that have been widely applied in construction projects (e.g., computer vision and classification-based analytics). Consider adding a brief statement (e.g., after line 86) noting that construction applications span multiple AI paradigms across project phases. Please support these prior studies with citations and short examples:
1) Computer vision for construction monitoring (rebar counting): “Effectiveness of traditional augmentation methods for rebar counting using UAV imagery with Faster R-CNN and YOLOv10-based transformer architectures”.
2) Classification-based AFDD for building operation/management: “Real operational labeled data of air handling units from office, auditorium, and hospital buildings”.
Author Response
Comments 1: Please state the novelty more explicitly in the Introduction (2–3 sentences). Clearly explain what is new compared with prior “AI in risk management” reviews—for example, the phase-based prompting aligned with PM², the cross-model comparison, and the proposed four-layer validation framework.
Response 1: We have revised the Introduction to more clearly and explicitly state the novelty of the study. The added sentences (starting with line 133) highlight how the proposed framework differs from prior AI-in-construction risk research by emphasizing its design, lifecycle alignment, cross-model evaluation, and structured validation approach. This ensures that the original contribution of the work is immediately evident to the reader.
Comments 2: Please improve the consistency of the risk categorization. Some category labels overlap or mix concepts (e.g., “Financial/Political Risk” vs. “Political/Regulatory Risk”). Define a fixed taxonomy and map all identified risks to that taxonomy in a consistent manner.
Response 2: Thank you for highlighting this issue. We have revised the manuscript to ensure that risk category labels are applied consistently throughout the manuscript. Mixed labels have been harmonized by adopting a fixed taxonomy, and all identified risks have been mapped to this taxonomy to improve clarity and comparability.
Comments 3: Please avoid overly strong claims such as “no bias” or “reliable AI.” Add a short subsection discussing common risks of LLM-based analysis (e.g., hallucination, omission, and framing bias) and explain how your framework mitigates them (e.g., grounding outputs in document evidence and applying human expert review).
Response 3: We thank the reviewer for this important observation. While the manuscript already discussed several limitations of AI-assisted risk identification and did not make many overly strong claims, we agree that this could be stated more clearly. Therefore, the last paragraph of the Discussion section has been updated to more explicitly acknowledge these considerations.
Comments 4: Please expand the Future Work section with concrete next steps, such as moving from risk identification to risk prioritization (likelihood × impact), linking outputs to PM² Risk Log fields, and testing the framework on additional projects (different types and countries) to demonstrate generalizability.
Response 4: The Future Work section already mentioned the majority of the suggested next steps. To provide additional clarification, one sentence was added to explicitly address the linkage of AI-generated outputs with standard PM² Risk Log fields.
Comments 5: The Introduction currently focuses mainly on LLM-based risk identification and overlooks other AI techniques that have been widely applied in construction projects (e.g., computer vision and classification-based analytics). Consider adding a brief statement (e.g., after line 86) noting that construction applications span multiple AI paradigms across project phases. Please support these prior studies with citations and short examples:
1) Computer vision for construction monitoring (rebar counting): “Effectiveness of traditional augmentation methods for rebar counting using UAV imagery with Faster R-CNN and YOLOv10-based transformer architectures”.
2) Classification-based AFDD for building operation/management: “Real operational labeled data of air handling units from office, auditorium, and hospital buildings”.
Response 5: We were not entirely certain how this comment aligns with the primary focus of the manuscript. Nevertheless, to acknowledge the broader landscape of AI applications in construction and to strengthen the contextual framing, we have added a brief extension grounded in the existing review reference [10] (see line 86).
Reviewer 2 Report
Comments and Suggestions for AuthorsFew comments to address:
1- Deeper analytical comparison of outputs from different AI systems differences, strengths, and limitations is missing.
2-The paper needs a clear explanation of how AI outputs were validated, filtered, or assessed against expert judgment.
3-The appendices are a bit detailed and repetitive across AI models; consolidation and synthesis may be needed to improve readability and academic focus.
Author Response
Comments 1: Deeper analytical comparison of outputs from different AI systems differences, strengths, and limitations is missing.
Response 1: We acknowledge the value of deeper analytical benchmarking between individual AI systems; however, the primary objective of this study is to examine the feasibility of AI-assisted risk identification rather than to perform a comparative performance evaluation of specific models. We therefore focused the analysis on the most relevant distinguishing characteristics, which are summarized in Table 6 and are consistent with key concepts reported in prior literature and reflected in expert feedback. We believe that introducing a more detailed model-by-model benchmarking analysis would detract from the central research aim and risk shifting the manuscript toward a model comparison study, which is beyond the intended scope of this work.
Comments 2: The paper needs a clear explanation of how AI outputs were validated, filtered, or assessed against expert judgment.
Response 2: Thank you for this suggestion. Section 2.4 (Validation) has been expanded to make the methodological process more transparent with respect to how AI outputs were validated and assessed against expert judgment (see line 280). The assessment is further displayed in the Discussion section. In addition, the filtration logic is described in the original text in Section 2.3 (Research Design, starting at line 233), which we believe adequately explains how and why the filtration of AI outputs was handled.
Comments 3: The appendices are a bit detailed and repetitive across AI models; consolidation and synthesis may be needed to improve readability and academic focus.
Response 3: We agree that the appendices are detailed and contain some repetition across AI models. However, this was a deliberate methodological choice. In practice, users cannot fully control how different general AI systems generate outputs, and meaningful comparison therefore requires preserving model-specific structures and content. To balance readability and transparency, the manuscript is organized in a layered manner: the main body (Results section) presents only the most distinctive synthesized characteristics, the appendices contain more detailed AI-generated explanatory outputs, and the Supplementary Materials provide the least readable yet most comprehensive raw risk identification tables.
Reviewer 3 Report
Comments and Suggestions for AuthorsReviewer Report:
The artificial intelligence based risk identification approach presented in the manuscript is timely and valuable in terms of both topic selection and application domain; however, providing certain sections in greater detail and with clearer explanations would strengthen the scientific rigor and reproducibility of the study.
First, the operational conditions under which the artificial intelligence models were executed should be clarified. The specific model versions used, the dates of use, and the technical parameters applied should be explicitly reported. In addition, it should be explained whether the same prompt was executed multiple times and how the outputs were selected or aggregated.
Relatedly, a more detailed presentation of the prompt structure used for risk generation is recommended. Clarifying how the concept of “risk” was elicited from the model, which components were expected to be included in a risk description, and the intended output format would help distinguish whether the generated statements constitute formal risk definitions or rather general problem descriptions or recommendation lists.
The scope and characteristics of the project documentation used in the study also require further elaboration. Specifying which types of documents were analysed, their total volume, temporal coverage, and content types would make the methodological boundaries of the approach more transparent. Moreover, it should be clearly stated whether any preprocessing steps were applied before submitting the documents to the AI models, including language normalization, confidentiality handling, or content filtering.
Further clarification of the operational workflow of the proposed AI-based framework, which is presented as the main contribution of the paper, would enhance its overall impact. A clear explanation of how the framework operates step by step, how inputs are transformed into outputs, at which stages human intervention is required.
The manner in which the generated risks are intended to be used in project management practice should also be addressed in greater detail. Beyond listing risks, the manuscript should explain how these outputs contribute to decision-making processes, how risks are prioritized, who is responsible for monitoring them, and at which stages they are updated.
Finally, providing additional information on the scalability, cost implications, and practical feasibility of the proposed method would be beneficial. The manuscript should discuss how the approach performs as the volume of project documentation increases, what types of time and cost requirements are involved, and how such a system could be sustainably implemented in a real construction site environment. In this context, it is also recommended that the measures taken to ensure the confidentiality of project documents and the ethical use of AI tools be clearly stated.
Author Response
Comments 1: First, the operational conditions under which the artificial intelligence models were executed should be clarified. The specific model versions used, the dates of use, and the technical parameters applied should be explicitly reported. In addition, it should be explained whether the same prompt was executed multiple times and how the outputs were selected or aggregated.
Response 1: The manuscript reports the most granular information available to end users when accessing these systems through standard web interfaces, including the service provider and the aliased model names. Specific version identifiers, backend snapshots, and technical parameters such as temperature, top-p, or random seeds are not disclosed to general users by providers such as OpenAI, Anthropic, or Google and are only accessible via developer-tier API environments. These parameters are therefore outside the control and visibility of the researcher when using consumer-facing interfaces. In addition, the paper is intentionally designed toward practitioners without a software background who rely on commercially available, off-the-shelf models through standard interfaces. Additionally, as noted starting at line 203, no domain-specific training or fine-tuning was applied to any model; this approach was selected to reflect a realistic application scenario in which industry professionals use general-purpose AI tools without specialized machine learning expertise or access to construction-specific datasets.
With respect to prompt execution, the same prompt structure was applied across all AI models to ensure comparability. We believe that this is clearly presented in the Results section. Section 2.3 (Research Design, starting at line 219) provides further detail on this prompting process.
Comments 2: Relatedly, a more detailed presentation of the prompt structure used for risk generation is recommended. Clarifying how the concept of “risk” was elicited from the model, which components were expected to be included in a risk description, and the intended output format would help distinguish whether the generated statements constitute formal risk definitions or rather general problem descriptions or recommendation lists.
Response 2: As partly explained in the previous response, we believe that the explanation of the prompt structure is already sufficient. The concept of “risk” was explicitly grounded in, and defined according to the selected project management methodology (PM²), and the entire prompting strategy was designed to align with this framework. The components expected to be included in each risk entry were clearly specified within the prompts through the definition of the risk identification table, whose table columns correspond to PM² Risk Log prescriptions.
In other words, prompts contain this: ”The table should have 7 columns: Code, Risk Source, Cause/Driver, Risk (delivery, event, occurrence), Affected Area, Risk Category, and Risk Bearer. “
By grounding the prompts in an established methodology and enforcing a structured cause–risk–effect format (see line 226), the framework ensures that the generated statements constitute formal risk definitions rather than general problem descriptions or recommendation lists.
Comments 3: The scope and characteristics of the project documentation used in the study also require further elaboration. Specifying which types of documents were analysed, their total volume, temporal coverage, and content types would make the methodological boundaries of the approach more transparent. Moreover, it should be clearly stated whether any preprocessing steps were applied before submitting the documents to the AI models, including language normalization, confidentiality handling, or content filtering.
Response 3: We agree that this aspect requires further clarification. Accordingly, Section 2.1 (Data Collection and Project Selection) has been extended to provide more detailed information on the type, scope, and content of the official technical specification document used in the analysis (see line 158). In addition, Section 2.3 (Research Design, see line 211) was expanded to clearly state that no preprocessing steps were applied before submitting the documents to the AI models.
Comments 4: Further clarification of the operational workflow of the proposed AI-based framework, which is presented as the main contribution of the paper, would enhance its overall impact. A clear explanation of how the framework operates step by step, how inputs are transformed into outputs, at which stages human intervention is required.
Response 4: We appreciate this comment and, accordingly, have added a new summarizing paragraph at the end of Section 2.3 (Research Design) that explicitly describes the step-by-step operation of the proposed AI-based framework.
Comments 5: The manner in which the generated risks are intended to be used in project management practice should also be addressed in greater detail. Beyond listing risks, the manuscript should explain how these outputs contribute to decision-making processes, how risks are prioritized, who is responsible for monitoring them, and at which stages they are updated.
Response 5: We acknowledge that while the practical use of AI-generated risks was discussed throughout the Discussion section, these aspects were not previously stated explicitly, so it may not be as clear to readers. To address this, we have expanded the Discussion section (see line 664) to clearly explain how the generated risks are intended to support established project management practices.
Comments 6: Finally, providing additional information on the scalability, cost implications, and practical feasibility of the proposed method would be beneficial. The manuscript should discuss how the approach performs as the volume of project documentation increases, what types of time and cost requirements are involved, and how such a system could be sustainably implemented in a real construction site environment. In this context, it is also recommended that the measures taken to ensure the confidentiality of project documents and the ethical use of AI tools be clearly stated.
Response 6: Aspects such as scalability, cost implications, and sustainable implementation can only be discussed at an indicative level within the scope of this study, which represents initial findings on a single case project. These topics are therefore framed as areas for further investigation rather than empirically validated outcomes of the present work. Accordingly, we have expanded the Conclusions section to outline the suggested practical implications, limitations, and future research directions related to these aspects.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsNo comment.
Reviewer 3 Report
Comments and Suggestions for AuthorsReviewer Report:
The revised manuscript submitted by the authors has been carefully reviewed. It is observed that the authors have diligently addressed the comments and suggestions provided during the previous review process and have made the necessary revisions.

