Previous Article in Journal
Evaluating Autonomous Truck Adoption: An Elasticity-Based Model of Demand, Modal Shift, and Emissions
 
 
Article
Peer-Review Record

A Comprehensive Analysis of Safety Failures in Autonomous Driving Using Hybrid Swiss Cheese and SHELL Approach

Future Transp. 2026, 6(1), 21; https://doi.org/10.3390/futuretransp6010021
by Benedictus Rahardjo 1,*, Samuel Trinata Winnyarto 2, Firda Nur Rizkiani 2 and Taufiq Maulana Firdaus 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Future Transp. 2026, 6(1), 21; https://doi.org/10.3390/futuretransp6010021
Submission received: 2 December 2025 / Revised: 13 January 2026 / Accepted: 13 January 2026 / Published: 15 January 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper  aims to systematically identify potential failures in AVs by integrating the  Swiss cheese and SHELL modes. The proposed swiss cheese and shell model seems novelty, it offers a valuable, interdisciplinary framework for AD safety analysis

 

1. Fig.3 was the proposed hybrid swiss and shell approach, it was no detail description about how to integrate these two models, such  as how many layers were selected when combined cross layer analysis. Suggest the author give more descriptions about this hybrid model.

2. Fig.3 and fig.6 have the same caption

3. This paper seems to be a survey on AV safety failures, the references should be updated, most of the references were published before 2023, the new recent 2023-2025 were too few. this limits the paper’s relevance to emerging challenges, such as AI related methods.

4.  besides, most of the references were cited from trans part A,B and C, and other journals also should be considered , such as  IEEE  TITS\IEEE TVT\IEEE TIV .

Author Response

Dear Reviewer:

We would like to thank the reviewers for your helpful comments and encouragement that improved the manuscript. In this revised manuscript, we have carefully considered your comments and suggestions. We find that the review comments are immensely beneficial, and those have helped us improve the quality of the paper and its presentation. Therefore, the revisions corresponding to various comments and suggestions are colored in the revised manuscript. We also present our response to each point below.

This paper aims to systematically identify potential failures in AVs by integrating the Swiss cheese and SHELL modes. The proposed swiss cheese and shell model seems novelty, it offers a valuable, interdisciplinary framework for AD safety analysis

 

Response: Thank you for your careful reading of the manuscript, and we sincerely appreciate your very positive feedback. We are happy to know that you and the other reviewer have found the paper interesting. We are grateful for your constructive suggestions, which immensely helped us in improving the paper. We are happy to inform you that we have considered all the comments and suggestions. Please see below for the detailed response. We have duly incorporated the changes in the manuscript.

 

 

Point 1: Fig.3 was the proposed hybrid swiss and shell approach, it was no detail description about how to integrate these two models, such as how many layers were selected when combined cross layer analysis. Suggest the author give more descriptions about this hybrid model.

 

Response 1: Thank you for your comment and we agree with your suggestion. Therefore, we have revised our manuscript, as stated: “To strengthen the explanation of how these models are combined, this study explicitly integrates five Swiss Cheese layers—governance and policy, perception, planning and decision-making, control and actuation, and human-related operational roles—with four SHELL interfaces: Liveware-Software, Liveware-Hardware, Live-ware-Environment, and Liveware-Liveware. A cross-layer mapping approach was applied, in which failures observed in real-world AV incident data were categorized by both system layer and human-system interaction interface. For instance, perception failures were linked with L-H issues such as sensor degradation or miscalibration, while communication breakdowns between driver and system were associated with L-S challenges such as low-salience alerts or mode confusion. This mapping reveals how concurrent weaknesses across systemic layers and human-related interactions can align and result in safety incidents. By combining these models with Endsley’s situational awareness framework, the analysis further incorporates cognitive dimensions such as the operator’s ability to perceive, comprehend, and project risk in real time.”

 

 

Point 2: Fig.3 and fig.6 have the same caption

 

Response 2: We agree with your comment. Thank you for your valuable comment. Therefore, we revise the caption for Figure 3 become “Conceptual Model of the Hybrid Swiss Cheese-SHELL Approach.”

 

 

Point 3: This paper seems to be a survey on AV safety failures, the references should be updated, most of the references were published before 2023, the new recent 2023-2025 were too few. This limits the paper’s relevance to emerging challenges, such as AI related methods.

 

Response 3: Thank you for your suggestions. We have updated the references to remove the outdated publication and enrich with current AI related methods.

 

 

Point 4: Besides, most of the references were cited from trans part A, B and C, and other journals also should be considered, such as IEEE TITS\IEEE TVT\IEEE TIV.

 

Response 4: Thank you for your suggestions. We have added related references that comes from IEEE TIV and IEEE TITS.

Reviewer 2 Report

Comments and Suggestions for Authors

The article presents a comprehensive, qualitative analysis of safety failures in autonomous driving systems. Its primary objective is to systematically identify potential failure modes and analyze the complex interactions between system components. However, I will comment on some aspects to improve the article.

- The paper claims a "qualitative case-based research design" but fails to delineate its execution. The case selection criteria, the number and nature of cases analyzed, and the systematic process for applying the hybrid model to these cases are not described. The analysis reads more like a generalized, deductive application of the models' concepts to AVs in general rather than an inductive, evidence-driven case study. A proper case-based methodology requires detailing how cases were sourced, how data were extracted, and how the models were used to code and analyze the data.
- The "Discussion" section does not explicitly analyze or refer to this or any other specific dataset. The study appears to rely entirely on secondary literature rather than primary case data, contradicting the stated "case-based" approach and undermining claims of empirical analysis.
- While the SCM and SHELL models are well-described individually, their integration remains schematic and underdeveloped. Figure 3 and the related text promise a "hybrid integration" but merely place the models side-by-side. The analysis in Section 4.1 applies SCM to list AV layers and then separately lists SHELL interfaces with generic examples. The crucial, novel insight—how the "holes" from SCM layers are specifically created or exacerbated by the dysfunctional interactions detailed in the SHELL tables—is asserted but not rigorously demonstrated. The promised "combined cross-layer analysis" is not delivered in depth.
- Endsley’s model is mentioned but is poorly integrated into the core analytical framework. It appears as an add-on in Figure 6 rather than as a central lens for explaining the cognitive mechanisms underlying Liveware-related failures in the SHELL model. This represents a missed opportunity to deepen the human factors analysis.
- Section 3 is a significant disruption. Its content is partly administrative (describing NHTSA orders) and partly confusing (with future-dated charts). This section should either be moved to a methodology subsection, thoroughly revised to represent the actual data used for analysis, or removed. The figures as presented are invalid.
- Sections of the paper, particularly the early parts of the Introduction and Related Works, recite very general benefits and challenges of AVs that are well-established in literature, delaying the presentation of the paper's unique contribution.
- The text sometimes uses "autonomous driving" and "automated vehicles" interchangeably. Given the focus on SAE levels, greater precision is needed.
- The conclusion states the hybrid approach "successfully identified critical alignment pathways." However, the analysis does not present a systematic identification or prioritization of such pathways. The recommendations are generic and not uniquely derived from the hybrid model analysis's specific findings.
- The claim that the study offers a "structured diagnostic method" is premature. The framework is proposed, but its validity, reliability, and utility as a diagnostic tool are not tested or demonstrated against a known set of incidents.
- The abstract and introduction contain noticeable grammatical errors and awkward phrasing. 
- Some references need to be reviewed.
- It is necessary for the authors to review and improve the conclusions.

Comments on the Quality of English Language

The article presents a comprehensive, qualitative analysis of safety failures in autonomous driving systems. Its primary objective is to systematically identify potential failure modes and analyze the complex interactions between system components. However, I will comment on some aspects to improve the article.

- The paper claims a "qualitative case-based research design" but fails to delineate its execution. The case selection criteria, the number and nature of cases analyzed, and the systematic process for applying the hybrid model to these cases are not described. The analysis reads more like a generalized, deductive application of the models' concepts to AVs in general rather than an inductive, evidence-driven case study. A proper case-based methodology requires detailing how cases were sourced, how data were extracted, and how the models were used to code and analyze the data.
- The "Discussion" section does not explicitly analyze or refer to this or any other specific dataset. The study appears to rely entirely on secondary literature rather than primary case data, contradicting the stated "case-based" approach and undermining claims of empirical analysis.
- While the SCM and SHELL models are well-described individually, their integration remains schematic and underdeveloped. Figure 3 and the related text promise a "hybrid integration" but merely place the models side-by-side. The analysis in Section 4.1 applies SCM to list AV layers and then separately lists SHELL interfaces with generic examples. The crucial, novel insight—how the "holes" from SCM layers are specifically created or exacerbated by the dysfunctional interactions detailed in the SHELL tables—is asserted but not rigorously demonstrated. The promised "combined cross-layer analysis" is not delivered in depth.
- Endsley’s model is mentioned but is poorly integrated into the core analytical framework. It appears as an add-on in Figure 6 rather than as a central lens for explaining the cognitive mechanisms underlying Liveware-related failures in the SHELL model. This represents a missed opportunity to deepen the human factors analysis.
- Section 3 is a significant disruption. Its content is partly administrative (describing NHTSA orders) and partly confusing (with future-dated charts). This section should either be moved to a methodology subsection, thoroughly revised to represent the actual data used for analysis, or removed. The figures as presented are invalid.
- Sections of the paper, particularly the early parts of the Introduction and Related Works, recite very general benefits and challenges of AVs that are well-established in literature, delaying the presentation of the paper's unique contribution.
- The text sometimes uses "autonomous driving" and "automated vehicles" interchangeably. Given the focus on SAE levels, greater precision is needed.
- The conclusion states the hybrid approach "successfully identified critical alignment pathways." However, the analysis does not present a systematic identification or prioritization of such pathways. The recommendations are generic and not uniquely derived from the hybrid model analysis's specific findings.
- The claim that the study offers a "structured diagnostic method" is premature. The framework is proposed, but its validity, reliability, and utility as a diagnostic tool are not tested or demonstrated against a known set of incidents.
- The abstract and introduction contain noticeable grammatical errors and awkward phrasing. 
- Some references need to be reviewed.
- It is necessary for the authors to review and improve the conclusions.

Author Response

Dear Reviewer:

We would like to thank the reviewers for your helpful comments and encouragement that improved the manuscript. In this revised manuscript, we have carefully considered your comments and suggestions. We find that the review comments are immensely beneficial, and those have helped us improve the quality of the paper and its presentation. Therefore, the revisions corresponding to various comments and suggestions are colored in the revised manuscript. We also present our response to each point below.

 

The article presents a comprehensive, qualitative analysis of safety failures in autonomous driving systems. Its primary objective is to systematically identify potential failure modes and analyze the complex interactions between system components. However, I will comment on some aspects to improve the article.

 

Response: Thank you for your careful reading of the manuscript, and we sincerely appreciate your very positive feedback. We are happy to know that you and the other reviewer have found the paper interesting. We are grateful for your constructive suggestions, which immensely helped us in improving the paper. We are happy to inform you that we have considered all the comments and suggestions. Please see below for the detailed response. We have duly incorporated the changes in the manuscript.

 

Point 1: The paper claims a "qualitative case-based research design" but fails to delineate its execution. The case selection criteria, the number and nature of cases analyzed, and the systematic process for applying the hybrid model to these cases are not described. The analysis reads more like a generalized, deductive application of the models' concepts to AVs in general rather than an inductive, evidence-driven case study. A proper case-based methodology requires detailing how cases were sourced, how data were extracted, and how the models were used to code and analyze the data.

 

Response 1: We thank the reviewer for this insightful comment. We acknowledge that the original manuscript did not sufficiently explicate the operational execution of the qualitative case-based research design, which may have created the impression of a generalized, deductive application of the models. To address this concern, we have substantially revised the manuscript clearly and articulated the empirical and analytical procedures underpinning the case-based analysis.

 

 

Point 2: The "Discussion" section does not explicitly analyze or refer to this or any other specific dataset. The study appears to rely entirely on secondary literature rather than primary case data, contradicting the stated "case-based" approach and undermining claims of empirical analysis.

 

Response 2: Thank you for your comments. Our analysis is indeed based on real-world crash data collected and publicly reported by the NHTSA through its Standing General Order (SGO) reporting system. As explained in Section 3 of the manuscript: “As part of its Standing General Order, the NHTSA mandates that manufacturers and operators of vehicles equipped with autonomous driving systems or SAE level 2 advanced driver assistance systems report crashes to the agency. Through the General Order, the NHTSA gets timely and transparent notifications from manufacturers and operators of real-world accidents involving Autonomous Driving Systems (ADS) and Advanced Driver Assistance Systems (ADAS) vehicles at level 2. This data will enable NHTSA to conduct further investigations and enforce vehicle safety laws in the event of crashes raising safety concerns about ADS or level 2 ADAS technologies.”

These data serve as the primary empirical foundation of our study and are used to illustrate how safety failures manifest across automation layers. In Section 4, the application of the hybrid Swiss Cheese–SHELL framework was directly informed by these real-world incidents. For example, as stated in the manuscript: “The L-L holes often serve as latent or active contributors in multi-layer failure alignment in the Swiss Cheese model. Some pathways are automation layers (software or perception) may detect hazards, but if remote operator or safety driver is not aware, i.e., handed over poorly, then that layer fails due to human intervention being late or incorrect.” This reflects our inductive case-based approach, where observed patterns from crash reports were systematically mapped into the hybrid framework to reveal how technical, organizational, and human factors align to produce safety failures.

 

Point 3: While the SCM and SHELL models are well-described individually, their integration remains schematic and underdeveloped. Figure 3 and the related text promise a "hybrid integration" but merely place the models side-by-side. The analysis in Section 4.1 applies SCM to list AV layers and then separately lists SHELL interfaces with generic examples. The crucial, novel insight—how the "holes" from SCM layers are specifically created or exacerbated by the dysfunctional interactions detailed in the SHELL tables—is asserted but not rigorously demonstrated. The promised "combined cross-layer analysis" is not delivered in depth.

 

Response 3: We sincerely thank the reviewer for this insightful and constructive comment. We have revised the Figure 3 caption and enriched the explanation of hybrid integration. Additionally, we have substantially revised Subsection 4.1 by strengthening the hybrid integration logic as stated at the end of this subsection. This has been done by explicitly mapping SHELL interface failures to corresponding Swiss Cheese model layers as explained in Table 6.

 

 

Point 4: Endsley’s model is mentioned but is poorly integrated into the core analytical framework. It appears as an add-on in Figure 6 rather than as a central lens for explaining the cognitive mechanisms underlying Liveware-related failures in the SHELL model. This represents a missed opportunity to deepen the human factors analysis.

 

Response 4: Thank you for your comments. We have updated Figure 6 with consideration of integrating Endsley’s situational awareness model into the core analytical logic. Additionally, we have added Table 6 along with its explanation to address the missed opportunity to deepen human factors analysis.

 

 

Point 5: Section 3 is a significant disruption. Its content is partly administrative (describing NHTSA orders) and partly confusing (with future-dated charts). This section should either be moved to a methodology subsection, thoroughly revised to represent the actual data used for analysis, or removed. The figures as presented are invalid.

 

Response 5: Thank you for your comments. The title of Section 3 has been changed to Case-based Analysis following the focus of this paper as a qualitative case-based research design. The data of Figures 4 and 5 are collected from the valid website of NHTSA.

 

Point 6: Sections of the paper, particularly the early parts of the Introduction and Related Works, recite very general benefits and challenges of AVs that are well-established in literature, delaying the presentation of the paper's unique contribution.

 

Response 6: Thank you for your suggestions. We have rewritten for both sections, especially in a thorough revision in the Introduction section.

 

 

Point 7: The text sometimes uses "autonomous driving" and "automated vehicles" interchangeably. Given the focus on SAE levels, greater precision is needed.

 

Response 7: Thank you for your insightful comment. We agree that the terms “autonomous driving” and “automated vehicles” should not be used interchangeably, particularly when the analysis is grounded in SAE levels of automation. In the revised manuscript, we have clarified the terminology and ensured consistent usage throughout the text. Specifically, “automated vehicles” is now used only when referring to lower levels of automation (SAE Levels 1-2), where the driver remains actively involved in vehicle operation. In contrast, “autonomous driving systems” is used to describe higher levels of automation.

 

Point 8: The conclusion states the hybrid approach "successfully identified critical alignment pathways." However, the analysis does not present a systematic identification or prioritization of such pathways. The recommendations are generic and not uniquely derived from the hybrid model analysis's specific findings.

 

Response 8: Thank you for your comments. We have rewritten the Conclusion section and improved its content to be specific from the findings.

 

 

Point 9: The claim that the study offers a "structured diagnostic method" is premature. The framework is proposed, but its validity, reliability, and utility as a diagnostic tool are not tested or demonstrated against a known set of incidents.

 

Response 9: Thank you for your comment. As stated in our manuscript, the proposed hybrid framework was developed from real-world crash data reported by the NHTSA through its standing general order system, which mandates reporting of crashes involving ADS and Level 2 ADAS. As stated in Section 3: “As part of its Standing General Order, the NHTSA mandates that manufacturers and operators of vehicles equipped with autonomous driving systems or SAE level 2 advanced driver assistance systems report crashes to the agency. Through the General Order, the NHTSA gets timely and transparent notifications from manufacturers and operators of real-world accidents involving Autonomous Driving Systems (ADS) and Advanced Driver Assistance Systems (ADAS) vehicles at level 2. This data will enable NHTSA to conduct further investigations and enforce vehicle safety laws in the event of crashes raising safety concerns about ADS or level 2 ADAS technologies.”

Furthermore, Section 4 applies this data in the analysis, as stated:” The L-L holes often serve as latent or active contributors in multi-layer failure alignment in the Swiss Cheese model. Some pathways are automation layers (software or perception) may detect hazards, but if remote operator or safety driver is not aware, i.e., handed over poorly, then that layer fails due to human intervention being late or incorrect.”

 

Point 10: The abstract and introduction contain noticeable grammatical errors and awkward phrasing.

 

Response 10: Thank you for your comment. We have carefully checked our abstract and introduction parts and made some revision.

 

Point 11: Some references need to be reviewed.

 

Response 11: Thank you for your suggestion. We have updated the references by removing the outdated ones and enriching them with current publications.

 

 

Point 12: It is necessary for the authors to review and improve the conclusions.

 

Response 12: Thank you for your thoughtful comment. We have carefully reviewed the Conclusion section, and we find that the current version contains several core elements expected in a comprehensive closing section, as follows:

  1. Clear summary of key findings: The conclusion begins by restating the study’s main findings, particularly how the hybrid Swiss Cheese–SHELL approach successfully identified critical alignment pathways across AV system layers including governance, perception, planning, control, and human-machine interface and how these layered failures contribute to real-world AV incidents.
  2. Explanation of novelty and contribution: The section highlights the novelty of the proposed hybrid framework by contrasting it with conventional AV safety analyses, which often focus solely on either technical malfunctions or isolated human errors. Our study uniquely integrates systemic and human-interaction factors, offering a multidimensional perspective on AV safety. This serves both as a theoretical contribution by linking system safety with human factors and as a practical tool for engineers, regulators, and researchers.
  3. Implication and application: The conclusion further elaborates how the proposed model can be applied in practice, including use cases in system design, regulatory frameworks, and user training. It also illustrates how the findings can inform efforts to improve safety in future AV development.
  4. Future research: The final part of the conclusion includes a forward-looking perspective, proposing directions for future studies such as simulation-based validation, analysis using real-world datasets, and expanding the model’s application to other autonomous domains such as aerial drones and maritime automation.

We fully agree with the reviewer’s suggestion to further emphasize the forward-looking significance of the work. Therefore, we have added the following sentence at the end of the conclusion: “Overall, this study lays the groundwork for a human-centered, system-aware methodology to support safer autonomous technologies.”

Reviewer 3 Report

Comments and Suggestions for Authors

The research topic is about a comprehensive analysis of safety failures in autonomous driving. Particularly, it focuses on the hybrid Swiss Cheese and SHELL method. It can be seen from the presented content that the authors have devoted efforts to the topic; however, many parts are unclear and insufficiently discussed. Please consider revising the article based on the comments below.
Comment 1. This paper has 97 references, which are far more than a regular research article. Excessive background information was provided. Please ensure that the background information facilitates the readers' understanding of the research content. Otherwise, remove the content.
Comment 2. Justify “hybrid Swiss Cheese and SHELL approach” because it was never proposed in the literature.
Comment 3. Abstract:
(a) Strengthen the importance of the research topic.
(b) Provide a concise discussion of the research contributions.
(c) Share the key results.
Comment 4. More terms should be added to the “Keywords” to better reflect the scope of the paper. The journal’s template allows more terms.
Comment 5. Section 1 Introduction:
(a) As an introduction; 31 references were used without sufficient elaboration. More importantly, it is expected that the content focuses on the importance of the research topic and the rationale of the proposed research directions.
(b) Add a paragraph to discuss the research contributions in detail, preferably in a point-form style.
Comment 6. Section 2 Related Works and Methods:
(a) Add an introductory paragraph before Subsection 2.1.
(b) Table 1: Is this 6-level scheme widely adopted worldwide?
(c) How does “Risk management” in Subsection 2.2 relate to “Safety Failures” as defined in the paper title?
(d) The heading of Subsection 2.3 is not precise.
(e) Which versions of the Swiss Cheese model and SHELL were used? Why?
(f) Justify the architecture of the hybrid Swiss Cheese model and SHELL approach.
Comment 7. Section 3 Data Collection:
(a) Are there benchmark datasets in the research topic?
(b) Why was only a short period of time (from Oct. 2024 to Sep. 2025) considered?
Comment 8. Section 4 Discussion:
(a) Add an introductory paragraph before Subsection 4.1.
(b) The captions of Figure 3 and Figure 6 are exactly the same.
Comment 9. Novelties of methods are weak.
Comment 10. Formal experiment analysis is missing.

Author Response

Dear Reviewer:

We would like to thank the reviewers for your helpful comments and encouragement that improved the manuscript. In this revised manuscript, we have carefully considered your comments and suggestions. We find that the review comments are immensely beneficial, and those have helped us improve the quality of the paper and its presentation. Therefore, the revisions corresponding to various comments and suggestions are colored in the revised manuscript. We also present our response to each point below.

 

The research topic is about a comprehensive analysis of safety failures in autonomous driving. Particularly, it focuses on the hybrid Swiss Cheese and SHELL method. It can be seen from the presented content that the authors have devoted efforts to the topic; however, many parts are unclear and insufficiently discussed. Please consider revising the article based on the comments below.

 

Response: Thank you for your careful reading of the manuscript, and we sincerely appreciate your very positive feedback. We are happy to know that you and the other reviewer have found the paper interesting. We are grateful for your constructive suggestions, which immensely helped us in improving the paper. We are happy to inform you that we have considered all the comments and suggestions. Please see below for the detailed response. We have duly incorporated the changes in the manuscript.

 

Point 1: This paper has 97 references, which are far more than a regular research article. Excessive background information was provided. Please ensure that the background information facilitates the readers' understanding of the research content. Otherwise, remove the content.

 

Response 1: Thank you for your suggestions. We have carefully reviewed the references. Some of the outdated references are removed from the list, also we have added the current related publications.

 

 

Point 2: Justify “hybrid Swiss Cheese and SHELL approach” because it was never proposed in the literature.

 

Response 2: Thank you for your valuable comment. The hybrid Swiss Cheese and SHELL models to analyze AV safety incidents could be claimed as a novel approach, thus it cannot be found in any literature. Specifically, as stated: “Following the safety failures analysis performed by the Swiss Cheese model, we use the SHELL model to systematically identify, code, classify, and prioritize human-system interface failures that contribute to safety incidents in autonomous driving.” In addition, the manuscript discusses how certain interface-level (such as liveware-liveware communication lapses) contribute to multi-layer failure alignment as modeled in Swiss Cheese, as stated: “The L-L holes often serve as latent or active contributors in multi-layer failure alignment in the Swiss Cheese model. Some pathways are automation layers (software or perception) may detect hazards, but if remote operator or safety driver is not aware, i.e., handed over poorly, then that layer fails due to human intervention being late or incorrect. Hardware degradation or environmental condition degrade system performance. If operators or maintenance teams do not communicate the issue to safety drivers, then the hole persists. Regulatory or culture lapses at the organizational or governance layer may lead to vague roles, leaving no one in charge in incidents. Informal communication, such as gestures and expectations among road users, interacts with environment or perception. If non-human road users misinterpret AV behavior, or human drivers do not understand AV behavior, leading to risky situations.”

To explicitly justify the novelty and clarify the integration of the two models, we added a new paragraph:” While the Swiss Cheese model and the SHELL model have been widely applied in-dependently in safety-critical domains, such as aviation and healthcare, their combined application has not been formally proposed in the context of autonomous driving systems. This study intentionally integrates the two models to address the complex socio-technical nature of AV safety failures. The Swiss Cheese model provides a system-level perspective by illustrating how latent failures align across multiple defensive layers, including governance, perception, planning, control, and human–machine interface. However, it does not explicitly capture the detailed mechanisms of human–system interaction failures. Conversely, the SHELL model focuses on interface-level mismatches between humans and software, hardware, environment, and other humans, but lacks a structured representation of how these failures propagate across system defenses. By integrating the two models, this hybrid approach enables a unified analysis that links interface-level human factors to system-level failure pathways, offering a more comprehensive and empirically grounded understanding of autonomous driving safety incidents. This integration emerged directly from the analysis of real-world AV incident data, where failures were rarely isolated and instead resulted from the alignment of technical, organizational, and human factors.”

 

Point 3: Abstract:

(a) Strengthen the importance of the research topic.

(b) Provide a concise discussion of the research contributions.

(c) Share the key results.

 

Response 3: Thank you for your valuable comments. We have improved the abstract as stated: “The advancement of automated driving technologies offers potential safety and efficiency gains, yet safety remains the primary barrier to higher-level deployment. Failures in automated driving systems rarely result from a single technical malfunction; instead, they emerge from coupled organizational, technical, human, and environmental factors, particularly in partial and conditional automation where human supervision and intervention are still required. This study aims to systematically identify safety failures in automated driving systems and analyze how they propagate across system layers and human–machine interactions. A qualitative, case-based analytical approach is adopted by integrating the Swiss Cheese model and the SHELL model. The Swiss Cheese model represents multilayer defensive structures, including governance and policy, perception, planning and decision-making, control and actuation, and human–machine interfaces. The SHELL model structures interaction failures between liveware and software, hardware, environment, and other liveware, enabling cross-layer analysis of how interaction breakdowns create or amplify vulnerabilities within system defenses. The integrated framework provides a structured means to trace recurrent failure pathways and identify intervention points for improving automated driving safety. It supports safety analysis and informs design and regulatory considerations for automated driving systems.

 

Point 4: More terms should be added to the “Keywords” to better reflect the scope of the paper. The journal’s template allows more terms.

 

Response 4: Thank you for the valuable suggestion. We have revised the “Keywords” as stated: “Keywords: Autonomous driving; Autonomous vehicle; Safety failures; Hybrid Swiss Cheese and SHELL approach; Risk mitigation; System safety”

 

Point 5: Section 1 Introduction:

(a) As an introduction; 31 references were used without sufficient elaboration. More importantly, it is expected that the content focuses on the importance of the research topic and the rationale of the proposed research directions.

(b) Add a paragraph to discuss the research contributions in detail, preferably in a point-form style.

 

Response 5: Thank you for your comment, the answer for each comment is below:

(a) We have revised the Introduction section along with its cited references.

(b) In the revised manuscript, we have added the research contributions section including the detail in a point-form style. The research contribution is as follows:

“AVs represent a transformative technology with the potential to reshape mobility, but their safe deployment remains a critical concern. Understanding how and why safety failures occur in AVs especially from both technical and human–organizational stand-points—is essential for building trust, improving regulations, and reducing real-world risks. Based on the detailed analysis presented above, this study makes the following key contributions:

  • A structured hybrid safety analysis framework

This study develops and applies a hybrid Swiss Cheese–SHELL framework that systematically links system-level defense layers with human–system interaction failures, enabling a more comprehensive understanding of autonomous driving safety incidents.

  • Explicit mapping of failure propagation across AV defense layers

By decomposing autonomous driving into governance, perception, planning and decision, control and actuation, and human–machine interface layers, this work demonstrates how failures propagate and align across layers, rather than occurring as isolated faults.

  • Extension of the SHELL model to autonomous driving contexts

The study adapts the classical SHELL model to autonomous driving by detailing how Liveware–Software, Liveware–Hardware, Liveware–Environment, and Live-ware–Liveware interactions contribute to safety failures, supported by concrete failure modes observed in real-world incidents.

  • Identification of human involvement beyond direct driving tasks

The analysis highlights that human contribution to AV safety failures extends beyond on-board driving, encompassing remote operators, maintenance teams, organizational communication, and interactions with other road users, emphasizing the socio-technical nature of AV systems.

  • Integration of technical, organizational, and human perspectives for mitigation

The findings provide a foundation for targeted mitigation strategies by showing how technical degradation, environmental complexity, interface design, and communication breakdowns jointly influence risk, offering actionable insights for system de-signers, operators, and regulators.”

These contributions are derived from in-depth analysis of real-world AV incident data sourced from NHTSA. The study reveals that safety failures often stem from the alignment of latent failures across governance, perception, planning, control, and HMI layers. Moreover, human factors—such as unclear alerting, sensor misalignment, or re-mote operation breakdowns—play a major role in these incidents, demonstrating that AV safety is not merely a technical problem but also socio-technical.

 

Point 6: Section 2 Related Works and Methods:

(a) Add an introductory paragraph before Subsection 2.1.

(b) Table 1: Is this 6-level scheme widely adopted worldwide?

(c) How does “Risk management” in Subsection 2.2 relate to “Safety Failures” as defined in the paper title?

(d) The heading of Subsection 2.3 is not precise.

(e) Which versions of the Swiss Cheese model and SHELL were used? Why?

(f) Justify the architecture of the hybrid Swiss Cheese model and SHELL approach.

 

Response 6:

Thank you for your invaluable comments and suggestions, the answer for each point is below:

(a) We have added an introductory paragraph at the beginning of Section 2, as stated: “This section outlines the key concepts and prior works relevant to this study. It begins with an overview of autonomous vehicle technologies and automation levels, followed by risk management approaches, and concludes with the development of a hybrid analytical framework combining the Swiss Cheese and SHELL models. The purpose of this section is to establish the theoretical and methodological foundation used to analyze safety failures in autonomous vehicle systems.”

(b) Table 1 presents the six levels of driving automation as defined by SAE International, ranging from Level 0 (no automation) to Level 5 (full automation). This scheme has been widely adopted as a global standard by regulators, automotive manufacturers, and researchers to classify and assess vehicle autonomy. For example, in developing countries such as Indonesia, most commercially available vehicles are equipped with Level 1 or Level 2 features, such as adaptive cruise control and lane keeping assist. Another example, in developed countries like the United States, some manufacturers including Tesla offer advanced driver-assistance systems branded as “Full Self Driving.” However, under SAE definitions, these systems are still considered Level 2 because they require constant human supervision. As of now, no production vehicle has reached Level 5 autonomy in real-world deployment.

(c) In this study, risk management is not treated as a separate topic but rather as a practical foundation for understanding safety failures in autonomous vehicles. We use its core principles—such as identifying potential hazards, evaluating their likelihood and impact, and considering ways to reduce risk—to guide how we analyze failure cases. This approach helps structure the discussion in a way that’s both systematic and meaningful. It also sets the stage for integrating the Swiss Cheese and SHELL models, which together allow us to explore how these failures unfold across technical and human-related layers within AV systems.

(d) We have revised the heading of Subsection 2.3 to “Hybrid Analytical Framework using Swiss Cheese and SHELL Models”

(e) In this study, we apply the Swiss Cheese model as originally introduced by Reason (1990), adapting it to fit the layered architecture of AV systems namely perception, planning, control, and human supervision. This structure allows us to analyze how failures propagate when multiple system defenses are breached. For the SHELL model, we adopt the classical framework by Hawkins (1993), focusing on the human interaction with software, hardware, environment, and other humans. These models were chosen due to their ability to capture both system-level and human-factor-related failures in a complementary manner.

(f) The hybrid architecture combines the Swiss Cheese and SHELL models to capture both systemic and human-interaction failures in autonomous vehicle systems. The Swiss Cheese model is used to map failure propagation across four key defense layers in AVs: perception, planning, control, and human supervision. This helps identify how technical failures can align to cause accidents. However, many AV failures also involve poor interactions between humans and the system. Therefore, the SHELL model is used to analyze the quality of interaction between the human (driver or supervisor) and elements such as software (e.g., autopilot system), hardware (e.g., steering wheel), environment (e.g., weather), and other humans (e.g., occupants). The combination of both models provides a more complete framework for analyzing real-world accident cases, especially in identifying not only what failed, but also how and why failures occur across system layers and human interfaces.

 

Point 7: Section 3 Data Collection:

(a) Are there benchmark datasets in the research topic?

(b) Why was only a short period of time (from Oct. 2024 to Sep. 2025) considered?

 

Response 7: Thank you for your comments. The answer for each comment is as follows:

(a) In this study we focus on real-world safety incident reports sourced from the NHTSA database. These datasets reflect actual failures, disengagements, and crashes involving Level 2 and Level 3 autonomous driving systems reported by manufacturers during public road testing. We believe this real-world data provides more practical and direct insight into safety failures as they occur in operational environments, capturing both technical and human factors that are often not represented in benchmark datasets.

(b) The selected one-year period reflects the most recent and complete reporting window from the NHTSA database at the time of our study. Given the rapid pace of advancement in autonomous vehicle technologies, especially in areas such as AI-based perception, decision-making, and human-machine interface design, safety data can evolve significantly within short periods. Therefore, we prioritized the most up-to-date data to ensure our analysis captures current system capabilities and failure patterns. This makes the findings more relevant to today’s AV deployments and offers a realistic snapshot of challenges in the current technological landscape.

 

Point 8: Section 4 Discussion:

(a) Add an introductory paragraph before Subsection 4.1.

(b) The captions of Figure 3 and Figure 6 are exactly the same.

 

Response 8: Thank you for your suggestions. The answer for each suggestion is described below:

(a) We have added an introductory paragraph at the beginning of Section 4, as stated: “This section discusses the key patterns of safety failures identified through the hybrid Swiss Cheese–SHELL analysis. By examining real-world AV incidents, we highlight how technical weaknesses and human-system interactions contribute to these failures, and how the findings can inform practical safety improvements.”

(b) We agree with your comment. Therefore, we revise the caption for Figure 3 become “Conceptual model of the Hybrid Swiss Cheese-SHELL approach”

 

Point 9: Novelties of methods are weak.

 

Response 9: Thank you for your invaluable comment. The hybrid Swiss Cheese and SHELL models to analyze AV safety incidents could be claimed as a novel approach, thus it cannot be found in any literature. The novelty of the hybrid method in AV positions the strength of our research paper.

 

 

Point 10: Formal experiment analysis is missing.

 

Response 10: Thank you for your comment. We have revised Section 3 to become a case-based analysis as the focus of this study. This section shows a formal experiment analysis.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Thanks to the authors for performing the changes suggested by the reviewers. A brief review of the English is necessary before publication

Comments on the Quality of English Language

Thanks to the authors for performing the changes suggested by the reviewers. A brief review of the English is necessary before publication

Author Response

We would like to thank the reviewer for your helpful comments and encouragement that improved the manuscript. In this revised manuscript, we have carefully considered your comments and suggestions. We find that the review comments are immensely beneficial, and those have helped us improve the quality of the paper and its presentation. Therefore, the revisions corresponding to various comments and suggestions are colored in the revised manuscript. We also present our response to each point below.

 

Thanks to the authors for performing the changes suggested by the reviewers. A brief review of the English is necessary before publication.

 

Response: Thank you for your careful reading of the manuscript, and we sincerely appreciate your very positive feedback. We are happy to know that you and the other reviewer have found the paper interesting. We are grateful for your constructive suggestions, which immensely helped us in improving the paper. We are happy to inform you that we have considered all the comments and suggestions. We have duly improved the English to more clearly express the research.

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have enhanced the quality of the paper. I have some follow-up comments on the revised paper.
Follow-up comment 1. In the abstract, only two sentences were insufficient to cover the research results and implications.
Follow-up comment 2. The stated three research contributions should be justified with research results (including numeric values), where applicable.
Follow-up comment 3. Section 2 Related Works and Methods:
(a) The heading includes “Methods”, which is supposed to be the methodology of the authors’ work, must be separated from the “Related Works”.
(b) Various Subsections belong to the background information (e.g., autonomous vehicles and risk management) instead of the related works, please move them to separate sections.
Follow-up comment 4. Figures 4 and 5: How to obtain the trendline?
Follow-up comment 5. Enhance the discussions on the benefits of the research work.

Author Response

We would like to thank the reviewer for your helpful comments and encouragement that improved the manuscript. In this revised manuscript, we have carefully considered your comments and suggestions. We find that the review comments are immensely beneficial, and those have helped us improve the quality of the paper and its presentation. Therefore, the revisions corresponding to various comments and suggestions are colored in the revised manuscript. We also present our response to each point below.

 

The authors have enhanced the quality of the paper. I have some follow-up comments on the revised paper.

 

Response: Thank you for your careful reading of the manuscript, and we sincerely appreciate your very positive feedback. We are happy to know that you and the other reviewer have found the paper interesting. We are grateful for your constructive suggestions, which immensely helped us in improving the paper. We are happy to inform you that we have considered all the comments and suggestions. Please see below for the detailed response. We have duly incorporated the changes in the manuscript.

 

Point 1: In the abstract, only two sentences were insufficient to cover the research results and implications.

 

Response 1: Thank you for your suggestions. We have added more sentences in abstract to cover the research results and implications.

 

 

Point 2: The stated three research contributions should be justified with research results (including numeric values), where applicable.

 

Response 2: Thank you for your valuable comment. We have added some numerical values to justify the research results.

 

 

Point 3: Section 2 Related Works and Methods:
(a) The heading includes “Methods”, which is supposed to be the methodology of the authors’ work, must be separated from the “Related Works”.
(b) Various Subsections belong to the background information (e.g., autonomous vehicles and risk management) instead of the related works, please move them to separate sections.

 

Response 3: Thank you for your valuable comments. We have improved Section 2. Additionally, Methods section have been moved to a new section.

 

 

Point 4: Figures 4 and 5: How to obtain the trendline?

 

Response 4: Thank you for the valuable suggestion. We have added explanations to describe how the trendline is obtained.

 

 

Point 5: Enhance the discussions on the benefits of the research work.

 

Response 5: Thank you for your comment. We have enhanced the discussions of this research work, especially in Subsection 5.2, i.e., research contributions.

 

Back to TopTop