A Unified, Threat-Validated Taxonomy for Hardware Security Assurance
Round 1
Reviewer 1 Report
The paper introduces a unified SOD taxonomy that consolidates 10 standards into 22 domains, eliminates redundancy, and unifies terminology. It further establishes a threat-to-domain mapping by linking 167 real threats to the SODs, providing a quantitative basis for risk prioritization. it also integrates the boundary-driven SoI model defining the scope of assurance and supports requirement mapping and integration across heterogeneous standards.
1.The extension of the original SoI model into three layers lacks a comparative analysis with existing frameworks (e.g., NIST SP 800-193, TCG models) and does not provide sufficient justification or references for the independent inclusion of the System Process layer, making the rationale for the three-layer division insufficiently rigorous.
2.The paper could include practical case studies or industrial application examples to demonstrate the applicability and effectiveness of the taxonomy in hardware security assurance scenarios.
Author Response
Comment 1: The extension of the SoI model into three layers lacks comparative analysis with existing frameworks (e.g., NIST SP 800-193, TCG models) and does not justify the independent inclusion of the System Process layer.
Response: We are grateful for this insightful observation. In the revised manuscript, Section 2.3 has been expanded to include a comparative discussion of NIST SP 800-193, TCG device trust models, and PSA Certified. We explain that while these frameworks provide strong coverage at the component and environment levels, they do not explicitly address lifecycle processes such as provisioning, secure updates, and patching. The System Process layer was therefore introduced to capture these critical operational workflows, which have historically been exploited as attack vectors. By adding these clarifications and references, we believe the rationale for the three-layer model is now more explicit and rigorously supported.
Comment 2: The paper should include practical case studies or industrial application examples.
Response: We agree that demonstrating practical applicability is important. While a full-scale industrial case study is beyond the scope of this paper, we have added an illustrative example in Section 5.1. This example considers an IoT healthcare gateway, showing how mapping its implemented controls against the taxonomy highlights strong areas (Boot, Cryptography, Communication) while revealing gaps in Traceability and Audit. This demonstrates how practitioners can use the taxonomy to identify assurance weaknesses and prioritize remediation, thereby bridging the gap between theory and practice.
Author Response File: Author Response.pdf
Reviewer 2 Report
This paper offers an unified taxonomy of 22 Security Objective Domains for hardware security assurance. The taxonomy is derived by consolidating requirements from ten international standards and mapping them to a Boundary-Driven System of Interest model. Each domain is validated against documented hardware-related threats from sources like CWE and CVE databases.
The paper is relevant, but there is the room of improvement. The critical points for improvement are:
- The justification for the selection of the specific ten standards used to build the corpus requires a more explicit and detailed rationale.
- A sensitivity analysis should be provided to show how slight variations in the cosine similarity threshold during clustering would impact the final number and definitions of the SODs.
- The use of raw threat count as a prioritization metric should be replaced with a more nuanced measure that incorporates factors like exploitability or impact, for instance by integrating CVSS scores.
- The core output, the list of 22 SODs, should be presented in a single, consolidated table or graphic early in the paper to improve clarity and readability.
- A practical case study or applied example should be included to demonstrate how the taxonomy would be used for control mapping and gap analysis in a real-world scenario.
1. The paper applies clear inclusion criteria but does not justify why the final set of ten standards represents a complete or optimal foundation for the taxonomy. The selection appears arbitrary without evidence that these standards collectively exhaust the universe of relevant hardware security requirements. A brief rationale for each standard's inclusion, perhaps noting its domain authority or the unique perspective it adds, is necessary to validate the corpus as a representative sample.
2. The dependence on a fixed cosine similarity threshold of 0.80 is a methodological vulnerability. While the silhouette score analysis supports this choice, the paper does not demonstrate that the resulting taxonomy structure is robust to minor variations in this parameter. A sensitivity analysis showing that the number and thematic coherence of the candidate SODs remain stable within a small range around the chosen threshold (e.g., 0.78–0.82) is required to prove the model's stability and reduce arbitrariness.
3The use of a raw count of mapped threats per SOD is a weak proxy for risk prioritization. It assigns equal weight to all threats, regardless of their real-world exploitability, impact, or prevalence. This metric can mislead practitioners by over-emphasizing domains with many low-severity issues and under-emphasizing those with few critical ones. Incorporating a weighted score based on CVSS severity or historical incidence rates would transform this metric from a simple count into a true risk-based priority indicator.
4.The core contribution, the unified taxonomy of 22 SODs, is fragmented across three tables in the results section. This structure forces the reader to reconstruct the model mentally and obscures the holistic view of the assurance framework. Presenting the complete taxonomy in a single, consolidated table or, preferably, a clear conceptual diagram early in the paper would immediately orient the reader and provide a necessary reference point for understanding the subsequent analysis.
5. The paper thoroughly describes the taxonomy's construction but provides only a theoretical discussion of its application. Without a concrete example of its use, the practical utility for the intended audience (assurance practitioners) remains abstract and hypothetical. A brief applied case study, such as mapping the controls of a hypothetical device against the SODs to perform a gap analysis, is essential to bridge the gap between academic contribution and practical tool.
Author Response
Comment 1: The selection of ten standards appears arbitrary; provide rationale.
Response: Thank you for raising this concern. Section 3.1 now provides a clear explanation of why the ten standards were chosen. Each standard contributes a unique assurance perspective: for example, FIPS 140-3 focuses on cryptographic modules, ISO/IEC 20243 on supply chain authenticity, IEC 62443 on industrial embedded devices, and UL 2900 on healthcare devices. Together, these standards capture broad and cross-sectoral requirements. We also clarify that sector-specific frameworks (e.g., ISO/SAE 21434 for automotive, DO-326A for aerospace) were deliberately excluded in order to preserve generality. These are identified as future extensions.
Comment 2: The dependence on a fixed cosine similarity threshold (0.80) should be tested for robustness.
Response: We appreciate this suggestion. Section 3.3 now reports a sensitivity analysis where thresholds between 0.78 and 0.82 were tested. Across this range, over 92% of cluster assignments remained stable, and no new domain-level groupings emerged. This confirms that the taxonomy’s structure is robust to minor parameter variation and not an artifact of threshold choice.
Comment 3: Provide justification for exactly 22 domains.
Response: We have clarified this in Section 3.3 Phase 2. The initial AI-assisted clustering produced 28 candidate groups. Through expert review, closely related groups were merged (for example, multiple storage-related protections were unified under a single Storage domain). This consolidation resulted in 22 domains, balancing comprehensiveness with non-redundancy.
Comment 4: Present the taxonomy in a single, consolidated view.
Response: We agree that a consolidated overview improves readability. A new Table 5 (Section 4.1) now presents all 22 domains grouped by assurance scope, giving readers a clear single reference point. The more detailed Tables 6–8 remain to provide illustrative threats and definitions.
Comment 5: Clarify prioritization metrics.
Response: We agree that raw threat counts provide only a baseline. To address this, Section 5.1 acknowledges the limitation and now includes an illustrative case study of an IoT healthcare gateway. In this example, explicit coverage scoring is reported (13/22 domains covered), and a three-step remediation sequence is proposed (Traceability → Audit → Provisioning). This demonstrates how prioritization can be applied in practice. In addition, Section 5.4 commits to integrating weighted prioritization metrics such as CVSS severity, exploitability, and historical prevalence. This will allow future applications of the taxonomy to balance both coverage and severity when guiding assurance priorities.
Author Response File: Author Response.pdf
Reviewer 3 Report
This paper focuses on one of the crirtical security weakpoint in the hardware security. With existing hardware security standards remain fragmented, inconsistent in scope, and difficult to integrate, creating gaps in protection and inefficiencies in assurance planning.
To address that, this study proposes a unified, standards-aligned, and threat-validated taxonomy of Security Objective Domains (SODs) for hardware security assurance.
Comments:
- Point 1: In the Introduction section, the authors have to discuss the critical security issues and Security Weaknesses of hardware devices.
- Point 2: The abstratc is missing some important parts like, what is the research methodology used? and what are the main future direction of the research for fututre researchers. So, my suggestion to rewrite the abstract.
- Point 3: The study mentioned at the end of abstract "risk management". Add new section in the background to clarify the importance of risk assessment and managment in hardware security. Use this study: - Model-Based Systems Engineering Cybersecurity Risk Assessment for Industrial Control Systems Leveraging NIST Risk Management Framework Methodology. Journal of Cyber Security and Risk Auditing.
- Point 4: Clarify what is the problem statement of the research.
- Point 5: Add future works.
- Point 6: Rewrite the conclusion.
- point 7: The methodology is clear.
- Point 8: The tables of classifications and mapping are very interesting.
None
Author Response
Comment 1: Discuss critical hardware weaknesses in the Introduction.
Response: We thank the reviewer for this helpful suggestion. The Introduction now explicitly highlights weaknesses unique to hardware, including side-channel leakage, counterfeit components, immutable flaws that cannot be patched, and exploitable debug/test interfaces. This strengthens the problem motivation and sets a clearer context for the taxonomy.
Comment 2: Add risk management background.
Response: We agree that risk assessment is central to assurance. Section 2.1 now discusses the importance of risk assessment and management, drawing on frameworks such as the NIST Risk Management Framework and MBSE-based cybersecurity risk analysis. This addition emphasizes how risk-informed decision making complements the taxonomy.
Comment 3: Abstract should include methodology and future directions.
Response: The revised Abstract now explicitly states that the taxonomy was inductively derived using AI-assisted clustering and expert validation, and that it was validated against 167 documented threats. It also points to future directions, including sector-specific extensions and integration of severity metrics such as CVSS.
Comment 4: Add future work and strengthen conclusion.
Response: Section 5.4 now outlines extensions such as sector-specific profiles, larger threat datasets, integration with CVSS-based prioritization, tool support, and empirical case studies. Section 6 Conclusion has been rewritten to more clearly summarize contributions and to highlight how the taxonomy addresses redundancy, fragmentation, and assurance gaps, while also supporting future adaptability.
Author Response File: Author Response.pdf
Reviewer 4 Report
The article addresses the important and relevant topic of unifying approaches to hardware security assurance. The proposed taxonomy of Security Objective Domains (SODs) has the potential to make a significant contribution to solving the problem of fragmentation among existing standards. The process of merging requirements from 10 standards into 22 domains is well organized. Linking each domain to real world threats (167 CWE/CVE entries) gives the taxonomy practical significance.
The research is of a high standard, but there are a number of comments and questions about the work:
- The Cohen's kappa value of κ=0.82 indicates good inter-rater agreement but does not reveal the nature of the disagreements. Which domains caused the most disagreement among the experts? How did these disagreements influence the final structure of the taxonomy? It might be useful to add an analysis of these cases, for example, in an appendix.
- Why exactly 22 domains? Could you provide a more rigorous justification for this number? It would be helpful to know how many AI-suggested clusters there were initially and to see at least one example where experts decided to merge two clusters into one. Was any statistical method used to determine the optimal number of clusters?
- A significant imbalance is observed in the number of threats per domain (from 3 to 14). Could this lead to domains with fewer threats being undervalued in practice? How do you propose to address this issue?
- It is unclear how the set of 167 threats was formed. Was it a random sample or a purposive selection? What criteria were used for the inclusion/exclusion of threats?
- How exactly was it determined that a specific threat maps to a particular domain? Was this process formalized? Could you propose a methodology that other researchers could use to map new threats to your taxonomy?
Author Response
Comment 1: Cohen’s κ = 0.82 indicates good agreement, but what were the disagreements?
Response: We appreciate this question. Section 3.3 has been revised to clarify that disagreements mainly occurred in overlapping areas, specifically between Boot and Cryptography, and between Identity and Provisioning. These disagreements were resolved through consensus discussions, resulting in refined definitions that ensured clear boundaries between domains. By documenting these contested areas, we demonstrate that the taxonomy was validated not only statistically but also through expert deliberation.
Comment 2: Why exactly 22 domains?
Response: Thank you for highlighting this. Section 3.3 (Phase 2) now explains that the initial AI-assisted clustering produced 28 candidate groups. During expert review, closely related groups were consolidated (for example, multiple storage protections were merged into a single Storage domain). This iterative consolidation process produced 22 domains, which we believe represents the optimal balance between comprehensive coverage and avoidance of redundancy. This explanation has been added to clarify the reasoning behind the final domain count.
Comment 3: Address imbalance in number of threats per domain.
Response: We appreciate this observation. Sections 4.2 and 4.3 explicitly note that certain domains (e.g., Cryptography, Boot) attracted more threats, while others (e.g., Environment) had relatively few. We explain that this imbalance reflects the distribution of publicly documented attack vectors. Importantly, we emphasize that domains with fewer threats remain critical, since even a single vulnerability in areas like Handling or Audit could create systemic assurance risks. To reinforce this point, we also added a clarifying note under Table 9, warning readers not to interpret low frequency as low importance.
Comment 4: Clarify threat selection.
Response: Section 3.4 now describes that the 167 threats were purposively selected from CWE, CVE, and regulatory advisories to ensure representative coverage of each domain. Selection criteria required explicit relevance to hardware and alignment with the System of Interest boundaries (System Component, Environment, or Process). Each mapping was independently coded by two experts, with disagreements resolved through consensus. This ensures both transparency and reproducibility
Comment 5: The methodology for mapping threats to domains should be formalized to allow replication.
Response: We agree fully with this recommendation. Section 3.4 now introduces a formalized, stepwise Mapping Protocol outlining how threats were assigned to SODs. The protocol includes six steps: (1) threat extraction, (2) scope identification, (3) intent alignment, (4) primary assignment, (5) dual review, and (6) consensus resolution. To illustrate the process, contested cases are documented in Appendix A, showing how overlaps (e.g., Boot vs. Cryptography, Identity vs. Provisioning) were resolved. This addition provides transparency and establishes a methodological template that other researchers can follow when applying the taxonomy to new or emerging threats.
Author Response File: Author Response.pdf
Reviewer 5 Report
The article is in the field of hardware security assurance, with a focus on unifying requirements from international standards and validating them against the threat landscape (CWE/CVE) to provide a coherent taxonomy of “Security Objective Domains” (SOD).
The authors propose a unified, standards-aligned, threat-validated taxonomy with 22 SOD domains, inductively derived from 1,287 requirements extracted from 10 standards and mapped to hardware threats, integrated into the Boundary-Driven System of Interest (SoI) model.
Beyond this, for a better highlighting of how the research was conducted as well as a better understanding of how such models can be implemented or adapted to certain particular requirements as well as increasing the scientific value of the paper, it would be useful if the following were taken into account:
1. It would be useful to highlight, beyond their limitations, in the section describing similar solutions, what problems the proposed solution solves or what improvements it brings.
2. Clarify the metrics in Table 8 and Table 9. Unify the definitions for “Average coverage per domain” vs “per SOD”.
3. Replace “BERT-base” with “MPNet (all-mpnet-base-v2)” in section 3.3 or justify why it is generically called “BERT-base”.
4. Include a short case study showing how the taxonomy is applied to a specific device (e.g., IoT gateway with root-of-trust), with coverage scoring and remediation prioritization.
5. Re-read the article to align the standards used and variations in cited sources, inconsistent metrics, etc.
Author Response
Comment 1: It would be useful to highlight, beyond their limitations, in the section describing similar solutions, what problems the proposed solution solves or what improvements it brings.
Response: We thank the reviewer for this valuable suggestion. Section 5.2 has been revised to make the problem-solving contribution of the taxonomy more explicit. In particular, we now emphasize that the taxonomy addresses three persistent challenges: (1) redundancy, by consolidating overlapping requirements across standards; (2) fragmentation, by harmonizing heterogeneous terminology and scope; and (3) coverage gaps, by explicitly integrating assurance objectives often absent in individual frameworks (e.g., Authenticity, Traceability). By presenting these improvements clearly, we highlight both the scientific contribution and the practical value of the taxonomy.
Comment 2: Clarify the metrics in Table 8 and Table 9. Unify the definitions for “Average coverage per domain” vs. “per SOD.”
Response: We appreciate this observation and agree that clearer definitions were needed. Section 4.2 and the notes for Table 9 now explicitly explain the distinction: “average coverage per domain” refers to the mean number of threats at the assurance scope level (System Component, System Environment, System Process), while “average coverage per SOD” refers to the mean number of threats mapped to each individual Security Objective Domain. This clarification ensures that the metrics are unambiguous and consistently interpreted.
Comment 3: Replace “BERT-base” with “MPNet (all-mpnet-base-v2)” in Section 3.3 or justify why it is generically called “BERT-base.”
Response: Thank you for pointing this out. We have corrected the terminology in Section 3.3 to specify that the embedding model used was MPNet (all-mpnet-base-v2) from the SentenceTransformers library. We also note that MPNet belongs to the BERT family of transformer architectures, which explains why earlier drafts used the shorthand “BERT-base.” The revised text now provides the precise model designation to avoid ambiguity.
Comment 4: Include a short case study showing how the taxonomy is applied to a specific device (e.g., IoT gateway).
Response: We fully agree. Section 5.1 (Practical Implications) now includes an illustrative case study of an IoT healthcare gateway. This case demonstrates how the taxonomy can be applied to identify assurance gaps and prioritize remediation. In the example, explicit coverage scoring is reported: the gateway satisfied 13 of the 22 domains. Uncovered or high-risk areas included Traceability (0/3 mapped threats) and Audit (1/4). To illustrate practical prioritization, a remediation sequence is proposed: (1) implement provenance tracking for component lots (Traceability), (2) deploy signed event logging with secure export (Audit), and (3) extend key-provisioning checks into manufacturing (Provisioning). This example shows concretely how the taxonomy guides both coverage assessment and remediation prioritization, directly addressing the reviewer’s request.
Comment 5: Re-read the article to align the standards used and variations in cited sources, inconsistent metrics, etc.
Response: We thank the reviewer for this reminder. The manuscript has been carefully reviewed for consistency in terminology, standard references, and metrics. Standards are now cited uniformly, references have been aligned to the updated author–year style, and terminology across tables and figures has been harmonized. We believe these revisions improve both clarity and professionalism of the presentation.
Author Response File: Author Response.pdf
Round 2
Reviewer 3 Report
Minor comment:
Regarding the Comment 2: Add risk management background. The following paragraph below you added is good but need reference to support it. Use this study - Analyzing cybersecurity risks and threats in IT infrastructure based on NIST framework. J. Cyber Secur. Risk Audit. "In addition to addressing technical weaknesses, assurance also requires systematic risk assessment and management. Risk-based methods ensure that assurance objectives are prioritized not only according to compliance requirements but also according to the likelihood and impact of exploitation. Frameworks such as the NIST Risk Management Framework (RMF) and subsequent adaptations for industrial control systems highlight how risk analysis can guide resource allocation and inform continuous monitoring. Recent research, for example, has applied model-based systems engineering to cybersecurity risk assessment in critical sectors, showing that structured risk models can bridge technical requirements with organizational decision-making [22]. Integrating such perspectives into hardware security assurance strengthens its ability to support evidence-driven prioritization and lifecycle resilience."
None
Author Response
Comment 1: Regarding the added paragraph on risk assessment and management, the reviewer requests that a supporting reference be included, specifically citing: Analyzing cybersecurity risks and threats in IT infrastructure based on NIST framework. J. Cyber Secur. Risk Audit.
Response: We thank the reviewer for this suggestion. In the revised manuscript (Section 2.1), we have retained the added paragraph on risk-based methods for assurance and integrated the recommended reference to strengthen the argument. Specifically, we now cite
Aljumaiah, O., Jiang, W., Addula, S. R., & Almaiah, M. A. (2025). Analyzing Cybersecurity Risks and Threats in IT Infrastructure based on NIST Framework. Journal of Cyber Security and Risk Auditing, 2025(2), 12–26. https://doi.org/10.63180/jcsra.thestap.2025.2.2.
This supports our discussion of how structured risk analysis methods, guided by NIST RMF, can inform prioritization and continuous monitoring, and aligns with our emphasis on integrating risk perspectives into hardware security assurance.
Reviewer 4 Report
I am satisfied with the latest version of manuscript. I recommend pulication
I am satisfied with the latest version of manuscript. I recommend pulication
Author Response
We sincerely thank the reviewer for the positive evaluation and for recommending the manuscript for publication. We greatly appreciate the constructive feedback provided throughout the review process, which has helped us improve the clarity and rigor of this work.
Reviewer 5 Report
The comments made were treated impartially by the authors; consequently, the paper was restructured/modified/added to increase clarity and to more coherently highlight the contribution made.
The comments made were treated impartially by the authors; consequently, the paper was restructured/modified/added to increase clarity and to more coherently highlight the contribution made.
Author Response
We sincerely thank the reviewer for the positive evaluation and for recommending the manuscript for publication. We greatly appreciate the constructive feedback provided throughout the review process, which has helped us improve the clarity and rigor of this work.