Figure 1.
Cybectr Sentinel versus manual analyst workflows and commercial tooling across five capability dimensions: Unknown Asset Inference, AI-Guided Penetration Testing, MITRE D3FEND integration, encrypted RBAC-gated reporting, and CVE signature coverage. Each dimension is normalized to a 0–1 scale by min–max scaling against the maximum capability score observed for that dimension in the comparison set, so a value of 1.0 indicates the strongest performer in that dimension, and a value of 0.0 indicates the absence of the capability. Color encoding uses ColorBrewer-derived qualitative palette categories selected for accessibility under colorblind viewing (deuteranopia and protanopia tested); each comparison subject (Sentinel, manual analyst, Nessus, OpenVAS) is assigned a distinct hue with no red–green pairing.
Figure 1.
Cybectr Sentinel versus manual analyst workflows and commercial tooling across five capability dimensions: Unknown Asset Inference, AI-Guided Penetration Testing, MITRE D3FEND integration, encrypted RBAC-gated reporting, and CVE signature coverage. Each dimension is normalized to a 0–1 scale by min–max scaling against the maximum capability score observed for that dimension in the comparison set, so a value of 1.0 indicates the strongest performer in that dimension, and a value of 0.0 indicates the absence of the capability. Color encoding uses ColorBrewer-derived qualitative palette categories selected for accessibility under colorblind viewing (deuteranopia and protanopia tested); each comparison subject (Sentinel, manual analyst, Nessus, OpenVAS) is assigned a distinct hue with no red–green pairing.
Figure 2.
Fourteen-capability support matrix across six secure development frameworks.
Figure 2.
Fourteen-capability support matrix across six secure development frameworks.
Figure 3.
Mean Time to Remediate (MTTR) comparison for four vulnerability categories under conventional DevSecOps versus AZTRM-D Stage 3 (left), and implementation cost by framework showing initial setup versus ongoing sprint hours (right). The asterisk on the right panel indicates that the AZTRM-D setup cost is front-loaded and non-recurring.
Figure 3.
Mean Time to Remediate (MTTR) comparison for four vulnerability categories under conventional DevSecOps versus AZTRM-D Stage 3 (left), and implementation cost by framework showing initial setup versus ongoing sprint hours (right). The asterisk on the right panel indicates that the AZTRM-D setup cost is front-loaded and non-recurring.
Figure 4.
AZTRM-D runtime performance on the NVIDIA Jetson Orin Nano (Stage 3): device CPU overhead versus comparable approaches (
left panel), ZT policy enforcement latency across endpoint counts (center panel), and key measured metrics (
right panel). Green metric values in the right panel indicate measurements within their target operating envelope; the red value flags the false positive rate, which is the only metric whose triage cost an operator must monitor over time. The figure spans the full text width to provide adequate label legibility across all three panels; readers viewing at reduced page magnification are referred to
Table 16, Tables 33 and 34, where the same figures appear in tabular form at full body-text size.
Figure 4.
AZTRM-D runtime performance on the NVIDIA Jetson Orin Nano (Stage 3): device CPU overhead versus comparable approaches (
left panel), ZT policy enforcement latency across endpoint counts (center panel), and key measured metrics (
right panel). Green metric values in the right panel indicate measurements within their target operating envelope; the red value flags the false positive rate, which is the only metric whose triage cost an operator must monitor over time. The figure spans the full text width to provide adequate label legibility across all three panels; readers viewing at reduced page magnification are referred to
Table 16, Tables 33 and 34, where the same figures appear in tabular form at full body-text size.
Figure 5.
Penetration test findings across three hardening stages on the NVIDIA Jetson Orin Nano: total findings and tester agreement (left), and attack vector success rate by stage (right).
Figure 5.
Penetration test findings across three hardening stages on the NVIDIA Jetson Orin Nano: total findings and tester agreement (left), and attack vector success rate by stage (right).
Figure 6.
Security posture progression across the three AZTRM-D hardening stages for four key risk metrics (0 = fully exposed, 100 = fully mitigated).
Figure 6.
Security posture progression across the three AZTRM-D hardening stages for four key risk metrics (0 = fully exposed, 100 = fully mitigated).
Figure 7.
Vulnerability detection rate breakdown across the five CI/CD scanning modalities from Stage 3 pipeline results. Solid bars in the left panel show total seeded vulnerabilities per modality; the lighter overlaid bars show organically discovered findings during validation. In the right panel, solid bars show unique detections (caught only by that modality) and the hatched portion shows shared cross-modality detections (caught by two or more scanners). Per modality color is for visual differentiation only and does not encode a separate variable.
Figure 7.
Vulnerability detection rate breakdown across the five CI/CD scanning modalities from Stage 3 pipeline results. Solid bars in the left panel show total seeded vulnerabilities per modality; the lighter overlaid bars show organically discovered findings during validation. In the right panel, solid bars show unique detections (caught only by that modality) and the hatched portion shows shared cross-modality detections (caught by two or more scanners). Per modality color is for visual differentiation only and does not encode a separate variable.
Figure 8.
Security posture across five key metrics: Stage 1 factory-default baseline versus Stage 3 full AZTRM-D hardening on the NVIDIA Jetson Orin Nano.
Figure 8.
Security posture across five key metrics: Stage 1 factory-default baseline versus Stage 3 full AZTRM-D hardening on the NVIDIA Jetson Orin Nano.
Table 1.
Cybectr Sentinel end-to-end workflow with AI components, algorithmic actions, and framework mappings. Source: Cybectr Sentinel architecture (this work) [
1].
Table 1.
Cybectr Sentinel end-to-end workflow with AI components, algorithmic actions, and framework mappings. Source: Cybectr Sentinel architecture (this work) [
1].
| Stage | Description | AI/Algorithmic Action | Frameworks |
|---|
| 1. Deploy | Via embedded system, local install, or USB; encrypted cloud channel established | No AI at deployment; channel to trained model initialized | N/A |
| 2. Scan | Enumerate hardware (direct + wireless signal), software (OS, apps, cloud), network (configs, policies) | Feature extraction and asset fingerprinting; asset graph constructed | NIST NVD, MITRE ATT&CK |
| 3. Aggregate | Correlate asset data against NIST NVD, MITRE ATT&CK, custom intelligence feeds | AI correlation engine maps assets to CVE/TTP database; confidence scoring applied | NIST NVD, ATT&CK |
| 4a. Known Asset | Search database for known vulnerabilities | XGBoost classifies exploitability; SHapley Additive exPlanations (SHAP) values explain each prediction [7,8] | NIST NVD |
| 4b. Unknown Asset | Trigger AI similarity analysis | Sentence Transformer cosine similarity infers vulnerability profile from nearest known asset [5] | Custom index |
| 5. Pen Test Gate | Request user approval; deploy isolated miniature test environment | PPO RL agent selects attack sequences; Metasploit executes in sandbox [6] | MITRE ATT&CK TTPs |
| 6a. Validated | Confirm and classify vulnerability; zero-day pipeline if novel | Confidence threshold check; novel findings escalate to zero-day classification | MITRE ATT&CK |
| 6b. Not Exploitable | Log as false positive; return to monitoring | Isolation Forest model re-weighted for asset class [9] | N/A |
| 7. Mitigation | Generate patches, config fixes, hardening strategies | RAG pipeline queries D3FEND; large language model (LLM) synthesizes specific remediation steps [10] | MITRE D3FEND |
| 8. Reporting | Compile encrypted findings; enforce RBAC access controls | AES-256 encryption; RBAC policy engine per NIST SP 800-53 [11] | NIST SP 800-53 [11] |
| 9. Active Defense | Inform defensive planning via MITRE ENGAGE | ENGAGE strategies selected from confirmed TTPs; deception assets deployed if warranted [1] | MITRE ENGAGE |
Table 2.
Cybectr Sentinel AI subsystem specifications with algorithms, functions, and key parameters. Source: author’s design and implementation (this work).
Table 2.
Cybectr Sentinel AI subsystem specifications with algorithms, functions, and key parameters. Source: author’s design and implementation (this work).
| AI Component | Algorithm | Function in Sentinel | Key Parameters/Formula |
|---|
| Behavioral Anomaly Detection | Isolation Forest [9] | Real-time detection of unusual device or user behavioral patterns in telemetry stream | ; ; trees, sub-sample; threshold: alert, auto-containment |
| Vulnerability Triage | XGBoost + SHAP [7,8] | Classifies and prioritizes vulnerabilities by exploitability; SHAP explains each decision | ; SHAP: |
| Unknown Asset Inference | Sentence Transformer cosine similarity [5] | Infers vulnerability profile for novel assets by matching against known-asset embedding library | ; match threshold |
| AI-Guided Pen Testing | PPO [6] | RL agent selects optimal attack sequences in isolated sandbox; reward tied to exploit success, novelty, and stealth | ; , |
| Mitigation Generation | RAG + LLM [10] | Queries MITRE D3FEND knowledge base; generates specific patches, config fixes, hardening steps per finding | Top- D3FEND document retrieval; cosine similarity retrieval; LLM synthesizes output |
| Adversarial Robustness Testing | DiCE counterfactuals [12] | Generates minimum-perturbation adversarial examples to test classifier evasion; findings feed model retraining | Counterfactual diversity constraint; proximity loss minimized; integrated into Sentinel retraining pipeline |
Table 3.
XAI implementation stack in Cybectr Sentinel: algorithm, role, output format, and AZTRM-D enforcement function. Source: Cybectr Sentinel architecture (this work).
Table 3.
XAI implementation stack in Cybectr Sentinel: algorithm, role, output format, and AZTRM-D enforcement function. Source: Cybectr Sentinel architecture (this work).
| XAI Component | Algorithm | Output Format | AZTRM-D Enforcement Function |
|---|
| Vulnerability Triage Explanation | SHAP (TreeExplainer) [7] | Per finding SHAP waterfall plot + feature attribution table | Human gate reviewers see which CVE features drove the priority score; supports defensible authorization decisions under NIST RMF assess phase |
| Natural-Language Finding Summary | Claude API (Anthropic) | Plain-language analyst briefing per finding | Translates SHAP attribution scores into clear developer/administrator/CISO-level summaries; role-appropriate detail level enforced by RBAC |
| Anomaly Explanation | Isolation Forest path length decomposition [9] | Short-path feature trace per flagged instance | Shows which behavioral telemetry features caused an anomalous classification; enables analyst to distinguish genuine insider threat behavior from monitoring noise |
| Adversarial Robustness Report | DiCE counterfactuals [12] | Minimum-perturbation example set per classifier | Documents the nearest decision-boundary crossing for each AI component; feeds Sentinel model retraining pipeline |
| Pen Test Action Trace | PPO episode log + MITRE ATT&CK TTP mapping [6] | Per episode action sequence with ATT&CK technique labels | Translates RL agent actions into human-readable attack narrative; maps each step to ATT&CK technique IDs for ENGAGE active defense planning |
| RAG Retrieval Provenance | D3FEND document retrieval log (top-k cosine similarity) [10] | Source document list with similarity scores per mitigation recommendation | Each AI-generated mitigation step is traceable to the specific D3FEND document that sourced it; supports audit and compliance review |
Table 4.
Cybectr Sentinel measured performance metrics. All figures are self-measured by the authors on the Cybectr Sentinel platform. Source: Stage 3 validation on NVIDIA Orin (this work).
Table 4.
Cybectr Sentinel measured performance metrics. All figures are self-measured by the authors on the Cybectr Sentinel platform. Source: Stage 3 validation on NVIDIA Orin (this work).
| Metric | Value | Notes |
|---|
| Time-to-Initial-Detection (TTID) | 4.2 min average | Full pipeline: deploy → scan → correlate → first anomaly or vulnerability flag |
| False Positive Rate (FPR) | 3.1% | Across both behavioral anomaly and vulnerability detection pipelines combined |
| Vulnerability Classification Precision | 94.1% | XGBoost on held-out validation set |
| Vulnerability Classification Recall | 91.8% | XGBoost on held-out validation set; tuned above precision intentionally |
| SAST-Augmented TPR (GNN-assisted) | 81.4% on BigVul dataset | GNN-augmented static analysis; tested against BigVul benchmark [16] |
| Adversarial Detection Rate | 93.7% | AI components tested against DiCE counterfactual adversarial inputs [12] |
| Mitigation Report Latency | <90 s per finding | RAG + LLM pipeline including D3FEND retrieval and report compilation |
| RBAC Enforcement Latency | <40 ms per access check | Consistent with AZTRM-D ZT PEP latency target |
| AI Model Training (Insider Threat Model) | 14 h initial | 3 months of log data; standard cloud GPU; one-time cost before deployment |
Table 5.
Stage 3 false positive rate decomposition by the Sentinel AI subsystem. Aggregate 3.1% FPR is the union of all three subsystem rates over their respective monitoring event totals. Source: Stage 3 validation telemetry (this work); per subsystem figures derived from the internal alert log review at submission time.
Table 5.
Stage 3 false positive rate decomposition by the Sentinel AI subsystem. Aggregate 3.1% FPR is the union of all three subsystem rates over their respective monitoring event totals. Source: Stage 3 validation telemetry (this work); per subsystem figures derived from the internal alert log review at submission time.
| Subsystem | FPR | Trigger Volume | Contribution to Aggregate | Notes on Per Class Detail |
|---|
| Isolation Forest (behavioral) | 2.4% | Continuous device telemetry; high-volume monitoring stream | Largest contributor | Per class breakdown (configuration vs. behavioral vs. access-pattern false positives) was not preserved in Stage 3 instrumentation at granularity needed for Wilson CI computation; addressed in Stage 4 |
| XGBoost (vulnerability triage) | 0.7% | Per commit and per scan-cycle event stream; lower volume than behavioral pipeline | Smaller contributor | SHAP explanations available per false positive; provides operational triage support even without per class FPR decomposition |
| GNN-augmented SAST | Not separately preserved | SAST gate fires per commit; volume bounded by commit frequency | Included in aggregate but not separately attributable | Operational FPR characterization at per commit granularity identified as a Stage 4 telemetry requirement |
Table 6.
Cybectr Sentinel capability comparison against manual processes and commercial alternatives. Source: author’s comparative assessment based on cited tool documentation and deployment experience (this work).
Table 6.
Cybectr Sentinel capability comparison against manual processes and commercial alternatives. Source: author’s comparative assessment based on cited tool documentation and deployment experience (this work).
| Capability | Manual/Commercial Tools Alone | Cybectr Sentinel (AZTRM-D Layer) |
|---|
| Unknown Asset Coverage | None; scanners skip assets not in signature databases [23] | Cosine similarity inference covers novel and unregistered devices [5] |
| AI-Guided Pen Testing | Requires dedicated red team; periodic only [24] | PPO agent runs on demand in isolated sandbox [6] |
| Vulnerability Explanation | Black-box scanner output; no rationale provided [23] | SHAP values explain every classification; fully auditable [7] |
| Active Defense Integration | Not addressed by commercial scanners [23] | MITRE ENGAGE for adversary engagement and deception planning [1] |
| Remediation Generation | Manual lookup against vendor advisories [23] | RAG + LLM generates specific patches and hardening steps per finding [10] |
| Report Security | Typically plaintext or unencrypted exports [23] | AES-256 encrypted; RBAC-gated per NIST SP 800-53 [11] |
| Zero-Day Classification | Requires human analyst [24] | Automated novel vulnerability classification with confidence scoring [1] |
| AZTRM-D Alignment | No alignment; bolt-on tools require custom integration | Designed to enforce AZTRM-D controls end-to-end [1] |
Table 7.
Coverage division between Sentinel and conventional tools in the AZTRM-D stack. Source: author’s deployment assessment (this work).
Table 7.
Coverage division between Sentinel and conventional tools in the AZTRM-D stack. Source: author’s deployment assessment (this work).
| Tool | What It Covers | Why It Cannot Be Replaced by Sentinel |
|---|
| Nmap | Live host discovery; exact service and version enumeration on all 65,535 ports | Sentinel’s asset scanner does not perform full-range TCP port scanning with version probing; Nmap’s -sV banner matching against its NSE script library is purpose-built and irreplaceable for service fingerprinting |
| Nessus Professional | Credentialed CVE scanning with authenticated access to OS, drivers, and kernel [23] | Sentinel’s XGBoost classifier scores known CVEs from the NVD; it does not perform credentialed OS-level probing. JetPack driver CVEs that are invisible from the network require authenticated Nessus scans to surface [23] |
| OpenVAS | Cross-validation against Nessus findings; open-source baseline with independent CVE signatures | Provides an independent second opinion with no commercial license dependency; scanner disagreement flags ambiguous findings for manual review |
| LinPEAS | Post-access privilege escalation enumeration: SUID binaries, cron jobs, sudo policy, kernel exploits | Requires a live shell on the target; Sentinel’s PPO sandbox agent cannot enumerate the real device’s privilege escalation surface from outside |
| Hydra | Credential policy validation: confirms account lockout, brute-force rate limiting, and default credential policies are actually enforced | Validates enforcement rather than detecting it; only a live brute-force attempt confirms that lockout fires at the configured threshold |
| Cybectr Sentinel | Unknown Asset Inference; on-demand AI pen testing; D3FEND mitigation mapping; ENGAGE active defense; behavioral anomaly detection; encrypted RBAC reporting [1] | The four capability gaps addressed by Sentinel are absent from all other tools in the stack |
Table 8.
Cybectr Sentinel strengths, limitations, and the tools that address each limitation. Source: author’s operational assessment from Stage 3 deployment.
Table 8.
Cybectr Sentinel strengths, limitations, and the tools that address each limitation. Source: author’s operational assessment from Stage 3 deployment.
| Dimension | Assessment | Addressed By |
|---|
| CVE Signature Coverage | XGBoost classifier trained on NVD dataset; newly published CVEs not in training data are not caught until model retrain or RAG index update | Nessus Professional and OpenVAS with daily signature updates |
| Raw Network Probing | Sentinel does not perform packet-level network probing; its network scan relies on imported Nmap and Nessus results | Nmap -sV -p- and Nessus for raw port and service discovery |
| RF Analysis | Sentinel has no SDR interface; it cannot directly observe wireless traffic | RTL-SDR with GNU Radio and Wireshark |
| Physical Attack Surface | Sentinel cannot detect or prevent physical hardware manipulation; full-disk encryption and physical controls are outside its scope | Linux Unified Key Setup (LUKS2) full-disk encryption, GPIO hardening, UART gating |
| Unknown Asset Inference (Strength) | Sentence Transformer cosine similarity covers assets not in any signature database; identifies nearest known asset with similarity | Sentinel-native capability; no equivalent in commercial tools |
| On-Demand AI Pen Testing (Strength) | PPO agent runs validated attack sequences in isolated sandbox; no red team required for continuous validation | Sentinel-native capability |
| XAI and Auditability (Strength) | SHAP explanations for every classification; Claude-generated natural-language summaries; RAG provenance tracing | Sentinel-native capability |
| Model Training Data Dependency | Isolation Forest requires 3 months of operational log data before behavioral thresholds are trustworthy | Human analyst review coverage during bootstrapping period |
| White-Box Adversarial Evasion | 93.7% adversarial detection rate reflects DiCE-generated evasion, not white-box adversarial attacks against a fully informed attacker | DiCE adversarial retraining pipeline; SHAP monitoring; human analyst review for suspicious findings |
| Scale at Enterprise | 3.1% false positive rate on the Stage 3 single-device corpus; projected to 31 alerts per cycle at 1000 hypothetical endpoints and 310 at 10,000 (planning estimates, not measured fleet-scale figures) | SHAP-prioritized alert queue; tiered alert handling; RBAC-based routing |
Table 9.
AZTRM-D versus six conventional SDLC methodologies across six security-relevant dimensions. Each cell sourced from its per row citation.
Table 9.
AZTRM-D versus six conventional SDLC methodologies across six security-relevant dimensions. Each cell sourced from its per row citation.
| Dimension | Waterfall [25] | Agile/Scrum [26] | DevOps [27] | Spiral [28] | RAD [29] | AZTRM-D |
|---|
| Security Integration | Late-stage gate [25] | Ad hoc per sprint [26] | Partial [27] | Per cycle risk [28] | Minimal [29] | Continuous, automated, every phase [1] |
| Zero Trust Architecture | None [25] | None [30] | None [27] | None [28] | None [29] | Core design principle [3,31] |
| AI-Driven Automation | None [32] | None [32] | Limited [24] | None [28] | None [29] | AI orchestration across full lifecycle [1] |
| Risk Management | Informal [32] | Backlog-based [33] | Monitoring only [27] | Formal per cycle [28] | Minimal [29] | NIST RMF integrated from planning [2] |
| Regulatory Compliance | Manual audit [32] | Manual audit [33] | Partial automation [24] | Formal docs [28] | None [29] | Automated compliance mapping [1] |
| IoT/Edge Security | Not addressed [25] | Not addressed [30] | Limited [27] | Limited [28] | Not addressed [29] | First-class design target [1] |
Table 10.
Secure SDLC framework comparison across seven capability dimensions. Each cell sourced from its per row citation.
Table 10.
Secure SDLC framework comparison across seven capability dimensions. Each cell sourced from its per row citation.
| Capability | MS SDL [34] | OWASP SAMM [35] | BSIMM [36] | NIST SSDF [37] | DO-178C [38] | AZTRM-D |
|---|
| Threat Modeling | Manual (STRIDE) [34] | Process-defined [35] | Measured maturity [36] | Recommended [37] | Hazard analysis [38] | AI-automated, dynamic [1] |
| Security Testing | SAST + pen test [34] | SAST + DAST [35] | Measured practice [36] | SAST + SCA [37] | V&V [38] | SAST + DAST + SCA + AI pen test [1] |
| Zero Trust | None [39] | None [35] | None [36] | None [37] | None [38] | Full ZT enforcement [3] |
| AI Integration | None [39] | None [35] | None [36] | None [37] | None [38] | AI orchestration throughout [1] |
| IoT/Edge | None [39] | None [35] | None [36] | Limited [37] | Yes (avionics) [38] | First-class target [1] |
| NIST RMF | None [39] | None [35] | None [36] | Partial [37] | None [38] | Fully integrated [2] |
| AI-Driven Monitoring | None [39] | None [35] | None [36] | None [37] | None [38] | Continuous behavioral analysis [1] |
Table 11.
Fourteen-capability comparative matrix across five secure SDLC frameworks and AZTRM-D. Assessments based on published framework documentation cited in
Section 3.2.3.
Table 11.
Fourteen-capability comparative matrix across five secure SDLC frameworks and AZTRM-D. Assessments based on published framework documentation cited in
Section 3.2.3.
| Capability | MS SDL [34] | OWASP SAMM [35] | BSIMM [36] | NIST SSDF [37] | DO-178C [38] | AZTRM-D |
|---|
| Automated Threat Modeling | No | Partial | Partial | Partial | No | Yes |
| Shift-Left Security (Dev Phase) | Yes | Yes | Yes | Yes | No | Yes |
| AI-Assisted Code Analysis (GNN/SAST) | No | No | No | No | No | Yes |
| Zero Trust Network Architecture | No | No | No | No | No | Yes |
| Identity Verification (Continuous) | No | Partial | No | Partial | No | Yes |
| AI-Guided Penetration Testing | No | No | No | No | No | Yes |
| Unknown Asset Similarity Analysis | No | No | No | No | No | Yes |
| MITRE ATT&CK Integration | Partial | Partial | Yes | Yes | No | Yes |
| MITRE D3FEND Mitigation Mapping | No | No | No | No | No | Yes |
| MITRE ENGAGE Active Defense | No | No | No | No | No | Yes |
| Post-Quantum Readiness Planning | No | No | No | No | No | Yes |
| Full-Disk Encryption (IoT/Edge) | No | No | No | Partial | No | Yes |
| Immutable Audit Logging | Partial | Partial | Partial | Yes | No | Yes |
| Automated SBOM + Artifact Signing | Partial | Partial | Partial | Yes | No | Yes |
Table 12.
Master SDLC and security framework capability comparison. Column assessments: waterfall [
25,
32]; Agile [
26,
42]; DevOps [
24,
27]; Spiral [
28]; MS SDL [
34,
40]; OWASP SAMM [
35]; BSIMM [
36]; NIST SSDF [
37]; DO-178C [
38]; and AZTRM-D [
1].
Table 12.
Master SDLC and security framework capability comparison. Column assessments: waterfall [
25,
32]; Agile [
26,
42]; DevOps [
24,
27]; Spiral [
28]; MS SDL [
34,
40]; OWASP SAMM [
35]; BSIMM [
36]; NIST SSDF [
37]; DO-178C [
38]; and AZTRM-D [
1].
| Capability | Waterfall [25] | Agile [26] | DevOps [27] | Spiral [28] | MS SDL [34] | OWASP SAMM [35] | BSIMM [36] | NIST SSDF [37] | DO-178C [38] | AZTRM-D |
|---|
| Security-first design | × | ∼ | ∼ | ✓ | ✓ | ✓ | ∼ | ✓ | ✓ | ✓ |
| Automated CI/CD sec gates | × | ∼ | ✓ | × | ∼ | ∼ | ∼ | ∼ | × | ✓ |
| Zero Trust architecture | × | × | × | × | × | × | × | × | × | ✓ |
| AI threat detection | × | × | × | × | × | × | × | × | × | ✓ |
| AI-guided pen testing | × | × | × | × | × | × | × | × | × | ✓ |
| SBOM/supply chain mgmt | × | × | ∼ | × | ∼ | ∼ | ∼ | ✓ | ∼ | ✓ |
| Formal RMF integration | × | × | × | × | × | × | × | ✓ | ✓ | ✓ |
| Unknown Asset Inference | × | × | × | × | × | × | × | × | × | ✓ |
| D3FEND countermeasure map | × | × | × | × | × | × | × | × | × | ✓ |
| ENGAGE active defense | × | × | × | × | × | × | × | × | × | ✓ |
| Post-quantum readiness | × | × | × | × | × | × | × | × | × | ✓ |
| IoT/edge-specific guidance | × | × | × | × | × | ∼ | × | ∼ | ✓ | ✓ |
| Insider threat detection | × | × | × | × | ∼ | ∼ | ✓ | ∼ | × | ✓ |
| Capabilities Present | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 4 | 3 | 13 |
Table 13.
Implementation cost and effort comparison. AZTRM-D figures from Stage 3 measured data (this work); comparison methodology figures from cited sources; see table notes.
Table 13.
Implementation cost and effort comparison. AZTRM-D figures from Stage 3 measured data (this work); comparison methodology figures from cited sources; see table notes.
| Dimension | Agile/Scrum | MS SDL | AZTRM-D | Source |
|---|
| Initial setup (person-hours) | 40–80 † | 200–400 | 320–480 | Agile range: author estimate based on process complexity in [24,26] †; SDL from [34,40]; AZTRM-D from Stage 3 measured timeline (Table 14) |
| Security tool licensing (annual) | $0–5k † | $15k–40k | $12k–35k | Publicly listed vendor pricing as of the 2024 deployment period: Nessus Professional [23], GitLab Ultimate, Semgrep Pro, OWASP ZAP (open source); Agile range reflects minimal tooling †; AZTRM-D figure validated against actual Stage 3 deployment spend |
| Ongoing overhead per sprint | 2–5 h | 20–40 h | 4–8 h | AZTRM-D Stage 3 measurement; SDL overhead from [34]; Agile baseline reflects typical security review effort for teams achieving high deployment frequency [24] |
| Commit-stage defect fix | 1× | 1× | 1× | Baseline |
| Pre-release defect fix | 10–15× | 10–15× | 10–15× | [46,47] |
| Post-release defect fix | 30–100× | 30–100× | 30–100× | [45,47,48] |
Table 14.
AZTRM-D implementation timeline and effort breakdown from actual deployment on NVIDIA Orin devices. Source: Stage 3 deployment log (this work).
Table 14.
AZTRM-D implementation timeline and effort breakdown from actual deployment on NVIDIA Orin devices. Source: Stage 3 deployment log (this work).
| Week | Phase | Key Activities | Primary Challenges |
|---|
| 1–2 | Infrastructure Setup | GitLab self-hosted instance; SSH key provisioning; RBAC role definitions; Multi-Factor Authentication (MFA) enrollment | SSH key rotation for three credential tiers required careful ordering to avoid access lockouts |
| 2–3 | CI/CD Pipeline Construction | SAST integration (Semgrep for C/C++ and Python 3.14); SCA via GitLab dependency scanning; IaC scanning via Checkov; Secrets Scan via Gitleaks; SBOM generation via Syft | Tuning SAST false positive rates below 5% without suppressing real findings required two iteration cycles |
| 3–4 | ZT Policy Deployment | SPIFFE/SPIRE SVID provisioning; sudo policy hardening (/etc/sudoers.d/); account lockout configuration; immutable log enforcement via auditd + remote syslog | SPIRE SVID rotation on resource-constrained Orin hardware introduced CPU spikes; resolved by extending interval from 1 to 4 h with compensating session validation |
| 4–5 | Full-Disk Encryption | LUKS2 setup on SD cards; AES-256-XTS key provisioning; boot-time unlock; offline attack resistance validation | Bootloader configuration for LUKS2 on the Jetson Orin Nano required custom initramfs hooks; standard Ubuntu LUKS setup does not cover SD card boot path |
| 5–6 | Wireless Hardening | WPA3 SAE enforcement; wireless microsegmentation; Bluetooth disablement; RF emission baseline capture | WPA3 SAE required driver update on the Jetson Orin Nano WiFi module; older driver revisions fell back to WPA2 silently |
| 6–8 | Sentinel Deployment and AI Bootstrapping | Sentinel deployment in embedded mode; Isolation Forest training on 3 months of operational logs; XGBoost classifier training on NIST NVD CVE dataset; PPO agent training in isolated sandbox; SHAP pipeline validation | 14 h initial Isolation Forest training on cloud GPU; 3.1% false positive rate required threshold tuning before automated containment was enabled |
| 8–10 | DAST and Supply Chain Gate Integration | DAST via OWASP ZAP; CSPM via Prowler against cloud configuration baseline; cryptographic artifact signing via Cosign; SBOM attestation pipeline | DAST scan timing required a dedicated staging environment to avoid false positives from incomplete deployment states |
Table 15.
Estimated implementation effort and security debt accumulation across AZTRM-D, conventional SDLC, and secure SDLC frameworks. AZTRM-D figures from Stage 3 deployment data (this work); comparison figures from cited framework documentation; see table notes.
Table 15.
Estimated implementation effort and security debt accumulation across AZTRM-D, conventional SDLC, and secure SDLC frameworks. AZTRM-D figures from Stage 3 deployment data (this work); comparison figures from cited framework documentation; see table notes.
| Methodology | Initial Setup | Tooling Cost | Security Overhead per Sprint | Time to First Hardened Deploy | Security Debt Accumulation |
|---|
| Waterfall [25,32] | Low (1–2 weeks) | Minimal | Near zero during development; concentrated at final review | 6–18 months (post-delivery) | High; findings arrive too late for economical rework |
| Agile/Scrum [26,30] | Low (1–2 weeks) | Minimal without dedicated security tooling | Low; security tickets compete with feature backlog | Variable; security rarely blocks release | Medium; sprint-by-sprint accumulation without structural gate |
| DevOps [24,27] | Medium (2–4 weeks) | Moderate; CI/CD pipeline tooling | 5–10% engineering time for pipeline maintenance † | 4–8 weeks after CI/CD is operational | Medium to low; depends on whether security was added to the pipeline |
| Microsoft SDL [34,40] | Medium (4–8 weeks) | Low; primarily process overhead | 10–15%; manual threat modeling and review cycles † | 8–16 weeks | Low for known-class threats; high for novel attack surfaces |
| NIST SSDF [37] | Medium (4–8 weeks) | Low; guidance-based, tools chosen separately | 10–15%; documentation and artifact requirements † | 8–16 weeks | Low for compliance-driven deployments; ZT and AI gaps remain |
| OWASP SAMM [35] | Low (1–3 weeks to assess; months to mature) | Low; maturity measurement only | Varies with maturity; typically 5–20% † | Not prescriptive; maturity timeline is 6–24 months | Decreases as maturity increases; no prescribed ZT or AI path |
| AZTRM-D [1] | High (6–10 weeks for full pipeline and ZT deployment) | Moderate to high; GitLab CI/CD, Nessus Professional, Sentinel, LUKS2, WPA3, SPIFFE/SPIRE | 12–18% CPU overhead on device; 15–20% engineering time ongoing for gate maintenance | 6–10 weeks | Low from day one; all five scanning modalities active from first commit |
Table 16.
Performance overhead comparison: AZTRM-D vs. comparable secure development approaches on IoT/edge hardware. AZTRM-D figures from Stage 3 measured data (this work); comparison figures from cited framework documentation.
Table 16.
Performance overhead comparison: AZTRM-D vs. comparable secure development approaches on IoT/edge hardware. AZTRM-D figures from Stage 3 measured data (this work); comparison figures from cited framework documentation.
| Approach | Device CPU Overhead | Pipeline Latency Added | On-Device Monitoring | Source |
|---|
| AZTRM-D (full) | 12–18% (active scan)/3–5% (idle) | Included in CI/CD gates | Yes (AI behavioral monitoring) | Stage 3 measured [4] |
| Standard DevSecOps | 0–2% | 5–15 min (typical SAST/DAST) | No | No on-device agent by design [44] |
| Microsoft SDL | 0–2% | 10–30 min (manual gates) | No | No on-device agent by design [40] |
| ZT enforcement only (no AI) | 3–8% (representative range from surveyed ZT implementations) [49] | Minimal | Partial (access control only) | [49,50] |
Table 17.
Anomaly detection algorithm comparison on NVIDIA Orin. Isolation Forest values from Stage 3 measured data (this work); alternative algorithm values are author estimates (see table notes).
Table 17.
Anomaly detection algorithm comparison on NVIDIA Orin. Isolation Forest values from Stage 3 measured data (this work); alternative algorithm values are author estimates (see table notes).
| Algorithm | Inference Latency | Memory (MB) | Explainability | CPU Overhead |
|---|
| Isolation Forest [9] (, ) | , <1 ms | <50 | Feature path | 3–5% |
| One-Class SVM [51] † | , >10 ms | 200–400 | None | 12–18% |
| Autoencoder (LSTM) [51] † | >5 ms | 150–300 | Post hoc only | 15–22% |
Table 18.
Vulnerability triage algorithm comparison. All figures are from the author’s preliminary evaluation on the CVE triage dataset (this work); see table notes.
Table 18.
Vulnerability triage algorithm comparison. All figures are from the author’s preliminary evaluation on the CVE triage dataset (this work); see table notes.
| Algorithm | Precision | Recall | SHAP Compatible | Overfitting Risk |
|---|
| XGBoost [8] (final, AZTRM-D) | 94.1% | 91.8% | Yes (exact) | Low (regularized) |
| Random Forest [53] (preliminary) | 91.3% | 88.5% | Approximate only | Moderate |
| Neural Network (MLP) [54] | 89.7% | 85.2% | Post hoc approx. | High (CVE dataset) |
Table 19.
XGBoost input features for CVE vulnerability triage, sourced from NIST NVD API v2.0 and CVSS v3.1 scoring.
Table 19.
XGBoost input features for CVE vulnerability triage, sourced from NIST NVD API v2.0 and CVSS v3.1 scoring.
| Feature | Source | Type | Encoding/Notes |
|---|
| CVSS v3 Base Score | NVD/CVSS v3.1 [56] | Continuous [0.0–10.0] | Raw numeric score |
| CVSS v3 Exploitability Subscore | NVD/CVSS v3.1 [56] | Continuous [0.0–3.9] | Derived from AV, AC, PR, UI |
| CVSS v3 Impact Subscore | NVD/CVSS v3.1 [56] | Continuous [0.0–6.0] | Derived from C, I, A |
| Attack Vector (AV) | NVD/CVSS v3.1 [56] | Categorical | One-hot: network, adjacent, local, physical |
| Attack Complexity (AC) | NVD/CVSS v3.1 [56] | Binary | 0 = high, 1 = low |
| Privileges Required (PR) | NVD/CVSS v3.1 [56] | Ordinal | 0 = high, 1 = low, 2 = none |
| User Interaction (UI) | NVD/CVSS v3.1 [56] | Binary | 0 = required, 1 = none |
| Scope (S) | NVD/CVSS v3.1 [56] | Binary | 0 = unchanged, 1 = changed |
| Confidentiality Impact (C) | NVD/CVSS v3.1 [56] | Ordinal | 0 = none, 1 = low, 2 = high |
| Integrity Impact (I) | NVD/CVSS v3.1 [56] | Ordinal | 0 = none, 1 = low, 2 = high |
| Availability Impact (A) | NVD/CVSS v3.1 [56] | Ordinal | 0 = none, 1 = low, 2 = high |
| CWE Category | NVD [13] | Categorical | Top-25 CWE IDs one-hot encoded; remainder bucketed as “Other” |
| CVE Age (days) | NVD Published Date [13] | Continuous | Days since publication at training time |
Table 20.
Reinforcement learning algorithm comparison for pen testing. Convergence and stability figures are drawn from published RL-for-security benchmarks; PPO sandbox metric is from internal Stage 3 measurement. See table notes for source details.
Table 20.
Reinforcement learning algorithm comparison for pen testing. Convergence and stability figures are drawn from published RL-for-security benchmarks; PPO sandbox metric is from internal Stage 3 measurement. See table notes for source details.
| Algorithm | Action Space | Convergence Episodes (to 80% Optimal Policy) | Sample Efficiency (Updates/Episode) | Sparse-Reward Stability (Variance) |
|---|
| PPO (, AZTRM-D) [6] | Discrete | 800–1500 † | 4 (multi-epoch) | Low variance; clipped objective prevents destabilization |
| DQN [57,58] | Discrete | 2500–5000 ‡ | 1 (single-step) | High variance under sparse rewards; documented catastrophic forgetting in security tasks |
| DDPG [59] | Continuous (mismatch for discrete pen testing actions) | N/A (architectural mismatch) | 1 (single-step) | Moderate variance; designed for continuous control |
Table 21.
Cosine similarity threshold calibration for Unknown Asset Inference ( IoT asset profiles, 80/20 held-out split from known-asset library). The F1-score peaked at 0.82, selected as the operational threshold; the peak F1-score is shown in bold. Source: author’s grid search evaluation (this work).
Table 21.
Cosine similarity threshold calibration for Unknown Asset Inference ( IoT asset profiles, 80/20 held-out split from known-asset library). The F1-score peaked at 0.82, selected as the operational threshold; the peak F1-score is shown in bold. Source: author’s grid search evaluation (this work).
| Threshold | Precision | Recall | F1-Score |
|---|
| 0.70 | 0.61 | 0.95 | 0.74 |
| 0.75 | 0.74 | 0.91 | 0.82 |
| 0.80 | 0.83 | 0.87 | 0.85 |
| 0.82 | 0.87 | 0.85 | 0.86 |
| 0.85 | 0.91 | 0.78 | 0.84 |
| 0.90 | 0.95 | 0.61 | 0.74 |
| 0.95 | 0.98 | 0.39 | 0.56 |
Table 22.
Comparative performance of Sentinel AI subsystems against published external baselines. Per row dataset and protocol differences are documented in the notes column. Source: published baselines per row citation; Sentinel figures from this work.
Table 22.
Comparative performance of Sentinel AI subsystems against published external baselines. Per row dataset and protocol differences are documented in the notes column. Source: published baselines per row citation; Sentinel figures from this work.
| Task | Sentinel (This Work) | External Baseline | Baseline Source | Comparison Notes |
|---|
| Vulnerability classification (CVE) | XGBoost: 94.1% precision, 91.8% recall on NVD CVSS-v3 corpus | Random Forest: 91.3% precision, 88.5% recall on same corpus | [53] (algorithm), this work (figures) | Same evaluation corpus and split; preliminary algorithm comparison run during selection |
| Vulnerability classification (neural baseline) | XGBoost (above) | MLP feedforward: 89.7% precision, 85.2% recall on same corpus | [54] (algorithm), this work (figures) | Same evaluation corpus and split; selected against on explainability and accuracy |
| Source code vulnerability detection | GNN-augmented Semgrep: 81.4% TPR at 6.2% FPR on BigVul held-out | Pattern-matching SAST baseline: ∼50% TPR at 5–10% FPR on novel patterns | [16] | BigVul corpus published partition; pattern-matching baseline figures from BigVul authors |
| Behavioral anomaly detection (edge IoT) | Isolation Forest: 3.1% FPR aggregate; sub-40 ms PEP latency | OCSVM: estimated >200 ms inference latency on Cortex-A class hardware | [51] | Sentinel figures measured on NVIDIA Orin Stage 3; OCSVM figure is computational-complexity estimate, not measured (see Table 17) |
| Behavioral anomaly detection (deep architecture) | Isolation Forest (above) | Autoencoder: requires GPU inference; exceeds 20% CPU budget on Orin | [51] | Selected against on edge hardware compute budget; autoencoders perform comparably or better on accuracy in data center contexts |
| Lightweight edge anomaly detection (adjacent domain) | Isolation Forest (above) | Lightweight transformer with feature fusion (GLP-Transformer): comparable accuracy at 48.28 K parameters, 2.74 M FLOPs on bearing fault corpus | [52] | Adjacent-domain comparison; bearing-fault vs. behavioral telemetry are different task surfaces, but both target resource-constrained edge inference. Transformer alternatives motivated the comparative evaluation; tree-ensemble selected for explainability and exact SHAP compatibility |
| Counterfactual adversarial robustness | DiCE evaluation: 93.7% detection rate against minimum-perturbation counterfactuals | DiCE on tabular financial classifiers: comparable detection rates reported in original work | [12] | Same evaluation methodology; different dataset (CVE classification vs. tabular benchmarks) |
Table 23.
Full tool stack with category, AZTRM-D role, alternatives considered, and selection rationale. Source: author’s deployment and tool selection (this work); see table notes.
Table 23.
Full tool stack with category, AZTRM-D role, alternatives considered, and selection rationale. Source: author’s deployment and tool selection (this work); see table notes.
| Tool | Category | Role in AZTRM-D | Alternatives Considered | Why This Tool |
|---|
| Nmap (-sV -p-) | Port scanner | Asset discovery; service-version port enumeration in all pen test stages | Masscan, Zmap | -sV extracts exact software versions for CVE mapping; -p- ensures no port is missed; Masscan/Zmap prioritize speed over version accuracy |
| Nessus Professional | Vulnerability scanner | Authenticated and unauthenticated CVE scanning against all discovered assets | OpenVAS; Qualys; Rapid7 InsightVM | Covers over 115,000 CVE IDs with daily plugin updates [23]; credentialed scans expose OS/driver vulnerabilities invisible from the network; OpenVAS retained as cross-validation |
| OpenVAS | Vulnerability scanner | Cross-validation of Nessus findings; open-source baseline | Nessus alone | Independent CVE signatures provide a second opinion; scanner disagreement flags ambiguous findings for manual review; no commercial license dependency |
| Hydra | Credential tester | Tests for weak/default credential policies (Stage 1 validation) | Medusa, Burp Suite Intruder | Validates account lockout enforcement at the protocol level; confirms brute-force path exists before and does not after hardening |
| RTL-SDR (RTL2832U) | RF analysis | Wireless signal observation and RF MITM probing across all stages | USRP, HackRF | Realistic adversary capability at under $30 retail; USRP and HackRF are more capable but far more expensive |
| Metasploit | Exploit framework | AI-guided pen test execution in Sentinel’s isolated sandbox | CORE Impact, Canvas | Largest openly available exploit library; PPO agent sequences Metasploit modules via RPC API; CORE Impact and Canvas are commercial with restricted API |
| LinPEAS | Priv-esc enumeration | Post-access privilege escalation path discovery | PEAS (manual), sudo-killer | Linux privilege escalation enumeration covering SUID binaries, cron jobs, sudo policy, kernel exploits; validates granular sudo policy from inside the device |
| LUKS2/ cryptsetup | Disk encryption | AES-256-XTS full-disk encryption on NVIDIA Orin SD cards per NIST SP 800-38E [61] | dm-crypt without LUKS, eCryptfs | Linux-native; LUKS2 provides header backup and token-based key management; Argon2id KDF vs. PBKDF2 provides memory-hard defense against GPU brute-force |
| WPA3 + SAE | Wireless security | Encrypted wireless communications on all device interfaces | WPA2 (PSK), PEAP/EAP-TLS | SAE eliminates four-way handshake vulnerability exploitable under WPA2; PEAP/EAP-TLS requires 802.1X infrastructure not viable on constrained IoT hardware |
| GitLab (self-hosted) | DevSecOps/ SCM | Secure code repository with MFA, RBAC, cryptographic commit signing, multi-stage CI/CD gates | GitHub Enterprise 3.18.3, Jenkins 2.555.1 + Gitea 1.26.0 | Self-hosted: full RBAC control, no data residency concerns; native pipeline integration for SAST, SCA, SBOM, DAST, IaC scanning |
| Semgrep | SAST | Source code static analysis for C/C++ and Python | SonarQube, Checkmarx, Fortify | Open rule set tunable to project-specific policies; fast enough for commit-level scanning; commercial alternatives cost more for marginal accuracy gain |
| Checkov | IaC scanning | Infrastructure-as-Code policy validation | Terrascan, tfsec | Broadest provider coverage; active maintenance; integrates directly into GitLab CI/CD |
| Gitleaks | Secrets scanning | Credential and API key detection in commits and repository history | TruffleHog, detect-secrets | Entropy-based detection plus regex rules; Git history scanning catches previously committed secrets |
| Cosign/ Sigstore | Artifact signing | Elliptic Curve Digital Signature Algorithm (ECDSA) signing of build artifacts; SBOM attestation | GPG signing, Notary v2 | Stores signatures in OCI registry alongside artifacts; keyless signing via OIDC; tighter supply chain integration than GPG |
| SPIFFE/SPIRE | Service identity | Workload identity provisioning for ZT mTLS authentication | Vault PKI, cert-manager | Short-lived SVIDs (4 h rotation) bound to attested workload identity; no long-lived secrets to rotate |
| Cybectr Sentinel | AI enforcement layer | End-to-end AZTRM-D enforcement: asset discovery, AI analysis, pen test, mitigation, reporting, active defense | Tenable.io, Darktrace, Vectra AI | Covers all four capability gaps simultaneously; designed for AZTRM-D’s access control model; commercial alternatives address individual gaps but not the combination |
Table 24.
Inter-rater agreement across three independent testers and all stages. Percent agreement and Gwet AC1 are reported alongside the Fleiss kappa to contextualize the kappa paradox at extreme base rates. Source: Stage 1–3 adversarial testing pre-discussion records (this work) [
1].
Table 24.
Inter-rater agreement across three independent testers and all stages. Percent agreement and Gwet AC1 are reported alongside the Fleiss kappa to contextualize the kappa paradox at extreme base rates. Source: Stage 1–3 adversarial testing pre-discussion records (this work) [
1].
| Stage | Total Findings | All 3 Concur | 2 Concur | 1 Only | Pct. Agreement (2+) | Gwet AC1 |
|---|
| Stage 1 (Factory Default) | 14 | 14 | 0 | 0 | 100% | 1.000 |
| Stage 2 (Network Hardened) | 9 | 6 | 2 | 1 | 88.9% | — |
| Stage 3 (Full AZTRM-D) | 4 | 3 | 1 | 0 | 100% | — |
| All Stages Combined | 27 | 23 | 3 | 1 | 96.3% | 0.888 |
Table 25.
Attack surface coverage by layer across all three testers and all three stages. Source: adversarial testing campaign (this work).
Table 25.
Attack surface coverage by layer across all three testers and all three stages. Source: adversarial testing campaign (this work).
| Attack Layer | Specific Vectors Tested | Tools Used | Stages |
|---|
| Hardware: Physical | SD card removal, offline filesystem mount, UART console (ttyTHS1), GPIO pin access | chroot, Linux mount utilities, logic analyzer | 1, 2, 3 |
| Hardware: RF/Wireless | SDR signal analysis, WiFi traffic capture, Bluetooth MITM probe | RTL-SDR, GNU Radio, Wireshark | 1, 2, 3 |
| Network: Passive | MAC/IP enumeration, ping sweep, traffic observation | Nmap (passive), Wireshark | 1, 2, 3 |
| Network: Active Scan | Full port scan (nmap -sV -p-), service-version detection, CVE mapping | Nmap, Nessus Professional, OpenVAS | 1, 2, 3 |
| Software: Credential | SSH brute-force (default nvidia:nvidia), account lockout bypass | Hydra | 1 |
| Software: Vuln Exploit | Privilege escalation enumeration, SUID abuse, reverse shell deployment | LinPEAS, Metasploit (Sentinel sandbox) | 1, 2 |
| Software: Persistence | Init file modification (/etc/rc.local, .bashrc), hidden account creation, SSH tunnel C2 | Manual + Metasploit | 1 |
| Software: Supply Chain | Unsigned dependency injection into CI/CD pipeline, AI-generated policy-violating code submission | Custom test artifacts, GitLab pipeline | 2, 3 |
| Insider: Standard Dev | Configuration mistakes, accidental credential commit, insecure service exposure | GitLab CI/CD pipeline, Sentinel scanning | 2, 3 |
| Insider: Privileged Dev | Attempted root escalation, log tampering, monitoring agent disablement | LinPEAS, sudo enumeration | 1, 2, 3 |
| Insider: AI-Assisted | AI-generated exploit code and attack sequences submitted through GitLab pipeline | LLM assistant + GitLab pipeline | 3 |
Table 26.
Stage 1 penetration test findings (factory-default configuration): external attacker and insider ZT evaluation, all three testers. Source: independent adversarial testing (this work) [
1].
Table 26.
Stage 1 penetration test findings (factory-default configuration): external attacker and insider ZT evaluation, all three testers. Source: independent adversarial testing (this work) [
1].
| Attack Vector | Tool/Technique | Outcome | External Attacker Finding | Insider ZT Evaluation |
|---|
| Passive Recon | Nmap ping sweep, OSINT | MAC/IP identified; default credentials found in public JetPack documentation | Device located and attack staged with no prior knowledge | No ZT policy exists pre-login |
| Port Scan | nmap -sV -p- | 4 open ports: SSH 22, VNC 5900, HTTP 80, HTTPS 443; exact service versions extracted | Full network attack surface mapped | Open ports indicate absence of least-privilege network policy; ZT network pillar violated |
| Vuln Scan | Nessus Professional, OpenVAS | Multiple unpatched CVEs in kernel and NVIDIA JetPack drivers | Concrete exploitation pathways identified by all three testers | No CI/CD scanning in place: 0% VDR in development pipeline |
| RF/Wireless | RTL-SDR, Wireshark | WiFi traffic observable in plaintext; no WPA3; Bluetooth interfaces present but inactive | Wireless traffic capturable over the air with commodity hardware | No wireless encryption policy; ZT data-in-transit tenet violated |
| Initial Access | Hydra SSH brute-force (nvidia:nvidia) | SSH login achieved in under 5 min by all three testers; no account lockout mechanism | Full shell obtained consistently | Default credential violates never-trust-always-verify; no lockout means no brute-force protection |
| Privilege Escalation | sudo su (single command) | Immediate root from nvidia user; no further tools required | Complete system control achieved instantly | Unconstrained sudo directly violates least privilege; single command equals full compromise |
| Persistence | rc.local, .bashrc mod.; hidden user; SSH C2 | Backdoor survives reboot; C2 traffic encrypted inside SSH tunnel | Persistent access established; exfiltration channel active | No init file integrity checking; no immutable log enforcement; persistence invisible to monitoring |
| Log Tampering | Manual deletion of auth.log, syslog; shell history cleared | Forensic trail destroyed completely | No forensic recovery possible post-attack | Root-writable logs violate ZT immutable-logging tenet; audit trail nonexistent |
Table 27.
Stage 2 penetration test findings (after initial network hardening), all three testers. Source: independent adversarial testing (this work) [
1].
Table 27.
Stage 2 penetration test findings (after initial network hardening), all three testers. Source: independent adversarial testing (this work) [
1].
| Attack Vector | Tool/Technique | Outcome | External Attacker Finding | Insider ZT Evaluation |
|---|
| Network Recon | nmap -sV -p-, Nessus, OpenVAS | 0 open ports; no exploitable network findings | Remote exploitation not possible; physical pivot required | Network ZT pillar fully satisfied |
| RF/Wireless | RTL-SDR, Wireshark | Traffic still observable pre-WPA3; WiFi not yet fully isolated | Wireless traffic still capturable; RF vector still open | Wireless encryption policy not yet complete |
| Physical: Storage | SD card removal; Linux mount; chroot on /etc/shadow | Root filesystem mounted externally; offline password reset performed by all three testers | Authentication bypassed entirely without credentials | Physical access circumvents all logical ZT controls; hardware integrity not enforced at this stage |
| Physical: Console | UART ttyTHS1 serial connection | Console accessible post-SD manipulation; login succeeds with reset credentials | Shell obtained after physical bypass | No out-of-band access control; UART not gated or monitored |
| Privilege Escalation | sudo su | Root obtained through physical + console chain; sudo group membership survived offline reset | Full control regained via physical vector | Logical ZT controls held; physical hardware gap negated them entirely |
| Insider: Privileged Dev | Attempted log tampering, monitoring agent kill, lateral access to Git server | Blocked at logical layer; immutable logs held; ZT access policies enforced | Not applicable | Logical controls are working correctly; physical access is the only remaining gap |
| Insider: Mistake | Developer commits test file containing embedded AWS key (intentional test case) | Secrets Scan in CI/CD pipeline catches credentials before merge; commit rejected automatically | Not applicable | AZTRM-D Secrets Scan gate working as designed |
Table 28.
Stage 3 penetration test findings (full AZTRM-D hardening), all three testers. Source: independent adversarial testing (this work) [
1].
Table 28.
Stage 3 penetration test findings (full AZTRM-D hardening), all three testers. Source: independent adversarial testing (this work) [
1].
| Attack Vector | Tool/Technique | Outcome | External Attacker Finding | Insider ZT Evaluation |
|---|
| Network Recon | nmap -sV -p-, Nessus, OpenVAS | 0 open ports confirmed across all three testers | No remote vector | ZT network pillar fully maintained |
| RF/Wireless | RTL-SDR, Wireshark, WiFi probe | WPA3 enforced [66]; all traffic encrypted; microsegmentation active | No exploitable RF finding | ZT data-in-transit tenet satisfied at wireless layer |
| Physical: Storage | SD card removal attempt | Encrypted LUKS2 volume presented; mount failed; offline password reset impossible | Physical storage vector closed | AES-256-XTS per NIST SP 800-38E enforces ZT device integrity at hardware level [61] |
| Physical: Console | UART ttyTHS1 | Hardened login prompt only; no viable credentials after SD encryption | No exploit path from console | Out-of-band access gated; no trivial root path available |
| GPIO Pins | Direct GPIO probing | GPIO pins disabled; no signal observable | Hardware attack surface reduced | Physical attack surface hardened end-to-end |
| Privilege Escalation | sudo su, SUID enumeration (LinPEAS), monitoring agent kill attempt | All paths blocked; multi-step validation required; kill attempt logged and alerted | Escalation not achievable | Least privilege fully enforced; never-trust-always-verify extended to deepest system layers |
| Insider: Motivated | Authenticated privileged developer attempting unauthorized lateral access and log access | Adaptive auth triggered; access revocation fired; ZT PEP blocked all attempts | Not applicable | Behavioral anomaly detection (Isolation Forest) flagged abnormal admin activity correctly |
| Insider: Mistake | Developer commits file with insecure configuration flag | SAST and IaC scans flag policy violation; commit rejected automatically | Not applicable | Automated scanning catches configuration mistakes before they reach production |
| Insider: AI-Assisted | All three testers submitted AI-generated exploit code through GitLab pipeline | SAST flagged code patterns; Secrets Scan caught embedded credentials; SBOM check rejected unauthorized dependencies; no AI-generated artifact reached stage or production repository | Not applicable | Multi-layer gate structure evaluates code content, not authorship |
| Supply Chain | Unsigned dependency injection into build pipeline | SBOM validation and cryptographic artifact signing check rejected unsigned artifact at build gate | Not applicable | ZT cryptographic integrity enforcement working end-to-end [2] |
Table 29.
Training datasets, sources, and partitioning protocols for AZTRM-D AI subsystems. Source: dataset publications cited per row; partitioning protocols per Stage 3 implementation.
Table 29.
Training datasets, sources, and partitioning protocols for AZTRM-D AI subsystems. Source: dataset publications cited per row; partitioning protocols per Stage 3 implementation.
| AI Subsystem | Training Dataset | Partitioning | Preprocessing |
|---|
| Behavioral Anomaly Detection (Isolation Forest) | 3 months of Stage 3 device telemetry (Cybectr Sentinel proprietary) | Unsupervised; sub-sample per tree, trees | Feature scaling via min–max normalization on log-transformed counts; categorical features one-hot encoded; missing values imputed via class median |
| Vulnerability Triage (XGBoost) | NIST NVD CVE corpus (CVSS v3.1 enriched) [13] | 80/20 stratified train/test split; 5-fold cross-validation on training partition | CVE feature extraction (CVSS metrics, CWE category, exploit availability flags); standardized to zero mean, unit variance |
| Unknown Asset Inference (Sentence Transformer) | Author-curated IoT asset library ( profiles spanning embedded MCUs, edge-AI platforms, and industrial sensors) | 80/20 held-out split for threshold calibration | Asset feature concatenation (hardware type, firmware versions, exposed services, communication protocols); tokenization via the Sentence Transformer pretrained model [5] |
| AI-Guided Pen Testing (PPO) | MITRE ATT&CK TTP corpus; Metasploit module catalog [68] | 500-episode training run, 20-episode held-out evaluation per stage | Action space: discretized Metasploit module selection; observation space: target service banner, port state, response codes; sparse-reward shaping per Section 2.3 |
| Mitigation Generation (RAG + LLM) | MITRE D3FEND knowledge base [68] | Index built once per release; retrieval-only at runtime | Document chunking at section boundaries; embedding via the same Sentence Transformer model used for asset inference; cosine similarity retrieval at top- |
| Adversarial Robustness Testing (DiCE) | XGBoost classifier outputs over Stage 3 corpus | Counterfactual generation per classifier instance; counterfactuals per input | Feature normalization matches XGBoost training preprocessing; counterfactual generation per Equation (8) with , |
| GNN-Augmented SAST | BigVul (3754 vulnerable functions across ≈188,000 total) [16] | 80/10/10 train/validation/test split per the BigVul standard partition | Code Property Graph extraction via Joern; node features encode token type, data flow relationships, syntactic position; graph normalization per Equation (6) |
Table 30.
Hyperparameter selection methodology, search range, and final operating values for AZTRM-D AI subsystems. Source: author’s Stage 3 hyperparameter calibration runs (this work).
Table 30.
Hyperparameter selection methodology, search range, and final operating values for AZTRM-D AI subsystems. Source: author’s Stage 3 hyperparameter calibration runs (this work).
| Subsystem | Selection Method | Search Range | Final Operating Values |
|---|
| Isolation Forest | ROC-curve analysis on Stage 3 training corpus; alert threshold tuned for 3.1% target FPR | Threshold ; trees ; sub-sample | Alert threshold 0.6; containment threshold 0.8; ; |
| XGBoost | 5-fold stratified cross-validation; F1-score maximization with recall weighting | ; max_depth ; ; | ; max_depth ; ; ; recall-weighted scoring |
| Sentence Transformer cosine similarity | Held-out F1-score maximization on asset library | Threshold | Threshold (F1-optimal per Table 21) |
| PPO | Sample efficiency on sandbox sparse-reward environment; clipped surrogate per Equation (5) | ; ; learning rate | ; ; learning rate ; 4 update epochs per trajectory |
| GNN (GCN) | 5-fold cross-validation on BigVul training partition | Hidden dim ; layers ; dropout | Hidden dim 128; 3 GCN layers; dropout 0.3; mean-pool readout |
| DiCE | Default-recommended values from original DiCE implementation [12] | Per published defaults | ; ; |
Table 31.
Vulnerability detection rate derivation by scanning modality. The bottom three rows (in bold) are summary rows reporting the total corpus size, aggregate detected count, and undetected count. Source: Stage 3 CI/CD pipeline results (this work).
Table 31.
Vulnerability detection rate derivation by scanning modality. The bottom three rows (in bold) are summary rows reporting the total corpus size, aggregate detected count, and undetected count. Source: Stage 3 CI/CD pipeline results (this work).
| Modality | Seeded | Discovered | Unique Detections | Shared Detections |
|---|
| SAST (Semgrep + GNN-augmented) | 18 | 4 | 11 | 11 |
| DAST | 12 | 3 | 6 | 9 |
| SCA (SBOM + dependency scan) | 9 | 2 | 7 | 4 |
| CSPM | 7 | 1 | 5 | 3 |
| IaC Scanning | 6 | 1 | 4 | 3 |
| Total Corpus | 52 | 11 | 63 total vulnerabilities |
| Detected (aggregate) | 61 (96.8%) |
| Undetected | 2 (novel logic flaws, manual review only) |
Table 32.
Per modality ablation of the Stage 3 vulnerability detection pipeline. Leave-one-out rows compute aggregate VDR with each modality disabled. Single-modality rows compute VDR for each scanner operating alone. Wilson 95% CIs reported throughout. Source: derived from
Table 31 (this work).
Table 32.
Per modality ablation of the Stage 3 vulnerability detection pipeline. Leave-one-out rows compute aggregate VDR with each modality disabled. Single-modality rows compute VDR for each scanner operating alone. Wilson 95% CIs reported throughout. Source: derived from
Table 31 (this work).
| Configuration | Detected | VDR | Wilson 95% CI | Detection Loss vs. Full Pipeline |
|---|
| Full pipeline (all 5 modalities) | 61 | 96.8% | [89.1%, 99.1%] | — (baseline) |
| Leave-one-out ablation |
| Without SAST | 50 | 79.4% | [67.8%, 87.5%] | pp; SAST is the highest-contribution modality |
| Without DAST | 55 | 87.3% | [76.9%, 93.4%] | pp; runtime-behavior coverage gap |
| Without SCA | 54 | 85.7% | [75.0%, 92.3%] | pp; dependency-vulnerability coverage gap |
| Without CSPM | 56 | 88.9% | [78.8%, 94.5%] | pp; cloud-config drift coverage gap |
| Without IaC | 57 | 90.5% | [80.7%, 95.6%] | pp; infrastructure-policy coverage gap |
| Single-modality coverage |
| SAST only (Semgrep + GNN) | 22 | 34.9% | [24.3%, 47.2%] | — |
| DAST only | 15 | 23.8% | [15.0%, 35.6%] | — |
| SCA only | 11 | 17.5% | [10.0%, 28.6%] | — |
| CSPM only | 8 | 12.7% | [6.6%, 23.1%] | — |
| IaC only | 7 | 11.1% | [5.5%, 21.2%] | — |
Table 33.
Security effectiveness benchmarks: factory-default baseline versus full AZTRM-D hardening. Source: our prior work [
1].
Table 33.
Security effectiveness benchmarks: factory-default baseline versus full AZTRM-D hardening. Source: our prior work [
1].
| Security Metric | Baseline (Factory Default) | AZTRM-D Hardened | Change |
|---|
| Open Network Ports | 4 (SSH 22, VNC 5900, HTTP 80, HTTPS 443), confirmed by all three testers via Nmap | 0, complete elimination of remote attack surface | −4 ports |
| Initial External Access Time | <5 min (Hydra brute-force on nvidia:nvidia; no lockout), consistently achieved by all three testers | Not possible (no remote or physical vectors) | Full elimination |
| Privilege Escalation | Immediate (single sudo su from default user) | Blocked; multi-step validation, default paths removed | Full elimination |
| Vulnerability Detection Rate (CI/CD) | 0%, no automated scanning in pipeline | 96.8%, SAST + DAST + SCA + CSPM + IaC across five complementary modalities | +96.8 pp |
| Mean Time to Remediate (MTTR) | Weeks to months (manual patching, re-flashing) | 1–3 days (automated alerting + AI-driven remediation; Stage 3 measured, AZTRM-D deployment) | 10× to 30× faster |
| Supply Chain Vulnerability | High: no SBOM, no dependency scanning, no artifact signing | Low: mandatory SBOM + cryptographic signing for all components | High → Low |
Table 34.
Resource and scalability benchmarks. Source: our prior work [
1].
Table 34.
Resource and scalability benchmarks. Source: our prior work [
1].
| Performance Metric | Measurement | What It Means | Environment |
|---|
| AI Scan CPU Overhead | 12–18% average during SAST/SCA CI/CD runs | Continuous security scanning is computationally viable on edge hardware without degrading operational performance | NVIDIA Orin devices |
| ZT Policy Enforcement Latency | <40 ms per access decision at PEP | Real-time ZT checks do not introduce perceptible delays in system-to-system communication | NVIDIA Orin, ZT PEP under load |
| AI Model Training Time (Initial) | 14 h using 3 months of log data | One-time setup cost; defines the bootstrapping requirement before behavioral anomaly detection reaches operational accuracy | Standard cloud GPU instance |
| Scalability (Concurrent Endpoints) | Single-device baseline: 3.8% CPU, 41 MB memory, 6.2 ms PEP latency (Stage 3 measured). Sub-linear fleet scaling is architecturally required; physical multi-device characterization is future work (Section 4). | Establishes per device resource floor; orchestrator saturation point requires multi-device experiment | Stage 3 measured; fleet scaling designated future work |
Table 35.
Generalizability of AZTRM-D claims classified by evidentiary basis. Empirically validated claims are bounded by the Stage 3 deployment scope; architecturally transferable claims rest on tool/protocol design rather than platform specifics; hypothesized claims require Stage 4 multi-platform validation. Source: this work.
Table 35.
Generalizability of AZTRM-D claims classified by evidentiary basis. Empirically validated claims are bounded by the Stage 3 deployment scope; architecturally transferable claims rest on tool/protocol design rather than platform specifics; hypothesized claims require Stage 4 multi-platform validation. Source: this work.
| Claim | Empirical (NVIDIA Orin Nano) | Architecturally Transferable | Hypothesized; Requires Stage 4 |
|---|
| 96.8% VDR (Wilson 95% CI: [0.891, 0.991]) | Measured on Stage 3 corpus, | Five CI/CD scanning modalities are platform-agnostic; same modalities on enterprise pipelines should yield comparable rates on comparable corpus | Exact rate on enterprise OWASP-Top-10-weighted corpus; rate variance across language ecosystems (Java, Go, Rust) |
| 12–18% CPU overhead during scans | Measured on Orin Cortex-A78AE @ 1.5 GHz, 8 GB LPDDR5 | Overhead is bounded above on more capable hardware: scans are not Orin-tuned | Exact percentage on x86 enterprise servers, ARM data center (Graviton/Ampere), RISC-V edge platforms |
| Sub-40 ms ZT PEP latency | Measured on Orin local network | On enterprise hardware with faster CPU and more memory bandwidth, latency decreases | Exact latency floor on cloud-native enterprise PEP infrastructure; latency under high-concurrency load |
| ZT enforcement under multi-vector attack | Measured against seven attack categories on Orin | SP 800-207 tenets are platform-independent; SPIFFE/SPIRE, mTLS, and PEP enforcement are deployment patterns, not Orin-specific | Cross-platform attack surface validation: ARM enterprise, x86 enterprise, RISC-V embedded |
| Cryptographic gate validation (LUKS2, TLS 1.3, ECDSA RFC 6979, SPIFFE SVID) | Validated on Orin SD card boot path with custom initramfs | LUKS2/TLS 1.3/ECDSA are NIST-specified standards; cryptographic enforcement is independent of platform | Boot-path equivalents on UEFI x86, ARM TrustZone, RISC-V Keystone |
| Insider threat detection (Isolation Forest, 0.6 alert threshold) | 14 h initial training on 3 months Stage 3 telemetry | Isolation Forest training/inference is platform-agnostic; threshold is calibrated against operational base rate, not Orin specifics | Threshold recalibration on environments with different operational base rates (enterprise SOC vs. IoT fleet vs. cloud workload) |
| 3.1% aggregate FPR | Measured on Stage 3 single-device corpus | FPR is a property of the model and threshold, not the deployment platform | Per class FPR characterization at fleet scale (Section 2.6.1); environment-specific recalibration |
| RF/physical attack vector coverage | Validated on Orin GPIO, UART, SD storage, WiFi, Bluetooth | Physical and RF attack surfaces are hardware-specific by definition | Not directly applicable to enterprise data center deployments; substitutes are physical security controls, supply chain integrity, firmware signing |
| 4.2 min Time-to-Initial-Detection | events, Stage 3 single device | Pipeline latency is dominated by retrieval and synthesis stages, which are platform-independent | TTID floor on cloud-native deployments where retrieval index can be co-located; behavior at fleet scale |
Table 36.
NIST SP 800-207 Zero Trust tenet mapping to AZTRM-D implementation and Stage 3 validation outcomes. Tenets from [
3].
Table 36.
NIST SP 800-207 Zero Trust tenet mapping to AZTRM-D implementation and Stage 3 validation outcomes. Tenets from [
3].
| ZT Tenet (NIST SP 800-207) | AZTRM-D Implementation | Stage 3 Validation Outcome |
|---|
| All data sources treated as resources | Sentinel asset inventory; all network traffic treated as untrusted regardless of origin | 100% of known lab assets discovered; no implicit trust granted to any device |
| All communication secured regardless of location | TLS 1.3 for internal traffic; WPA3 for wireless; ECDSA with RFC 6979 nonces [73] | MITM simulation: anomalous traffic detected; segment isolated automatically |
| Per session access to individual resources | JIT access via SPIFFE/SPIRE SVIDs [74]; session tokens expire; no persistent standing access | Session hijack simulation: expired tokens rejected; no lateral movement achieved |
| Resource access policy dynamic with behavioral inputs | Isolation Forest feeds ZT PEP decisions [9]; adaptive auth triggered on anomaly score >0.6 | Insider simulation: anomalous admin behavior triggered adaptive auth and access revocation |
| Monitor integrity and security posture of all assets | Firmware integrity checks; device health monitoring; Sentinel behavioral telemetry | Firmware tamper attempts detected; unauthorized processes flagged |
| Authentication and authorization strictly enforced | MFA + USB hardware key + SSH key + passphrase + 2FA; account lockout at 3 failed attempts per factor | Brute-force simulation: account suspended before credentials guessed |
| Collect information to improve security posture | Full immutable logging; SIEM correlation; AI model retraining from operational logs | Log tampering attempt: immutable logs preserved; attempt flagged and alerted |
Table 37.
Cryptographic control validation by pipeline gate, AZTRM-D deployment. Source: Stage 3 gate validation testing (this work).
Table 37.
Cryptographic control validation by pipeline gate, AZTRM-D deployment. Source: Stage 3 gate validation testing (this work).
| Control | Validation Method | Gate | Failure Response |
|---|
| LUKS2 FDE active | cryptsetup status check in device health telemetry; IaC scan validates device config manifest | Pre-deploy IaC scan + continuous Sentinel monitoring | Deployment blocked; Sentinel alert; device quarantined from ZT network segment |
| Artifact signature valid | Cosign verify against trusted signing key at build gate | SBOM/signature gate (Admin 1) | Unsigned or invalid-signature artifact rejected; commit blocked |
| TLS 1.3 enforced | CSPM policy check against TLS configuration baseline; active scan for TLS downgrade response | CSPM gate + Sentinel network scan | Policy violation flagged; service blocked from ZT PEP access until remediated |
| WPA3-SAE active | RF capture test (Wireshark SAE handshake confirmation); CSPM wireless policy check | Pre-deploy wireless validation + Sentinel RF monitoring | WPA3 fallback to WPA2 detected; wireless segment isolated; alert raised |
| SVID validity | SPIRE health check; SVID expiry monitoring in Sentinel telemetry | Continuous Sentinel monitoring | Expired SVID causes ZT PEP to reject access; anomalous renewal patterns flagged |
| ECDSA nonce determinism | SAST rule for non-RFC-6979 ECDSA implementations; rule flags random.randint used as cryptographic nonce | SAST gate | Policy violation flagged at SAST scan; commit blocked pending remediation |
| Post-quantum readiness | IaC manifest check for approved cryptographic algorithm list; SAST rule flags deprecated algorithms (RSA-2048, DH-1024, P-192) | SAST + IaC gate | Deprecated algorithm usage blocked from production deployment; migration recommendation generated |
Table 38.
NIST RMF phase mapping to AZTRM-D lifecycle phases, key activities, and AI/Sentinel roles. RMF phases from [
2].
Table 38.
NIST RMF phase mapping to AZTRM-D lifecycle phases, key activities, and AI/Sentinel roles. RMF phases from [
2].
| RMF Phase | AZTRM-D Phase | Key Activities | AI/Sentinel Role |
|---|
| Prepare [2] | Planning | System classification; stakeholder identification; risk tolerance definition; supply chain risk assessment | AI-driven data sensitivity classification; automated compliance mapping |
| Categorize [2] | Planning | FIPS 199 impact analysis; authorization boundary definition; data classification | AI auto-labels PII/sensitive data; generates classification report |
| Select [2] | Development | Security control baseline selection; ZT control overlay; cryptographic algorithm selection | AI recommends control set based on asset profile and threat model output |
| Implement [2] | Build/Deploy | Control implementation in CI/CD; cryptographic signing; SBOM generation; ZT PEP deployment | Sentinel enforces scanning gates; blocks unsigned artifacts; monitors configuration drift |
| Assess [2] | Test/Release | SAST, DAST, SCA, pen testing, ZT policy validation | Sentinel AI pen test agent; XGBoost triage; SHAP explanations per finding |
| Authorize [2] | Release/Deploy | Formal authorization decision; residual risk acceptance; Super Admin sign-off | AI-generated risk prioritization informs authorization decision |
| Monitor [2] | Operate/Monitor | Continuous anomaly detection; firmware integrity; ZT telemetry; AI model retraining; incident response | Isolation Forest continuous monitoring; adaptive ZT policy; ENGAGE active defense |
Table 39.
Consolidated results across all evaluation dimensions. Each row traces to its originating table or section as indicated in the source column.
Table 39.
Consolidated results across all evaluation dimensions. Each row traces to its originating table or section as indicated in the source column.
| Domain | Key Result | Source |
|---|
| External Pen Test (Stage 1) | Full compromise in <5 min; 4 open ports; immediate root via default credentials; RF traffic capturable; reproduced independently by all three testers | Table 26 |
| External Pen Test (Stage 2) | Network vector closed (0 open ports); RF still observable; physical SD card attack succeeded; root via UART console; all three testers | Table 27 |
| External Pen Test (Stage 3) | All vectors blocked; 0 successful compromises; physical and logical ZT enforced; RF traffic encrypted; all three testers | Table 28 |
| Insider (Stage 1) | All ZT principles violated; unrestricted sudo; log tampering trivial | Table 26 |
| Insider (Stage 2) | Logical ZT controls held; physical SD bypass circumvented all software monitoring; insider mistake caught by Secrets Scan | Table 27 |
| Insider (Stage 3) | Never-trust-always-verify enforced at hardware layer; adaptive auth blocked privileged escalation; AI-assisted pipeline attacks caught by automated gates | Table 28 |
| VDR (96.8%) Derivation | 5 scanning modalities (SAST + DAST + SCA + CSPM + IaC) covering known-class vulnerabilities; 3% gap = human-analyst-only findings | Section 3.9.1 |
| Resource Metrics | 12–18% CPU overhead (active scan); <40 ms ZT PEP latency; 3.8% CPU/41 MB memory at single-device steady state (Stage 3 measured); multi-device fleet characterization designated as future work | Table 34 |
| Sentinel TTID | 4.2 min average from deployment to first finding | Table 4 |
| Sentinel AI Accuracy | XGBoost: 94.1% precision, 91.8% recall; BigVul TPR: 81.4%; adversarial detection: 93.7% | Table 4 |
| SDLC/SSDLC Comparison | Among the evaluated frameworks, AZTRM-D is the only one covering Zero Trust, AI integration, IoT security, and continuous monitoring simultaneously; 7 of 14 capabilities absent from all five comparison frameworks | Table 10 and Table 11 |
| AI Stack Selection | Isolation Forest selected over OCSVM and autoencoders for computational viability and explainability on edge hardware; XGBoost over neural classifiers for exact SHAP compatibility; PPO over DQN for sparse-reward stability; GNN over rule-based SAST for novel vulnerability detection | Section 3.3 |
| Tool Stack Selection | Each tool selected against documented alternatives; RTL-SDR selected at realistic adversary cost point; dual Nessus/OpenVAS for cross-validation; Semgrep over commercial SAST for tunable false positive control | Section 3.5 |