Next Article in Journal
QCNN-Inspired Variational Circuits for Enhanced Noise Robustness in Quantum Deep Q-Learning
Previous Article in Journal
A Study on Regional Disparities and Shifting Trends in Transportation Carbon Emissions in China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Reliability Analysis of the LEON3 Memory Subsystem Under Single-Event Upsets: Cache, AHB Interface, and Memory Controller Vulnerability

1
GEII Department, IUT Bordeaux, 15 Street of Naudet, 33170 Gradignan, France
2
Sciences Faculty of Tunis, Tunis El Manar University, 20 Street of Tolède, Tunis 2092, Tunisia
3
RF2S Spectrum Solutions, 18 Street of the Faïencerie, 33300 Bordeaux, France
4
Electronics and Micro-Electronic Laboratory (LEµE), Bd de L’environnement, Monastir 5000, Tunisia
5
Higher Institute of Applied Sciences and Technology of Sousse, University of Sousse, Street Taher Ben Achour, Sousse 4003, Tunisia
*
Author to whom correspondence should be addressed.
Information 2026, 17(3), 249; https://doi.org/10.3390/info17030249
Submission received: 31 December 2025 / Revised: 8 February 2026 / Accepted: 28 February 2026 / Published: 3 March 2026

Abstract

This paper presents a register-transfer-level (RTL) fault injection study of the LEON3 processor’s internal memory subsystem under single-event upsets (SEUs). The analysis targets four key components: the instruction cache (I-cache), data cache (D-cache), AHB bus control interface, and memory controller (MCTRL), all of which are unprotected in the standard LEON3 configuration. Using the NETFI+ fault injection framework, multi-cycle SEUs are injected into sequential elements across these blocks while executing a memory-intensive benchmark. The results show that the AHB interface is extremely fragile, with every fault causing execution failure. The memory controller, though architecturally invisible, frequently induces precise SPARC V8 traps such as window overflow and illegal instruction through indirect data-path corruption. The data cache is identified as the primary source of silent data corruption (SDC), while the instruction cache exhibits partial natural masking but remains susceptible to control-flow errors. These findings highlight the disproportionate impact of unprotected protocol and controller logic on system reliability and inform targeted hardening strategies for LEON3-based embedded systems in radiation-prone environments.

1. Introduction

The increasing reliance on commercial off-the-shelf (COTS) processor cores in aerospace, automotive, and other safety-critical embedded systems has intensified concerns regarding their reliability under radiation-induced soft errors. Among these effects, single-event upsets (SEUs), caused by energetic particles flipping the state of sequential elements, pose a major threat to system correctness and availability [1,2]. Even transient bit flips in unprotected logic can propagate through the microarchitecture, leading to silent data corruption (SDC), unexpected exceptions, or complete execution failure [3,4]. As technology scales and operating voltages decrease, the susceptibility of deeply integrated processor subsystems to SEUs continues to grow, making comprehensive reliability assessment a fundamental requirement for dependable system design.
The LEON3 processor, a SPARC V8-compatible soft-core widely adopted in space and safety-critical applications, is frequently deployed in its standard, non-fault-tolerant configuration to satisfy stringent constraints on area, power, and execution determinism [5,6]. While numerous studies have examined SEU effects in architecturally visible components, such as the pipeline, register file, and arithmetic units, comparatively little attention has been devoted to the internal memory transaction path [7,8,9,10,11,12,13,14,15]. This path comprises the instruction and data caches, the AMBA AHB control interface, and the memory controller (MCTRL), which collectively mediate all instruction fetches and data accesses. Although largely invisible to software, these blocks play a decisive role in preserving control-flow integrity and data correctness; their corruption can lead to disproportionate system-level failures.
Existing fault injection campaigns typically focus on cache storage arrays or pipeline registers, often neglecting protocol-critical logic and configuration-driven controllers. In particular, the AHB control interface and MCTRL are rarely analyzed in isolation, despite serving as single points of failure for all external memory transactions. Moreover, the mechanisms by which faults in these subsystems manifest as precise SPARC V8 exceptions, such as illegal instruction, privileged instruction, or window overflow, remain insufficiently characterized. This gap limits the development of targeted hardening strategies that address the most vulnerable microarchitectural structures without incurring unnecessary overhead.
To address this, the present work delivers a detailed register-transfer-level (RTL) fault injection analysis of the LEON3 internal memory subsystem under SEU stress. Using the NETlist Fault Injection Plus (NETFI+) framework, multi-cycle SEUs are injected into sequential elements across four key components: the instruction cache, data cache, AHB control interface, and MCTRL. A memory-intensive benchmark is used to activate memory paths comprehensively, and fault outcomes are systematically classified into masked faults, silent data corruption, and execution halts. Beyond quantitative profiles, the study traces how microarchitectural corruption propagates into architecturally visible traps, revealing indirect failure pathways that originate in protocol and configuration logic, not in the core architectural state.
The remainder of this paper is organized as follows. Section 2 reviews related work on fault injection and reliability analysis in LEON3-based systems, highlighting existing gaps. Section 3 describes the microarchitectural organization of the cache hierarchy and memory controller. Section 4 details the NETFI+ fault injection methodology and experimental setup. Section 5 and Section 6 present and comparatively analyze the results of SEU injection campaigns across the four target subsystems. Section 7 discusses implications for targeted hardening, and Section 8 concludes the paper.

2. Related Work

Fault injection in the LEON3 SPARC V8 soft-core processor has become a cornerstone methodology for evaluating soft-error resilience in safety-critical, aerospace, and secure embedded systems. The literature spans diverse fault models, injection techniques, architectural targets, and workload conditions, yet a systematic gap persists in the analysis of protocol-critical, architecturally invisible structure—specifically, the AHB control interface and memory controller.
Early studies established baseline SEU sensitivity in internal state elements. Abbasitabar et al. [16] injected ~11,200 faults (SEU, MBU, SET, MET) into flip-flops, registers, and caches, reporting high overwrite rates and identifying the integer and multiplier units as most vulnerable, while Benoit et al. [7] used NETFI-2 for exhaustive RTL injection into control logic, highlighting multi-node upset risks in advanced CMOS processes. Complementing this, our prior work [8] revealed pipeline-stage-dependent program counter-vulnerability through over four million NETFI+ injections. To address simulation cost, researchers developed accelerated methodologies: Ebrahimi et al. [17,18] accelerated campaigns via statistical or lifetime-aware methods to reduce the runtime by up to two orders of magnitude, Tuzov et al. [19] proposed adaptive statistical sampling to dynamically terminate campaigns, and Khanov et al. [20] integrated on-chip fault injectors as autonomous IP cores for low-latency testing. Concurrently, FPGA-based platforms such as RapidSmith [21,22] and XRTC-V5FI [23,24] enabled configuration memory fault injection, revealing LEON3’s significantly higher sensitivity compared to other soft cores, while Da Silva et al. [25] extended TLM2.0 with SystemC instrumentation for transaction-level fault modeling.
The majority of these efforts concentrate on architecturally visible components. Travessini et al. [9] demonstrated that partial TMR on the most sensitive CPU registers achieves high fault tolerance with minimal area overhead, while Mansour et al. [10,26] compared hardware/software fault injection in the register file against the CEU (code emulating upsets) method using matrix multiplication and a self-converging algorithm, showing that algorithmic redundancy can mask register-level faults. Chekmarev and Khanov [27] employed an on-chip debug-based IP-core to inject SEUs autonomously into the LEON3 register file and cache under RTEMS, quantifying EDAC coverage and demonstrating that fault detectability depends critically on software memory activity and injection timing. Theodorou et al. [28,29] developed software-based self-test programs for L1 cache arrays using debug instructions. Hybrid techniques, such as debug-interface monitoring [30,31], achieved over 95% control-flow error coverage without hardware modification. Kempf et al. [11] proposed a run-time adaptive cache that switches between an unprotected performance mode and a checkpointing-based reliable mode for mixed-criticality workloads on LEON3, demonstrating low execution overhead and successful fault recovery. Hardening strategies range from full redundancy, such as TMR combined with CRAM/BRAM scrubbing [32], to selective protection like SEC/DED with control-flow monitoring [33]. However, these approaches consistently overlook the MCTRL and AHB interface, despite their role as gatekeepers of all external memory transactions. Meanwhile, workload-aware studies have shown that fault manifestation depends critically on application semantics: our work on MBUs and MCUs in SDRAM [12,13] demonstrated that cryptographic workloads like AES exhibit high intrinsic masking, whereas numerical benchmarks like MulMatrix activate nearly all faults, a finding corroborated by Houssany et al. [34,35] and Kooli et al. [14,36,37]. Guzman-Miranda et al. [15] measured the post-fault recovery time in real-time systems. Yet these investigations focus on external memory or algorithmic effects, not the internal protocol layer that mediates access to them.
As summarized in Table 1, no prior study provides a unified vulnerability assessment of LEON3’s internal memory transaction path. This work fills that gap by delivering the first comprehensive SEU analysis of the I-cache, D-cache, AHB control interface, and MCTRL, revealing that protocol-critical blocks, though small and architecturally invisible, dominate system-level failure, and that their corruption propagates into precise, high-level exceptions through well-defined microarchitectural pathways.

3. Architectural Overview of Cache and MCTRL in the Standard LEON3 Processor

The LEON3 processor features a configurable, seven-stage in-order pipeline optimized for deterministic timing and area efficiency, making it well-suited for hard real-time embedded systems. In its standard (non-fault-tolerant) configuration, used throughout this study, the design excludes hardware-level resilience mechanisms such as TMR, parity protection, or error-correcting codes (ECCs) [38,39]. Consequently, all sequential elements, including flip-flops in the cache hierarchy and the MCTRL, remain fully exposed to radiation-induced SEUs.
This unprotected baseline provides an ideal platform for fine-grained vulnerability analysis. As illustrated in Figure 1, this section details the microarchitectural organization of the two subsystems under investigation: (i) the cache subsystem (comprising the instruction cache, data cache, and AHB bus control interface), and (ii) the MCTRL, which governs physical memory timing, bus-width adaptation, wait-state insertion, and the I/O configuration.

3.1. Cache Subsystem

LEON3 implements a Harvard-style memory hierarchy with physically separate instruction (I-cache) and data caches (D-cache), both integrated within the memory management unit (MMU) [40]. The configuration used in this study employs 4 kB capacity per cache, two-way set associativity, and a 16-byte line size. As shown in Figure 1, each cache comprises three structural components at the RTL level:
  • Tag arrays: store physical address tags for hit/miss determination;
  • Data arrays: hold fetched instructions (I-cache) or data words (D-cache);
  • Metadata and control logic: including valid bits (both caches), dirty bits (D-cache only), replacement state, and finite-state machines (FSMs) that orchestrate operations such as line fill, flush, write-back, and hit handling.
Critically, the cache subsystem includes the AHB bus control interface, which bridges the cache to the AMBA AHB bus. This block manages address forwarding, handshake signaling (hready, hresp), burst control, and byte-lane selection to ensure protocol-compliant transactions between the core and external memory.
For the fault injection campaign, the entire cache subsystem was instrumented at the bit level, resulting in the following injection vector allocation:
  • I-cache: 93 bits targeting tag storage, valid bits, and control FSMs (e.g., istate, flush logic, hit path);
  • D-cache: 293 bits covering data/tag arrays, dirty bits, write-back buffer registers, and D-cache FSMs (e.g., dstate, mexc, TLB interaction logic);
  • AHB control interface: 16 bits focused on critical AHB protocol signals, including byte-order (bo), write indication (hwrite), transfer type (htrans), burst mode, and lock status.
The absence of parity, ECC, or hardened storage elements in this baseline renders all sequential logic in the cache subsystem fully vulnerable to SEUs, thereby enabling precise quantification of architectural susceptibility.

3.2. Memory Controller: MCTRL

In this study, the memory controller refers specifically to the MCTRL block, a microarchitectural unit in LEON3 responsible for managing the physical interface between the processor core and external memory peripherals. Unlike the cache/MMU, which operates at the virtual memory layer, MCTRL functions beneath the cache hierarchy, directly controlling the timing, protocol, and data-path behavior on the AMBA AHB bus for SRAM, ROM, and memory-mapped I/O devices.
Implemented as part of the GRLIB IP library, MCTRL is distinct from both the SPARC register file and the MMU. It comprises configuration registers and FSMs that govern the following:
  • Bus-width adaptation (ramwidth, romwidth, iowidth) for 8-, 16-, and 32-bit devices;
  • Programmable wait-state insertion (ws, iows, romrws, ramrws) to accommodate slow peripherals;
  • Memory region control (ioen, brdyen, bexcen) for I/O enablement, burst termination, and bus-error signaling;
  • AHB protocol generation, including hready, hresp, hwrite, and byte-enable strobes (mben).
These functions rely on combinational logic and sequential elements (flip-flops) that store the configuration state and track the transaction progress. Notably, MCTRL does not implement virtual memory, caching, or TLB logic. Those are exclusively handled by the MMU. Instead, MCTRL acts as a physical-layer memory adapter, ensuring the correct timing and protocol compliance for every external memory transaction. In the fault injection campaign, the MCTRL instance (mctrl_work_leon3mp_rtl_0layer0) was instrumented with 207 dedicated injection bits, distributed across four functional categories: 49 bits targeting configuration registers, 34 bits covering protocol state machines, 84 bits allocated to address and data output registers, and 40 bits assigned to control signal generators.
Because MCTRL directly determines the correctness and timing integrity of all external memory accesses, SEUs in their logic can trigger severe system-level failures. It includes bus hangs, unacknowledged writes, silent data corruption, or spurious I/O activation, even when the pipeline and caches remain error-free. Critically, the standard LEON3 configuration provides no error detection or correction for MCTRL, leaving all its sequential elements fully susceptible to SEUs.
This vulnerability makes MCTRL a high-priority target for reliability assessment in radiation-exposed environments such as space systems. By isolating and characterizing SEU effects in this block, this work presents the first detailed analysis of MCTRL susceptibility in LEON3. This contribution complements existing studies focused on caches and pipeline logic, and it informs selective, low-overhead hardening strategies for non-fault-tolerant implementations.

4. Methodology of Fault Injection: NETFI+

4.1. Target Architecture and NETFI+ Approach

RTL-level fault injection offers an optimal trade-off between modeling fidelity and simulation efficiency. Unlike instruction-set-architecture (ISA)-level [41,42] or software-emulated approaches [43,44], it captures microarchitectural phenomena, such as pipeline stalls, cache state corruption, and AMBA AHB protocol violations, which are essential for evaluating the resilience of unprotected hardware structures like cache metadata arrays and memory controller logic.
The target architecture is the standard LEON3 core (GRLIB v1.0.30), synthesized and mapped on a Xilinx Virtex-6 FPGA using Synplify® Pro ME V-2023.09M-5. This configuration provides a baseline free of built-in hardening, ensuring all sequential elements remain fully susceptible to SEUs and enabling the assessment of intrinsic architectural vulnerability.
The fault injection framework employs NETFI+, an updated version of the original NETFI tool [45] and its successor NETFI-2 [46]. NETFI+ builds upon its foundation with enhanced RTL-level instrumentation, support for multi-bit upsets (MBUs) and single-event transients (SETs), automation for large-scale fault injection campaigns, and compatibility with both event-driven simulation and FPGA-based emulation. Critically, it permits bit-accurate injection into any flip-flop at any clock cycle or multi-cycle window, a capability validated in prior LEON3 studies of control and arithmetic units [16,47].
Two independent fault injection campaigns were executed using the randomized injection mode of NETFI+ to ensure unbiased and statistically representative coverage of the target subsystems:
  • Cache subsystem campaign: 482,400 multi-cycle SEUs injected across 402 instrumented flip-flops spanning the I-cache, D-cache, and AHB control interface.
  • MCTRL campaign: 248,400 multi-cycle SEUs injected across 207 instrumented flip-flops in the MCTRL block.

4.2. Experimental Setup

Figure 2 illustrates the end-to-end RTL fault injection workflow enabled by NETFI+. Starting from the LEON3 HDL source code, the design undergoes synthesis via Synplify®, yielding a structural netlist. MODNET then instruments the netlist by inserting ‘INJ’ signals into all targeted flip-flops. Simulation is executed in ModelSim Version 20.1.0.711® under controlled injection conditions, driven by TCL scripts that generate test benches and orchestrate fault scenarios. Post-simulation, results are classified using automated scripts and compiled into spreadsheets for statistical analysis.
Fault injection setup is performed using the MODify NETlist (MODNET) tool [45], which inserts an auxiliary enable signal ‘INJ’ into each targeted sequential element. The injection vector spans 609 flip-flops across four subsystems:
  • 93 in the I-cache;
  • 293 in the data cache (D-cache);
  • 16 in the AHB control interface;
  • 207 in the memory controller (MCTRL).
This setup enables clock-cycle-accurate SEU activation during ModelSim® simulation, with three degrees of freedom: (i) arbitrary flip-flop selection, (ii) random injection timing within the benchmark execution window, and (iii) configurable fault duration, single-cycle or multi-cycle. Multi-cycle injection is used throughout this study to model realistic charge-collection effects, wherein an upset persists across multiple pipeline stages. This is particularly important for cache controllers and MCTRL state machines, where transient errors may propagate through sequential logic over several cycles before manifesting as observable failures.
This methodology ensures comprehensive sampling across three orthogonal dimensions: spatial (flip-flop location), temporal (injection cycle), and architectural (subsystem under test). By avoiding fixed or workload-biased injection points, the campaign mitigates coverage skew toward frequently accessed cache lines, dominant control states, or specific memory regions. Following each injection, the processor continued execution until either successful completion or the onset of a failure condition (e.g., unhandled exception or simulation timeout), enabling robust post-simulation classification into masked, silent data corruption (SDC), or halt/timeout categories.

4.3. Benchmark and Simulation Environment

All experiments employ the MulMatrix benchmark, a 30 × 30 signed integer matrix multiplication workload, executed in a bare-metal environment on the standard LEON3 core (GRLIB v1.0.30) with no operating system or runtime abstraction. The benchmark was compiled using the SPARC GCC toolchain with standard optimization flags and linked to a minimal runtime harness for bare-metal execution.
MulMatrix was selected for its strong suitability in soft-error vulnerability analysis. It exhibits deterministic control flow, high memory intensity, and a balanced mix of arithmetic, load/store, and branching instructions, properties that together maximize the observability of transient faults. Moreover, its compact code and data footprint ensure manageable simulation runtimes while fully activating the cache and memory controller logic throughout execution.
Although MulMatrix has previously been used to analyze program counter sensitivity under SEU injection [8,10,12,13,47], its memory-bound, deeply nested loop structure renders it equally effective, and highly appropriate, for exposing SEU-induced anomalies in cache controllers and MCTRL logic, which constitute the primary focus of this study.

5. Fault Injection Campaign: Results and Analysis

5.1. Cache Subsystem Reliability Analysis

The cache subsystem of the standard LEON3 processor exhibits significant vulnerability to SEUs due to the absence of built-in error detection or correction mechanisms. A total of 482,400 multi-cycle SEU injections were performed across 402 flip-flops spanning the I-cache, D-cache, and AHB control interface. The results reveal distinct reliability profiles for each component, reflecting their architectural roles and exposure to corrupted state.

5.1.1. Instruction Cache Controller Error Rate Classification

A total of 111,600 multi-cycle SEUs were injected into the 93 flip-flops governing the I-cache, including tag arrays, valid bits, and fetch-state logic. The resulting error classification, presented in Figure 3, reveals a tripartite vulnerability profile: nearly half of all injected faults (47.08%) were masked, a non-negligible 7% resulted in SDC, and the remaining 45.92% led to execution halts or timeouts.
The high masking rate reflects the strong spatial and temporal locality inherent in instruction access patterns during the MulMatrix benchmark; many corrupted cache lines are either never fetched, overwritten before use, or evicted during normal replacement activity, thereby preventing error propagation. Nevertheless, the SDC rate of 7% demonstrates that a measurable fraction of faults subtly alters instruction semantics without violating syntactic validity, for instance, by flipping bits in branch displacement fields or arithmetic operands, resulting in undetected output deviations. Most critically, the halt/timeout rate of 45.92% underscores the I-cache’s pivotal role in pipeline continuity: persistent corruption in tag storage, valid-bit metadata, or fetch control logic frequently disrupts instruction delivery, triggering unhandled exceptions (such as illegal instruction traps) or causing pipeline stalls that prevent benchmark completion.
These results indicate that the I-cache exhibits partial natural resilience due to workload characteristics, yet remains a significant source of both silent and catastrophic failures. This dual behavior reflects the tension between architectural locality, which suppresses many errors, and microarchitectural exposure, where even infrequent faults can propagate into critical control-path disruptions. The high incidence of halt/timeout outcomes further confirms that the I-cache’s integrity is fundamental to sustained execution, making it a key locus for vulnerability characterization in non-fault-tolerant LEON3 deployments.

5.1.2. Data Cache Controller

A total of 351,600 multi-cycle SEUs were injected into the 293 flip-flops governing the D-cache, including its data arrays, dirty/valid metadata, write-back buffers, and address translation logic. The resulting error classification, illustrated in Figure 4, reveals a markedly severe vulnerability profile: only 5.4% of faults were masked, 31.59% resulted in SDC, and 63.01% led to execution halts or timeouts.
The extremely low masking rate reflects the comprehensive activation of data paths by the MulMatrix benchmark: nearly every corrupted word is loaded, stored, or used in computation, leaving minimal opportunity for natural suppression through locality or replacement. This confirms that the D-cache operates under near-maximal exposure during typical workloads, rendering it exceptionally sensitive to soft errors.
The dominant failure mode, silent data corruption at 31.59%, underscores the D-cache as the principal source of undetected output deviations in the system. In the absence of integrity checks such as parity or ECC on data arrays, corrupted values are silently consumed by arithmetic units or propagated back to memory, producing incorrect results without triggering any architectural exception. This represents a critical reliability risk, as the system continues operation unaware of its compromised state.
The high halt/timeout rate (63.01%) arises from corruptions that destabilize the memory subsystem’s operational semantics: invalid store operations, misaligned accesses, or protocol violations induced by corrupted metadata or address fields frequently lead to unhandled exceptions or pipeline stalls. These failures reflect the tight coupling between the D-cache state and the broader memory access path, where even minor corruption can cascade into systemic disruption.
These findings establish the D-cache as the most vulnerable component in terms of both output correctness (via SDC) and execution continuity (via halts/timeouts). Its behavior highlights the fundamental trade-off between performance optimization and resilience: while the cache accelerates data access, its unprotected structure amplifies the impact of transient faults, making it a central focus for vulnerability assessment in non-fault-tolerant LEON3 deployments.

5.1.3. AHB Control Interface

A total of 19,200 multi-cycle SEUs were injected into the 16 flip-flops governing the AHB control interface, which manages handshake signals (hready, hresp), burst control, byte-lane selection, and address forwarding between the LEON3 cache hierarchy and the external AMBA AHB bus. The resulting error classification is unequivocal: 100% of all injected faults resulted in execution halts or timeouts.
This extreme vulnerability arises from the AHB interface’s role as a single point of protocol failure: even a single-bit upset in address multiplexing, control signaling, or handshake logic corrupts the transaction sequence, leading to malformed instruction fetches or memory accesses that cannot be recovered by the core. Because the interface directly mediates every external memory transaction, any disruption propagates immediately into the instruction stream, destabilizing pipeline continuity before corrupted data can reach architectural state.
The complete absence of masked faults or silent data corruption reflects the criticality of protocol integrity in this block: unlike caches, which may tolerate transient errors through locality or replacement, the AHB interface has no mechanism to suppress or recover from corrupted transactions. Every fault manifests as an immediate system-level failure, typically via illegal instruction traps triggered by malformed opcodes received during fetch. The absence of masked faults is attributable to the continuous utilization of the AHB interface under MulMatrix, which leaves no inactive cycles wherein a corrupted control bit could remain unobserved.
These results establish the AHB control interface as the most fragile component in the entire memory path, not due to complexity, but due to its irreplaceable function as the sole conduit for instruction and data delivery. Its behavior underscores the disproportionate impact of small, unprotected sequential structures on overall system reliability, making it a high-priority target for vulnerability characterization in non-fault-tolerant LEON3 deployments.

5.2. Cache Subsystem Trap Analysis

Although the cache subsystem comprises non-architectural structures, specifically tag and data arrays, metadata registers, and AHB protocol logic, SEUs within these components frequently manifest as architecturally visible SPARC V8 exceptions. This phenomenon arises from the tight coupling between the cache hierarchy and the Fetch (FE), Decode (DE), and Register Access (RA) stages of the LEON3 pipeline: any corruption in cache output or transaction integrity destabilizes instruction delivery, address generation, or memory semantics, which the processor’s exception logic interprets as violations of architectural invariants.
Consequently, a significant fraction of the halt/timeout outcomes reported in Section 5.1 are accompanied by well-defined traps, rather than silent hangs or undefined behavior. The following subsections elucidate the microarchitectural pathways through which SEUs in each component induce precise exceptions, not via direct corruption of SPARC control registers (e.g., %psr, %wim, %tbr), but through indirect side effects that breach protocol or semantic assumptions embedded in the processor’s execution model.

5.2.1. Instruction Cache Controller Trap Analysis

The I-cache supplies the FE stage with instruction words. Thus, any corruption in its tag arrays, valid bits, or data RAM directly upsets the instruction stream before Decode. As shown in Figure 5, this corruption manifests exclusively as architecturally visible SPARC V8 traps, leading to three dominant trap categories.
  • Illegal instruction (53.56%, 24.59% of the total FI), the most frequent outcome, arises when bit-flips produce non-decodable opcodes, causing the decode unit to raise an illegal instruction exception upon encountering syntactically invalid bit patterns, an immediate symptom of instruction-stream corruption.
  • Privileged instruction (21.46%, 9.85% of the total FI) occurs when an SEU transforms a user-mode instruction into a syntactically valid supervisor opcode, triggering a privileged instruction trap due to privilege violation in the user mode, not from PSR corruption, but from corrupted instruction content.
  • Window overflow (24.98%, 11.47% of the total FI), an indirect consequence of control-flow corruption, results when SEUs alter branch targets or mutate arithmetic instructions into save opcodes, causing unintended register-window spills that exhaust the architectural window depth and invoke a window overflow trap, despite no corruption of the WIM register.
These results confirm that the I-cache is a primary vector for control-path corruption, with nearly all non-masked faults materializing as architecturally visible exceptions.

5.2.2. Data Cache Controller Trap Analysis

Unlike the I-cache, the D-cache influences the data path and address generation. Its faults rarely corrupt instructions directly but instead poison the program state, which later propagates into control-path failures. As shown in Figure 6, this leads to four dominant trap categories.
  • Illegal instruction (48.17%, 30.35% of total FI), an indirect cascade where corrupted load values (e.g., an invalid return address fetched from memory) are used as branch targets. If the target points to non-instruction data or an unmapped region, the fetched word becomes nonsensical, triggering an illegal instruction trap, thus destabilizing control flow via data-dependent addressing.
  • Privileged instruction (44.68%, 28.15% of total FI), similarly indirect, arises when a corrupted load places a supervisor opcode into a general-purpose register later executed via a function pointer, or when a corrupted stack value causes a return to supervisor code while in the user mode; in both cases, the trap stems from data-driven control-flow hijacking, not PSR corruption.
  • Window overflow (2.813%, 1.77% of total FI) results from corrupted loop counters or function arguments that alter save/restore behavior, for example, a flipped bit in a recursion counter may cause excessive save operations, exhausting register windows, a data-induced control anomaly.
  • Memory address not aligned (4.336%, 2.73% of total FI), a direct structural failure caused by SEUs in byte-enable logic or address LSBs forcing misaligned accesses (e.g., 32-bit load from an odd address). Since SPARC mandates alignment, the RA stage raises a misaligned memory address trap, a direct consequence of address-path corruption.
This profile underscores a key insight: D-cache faults are stealthier but equally dangerous. They often bypass immediate detection, only revealing themselves later as severe control-flow violations, highlighting the critical role of data integrity in preserving control semantics.

5.2.3. AHB Control Interface Trap Analysis

The AHB control interface, the protocol bridge between the LEON3 cache hierarchy and the AMBA AHB bus, emerges as the most fragile point of failure in the cache subsystem. It coordinates address forwarding, burst control, handshake signaling (hready, hresp), and transaction sequencing. Because instruction fetch depends entirely on the integrity of these transactions, any SEU-induced disruption immediately compromises control-flow continuity.
As shown in Figure 7, this extreme vulnerability is quantified: 100% of the 19,200 SEUs injected into its 16 flip-flops resulted in illegal instruction traps, with zero masked or silent data corruption outcomes. This uniformity arises because the AHB interface implements no architectural logic; it cannot generate privileged instructions, window overflow, or memory address not aligned exceptions. Its sole function is to deliver correct instruction words from memory to the fetch stage.
When an SEU corrupts,
Address muxing logic (haddr, bo);
Handshake control signals (hready, hresp);
Arbitration state (hgrant, bg);
Transfer control (htrans, hburst).
The resulting AMBA transaction either returns invalid data or stalls indefinitely. The I-cache forwards this corrupted word to the Decode stage, which, faced with a syntactically invalid opcode, raises an illegal instruction trap. The trap is thus a downstream symptom of upstream communication failure, not a direct consequence of architectural state corruption.
Although SEUs may also induce bus deadlocks (e.g., via permanently deasserted hready), in our finite-length MulMatrix simulation, the first corrupted fetch is always detected before timeout classification, resulting in trap-based halts rather than stalls.

5.3. Memory Controller (MCTRL) Reliability Analysis

The MCTRL, instantiated as mctrl_work_leon3mp_rtl_0layer0, operates as the physical-layer interface between the LEON3 core and external memory peripherals (SRAM, ROM, I/O devices). It is architecturally invisible: no software-accessible registers expose its state, and it functions entirely beneath the cache hierarchy. Nevertheless, the MCTRL governs the timing, protocol compliance, and electrical integrity of every external memory transaction by configuring wait states (ws, iows, romrws, ramrws), bus widths (ramwidth, romwidth, iowidth), region enables (ioen, brdyen, bexcen), and AHB protocol signals (hready, hresp, hwrite, mben).
In the standard (non-fault-tolerant) LEON3 configuration, the MCTRL employs no parity, ECC, or redundancy, leaving all 207 sequential elements fully exposed to SEUs. As shown in Figure 8, a total of 248,400 multi-cycle SEU injections yielded a failure distribution characterized by 21.41% masked faults, 3.35% SDC, and 75.24% halt/timeout outcomes.
Masked faults (21.41%) occur only under limited conditions: when a corrupted configuration field is inactive during benchmark execution (e.g., I/O timing parameters during pure RAM access) or when redundancy in protocol signaling absorbs the error. However, because MulMatrix exercises both instruction fetch (ROM path) and data access (SRAM path) continuously, most MCTRL parameters, including ramwidth, bstate, and srhsel, are activated on nearly every bus cycle, leaving minimal opportunity for natural masking.
Silent data corruption (3.35%) represents the most insidious threat. These failures arise when parameter corruption preserves AHB protocol validity but distorts data semantics or timing. Examples include the following:
  • A flipped bit in iows[2], reducing I/O wait states, and thus causing the core to sample unstable data from a peripheral;
  • Corruption of ramwidth, forcing a 32-bit memory device into the 16-bit mode, splitting a single word into two half-word transfers;
  • Alteration of byte-enable strobes (mben), leading to partial writes that overwrite adjacent fields.
Because the AHB transaction completes without error (hresp = OK), no trap is raised. The corrupted values propagate silently into the program state, producing incorrect output, a critical reliability risk in safety-critical applications.
Halt/timeout failures (75.24%) dominate due to protocol-level violations or bus deadlocks. SEUs in critical sequential logic, such as the bus state machine (bstate[7:0]), hready generation logic, or slave selector (srhsel), can achieve the following:
Permanently deassert hready, stalling the entire pipeline (classified as timeout);
Generate spurious hresp = error, triggering a memory_access_exception;
Misroute transactions to unmapped regions, causing the fetched instruction word to be corrupted, which later manifests as an illegal_instruction trap in the Decode stage.
This downstream trap behavior, though originating in MCTRL, is indistinguishable from cache-induced faults, demonstrating how physical-layer errors can masquerade as control-path failures.
These results confirm that the MCTRL is a high-leverage reliability bottleneck: despite its architectural invisibility, it acts as the gatekeeper of all external memory interactions. The combination of low masking, non-zero SDC, and extreme halt/timeout rates underscores the necessity of lightweight, targeted hardening, to prevent both silent and catastrophic failures in radiation-prone environments.

5.4. MCTRL-Induced Trap Manifestations

Although the MCTRL operates beneath the SPARC V8 architectural layer and contains no architectural registers (e.g., %psr, %tbr, %wim), 75.24% of the 248,400 injected SEUs resulted in Halt/Timeout outcomes accompanied by architecturally visible traps. These exceptions do not arise from direct corruption of the processor state but from indirect cascading effects that compromise instruction fetch integrity, data correctness, or control-flow semantics. The trap distribution is presented in Figure 9 and analyzed below.
  • Illegal instruction (32.00%, 24.07% of total FI), the most direct manifestation, arises when SEUs in MCTRL configuration fields, such as ramwidth, romrws, or ioen, corrupt instruction fetch transactions by misaligning word boundaries, truncating data, or sampling unstable memory outputs. The resulting garbage opcode is detected by the DE stage as syntactically invalid, triggering an illegal instruction trap, reflecting physical-layer corruption of the instruction stream, not core logic faulting.
  • Window overflow (60.14%, 45.25% of total FI), the dominant indirect effect, is not caused by corruption of the %wim register (which resides in the CPU core), but by control-flow disruption stemming from a corrupted program state. In the MulMatrix benchmark, which features deep, regular loop nests, a single corrupted load can overwrite a return address with a value that induces unintended save instructions, or alter loop counters or function arguments, forcing excessive register-window spills. When the architectural window depth is exceeded, a window overflow trap is raised, and thus, MCTRL faults indirectly destabilize SPARC’s register-window mechanism through data-path poisoning, not control-register corruption.
  • Privileged instruction (5.29%, 3.98% of total FI) occurs when corrupted data redirects execution into supervisor code while in the user mode. Examples include a corrupted function pointer (due to faulty I/O timing or bus-width misconfiguration), causing a jump to a privileged opcode, or a corrupted stack frame leading to premature return into a trap handler. Though infrequent, these cases demonstrate that MCTRL-induced errors can violate privilege boundaries via data-driven control hijacking.
  • Memory address not aligned (2.57%, 1.94% of total FI) results from MCTRL faults in bus-width (busw) or transfer-size (hsize) logic that force incorrect byte-lane selection. Although the SPARC architecture enforces alignment in the Load/Store Unit, the root cause is physical-layer misconfiguration (e.g., treating a 32-bit device as 16-bit), leading to effective addresses that violate alignment constraints and trigger a memory address not aligned trap.
These findings underscore a critical principle: architectural visibility is not required for architectural impact. Even though the MCTRL is invisible to software, its corruption propagates through the memory hierarchy to induce precise, high-level exceptions, blurring the boundary between microarchitectural and architectural fault effects.

6. Comparative Vulnerability Analysis

6.1. Empirical Vulnerability Characterization

Figure 10 presents a comparative visualization of SEU resilience across four critical LEON3 subsystems: I-cache, D-cache, AHB control interface, and MCTRL. A comparative analysis of these results reveals that vulnerability is not uniformly distributed but is instead shaped by architectural function and position within the memory access hierarchy. Cache components exhibit partial natural resilience due to workload locality, whereas protocol-critical blocks, despite their compact size, emerge as high-leverage failure points whose integrity is essential to all external memory transactions. The distinct failure signatures observed across subsystems reflect fundamental differences in how microarchitectural corruption propagates into architectural outcomes, from silent data corruption to catastrophic execution collapse.
The comparative analysis begins with an examination of failure mode distribution across subsystems, revealing systematic differences rooted in architectural design and operational semantics. The masking rate varies significantly: I-cache exhibits the highest value at 47.06%, attributable to instruction locality, and many corrupted lines are evicted, overwritten, or never referenced during benchmark execution.
In contrast, the D-cache shows minimal masking at 5.4% due to comprehensive data-path activation under MulMatrix, while the MCTRL masks 21.41% of faults only when configuration fields remain inactive (e.g., I/O timing parameters during pure RAM access). The AHB interface, however, exhibits zero masking, a direct consequence of its role as the sole conduit for instruction and data fetches, where even a single-bit upset disrupts transaction fidelity and prevents error suppression.
Silent data corruption emerges as the most insidious threat, dominated entirely by the D-cache at 31.59%. This stems from the absence of integrity checks on data arrays: corrupted values loaded from memory are consumed by arithmetic or control-flow logic without architectural visibility. All other subsystems exhibit negligible SDC: I-cache at 7.02% reflects instruction-stream observability, the MCTRL at 3.35% arises from subtle parameter corruption that preserves protocol validity but distorts timing or semantics, and the AHB interface is at 0% because any error immediately halts execution before silent propagation can occur.
Halt/timeout failures dominate in the AHB interface and MCTRL, with rates of 100% and 75.24% respectively, underscoring their role as gatekeepers of memory transaction integrity. Any corruption in address, control, or handshake logic collapses the instruction stream or induces bus deadlocks. The I-cache and D-cache also show high halt/timeout rates at 45.91% and 63.00%, but these stem from internal state corruption rather than protocol collapse. This distinction highlights a critical insight: physical-layer interfaces are more fragile than storage structures, not because they are more complex, but because their failure modes are systemic rather than localized.
Trap profiles further differentiate the subsystems. The I-cache manifests primarily through direct corruption: illegal instruction (24.59%), privileged instruction (9.85%), and window overflow (11.47%). The D-cache, in contrast, induces traps indirectly via data-path poisoning: illegal instruction (30.35%) and privileged instruction (28.15%) arise from corrupted branch targets or return addresses, while memory address misaligned (2.73%) results from misconfigured byte-lane selection. The AHB interface produces exclusively illegal instruction traps (100%), reflecting its role in delivering corrupted instruction words. The MCTRL, though architecturally invisible, generates a diverse set of traps, including illegal instruction (24.07%), window overflow (45.25%), and privileged instruction (3.98%), through indirect cascades that destabilize control flow via a corrupted program state.
These findings collectively establish that vulnerability is not uniformly distributed but is instead shaped by the architectural role: masking is a luxury of locality, SDC is a risk of unprotected data paths, and catastrophic failure is the hallmark of protocol-critical blocks. This understanding provides the foundation for targeted resilience engineering, where protection mechanisms can be allocated according to both failure severity and detection feasibility.

6.2. Comparative Context with Prior LEON3 Reliability Studies

This work fundamentally extends the landscape of LEON3 fault resilience research by shifting the focus from architecturally visible storage elements, such as register files [9,16], pipelines [8], or external SDRAM [12,13], to the architecturally invisible yet functionally critical memory transaction path encompassing the instruction/data caches, AHB control interface, and MCTRL. Prior studies largely targeted components where faults manifest as data corruption or control-flow deviation, but none isolated the AHB or MCTRL as independent failure surfaces. For instance, ref. [16] reported high overwrite rates in caches due to locality, a behavior our I-cache also exhibits (47.06% masking), but did not analyze protocol logic or causal trap semantics. Similarly, ref. [8] mapped PC corruption to Window Overflow traps, yet our results demonstrate that the MCTRL alone, despite containing no architectural registers, can induce the same trap signature (45.25%) through data-path corruption, proving that such exceptions are not exclusive to PC faults. While [12,13] revealed extreme workload-dependent error observability in SDRAM (e.g., AES masks > 88% of faults), their focus remained external to the internal protocol logic that orchestrates all memory activity. Crucially, no prior work—not even those targeting control logic, like [7], or employing symptom monitoring, like [30,31,33]—analyzed the AHB interface. Our finding that AHB faults yield a 100% halt with 0% masking exposes a systemic fragility absent from all the existing literature, where protocol integrity was either assumed (e.g., ref. [27] with EDAC) or overlooked in favor of storage structures. This causal, microarchitectural root-cause analysis, linking specific flip-flop corruption in AHB/MCTRL to precise SPARC V8 exceptions, reveals a vulnerability hierarchy orthogonal to traditional error counts, and it establishes that the most severe failure modes originate not in large memory arrays, but in compact, non-storage protocol logic.

6.3. Limitations and Workload Dependency

The vulnerability metrics reported in this study, particularly masking and SDC rates, are contingent upon the execution characteristics of the MulMatrix benchmark. MulMatrix was deliberately selected for its dense, linear memory access pattern, which maximizes fault activation and minimizes idle or conditional execution paths. Consequently, the observed D-cache SDC rate of 31.59% and low masking (5.4%) represent a worst-case scenario for data-path vulnerability.
In contrast, applications featuring sparse memory access, frequent branching, or significant idle cycles (e.g., network packet processors, search algorithms, or control-dominated embedded tasks) would likely exhibit substantially higher masking rates and lower SDC, as many corrupted cache lines may never be referenced or consumed. Therefore, the quantitative results presented herein should be interpreted as upper bounds on vulnerability under memory-intensive workloads, rather than universal values. This limitation is inherent to any single-benchmark fault injection campaign but is justified by our goal of stress-testing the memory transaction path to expose its most severe failure modes.

7. Implications for Targeted Hardening

The distinct vulnerability profiles observed across subsystems directly inform lightweight, targeted hardening strategies that align with the constraints of non-fault-tolerant LEON3 deployments. As summarized in Table 2, these strategies address the dominant failure modes of each component with minimal architectural intrusion. For the instruction cache controller, parity protection on instruction words and tag arrays, combined with automatic line invalidation on parity error, provides high-efficiency detection and containment of instruction-stream corruption. In the data cache, the dominant source of silent data corruption, SEC-DED ECC on data arrays, parity on metadata (tag, valid, dirty bits), and error-triggered line invalidation collectively suppress undetected errors and prevent misfetches. The AHB control interface, which exhibits zero tolerance for faults, benefits from parity on command signals (e.g., haddr, htrans, hwrite), duplication-with-comparison (DWC) on handshake signals (hready/hresp), and a transaction watchdog timer to ensure protocol fidelity and enable recovery from deadlocks. Finally, for the MCTRL, parity on configuration registers (e.g., ramwidth, iows, romrws), safe state encoding (e.g., one-hot or Hamming distance) for the bstate FSM, and runtime monitoring of handshake signals (hready, hresp) mitigate both silent semantic corruption and catastrophic transaction failures.
The proposed hardening techniques incur minimal overhead: parity on configuration registers adds <1% area; DWC on handshake signals introduces negligible latency (<1 cycle); and a transaction watchdog timer operates off the critical path, adding no timing penalty but ~200 LUTs in Artix-7 FPGAs [11,33]. These estimates align with lightweight hardening strategies suitable for non-fault-tolerant LEON3 deployments.
Collectively, these measures demonstrate that significant resilience can be achieved through selective, subsystem-aware protection, without compromising determinism, area efficiency, or real-time predictability in radiation-prone embedded systems.

8. Conclusions

This work has presented a comprehensive vulnerability analysis of the internal memory transaction path in the standard LEON3 processor under single-event upsets. By targeting the instruction cache, data cache, AHB control interface, and memory controller (MCTRL), components that operate largely beneath the architectural surface, this study reveals that unprotected protocol-critical logic can dominate system-level reliability, despite its small footprint and architectural invisibility.
Key findings demonstrate that the AHB interface acts as a single point of catastrophic failure, with any corruption invariably collapsing the instruction stream. The MCTRL, though containing no architectural registers, frequently induces high-level SPARC V8 exceptions, including window overflow and privileged instruction, through indirect data-path corruption, not control-register faults. Meanwhile, the data cache emerges as the primary source of silent data corruption, posing a critical threat to output correctness in safety-critical applications.
The findings indicate that vulnerabilities are not distributed uniformly but rather are shaped by functional roles within the memory access hierarchy. While natural masking is present in the instruction cache, it is not enough to ensure resilience in protocol and controller logic. The findings provide a mechanistic foundation for selective hardening: rather than applying uniform protection, designers can prioritize high-leverage blocks such as the AHB handshake signals, MCTRL configuration registers, and D-cache data arrays.
Future work will extend this analysis to multicore LEON3 configurations under shared-memory contention, and will explore hybrid mitigation strategies that combine lightweight hardware checks with algorithm-based fault tolerance. Additionally, the methodology can be adapted to other open-source soft cores (e.g., RISC-V, ARM v7-M, Intel) to develop cross-architecture reliability guidelines for radiation-prone embedded systems.

Author Contributions

Conceptualization, A.K. and S.S.; methodology, A.K. and S.S.; software, A.K.; validation, A.K., S.S. and H.G.; formal analysis, S.S.; investigation, A.K., S.S. and H.G.; resources, A.K. and S.S.; data curation, A.K. and S.S.; writing—original draft preparation, A.K. and S.S.; writing—review and editing, A.K., S.S. and H.G.; visualization, A.K., S.S. and H.G.; supervision, S.S. and H.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Sehmi Saad was employed by the company RF2S Spectrum Solutions. The remaining authors declare that this research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
µPMicroprocessor
AESAdvanced Encryption Standard
AHBAdvanced High-Speed Bus
ALUArithmetic Logic Unit
AMBAAdvanced Microcontroller Bus Architecture
APBAdvanced Peripheral Bus
ARMAdvanced RISC Machine
ASRAddress Space Register
BRAMBlock Random-Access Memory
BSORTBubble Sort
CEUCode Emulating Unit
CFCControl-Flow Monitoring
CFEControl-Flow Error
CMOSComplementary Metal-Oxide Semiconductor
COTSCommercial Off-The-Shelf
CPUCentral Processing Unit
CRAMCard Random-Access Memory
CRC32Cyclic Redundancy Check 32
DBIDynamic Binary Instrumentation
DCADirect Cache Access
DEDDouble Error Detection
DIVDivision
DSUDebug Support Unit
DWCDuplication-With-Comparison
ECCError Code Correction
EDACError Detection And Correction
FIFault Injection
FFFlip-Flop
FFTFast Fourier Transform
FPGAField-Programmable Gate Array
FPUFloating-Point Unit
FSMFinite-State Machines
FTFault-Tolerant
GCCGNU Compiler Collection
HDLHardware Description Language
HWHardware
I/DInstruction/Data
I/OInput/Output
IEEEInstitute of Electrical and Electronics Engineers
IPIntellectual Propriety
IRInstruction Register
IRQInterruption ReQuest
ISAInstruction Set Architecture
ISSInstruction Set Simulator
IUInteger Unit
IU3Integer Unit × 3
JTAGJoint Test Access Group
KBKilobyte
L1Level 1
LD/STLoad/Store
LRULeast Recently Used
LUTLook-Up Table
MBUMultiple-Bit Upset
MCTRLMemory Controller
METMultiple-Event Transient
MFMaximum Flow
MiBenchEmbedded Benchmarks
MMUMemory Management Unit
MMULTMatrix Multiplication (MulMatrix)
MODNETMODify NETlist
MULMultiplication
NETFINETlist Fault Injection
OCDOn Chip Debugging
OKOll Korrect (all correct)
PCProgram Counter
PIDProportional Integral Derivative
PROMProgrammable Read-Only Memory
PSRProcessor State Register
QSortQuiqker Sort
RAMRandom-Access Memory
RISCReduced Instruction Set Computing
RLERun Length Encoding
ROMRead-Only Memory
RTEMSReal-Time Executive for Multiprocessor Systems
RTLRegister Transfer Level
SBSTSoftware-Based Self-Test
SBUSingle-Bit Upset
SDCSilent Data Corruption
SDRAMSynchronous Dynamic Random-Access Memory
SECSingle-Error Correction
SERSoft Error Rate
SETSingle-Event Transient
SEUSingle-Event Upset
SHASecure Hashing Algorithm
SoCSystem-on-Chip
SPARCScalable Processor Architecture
SRAMSynchronous Random-Access Memory
SWSoftware
TCLTool Command Language
TLBTranslation Lookaside Buffer
TLMTransaction-Level Modeling
TMRTriple Modular Redundancy
UARTUniversal Asynchronous Receiver Transmitter
VHDLVHSIC Hardware Description Language
VHSICVery-High-Speed Integrated Circuit
WIMWindows Imaging Format
XRTCXilinx Radiation Test Consortium
XMLeXtensible Markup Language

References

  1. Wang, J.; Zhang, H.; Zhu, X.; Shen, G.; Chang, Z.; Xu, X.; Yu, T.; Zhu, X.; Zhang, L.; Ma, Y. Analyzing measured evidence for inducing factors of SEU from in-flight data of NSSC-SPRECMI on OPUS CZ-4C. IEEE Trans. Nucl. Sci. 2024, 72, 101–109. [Google Scholar] [CrossRef]
  2. Rodbell, K.P. Low-Energy Protons—Where and Why “Rare Events” Matter. IEEE Trans. Nucl. Sci. 2024, 67, 1204–1215. [Google Scholar] [CrossRef]
  3. Sangchoolie, B.; Pattabiraman, K.; Karlsson, J. One bit is (not) enough: An empirical study of the impact of single and multiple bit-flip errors. In Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Denver, CO, USA, 26–29 June 2017; pp. 97–108. [Google Scholar]
  4. Joshi, K.; Singh, R.; Bassetto, T.; Adve, S.; Marinov, D.; Misailovic, S. FastFlip: Compositional SDC Resiliency Analysis. In Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization, Las Vegas, NV, USA, 1–5 March 2025; pp. 362–376. [Google Scholar]
  5. Bonet, M.S.; Kosmidis, L. SPARROW: A low-cost hardware/software co-designed SIMD microarchitecture for AI operations in space processors. In Proceedings of the IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, 14–23 March 2022; pp. 1139–1142. [Google Scholar]
  6. Bekker, D.L.; Tran, M.Q.P. Performance analysis of standalone and in-fpga LEON3 processors for use in deep space missions. In Proceedings of the 2019 IEEE Aerospace Conference, Big Sky, MT, USA, 2–9 March 2019; pp. 1–17. [Google Scholar]
  7. Bonnoit, T.; Coelho, A.; Zergainoh, N.E.; Velazco, R. SEU impact in processor’s control-unit: Preliminary results obtained for LEON3 soft-core. In Proceedings of the 18th IEEE Latin American Test Symposium (LATS), Bogota, Colombia, 13–15 March 2017; pp. 1–4. [Google Scholar]
  8. Kchaou, A.; Saad, S.; Garrab, H.; Machhout, M. Reliability of LEON3 Processor’s Program Counter Against SEU, MBU, and SET Fault Injection. Cryptography 2025, 9, 54. [Google Scholar] [CrossRef]
  9. Travessini, R.; Villa, P.R.; Vargas, F.L.; Bezerra, E.A. Processor core profiling for SEU effect analysis. In Proceedings of the IEEE 19th Latin-American Test Symposium (LATS), Sao Paulo, Brazil, 12–14 March 2018; pp. 1–6. [Google Scholar]
  10. Mansour, W.; Velazco, R. SEU fault-injection in VHDL-based processors: A case study. J. Electron. Test. 2013, 29, 87–94. [Google Scholar] [CrossRef]
  11. Kempf, F.; Hoefer, J.; Kreß, F.; Hotfilter, T.; Harbaum, T.; Becker, J. Runtime adaptive cache checkpointing for risc multi-core processors. In Proceedings of the IEEE 35th International System-on-Chip Conference (SOCC), Belfast, UK, 5–8 September 2022; pp. 1–6. [Google Scholar]
  12. Kchaou, A.; Saad, S.; Garrab, H. Workload-Dependent Vulnerability of SDRAM Multi-Bit Upsets in a LEON3 Soft-Core Processor. Electronics 2025, 14, 4852. [Google Scholar] [CrossRef]
  13. Kchaou, A.; Saad, S.; Garrab, H. Emulation-Based Analysis of Multiple Cell Upsets in LEON3 SDRAM: A Workload-Dependent Vulnerability Study. Electronics 2025, 14, 4582. [Google Scholar] [CrossRef]
  14. Kooli, M.; Kaddachi, F.; Di Natale, G.; Bosio, A. Cache-and register-aware system reliability evaluation based on data lifetime analysis. In Proceedings of the IEEE 34th VLSI Test Symposium (VTS), Las Vegas, NV, USA, 25–27 April 2016; pp. 1–6. [Google Scholar]
  15. Guzman-Miranda, H.; Aguirre, M.A.; Tombs, J. Noninvasive fault classification, robustness and recovery time measurement in microprocessor-type architectures subjected to radiation-induced errors. IEEE Trans. Instrum. Meas. 2009, 58, 1514–1524. [Google Scholar] [CrossRef]
  16. Abbasitabar, H.; Zarandi, H.R.; Salamat, R. Susceptibility analysis of LEON3 embedded processor against multiple event transients and upsets. Proceeding of the 2012 IEEE 15th International Conference on Computational Science and Engineering, Paphos, Cyprus, 5–7 December 2012; pp. 548–553. [Google Scholar]
  17. Ebrahimi, M.; Sayed, N.; Rashvand, M.; Tahoori, M.B. Fault injection acceleration by architectural importance sampling. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES + ISSS), Amsterdam, The Netherlands, 4–9 October 2015; pp. 212–219. [Google Scholar]
  18. Ebrahimi, M.; Moshrefpour, M.H.; Golanbari, M.S.; Tahoori, M.B. Fault injection acceleration by simultaneous injection of non-interacting faults. In Proceedings of the 53rd Annual Design Automation Conference, Austin, TX, USA, 5–9 June 2016; pp. 1–6. [Google Scholar]
  19. Tuzov, I.; de Andrés, D.; Ruiz, J.C. Accurate robustness assessment of hdl models through iterative statistical fault injection. In Proceedings of the 14th European Dependable Computing Conference (EDCC), Iasi, Romania, 10–14 September 2018; pp. 1–8. [Google Scholar]
  20. Khanov, V.K.; Chekmarev, S.A. Fast SEU fault injection in the SoC-memory. In Proceedings of the 2016 13th International Scientific-Technical Conference on Actual Problems of Electronics Instrument Engineering (APEIE), Novosibirsk, Russia, 3–6 October 2016; pp. 447–450. [Google Scholar]
  21. Sari, A.; Psarakis, M. A fault injection platform for the analysis of soft error effects in FPGA soft processors. In Proceedings of the IEEE 19th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Kosice, Slovakia, 20–22 April 2016; pp. 1–6. [Google Scholar]
  22. Sari, A.; Psarakis, M. A flexible fault injection platform for the analysis of the symptoms of soft errors in FPGA soft processors. J. Circuits Syst. Comput. 2017, 26, 1740009. [Google Scholar] [CrossRef]
  23. Harward, N.A.; Gardiner, M.R.; Hsiao, L.W.; Wirthlin, M.J. Estimating soft processor soft error sensitivity through fault injection. In Proceedings of the IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, Vancouver, BC, Canada, 2–6 May 2015; pp. 143–150. [Google Scholar]
  24. Harward, N.A.; Gardiner, M.R.; Hsiao, L.W.; Wirthlin, M.J. A fault injection system for measuring soft processor design sensitivity on Virtex-5 FPGAs. In FPGAs and Parallel Architectures for Aerospace Applications: Soft Errors and Fault-Tolerant Design; Springer International Publishing: Cham, Switzerland, 2016; pp. 61–74. [Google Scholar]
  25. Da Silva, A.; Sanchez, S. LEON3 ViP: A virtual platform with fault injection capabilities. In Proceedings of the 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools, Lille, France, 1–3 September 2010; pp. 813–816. [Google Scholar]
  26. Mansour, W.; Velazco, R. SEU fault-injection in VHDL-based processors: A case study. In Proceedings of the 13th Latin American Test Workshop (LATW), Quito, Ecuador, 10–13 April 2012; pp. 1–5. [Google Scholar]
  27. Chekmarev, S.A.; Khanov, V.K. Fault injection via on-chip debugging in the internal memory of systems-on-chip processor. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Krasnoyarsk, Russia, 6–10 April 2015; IOP Publishing: Philadelphia, PA, USA, 2015; p. 012020. [Google Scholar]
  28. Theodorou, G.; Kranitis, N.; Paschalis, A.; Gizopoulos, D. A software-based self-test methodology for on-line testing of processor caches. In Proceedings of the IEEE International Test Conference, Anaheim, CA, USA, 20–22 September 2011; pp. 1–10. [Google Scholar]
  29. Theodorou, G.; Kranitis, N.; Paschalis, A.; Gizopoulos, D. Software-based self test methodology for on-line testing of L1 caches in multithreaded multicore architectures. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2012, 21, 786–790. [Google Scholar] [CrossRef]
  30. Du, B.; Reorda, M.S.; Sterpone, L.; Parra, L.; Portela-García, M.; Lindoso, A.; Entrena, L. Online test of control flow errors: A new debug interface-based approach. IEEE Trans. Comput. 2015, 65, 1846–1855. [Google Scholar] [CrossRef]
  31. Parra, L.; Lindoso, A.; Portela-Garcia, M.; Entrena, L.; Du, B.; Reorda, M.S.; Sterpone, L. A new hybrid nonintrusive error-detection technique using dual control-flow monitoring. IEEE Trans. Nucl. Sci. 2014, 61, 3236–3243. [Google Scholar] [CrossRef]
  32. Keller, A.M.; Wirthlin, M. J Benefits of complementary SEU mitigation for the LEON3 soft processor on SRAM-based FPGAs. IEEE Trans. Nucl. Sci. 2016, 64, 519–528. [Google Scholar] [CrossRef]
  33. Lindoso, A.; Entrena, L.; García-Valderas, M.; Parra, L. A hybrid fault-tolerant LEON3 soft core processor implemented in low-end SRAM FPGA. IEEE Trans. Nucl. Sci. 2016, 64, 374–381. [Google Scholar] [CrossRef]
  34. Houssany, S.; Guibbaud, N.; Bougerol, A.; Leveugle, R.; Miller, F.; Buard, N. Microprocessor soft error rate prediction based on cache memory analysis. IEEE Trans. Nucl. Sci. 2012, 59, 980–987. [Google Scholar] [CrossRef]
  35. Houssany, S.; Guibbaud, N.; Bougerol, A.; Leveugle, R.; Santini, T.; Miller, F. Experimental assessment of cache memory soft error rate prediction technique. IEEE Trans. Nucl. Sci. 2013, 60, 2734–2741. [Google Scholar] [CrossRef]
  36. Kooli, M.; Di Natale, G.; Bosio, A. Cache-aware reliability evaluation through LLVM-based analysis and fault injection. In Proceedings of the IEEE 22nd International Symposium on on-Line Testing and Robust System Design (IOLTS), Sant Feliu de Guixols, Spain, 4–6 July 2016; pp. 19–22. [Google Scholar]
  37. Kaddachi, F.; Kooli, M.; Di Natale, G.; Bosio, A.; Ebrahimi, M.; Tahoori, M. System-level reliability evaluation through cache-aware software-based fault injection. In Proceedings of the IEEE 19th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Kosice, Slovakia, 20–22 April 2016; pp. 1–6. [Google Scholar]
  38. Cannon, M.J.; Keller, A.M.; Rowberry, H.C.; Thurlow, C.A.; Pérez-Celis, A.; Wirthlin, M.J. Strategies for removing common mode failures from TMR designs deployed on SRAM FPGAs. IEEE Trans. Nucl. Sci. 2018, 66, 207–215. [Google Scholar] [CrossRef]
  39. Kasap, S.; Wächter, E.W.; Zhai, X.; Ehsan, S.; Mcdonald-Maier, K. Survey of soft error mitigation techniques applied to LEON3 soft processors on SRAM-based FPGAs. IEEE Access 2020, 8, 28646–28658. [Google Scholar] [CrossRef]
  40. Tong, J.G.; Anderson, I.D.; Khalid, M.A. Soft-core processors for embedded systems. In Proceedings of the International Conference on Microelectronics, Dhahran, Saudi Arabia, 16–19 September 2006; pp. 170–173. [Google Scholar]
  41. Trouchkine, T.; Bouffard, G.; Clédière, J. Fault injection characterization on modern cpus: From the isa to the micro-architecture. In Porceeding of the IFIP International Conference on Information Security Theory and Practice, Paris, France, 29 February–1 March 2019; Springer International Publishing: Cham, Switzerland; pp. 123–138.
  42. Alshaer, I.; Colombier, B.; Deleuze, C.; Beroulle, V.; Maistri, P. Variable-length instruction set: Feature or bug? In Proceedings of the 25th Euromicro Conference on Digital System Design (DSD), Maspalomas, Spain, 31 August–2 September 2022; pp. 464–471. [Google Scholar]
  43. Almeida, R.; Silva, V.; Cabral, J. Virtualized fault injection framework for iso 26262-compliant digital component hardware faults. Electronics 2024, 13, 2787. [Google Scholar] [CrossRef]
  44. Peña-Fernández, M.; Serrano-Cases, A.; Lindoso, A.; Cuenca-Asensi, S.; Entrena, L.; Morilla, Y.; Martín-Holgado, P.; Martínez-Álvarez, A. Hybrid lockstep technique for soft error mitigation. IEEE Trans. Nucl. Sci. 2022, 69, 1574–1581. [Google Scholar] [CrossRef]
  45. Mansour, W.; Velazco, R. An automated SEU fault-injection method and tool for HDL-based designs. IEEE Trans. Nucl. Sci. 2013, 60, 2728–2733. [Google Scholar] [CrossRef]
  46. Solinas, M.; Coelho, A.; Fraire, J.A.; Zergainoh, N.E.; Ferreyra, P.A.; Velazco, R. Preliminary results of NETFI-2: An automatic method for fault injection on HDL-based designs. In Proceedings of the 18th IEEE Latin American Test Symposium (LATS), Bogota, Colombia, 13–15 March 2017; pp. 1–4. [Google Scholar]
  47. Timmers, N.; Spruyt, A.; Witteman, M. Controlling PC on ARM using fault injection. In Proceedings of the IEEE Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), Santa Barbara, CA, USA, 16 August 2016; pp. 25–35. [Google Scholar]
Figure 1. Microarchitectural organization of cache subsystem and memory controller in the standard LEON3 processor.
Figure 1. Microarchitectural organization of cache subsystem and memory controller in the standard LEON3 processor.
Information 17 00249 g001
Figure 2. RTL fault injection strategy using the NETFI+ framework.
Figure 2. RTL fault injection strategy using the NETFI+ framework.
Information 17 00249 g002
Figure 3. Error classification for I-cache under SEU injection (Fault injection = 111,600).
Figure 3. Error classification for I-cache under SEU injection (Fault injection = 111,600).
Information 17 00249 g003
Figure 4. Error classification for D-cache under SEU injection (fault injection = 351,600).
Figure 4. Error classification for D-cache under SEU injection (fault injection = 351,600).
Information 17 00249 g004
Figure 5. Trap classification for I-cache under SEU injection.
Figure 5. Trap classification for I-cache under SEU injection.
Information 17 00249 g005
Figure 6. Trap classification for D-cache under SEU injection.
Figure 6. Trap classification for D-cache under SEU injection.
Information 17 00249 g006
Figure 7. Trap classification for AHB control interface under SEU injection.
Figure 7. Trap classification for AHB control interface under SEU injection.
Information 17 00249 g007
Figure 8. Error classification for MCTRL under SEU injection (Total FI = 248,400).
Figure 8. Error classification for MCTRL under SEU injection (Total FI = 248,400).
Information 17 00249 g008
Figure 9. Trap classification for MCTRL under SEU injection.
Figure 9. Trap classification for MCTRL under SEU injection.
Information 17 00249 g009
Figure 10. Cross-subsystem comparison of SEU outcomes.
Figure 10. Cross-subsystem comparison of SEU outcomes.
Information 17 00249 g010
Table 1. Summary of LEON3 fault injection and reliability studies in the literature.
Table 1. Summary of LEON3 fault injection and reliability studies in the literature.
Ref./YearFault ModelsMethodologyTarget SubsystemsWorkload-AwareKey Strength
[7]/2017Assess SEU/MBU sensitivity in control logicRTL fault injection (NETFI-2, FPGA)IU3 control registers (instruction, flags, ALU config)AESIdentifies DE-stage instruction registers as most critical; quantifies MBU impact in non-SBU-sensitive bits
[8]/2025Analyze PC vulnerability to SEU/MBU/SETNETFI+ RTL injection (4M+ faults)PC across FE, DE, RA, EX, ME, XC pipeline stagesMulMatrixFirst pipeline-stage trap causality: window overflow from RA/EX PC corruption
[9]/2018Profile CPU register sensitivity to SBUSimulation-based FI (ModelSim© + TCL)PROC3 registers
(IU3, caches)
PID, BSORT,
Hamming
Ranks 362 registers by harm; shows PC causes 100% failure; validates 99.25% SBU tolerance with partial TMR
[10,26]/2013Compare DFI vs. CEU fault coverageRTL-modified DFI (single-cycle inject.)Register file (windowed)MulMatrix, self-convergent algorithmDFI accesses all RF bits (including pipeline-internal); reveals CEU overestimates error rates
[11]/2022Enable runtime-adaptive cache reliability for mixed-criticality workloadsHardware-based cache checkpointing with rollbackL1 cache (instruction/data arrays and controller)FFT, Matrix Multiplication, Mergesort, Quicksort, Black-ScholesLow-overhead switching between performance and reliable modes; successful fault handling demonstrated on LEON3
[12]/2025Assess workload-dependent MBU effects in SDRAMFPGA-based runtime MBU FIExternal SDRAMMulMatrix, FFT, AESDemonstrates extreme workload dependence: AES masks 92.4% MBUs; MulMatrix activates > 99.99%
[13]/2025Assess workload-dependent MCU effects in SDRAMFPGA-based runtime MCU FIExternal SDRAMMulMatrix, FFT, AESReveals stark workload dependence:
AES shows high non-propagation, while MulMatrix and FFT exhibit > 97% observable errors;
rare instruction traps linked to address/control corruption
[14,36,37]/2016Accelerate fault injection via data lifetime analysisSoftware-based FI targeting only live registers and cache linesRegister file,
cache (data arrays)
MiBench (QSort, Dijkstra, FFT)Reduces injection space by >90% by focusing on active data; shows inactive injections are mostly masked
[15]/2009Measure post-fault recovery time and robustnessDual-instance HW emulation (FT-UNSHADES-μP)Cache, register file, SDRAMSignal processing
task
First to quantify recovery time and classify faults as transient/permanent; shows hardened software improves robustness
[16]/2012Analyze multi-fault effects (SEU, MBU, SET, MET)RTL simulation (ModelSim©, VHDL)FFs, registers, register file, I/D cache (tag/data)QSort, AES, CRC32Comparative analysis of four fault models across MiBench workloads; quantifies over-write/latent/failure rates
[17,18]/2016Accelerate FI via batched non-interacting faultsEmulation + k-ary tree fault batchingI/D cache (tag/data),
register file, FFs
8 MiBench: CRC32, SHA, Qsort, FFT…85× average speedup via simultaneous injection of non-interacting faults; validated on FPGA
[19]/2018Optimize statistical fault injection sample sizeIterative sampling + RTL simulationInteger unit,
multiply/divide unit FFs
Matrix mathAdaptive campaign sizing reduces fault injections by up to 6× while maintaining 0.1% error margin
[20]/2016Enable fast SEU injection in SoC memoryOn-chip FI IP-core via OCD/MCTRLInternal memory (register file, cache), external SDRAMNo workload
specified
Injection latency as low as 54 ± 7 cycles; supports MBU emulation
[21,22]/2017Analyze SEU symptoms
via performance counters
On-chip FI
(RapidSmith + MicroBlaze)
Full LEON3
(IU, MMU, MUL/DIV)
FFT, SHA, Basic mathUses exception signals and performance counter deviations (e.g., cache misses, AHB util.) for 96% error coverage
[23,24]/2016Compare CRAM sensitivity across soft processorsBitstream-level FI (XRTC-V5FI)Full core (CRAM only)Towers of Hanoi, Dhrystone,
Whetstone,
CoreMark, Dijkstra
Shows 54% software-dependent sensitivity; LEON3 81.3% more sensitive than MicroBlaze
[25]/2010Enable early TLM-level
fault injection
SystemC TLM2.0 + DBI + XMLISS registers,
TLM transactions
SHA-1First TLM2.0 fault framework using transport_dbg and runtime binary wrappers
[27]/2015Enable autonomous SEU injection via on-chip debuggerOn-chip FI IP-core using DSU/OCDRegister file, cache
(EDAC-protected)
Register loop, RTEMS-tasksInjection latency ≤ 54 cycles; <2% FPGA overhead; host-free campaigns
[28,29]/2012Enable on-line cache testing via SBSTSW-based March test using DCAL1 cache (tag/data arrays)No workload
specified
Uses debug instructions for 83% smaller code, 72% faster vs. prior SBST; validates 100% March fault coverage
[30,31]/2015Detect control-flow errors via debug interfaceExt. CFC module monitoring PC/IR via DSU trace portControl flow
(PC/Instruction stream)
Bubble, Matrix,
Dijkstra, RLE, MF
>95% CFE detection with no hardware/software modification; works with caches enabled
[32]/2017Evaluate combined SEU mitigation: TMR + scrubbingFI +neutron
radiation testing
Full LEON3 core
(TMR-protected)
DhrystoneDemonstrates ~50× reliability improvement with TMR + CRAM/BRAM scrubbing
[33]/2017Implement hybrid fault tolerance in low-end FPGACombines SEC/DED,
HW CFC monitor, SW hardening, CRAM scrubbing
Cache, register file, CRAMQuicksort, MMULT, AESAchieves 94–96% error detection with low FPGA overhead (<2% LUTs); validated with neutron testing
[34,35]/2013Predict cache soft error rate (SER) under radiationAnalytical SER model + neutron
radiation testing
I/D cache
(architectural level)
No workload
specified
First cache SER prediction model validated with neutron tests (10% accuracy) for LEON3 on Virtex-5
Table 2. Recommended hardening strategies per subsystem.
Table 2. Recommended hardening strategies per subsystem.
SubsystemRecommended Hardening StrategiesPrimary Failure Mode Addressing
I-cacheParity on instruction words and tag arrays
Automatic line invalidation on parity error
Instruction-stream corruption
(Illegal instruction, SDC)
D-cacheSEC-DED ECC on data arrays
Parity on metadata (tag, valid, dirty)
Error-triggered line invalidation
SDC (31.59%), misfetches
AHB control
interface
Parity on command signals (haddr, htrans, hwrite)
DWC on handshake signals (hready/hresp)
Transaction watchdog timer
Protocol collapse
(100% halt/timeout)
MCTRLParity on configuration registers (ramwidth, iows, rom-rws)
Safe FSM encoding for bstate
Runtime monitoring of hready/hresp
Parameter corruption,
bus deadlocks, SDC (3.35%)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kchaou, A.; Saad, S.; Garrab, H. Reliability Analysis of the LEON3 Memory Subsystem Under Single-Event Upsets: Cache, AHB Interface, and Memory Controller Vulnerability. Information 2026, 17, 249. https://doi.org/10.3390/info17030249

AMA Style

Kchaou A, Saad S, Garrab H. Reliability Analysis of the LEON3 Memory Subsystem Under Single-Event Upsets: Cache, AHB Interface, and Memory Controller Vulnerability. Information. 2026; 17(3):249. https://doi.org/10.3390/info17030249

Chicago/Turabian Style

Kchaou, Afef, Sehmi Saad, and Hatem Garrab. 2026. "Reliability Analysis of the LEON3 Memory Subsystem Under Single-Event Upsets: Cache, AHB Interface, and Memory Controller Vulnerability" Information 17, no. 3: 249. https://doi.org/10.3390/info17030249

APA Style

Kchaou, A., Saad, S., & Garrab, H. (2026). Reliability Analysis of the LEON3 Memory Subsystem Under Single-Event Upsets: Cache, AHB Interface, and Memory Controller Vulnerability. Information, 17(3), 249. https://doi.org/10.3390/info17030249

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop