1. Introduction
Code theft, direct non-volatile readout (a.k.a. ‘ROM scraping’), side-channel attacks and communication eavesdropping pose a serious problem for commercial products based on embedded-system integrated circuits [
1,
2,
3,
4]. These threats jeopardise embedded-code security and intellectual property. ROM scraping refers to the theft of code extracted directly from a ROM through physical access. Although it is feasible on ordinary integrated circuits, it is even more difficult on FPGAs, especially inside MicroBlaze designs within an SoC architecture.
The default obfuscated structure of the MicroBlaze IP and the bitstream-encryption option in Xilinx, make code extraction from ROM almost impossible [
5,
6,
7,
8]. MicroBlaze, as a soft-core processor embedded in the Programmable Logic (PL) section of the FPGA, stores its code in volatile configuration memory. By nature this memory is non-linear and obfuscated. The C code compiled for MicroBlaze is converted into this hardware configuration, becoming an integral part of the FPGA fabric; therefore, extracting it through physical techniques such as ROM scraping is extremely difficult. This structure fulfils the reverse-engineering-prevention goal that obfuscated algorithms are meant to serve against the problems defined in the threat model [
1,
2]. In the Zynq-7020 build of this study, security-critical C-code fragments are placed in AXI-mapped BRAMs inside reconfigurable partitions and are periodically relocated and scrubbed via DFX (PCAP), so the physical locus of both secrets and selected code pages also moves rather than relying solely on MicroBlaze/configuration-memory obfuscation (
Figure 1).
Side-channel attacks aim to obtain sensitive data by analysing power consumption or electromagnetic emissions. Bitstream encryption provides resistance to such attacks. Communication eavesdropping, in turn, tries to understand code functionality by analysing the data flow across pins, which can enable indirect code inference, especially through man-in-the-middle methods. The threat model assumes that an attacker may capture the commercial, intellectual-property-class software inside the integrated circuit by scraping the ROM, obtaining keys via side-channel analysis, or analysing communication patterns to infer code behaviour. To counter such threats on resource-constrained systems, lightweight AEAD primitives (e.g., ASCON-128 family that currently chosen as a standard by NIST) and a per-message keying policy are adopted for prioritising low energy and bounded latency in embedded deployments [
9,
10,
11].
Comprehensive, advanced studies on embedded-system security in the literature ordinarily focus solely on FPGAs, yet specific solutions that directly address—and effectively prevent—the disclosure of code functionality through communication-line eavesdropping are limited [
12]. For example, the relevant work can be grouped under five main headings. Bitstream Protection: Guin [
5], tackled counterfeit integrated-circuit threats and proposed bitstream encryption; Obfuscation: Engels [
13], assessed the security of logic locking but did not emphasise communication security; IoT Security: Silva [
12], investigated lightweight encryption algorithms yet did not address code-functionality protection; Algorithm Hopping: Soliman [
14], showed that algorithm hopping together with Dynamic Partial Reconfiguration (DPR that currently replaced with DFX) improves security; and Physical-Layer Key Generation: recent systems (e.g., MobileKey) derive symmetric keys from wireless-channel reciprocity on mobile devices; these works focus on bootstrapping keys between peers rather than preventing on-chip code theft or communication-semantics leakage targeted here [
15,
16,
17]. Specific countermeasures that prevent the disclosure of code functionality via communication eavesdropping therefore remain scarce. This gap is precisely what the proposed method seeks to fill. This study proposes a low-energy method that (i) derives per-message AEAD keys via an Ascon-XOF KDF from a small encrypted pool and a pre-shared root secret bound to the device ID and a timestamp/nonce, and (ii) relocates the encrypted pool and selected code pages in BRAM via DFX as a moving target [
18,
19,
20]. Evaluation is performed on Zynq-7020 hardware (and an Arduino-class solver) using cycles/byte, end-to-end latency, and 12 V/9.6 V energy measurements. Attack simulations are also made to prove the resistance of the methodology against current well-known AEAD standards.
Consequently, it has been observed that most existing solutions concentrate on bitstream encryption or on optimising the energy and execution time of general cryptographic algorithms, yet the literature lacks sufficient mechanisms that prevent the inference of code behavior from data captured through communication eavesdropping. Furthermore, academic FPGA-security proposals have been documented as largely theoretical and as failing to meet commercial requirements [
8,
21]. This academic–commercial mismatch has generated the expectation that the method proposed in this study, by addressing IP theft in addition to commercial requirements such as energy efficiency and practical deployability, will fill this gap. No specific threat model or countermeasure directly addressing the extraction of code functionality by communication eavesdropping or ROM scraping has been identified in the literature; current studies typically prioritize energy and time optimisation and, therefore, overlook this particular threat. Accordingly, practicality, security and energy efficiency are evaluated against representative lightweight AEAD baselines widely used on constrained devices (Ascon-128a, ACORN, TinyJAMBU, JAMBU) under identical measurement conditions [
10,
22,
23,
24,
25].
Because no directly comparable study targets the combined threat of code-functionality inference via communication eavesdropping together with ROM scraping, the proposed hybrid security framework is evaluated against representative lightweight AEAD baselines widely used on constrained devices (e.g., Ascon-128a, ACORN, TinyJAMBU). The comparison focuses on cycles per byte, end-to-end latency, and energy at 12 V and 9.6 V under identical measurement conditions, while the HSF layer contributes dynamic per-message keying and DFX-based memory relocation orthogonal to the chosen AEAD. Historical PSK-based stream ciphers, such as A5/1, are referenced only as a contrast case to motivate authenticated encryption and to highlight why static-key LFSR designs are unsuitable for this threat model [
26].
The hybrid security framework proposed in this study combines bitstream encryption, dynamic key generation, and memory obfuscation through Dynamic Function eXchange (DFX) to counter the threats described above [
6,
14,
21,
27]. Bitstream encryption, available as an integrated option in Xilinx Vivado, blocks side-channel attacks on the ROM, whereas the ROM of the MicroBlaze soft processor located inside the SoC Block Design module is already obfuscated and is therefore naturally resistant to physical scraping attacks. In the implementation of Zynq-7020 in this study, relying on any fixed ROM location is also avoided: the security-critical code segments and tables that would otherwise reside in ROM were placed into two DFX-relocatable BRAM regions in the PL and are periodically swapped and scrubbed via PCAP, so even their physical locus keeps moving. Against the communication-eavesdropping threat, the study proposes deriving dynamic per-message keys via an Ascon-XOF KDF from an encrypted pool and a pre-shared root secret (as a PSK) added with the device ID and an optional timestamp/nonce. Although this method is formally similar to schemes that rely on pre-shared keys, such as A5/1—which, despite all reported vulnerabilities, is still in use in GSM networks—it provides greater security [
26].
An A5/1-like structure (stream cipher) was not selected here because, due to its known cryptanalytic weaknesses, 64-bit key length, and non-obfuscated structure, the single root key can be obtained directly from RAM with greater likelihood [
3,
26]. In contrast, the proposed method increases the entropy and removes reliance on a root key by employing per-message 128-bit AEAD keys derived from (pool + root + device ID +
[opt. timestamp/nonce]). In this context, the method introduces an original threat model and a dedicated solution that fill a gap in the literature.
Because the contribution is a hardware–software hybrid that hides communication semantics at run time and relocates both secrets and selected code segments in PL BRAM via DFX, the appropriate baselines are lightweight AEAD ciphers commonly used on embedded targets—not asymmetric key-exchange protocols. Accordingly, Hybrid Security Framework (HSF) is evaluated as a per-message keying wrapper around standard AEADs (e.g., Ascon-128a by default, with ACORN, JAMBU and TinyJAMBU as alternatives) and time/energy are reported on the same setup. Legacy PSK stream ciphers (e.g., A5/1) are cited only to motivate the choice of dynamic, per-message keys—versus single-root-key schemes that are exposed to known attacks. This framing isolates the novelty (dynamic keying + DFX-relocated BRAM for secrets and code) and aligns the evaluation with the stated threat model. A5/1 employs a 64-bit key and generates session sub-keys from a root key by means of a Linear Feedback Shift Register (LFSR) but is regarded as weak by modern standards [
26]. The proposed method uses per-message 128-bit AEAD keys (Ascon family) and relocates BRAM contents via DFX. DFX does not increase cryptographic key entropy; it expands the attacker’s physical search space.
One of the important sub-components in the proposed hybrid method is the use of DFX, which conceals the physical location of the secret pool and selected code pages by continually relocating RAM blocks and addresses. In this process, RAM addresses are reconfigured randomly and meaningless data are inserted into empty areas, making it difficult for an attacker to separate genuine data from decoys. Thus, the proposed method not only increases entropy but also impedes correlating communication patterns with code behaviour, thereby preventing code theft [
8]. In practice, relocating BRAM-resident code pages together with secret material further breaks stable leakage templates and frustrates physical scraping or probing.
In the design of the proposed method, the energy-efficiency advantages of pre-shared keys (PSKs) were taken into account. This consideration is also critical for resource-constrained embedded systems and similar IoT devices. Security protocols such as the asymmetric Diffie–Hellman, whose protection generally relies on key-exchange procedures, are unsuitable for energy-critical embedded systems owing to their high computational cost [
12,
28].
The method offers an alternative solution to the problem defined in the aforementioned threat model by generating dynamic keys without requiring real-time key exchange. During this process, the reduction of energy consumption and latency was also considered. The method is designed to derive a session-specific key in each communication session by generating dynamic keys without real-time key exchange: a random element is drawn from the encrypted pool and combined with root key and the device ID; an Ascon-XOF KDF derives the per-message AEAD key from these inputs.
Threat model. An adversary capable of probing off-chip buses (DDR/QSPI/AXI), attempting physical readout of on-chip arrays, and passively observing I/O to infer program behaviour is assumed. In response, only encrypted secret-pools transit AXI and BRAM instances holding secrets are periodically destroyed and recreated via DFX to desynchronise physical locality from logical content. Relocating BRAM-resident code pages together with the secret pool further desynchronises physical locality from logical content and frustrates scraping/probing templates.
The proposed method was intentionally made complex so that, in addition to being low-cost in terms of energy and time, it would resist the hardware attacks described in the threat model. Consequently, its complexity was increased with the aim of contributing to hardware obfuscation in that context [
4,
6,
7]. Here, the role of obfuscation is to enhance security by increasing system complexity, thereby enriching Kerckhoffs’s principle—that security should rely on the secrecy of the key rather than the secrecy of the algorithm—in a modern context. Responding to the unique security needs faced by FPGA-based systems, this innovative method supplements traditional approaches with implementation-level obfuscation. It also takes into account Shannon’s [
29] dictum that any system can be broken given sufficient time and significantly hinders an attacker’s efforts by making reverse engineering of the obfuscated structure more difficult. For this reason, the method additionally aims to extend the attacker’s expenditure of time and resources. Within this context, DFX provides protection against statistical analysis by continually changing RAM addresses and inserting meaningless data [
8]. Therefore, although lightweight, this purpose-built hybrid structure—evaluated from multiple perspectives and without compromising security—provides sufficient complexity to prevent even indirect inference of code behaviour, because even if a key is broken only a single message can be captured.
An optional provisioning phase can pre-establish a root secret and parameters under bitstream encryption, enabling both endpoints to deterministically derive the same encrypted pool from a TRNG-seeded seed/salt via Ascon-XOF—without transferring static keys [
30,
31,
32]. A timestamp-trial variant may be used, but this option is orthogonal to the evaluated threat model and deferred to future work. The provisioning option preserves low energy and latency by avoiding online key exchange on constrained devices. Importantly, DFX enlarges the physical search space against scraping/probing, whereas shell cryptographic entropy remains 128 bits.
Although this issue lies outside the threat model, the study also proposes an optional mechanism. This mechanism aims to provide an infrastructure for future work that tackles weaknesses arising from the “key-embedding” approach and from other encryption gaps noted in the literature. Accordingly, the provisioning phase is discussed but kept outside the formal scope of the solution proposed for the main threat model. Nevertheless, the study flags this separate issue for future research by offering a preliminary solution path. For this reason, theoretical or experimental comparison topics have not been addressed for this phase.
As a result, this study fills an important gap in the current literature by presenting an energy-aware and practical solution against code theft in FPGA-based systems [
8,
12,
21]. Therefore,
(i) the definition of a specific threat model for extracting code functionality by eavesdropping on communication lines and the presentation of a hybrid security solution against this;
(ii) the justification—using examples such as A5/1—of the suitability of pre-shared keys for energy efficiency and practical applicability in IoT environments [
26],
(iii) the detailed analysis, in the light of Kerckhoffs’s principle, of the contribution of DFX-based obfuscation techniques to real-world security by increasing attack complexity and time even if the algorithm is known [
6,
8,
21] and
(iv) demonstrating that the additional energy and latency costs incurred by the proposed method, when compared with representative lightweight AEAD baselines on the same hardware, are at an acceptable level for a practical solution [
33].
Note. Public access to the open-source comparison of proposed KDF method versus known AEAD baselines has been provided to facilitate reproducibility and future research,
https://github.com/tkopter/Proposd_HSF (accessed on 15 September 2025).
3. Materials and Methods
This section describes the methods used to implement a hybrid security framework that couples per-message authenticated encryption with run-time relocation of secret storage and selected code pages in AXI-mapped BRAM via Dynamic Function eXchange (DFX). The description focuses on the encryption algorithm hardware design; measurement and benchmarking details are reported elsewhere.
3.1. Hybrid Security Framework (HSF): Design and AEAD Interface
A 256-bit root secret (), a device identifier (dev_id, 32-bit, big-endian on the wire), and a pool of 128-bit slices () reside in AXI-mapped BRAM. Pool entries are 128-bit windows over the 256-bit root at non-sequential offsets; the slice mapping is kept in BRAM and, together with selected code pages, is periodically relocated by Dynamic Function eXchange (DFX), so the physical locus of both secrets and code becomes a moving target. Authenticated encryption uses ASCON-128a (v1.2); policy-controlled key derivation is performed via ASCON-XOF128. Nonces and AAD are public but integrity-bound; acceptance is tag-only.
Keying and Domain Separation (KDF Policies). ASCON-XOF128 is modeled as a PRF/RO keyed by the 256-bit
over public context. Two deployment policies are supported; both apply explicit domain separation via a protocol string
dom [
18,
19,
20].
Policy A (per-slice key).
Keys are per-device/per-slice; nonce uniqueness per key remains mandatory. A compromise of affects packets using the same until rekey or slice rotation.
Policy B (per-message key, optional).
Keys are per-packet; a compromise of one does not help on other packets, even with full knowledge of .
In both policies,
never leaves the device, and the protocol string
dom provides cross-protocol domain separation. Under Policy B,
additionally define a per-packet domain [
10,
18,
19]; therefore exposure of one
does not help derive any
even with full knowledge of
.
Nonce and AAD Construction. The public nonce is 16 bytes and packs
dev|epoch|msg as big-endian fields:
where reuse is prevented by a monotone policy on
[
10,
62].
Operational rules are as follows: (i) Increment on reboot; (ii) keep monotone; (iii) rekey (new or a new derivative) well before counter wrap. Uniqueness per key is mandatory for AEAD security.
Rationale. The
dev field binds the endpoints and enables rejection of packets from the wrong device;
epoch prevents counter collisions after reboot;
msg enforces monotonicity that provides per-epoch freshness and aids transport synchronisation. These fields are not secret; carrying them in the clear does not introduce a vulnerability, because acceptance depends solely on verification of the authentication tag under the correct key. Associated data (AAD) is integrity-protected by the AEAD tag and uses either a 15-byte layout
ver|dev|idx|msg|feat or a 23-byte extension that appends an optional
ts field. Here,
ver encodes protocol-version compatibility;
dev enforces endpoint matching;
idx indicates which slice is used;
msg supports synchronisation with the transport layer and controls replay; and
feat carries application-specific policy bits. When present, the optional
ts field enables timestamp validation. The AAD is not encrypted; however, it is cryptographically bound to the AEAD tag, so any bit flip causes verification to fail (INT-CTXT) [
9]. When Policy B is selected,
also enter the KDF for per-packet domain separation; under Policy A they do not.
Slice Pool. The slices over are 128-bit sliding windows of the 256-bit string; their starting offsets are chosen from the ordinary set , and the selection order is non-monotonic (e.g., the fifth key may use the window when ). The idx field in the AAD is merely a label indicating which window is used; because the window’s contents remain secret, exposing idx does not create a vulnerability. Even if the same idx is reused, AEAD security is preserved by nonce uniqueness; the residual risk is a possible concentration of key leakage on a single idx, which is mitigated by random idx selection and DFX. Slices in POOL are 128-bit windows over the 256-bit chosen at non-sequential offsets (e.g., ); the policy avoids a fixed ordering so that an observer cannot map public indices to stable root sub-ranges.
Initialization. Initialization loads , , and into BRAM; sets ; and enables DFX relocation/scrubbing for the BRAM regions that host the secret pool and selected code pages. On reboot, increment to prevent nonce reuse. At startup, the DFX controller allocates two BRAM regions for secrets and selected code and enables periodic relocation.
Steady-state. For each message, build the nonce from
, derive
according to Policy A or Policy B using
, compute
, and transmit the record. The receiver enforces the checks in
Section 3.1.1 and accepts only on successful tag verification. Reusing the same
idx in consecutive messages does not degrade security as long as nonces are unique; under Policy A a compromised key affects packets with the same
until rekey, whereas under Policy B compromise is confined to a single packet.
3.1.1. Receiver Side Acceptance Checks
A packet is accepted if all conditions
[L1]–
[L9] hold; otherwise it is rejected. The items below mirror the concise checklist in
Table 2.
- L1.
Lengths valid: , , . (format sanity)
- L2.
Index & device consistency: ; . (single source of truth)
- L3.
Message consistency: . (single source of truth)
- L4.
Version/feature policy: policy bits ver/feat, AAD layout valid. (compatibility gating)
- L5.
Anti-replay: is fresh under the policy monotone or sliding window. (replay resistance)
- L6.
Optional timestamp policy: if is present, enforce or bound. (audit/window when enabled)
- L7.
KDF derive (base): .
- L8.
KDF (optional separation): append if policy requires per-message domain separation.
- L9.
AEAD verify: succeeds. (INT-CTXT)
Table 2.
Receiver checks (concise).
Table 2.
Receiver checks (concise).
| Step | Condition (Reject on Failure) |
|---|
| L1 | , , |
| L2 | ; |
| L3 | |
| L4 | Policy bits: ver/feat and AAD layout valid |
| L5 | Anti-replay: fresh (monotone or window) |
| L6 | Optional timestamp: if present, enforce or window bound |
| L7 | |
| L8 | (Optional) Append to KDF input for per-message domain separation (Policy B) |
| L9 | succeeds |
See Table 3 for full field semantics. Security note. dev/idx/msg/epoch are public by design; secrecy is provided by
and integrity by the tag [
30,
31,
32]. Leaking idx or dev does not weaken confidentiality because
and POOL[idx] remain unknown, while
ensure per-packet key separation. Under Policy A (per-slice), compromising a derived key impacts packets using the same
until rekey or slice rotation. Under Policy B (per-message), including
in the KDF confines a compromise to that packet. In both policies, nonce uniqueness per key remains mandatory.
3.1.2. Operational Procedures
Ciphertexts and tags are produced by ASCON-128a. Acceptance is exclusively by tag verification; no plaintext markers are used. The probability of tag forgery is approximately
; to mitigate application-level timing side channels, tag comparison is implemented in constant time [
10]. Damage containment follows Policy A/B as defined in the
KDF Policies subsection.
Figure 2 summarises the KDF inputs, Nonce/AAD construction, receiver-side checks, and the DFX-assisted BRAM relocation of both secret pools and selected code pages.
DFX does not increase cryptographic key entropy; rather, it enlarges the attacker’s physical search space against scraping/probing attacks and destabilizes leakage templates. This effect is orthogonal to the cryptographic keyspace. Pool size and index selection are parameters: a larger pool reduces accidental reuse probability of the same slice; the index idx is public in AAD, whereas the slice content remains secret. DFX relocation and BRAM scrubbing enlarge the physical search space against scraping/probing; cryptographic key entropy remains defined by the 128-bit key length. Only the idx label and the AAD/nonce headers are visible to an eavesdropper; the idx → slice mapping is hidden within BRAM and is periodically relocated via DFX, so observing idx on the clear channel does not permit inference of the slice contents or of .
Parameter roles. The
tag authenticates the ciphertext and verifies its integrity; the
AAD is transmitted in the clear but, being cryptographically bound to the tag, is not modifiable without detection, and the
KDF (ASCON-XOF) derives a 128-bit session key from
,
dev_id, and the secret
POOL[idx]. Thus confidentiality (IND-CPA) and integrity (INT-CTXT) are ensured. Damage containment follows Policy A/B as defined in the
KDF Policies subsection. For each packet, the session key
is derived via ASCON-XOF128 according to Policy A or Policy B [
9].
Encryption. Given plaintext
M and header fields
: (i) read
and
from BRAM; (ii) derive
by ASCON-XOF128 as above; (iii) build
nonce and
AAD; (iv) compute
; (v) emit a single UART line
Decryption. Upon receipt, the receiver: (i) checks lengths and layout of nonce/AAD/tag; (ii) verifies and that the dev field in AAD equals the one in the nonce; (iii) enforces anti-replay on ; (iv) derives with ASCON-XOF128 using ; (v) accepts only if returns success, then delivers M and updates the replay window. Failed verifications are logged and the window is not advanced; for accepted packets, the window is advanced.
Field semantics and security rationale. The packet exposes a fixed set of public fields (
Nonce,
AAD) and hides secret material (
key,
ROOT,
slice).
Table 3 summarises each symbol used in
Figure 2, its scope/size, provenance, and the exact security/verification role;
_N denotes “from
Nonce”, and
_AAD denotes “from
AAD”.
| Algorithm 1 HSF–ASCON Encrypt/Decrypt (Pseudocode) |
- 1:
function HSF_Encrypt() - 2:
- 3:
% Optional (Policy B): append before truncation for per-message separation - 4:
- 5:
- 6:
- 7:
return PKT;ver=1;nonce_hex=…;aad_hex=…;ct_hex=…;tag_hex=… - 8:
end function - 9:
function HSF_Decrypt() - 10:
parse ; require , , - 11:
require and ; anti-replay check on - 12:
- 13:
- 14:
return M if else Reject - 15:
end function
|
Optional provisioning. An optional provisioning phase may establish a seed/salt under bitstream protection so that both endpoints deterministically derive the same
without transferring static keys on the wire. When provisioning messages are sent, ASCON-128a is used for confidentiality and integrity; details of the mechanism are outside the evaluated threat model and are summarised here to support reproducibility [
30,
31,
32].
3.2. Experimental Setup
All measurements were performed on hardware; the primary platform was a Xilinx Zynq-7020 (Snickerdoodle Black) configured via Vivado/Vitis 2020.2. The evaluated ciphers comprised ASCON-128a and the lightweight AEAD set (ACORN-128, TinyJAMBU-128, JAMBU–PRESENT-128) [
10,
22,
23,
24,
25]; the Hybrid Security Framework (HSF) was exercised as a per-message keying wrapper around ASCON AEAD, while ACORN-128, TinyJAMBU-128, and JAMBU–PRESENT-128 were evaluated in their reference configurations. Secrets and selected code pages resided in two AXI-mapped BRAM regions inside partially reconfigurable partitions; Dynamic Function eXchange (PCAP) periodically swapped and scrubbed these regions during tests [
63,
64]. Execution time was measured on the Zynq Processing System using the ARM Global Timer and PMU cycle counters [
65]. Each packet emitted a single UART line containing;
[time] us_total|us_kdf|us_aead|us_glue,
[pmu] cyc_total|cyc_kdf|cyc_aead,
PKT;ver=1;…; which was stored as a log.
Here, us_glue covers non-cryptographic overhead (packet framing/parsing, UART I/O, buffer copies) outside KDF/AEAD; PMU counters (cyc_∗) correspond accordingly. Instantaneous current on the input rail was measured through a precision shunt and a Digilent Analog Discovery 3 in oscilloscope mode; energy per packet was obtained by integrating
over the software window aligned to the AEAD call. Where available, readings were cross-checked with a PicoTest M3511A 6 1/2 multimeter by measuring current over main power line under steady-state load to validate average current; per-packet energy was derived from the oscilloscope integration.
An Arduino Due acted as a receiver/attack harness: it parsed Nonce and AAD fields (dev, idx, epoch, msg), derived , invoked , and compared tags in constant time. Negative tests flipped tag bits and replayed stale tuples to confirm anti-replay.
Correctness. Implementations were cross-checked against official test vectors for ASCON, ACORN, TinyJAMBU, and JAMBU; encryption and verification outputs matched bit-for-bit.
Two BRAM regions were used in the reported setup; using four or eight regions is straightforward and further enlarges the physical search space for scraping/probing without changing cryptographic entropy as well as increasing randomness for higher resistance against possible attacks.
Additionally, Arduino Due was used for the analysis of decryption and attacks. Energy consumption analyses were conducted in the hardware environment, measurements on FPGA and Arduino Due evaluated the algorithm’s energy efficiency in real-world scenarios. All components were carefully configured to prove the algorithm’s effectiveness and measure its performance differences with traditional methods. Each tool focused on a specific performance metric or attack type, enabling a comprehensive analysis of the study. Thus, the algorithm’s performance and security in both simulation and hardware environments were examined in detail. The experimental setup also considers the practical implementation challenges in commercial embedded systems, addressing the academic-commercial misalignment highlighted in prior studies [
8,
21].
3.3. Hardware and Software Design
3.3.1. Platform & Top-Level Design
The security objective is two–fold: (i) to keep code and long–lived secrets out of easily–scrapable, externally–observable memories; and (ii) to make the
physical locus of sensitive data a moving target through Dynamic Function eXchange (DFX). A prototype runs on Zynq-7020 via PCAP; the method also ports to MicroBlaze (as the soft processor’s instruction memory kept in PL BRAMs that can be moved via DFX too). Using the hard-core ARM keeps control trusted while two reconfigurable BRAMs in PL store encrypted state, and selected BRAM-resident code pages are likewise relocated via DFX (see
Figure 3). The proposed framework was implemented on a Xilinx Zynq-7020 using Vivado/Vitis 2020.2. The current design does not use MicroBlaze; all software executes on the Zynq Processing System (dual-core Cortex-A9). Secrets and selected BRAM-resident code pages in PL are relocated via DFX. Two partially reconfigurable regions (RMs) are defined in the programmable logic (PL) using floorplanning with
Pblock constraints. The static design comprises the Zynq PS, an AXI SmartConnect, and two partially–reconfigurable modules (RMs) bounded by
Pblocks. Each RM contains an AXI BRAM Controller and a Block Memory Generator instance.
3.3.2. Memory & Configuration Architecture
Each RM can be independently reconfigured at runtime using Dynamic Function eXchange (DFX). During normal operation one BRAM holds the encrypted secret POOL (slices) and selected code pages, while the second BRAM is prepared off-line. PCAP loads a partial bitstream and swaps the two BRAM roles. Constant BRAM relocation hides key location and blocks side-channel/probing attacks.
Figure 4 depicts the internal structure of each RM that contains an AXI Block RAM (BRAM) Controller connected to a
Block Memory Generator instance. The first BRAM occupies addresses
0x4000_0000 – 0x4000_7FFF and the second resides at
0x4200_0000 – 0x4200_7FFF, providing two 32 KiB on-chip memories that is alternately activated, destroyed and recreated via DFX for secure key and word-list storage. The SmartConnect interconnect and addressing are conventional AXI as the novelty lies in continuously relocating the
physical storage cells rather than only updating contents (
Figure 3 and
Figure 4). On Zynq-7000 devices the PS programs the PL through
PCAP, exposed to software by the
DEVCFG (device configuration) peripheral; PCAP supports handling full/partial, encrypted bitstreams. [
63,
66]. An
ICAP primitive also exists in PL for high–bandwidth in-fabric reconfiguration; using it from Zynq requires handing ownership from PCAP to ICAP and driving ICAP via custom logic or a DFX controller [
66]. PCAP is therefore used for all DFX actions to keep the trusted control plane within the ARM PS.
Table 4 summarises the main differences among these interfaces. In this section, the available memory types on a Zynq-7020 and their suitability for cryptographic key storage are compared. The design deliberately uses BRAM as it is on-chip, AXI-visible through both PS/PL, and DFX-relocatable. Non-volatile eFUSE or battery-backed BBRAM are used by Xilinx tools to store the 256-bit AES key for bitstream encryption, but their one-time-programmable nature and limited capacity make them unsuitable for dynamic key lists. Off-chip DDR and QSPI memories offer high capacity but can be probed through bus analysis and therefore are reserved for non-secret code storage. On-chip OCM (on-chip memory) is accessible only to the PS and cannot be relocated via DFX. Using four or eight reconfigurable BRAM regions is straightforward; increasing the number of regions enlarges the physical search space for scraping/probing in order to add more randomness without changing cryptographic entropy. For
M BRAMs with
K addresses each, single-cell location uncertainty is
[
63,
64].
In Xilinx devices three configuration paths exist. PCAP links the PS to the configuration engine and is the sole full-config interface on Zynq-7000. PCAP accepts encrypted and authenticated bitstreams, supports both full and partial reconfiguration and is controlled by the
devcfg peripheral; however, its throughput is limited by the AXI slave port and it ties up the PS during transfers. The
Internal Configuration Access Port (ICAP) resides in the programmable logic and can be driven by custom hardware for high-bandwidth partial reconfiguration. ICAP cannot decrypt bitstreams—encrypted frames must be passed through PCAP first—and switching from PCAP to ICAP requires quiescing the configuration engine and clearing control bits in the
DEVCFG register, as described in
UG909 Vivado Partial Reconfiguration User Guide [
66]. Finally, UltraScale and Versal devices introduce a
Management Configuration Access Port (MCAP) that exposes the configuration bus through PCIe; MCAP is absent in Zynq-7000 and therefore irrelevant to the current implementation. PCAP is retained in this design because it integrates naturally with the ARM processing system and Vivado’s Dynamic Function eXchange flow [
63,
64].
For completeness,
Table 5 extends the earlier memory comparison by enumerating all principal RAM and ROM resources in the Zynq-7020. The processing system contains separate 32 KiB instruction and data caches per Cortex-A9 core and a shared 512 KiB L2 cache; these caches are transparent to software and unsuitable for storing cryptographic keys. A 256 KiB on-chip memory (OCM) serves as tightly coupled RAM for fast code and data. Dual-channel DDR3 memories (PS DDR) provide several megabytes of volatile storage, whereas off-chip Quad SPI (QSPI) flash holds the non-volatile primary bitstream and user data. Within the programmable logic, distributed RAM exploits LUTs for small FIFOs and registers, BRAM offers 18–36 KiB blocks, and UltraRAM would provide larger 288 KiB blocks on devices that support it (not available on XC7Z020). The device also includes a 256-bit eFUSE array and a battery-backed RAM (BBRAM) for storing the AES decryption key used by the built-in bitstream decryption engine. Each of these memories has distinct access rights and volatility characteristics; only BRAM satisfies the combined requirements of on-chip storage, relocatability via DFX and sufficient capacity for the secret-pool.
On-chip BRAM is relocatable for security/performance, and PCAP lets the PS drive encrypted DFX on Zynq-7000 [
63,
66]. MCAP does not exist on this device class; ICAP would add PL logic and control complexity without clear security benefit in the present threat model. Hence, using PCAP instead of ICAP simplifies the software by allowing the PS to control reconfiguration directly; the
devcfg driver abstracts the details of the configuration bitstream and ensures synchronisation with the running system. Finally, eFUSE/BBRAM are reserved and used for the device-unique AES key that protects the PL bitstream itself and are not appropriate stores for frequently changing secret-pools [
67]. Per-rotation re-keying ensures each relocated list has fresh ciphertext, eliminating repeat-pattern leaks to prevent side-channel and scraping attacks. Thus, the chosen configuration maximises security without introducing external interfaces that could be probed (see
Table 5 and
Table 6).
3.3.3. Security, Run-Time & Implementation Details
The attacker model includes (i) delayering/probing of the package or board (bus snooping on DDR, QSPI, AXI), and (ii) invasive readout of on-chip arrays (
Table 6). Therefore; (a) keep long-lived keys out of off-chip media; (b) place
only encrypted secret-pools and auxiliary state in PL BRAM; (c) ensure that any cleartext material lives only in PS core registers for a few cycles; and (d) continuously
destroy and recreate the physical BRAM instance via DFX (
Figure 4). This moving-target, encrypted-transport strategy raises the bar for both physical scraping and correlation/power analysis.
Reproducibility/Future work. To harden deployments against bus probing and power/EM analysis, an optional per-relocation re-encryption layer
can be added. Before writing to the freshly loaded RM, a device-bound transport key is derived via ASCON-XOF128 from a long-term secret and a monotone rotation counter; POOL entries are then sealed under ASCON-128a (AEAD). Thus, even if two relocations carry the same logical contents, their ciphertexts on the AXI path and in BRAM become statistically independent across rotations. This mechanism was
not enabled in the evaluated build; it is included here to facilitate reproduction and future comparative studies.
Rationale. Binding to ties the transport key to a single device, while the monotone ensures uniqueness across relocations; the domain-separation tag "TR" prevents cross-use with other XOF invocations. Adding extra fields (e.g., epoch, RM identifiers, content digests) increases input size without material security gain under a correct policy and nonce-unique AEAD use, hence the compact form is preferred for clarity and portability.
On Zynq-7000, partial bitstreams are streamed through PCAP by the
XDevCfg driver (header
xdevcfg.h), whose typical call path includes
XDcfg_CfgInitialize (binds the
DEVCFG instance),
XDcfg_Transfer (DMA push to PCAP), and status/ISR helpers (e.g.,
XDcfg_IsDmaBusy,
XDcfg_IntrGetStatus) to synchronise reconfiguration. In contrast, the
XilFPGA Vitis service (
XFpga_∗ API, e.g.,
XFpga_PL_BitStream_Load) targets UltraScale(+)/ZynqMP/Versal platforms and is not the canonical path on Zynq-7000; the BSP uses
XDevCfg with PCAP as recommended for 7-series based SoCs [
63,
66]. At boot, a secure configuration sequence loads the static PL through PCAP; decryption of the PL bitstream uses the AES key stored in eFUSE or BBRAM (device-bound) [
67].
After initing AXI, the per-message keys are derived via ASCON-XOF128 from , and ASCON-128a is used for authenticated encryption. Only ciphertext and tags, together with public Nonce/AAD, transit AXI into the active BRAM. Following each message, the PS triggers a DFX event: a partial bitstream loads the alternate RM over PCAP (via XDcfg_Transfer), the newly loaded BRAM becomes active, and the old region is scrubbed. On the receiver (e.g., Arduino Due), the same (idx,epoch,msg,dev) regenerates the per-message key via ASCON-XOF128 and ASCON-128a verification/decryption completes the receive path (demonstration of cross-platform feasibility). Only the encrypted payload and freshly re-encrypted secret-pool entries transit AXI into the active BRAM each rotation.
All crypto and key tasks run in the PS via Vitis C code as follows: Upon system start, a secure boot sequence configures the static PL design via PCAP. The AES decryption key for the bitstream resides in eFUSE or BBRAM, but this key is unrelated to the dynamic keys used by the proposed method. The processor initialises the SmartConnect and AXI BRAM Controllers and loads an initial partial bitstream for RM1 or RM2 through PCAP. Dynamic keys are generated by concatenating S_ROOT, dev_id, and POOL[idx] and expanding the result with ASCON-XOF128 to produce a 128-bit key. The keys are stored in the active BRAM region as encrypted POOL entries and auxiliary state. A message is encrypted using ASCON-128a (AEAD) and transmitted. Before each DFX swap, derive a fresh transport key with ASCON-XOF128 and re-encrypt the active secrets in PS registers, then write the new ciphertext set to the target RM’s BRAM. Only then issue XDcfg_Transfer() to load the partial bitstream and flip roles; registers are zeroized afterwards. After each encryption, the PS triggers a DFX event via XDevCfg, causing the inactive RM to be loaded with the next partial bitstream while the active one is scrubbed. This constant relocation of BRAM thwarts physical probing and correlation attacks. Decryption on the receiving side (e.g., Arduino Due) reverses the process by regenerating the same per-message key and completing verification/decryption with ASCON-128a.
MicroBlaze was previously mentioned to be placed in PL for its naturally-obfuscated instruction memory to be relocated via DFX by using ICAP. It is now kept the trusted control and DFX orchestration in the Zynq PS to (i) leverage PCAP and its driver stack, (ii) minimize PL resource overhead, and (iii) isolate long–lived secrets from off-chip memories. A MicroBlaze variant is straightforward; the soft core would issue reconfiguration commands through an ICAP controller in PL; its instruction/data BRAMs can be made reconfigurable and shuffled with the same DFX flow. Thus, Zynq-based build is a didactic vehicle; the security argument (moving-target BRAM + encrypted transport + register-only cleartext lifetime) carries over verbatim. MicroBlaze benefits from an obfuscated configuration memory that complicates ROM scraping; however, its use incurs significant resource overhead and restricts clock frequency. The Zynq-7020’s hard ARM Cortex-A9 processing system provides higher performance, integrated peripherals, and a simpler software stack for dynamic partial reconfiguration via PCAP, and it is still resistant to side-channel and scrape attacks because of the obfuscated structure of the proposed encryption algorithm’s methodology. It is important to emphasise that the method is platform-agnostic—nothing prevents a MicroBlaze implementation—and the choice of Zynq here merely illustrates the concept on an energy-aware SoC. A MicroBlaze variant would use the ICAP interface for partial reconfiguration and store keys in BRAM or distributed RAM; the security arguments and dynamic key-generation protocol remain unchanged. Thus, presenting the design on Zynq demonstrates that DFX-driven memory relocation and dynamic key generation can protect communication and data—which is essential for guessing the algorithm—even when the processor core and its caches are fully transparent. MicroBlaze is suggested for extra resistance to ROM scraping, but the proposed HSF encryption algorithm’s obfuscated structure still has enough resistance to scraping or side-channel attacks; thereby answering potential concerns about why a soft core was not employed, as the demonstration has successfully shown the applicability for demonstration.
On Zynq-7000/Vitis 2020.2 the BSP exposes
XDevCfg: initialize with
XDcfg_CfgInitialize(), set up the PCAP for DMA, push the partial bitstream buffer with
XDcfg_Transfer(), and poll/ISR-clear with
XDcfg_IsDmaBusy() and
XDcfg_IntrGetStatus(). On UltraScale(+)/ZynqMP platforms the analogous operation would use
XilFPGA (
XFpga_PL_BitStream_Load), but this service is not the canonical path on Zynq-7000 [
66].
Software handles key generation and ASCON-128a authenticated encryption/decryption. Example C code in Vitis demonstrates this flow, and the results illustrate dynamic key derivation and message encryption. In the demonstrated Zynq implementation, all software runs on the hard ARM Cortex-A9 processing system (PS) of the Xilinx Zynq-7000 SoC. Instead of relying on a MicroBlaze soft core and obfuscated instruction memory in the PL, the current design derives security from the algorithm’s obfuscated structure, keeps long-lived secrets out of off-chip memories, and stores only encrypted secret-pool and ephemeral session material in two AXI-mapped BRAM regions in the programmable logic (PL). After each transaction, Dynamic Function eXchange (DFX) via the Processor Configuration Access Port (PCAP) swaps the active BRAM region with an alternate one and scrubs the old instance, turning the physical locus of sensitive data into a moving target. Cleartext keys exist only transiently in PS core registers, which are practically unattainable via non-destructive probing. This DFX-driven moving-target memory and encrypted transport provides resistance against ROM scraping, bus probing, and reverse engineering, meeting the threat model without the resource overhead of a soft processor (PCAP transfers are issued via the XDevCfg driver on Zynq-7000). Hence, a MicroBlaze-based realization is a viable alternative when energy and latency constraints are relaxed, leveraging ICAP-driven DFX and PL-resident instruction BRAMs to add further obfuscation if desired. Otherwise a Zynq PS solution is still robust against the mentined attacks owing to the obfuscated algorithm structure of proposed HSF.
This example highlights the generation of a unique per-message keying, providing significant security advantages by isolating the potential impact of any compromised keys. Arduino-based decryption verified the algorithm on a microcontroller. Arduino Due acted as a cross-platform receiver to parse , , derive via ASCON-XOF128, and verify/decrypt with ASCON-128a. Code example of Arduino decoder part demonstrates this process using the same (dev, idx, epoch, msg) to derive via ASCON-XOF128 and obtain Nonce and AAD in order to verify/decrypt with ASCON-128a. This code demonstrates the applicability of the decryption process on the Arduino Due. The code examples confirm that the algorithm operates effectively in both FPGA and microcontroller environments.
3.3.4. Manuscript Preparation and AI Tool Usage
During the preparation of this manuscript, the authors utilized a generative AI tool (OpenAI’s ChatGPT, and Google’s Gemini) for assistance with secretarial tasks. Its use was limited to improving readability, correcting grammar and syntax, and formatting references. Any AI tool is not used as a material or a method of the study, except formatting. The authors assume full responsibility for all content, including the final verification of any AI-assisted outputs.
4. Results
This section reports hardware measurements on a Xilinx Zynq-7020 for encryption (producer side) and an Arduino Due for decryption/attack harness (solver side). The evaluated set comprises ASCON-128a, ACORN-128, TinyJAMBU-128, and JAMBU–PRESENT-128 in their reference forms, and the Hybrid Security Framework (HSF) applied as a per-message keying wrapper around ASCON-128a. Latency and cycle counts were captured on Zynq via the ARM Global Timer and PMU; current/voltage were measured on the supply rail as detailed below. The sub-stage timings and cycle breakdown that appear in
Table 7 and
Table 8 are taken directly from the PMU logs and consolidated tables
(Zynq sub-stage cycles and μs: HSF–ASCON total with KDF and AEAD; ASCON-128a total with AEAD; ACORN-128 ; TinyJAMBU-128 ; JAMBU–PRESENT-128 ).
4.1. Measurement Setup and Discipline
A precision series shunt of
was inserted on the
input to the Zynq board. The Digilent Analog Discovery 3 sampled the shunt drop
. Instantaneous current and load voltage follow
Crypto windows operate in an observed span; window energies therefore use per-window , not a fixed value. A PicoTest M3511A Multimeter provided an independent average-current cross-check () used only for traceability in tables; per-window energies are reported from the shunt method. On the Arduino platform, power was treated as constant at , so energy scales linearly with duration using
A representative shunt capture illustrates three plateaus: (i) a low-current UART transmit window, (ii) an idle+crypto window enclosing KDF+AEAD execution, and (iii) a short over-current DFX window due to partial reconfiguration. “Delta” rows in the window table (
Table 9) report signed differences relative to the stated baseline (
) (
Figure 5).
The “AEAD delta (vs UART)” quantity in the subsequent window table isolates the incremental energy of a short encryption over the lower-current UART plateau, separating cryptographic work from logging overhead. The DFX window corresponds to the relocation step; its 4-RM single-shot duration/energy are measured directly, and the 2-RM single-shot values are obtained by linear scaling.
4.2. Producer-Side Encryption on Zynq-7020
Table 7 consolidates producer-side encryption results on the Zynq-7020, reporting PMU/GT–derived cycle counts and durations for the HSF–ASCON wrapper and the reference AEADs together with their sub-stages (KDF, init, aad, msg, fin). Power and energy are obtained at the crypto plateau using the shunt-derived load operating point and, for traceability, complementary 12 V PicoTest measurements aggregates; energies follow
. The table thus quantifies the per-message computation cost of the producer under a uniform measurement discipline, with host-side glue excluded.
The producer table reports the cost of one protected message under HSF (KDF+AEAD) at the crypto operating point; the 12 V column shows PicoTest measurements aggregates for traceability. The stage and sub-stage durations/cycles originate from the PMU logs and consolidated tables.
4.3. Solver-Side Decryption on Arduino Due
Table 8 isolates solver-side decryption on the Arduino Due under the same per-message framing. Operation at a constant
and
(21.6 mW) yields energies that scale linearly with duration; the HSF–ASCON total and its parse/KDF/AEAD breakdown are listed explicitly and are consistent with the consolidated figures. These values capture the microcontroller cost to parse inputs, derive per-message keys, and verify/decrypt.
The HSF–ASCON total, and its parse/KDF/AEAD breakdown, are computed at 21.6 mW and match the consolidated energy table (e.g., 0.00577 µWh/op for HSF–ASCON on Due).
Security note. ASCON/ACORN/TinyJAMBU use 128-bit tags (AEAD); JAMBU–PRESENT uses a 64-bit tag; PRESENT-ECB is a non-AEAD baseline.
4.4. Energy Attribution and DFX Scaling
This subsection attributes energy to transport, cryptography, and relocation by combining the plateau segmentation in
Table 9 with the per-algorithm summary in
Table 10. The “AEAD delta (vs UART)” quantity isolates the incremental encryption work within the idle+crypto window, while the DFX rows correspond to single-shot partial reconfiguration. The measured 4-RM window provides the per-swap reference; a 2-RM single-shot scales approximately by one half in both duration and energy under identical operating conditions, and the summary table reports both variants explicitly. The “AEAD delta (vs. UART)” row isolates the incremental energy of a short encryption over the lower-current UART plateau, separating cryptographic work from logging overhead. The DFX window exhibits a short over-current associated with partial reconfiguration; the table lists both total and delta-versus-idle contributions.
Per-algorithm summary (crypto-only vs. relocation). Cipher-to-cipher comparisons adopt the idle+crypto operating point for the producer: from the crypto plateau; per-cipher energy equals using the measured latency. HSF–ASCON is reported twice in the comparison table: (i) crypto-only (KDF+AEAD, no relocation) and (ii) crypto+DFX, where the relocation cost is made explicit.
Relocation policy scaling (4-RM measurement → 2-RM deployment). Stress experiments exercised
four BRAM-backed reconfigurable modules (RMs) to probe relocation, whereas the target deployment uses
two RMs. The DFX cost in
Table 9 represents a
per-swap window (one partial bitstream load). For a policy that performs
swaps per message with
M regions,
Using the measured per-swap values from
Table 9 for the
4-RM bitstream (
; delta energy
; total window energy
), a
2-RM policy with
(single-shot PR of a half-size bitstream) yields
and
(delta), consistent with linear scaling of the measured per-swap window under identical operating conditions.
The results consolidate per-message computation and energy costs across both platforms under a common framing. On the producer side (Zynq-7020), cycle counts and timings come from the ARM PMU/Global Timer, while energies are obtained by plateau-based series-shunt integration over the observed
load range; shunt captures were acquired with a Digilent Analog Discovery 3, and 12 V average-current aggregates measured with a PicoTest M3511A Multimeter are used solely as a cross-check (
Table 7). On the solver side (Arduino Due), operation at constant
makes energy proportional to duration, and the parse/KDF/AEAD breakdown is listed explicitly (
Table 8). Window segmentation separates UART, idle+crypto, and DFX contributions and exposes the “AEAD delta (vs UART)” as the incremental encryption cost (
Table 9). The comparison distinguishes crypto-only HSF–ASCON from HSF–ASCON+DFX and makes relocation overhead explicit; under single-shot partial reconfiguration, DFX scales linearly so a 2-RM policy incurs approximately half the duration and energy of the measured 4-RM window (
Table 10).
DFX frequency policy. Relocation can be decoupled from every-packet operation: a policy of one single-shot 2-RM swap every N messages adds ms and Wh per swap (measured), leaving the crypto-only path unchanged.
4.5. Security Analysis and Attack Experiments
This section evaluates HSF–ASCON against brute-force, forgery/replay, side-channel, timing, hardware-tampering, dynamic key-recovery, and dictionary attacks. Session keys are derived via ASCON-XOF128 from the device-internal , , and ; Policy B optionally appends to achieve per-packet separation. Nonces are deterministic with uniqueness per key; acceptance is tag-only under ASCON-128a (128-bit tag). Under Policy A (per-slice), compromise of a derived key impacts packets sharing until rekey; under Policy B (per-message), compromise is confined to the affected packet. DFX-based relocation increases the physical search space for probing/template attacks but does not alter cryptographic key entropy.
The method directly addresses threats such as ROM scraping and communication interception in embedded deployments. Bitstream encryption and memory obfuscation impede code extraction via physical or indirect means, while per-message keying localizes any exposure to a single ciphertext.
Side-channel considerations are twofold. First, software-based authenticated decryption (ASCON-128a) executes in a compact, fixed-control flow that limits timing variability; constant-frequency operation further reduces timing side channels. Second, dynamic partial reconfiguration (DFX) eliminates stable physical layouts by relocating BRAM-backed modules between messages, thereby degrading power/EM templates. Under a placement model with R relocation slots and M reconfigurable modules, the per-message configuration space is , yielding . For a deployment with slots and modules, bits per single-shot placement; a 4-RM stress configuration provides bits. These bits complement, but do not replace, cryptographic key entropy; independence assumptions should be treated conservatively in security claims.
Correctness Note. DFX does not add to key entropy; it only enlarges the physical search space for scraping/probing. For the single “secret cell” model, ; with BRAMs and addresses, . If the real data are dispersed across k cells out of cells and the remainder are filled with decoys, the combinatorial uncertainty is (e.g., ; ). Model applies for the deployment should be specified according to the requirements.
Hardware manipulation attacks on FPGA platforms (e.g., clock tampering or hardware Trojans) are countered using device-locked bitstreams and silicon identity. Vivado’s Device DNA and bitstream encryption confine bitstreams to the intended device, impeding unauthorized modification and reuse [
8,
37]. Timing attacks are further mitigated by a fixed clock source and constant-time AEAD verification paths [
8,
37]. Dynamic key resolution attacks are addressed by per-message derivation from high-variability inputs (e.g., random word, timestamp, optional device identity etc.), preventing key prediction across messages hence there is no “random word/timestamp” inputs are used as KDF in HSF.
Assumptions on PR-channel integrity. Partial bitstreams are loaded over PCAP with device-bound encryption
and integrity checks; toolchain versions and patch levels mitigating known issues (e.g., Starbleed on 7-series) are documented [
40]. Threats requiring a compromised PR channel are out of scope.
Terminology (security accounting). q: total number of adversarial attempts/queries (verification or forgery tries). D: number of keys (e.g., devices/sessions); M: messages per key; (fleet-wide trials). RO/PRF: modeling the XOF/KDF as a Random Oracle or a PRF keyed by over public context.
Quantitative AEAD bounds. Forgery probability for a t-bit tag under Q total attempts is bounded by . In multi-user settings with D devices and M messages per device, . Counter-based nonces avoid birthday collisions; if random L-bit nonces were used, .
Multi-user. , .
Per-device. , .
Random-nonce reference (not used). , draws,
Data limit for 64-bit blocks. For PRESENT-based AEAD (JAMBU–PRESENT-128), the total number of processed 64-bit blocks under a single key should be kept
due to the birthday bound; in practice, rotate keys around
blocks as a safety margin. For a rotation point of
total 64-bit blocks per key, the maximum number of messages before rekey is approximately the following (
Table 11):
Cipher suite and fairness. JAMBU–PRESENT-128 emits 64-bit tags (
), whereas ASCON/ACORN/TinyJAMBU use 128-bit tags (
). Reported performance tables annotate this security-level difference. PRESENT-ECB is used solely as a core-speed baseline; practical deployments require an AEAD mode (e.g., JAMBU) (
Table 12).
Side-channel methodology and PR-channel assumptions. Code paths follow fixed control flow; table-based S-boxes are avoided where applicable; tag comparison is constant-time. Dynamic Function eXchange relocates BRAM-resident pools and selected code pages, breaking stable templates; relocation is orthogonal to cryptographic entropy but increases the attacker’s physical search space. PR-channel integrity is assumed: device-bound, encrypted bitstreams with verified tooling mitigate unauthorized reuse or tampering. Implementations follow constant-time coding; table-based S-boxes are avoided (bit-slicing is also recommended for PRESENT). Future work includes DPA/CPA measurements (fixed frequency), leakage coefficients, and template robustness. DFX-induced relocation is used to destabilize persistent templates.
Attack experiments (Arduino Due, 84 MHz). Practical brute-force throughput and leakage isolation were evaluated on an Arduino Due (SAM3X8E, 84 MHz). ASCON-128a verification cost from logs is
(
cycles consistent with
, cycles ≈
); the brute-force speed trial measures tag-verification throughput (integrity check per second) rather than cryptanalytic success (
Table 13). Throughput trials enumerate random 128-bit keys and count verification attempts/s; leakage trials contrast (i)
Plain ASCON with a single session key
K assumed leaked versus (ii)
HSF–ASCON with a per-message key
assumed leaked (
Table 14). Measurements were sanity-checked against the power model reported in the Results section (constant
on Due); no energy anomalies were observed.
Per-attack notes.
(Table 15) Brute force: Per-message 128-bit key and 128-bit tag; at the measured
/s on Due, the search space remains unreachable (ETA
years).
Side-channel: Fixed frequency, fixed control flow, and DFX-induced relocation destabilise power/EM templates; persistent profiling is impeded.
Timing: The verification path is constant-time with a fixed clock; data-dependent branching/min-time leakage is reduced.
Hardware manipulation (HTH): Device-locked, encrypted bitstreams (Device DNA binding) hinder unauthorised reuse; sustained placement manipulation requires a compromised PR channel.
Dynamic key resolution: ; leakage of a single
affects only its packet; cross-message prediction is ineffective.
Dictionary: No human-word or low-entropy dependency; offline guessing reduces to brute force over the 128-bit AEAD key space.
Relocation randomness. Under slot-level relocation with R available slots and M reconfigurable modules per message, the number of single-shot placements is , giving bits of placement uncertainty. With , this yields bit for , bits for (), and bits for (). These bits are orthogonal to the 128-bit keyspace and should not be summed; they raise the work factor for scraping/probing and power/EM template reuse.
One-line summary. On Arduino Due, ASCON-128a verification is
per packet; scanning
packets takes
. With a leaked session key
K, the plain design yields
OK, whereas with a leaked per-message key
under HSF the success is
; brute-force throughput is
the 128-bit space is practically unreachable (ETA
years). This empirical picture is consistent with formal indistinguishability and per-message keying assumptions validated in CryptoVerif [
68].
4.6. Summary of Results
Measurements quantify the producer (Zynq-7020) and solver (Arduino Due) costs under a common per-message framing. On the producer side, PMU/Global-Timer timings and plateau-based shunt integration at the observed
load range yield per-stage energies and durations (
Table 7); HSF–ASCON completes the
crypto-only path in
(KDF + AEAD), while ASCON-128a alone completes in
. The comparison in
Table 10 shows that, at a 1×16 B payload, energy ordering follows latency: ACORN-128 ≫ JAMBU–PRESENT-128 ≳ TinyJAMBU-128 ≫ ASCON-128a, with HSF–ASCON incurring a small KDF overhead yet remaining in the microsecond/
Wh regime. Dynamic partial reconfiguration appears as a separate, single-shot relocation window: the measured 4-RM swap costs
and
(absolute
), while a 2-RM single-shot scales to
and
under identical operating conditions (
Table 9). On the solver side, operation at constant
gives HSF–ASCON verification/decryption of
, with sub-stage energies proportional to duration (
Table 8). Security evaluations indicate that per-message 128-bit keys render exhaustive search impractical [
69]; the brute-force throughput measured on Arduino Due is
(
Table 13), implying
years for a uniform 128-bit space. Leakage experiments demonstrate damage isolation: a single leaked session key
K opens all packets for a plain design, whereas a single leaked per-message key
under HSF opens only its packet (
Table 14). Side-channel resistance benefits from constant-frequency operation and compact, fixed-control AEAD verification, together with DFX-based relocation that removes stable physical layouts; placement uncertainty grows with the number of relocation slots
R (e.g.,
bits for
at
), which raises the work factor for scraping/probing without altering key entropy. Bitstream encryption bound to silicon identity (Device DNA) counters unauthorized bitstream reuse and supports robustness against hardware manipulation [
8,
37]. The empirical picture aligns with per-message keying and indistinguishability assumptions established in formal analyses [
68]. The attack matrix in
Table 15 aligns with the measured brute-force throughput and leakage-isolation trials, indicating high resilience under the stated threat model.
Security contributions of the method. (1) Multi-dimensional domain separation. Keys are separated across algorithm, device, slice, and (optionally) message via and . (2) Per-message keying (optional). Including in the KDF yields a distinct key per packet, localizing exposure and simplifying misuse resistance. (3) Operational limits quantified. Fleet-wide bounds, nonce policy, and 64-bit block data limits are stated alongside performance results, enabling risk-aware deployment. (4) Honest reporting across schemes. Tag-size differences (64 versus 128 bits) are flagged in tables to align performance with security level.
5. Discussion
The evaluation indicates that a per-message AEAD with lightweight key derivation (HSF–ASCON) achieves microsecond-scale latency and micro-watt-hour energy while confining any disclosure to a single ciphertext. On the producer (Zynq-7020), crypto-only HSF–ASCON completes in
s with
Wh for a
B payload; plain ASCON-128a completes in
s with
Wh (
Table 7). On the solver (Arduino Due), verification/decryption costs scale linearly with time at a constant
mW (
Table 8). Dynamic Function eXchange (DFX) appears as a separate relocation window: the measured 4-RM single-shot costs
ms with
Wh; a 2-RM single-shot scales to
ms and
Wh under identical operating conditions (
Table 9). These windows can be scheduled sparsely relative to traffic, allowing relocation frequency to be tuned to the threat model. A concise head-to-head view is given in
Table 16, contrasting baseline ASCON-128a with HSF–ASCON (crypto-only) and HSF–ASCON with a single-shot 2-RM DFX window.
Security posture. Per-message 128-bit keys keep exhaustive search impractical [
69]; the measured brute-force throughput on Due (
/s) implies
years for a uniform 128-bit space (
Table 13). Leakage experiments validate damage isolation: a leaked session key in a plain design compromises all packets in that session, whereas a leaked per-message key
under HSF affects only its packet (
Table 14). Device-locked, encrypted bitstreams and silicon identity (Device DNA) restrict unauthorized reuse and help counter hardware manipulation [
8,
37]. Constant-frequency operation and compact, fixed-control AEAD verification reduce timing variance; DFX relocates BRAM-resident secrets and selected code pages, destabilizing persistent power/EM templates without claiming extra cryptographic entropy. This empirical picture aligns with indistinguishability/per-message-keying assumptions supported by formal analyses [
68] and complements prior DFX-based defences such as algorithm hopping and energy-adaptive HSMs [
14,
43].
Trade-offs and deployability. HSF–ASCON adds a small XOF-KDF cost relative to plain ASCON but remains in the same latency/energy regime for short packets; relocation overhead is external to the crypto path and can be amortized or scheduled. The approach retains compatibility with other lightweight AEADs (e.g., ACORN, TinyJAMBU, JAMBU–PRESENT) while providing per-message keying and relocation as orthogonal hardening layers. Compared with static-layout or static-key baselines, the method improves the compromise scope (message-granular) at modest computational overhead and a tunable relocation cost, addressing the commercial–academic gap noted in prior work [
8,
21].
Comparison against AEAD baselines.
Table 17 summarises crypto-only latency/energy on Zynq-7020 together with the effective compromise scope under the evaluated configurations; relocation overhead is reported separately in
Table 9.
Synthesis with state of the art.Table 18 contrasts representative approaches. The HSF row has been updated with the measured crypto-only figures and the single-shot DFX cost under the 2-RM policy (half of the 4-RM window in
Table 9).
Related literature note. Prior work emphasises the tension between energy budgets and key-management in embedded/IoT settings. Lightweight primitives and PRESENT-class designs target low area/energy but offer limited key agility in their basic forms [
70,
71]. Practical constant-time coding and leakage-reduction techniques are recommended to limit timing/power side channels on general-purpose cores [
72,
73]. One-time-pad/XOR-style schemes avoid computation but are impractical due to key distribution and reuse hazards [
74,
75]. These observations motivate the use of a small-cost XOF-based KDF to derive per-message AEAD keys under tight energy constraints.
Limitations and outlook. Relocation frequency presents a tunable cost–benefit: higher frequency increases template instability at the expense of additional DFX windows; lower frequency amortizes the cost. The current harness evaluates short messages; larger payloads favor AEADs with higher per-byte throughput, while the HSF cost remains dominated by a fixed KDF+AEAD setup. Strengthening of the reconfiguration channel (e.g., authenticated PR control) remains essential [
8,
37]. Within these bounds, per-message AEAD with relocatable BRAM provides a practical, energy-aware defence layer for embedded deployments while preserving compatibility with established lightweight primitives.
6. Conclusions
A hybrid protection layer that couples per-message authenticated encryption with lightweight key derivation (HSF–ASCON) and Dynamic Function eXchange (DFX)–assisted relocation of BRAM-resident secrets and selected code pages has been demonstrated for embedded/IoT targets. The cryptographic core adheres to the AEAD interface (
key,
nonce,
AAD →
ciphertext,
tag) [
9,
76,
77], and derives a fresh per-message key via a sponge-based XOF with domain separation [
18,
19,
20]. ASCON-128a and ASCON-XOF are selected in line with the NIST Lightweight Cryptography process and the algorithm’s published analysis [
10,
11]. The XOF-KDF construction follows standard practice that permits public context (e.g., device identifier, index) alongside a secret root; this is consistent with HKDF and NIST KDF recommendations on
salt/
info inputs [
30,
31,
32]. Nonce uniqueness is enforced by a 128-bit counter layout, reflecting widely adopted guidance for AEAD modes [
62]. These choices align with Kerckhoffs’s principle: security is anchored in key secrecy rather than algorithm secrecy [
33].
Measurements indicate that the method attains microsecond-scale latency and micro-watt-hour energy while confining any disclosure to a single ciphertext. On the producer (Zynq-7020), crypto-only HSF–ASCON completes in
s with
Wh for a
B payload, whereas plain ASCON-128a completes in
s with
Wh (
Table 7 and
Table 10); timings are obtained via the ARM Global Timer/PMU on the Cortex-A9 as per vendor documentation [
63,
65]. On the solver (Arduino Due), verification/decryption costs scale linearly at a constant
mW and measure
s (
Table 8). DFX appears as a separate relocation window: the measured single-shot costs are
ms with
Wh (4-RM) and
ms with
Wh (2-RM) under identical operating conditions (
Table 9). A head-to-head summary contrasting baseline ASCON-128a, HSF–ASCON (crypto-only), and HSF–ASCON including a 2-RM single-shot DFX window is provided in
Table 16. Additional crypto-only comparisons with ACORN-128, TinyJAMBU-128, and JAMBU–PRESENT-128 are given in
Table 17 (specs. in [
22,
23,
24,
25]).
Security evaluations support the intended threat model. Per-message 128-bit keys render exhaustive search impractical [
69]; the measured verification throughput on Due (
/s) implies an infeasible
years for a uniform 128-bit space (
Table 13). Leakage experiments confirm damage isolation: a leaked long-lived session key compromises all packets in a plain baseline, whereas a leaked per-message key affects only its packet under HSF (
Table 14). Constant-time verification and fixed-frequency operation reduce timing leakage; DFX relocates BRAM-resident secrets and selected code pages to destabilize power/EM templates without claiming extra cryptographic entropy. Device-locked, encrypted bitstreams and silicon identity constrain unauthorized reuse and emphasize the need to secure the partial-reconfiguration channel against hardware manipulation [
8,
37]. These observations are consistent with accepted AEAD security notions and engineering practice [
9,
62,
76,
77].
The approach integrates with established lightweight AEADs while adding orthogonal hardening. On the evaluated setup, HSF–ASCON remains in the same latency/energy regime as ASCON-128a for short payloads and improves compromise scope to the message granularity; baseline figures for ACORN-128, TinyJAMBU-128, and JAMBU–PRESENT-128 are included for context (
Table 10 and
Table 17) [
10,
11,
22,
23,
24,
25]. DFX scheduling can be tuned sparsely relative to traffic, making the relocation cost adjustable to the threat model. In line with prior DFX-based defences (algorithm hopping, energy-adaptive modules, dynamic obfuscation) [
7,
8,
14,
43,
45], the present design emphasizes authenticated encryption and per-message keys as the primary cryptographic control.
Several caveats guide deployment and future work. The DFX window is an explicit cost and assumes an authenticated, rate-limited reconfiguration path [
37,
64]. For larger payloads, per-byte throughput of the chosen AEAD dominates whereas the HSF overhead remains largely fixed; parameterization of pool size and index policy therefore merits workload-aware tuning. Side-channel hardening beyond constant-time code (e.g., power balancing, masking) can further reduce leakage [
72,
73]. If a PUF is used to derive the long-term root secret, careful, non-interactive provisioning is advised given modelling attacks on certain strong-PUF families [
38,
78,
79]. Finally, per-relocation re-encryption of BRAM contents (outlined in Equation (
1)) offers a practical avenue to decorrelate ciphertexts across swaps without altering the online protocol.
In summary, per-message AEAD with XOF-based key derivation, combined with DFX-assisted relocation of BRAM-resident secrets and code pages, delivers a practical and energy-aware defence layer for embedded deployments [
10,
11]. The method aligns with standardized AEAD/KDF practice [
9,
20,
30,
31,
32] and with the NIST-endorsed lightweight cryptography portfolio [
10,
11], while providing tunable physical-layout churn and message-granular damage containment under the measured latency and energy budgets.