On the SCA Resistance of TMR-Protected Cryptographic Designs

Kabin, Ievgen; Langendoerfer, Peter; Dyka, Zoya

doi:10.3390/electronics14163318

Open AccessArticle

On the SCA Resistance of TMR-Protected Cryptographic Designs

by

Ievgen Kabin

^1,*

,

Peter Langendoerfer

^1,2

and

Zoya Dyka

^1,2

¹

IHP—Leibniz-Institut für Innovative Mikroelektronik, 15236 Frankfurt (Oder), Germany

²

Chair of Wireless Systems, Institute of Computer Science, Faculty 1: Mathematics, Computer Science, Physics, Electrical Engineering and Information Technology, Brandenburg University of Technology Cottbus-Senftenberg, 03046 Cottbus, Germany

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(16), 3318; https://doi.org/10.3390/electronics14163318

Submission received: 1 July 2025 / Revised: 6 August 2025 / Accepted: 14 August 2025 / Published: 20 August 2025

(This article belongs to the Special Issue Advances in Hardware Security Research)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The influence of redundant implementations on success of physical attacks against cryptographic devices is currently under-researched. This is especially an issue in application fields such as wearable health, industrial control systems and the like in which devices are accessible to potential attackers. This paper presents results of an investigation of the TMR application impact on the vulnerability of FPGA-based asymmetric cryptographic accelerators to side-channel analysis attacks. We implemented our cryptographic cores using full- and partial-TMR application approaches and experimentally conducted evaluation of their side-channel resistance. Our results reveal that TMR can significantly impact side-channel leakage, either increasing resistance by introducing noise or amplifying leakage depending on the part of the design where redundancy was applied.

Keywords:

triple modular redundancy; TMR; side-channel analysis attacks; SCA; cryptographic hardware; FPGA security; fault tolerance; hardware redundancy; physical attacks; secure design; elliptic curve cryptosystems

1. Introduction

With the advent of the Internet of Things, electronic devices are becoming ubiquitous in almost all fields of application ranging from wearable health care via industrial control systems and critical infrastructures to space missions. What these applications have in common are that the devices—at least to some extend—are physically accessible, i.e., devices can be physical manipulated and/or analysed by attackers. In addition, these applications require extreme reliability and that majority of the data are sensitive, meaning their integrity and confidentiality are of utmost importance. The latter two features are ensured using cryptographic approaches; reliability can be achieved by redundancy, e.g., triple modular redundancy.

The influence of redundant implementations on physical attacks against cryptographic devices is currently under-researched and frankly speaking rather unknown. On the one hand, redundancy is a suitable means to countermeasure fault injection attacks. On the other hand, it might increase the leakage exploited in side-channel attacks. In this paper, we are researching exactly this relation. Our primary goal is to explore the impact of triple modular redundancy on the attack’s success against a cryptographic implementation of asymmetric approaches. Therefore, we decided to use our own design due to the fact that it was extensively investigated in our previous work. The design is a hardware accelerator for elliptic curve (EC) point multiplication, which is the main operation in all EC-based cryptographic protocols. The design is vulnerable to address-bit attacks, and the leakage source is a key-dependent addressing of registers.

We extended our own design applying the TMR strategy for different design blocks, ported it to an FPGA and captured electromagnetic radiation traces which we used to run side-channel attacks to extract the used key.

Our main contributions are summarized as follows.

We presented the first detailed experimental results of the impact of triple modular redundancy (TMR) on side-channel resistance of FPGA-based asymmetric cryptography designs.
We demonstrated that TMR can both reduce and amplify SCA leakage depending on design factors such as selective redundancy application.
We provide guidelines for hardware designers aiming at balancing fault tolerance and side-channel resistance.

The rest of this paper is organized as follows: Section 2 reviews the current state of research on side-channel attacks targeting TMR-protected designs. Section 3 introduces the tools and methodologies used for automated implementation of TMR. Section 4 describes our cryptographic designs selected for evaluation and how TMR was applied. Section 5 presents the experimental setup, conducted attack as well as a discussion of the attack results. Finally, Section 6 concludes the paper and presents our plans for future work.

2. SCA Attacks Against TMR: State of the Art

Traditional countermeasures against fault injection attacks are based on techniques that allow detection and coping with injected faults, preventing them from affecting the outputs or leaking sensitive data. These techniques are based on redundancy application, e.g., dual-, triple-, or n-modular redundancy, error detection and correction codes. Countermeasures against fault injection attacks are under-investigated in the context of side-channel analysis attacks. There is a limited number of papers that shed light on how a countermeasure designed for one type of attack might influence the effectiveness of the other. Some of them are briefly discussed below.

In 2007/2008, authors of [1,2] assumed that a countermeasure against fault injection attacks may have an impact on the resistance of a cryptographic implementation. Using VHDL they implemented S-boxes for the AES and Kasumi ciphers and modified them by adding parity check circuits. To verify their assumption, authors conducted correlation power analysis attack against traces simulated for the UMC [3] 180 nm technology. For their implementation with parity check circuits, the number of key bits found was increased by up to 38% for the Kasumi algorithm and by up to 45% for AES compared to the implementations without fault detection circuits. Authors also demonstrated in [2], that application of error detection circuits can be beneficial for attackers in case of noisy measurements.

In [4] correlation power analysis was conducted against power traces measured on a SASEBO-GII (equipped with a Virtex-5 FPGA) board for an AES protected by different error detection schemes. These included cyclic redundancy check (CRC) codes, as well as duplicated critical components. The results demonstrated that all implemented protection circuits made the design more vulnerable to correlation power analysis (CPA) attacks. The performed CPA attack against the AES implementation with protection using duplication of critical components had a similar success rate as the unprotected original version. This was explained by the authors as a consequence of the signal-to-noise ratio (SNR) being almost unaffected when both leakage and noise are doubled.

In [5,6] authors investigated how fault detection methods affect the resistance of an AES implementation against correlation power analysis. As target platform, authors used a Sakura-G board [7] with a Xilinx Spartan-6 FPGA. Power traces of the AES implementations with double modular redundancy (DMR), inverse function and parity check code were captured using ChipWhisperer for identical keys and plaintexts. The authors demonstrated that fault detection techniques applied for different design blocks have different impact on the design’s resistance. For instance, parity applied to the S-Box increased the design’s resistance to CPA, while DMR and inverse function reduced the resistance, making the key recovery faster. The authors also discovered that applying more fault detection schemes to other design modules does not help to protect the implementation from the CPA attack and can actually help attackers. The impact of fault detection schemes on CPA resistance varies and some may unintentionally weaken security.

The authors of [8] wanted to investigate how the application of modular redundancy affects the resistance of an AES 128 implementation. The implemented design was based on the application of DMR, TMR as well as NMR, i.e., they used multiple copies of the same AES module. Measurements were conducted by using a PicoScope 6404D portable oscilloscope on the Evariste III platform equipped with an Altera Cyclone III FPGA. Unfortunately, the paper provides no results of the attack.

In [9], the same authors made another attempt to evaluate the resistance to differential power analysis (DPA) of AES implementations protected by fault-tolerant techniques based on parity check, space and time redundancy. They implemented five designs with fault-tolerant techniques and compared them to the standard design. The designs were synthesized for 1 MHz clock frequency targeting the Evariste III platform [10,11]. Power traces were captured using an Agilent DSO 7104A oscilloscope for each of the implementations. Authors demonstrated that in most of the cases, application of fault-tolerant techniques had minimal impact on the design, while slightly increasing its resistance against DPA.

Authors of [8] continued their work on the topic and published a paper [12] in which they proposed a masked duplex architecture for the Present cipher [13], compared it with an TMR implementation and evaluated the SCA leakage based on Welch’s test [14] applied to 1 million traces. Authors do not observe significant difference in leakage between the single and the TMR implementation of the design. In the case of the proposed masked duplex architecture, there was a significant leakage. However, authors managed to eliminate it using alternative comparison logic.

In [15], the effectiveness of side-channel analysis attacking AES implementations with triple modular redundancy is investigated. Authors implemented three instances of the same design applying TMR—the implementation with three identical AES blocks without optimization, its optimized version and, finally, the instance where TMR was achieved by using three structurally different functionally equivalent AES implementations. Design instances were synthesized for a 500 MHz clock frequency and a 65 nm technology library. For each of the investigated designs authors simulated power traces for 1000 random plaintexts with a simulation step of 1 ns, resulting in 2 simulated values per clock cycle. The conducted CPA attack was based on Pearson’s correlation coefficient. The authors pointed out that the application of TMR brought no benefits for their first two instances of TMR designs compared to the single AES design. However, an attack against the third instance, where physically and structurally different AES designs were applied for TMR implementation, resulted in increased computational efforts required to reveal the secret key. In general, the last one is obvious, as any two of these three AES blocks in the TMR implementation act as a noise source from the third block perspective.

While different studies have analyzed the effect of TMR on cryptographic circuits (mostly AES implementations), their findings vary significantly. Some of them report enhanced resistance to side-channel attacks, while others indicate increased vulnerability. As a result, there is no clear, unambiguous conclusion regarding whether redundancy schemes like TMR strengthen or weaken the resistance of cryptographic implementations against SCA attacks.

3. Tools for Automated Triple Modular Redundancy Implementation

As circuits become more complex, manual logic redundancy implementation in large-scale hardware designs may become time-consuming and error-prone. Therefore, usage of automated tools for TMR/NMR implementation becomes essential in modern HDL-based designs. The number of such tools is quite limited. Most of the industrial-grade TMR tools, such as Synopsys Synplify Premier [16] or Siemens Precision Hi-Rel [17], are proprietary and require commercial licenses, and at the same time may be vendor-specific or suitable to a dedicated FPGA family only, e.g., the Xilinx TMRTool [18]. However, there are still a few free, open-source alternatives available such as BL-TMR [19] developed at the Brigham Young University (BYU) under the support of the Los Alamos National Laboratory, SpyDrNet/SpyDrNet TMR—a format independent tool for netlist transformation developed at BYU Configurable Computing Lab [20,21].

We made our choice in favor of SpyDrNet TMR (version 1.14) due to the facts that it is a free, open-source, python-based framework with an intuitive API for accessing, analyzing and modifying generic netlist data structures. The documentation provides all the installation instructions and tutorials required for quick start as well as examples for Digilent boards.

In this work, we investigated several possibilities for TMR implementation, i.e., implementation of the entire design using triplicated logic, as well as partial TMR application for implementation of critical blocks selected to be protected. We followed the design flow illustrated in Figure 1.

The starting point of the flow is an original HDL-based implementation of a design.

The module to be implemented using the TMR logic is independently synthesized as a new project to generate a netlist. After this step, the synthesized netlist of a selected block can be exported as an EDIF (used in this work) or Verilog netlist formats, both compatible with SpyDrNet. The next step is a netlist modification using the SpyDrNet python script that replicates the logic three times and inserts majority voters at appropriate output nodes. Then, the resulting TMR netlist is encapsulated into an IP core to be instantiated within the original design, maintaining design hierarchy. The original design is updated to include the pre-synthesized TMR IP core. The rest of the steps follow the regular Vivado [22] design flow, i.e., the complete design is synthesized, passed through implementation stages including placement, routing, and timing analysis. Finally, a bitstream is generated.

4. Investigated Designs

4.1. Original Design

The starting point of the investigations conducted in this work is a hardware accelerator of an Elliptic Curve point multiplication (kP operation) for the standard NIST elliptic curve B-233 [23] with the irreducible polynomial f(t) = t²³³ + t⁷⁴ + 1. The kP operation in B-233 was selected as example for the investigations and due to the fact that the resistance of the accelerator to horizontal SCA attacks was well investigated in the past [24,25]. The implementation is described using VHDL language, based on the modification of the Montgomery kP algorithm [26] with Lopez–Dahab coordinates. The implemented algorithm is given in the Appendix A (Algorithm A1). For more implementation details see FPGA1 design in Section 7.2 of [27]. Here, we try to avoid a detailed explanation of the algorithm so as to focus on the investigation of how triplication of different design blocks influences its resistance to SCA.

The selected target platform is a Digilent Arty Z7-20 development board [28] with a Xilinx Zynq 7020 FPGA. The original design, denoted as Design_0 in the rest of the paper, was synthesized and implemented using Vivado v2022.1 (64-bit, SW Build: 3526262) with “Vivado Synthesis Defaults” and “Vivado Implementation Defaults” strategies for a clock frequency of 10 MHz. The block diagram of the implemented design is shown in Figure 2.

The kP accelerator requires an up to 233-bit long binary scalar k as well as affine coordinates x and y of a point P as inputs. The output is represented by two coordinates—the result of a scalar multiplication of the value k with a point P. The communication with the design is performed using a UART interface represented by RX/TX ports in the diagram. The hierarchy of the ecc_uart_0 block is shown in Figure 3. The main block implementing the kP operation is called i_ecc.

The part of the design responsible for the kP operation consists of the following main components:

controller (i_cntr)—manages the sequence of the field operations and data flow between the rest of the components;
arithmetic logic unit (i_alu)—performs squaring or addition of operands;
multiplier (i_multiply)—calculates the field product of 233-bit-long operands using 9 partial products according to a fixed calculation plan [26], implemented according to the iterative 4-segment Karatsuba multiplication method;
partial multiplier (u1)—calculates partial products for the multiplier, implemented using only the classical multiplication formula [29];
233-bit-long registers: for the implementation of the main loop of the algorithm (i_x1, i_x2, i_x3, i_x4, i_z1, i_z2), input/output registers (i_x, i_y), and the registers that hold the value of scalar k and the parameter b of the elliptic curve equation (i_ext_reg, i_b);
muxer (i_sys_mux)—ensures data exchange between design components.

4.2. TMR Instances of the Original Design

Following the design flow represented in Figure 1, we created several instances of the original design with the goal of investigating the impact of triple module redundancy application on the design’s vulnerability to side-channel attacks:

Design_1: contains a multiplier (i.e., the i_multiply component) implemented using TMR;
Design_2: triple modular redundancy was applied to six registers used in the main loop of the algorithm, i.e., registers i_x1, i_x2, i_x3, i_x4, i_z1, i_z2;
Design_3: combines the components implemented using TMR logic in Design_1 and Design_2 (i.e., the multiplier and six registers);
Design_4: the i_ecc part of the ecc_uart_0 block (see Figure 3), responsible for the calculation of the kP operation was implemented using TMR logic.

The result of triplication is shown schematically on the example of the i_z1 register in Figure 4.

As can be seen, each flip-flop from the original design was triplicated with a shared voter inserted for the Design_2 and Design_3. Implementation of i_z1 register in Design_4, follows the same scheme employing TMR with distributed voters (i.e., three voters for three flip-flops) since the triplication of the register occurred as a part of the triplication process of its parent module i_ecc.

The FPGA resource utilization for investigated designs is given in Table 1. A more detailed table providing a hierarchical comparison of resource usage is given in the Appendix (Table A1).

5. SCA Attack and Results Discussion

5.1. Measurements

The board was placed in the Langer ICS 105 IC scanner [30] equipped with a Langer MFA-R 0.2-75 near-field probe [31] that allows precise measurements of high-frequency near fields close to the board components. According to the board schematic [32], the measurement point should be selected close to one of the power decoupling capacitors C125—C134 for internal core supply voltage VCCINT. However, the board lacks reference designators—only a small part of components can be matched to the schematic. For this reason we selected the capacitor with the best signal-to-noise ratio for the measurements experimentally. The exact placement of the probe is illustrated in Figure 5.

Five electromagnetic traces, one for each of the attacked designs, were captured during the execution of kP operations using a WavePro 254HD (Teledyne LeCroy GmbH, Heidelberg, Germany) oscilloscope [33], operating at a sampling rate of 10 GS/s. The placement of the EM probe as well as the inputs for the kP implementation were kept constant for all investigated cases. The design’s functionality and the correctness of the kP calculation were verified by validating the output result. A screenshot of the captured EM trace for the original design is given in Figure 6.

The captured EM traces differ in the signal amplitude depending on the design’s instance. For example, the trace for the Design_4 with the full TMR has about three times higher signal amplitude compared to the original Design_0 without TMR. The difference in signal amplitude is demonstrated on parts of the synchronized traces depicted in Figure 7.

It can be seen in Figure 7 that the application of the TMR influences not only the amplitude of the signal but also the signal propagation delays. As TMR introduces additional logic paths and voters, it alters the timing of signal transitions. These changes in propagation delays affect the alignment and duration of switching activity, thus influencing the shape of captured EM traces and detectability of SCA leakage.

5.2. Performed Attack

The conducted attack is described in detail in [27,34]. It is suitable for implementations in which computations are performed in a sequence of steps corresponding to individual bits of a secret scalar, e.g., algorithms with bitwise processing of a scalar in the elliptic curve scalar multiplication. Each of these steps is referred to as a “slot”, representing a fixed-length period of time during which one bit of the scalar is processed. The attacker’s goal is to reveal the value of each bit processed in these slots by analyzing the corresponding power consumption or electromagnetic emanation trace. In other words, an attacker exploits statistical differences in power consumption between parts that process ‘0’ key bits and those that process ‘1’ key bits, under the assumption that these two sets have distinguishable mean shapes in the corresponding power or electromagnetic traces.

Although the attacker does not know the processed scalar (i.e., has no knowledge of which slot corresponds to which bit value) as well as the exact operation performed in each slot, he can calculate a mean slot over all slots. This mean slot is then used as a reference for comparison. By comparing the shape of each individual slot to the shape of the mean slot sample-wise, the attacker attempts to classify the bit value of that individual slot. For example, if a sample from a slot is lower (or higher) than the corresponding point in the mean trace, the bit is classified as a ‘0’ bit, based on the assumption that less power is consumed (or vice versa). Two hypotheses are considered: one where ‘0’-slots consume less power on average than ‘1’-slots, and the other where the opposite is true.

Application of this classification to all slots results in a sequence of key candidates. Because the two hypotheses are inverses of one another, only one needs to be applied for extracting key candidates and inverted key candidates can be applied to assess the attack success, too.

5.3. Attack Results and Discussion

As our designs operate at 10 MHz and the sampling rate during the measurements was set to 10 GS/s, a single clock cycle is represented by 1000 captured samples. We compressed the trace by applying the sum of absolute values to represent each clock cycle using a single value.

Due to the fact that the processing of a single key bit in our implementation requires 54 clock cycles only, we obtained 54 key candidates. All key candidates were compared to the real key by calculating the Hamming distance HD of each key candidate and the real key, which is the scalar k processed, and the relative correctness

δ

using the following equation:

δ = (1 - \frac{H D}{k e y_l e n g t h}) \cdot 100 %

(1)

The results of the attack against the original design as well as its four instances employing triple modular redundancy are presented graphically in Figure 8.

As can be seen, for Design_4 (see dashed black line), in which the TMR was applied to the whole design, most of the key candidates have a correctness similar to the one of the original implementations without redundancy (see blue line for Design_0). The best key candidate (key candidate 41) has a correctness of 99.5% for Design_4, while the maximum correctness of 99.1% is achieved for the candidate with the same index in the original, i.e., Design_0. These results were quite expected, as the impact of power consumption related to the TMR application is increased proportionally for all design blocks. However, some of the key candidates have significantly increased (e.g., key candidates 21, 40, 41, 45, 48) or reduced (key candidate 44) correctness compared to the original design. This may be caused either by differences in signal propagation delays that originate from TMR application or placement and routing steps of the design implementation.

The usage of registers in the design is key-dependent, see steps 11, 12 and 14, 15 in the Algorithm A1. Therefore, the power consumed by accessing these registers contributes to side-channel leakage responsible for the key revealing. By triplicating only the registers in Design_2, we expected an increase in this leakage. Attack results confirm this assumption (see red line in Figure 8)—most of the key candidates have an increased correctness compared not only to the original design, but to the full TMR design as well. This is the one instance out of the investigated ones that achieved the highest correctness of 100% (key candidate 41), meaning that the key candidate is equal to the original key.

The field multiplier is the biggest and most energy-consuming unit in our design, which is active in all clock cycles during main loop iterations. This means that the rest of the operations, including those that are key-dependent, are performed in parallel to multiplications. The multiplier block is resistant to the performed horizontal attack [24]. Thus, the multiplier acts as an inherent noise source in the design, which can hide the key-dependent activity of other blocks, thereby increasing the design’s resistance to SCA attacks [27]. The implementation of the multiplier block using TMR in Design_1 significantly increases the level of noise produced by the design, resulting in considerably reduced correctness of the keys revealed (see green solid line in Figure 8). Most of the key candidates have a correctness below 75% while the best key candidate has a correctness of 79% (key candidate 39).

Design_3 (see green dashed line), that includes triplicated registers and field multiplier blocks, shows slightly increased vulnerability compared to the Design_1 (see green dashed line). Again, the correctness is increased due to the increased power consumed by the registers, which contributes to the SCA leakage.

Our results demonstrate that a sophisticated TMR approach is crucial for enhancing the side-channel resistance of cryptographic designs. Designers have to understand how much each part of their implementation leaks information as well as what is the source of the noise and apply TMR to the noise source blocks such as field multipliers rather than simply triplicating everything without a clear strategy.

6. Conclusions and Future Work

Our experiments demonstrate that TMR affects side-channel resistance of cryptographic implementations. Selective redundancy added to functional blocks of the design can make SCA attacks either more successful and easier to perform or reduce the design’s vulnerability, introducing a sufficient amount of noise to hide leakage sources. Thus, although TMR application may provide protection under specific conditions, it should not be considered a primary countermeasure against SCA attacks. Furthermore, application of TMR can unintentionally increase SCA leakage, potentially weakening the security of otherwise resilient designs.

In our future work we plan to investigate the impact of TMR on the success of attacks by exploiting static currents, including those conducted under laser illumination or operating parameters variation.

Author Contributions

Conceptualization, I.K. and Z.D.; funding acquisition, P.L.; investigation, I.K.; methodology, I.K. and Z.D.; software, I.K.; validation, I.K.; writing—original draft, I.K.; writing—review and editing, I.K., P.L. and Z.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AES	Advanced encryption standard
CPA	Correlation power analysis
CRC	Cyclic redundancy check
DPA	Differential power analysis
DMR	Double modular redundancy
SNR	Signal-to-noise ratio
TMR	Triple modular redundancy

Appendix A

Algorithm A1: Modified Montgomery algorithm for the kP operation corresponding to [27]

Input: k = (k_l−₁ … k₁ k₀)₂ with k_l−₁ = 1, P = (x,y) is a point of EC over GF(2^l)
Output: kP = (x₁, y₁)

1: X₁ ← x, X₂ ← x⁴ + b, Z₂ ← x² //initialization
2: if k_l−₂ = 1 then //processing second most significant bit
3: T ← Z₂, Z₁ ← (X₁Z₂ + X₂)², X₁ ← X₁Z₂X₂ + xZ₁,
4: T ← X₂, U ← b Z₂⁴, X₂ ← X₂⁴+ U,
U ← TZ₂, Z₂ ← U².
5: else
6: T ← Z₂, Z₂ ← (X₁Z₂ + X₂)², X₂ ← X₁X₂T + xZ₂,
7: T ← X₁, U ← bX₂⁴, X₁ ← X₁⁴ + b,
U ← TX₂, Z₁ ← T².
8: end if
9: for i from l − 3 downto 0 do //start of the main loop
10: if k_i = 1 then
11: T ← Z₁, Z₁ ← (X₁Z₂ + X₂Z₁)², X₁ ← xZ₁ + X₁X₂TZ₂,
12: T ← X₂, X₂ ← X₂⁴ + bZ₂⁴, Z₂ ← T² Z₂².
13: else
14: T ← Z₂, Z₂ ← (X₂Z₁ + X₁Z₂)², X₂ ← xZ₂ + X₁X₂TZ₁,
15: T ← X₁, X₁ ← X₁⁴ + b Z₁⁴, Z₁ ← T² Z₁².
16: end if
17: end for //end of the main loop
//calculating affine coordinates of the kP result
18: x₁ ← 1/(xZ₁Z₂)
19: y₁ ← y + (x + x₁)[(X₁ + xZ₁)(X₂ + xZ₂) + (x² + y)(Z₁Z₂)] ∙ x₁
20: x₁ ← X₁x₁xZ₂ // i.e., x₁ = X₁/Z₁
21: return (x₁, y₁)

Table A1. FPGA Resources Utilization by Design Hierarchy.

	Design_0		Design_1		Design_2		Design_3		Design_4
	Logic LUTs	FFs	Logic LUTs	FFs	Logic LUTs	FFs	Logic LUTs	FFs	Logic LUTs	FFs
bd_ecc_i	5466	3706	12,285	5366	6790	6502	13,687	8162	22,705	10,702
clk_wiz	0	0	0	0	0	0	0	0	0	0
util_vector_logic_0	1	0	1	0	1	0	1	0	1	0
ecc_uart_0	5465	3706	12,284	5366	6789	6502	13,686	8162	22,704	10,702
(U0	2	60	2	60	2	60	2	60	10	60
❖ i_ecc	4278	3498	12,020	5158	6526	6294	13,422	7954	22,465	10,494
(i_ecc)	-	-	1	0	1	0	1	0	1	0
• i_alu	0	233	0	233	0	233	0	233	466	699
• i_b	0	233	0	233	233	233	233	233	478	699
• i_cntr	661	104	896	104	895	104	895	104	3241	312
• i_ext_mux	*	*	161	0	*	*	161	0	*	*
• i_ext_reg	0	233	0	233	0	233	0	233	478	699
• i_multiply	2417	830	9258	2490	2368	830	9262	2490	8810	2490
◦ u1	*	*	*	*	1546	0	*	*	4762	0
• i_sys_mux	932	0	932	0	932	0	932	0	2796	0
• i_testbit	64	1	64	1	64	1	65	1	194	3
• i_x	64	233	0	233	297	233	233	233	542	699
• i_x1	18	233	699	233	233	699	233	699	2563	699
• i_x2	0	233	0	233	233	699	233	699	466	699
• i_x3	0	233	0	233	233	699	233	699	466	699
• i_x4	0	233	0	233	233	699	233	699	466	699
• i_y	122	233	9	233	338	233	242	233	575	699
• i_z1	0	233	0	233	233	699	233	699	466	699
• i_z2	0	233	0	233	233	699	233	699	466	699
❖ i_uart	1187	148	265	148	264	148	265	148	231	148
• up	1103	84	186	84	185	84	186	84	143	84
• ut	84	64	80	64	80	64	80	64	88	64
◦ bg	21	17	20	17	20	17	20	17	21	17
◦ ur	45	28	44	28	44	28	44	28	49	28
◦ ut	18	19	16	19	16	19	16	19	18	19

* was flattened during synthesis.

The clk_wiz IP block does not consume any logic LUTs or flip-flops due to the fact that it uses dedicated clocking resources such as one Mixed-Mode Clock Manager (MMCME2_ADV) to divide the input clock frequency as well as produce the output clock and two Global Clock buffers (BUFGCTRL) to distribute it.

References

Regazzoni, F.; Eisenbarth, T.; Grobschadl, J.; Breveglieri, L.; Ienne, P.; Koren, I.; Paar, C. Power Attacks Resistance of Cryptographic S-boxes with added Error Detection Circuits. In Proceedings of the 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007), Rome, Italy, 26–28 September 2007; pp. 508–516. [Google Scholar] [CrossRef]
Regazzoni, F.; Eisenbarth, T.; Breveglieri, L.; Ienne, P.; Koren, I. Can Knowledge Regarding the Presence of Countermeasures Against Fault Attacks Simplify Power Attacks on Cryptographic Devices? In Proceedings of the 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems, Boston, MA, USA, 1–3 October 2008; pp. 202–210. [Google Scholar] [CrossRef]
UMC. Available online: https://www.umc.com/en/Home/Index (accessed on 19 June 2025).
Luo, P.; Fei, Y.; Zhang, L.; Ding, A.A. Side-channel power analysis of different protection schemes against fault attacks on AES. In Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14), Cancun, Mexico, 8–10 December 2014; pp. 1–6. [Google Scholar] [CrossRef]
Pahlevanzadeh, H.; Dofe, J.; Yu, Q. Assessing CPA resistance of AES with different fault tolerance mechanisms. In Proceedings of the 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), Macao, China, 25–28 January 2016; pp. 661–666. [Google Scholar] [CrossRef]
Dofe, J.; Pahlevanzadeh, H.; Yu, Q. A Comprehensive FPGA-Based Assessment on Fault-Resistant AES against Correlation Power Analysis Attack. J. Electron. Test. 2016, 32, 611–624. [Google Scholar] [CrossRef]
SAKURA-G. Available online: http://www.meytang.com/h-pd-18.html (accessed on 19 June 2025).
Miškovský, V.; Kubátová, H.; Novotný, M. Influence of fault-tolerant design methods on differential power analysis resistance of AES cipher: Methodics and challenges. In Proceedings of the 2016 5th Mediterranean Conference on Embedded Computing (MECO), Bar, Montenegro, 12–16 June 2016; pp. 14–17. [Google Scholar] [CrossRef]
Říha, J.; Miškovský, V.; Kubátová, H.; Novotný, M. Influence of Fault-Tolerance Techniques on Power-Analysis Resistance of Cryptographic Design. In Proceedings of the 2017 Euromicro Conference on Digital System Design (DSD), Vienna, Austria, 30 August–1 September 2017; pp. 260–267. [Google Scholar] [CrossRef]
Bochard, N.; Marchand, C.; Petura, O.; Bossuet, L.; Fischer, V. Evariste III: A new multi-FPGA system for fair benchmarking of hardware dependent cryptographic primitives. In Proceedings of the International Conference on Cryptographic Hardware and Embedded Systems—CHES 2015, Saint Malo, France, 13–16 September 2015. [Google Scholar] [CrossRef]
Wiki-Evariste. Available online: https://labh-curien.univ-st-etienne.fr/wiki-evariste/index.php/Main_Page (accessed on 19 June 2025).
Miškovský, V.; Kubátová, H.; Novotný, M. Secure and dependable: Area-efficient masked and fault-tolerant architectures. In Proceedings of the 2021 24th Euromicro Conference on Digital System Design (DSD), Palermo, Italy, 1–3 September 2021; pp. 333–338. [Google Scholar] [CrossRef]
Bogdanov, A.; Knudsen, L.R.; Leander, G.; Paar, C.; Poschmann, A.; Robshaw, M.J.; Seurin, Y.; Vikkelsoe, C. PRESENT: An Ultra-Lightweight Block Cipher. In Proceedings of the Cryptographic Hardware and Embedded Systems—CHES 2007, Vienna, Austria, 10–13 September 2007; Paillier, P., Verbauwhede, I., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 450–466. [Google Scholar] [CrossRef]
Welch, B.L. The Generalization of ‘Student’s’ Problem when Several Different Population Variances are Involved. Biometrika 1947, 34, 28–35. [Google Scholar] [CrossRef] [PubMed]
Almeida, F.; Aksoy, L.; Raik, J.; Pagliarini, S. Side-Channel Attacks on Triple Modular Redundancy Schemes. In Proceedings of the 2021 IEEE 30th Asian Test Symposium (ATS), Matsuyama, Japan, 22–25 November 2021; pp. 79–84. [Google Scholar] [CrossRef]
Synopsys. FPGA Design Solution for High-Reliability Applications. Available online: https://www.synopsys.com/content/dam/synopsys/implementation&signoff/datasheets/fpga-design-solution-for-high-reliability-applications-brochure.pdf (accessed on 19 June 2025).
Precision Hi-Rel, Siemens Digital Industries Software. Available online: https://eda.sw.siemens.com/en-US/ic/precision/hi-rel/ (accessed on 19 June 2025).
Xilinx TMRTool: Industry’s First Triple Modular Redundancy Development Tool for Reconfigurable FPGAs. Available online: https://www.xilinx.com/publications/prod_mktg/CS11XX_TRMTool_Product_Brief_FINAL0806.pdf (accessed on 13 August 2025).
SourceForge. BYU EDIF Tools. Available online: https://sourceforge.net/projects/byuediftools/ (accessed on 19 June 2025).
SpyDrNet|Home. Available online: https://byuccl.github.io/spydrnet-tmr/ (accessed on 19 June 2025).
Python. byuccl/spydrnet BYU Configurable Computing Lab (29 May 2025). Available online: https://github.com/byuccl/spydrnet (accessed on 19 June 2025).
AMD. AMD VivadoTM Design Suite. Available online: https://www.amd.com/en/products/software/adaptive-socs-and-fpgas/vivado.html (accessed on 25 June 2025).
NIST SP 800-186; Recommendations for Discrete Logarithm-Based Cryptography: Elliptic Curve Domain Parameters. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2023.
Kabin, I.; Dyka, Z.; Klann, D.; Langendoerfer, P. Horizontal DPA Attacks against ECC: Impact of Implemented Field Multiplication Formula. In Proceedings of the 2019 14th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS), Mykonos, Greece, 16–18 April 2019; pp. 1–6. [Google Scholar] [CrossRef]
Kabin, I.; Dyka, Z.; Klann, D.; Langendoerfer, P. Methods increasing inherent resistance of ECC designs against horizontal attacks. Integration 2020, 73, 50–67. [Google Scholar] [CrossRef]
Kabin, I.; Dyka, Z.; Langendoerfer, P. Atomicity and Regularity Principles Do Not Ensure Full Resistance of ECC Designs against Single-Trace Attacks. Sensors 2022, 22, 3083. [Google Scholar] [CrossRef] [PubMed]
Kabin, I. Horizontal Address-Bit SCA Attacks Against ECC and Appropriate Countermeasures. Ph.D. Thesis, BTU Cottbus-Senftenberg, Senftenberg, Germany, 2023. [Google Scholar] [CrossRef]
Arty Z7—Digilent Reference. Available online: https://digilent.com/reference/programmable-logic/arty-z7/start (accessed on 19 June 2025).
Hankerson, D.; Menezes, A.J.; Vanstone, S. Guide to Elliptic Curve Cryptography; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Langer EMV—ICS 105 set, IC Scanner 4-Axis Positioning System. Available online: https://www.langer-emv.de/en/product/langer-scanner/41/ics-105-set-ic-scanner-4-axis-positioning-system/144 (accessed on 25 June 2025).
Langer EMV—MFA-R 0.2-75, Near-Field Micro Probe 1 MHz up to 1 GHz. Available online: https://www.langer-emv.de/en/product/mfa-active-1mhz-up-to-6-ghz/32/mfa-r-0-2-75-near-field-micro-probe-1-mhz-up-to-1-ghz/854 (accessed on 25 June 2025).
Arty Z7—Schematic. Available online: https://files.digilent.com/resources/programmable-logic/arty-z7/arty-z7-d0-sch.PDF (accessed on 19 June 2025).
Teledyne LeCroy—WavePro 254HD. Available online: https://www.teledynelecroy.com/oscilloscope/wavepro-hd-oscilloscope/wavepro-254hd (accessed on 25 June 2025).
Kabin, I.; Dyka, Z.; Klann, D.; Mentens, N.; Batina, L.; Langendoerfer, P. Breaking a fully Balanced ASIC Coprocessor Implementing Complete Addition Formulas on Weierstrass Elliptic Curves. In Proceedings of the 2020 23rd Euromicro Conference on Digital System Design (DSD), Kranj, Slovenia, 26–28 August 2020; pp. 270–276. [Google Scholar] [CrossRef]

Figure 1. Design flow applied in this work, illustrating the transformation of the original implementation into a TMR-protected version.

Figure 2. Vivado block diagram of the implemented design: the ecc_uart_0 block represents the kP accelerator with the UART interface, clk_wiz is a Clocking Wizard IP core configured to produce an output clock of 10 MHz.

Figure 3. Design hierarchy of the kP accelerator, i.e., the ecc_uart_0 block on the block diagram in 0.

Figure 4. Two instances of the same register i_z1: (a) original implementation in the Design_0, and (b) Triple Modular Redundancy (TMR) version in Design_2 and Design_3.

Figure 5. Measurement position of the EM probe close to one of the power decoupling capacitors.

Figure 6. A screenshot of the captured electromagnetic trace (red) for Design_0 and a trigger_out signal (green), indicating an ongoing kP operation. The duration of the kP operation is about 1.3 ms for the 10 MHz clock frequency.

Figure 7. Representation of synchronized parts of the traces for all five investigated designs, consisting of 2000 samples, which corresponds to the duration of two clock cycles.

Figure 8. Results of the conducted attack against the original design as well as its four instances employing triple modular redundancy.

Table 1. FPGA resources utilization.

FPGA Resources	Available		Used (Utilization)
Design		Design_0	Design_1	Design_2	Design_3	Design_4
Slice	13,300	1775 (13.35%)	3721 (27.98%)	2302 (17.31%)	4304 (32.36%)	7542 (56.71%)
SLICEL		1308	2664	1720	3071	5094
SLICEM		467	1057	582	1233	2448
LUT as Logic	53,200	5466 (10.27%)	12,285 (23.09%)	6790 (12.76%)	13,687 (25.73%)	22,705 (42.68%)
Using O6 output only		4631	9723	6259	11,129	18,090
Using O5 and O6		835	2562	531	2558	4615
Slice Registers	106,400	3706 (3.48%)	5366 (5.04%)	6502 (6.11%)	8162 (7.67%)	10,702 (10.06%)
Register driven from within the Slice		1891	2727	2014	2804	5274
Register driven from outside the Slice		1815	2639	4488	5358	5428
LUT in front of the register is unused		1169	1045	1394	1339	1775
LUT in front of the register is used		646	1594	3094	4019	3653
Unique Control Sets	13,300	69 (0.52%)	71 (0.53%)	69 (0.52%)	71 (0.53%)	163 (1.23%)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kabin, I.; Langendoerfer, P.; Dyka, Z. On the SCA Resistance of TMR-Protected Cryptographic Designs. Electronics 2025, 14, 3318. https://doi.org/10.3390/electronics14163318

AMA Style

Kabin I, Langendoerfer P, Dyka Z. On the SCA Resistance of TMR-Protected Cryptographic Designs. Electronics. 2025; 14(16):3318. https://doi.org/10.3390/electronics14163318

Chicago/Turabian Style

Kabin, Ievgen, Peter Langendoerfer, and Zoya Dyka. 2025. "On the SCA Resistance of TMR-Protected Cryptographic Designs" Electronics 14, no. 16: 3318. https://doi.org/10.3390/electronics14163318

APA Style

Kabin, I., Langendoerfer, P., & Dyka, Z. (2025). On the SCA Resistance of TMR-Protected Cryptographic Designs. Electronics, 14(16), 3318. https://doi.org/10.3390/electronics14163318

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the SCA Resistance of TMR-Protected Cryptographic Designs

Abstract

1. Introduction

2. SCA Attacks Against TMR: State of the Art

3. Tools for Automated Triple Modular Redundancy Implementation

4. Investigated Designs

4.1. Original Design

4.2. TMR Instances of the Original Design

5. SCA Attack and Results Discussion

5.1. Measurements

5.2. Performed Attack

5.3. Attack Results and Discussion

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI