Next Article in Journal
Environmental Impact of the Natural Gas Liquefaction Process: An Example from China
Next Article in Special Issue
Adaptive Dynamic Disturbance Strategy for Differential Evolution Algorithm
Previous Article in Journal
A Control Strategy for Suppressing Zero-Sequence Circulating Current in Paralleled Three-Phase Voltage-Source PWM Converters
Previous Article in Special Issue
Entanglement Control of Two-Level Atoms in Dissipative Cavities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Secure Architecture for Modular Division over a Prime Field against Fault Injection Attacks

1
School of Software Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
2
School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(5), 1700; https://doi.org/10.3390/app10051700
Submission received: 26 January 2020 / Revised: 22 February 2020 / Accepted: 26 February 2020 / Published: 2 March 2020

Abstract

:
Fault injection attacks pose a serious threat to many cryptographic devices. The security of most cryptographic devices hinges on a key block called modular division (MD) over a prime field. Although a lot of research has been done to implement the MD over a prime field in hardware efficiently, studies on secure architecture against fault injection attack are very few. A few of the studies that focused on secure architecture against fault injection attack can only detect faults but not locate faults. In this regard, this paper designs a novel secure architecture for the MD over a prime field, which can not only detect faults, but also can locate the error processing element. In order to seek the best optimal performance, four word-oriented systolic structures of a main function module (MFM) were designed, and three error detection schemes were developed based on different linear arithmetic codes (LACs). The MFM structures were combined flexibly with the error detection schemes. The time and area overheads of our architecture were analyzed through the implementation in an application-specific integrated circuit (ASIC), while the error detection and location capabilities of our architecture were demonstrated by C++ simulation, in comparison to two existing methods. The results show that our architecture can detect single-bit error (SBE) with 100% accuracy and locate the erroneous processing element (PE), and correctly identify most of the single PE errors and almost all of the multi-PE errors (when there are more than three erroneous PEs). The only weakness of our architecture is the relatively high time and area overhead ratios.

1. Introduction

Currently, there are various important integrated circuit (IC) devices, ranging from pivotal calculators to security-sensitive devices. Many of these IC devices face the risk of fault injection attacks. The consequences of these attacks include sudden failure of the IC and the leak of key secret information [1,2,3,4]. Over the years, many fault injection methods have emerged, namely, heavy ion radiation, electromagnetic interference, and laser exposure, posing an increasingly high threat to IC devices.
In the field of IC security, widespread attention has been paid to the protection against fault injection attacks [5,6,7,8,9,10], resulting in multiple measures to prevent the ICs from being attacked by fault injection. The typical measures include physical protection [11], the hardware/time redundancy (module duplication/re-computation) method [12,13,14,15], and the error detection codes (EDC)-based technique [10,16,17,18,19,20,21,22,23]. Among them, the EDC-based technique achieves the best tradeoff between fault coverage and hardware/time overheads [24]. Recently, Mustafa et al. [25] presented a novel differential fault attack (DFA)-aware floor-planning technique, which mitigates the threat from different fault attacks. The authors confirmed that this technique, involving no algorithm, circuit nature, or protocol, falls into the category of physical protection. Despite being an attractive approach to design secure ICs, the DFA-aware technique is not directly linked to our research, which focuses on the EDC-based technique for fault detection.
Finite field arithmetic has important applications in cryptography and coding theory. One of the most essential and complex operation over a finite field is the modular division/inversion (MD/MI) over a prime field. The secure implementation of this operation preludes the security of related cryptographic devices.
Because of modular division in cryptography, researchers have paid much attention to it. Various algorithms have been presented to compute modular division. They can be classified into three categories: repeated exponentiation, the extended Euclidean algorithm, and the extended binary GCD algorithm. The extended GCD algorithm is most suitable to be implemented in hardware. Many implementation architectures of this algorithm have been presented [26,27,28,29,30,31,32,33]. These works made great contributions to improve hardware implementation performance of modular division, however, they did not guarantee the architectures were reliable when the chip was injected into the fault.
Considering the reliability of finite field arithmetic, researchers have also developed many fault detection schemes [6,10,34,35,36,37,38,39,40]. Some scholars [34,35,36] explored deep into concurrent error detection schemes for division/division over the Galois field GF(2m). Based on time redundancy, Bayat Sarmadi and Hasan [34] put forward a security scheme for MD over polynomial basis and normal basis finite fields but did not implement the proposed scheme. Mozaffari Kermani and Reyhani Masoleh [35] proposed a fault detection scheme for multiplicative division on GF(2m) based on parity prediction. Mozaffari Kermani et al. [36] applied a similar technique to ensure the structural reliability of an extended Euclidean algorithm over GF(2m). The parity prediction scheme is efficient for MD over GF(2m), but not the best choice for arithmetic operations like MD over a prime field. This is because the parity prediction of arithmetic operations is usually more complicated than logic operations. The complexity leads to large area and time overheads. Compared with parity code, arithmetic codes are efficient in protecting arithmetic operations [37]. Some fault-tolerant schemes have been investigated based on different arithmetic codes [6,10,38,39], yet fail to tackle MD over a prime field.
Focusing on modular division over a prime field, Hu et al. [40] came up with a concurrent error detection scheme based on linear arithmetic code (LAC), and applied it to systolic implementation of MD. As shown in Figure 1, the scheme consists of a main function module (MFM) and an error detection module (EDM). The former is responsible for the MD computation, and the latter for error detection of the MFM, using the LAC. The EDM involves three sub-modules, namely, actual check part generator (ACPG), check part predictor (CPP), and comparator (CMP). Among them, the ACPG accepts the results of each iteration from the MFM, and generates their actual check parts; the CPP predicts the check parts of iterative results; the CMP compares the actual check parts with the predicted ones—if the two parts are consistent in each iteration, then no fault has been injected in the MFM, and the system will continue to run; otherwise, the architecture will issue an alarm about the probability of fault attacks in the MFM. The MD output is regarded as valid, if and only if the actual and predicted check parts of any iteration are equal to each other. On the upside, the scheme in [40] can detect errors accurately with limited area and time overheads. On the downside, this scheme cannot report the detected errors in real-time because it takes the entire n-bit iterative result as the detection cell so that the system cannot output the iterative detection result until the last word of the iteration result is valid. Neither could this scheme locate the detected errors.
Inspired by the concurrent error detection scheme for the MD in [40], this paper proposes a novel MD architecture capable of effectively detecting and locating erroneous processing elements, and applicable to related cryptographic implementations, laying the basis for prevention of natural faults and fault attacks. The contributions of this paper are as follows:
  • This paper extends our work in [40] to present a new secure MD architecture that can not only detect, but also locate, the error.
  • Twelve combinations of four word-oriented systolic implementations of MFM and three error-detecting schemes with different LAC values were explored to seek the best tradeoff between area, time overheads, and error detection capability. These combinations were modeled using Verilog language and synthesized using Synopsys Design Complier with the TMSC (Taiwan Semiconductor Manufacturing) 90nm CMOS (Complementary metal-oxide-semiconductor) standard cell library. Their functions were also verified using Modelsim.
  • Random fault injections were simulated using the C++ program and the simulation result shows that the proposed architecture can detect single-bit error (SBE) with 100% accuracy and locate the erroneous processing element (PE). The detection capability of single-PE error (multiple-bit error is injected into one PE) and multi-PE error vary with the value of LAC. However, it reaches 99.898% when the number of erroneous processing elements is three or more.
  • In addition, the proposed architecture can greatly shorten the delay in error reporting.
The remainder of this paper is organized as follows: Section 2 briefly reviews the MD algorithm and the LAC; Section 3 sets up our architecture and describes its algorithm; Section 4 analyzes the error detection and location capabilities of our architecture; Section 5 presents the application-specific integrated circuit (ASIC) implementation results of our architecture, and compares them with those of existing schemes; Section 6 puts forward the conclusions of this research.

2. Preliminaries

2.1. The MD Algorithm

Let X, Y, and M be three integers, where the greatest common divisor GCD(Y, M) = 1. The MD problem aims to find an integer R satisfying RY ≡ X(mod M). Chen and Qin [31] optimized the extended binary GCD algorithm for the MD over a prime field, which requires less iteration than the other existing algorithms. Its equivalent description is shown as Algorithm 1.
Algorithm1 Equivalent Description of Modular division algorithm over prime field in [31]
Input: X , Y , M ( M   is   a   prime ) , 0 Y M ,     0 X M
Output: R ( R Y X   m o d   M )
1 : I n i t i a l i z a t i o n
2 : i = 0 , 0 = 1 ,   δ 0 = 1  
3 :   { A 0 , B 0 , R 0 , S 0 } { Y , M , X , 0 }
4: // Modular Division Computation
5: W h i l e   A 0 1
6: i = i + 1
7: // Control Element
8: α i = { 0 if     A i 1   m o d     2 = 0 1 else   if   ( A i 1 + B i 1 ) m o d   4 = 0 1   else   if   ( A i 1 B i 1 ) m o d   4 = 0
9:   β i = { 0 if     ( R i 1 + α i S i 1 ) m o d   2 = 0   i 1 o t h e r w i s e  
10: ϕ i = {       i 1 if     ( R i 1 + α i S i 1 ) m o d   2 = 0   i 1 o t h e r w i s e  
11:   λ i = { 0 if   ( α = 0 ) o r   ( δ i 1 0 )   1 o t h e r w i s e  
12:   δ i = { 1 + δ i 1             if   ( λ i = 0 )   1 δ i 1             if   ( λ i = 1 )  
13: //Compute Iteration Output
14: { A i = ( A i 1 + α i B i 1 ) / 2   B i = { B i 1           i f   ( λ i = 0 ) S i 1           i f   ( λ i = 1 )           R i = ( R i 1 + α i S i 1 + β i M ) / 2   S i = { S i 1           i f   ( λ i = 0 ) R i 1         i f   ( λ i = 1 )    
15: end for
16: end while
17: if   ( A i = 1 )   t h e n     R i = R i
18: R = R i
19: Return R

2.2. The Linear Arithmetic Code (LAC)

Definition 1 (LAC).
Let A be an integer, and p be a small odd number (p << A). Then, (A, A mod p) is defined as a LAC word of A, where A mod p, denoted as | A | p , is the check part of the LAC word, and p is the check f actor.
The LAC can be added and multiplied as normal integers:
| A + B | p = | | A | p + | B | p | p   and   | A × B | p = | | A | p × | B | p | p
In theory, p can be any relatively small prime number. In practice, however, the p value greatly affects the performance of different applications. This paper adjusts the p value for specific MFM structure, provided that it satisfies p = 2 i 1 , where l = 2,3,5 (namely, p = 3,7,31).

3. Proposed Secure Architecture and Its Algorithm Description

This section firstly proposes a secure architecture for the MD, and then shows its algorithm description. The relevant parameters are defined as in Table 1 below.
As shown in Figure 2, the proposed secure architecture contains two modules, namely, the MFM (marked by green line) and the EDM (marked by red line). The former is employed to compute the MD and the latter to detect the error in MFM. The computing process of the secure architecture is summarized as Algorithm 2. In this algorithm, Lines 4, 18–20 describe the error detection function, which is mapped into EDM of Figure 2, and the left lines describe the word-oriented computing process of MD, which is mapped to MFM.
MFM architecture of Figure 2 is similar with that in [40]. In this architecture, the n-bit operands A, B, R, S are split into e w-bit words, each of which is processed by a PE. A total of e PEs are combined into a systolic array to processes them word by word where e = n / w . Specifically, the control element (CE) executes the operations in Lines 9–15, producing control variables ( α , β , ϕ , λ , δ ), initial carry-in signals (1-bit CA-1 and 2-bit CR-1), and the f signal that controls the initialization in Lines 2–4; the PEk, the physical mapping of the k-th step in Line 17 in for-loop, targets the k-th word of A, B, R, and S. From right to left, a total of e PEs are connected to complete the computation of one iteration (e = 4 in Figure 2). In addition, in MFM, the testing element (TE) module is used to execute the condition judgment operation in Line 6. Finally, the output control element (OCE) executes the operations in Lines 23–24, ensuring that the MD problem outputs parallel results. This paper implements the secure architecture for four types of MFM structures, namely, Type-8, Type-16, Type-32, and Type-64. The number in the name of each type equals w, the word size of PE. Note that, in this paper, we do not show the implementation of MFM; the implementation details of MFM can be found in [40].
Algorithm 2 Proposed word-oriented MD Algorithm with concurrent error detection
Input: X = k = 0 e 1 X k 2 k w ,   Y = k = 0 e 1 Y k 2 k w , M = k = 0 e 1 M k 2 k w
Output: R ( RY X   mod   M )
1 : I n i t i a l i z a t i o n
2 : i = 0 , 0 = 1 ,   δ 0 = 1
3 :   { A 0 , B 0 , R 0 , S 0 } { Y , M , X , 0 }
4 : for { k = 0 t o k < e } do { A ¯ k 0 , B ¯ k 0 , R ¯ k 0 , S ¯ k 0 } = { Y k p , M k p , X k p , 0 } { A ¯ ¯ k 0 , B ¯ ¯ k 0 , R ¯ ¯ k 0 , S ¯ ¯ k 0 } = { Y k p , M k p , X k p , 0 } if { A ¯ k 0 , B ¯ k 0 , R ¯ k 0 , S ¯ k 0 } = { A ¯ ¯ k 0 , B ¯ ¯ k 0 , R ¯ ¯ k 0 , S ¯ ¯ k 0 } then exit end for
5 :   / / Modular   division   computtion 6 :   While   A i ± 1   do
7 :        i = i + 1
8 :   / / Control   Element
9 :   α i = { 0   if   L ( A 0 i 1 ) = 0 1   else   if   ( L * ( A 0 i 1 ) + L * ( B 0 i 1 ) ) mod 4 = 0 1   else   if   ( L * ( A 0 i 1 ) L * ( B 0 i 1 ) ) mod 4 = 0
10 :   β i = { 0   if   ( R 0 i 1 + α S 0 i 1 ) mod   2 = 0 ϕ i 1   otherwise
11 :   ϕ i = { ϕ i 1   if   ( R 0 i 1 + α S 0 i 1 ) mod   2 = 0 ϕ i 1   otherwise
12 :   λ i = { 0   if   ( α = 0 )   or   ( δ i 1 0 ) 1   otherwise
13 :   δ i = { 1 + δ i 1   if   ( λ i = 0 ) 1 δ i 1   if   ( λ i = 1 )
14 :   C A 1 i = ( L ( A 0 i 1 ) + α i L ( B 0 i 1 ) ) / 2
15 : C R 1 i = ( L ( R 0 i 1 ) + α i L ( B 0 i 1 ) + β i L ( M 0 ) ) / 2
16 :   for   { k   =   0   to   k   <   e   }   do
17 : / / P E k Module C A k i A k i = L ( A k + 1 i 1 ) H ( A k i 1 ) + α i ( L ( B k + 1 i 1 ) H ( B k i 1 ) ) + C A k 1 i B k i = B k i 1 if λ i = 0 A k i 1 if λ i = 1 C R k i R k i = L ( R k + 1 i 1 ) H ( R k i 1 ) + α i ( L ( S k + 1 i 1 ) H ( S k i 1 ) ) + β i ( L ( M k + 1 ) H ( M k ) ) + C R k 1 i S k i = S k i 1 if λ i = 0 R k i 1 if λ i = 1
18 : / / Actual Check Part Generation ( ACP G k ) Module A ¯ k i = | ( C A k i A k i ) | p B ¯ k i = | B k i | p R ¯ k i = | ( C R k i R k i ) | p S ¯ k i = | S k i | p
19 : / / Check Part Predictor ( CP P k ) Module A ¯ ¯ k i = | L ( A k + 1 i 1 ) H ( A k i 1 ) | p + α i | L ( B k + 1 i 1 ) H ( B k i 1 ) | p + C A k 1 i p B ¯ ¯ k i = | B k i 1 | p i f λ i = 0 | A k i 1 | p i f λ i = 1 R ¯ ¯ k i = | L ( R k + 1 i 1 ) H ( R k i 1 ) | p + α i | L ( S k + 1 i 1 ) H ( S k i 1 ) | p + β i | L ( M k + 1 ) H ( M k ) | p + C R k 1 i p S ¯ ¯ k i = | S k i 1 | p i f λ i = 0 | R k i 1 | p i f λ i = 1
20 :           if   { A ˙ ¯ k i , B ˙ ¯ k i , R ˙ ¯ k i , S ˙ ¯ k i }   { A ˙ ¯ ¯ k i , B ˙ ¯ ¯ k i , R ˙ ¯ ¯ k i , S ˙ ¯ ¯ k i }   then   exit
21 :         end   for
22 :   end   while
23 :   if   A i = 1   then   R i = R i
24 :   R = R i
25 :   retrun   R
The EDM encompasses e sub-ED elements (EDe−1, EDe−2, ..., ED1, ED0). Each of them detects the errors in a specific PE. Once an error is detected in one of the PEs, the EDM will issue an alarm immediately and terminate the MD process. The structure of EDk, shown as Figure 3, contains three components: ACPGk, CPPk, and CMPk. Among them, ACPGk executes the operation of Line 18 to generate the actual check part of { A ˙ k , B ˙ k , R ˙ k , S ˙ k } . CPPk executes the operation of Line 19 to predict the check part of { A ˙ k , B ˙ k , R ˙ k , S ˙ k } . CMPk compares the actual and predicted check parts. If the two parts are unequal, CMPk will issue an alarm Efk about the high possibility of a fault injection attack on PEk, and the system will terminate the MD process; otherwise, the MD process will continue. The logic diagrams of ACPGk and CPPk are shown as Figure 4 and Figure 5, respectively. Note that our architecture provides an ED to each PE. Hence, the validity of any Ef will lead to an error alarm and the termination of the MD, which overcomes the defect of the EDM in [40]—taking the entire n-bit iterative result as the detection so that the system cannot output the iterative detection result until the last word of the iteration result is valid.

4. Testing and Comparison of Error Detection Capability

4.1. Attacker Model

Since the emergence of fault injection, various attack methods have been created to inject fault into semiconductors [41], which makes it hard to define the fault model. Referring to the relevant literature, this paper puts forward the following common hypotheses on the capability of attackers: (1) Attackers are incapable of directly invading, modifying, or rebuilding the circuit structure, using powerful fault injection techniques like a focused ion beam [42]. It is a most powerful assumption of the capability of attacker. If attackers are capable of rebuilding the circuit, then our secure architecture is not able to resist this attack. However, the attack is high-cost and is hard to carry out in practice. It needs very expensive consumables and a strong technical background [24]. (2) Attackers are incapable of tampering with the clock signal [10]. Although tampering with the clock signal is a viable option for an attacker, it is a common assumption for a secure architecture designer because the designer is usually focused on protecting the data path of the chip but not the clock. (3) Attackers are capable of injecting a fault into either MFM or EDM, but not both at the same time. However, the probability that a fault is injected into MFM and EDM at the same time and escape from a comparison result is very small in practice. (4) Attackers are incapable of controlling the error pattern in the MFM output. It is also a strong assumption of the capability of an attacker, that if an attacker is capable of injecting a fault in MFM and causes an error pattern that our architecture cannot detect, our secure scheme will not work.

4.2. Five Types of Error Models

In this paper, the five types of error models were adopted to verify the error detection capability of our secure architecture.
(1) Single-bit error (SBE): The injected fault causes a one-bit error in the PE output.
(2) Single-PE single-cycle error (SPSCE): The injected fault causes an error in the output of one PE, and the error lasts only one cycle.
(3) Single-PE multi-cycle error (SPMCE): The injected fault causes an error in the output of one PE, and the error lasts multiple cycles in a row.
(4) Multi-PE single-cycle error (MPSCE): The injected fault causes an error in the output of multiple PEs, and the errors last only one cycle.
(5) Same-iteration multi-PE error (SIMPE): The injected fault causes an error in the output of multiple continuous PEs in the same iteration.
The five error models are visualized in Figure 6, where each black rectangle represents a PE stricken by the injected fault, and each black dot represents an erroneous bit in PE output.

4.3. Simulation and Comparison of Error Detection Capabilities

In order to analyze the error detection capability of the proposed systolic MD architecture, we first verified the proposed secure architecture using the C++ program, then simulated fault injection and got the error detection capability of the architecture based on 100,000 testing cases. The testing results were compared with the result of the error detection scheme in [40] in Table 2. In addition, we also applied Mozaffari Kermani’s multi-column parity prediction scheme to our MFM, and using the same way, investigated its error detection capability. Here, we need to clarify a fact—in this paper we borrowed Mozaffari Kermani’s multi-column parity prediction idea and applied it to MD over a prime field, but in [36], Mozaffari Kermani applied it to MD over GF(2m). The MD operations over a prime field and GF(2m) are different, thus the cost of error detection is different.
For a simplified description, we used Parity, Style-I, and Style-II to represent the error detection scheme in [36,40], and in this paper.
Table 2 shows simulation results of all three schemes. As shown in Table 2, for the SBE model, all three methods detected 100% of the errors. For the other error models, Style-II exhibited a poorer error detection capability than Style-I and Parity, due to its extremely small p value. However, Style-II architectures with p = 7 and p = 31 were similar to Style-I and Parity in error detection performance. When there were three or more erroneous PEs, Style-II could detect 99.898% or more errors, slightly behind that of Style-I and Parity. Despite the lag in detection ability, Style-II has an advantage over the two contrastive methods—once an error was detected in a PE, the erroneous PE could be located 100%.
Table 2 also shows that the error detection capability of Style-I varied with the MFM structures—the longer the word size of the PE in the MFM, the better the error detection capability. By contrast, Style-II’s error detection capability changed only slightly with the MFM structures. For Style-II, the error detection capability increases with the check factor p, under the same MFM structure.

5. Analysis on Time and Area Overheads

This section mainly presents the time and area overheads of our architecture (Style-II) with three different check factors (p = 3, 7, and 31) under each MFM structure. In order to get time and area overheads, we first modeled the proposed architecture using Verilog, then verified using Modelsim, and finally synthesized the circuit by Synopsys Design Vision with TSMC 90nm CMOS standard cell library. For comparison, Style-I and Parity methods were also modeled and synthesized under the same conditions. The synthesized results are given in Table 3, where time (area) overhead ratio refers to the quotient between extra time (area) overhead and MFM time (area).
Firstly, as shown in Table 3, the time and area overhead ratios of Style-II always surpassed those of Style-I, whichever the check factor, but they were lower than those of Parity at p = 3 or 7. For example, when the MFM structure belonged to Type-8, the mean time and area overhead ratios of Style-I were 0.51% and 41.31%, respectively; those of Style-II with p = 3 were 11.86% and 45.67%, respectively; those of Style-II with p = 7 were 20.70% and 77.84%, respectively; those of Parity were 30.07% and 72.23%, respectively; those of Style-II with p = 31 were 49.64% and 123.22%.
Secondly, we noticed that, when the p value was fixed, the time overhead ratio increased with the operand size n, the area overhead ratio decreased with the growth in n, and the product of time and area overhead ratios decreased with n. For example, for Style-II with p = 7, the mean time and area overhead ratios were 18.50% and 104.80%, respectively, when n = 128, and 23.64% and 46.80%, respectively, when n = 1024. Judging by only time and area overheads, Style-II’s performance is negatively correlated with p-value. However, considering overall performance including time overheads, area overheads, and error detection capability, Style-II with p = 7 should be a better choice compared with Style-II with p = 3 or 31.
Thirdly, under the fixed p-value, Style-II’s performance varied with the MFM structures. The larger the word size of the PE in the MFM, the greater the time and area overhead ratios. For example, when Style-II with p = 3 was applied to Type-8 MFM, the mean time and area overhead ratios were 11.86% and 45.67%, respectively; when it was applied to Type-16 MFM, the two ratios were 33.98% and 51.02%, respectively. Hence, Style-II works better in error detection of short-word systolic implementation of the MFM. On the contrary, Style-I is more suitable for error detection of long-word systolic implementation of the MFM. In both methods, the time and area overheads are negatively correlated with the operand size n. In other words, the two methods are more efficient for large integer MD problems.
Finally, Style-II showed a much shorter delay in error reporting than Style-I and Parity. The delay of Style-II increased with w and p, but remained basically constant when the MFM structure was stable and the operand size varied. Overall, Style-II with p = 7 and Type-8 MFM strikes a good balance between time overheads, area overheads, and error detection capability.

6. Conclusions

This paper extends the work in [40] to put forward a new LAC-based secure architecture for the MD over a prime field against fault injection attacks. Instead of taking the long n-bit iteration result as a detection cell, this paper takes the short w-bit word as the detection cell to implement the function of locating the erroneous processing element. In this paper, four word-based MFM systolic structures with different word sizes and three error detection schemes with different values of linear arithmetic code were explored to seek an optimal tradeoff between different performance indexes. These combination architectures were modeled using Verilog and synthesized by Synopsys Design Vision with the TSMC 90nm CMOS standard cell library to get time and area overheads. The error detection and location capability of proposed architectures were also investigated using the C++ simulation method. The same methods were also used to test the performance of the architectures based on the Style-I scheme and the Parity scheme. The simulation results show that the proposed architecture with p = 7 and Type-8 MFM strikes a good balance between time overheads, area overheads, and error detection capability. Despite having greater area overheads than Style-I, the architecture enjoys unique advantages in the location of the error processing element and timely error reporting. The research results help to find and locate fault attacks quickly. However, the large time and area overheads of this architecture maybe limited its application; we need to optimize its implementation in future research to expand its application range.

Author Contributions

Conceptualization, X.H. and Z.Q.; software, X.H.; validation, X.H. and Z.Q.; formal analysis, X.H. and Z.Q.; investigation, X.H.; writing—original draft preparation, X.H. and Z.Q.; writing—review and editing, X.H. and Z.Q.; visualization, X.H. and Z.Q.; supervision, Z.Q.; project administration, X.H. and Z.Q.; funding acquisition, Z.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grant No.61702237, the Natural Science Foundation of Jiangsu Province, China under Grant No. BK20150241.

Acknowledgments

The authors would like to thank the reviewers for their constructive comments. The authors thank Changxuan Liu for her help in writing. We also would like to thank Mustafa Khairallah for his comment about DFA-aware technology.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Voyiatzis, A.G.; Serpanos, D.N. A fault-injection attack on Fiat-Shamir cryptosystems. In Proceedings of the 24th International Conference on Distributed Computing Systems Workshops, Tokyo, Japan, 23–24 March 2004; Volume 5, pp. 618–621. [Google Scholar]
  2. Schmidt, J.M.; Hutter, M.; Plos, T. Optical fault attacks on AES: A threat in violet. In Proceedings of the Sixth International Workshop on Fault Diagnosis and Tolerance in Cryptography, FDTC 2009, Lausanne, Switzerland, 6 September 2009; pp. 13–22. [Google Scholar]
  3. Trichina, E.; Korkikyan, R. Multi fault laser attacks on protected CRT-RSA. In Proceedings of the 2010 Workshop on Fault Diagnosis and Tolerance in Cryptography, Santa Barbara, CA, USA, 21 August 2010; pp. 75–86. [Google Scholar]
  4. Karaklajić, D.; Schmidt, J.M.; Verbauwhede, I. Hardware designer’s guide to fault attacks. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2013, 21, 2295–2306. [Google Scholar] [CrossRef] [Green Version]
  5. Karpovsky, M.; Kulikowski, K.J.; Taubin, A. Robust protection against fault-injection attacks on smart cards implementing the advanced encryption standard. In Proceedings of the 2004 International Conference on Dependable Systems and Networks (DSN 2004), Florence, Italy, 28 June–1 July 2004; pp. 93–101. [Google Scholar]
  6. Gaubatz, G.; Sunar, B. Robust finite field arithmetic for fault-tolerant public-key cryptography. In Fault Diagnosis and Tolerance in Cryptography, Proceedings of the International Workshop on Fault Diagnosis and Tolerance in Cryptography, Yokohama, Japan, 10 October 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 196–210. [Google Scholar]
  7. Eddine Cherif, B.D.; Bendiabdellah, A.; Tabbakh, M. Diagnosis of an inverter IGBT open-circuit fault by hilbert-huang transform application. Traitement du Signal 2019, 36, 127–132. [Google Scholar] [CrossRef]
  8. Berzati, A.; Canovas, C.; Goubin, L. In (security) against fault injection attacks for CRT-RSA implementations. In Proceedings of the 2008 5th Workshop on Fault Diagnosis and Tolerance in Cryptography, Washington, DC, USA, 10 August 2008; pp. 101–107. [Google Scholar]
  9. Dominguez-Oviedo, A.; Hasan, M.A. Error detection and fault tolerance in ECSM using input randomization. IEEE Trans. Dependable Secur. Comput. 2008, 6, 175–187. [Google Scholar] [CrossRef]
  10. Wang, Z.; Karpovsky, M.; Joshi, A. Secure multipliers resilient to strong fault-injection attacks using multilinear arithmetic codes. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2011, 20, 1036–1048. [Google Scholar] [CrossRef] [Green Version]
  11. IBM 4764 PCI-X Cryptographic Coprocessor. 2011. Available online: http://www.ibm.com/support/knowledgecenter/9119-FHB/p7hcd/fc474.htm (accessed on 28 February 2020).
  12. Patel, J.H.; Fung, L.Y. Concurrent error detection in ALU’s by recomputing with shifted operands. IEEE Trans. Comput. 1982, 31, 589–595. [Google Scholar] [CrossRef]
  13. Johnson, B.W.; Aylor, J.H.; Hana, H.H. Efficient use of time and hardware redundancy for concurrent error detection in a 32-bit VLSI adder. IEEE J. Solid-State Circuits 1988, 23, 208–215. [Google Scholar] [CrossRef]
  14. Li, J.; Swartzlander, E.E. Concurrent error detection in ALUs by recomputing with rotated operands. In Proceedings of the 1992 IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems, Dallas, TX, USA, 4–6 November 1992; pp. 109–116. [Google Scholar]
  15. Chiou, C.W.; Lee, C.Y.; Lin, J.M.; Hou, T.W.; Chang, C.C. Concurrent error detection and correction in dual basis multiplier over GF (2m). IET Circuits Devices Syst. 2009, 3, 22–40. [Google Scholar] [CrossRef]
  16. Lo, J.C.; Thanawastien, S.; Rao, T.R.N. Concurrent error detection in arithmetic and logical operations using Berger codes. In Proceedings of the 9th Symposium on Computer Arithmetic, Santa Monica, CA, USA, 6–8 September 1989; pp. 233–240. [Google Scholar]
  17. Nicolaidis, M.; Duarte, R.O.; Manich, S.; Figueras, J. Fault-secure parity prediction arithmetic operators. IEEE Des. Test Comput. 1997, 14, 60–71. [Google Scholar] [CrossRef]
  18. Reyhani-Masoleh, A.; Hasan, M.A. Fault detection architectures for field multiplication using polynomial bases. IEEE Trans. Comput. 2006, 55, 1089–1103. [Google Scholar] [CrossRef]
  19. Gaubatz, G.; Sunar, B.; Karpovsky, M.G. Non-linear residue codes for robust public-key arithmetic. In Fault Diagnosis and Tolerance in Cryptography, Proceedings of the International Workshop on Fault Diagnosis and Tolerance in Cryptography, Yokohama, Japan, 10 October 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 173–184. [Google Scholar]
  20. Hariri, A.; Reyhani-Masoleh, A. Fault detection structures for the Montgomery multiplication over binary extension fields. In Proceedings of the Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC 2007), Vienna, Austria, 10 September 2007; pp. 37–46. [Google Scholar]
  21. Bayat-Sarmadi, S.; Hasan, M.A. On concurrent detection of errors in polynomial basis multiplication. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2007, 15, 413–426. [Google Scholar] [CrossRef] [Green Version]
  22. Hariri, A.; Reyhani-Masoleh, A. Concurrent error detection in montgomery multiplication over binary extension fields. IEEE Trans. Comput. 2011, 60, 1341–1353. [Google Scholar] [CrossRef]
  23. Wang, Z.; Karpovsky, M.; Sunar, B. Multilinear codes for robust error detection. In Proceedings of the 2009 15th IEEE International On-Line Testing Symposium, Lisbon, Portugal, 24–26 June 2009; pp. 164–169. [Google Scholar]
  24. Barenghi, A.; Breveglieri, L.; Koren, I.; Naccache, D. Fault injection attacks on cryptographic devices: Theory, practice, and countermeasures. Proc. IEEE 2012, 100, 3056–3076. [Google Scholar] [CrossRef] [Green Version]
  25. Khairallah, M.; Sadhukhan, R.; Samanta, R.; Breier, J.; Bhasin, S.; Chakraborty, R.S.; Mukhopadhyay, D. DFARPA: Differential fault attack resistant physical design automation. In Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 19–23 March 2018; pp. 1171–1174. [Google Scholar]
  26. Stein, J. Computational problems associated with Racah algebra. J. Comput. Phys. 1967, 1, 397–405. [Google Scholar] [CrossRef]
  27. Brent, R.P.; Kung, H.T. Systolic VLSI arrays for linear time GCD computation. In VLSI ‘83:VLSI Design of Digital System; Elsevier Science Pub. Co.: Amsterdam, The Netherlands, 1983; pp. 145–154. [Google Scholar]
  28. Takagi, N. A VLSI algorithm for modular division based on the binary GCD algorithm. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 1998, 81, 724–728. [Google Scholar]
  29. Kaihara, M.; Takagi, N. A Hardware Algorithm for Modular Multiplication/Division. IEEE Trans. Comput. 2005, 54, 12–21. [Google Scholar] [CrossRef]
  30. Chen, G.; Bai, G.; Chen, H. A New Systolic Architecture for Modular Division. IEEE Trans. Comput. 2007, 56, 282–286. [Google Scholar] [CrossRef]
  31. Chen, C.; Qin, Z. Efficient algorithm and systolic architecture for modular division. Int. J. Electron. 2011, 98, 813–823. [Google Scholar] [CrossRef]
  32. Choi, P.; Lee, M.; Kong, J.; Kim, D.K. Efficient Design and Performance Analysis of a Hardware Right-shift Binary Modular Division Algorithm in GF(p). J. Semicond. Technol. Sci. 2017, 17, 425–437. [Google Scholar]
  33. Chervyakov, N.; Lyakhov, P.; Babenko, M.; Nazarov, A.; Deryabin, M.; Lavrinenko, I.; Chervyakov, N. A High-Speed Division Algorithm for Modular Numbers Based on the Chinese Remainder Theorem with Fractions and Its Hardware Implementation. Electronics 2019, 8, 261. [Google Scholar] [CrossRef] [Green Version]
  34. Bayat-Sarmadi, S.; Hasan, M. Concurrent Error Detection in Finite-Field Arithmetic Operations Using Pipelined and Systolic Architectures. IEEE Trans. Comput. 2009, 58, 1553–1567. [Google Scholar] [CrossRef] [Green Version]
  35. Kermani, M.M.; Reyhani-Masoleh, A. Concurrent Structure-Independent Fault Detection Schemes for the Advanced Encryption Standard. IEEE Trans. Comput. 2010, 59, 608–622. [Google Scholar] [CrossRef]
  36. Mozaffari-Kermani, M.; Azarderakhsh, R.; Lee, C.Y.; Bayat-Sarmadi, S. Reliable Concurrent Error Detection Architectures for Extended Euclidean-Based Division Over GF(2m). IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2013, 22, 995–1003. [Google Scholar] [CrossRef]
  37. Nicolaidis, M.; Duarte, R. Fault-secure parity prediction Booth multipliers. IEEE Des. Test Comput. 1999, 16, 90–101. [Google Scholar] [CrossRef]
  38. Yumbul, K.; Erdem, S.S.; Savas, E. On Selection of Modulus of Quadratic Codes for the Protection of Cryptographic Operations against Fault Attacks. IEEE Trans. Comput. 2012, 63, 1182–1196. [Google Scholar]
  39. Yang, Q.; Hu, X.; Qin, Z. Secure Systolic Montgomery Modular Multiplier Over Prime Fields Resilient to Fault-Injection Attacks. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2014, 23, 1889–1902. [Google Scholar] [CrossRef]
  40. Hu, X.; Qin, Z.; Yang, Q. A Secure Modular Division Algorithm Embedding with Error Detection and Low-Area ASIC Implementation. J. Signal Process. Syst. 2019, 1–13. Available online: https://link.springer.com/article/10.1007/s11265-019-01481-6 (accessed on 20 February 2020).
  41. Benso, A.; Prinetto, P. (Eds.) Fault Injection Techniques and Tools for Embedded Systems Reliability Evaluation; Springer Science & Business Media: Berlin, Germany, 2003; Volume 23, pp. 7–39. [Google Scholar]
  42. Torrance, R.; James, D. The state-of-the-art in IC reverse engineering. In International Workshop on Cryptographic Hardware and Embedded Systems; Springer: Berlin/Heidelberg, Germany, 2009; pp. 363–381. [Google Scholar]
Figure 1. The concurrent error detection scheme for the modular division (MD) in [40].
Figure 1. The concurrent error detection scheme for the modular division (MD) in [40].
Applsci 10 01700 g001
Figure 2. The proposed secure systolic architecture for the MD.
Figure 2. The proposed secure systolic architecture for the MD.
Applsci 10 01700 g002
Figure 3. The logic diagram of EDk.
Figure 3. The logic diagram of EDk.
Applsci 10 01700 g003
Figure 4. The logic diagram of ACPGk.
Figure 4. The logic diagram of ACPGk.
Applsci 10 01700 g004
Figure 5. The logic diagram of CPPk.
Figure 5. The logic diagram of CPPk.
Applsci 10 01700 g005
Figure 6. Five types of error models.
Figure 6. Five types of error models.
Applsci 10 01700 g006
Table 1. Parameter definitions.
Table 1. Parameter definitions.
SymbolDescription
pThe check factor of the LAC
nThe bit number of modulus M
wThe word size of PE (w∈{ 8, 16,32,64})
eThe number of words in operands e = n / w
X i The value of X at the i-th iteration of Algorithm 2
X k The k-th word of X (X can be A, B, R, or S)
L ( X k ) The least significant bit (LSB) of Xk
L * ( X 0 ) Bits 1 to 0 from X0
H ( X k ) Bits w − 1 to 1 from Xk
ZThe set {A, B, R, S}
Z k The set {Ak, Bk, Rk, Sk}
L ( Z k ) The set { L ( A 0 ) , L ( B 0 ) , L ( R 0 ) , L ( S 0 ) }
L * ( Z 0 ) The set { L * ( A 0 ) , L * ( B 0 ) , L * ( R 0 ) , L * ( S 0 ) }
H ( Z k ) The set { H ( A k ) , H ( B k ) , H ( R k ) , H ( S k )
X | | Y The concatenation of X and Y
C A k The 1-bit carry-out signal from the computation of the k-th word of A
C R k The 2-bit carry-out signal from the computation of the k-th word of R
Table 2. The comparison of error detection capability.
Table 2. The comparison of error detection capability.
MFM
Arch.
SchemepProb.Error ModelError PE
Location
SBESPSCESPMCEMPSCESIMPE
One-bit
Error
One-PE
Error
Two-PE
Error
Three-PE
Error
Two-PE
Error
Three-PE
Error
Two-PE
Error
Three-PE
Error
Type-8Parity [36] 8bitP(%)100100100100100100100100×
Style-I [40] 28 − 1P(%)10099.895100100100100100100×
Style-II
(Ours)
3P(%)10085.43297.87899.69197.87899.69197.87899.691
7P(%)10095.22799.77299.89899.77299.89899.77299.898
31P(%)10099.01299.99010099.99010099.990100
Type-16Parity [36] 8bitP(%)100100100100100100100100×
Style-I [40] 216 − 1P(%)100100100100100100100100×
Style-II
(Ours)
3P(%)10085.59597.92599.70197.92599.70197.92599.701
7P(%)10095.29399.77899.99099.77899.99099.77899.990
31P(%)10099.09799.99210099.99210099.992100
Type-32Parity [36] 8bitP(%)100100100100100100100100×
Style-I [40] 232 − 1P(%)100100100100100100100100×
Style-II
(Ours)
3P(%)10085.59797.92599.70197.92599.70197.92599.701
7P(%)10095.29499.77899.99099.77899.99099.77899.990
31P(%)10099.09799.99210099.99210099.992100
Type-64Parity [36] 8bitP(%)100100100100100100100100×
Style-I [40] 264 − 1P(%)100100100100100100100100×
Style-II
(Ours)
3P(%)10085.59897.92599.70197.92599.70197.92599.701
7P(%)10095.29599.77999.99099.77999.99099.77999.990
31P(%)10099.09899.99210099.99210099.992100
Table 3. Comparison of time and area overheads.
Table 3. Comparison of time and area overheads.
MFM
Types
SchemepOperand Size (bit)f
(MHZ)
Total Area
(K gate)
Extra OverheadExtra Overhead RatioError-Reporting Delay
(ns)
Extra
Cycles
Area
(K gate)
Time (%)Area (%)
Type-8Parity
[36]
8bit12890961.06329.9630.8296.3311.00
256131.1659.6830.1183.5019.80
512308.49119.1229.7662.9037.40
1024799.28237.9929.5942.4072.60
Style-I
[40]
28 − 1128122048.84217.741.0957.068.50
256105.7024.220.5447.8815.30
512257.5768.200.2736.0128.90
1024697.53136.250.1324.2756.10
Style-II
(Ours)
3128111150.89219.796.6563.641.80
2561052108.8437.3312.1752.231.90
5121041264.3374.9613.1439.591.92
10241020705.71150.9415.4027.211.96
7128113663.69232.5918.50104.802.00
256980136.6765.1920.4391.202.04
512980319.20129.8320.2268.562.04
1024952814.37258.6023.6446.802.10
3112880083.732252.63248.13169.242.50
256800175.26104.7847.59148.682.50
512769396.64205.2753.22108.402.60
1024769970.66415.8953.0874.972.60
Type-16Parity
[36]
16bit12876959.35332.4449.39120.537.80
256122.3764.7858.55112.4813.00
512266.26129.4848.1494.7323.40
1024619.91258.8847.9371.7144.20
Style-I
[40]
216 − 1128113645.06218.150.7367.455.28
25697.7336.140.3762.768.80
512207.4070.720.1851.7515.84
1024502.41141.380.0939.1629.92
Style-II
(Ours)
312890942.22216.3325.9463.092.20
25683389.8932.6936.8757.152.40
512833202.2065.3836.6247.852.40
1024833493.96130.7736.4936.012.40
712883354.89229.0637.39112.492.40
256800115.2758.1242.57101.722.50
512800252.85116.2542.3185.112.50
1024800593.55279.3247.8664.412.60
3112871460.86235.0260.28135.592.80
256714127.3270.1859.68122.812.80
512689276.33129.7365.08102.302.90
1024689640.35279.3264.9277.372.90
Type-32Parity
[36]
32bit12869059.79335.1346.65142.445.80
256114.7664.2745.82122.908.70
512237.73123.8845.41108.8114.50
1024517.68246.0945.2089.9126.10
Style-I
[40]
232 − 1128100034.1129.450.7638.314.00
25671.1519.670.3838.216.00
512151.9138.050.1933.4310.00
1024346.0674.470.0926.9618.00
Style-II
(Ours)
312880041.95217.5125.9571.622.50
25674082.2430.8235.5159.942.70
512714175.1661.3040.2653.852.80
1024714395.10122.5140.1344.942.80
712866745.40221.0151.1486.133.00
25666793.4342.0250.5781.753.00
512645197.8983.0555.2972.943.10
1024645440.67167.9555.1461.593.10
3112862551.40227.0161.22110.713.20
256625100.3848.9760.6095.283.20
512606211.3497.5065.3185.643.30
1024606467.46194.8965.1571.503.30
Type-64Parity
[36]
64bit12862555.64331.8147.13133.463.20
256111.7363.0746.28129.626.40
512228.41125.6145.87122.189.60
1024478.77250.6845.66109.9016.00
Style-I
[40]
264 − 112890934.03210.190.7742.782.20
25661.2512.590.3825.894.40
512120.8118.010.1917.526.60
1024256.9628.870.0912.6611.00
Style-II
(Ours)
312880039.48215.6414.5165.662.50
25680079.6530.9914.0963.702.50
512740163.9461.1422.9659.472.70
1024740360.40122.3122.8453.632.70
712864543.36219.5141.9981.793.10
25664587.3039.0141.4480.733.10
512645181.1075.9945.7373.703.20
1024645385.02153.2345.5966.903.20
3112855545.90218.5164.8992.453.60
25655593.6845.3764.2693.923.60
512540191.7388.6268.5085.953.70
1024540402.41173.3668.3475.693.70

Share and Cite

MDPI and ACS Style

Hu, X.; Qin, Z. A Secure Architecture for Modular Division over a Prime Field against Fault Injection Attacks. Appl. Sci. 2020, 10, 1700. https://doi.org/10.3390/app10051700

AMA Style

Hu X, Qin Z. A Secure Architecture for Modular Division over a Prime Field against Fault Injection Attacks. Applied Sciences. 2020; 10(5):1700. https://doi.org/10.3390/app10051700

Chicago/Turabian Style

Hu, Xiaoting, and Zhongping Qin. 2020. "A Secure Architecture for Modular Division over a Prime Field against Fault Injection Attacks" Applied Sciences 10, no. 5: 1700. https://doi.org/10.3390/app10051700

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop