Algebraic Persistent Fault Analysis of SKINNY_64 Based on S_Box Decomposition

Algebraic persistent fault analysis (APFA), which combines algebraic analysis with persistent fault attacks, brings new challenges to the security of lightweight block ciphers and has received widespread attention since its introduction. Threshold Implementation (TI) is one of the most widely used countermeasures for side channel attacks. Inspired by this method, the SKINNY block cipher adopts the S_box decomposition to reduce the number of variables in the set of algebraic equations and the number of Conjunctive Normal Form (CNF) equations in this paper, thus speeding up the algebraic persistent fault analysis and reducing the number of fault ciphertexts. In our study, we firstly establish algebraic equations for full-round faulty encryption, and then analyze the relationship between the number of fault ciphertexts required and the solving time in different scenarios (decomposed S_boxes and original S_box). By comparing the two sets of experimental results, the success rate and the efficiency of the attack are greatly improved by using S_box decomposition. In this paper, We can recover the master key in a minimum of 2000s using 11 pairs of plaintext and fault ciphertext, while the key recovery cannot be done in effective time using the original S_box expression equations. At the same time, we apply S_box decomposition to another kind of algebraic persistent fault analysis, and the experimental results show that using S_box decomposition can effectively reduce the solving time and solving success rate under the same conditions.


Introduction
With the development of Internet of Things (IoT) and chip technology, the security of information is becoming a key concern for people in practical production and life, and cryptography is gradually gaining popularity. In a resource-constrained environment, some lightweight block ciphers [1] have emerged in order to improve the security of information. They have the advantages of simple structure, high efficiency, and easy implementation. Common lightweight block ciphers include PRESENT, GIFT, SKINNY [2], LED, etc. These encryption ciphers are all substitution-permutation network (SPN) structures, where S_box substitution as a nonlinear operation plays a crucial role in the security of the ciphers. Therefore, it is important to conduct research on lightweight block ciphers and their S_boxes.
Fault attack, as a common side channel analysis method, has been receiving attention and research from scholars and experts since its inception. The main ways of fault attack [3] are voltage fault attack, electromagnetic fault attack, laser fault attack [4,5], temperature fault attack, etc. The purpose of these approaches is to alter the surroundings of the working cryptographic chip and change the encryption result by injecting an abnormal state. This concept was put forward by Boneh et al. on the RSA-CRT (Rivest-Shamir-Adleman China

Related Works
The current common analysis methods for persistent fault [12,13] are mainly classical persistent fault analysis (PFA) [14,15], enhanced persistent fault analysis (EPFA) [16], and persistent fault-Based collision analysis (PFCA) [17]. For the SKINNY block cipher, classical persistent fault analysis cannot recovery the master key. EPFA requires 1500-1600 fault ciphertexts to recover the master key. When using the PFCA method, we need plaintext selection depending on the algorithm structure and a large number of fault ciphertexts. This method requires a more complex application scenario compared to other methods. In order to drastically reduce the number of fault ciphertexts required by the attack, we try to introduce algebraic analysis methods into the persistent fault analysis.
In our previous research, we performed an algebraic fault analysis [18] based on S_box decomposition for the SKINNY block cipher. By performing the S_box decomposition, we can represent the algebraic characteristics of the S_box using a smaller number of CNF equations and variables. The experimental results show that the experimental efficiency is substantially improved after S_box decomposition. Therefore, we try to introduce this method when performing algebraic persistent fault analysis to improve the speed and success rate of solving.

Our Contributions
In this paper, we propose algebraic persistent fault analysis methods for the SKINNY block cipher based on S_box decomposition. This is first work that combines S_box decomposition methodology and algebraic analysis to recover the master key of the SKINNY cipher under the condition of a persistent fault in S_box. Our main contributions are as follows. • For the SKINNY block cipher, its S_box is the four-in-four-out type, and the output four-bit value can be represented by an algebraic equation of the four-bit input value. When there is a fault in the S_box lookup table, the original algebraic equation cannot represent the output result, and an algebraic representation using the changed set of algebraic equations is required. We give the distribution of the number of variables in the algebraic equations of S_box by traversing all possible single faults in the S_box; • In this paper, we propose an algebraic persistent fault analysis method based on known plaintexts(KP-APFA) with all rounds encryption. The attack is first attempted using the original faulty S_box algebraic expression. The experimental results show that the attack cannot complete key-recovery within the specified time; • To achieve key-recovery of the SKINNY cipher, we introduce the S_box decomposition method and combine it with the KP-APFA method to analyze the SKINNY cipher, which can solve the key in 2000 swith at least 11 pairs of plaintext and faulty ciphertext. This reduces the number of fault samples by more than 100 times compared to the EPFA method; • A constraint-based algebraic persistent fault analysis method was proposed by Zhang Fan et al. In this paper, the S box decomposition is combined with this method (referred to as SD-APFA), and the experimental results show that the solving speed and the success rate of solving in the specified time are improved, and the best case can improve the solving speed by more than 10 times. In addition, the relationship between key residual entropy, fault depth, and number of faults is further investigated in this paper.
The remainder of this article is organized as follows. We describe the algorithmic structure of SKINNY in Section 2. Afterward, the persistent attack on s_box and the knownplaintext APFA method are presented in Section 3. We introduce the S_box decomposition methodology of SKINNY and simulation experiments of KP-APFA based on S_box decomposition in Section 4. In Section 5, we introduce the methodology of S_box decomposition to APFA (SD-APFA) and provide the attack results of different methods in several scenarios. Meanwhile, we further investigate the relationship between key residual entropy, fault depth, and the number of faulty ciphertexts. We give the experimental setup and results in Section 6, followed by the Conclusions in Section 7.

Algorithmic Description of SKINNY
The SKINNY block cipher is a lightweight AES-like tunable block cipher with a novel SPN structure, proposed by Beierle et al. at CRYPTO 2016. SKINNY is a class of tunable block cipher with tunable key framework, which is divided into six different versions according to the tunable key size and block length. In this paper we choose the most common version SKINNY_64_64 as the research object.
Each encryption round of the SKINNY block cipher includes operations such as Subcells, Addconstants, AddroundTweakey, Shiftrows, and Mixcolumns. The single-round encryption process is shown in Figure 1. The number r of rounds to be performed during encryption depends on the block and tweakey sizes. For the SKINNY_64_64 version of the block cipher, its number of encryption rounds are 32. • Subcells Subcells are the only non-linear operation in the entire encryption process. The hexadecimal notation of this S box is given by the following Table 1. The S_box is a four-in and four-out type, and the four-bit values of output are related to the input four-bit values. Let the input of S_box be x 3 x 2 x 1 x 0 and the output of S_box be y 3 y 2 y 1 y 0 , the algebraic relationship between them can be expressed by the following algebraic equations: Observation of Equation (1) reveals that the original S-box algebraic equation uses a total of eight quadratic and quadratic+ variables, which are The constants of the SKINNY block cipher are generated through a 6-bit affine LFSR (Linear Feedback Shift Register), whose state is updated by following definition: (rc 5 , rc 4 , rc 3 , rc 2 , rc 1 , rc 0 ) ←= (rc 4 , rc 3 , rc 2 , rc 1 , rc 5 ⊕ rc 4 ⊕ 1) The initial value of these 6 bits is set to 0, which are updated before use in a given round. The bits from the LFSR are arranged into a 4 × 4 array (only the first column of the state is affected by the LFSR bits): with c 2 = 0 × 2, (c 0 , c 1 ) = (rc 3 rc 2 rc 1 rc 0 , 0 0 rc 5 rc 4 ).

• AddRoundTweakey
The first and second rows of all tweakey arrays are extracted and bitwise exclusiveoredto the cipher internal state, respecting the array positioning. The specific subkey generation method can be found in Ref. [2]. • ShiftRows This operation can be represented as a permutation. A permutation P is applied on the cells positions of the cipher internal state cell array: for all 0 ≤ i ≤ 15, the operation can be showed as P = [0, 1,2,3,7,4,5,6,10,11,8,9,13,14,15,12].

Persistent Fault Injection in S_Box
The cryptographic world has never stopped attacking the SKINNY block cipher since this cipher's inception. The SKINNY has good security properties and can bring security to information under resource-constrained conditions. Since Zhang Fan et al. Algorithm 1 is used to iterate through all individual persistent faults of the S_box and generate a system of algebraic fault equations for the S_box and calculate the number of higher-order variables in the system of equations. It is worth stating that the values of the following statistics are the number of variables that are not duplicated. A statistical table of the number of quadratic and quadratic+ variables in the original S_box for different persistent fault scenarios is given in Table 2.
Algorithm 1: Pseudocode for calculating the number of higher-order variables in a system of equations for an S_box Generate the equations of S * ;   According to previous research experience, the more quadratic and above variables in the S_box equations, the slower the solving speed will be. Therefore, when injecting fault into the look-up table, in order to improve the solution efficiency, the number of intermediate variables in the modified equation sets should be as small as possible. When the persistent fault is injected into the S_box lookup table and makes F[2] = 9 into F[2] = A, the number of quadratic and quadratic+ variables is 7, and its corresponding equation expression becomes: Algorithm 2 is used to generate the faulty ciphertext. P and f are the inputs to the Algorithm 2, representing plaintext and fault, respectively. The output of Algorithm 2 is the fault ciphertext C * . The function of Algorithm 2 is equivalent to simulating a persistent fault injection experiment.

Algorithm 2: The fault ciphertext generation of SKINNY_64
Input: P, f Output: In Algorithm 3, we give a pseudocode for algebraic persistent fault analysis based on known plaintext, referred to as KP-APFA. The input N in the algorithm represents the number of faults, i.e., the number of plaintext and fault ciphertext pairs. The output in the algorithm is the solving time. For the SKINNY block cipher, there is a constant algebraic relationship between the subkey and the master key for each round. We represent the faulty encryption process in the form of algebraic equations and convert all the useful information into the form of a system of CNF equations. Then, all the CNF equations are combined to perform the key solution in the T sol = RunAPFA() means using CryptoMiniSAT to get the time of key-recovery.
We use the CryptoMiniSAT for key-recovery, which requires converting the algebraic equations into the form of CNF equations. For correct encryption, each round of S_box can be represented with 192 variables and 480 CNF equations, the round-constant is generated by a 6-bit affine LFSR, it can be represented with 6 variables and 6 CNF equations. Considering the operations of the Addconstants, Subkey, AddRoundTweakey, ShiftRows, and MixColumns, each round of them can be represented with 320 variables and 320 CNF equations.
Equation (5)  Use the above method to build a set of equations for persistent algebraic fault analysis. Experiments were set up using 16, 18, 20, and 30 random plaintexts, respectively, and for the purpose of discussing the generality of the experiments, 50 samples were randomly generated for each set of experiments. The maximum solving time is set to 1 h in the experiment, and the solution is judged to fail if it exceeds 1 h. The results of the study show that all samples cannot complete the recovery of the key within the specified time, therefore, the key-recovery cannot be completed by using the original S_box directly. We need to improve the expression of S_box algebraic equations.

S_Box Decomposition of SKINNY
This section gives the general flow of algebraic persistent fault analysis based on S_box decomposition. Firstly, the S_box in the target algorithm is decomposed and a suitable decomposition scheme is selected. Then, all possible individual S_box persistent faults are traversed and a suitable fault injection scheme is selected. The key-recovery is finally done by simulating the fault injection to generate the faulty ciphertexts and build the cryptographic algebraic system of equations and other useful information using the CryptoMiniSAT. The specific flowchart is given in Figure 2. In our previous research on the SKINNY_64 for algebraic fault analysis, we found that the use of S_box decomposition can greatly improve the speed of the key-recovery. We can decompose the original cubic S_box into two quadratic S_boxes to reduce the number of CNF clauses in the set of algebraic equations and the number of quadratic and quadratic+ intermediate variables introduced due to the nonlinear operation S_box. A schematic diagram of the decomposition is given in Figure 3. In 2011, Poschmann et al. proposed a technique to decompose a cubic S_box function into two quadratic functions.This relation can be expressed by the following equations S(X) = H(F(X)), where S, F, H : GF(2) 4 → GF(2) 4 . Considering the input and output of G(X) as 4-bits vectors X = (x, y, z, w) and F(X) = ( f 0 (X), f 1 (X), f 2 (X), f 3 (X)). Each f i , as a quadratic Boolean function, can be represented in ANF as the following equation, where a i , a ij are the binary coefficients of the Boolean function: f i (x, y, z, w) = a 0 + a 1 x + a 2 y + a 3 z + a 4 w + a 12 xy + a 13 xz + a 14 xw + a 23 yz + a 24 yw + a 34 zw  As discussed in literature [10], in order to reduce the overall search space for the two decomposed functions of F and H, the following two facts were used in this paper: • Rewriting S(X) = H(F(X)) as S(F −1 (X) = H(X), one needs to search only for all possible quadratic functions for G(X). This is then used to compute the other quadratic functions F(X) as S(F −1 (X)). • Rewriting S(X) = H(F(X)) as S(X) = H (F (X)) where F (X) = F(X) + F(0) and H (X) = H(X + F(0)). We assume that G(0) = 0 and get the other decompositions directly by substituting 15 nonzero values for G(0). Therefore, we only need to vary the 10 nonconstant coefficients in the ANF and the search space is reduced to (2 10 ) 4 = 2 40 .
With the above refinements, the actual quadratic Boolean equation used for the search is shown below.
f i (x, y, z, w) = a 1 x + a 2 y + a 3 z + a 4 w + a 12 xy + a 13 xz + a 14 xw + a 23 yz + a 24 yw + a 34 zw (7) The following steps were implemented in order to compute the desired optimized quadratic Boolean functions for F and H:  We followed the method provided in the literature [16] for censoring and finally obtained the optimal decomposition scheme. The selected F(X) and H(X), satisfying all the three TI requirements, namely Correctness, Non-Completeness, and Uniformity, are shown in Table 3. The ANFs of first S_box can be represented by Equation (8).
The ANFs of second S_box can be represented by Equation (9).
By looking at the set of Equations (8) and (9) we find that by decomposing the S_box, the number of quadratic and quadratic+ intermediate variables introduced is changed from the original 8 to 2 + 2. Next, we perform the persistent fault injection for the first S_box. The number of quadratic and quadratic+ intermediate variables is calculated by traversing all fault injection cases. The specific experimental results are shown in Table 4. According to the traversal results given in Table 4 3  3  3  3  3  3  3  3  3  -3  3  3  3  3   Table 5 is the decomposed fault S_box1. The corresponding algebraic equation for the fault S_box is shown below. For the original S_box, this persistent fault is equivalent to the following expression. Table 6 is the faulty original S_box. The corresponding algebraic equation for the fault of the original S_box is shown in Equation (11).
We use the decomposing S_boxes algebraic equations and the original S_box algebraic equations for persistent algebraic fault analysis experiments, respectively. First, we set the number of faulty plaintexts to 30, 20, 18, and 16, respectively. Fifty samples are randomly generated in each scenario, and the two methods are compared in experiments under the same fault conditions. The experiment sets the maximum solving time of the solver to 1 h, and the attack is judged to have failed after 1 h. Table 7 gives the average solving time and success rate of the two methods for different scenarios. The experimental results show that the key can be solved in effective time after using S_box decomposition, while all experiments cannot complete the key-recovery in effective time when using the original S_box expression. Figure 5 gives a histogram of the distribution of the solving time when using S_box decomposition in different scenarios, where the horizontal coordinates represent the solving time and the vertical coordinates represent the frequency. When the number of faulty ciphertexts is 30, the average solving time is 784.9 s. When the number of faulty ciphertexts is reduced to 20 and 18, the average solving time decreases. This may be due to the information redundancy caused by the larger number of algebraic equations when the number of faulty ciphertexts is large, and thus the solving time is longer than when the number of faulty ciphertexts is 20 and 18. As the number of faulty ciphertexts decreases further, the average solving time increases. When the number of faulty ciphertexts is 16, the average solving time reaches 450.6 s. From the histogram, we can find that 48% of the samples can be solved for the key within 400 s. The Table 8 shows that when the setting time is 2 h and the number of ciphertexts is 12, the success rate is 54%. When the maximum solving time is relaxed to 10 h, the success rate of the solving for the same experimental samples reaches 92%, where the shortest solving time is 459.5 s. In our study, in order to explore the minimum number of ciphertexts that can be used to achieve key-recovery, the number of ciphertexts is further reduced to 11, 10 sets of samples are randomly selected, the maximum solving time is set to 10 h, and the attack is judged to failure after 10 h. The experimental results show that 50% of the samples can be solved within the specified time, and the shortest solving time is 1930.9 s. Figure 6 gives the histogram of the solving time distribution for the number of faulty ciphertexts of 14, 13, and 12. From Figure 4, we can find that as the number of faults decreases from 14 to 12, the number of samples that complete key-recovery within 450 s decreases from 64% to 10%, while the number of samples that do not complete key-recovery within 7200 s increases from 6% to 46%.

Applying Another Persistent Algebraic Analysis of SKINNY
A method for persistent algebraic fault analysis [19] was proposed by Fan Zhang et al. in 2022, and the SKINNY block cipher was studied in the literature. Inspired by this literature, the relationship between the number of faulty ciphertexts, fault depth, and key residual entropy is analyzed on the basis of the original research in my paper. Our paper adopts the method of S_box decomposition instead of the original S_box, and analyzes the differences between the two methods in the same scenario. In the literature [19], Zhang Fan et al. proposed a constraint-based method to solve the problem of unsolvability due to information redundancy in full-round encryption. The method is briefly described in the following.
Suppose the original value of the fault S_box S to be V. In the r-th(1 ≤ r ≤ 32) round function, X r is the input of the r-th (1 ≤ r ≤ 32) round, Y r = MC(SR(K r ⊕ (AC(SC (X r ))))). Y r contains the information of both SC (X r ) and K r , which means that we can use Y r to add new constraints to K r , as shown in Equation (12).
In Equation (12), X r i andỸ r i are the 4 bits of X r and MC −1 (SR −1 (Y r )), respectively.K r i are the 4 bits of AC −1 (K r ) and V is a constant. Based on this inequality relation, we establish constraints so as to reduce the key search space. The reader can check the specific steps in the literature [19].
Algorithm 4 gives the pseudo-code for the SKINNY block cipher based on S_box decomposition for algebraic persistent fault analysis. The input of the algorithm N is the number of faulty ciphertexts, FD is the fault depth, and the output T sol is the solving time of the solver. First the algorithm randomly generates N plaintexts and uses Algorithm 2 to get the set of faulty ciphertexts, then all faulty ciphertexts are transformed into CNF equations. Next, we express the relationship between each round of subkeys and the master key as an algebraic equation and transform it into the form of CNF equations. Based on the input fault depth, we perform an algebraic representation of the FD rounds encryption process. We use two decomposed S_boxes instead of the original S_boxes, and transform the algebraic equations for each of these two S_boxes. Similarly, we translate the information about Addconstants, AddRoundTweakey, ShiftRows, MixColumns and constraints into the form of algebraic equations. Finally, we associate all the sets of CNF equations and use T sol = Run SD − APFA() for key-recovery. It is worth stating that the solver is set to work for 1 h, and the attack is considered to fail if it is exceeded. Fifty samples are randomly generated for each set of experiments, and the average solving time and solving success rate are calculated in Table 9. Table 9 gives the average solving time and the success rate of the solving within the specified time for the two methods for different scenarios. N represents the number of faulty ciphertexts and FD represents the depth of the fault. Under different scenarios, we randomly generated 50 samples and set the maximum solving time to 1 h. When N = 30 and FD = 4, the use of the new S_box algebraic representation can improve about 18 times over the original method. When the number of faulty ciphertexts is 20, 18, 17, 16, and 14, both methods are able to recover the keys within the specified time. At the same time, the solving speed is improved using the new S_box algebraic equations compared to the original method. When the number of faulty ciphertexts is 13 and 12, the solving success rate and average solving time using the new method are better than the original method. From the above experimental results, it is shown that the solving speed and solving success rate can be effectively improved by using the new S_box algebraic expression method.
the object of study and analyze the relationship between the solved key and the correct key. If they are the same, the key is determined to be unique, and if they are different, the key residual entropy is determined to be greater than 0. In other words, the value of the key is not unique.
We analyze and discuss the relationship between the number of faulty ciphertexts, fault depth, the key residual entropy, and the success rate of solving in a specified time under the new method. From Table 10, we find that when the number of faulty ciphertexts is 30 and the fault depth is 6, the first solution obtained by all 50 samples is equal to the correct key, and the key residual entropy is determined to be 0 in this experiment. Meanwhile, when the fault depth is reduced to 4, only 84% of the samples obtain the first solution equal to the correct key, while the remaining 16% obtain the first solution with a different value from the correct key. As the number of faulty ciphertexts decreases, there is such a relationship between the fault depth and the residual entropy of the key. When the fault depth is shallow, the key residual entropy cannot be 0, at the same time, the fault depth must be increased to satisfy that the key residual entropy is 0.  Figure 7 gives the relationship between the number of faulty ciphertexts and the minimum fault depth when the key residual entropy is 0 in the SKINNY_64 block cipher. We can find that the minimum fault depth decreases gradually as the number of fault ciphertexts increases, in other words, there is an inverse relationship between the minimum fault depth and the number of fault ciphertexts. When the number of faulty ciphertexts is small, if the fault depth is shallow, there are fewer constraints on the key information, so there are more keys satisfying the algebraic equations and the first solution given by the solver is different from the correct key. When deepening the fault depth, the generated set of algebraic equations has more constraints on the key, which improves the possibility of the key being the unique solution.

Experimental Setup and Results
In this section, we will introduce our setup of the experiments and the comparison results with a variety of existing methods.

Experimental Setup
In our experiment, we simulate the fault injection experiment via software, and use the CryptoMiniSAT v5.8.0 as the solver to solve the algebraic equations using Ubuntu 18.04.5 on Windows.We implement the experiments on a PC that has 16 GB memory and Intel(R) Core(TM) i5-9500 CPU at 3 GHz. The operating system is a 64 bit Windows 10. Table 11 provides a variety of existing methods to persistent fault analysis of SKINNY_64. The relationship between precondition, fault depth, minimum number of faults, and key residual entropy for different methods is given in the table. From Table 11 we can see that by using the PFA method, the key cannot be fully recovered because the fault depth is 1. The residual entropy of the key is 32. The EPFA method uses multiple rounds of faulty information and can recover the entire key. A minimum of 1500-1600 faulty ciphertexts are required to complete key-recovery. In this paper, the proposed KP-APFA based on S_box decomposition can complete key-recovery using a minimum of 11 faults, and unlike other methods, plaintexts need to be provided. The SD-APFA method proposed in this paper based on APFA also requires a minimum of 10 fault ciphertexts, but we can find that it improved the solving speed and solving success rate in the same attack scenarios as in Figure 8 and Table 9.  Figure 8. Comparison of the solving time with the two methods in different scenarios.

Conclusions
In this paper, we combine S_box decomposition methods commonly found in the field of threshold implementation with algebraic analysis. A more suitable persistent fault injection scheme is found from the perspective of algebraic analysis. Under the condition of known plaintexts, we have conducted several sets of experiments on the KP-APFA method based on S_box decomposition, and the average solving time and solving success rate are given for a different number of faults. The key-recovery can be completed within 2000s using at least 11 faults. Meanwhile, this paper also combines the method of S_box decomposition with the APFA method, and the proposed SD-APFA method has been significantly improved in both solving speed and success rate. Meanwhile, we discuss the relationship between key residual entropy, number of faults, and fault depth in different methods. In summary, the simplification of the system of equations using the S_box decomposition technique to achieve the S_box substitution operation is beneficial to improve the solving speed of the CryptoMiniSAT, thus improving the attack efficiency of the attacker in persistent fault analysis. In future work, we will apply and generalize this approach on other lightweight block ciphers.