Chosen Plaintext Combined Attack against SM4 Algorithm

: The SM4 algorithm is widely used to ensure the security of data transmission. The traditional chosen plaintext power attacks against SM4 usually need to analyze four rounds power traces in turn to recover the secret key. In this paper, we propose a new combined chosen plaintext power analysis, which combines the chosen plaintext power attack and the differential characteristics of the substitution box (S-box) in SM4. In our attack, only the second and fourth round S-box outputs of SM4 algorithm are used as attack points, and some sensitive ﬁxed intermediate values are obtained by power analysis when inputting speciﬁc plaintext. Then the differential analysis of these sensitive intermediate values is carried out to calculate the difference between the input and output of the S-box, and the key can be recovered from the differential characteristics of S-box. Compared with the traditional chosen plaintext power analysis, which requires four rounds of analysis, our analysis reduces the number of attack rounds into two rounds, and adopts the nonlinear S-box with obvious leakage information as the attack intermediate value, which effectively improves the feasibility of attack. Finally, a practical attack experiment is carried out on a Field Programmable Gate Array (FPGA) based implementation of SM4 algorithm, and the results show that our method is feasible and effective for real experiments.


Introduction
Since Kocher et al. proposed differential power analysis (DPA) in Crypto '1999 [1], power analysis has rapidly become a research hotspot for cryptographic algorithm implementation security. The basic principle of power analysis is to collect power leakage information such as time, power consumption and electromagnetic radiation in the process of cryptographic equipment performing sensitive operations (such as encryption and decryption operation and key transmission), and build the Hamming weight or Hamming distance leakage model of key/sensitive information. Finally, the relationship between the model and the power leakage information is calculated by statistical methods to extract the key/sensitive information. Power analysis methods mainly include DPA attack, Correlation Power Analysis (CPA) [2][3][4], Template attack (TA) [5][6][7][8], and Mutual Information Analysis (MIA) [9] etc.
The SM4 cryptographic algorithm is a commercial block cipher algorithm published in China in 2006 [10]. It officially became an ISO/IEC international standard in 2021 and is widely used in government departments, power, finance and other network information systems to ensure the security of data transmission. Therefore, it is very important to analyze its implementation security.

Contributions
In this paper, we propose a new round-reduced chosen plaintext power analysis against SM4 which combines chosen plaintext attack and differential analysis. After two rounds of analysis, the initial 128-bit key of SM4 can be completely recovered. Compared to the traditional chosen plaintext attacks [13][14][15][16], our attack has the following advantages: (1) Our attack can recover two round keys in one round of analysis simultaneously. For the previous chosen plaintext power analysis, only one round key can be recovered in one round of analysis and requires the analysis of rounds 1-4 in total. However, in our attack, only the S-box outputs of round 2 (or 4) are selected as the attack intermediate values to carry out the chosen plaintext attack by inputting special plaintexts. It can determine some fixed value about the first and second round keys (or the 3rd and 4th round keys). Then, by employing the differential characteristics of S-box, we can further determine 2 4 candidates for the two round keys with near 100% probability in one round of analysis. (2) Our attack is more feasible and simpler for experiments. As mentioned above, our attack reduces the rounds of analysis. Correspondingly, we just need to collect power traces for twice, while the traditional attacks need 4 times. Furthermore, if we improve the method (see Section 3.3), i.e., guess all the 2 4 candidates of round keys derived by differential analysis and recalculate the correlation coefficients to distinguish the correct ones, the required number of traces will decrease by one third and the key search space complexity will be reduced. This makes the attack experiments more feasible. (3) The target selected in our attack has stronger power leakage. All of the previous attacks targeted the linear operations such as the XOR operation before a round outputting as the leaked points, but our attack targets the nonlinear operation, i.e., the output of S-box. Under the same and unprotected implementation, the leakage of the S-box is obviously greater than the linear operations. This means our attack experiments can succeed more easily due to the stronger power leakage.

Preliminaries
This section mainly introduces the SM4 algorithm, the current chosen plaintext power analysis and differential analysis methods for the SM4 algorithm.

SM4 Algorithm
As shown in Figure 1, the encryption operation is carried out in the unit of 32-bit wide word, and an iteration operation is called a round, with a total of 32 iterations. Assume that the input (X 0 , X 1 , X 2 , X 3 ) ∈ Z 32 2 , round key rk i ∈ Z 32 2 . This section mainly introduces the SM4 algorithm, the current chosen plaintext power analysis and differential analysis methods for the SM4 algorithm.

SM4 Algorithm
As shown in Figure 1, the encryption operation is carried out in the unit of 32-bit wide word, and an iteration operation is called a round, with a total of 32 iterations. Assume that the input  The round function F can be expressed as follows.
Round transformation T: 32 Z is an invertible transformation, which is com- The round function F can be expressed as follows.
Round transformation T: Z 32 2 → Z 32 2 is an invertible transformation, which is composed of nonlinear transformation τ and linear transformation L, and can be expressed as T(.) = L(τ(.)). Nonlinear transformation τ:τ is composed of 4 parallel S-boxes, and S-boxes are the permutation of 8-bit input and 8-bit output, denoted as Sbox (.). Assume the input is A = (a 0 , a 1 , a 2 , a 3 ) ∈ (Z 8 2 ) 4 , and the output is Then B can be expressed as follows.
Linear transformation L: The output of the nonlinear transformation τ is the input of the linear transformation L. Let the input be B ∈ (Z 32 2 ) and the output C ∈ (Z 32 2 ), then C can be expressed as follows.

Chosen Plaintext Power Analysis for SM4
Reference [14] describes the chosen plaintext attack of SM4, and the details are described as below: First, select special plaintext with certain constraints, so that the output res after L transformation is fixed. Then, the round output X i+4 is selected as the attack object (X i+4 = X i ⊕ res, where X i is the known random value and res is the fixed unknown value), and the fixed value res is obtained through power analysis, and then the round key can be deduced. The key of SM4 can be recovered by executing the chosen plaintext attack on the first four rounds successively. The attack of the first round is taken as an example: 1.
The output X 4 (X 4 = X 0 ⊕ res) of the first round is chosen as the attack point to perform CPA analysis and recover res; 3.
Derive the round key rk 0 .
The attack on round 2-4 is similar to the first round mentioned above, one round key is recovered each time, and the initial key is finally recovered through key extension.

Differential Characteristics of SM4 Algorithm S-box
In reference [11], a differential fault attack based on random bytes was proposed for SM4 by using the S-box differential characteristics of SM4. The differential characteristics of S-box are described as follows. For the SM4 algorithm, let . . , 4} and i ∈ {0, . . . , 31}. At the same time, let A i = ( a 0,i , a 1,i ) a 2,i a 3,i as the input difference of S-box in round i. (Note: Different from the difference definition in reference [11], the difference in this paper is defined as the XOR value of S-box input in round i when two different plaintexts are input for encryption operation.) Similarly, that is the set of input values of S-box when the difference between input and output of S-box in round i is a j,i and b j,i respectively. In addition, let the inverse transform of L be L −1 . If A i and C i are known, B i can be derived through B i = L −1 ( C i ), and Φ( a j,i , b j,i ) can also be constructed. As mentioned in reference [11], if Φ( a j,i , b j,i ) is a non-empty set and the attacker knows a j,i and b j,i , then x j,i has at most 4 known candidate values. Moreover, the probability is 99.2% when there are 2 candidate values, and the probability is 0.8% is when there are 4. Based on the above differential characteristics, it can be seen that the input difference and output difference values of two pairs of different S-boxes need to be known to recover the round key of SM4 algorithm by using the differential characteristics.

Methodologies
It can be seen from Section 2.2 that chosen plaintext power analysis can only obtain one round key in each round of analysis. To recover the initial key of SM4, four rounds of analysis are needed to obtain four round keys. In order to improve the chosen plaintext power analysis, we utilize the differential characteristics of SM4 (see Section 2.3 for details). Thereby, it is only necessary to analyze the 2nd and 4th round of SM4 encryption to recover the whole initial key. The whole combined attack can be divided into two parts. Firstly, the intermediate value of the second and fourth round is obtained by round reduction chosen plaintext power analysis. Then the initial key is determined by differential analysis using S-box differential characteristics. The following two parts of the combined attack are introduced in turn.

Round Reduction-Based Chosen Plaintext on SM4
Step 1: Chosen plaintext For N encryption operations, the plaintext input in each encryption operation must meet the requirement that X 0 is a random value and X 1 ⊕ X 2 ⊕ X 3 = M 0 (M 0 is a fixed value). That is, the first four bytes of plaintext grouping in each encryption operation are random, and the last 12 bytes are divided into three groups, and the XOR result is fixed.
According to the round operation of SM4, the round input A 1 = (a 0,1 , a 1,1 , a 2,1 , a 3,1 ) and round output of the first round meet the following conditions: For the second round iteration, it can be obtained that the round input A 2 = (a 0,2 , a 1,2 , a 2,2 , a 3,2 ) and the S-box output B 2 of the second round meet the following conditions: Step 2: Power analysis Let V 1 = T(A 1 ) ⊕ rk 1 , then V 1 is a fixed value. The output B 2 of S-box in the second round of N group encryption operation is selected as the attack object (intermediate value) to conduct CPA (where the value of N, that is, the number of encryption operations, should make CPA analysis successful). The value of V 1 can be obtained, and then the equation for rk 0 and rk 1 is expressed as follows.
Since there are two unknowns in V 1 , the key byte cannot be determined.
1. Repeat for the first time.
Reselect N groups of plaintext for encryption, input plaintext such that X 0 (the first 4 bytes of plaintext) is a random value,M 0 (M 0 = X 1 ⊕ X 2 ⊕ X 3 ) is still fixed and M 0 = M 0 . Let the S-box input of round 1 and round 2 be A 1 (A 1 = (a 0,1 , a 1,1 , a 2,1 , a 3,1 )) and A 2 , X 4 be the round output of round 1, B 2 be the round output of round 2 S-box, then Let V 2 = T(A 1 ) ⊕ rk 1 (a fixed value) and X 2 ⊕ X 3 ⊕ X 0 be a random known value, select the output of S-box as the attack object, and conduct CPA on the above N groups of data to recover the value of V 2 .

Repeat for the second time.
Similarly, the following formula can be obtained by choosing the plaintext input: where

Differential Analysis
Differential analysis of the above selected plaintext data includes three different plaintext inputs. It is known that the first round S-box input difference (the S-box input XOR value obtained from the XOR between the first plaintext input and the second plaintext input) A 1 satisfies the following formula: Similarly, the difference A 1 of the first round S-box input obtained by XOR of the first plaintext input and the third plaintext input satisfies Accordingly, Equation (17) shows that the differences C 1 and C 1 of the first round of L transformation meet the following formula respectively.
The output difference B 1 and B 1 of S-box in the first round can be obtained by conducting L −1 inverse operation on C 1 and C 1 , and the formula is as follows.
According to the differential definition of S-boxes in Section 2.3, given the input and output differences ( A 1 , B 1 ) and A 1 , B 1 of the four S-boxes in the first round, for the input (M 0 ⊕ rk 0 ) and (M 0 ⊕ rk 0 ) of the four S-boxes, where M 0 and M 0 are known values, then the number of candidate values of every byte rk j,0 (j = 0, 1, 2, 3) could be two or four. The probability is 99.2% for two values and is 0.8% for four values. In the following analysis, suppose that the round key has two candidate values. In case there exist four candidate values (very low probability), the attack is seen to fail and is carried out with different inputs again. For the differential analysis results of the above two times, the correct key is the intersection of the two, that is, two sets of key candidate values (99.2% probability) are obtained by the two analyses, respectively, and there is a same value in the two sets, that is, the correct round key byte rk j,0 . The next step is to analyze and recover the four bytes of rk 0 in turn, and then recover rk 1 from V 1 . Finally, the correctness of rk 0 and rk 1 can be further verified by substituting rk 0 and rk 1 into equations set (17).
After rk 0 and rk 1 are recovered, the same method as above is adopted to select plaintext input so that the 12 bytes (3 words) after the third round of input are XOR fixed, that is, M = X 3 ⊕ X 4 ⊕ X 5 is a fixed value, and the first 4 bytes X 2 are random values, where X 3 and X 4 meet the following formulas.
The chosen plaintext input can be determined through derivation. For example, if X 2 is selected as random, X 0 = X 1 = X 2 , and X 3 is a fixed value, then the round input of the third round can be ensured to meet the condition. Similarly, combining the above Sections 3.1 and 3.2 for power analysis and differential analysis based on chosen plaintext, respectively, can restore the values of rk 2 and rk 3 in turn.
In summary, by selecting different plaintext inputs, taking the S-box output of the second and fourth rounds as the attack objects, and combining with differential analysis, the key of the first four rounds of SM4 encryption algorithm can be obtained. Finally, the initial key of SM4 can be recovered by the key expansion algorithm.

Complexity Analysis and Further Improvement
As mentioned above, there are two steps for our attack. In the first step, i.e., the chosen plaintext attack, to determine the round keys in one round of analysis, three CPAs in one round of analysis are carried out to construct two differential relations of the input and output of S-box. There are 2 8 candidate values for each byte of the sensitive intermediate value, and 3 groups of curves need to be analyzed for attack. Hence, the sum key search space complexity for recovering the four round keys is 6 × 4 × 2 8 . Meanwhile, the number of traces needed for our attack is 6 × N, where N is the number of traces needed for each CPA.
To make our attack more feasible for experiments, we have made the following improvement. We first discuss the 2nd round of analysis. Since CPA is byte-wise carried out, each byte of rk 0 has two candidate values with near 100% probability. Alternatively, we carry out not three but two CPAs. Consequently, there are 2 4 candidate values for rk 0 . Moreover, rk 1 also has 2 4 candidate values corresponding to rk 0 one by one, since rk 1 is determined by rk 0 . Unlike the analysis of Section 3.1, we continue to analyze the 2nd round and guess the 2 4 candidate values of rk 0 and rk 1 . Based on the guessed round keys, we recalculate the correlation coefficients between the S-box output and the traces. The round key corresponding to the maximum coefficient is the correct one. Then, we carry out another two CPAs in the 4th round of analysis with known rk 0 and rk 1 . Likewise, the similar differential analysis is carried out in the 4th round, and 2 4 candidate values of rk 2 and rk 3 are recovered. The correct values of rk 2 and rk 3 can be picked out corresponding to the maximum coefficient when guessing the candidate values and recalculating the correlation coefficients.
From the improvement, only 4 main CPAs are carried out and the number of traces for analysis has been reduced into 4 × N. Moreover, the key search space complexity decreases to 4 × 2 8 + 2 4 × 2. To sum up, our attack has obvious advantages at not only the number of traces needed for our attack but also the time complexity. This makes our attack more practical and feasible for experiments.

Limitations
As mentioned in Section 3.3, although our attack combines the new differential technology and is more feasible for experiments, there still exist some limitations.
Firstly, as introduced in Section 3.2, we only suppose that the round key has two candidate values. Actually, the round key byte has four candidates. For the case that there exist four candidate values, the attack is viewed to fail and is carried out with different inputs again. Furthermore, if we guess the four candidates in the analysis, the complexity analysis will increase. This is also what we will study and verify in the future. Secondly, when the implementation of SM4 has masking countermeasures and the S-box is masked with random numbers (this case is very common), our attack will fail. For the masking implementation, we will further consider to combine template attack and collision attack.

Experiments
For the above combined attacks, we carried out experimental verification on the SM4 algorithm implemented in FPGA chip, mainly verifying the feasibility and effect of the attack.

Experimental Environment
The FPGA chip used in the experiment (implementing SM4 algorithm) is SAKURA-G FPGA test board, and the Riscure suite about power analysis attack is used for our attack, including analysis of Software Inspector and hardware oscilloscope for acquisition. The whole analysis process is shown in Figure 2, including the following three steps.
(1) PC delivers plaintext to SAKURA-G FPGA test board, and the test board performs SM4 encryption operation and generates trigger signal at the same time. (2) The PC sends control instructions to the oscilloscope to collect the power consumption curves leaked by the SM4 encryption operation, and sends the information to the PC for saving.

Attack Instances
In the experiment, the second round (the input of the first round needs to be controlled, so that 0 rk and 1 rk are recovered) is selected as the analysis object for attack examples. The analysis of the fourth round (the input of the third round needs to be controlled) is similar to that of the second round.

Attack Instances
In the experiment, the second round (the input of the first round needs to be controlled, so that rk 0 and rk 1 are recovered) is selected as the analysis object for attack examples. The analysis of the fourth round (the input of the third round needs to be controlled) is similar to that of the second round.
Based on the above experimental environment, three groups of power leakage curves A, B and C (1000 for each group) are collected, and the plaintext input of the curves need to satisfy the following requirements: Group A: M 0 = X 1 ⊕ X 2 ⊕ X 3 is a fixed value, X 0 is a random value; Group B: M 0 = X 1 ⊕ X 2 ⊕ X 3 is a fixed value, X 0 is a random value; Group C: As shown in Figure 3, the power curve of data collection in group A includes plaintext input, 32 obvious peaks, and ciphertexts output; each peak represents the round operation of SM4. The second peak (corresponding intermediate value is the output of the second round S box) is selected for attack. When the number of the power consumption curves is 1000, the correlation coefficient results of the attack are shown in  There are four obvious peaks, which respectively represent the correlation between the correct guess value of V 1 four bytes and the power consumption curve sample points. Therefore, the correct V 1 can be determined. Similarly, the power consumption curves of group B and C are analyzed successively to recover V 2 and V 3 . Using V 1 , V 2 , V 3 and chosen input plaintext values, the input and output difference of S-box is calculated, the round key rk 0 of the first round is recovered, and then rk 1 is deduced. Meanwhile, we can use two of the three values V 1 , V 2 , V 3 , and chosen input plaintext values, calculate 16 candidate values for rk 0 , and recalculate the correlation coefficients between the S-box output and the traces; the round key corresponding to the maximum coefficient is the correct rk 0 , and then rk 1 is deduced.               Based on the keys 0 rk and 1 rk of the first and second rounds of the above attack, the input plaintext can make the input of the third round meet the attack conditions. The three groups of curves are collected again, and the output of the fourth round S-box on the curve is selected as the attack object to attack, and the round key 2 rk and 3 rk are obtained. Finally, the 128-bit initial key is completely recovered by the SM4 key extension algorithm.

Comparison with other Attack Methods
Compared with the previous chosen plaintext attack, the combined round reduction attack in this paper has obvious advantages on the number of rounds needed for attack, the selection of attack points and the number of times for collecting traces. The SM4 encryption attack is used as an example for comparison.
As shown in Table 1, our combined attack reduces the number of attack rounds by half, and our attack only needs to collect traces twice, which is significantly less than the number of plaintext selections in previous attacks, thus improving the efficiency of attack. In addition, compared with the previous linear XOR or L transformation and round output, our attack chooses the output of S-box as the attack point, which effectively improves the SNR and success rate of the attack. Furthermore, the sum number of traces, i.e., 4 N  , (N is the number of traces for a single successful attack) required for recovering the round keys of the first four rounds in our attack is obviously less than those (16 N  ) for the previous chosen-plaintext attacks [13,15,16]. Although the sum number of traces in Reference [14] is 4 N  , needing to collect traces four times, our combined attack only needs to attack 2 times and collect traces twice, reducing collection time and attack time. Finally, Based on the keys rk 0 and rk 1 of the first and second rounds of the above attack, the input plaintext can make the input of the third round meet the attack conditions. The three groups of curves are collected again, and the output of the fourth round S-box on the curve is selected as the attack object to attack, and the round key rk 2 and rk 3 are obtained. Finally, the 128-bit initial key is completely recovered by the SM4 key extension algorithm.

Comparison with other Attack Methods
Compared with the previous chosen plaintext attack, the combined round reduction attack in this paper has obvious advantages on the number of rounds needed for attack, the selection of attack points and the number of times for collecting traces. The SM4 encryption attack is used as an example for comparison.
As shown in Table 1, our combined attack reduces the number of attack rounds by half, and our attack only needs to collect traces twice, which is significantly less than the number of plaintext selections in previous attacks, thus improving the efficiency of attack. In addition, compared with the previous linear XOR or L transformation and round output, our attack chooses the output of S-box as the attack point, which effectively improves the SNR and success rate of the attack. Furthermore, the sum number of traces, i.e., 4 × N, (N is the number of traces for a single successful attack) required for recovering the round keys of the first four rounds in our attack is obviously less than those (16 × N) for the previous chosen-plaintext attacks [13,15,16]. Although the sum number of traces in Reference [14] is 4 × N, needing to collect traces four times, our combined attack only needs to attack 2 times and collect traces twice, reducing collection time and attack time. Finally, key search space complexity is smaller than previous chosen-plaintext attacks [13][14][15][16].

Conclusions
In this paper, we proposed a method that uses chosen plaintext power analysis for SM4 to improve the efficiency existing power analysis for SM4. The method reduces the number of attack rounds, the number of plaintext selections, and the search space of the key, and it selects the nonlinear s-box output as the attack point. This method is not only applied to analyze the first four rounds of SM4 encryption, but also effective to the first four rounds of SM4 decryption. Moreover, this method can also be directly applied to other grouping cipher attacks with similar differential features of S-box, such as AES. Meanwhile, we also can carry out our attack on the first four rounds on SM4 decryption. Another possibility for future work is to combine other cryptanalysis and side channel attacks, such as combining power analysis and algebraic analysis.