In this section, we propose our STA on the two cases of NTRU implementation. For each case, we first describe the implementation and then suggest our STA. Lastly, we present the experimental results on our attack. The purpose of our attack is to recover the private key. Therefore, only the implementation of decryption is introduced in this paper.
3.1. NTRU Open Source
The integral parts of NTRU implementation are the way to store polynomials and a polynomial multiplication.
Representing Polynomials
To store a polynomial f of the private key, NTRU Open Source stores the degree of indeterminant x whose coefficient is $\mathtt{1}$ or $\mathtt{1}$. Because the addition is computed according to the degree of $\mathtt{1}$ and $\mathtt{1}$, it is possible to operate without the degree of $\mathtt{0}$. Thus, the private key array first stores all the degree whose coefficient is $\mathtt{1}$ and then it stores all the degree where its coefficient is $\mathtt{1}$ in an array. For example, if $f={x}^{3}x+1$, then the array of f would be $\{\mathtt{0},\mathtt{3},\mathtt{1}\}$. The polynomial in general, is stored such that the coefficient of the xth degree is the xth element in an array. For example, the polynomial $e=\mathtt{3}{x}^{4}{x}^{2}+\mathtt{9}x\mathtt{5}$ represent as $\{\mathtt{5},\mathtt{9},\mathtt{1},\mathtt{0},\mathtt{3}\}$.
Polynomial Multiplication
For efficiency, the private key is set as
$f=pF+1$ and
F is divided into three trinary polynomials
$F={F}_{1}\xb7{F}_{2}+{F}_{3},\phantom{\rule{0.166667em}{0ex}}{F}_{1},{F}_{2},\mathrm{and}{F}_{3}\in {\mathcal{L}}_{F}$. The advantage of splitting
F, is that it lowers the hamming weight of polynomials so that the multiplication could be speed up [
13,
30]. Consequently, the decryption of NTRU Open Source performs as in Equation (
7) considering the order of multiplication.
Computation of Equation (
7) is represented in Algorithm 1 and algorithm for polynomial multiplication is in Algorithm 2.
Algorithm 1 Decryption in NTRU Open Source 
 Require:
The trinary polynomials ${F}_{1},{F}_{2},{F}_{3}$ $\mathrm{and}e\in R$ with degree N ▷${F}_{1},{F}_{2},{F}_{3}$ is a private key polynomial satisfied $f=1+p({F}_{1}\xb7{F}_{2}+{F}_{3})$  Ensure:
message $m=f\xb7e\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)=(1+({F}_{1}\xb7{F}_{2}+{F}_{3}))\xb7e\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$
 1:
$t\leftarrow \mathrm{Algorithm}2({F}_{1},e)$ ▷ Algorithm 2 is polynomial multiplication  2:
$t\leftarrow \mathrm{Algorithm}2({F}_{2},t)$  3:
$u\leftarrow \mathrm{Algorithm}2({F}_{3},e)$  4:
for$0\le i<N$do  5:
${v}_{i}\leftarrow ({t}_{i}+{u}_{i})\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$ ▷ add t and u  6:
end for  7:
for$0\le i<N$do  8:
${m}_{i}\leftarrow ({e}_{i}+p\ast {v}_{i})\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$ ▷∗ is a word multiplication  9:
end for  10:
returnm

The input b of Algorithm 2 is formed in a way such that the degree having coefficient $\mathtt{1}$ is stored in ascending order and then degree having $\mathtt{1}$ is stored. The polynomial multiplication starts with the smallest degree where its coefficient equals to $\mathtt{1}$ and add ciphertext to the initialized array. Since the result must be reduced modulo $({x}^{N}1)$, this implementation performs the addition from the beginning to $(N1)$ and restarts for the $\mathtt{0}$th element in an array when the degree exceeds N. After the modular operation, the sign is reversed and the same steps are repeated on for the degree having coefficient $\mathtt{1}$. Lastly, the (mod $q)$ operation is performed by AND(∧) $(q1)$ since the q is set as power of $\mathtt{2}$.
Algorithm 2 Polynomial Multiplication during NTRU Open Source Decryption 
 Require:
Polynomial $e\in R$ with degree N and Private key array b ▷ let b be a information of private key F  Ensure:
$H=F\xb7e\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$
 1:
for$i=0;i<N;i$++ do  2:
${t}_{i}\leftarrow 0$  3:
end for  4:
for$j={d}_{F}+1;j<2{d}_{F}+1;j$++ do ▷ private key has ${d}_{F}$ coefficients equal −1  5:
$k\leftarrow {b}_{j}$  6:
for $i=0;k<N;i$++$,k$++ do  7:
${t}_{k}\leftarrow {t}_{k}+{e}_{i}$  8:
end for  9:
for $k=0;i<N;i$++$,k$++ do  10:
${t}_{k}={t}_{k}+{e}_{i}$  11:
end for  12:
end for  13:
for$i=0;i<N;i$++ do ▷ This step is because the above process is for −1  14:
${t}_{i}\leftarrow {t}_{i}$  15:
end for  16:
for$j=0;j<{d}_{F}+1;j$++ do ▷ private key has ${d}_{F}+1$ coefficients equal 1  17:
$k\leftarrow {b}_{j}$  18:
for $i=0;k<N;i$++$,k$++ do  19:
${t}_{k}\leftarrow {t}_{k}+{e}_{i}$  20:
end for  21:
for $k=0;i<N;i$++$,k$++ do  22:
${t}_{k}={t}_{k}+{e}_{i}$  23:
end for  24:
end for  25:
for$i=0;i<N;i$++ do  26:
${H}_{i}\leftarrow {t}_{i}\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$ ▷ in the case of q is powering of 2, $\wedge (q1)$ works for mod q  27:
end for  28:
returnH

3.1.1. Proposed Method
The idea behind the attack is that the correlation between power consumption traces obtained when performing the same operations is higher than the power consumption trace obtained when performing different operations. Let the power trace obtained during the addition operation be taken as a reference trace
R. Let
O be the subtraces of the power consumption trace in Algorithm 2. When calculating the correlation between
R and
O, the correlation coefficient will be obtained when computing Algorithm 2. When plotting the gained coefficients values, then a graph appear like
Figure 1. There are peaks, called as high peak herein, which signify the affinity between
R and
O. Then, we recover the private key polynomial by calculating the distance between the high peaks.
As in Algorithm 2, the additions in steps 4 to 12 and steps 16 to 24 depend on the private value. For example, suppose $N=\mathtt{11}$ and let $\mathtt{5}$ be the smallest degree when its coefficient equals to $\mathtt{1}$. Then the steps 6 to 8 are repeated 6 times and steps 9 to 11 are repeated 5 times. Note that, there is a moment when the loop passes to the next loop, then the distance between high peaks is different at that moment. Thus, if the real value is x, so that the interval between $(Nx)$th and $(Nx+1)$th high peak is different from the others. Therefore, we can recover the whole value by applying the same steps for the coefficients $\mathtt{1}$ and $\mathtt{1}$.
3.1.2. Experiment
Figure 2 is a full trace of the NTRU Open Source porting on an KLASCARF AVR, captured in Lecroy HDO6104A oscilloscope with 250 M sampling rate [
12,
31]. The parameters for the experiment are
N =
$\mathtt{50}$,
${d}_{{F}_{1}}$ =
$\mathtt{8}$,
${d}_{{F}_{2}}$ =
$\mathtt{8}$,
${d}_{{F}_{3}}$ =
$\mathtt{6}$ and the private key is as follows.
$$\begin{array}{cc}\hfill b=& \{\mathtt{0}\mathtt{x}\mathtt{03},\mathtt{0}\mathtt{x}\mathtt{01},\mathtt{0}\mathtt{x}\mathtt{1}\mathtt{e},\mathtt{0}\mathtt{x}\mathtt{11},\mathtt{0}\mathtt{x}\mathtt{05},\mathtt{0}\mathtt{x}\mathtt{06},\mathtt{0}\mathtt{x}\mathtt{1}\mathtt{a},\mathtt{0}\mathtt{x}\mathtt{0}\mathtt{e},\mathtt{0}\mathtt{x}\mathtt{13},\mathtt{0}\mathtt{x}\mathtt{01},\mathtt{0}\mathtt{x}\mathtt{28},\mathtt{0}\mathtt{x}\mathtt{23},\mathtt{0}\mathtt{x}\mathtt{10},\mathtt{0}\mathtt{x}\mathtt{29},\mathtt{0}\mathtt{x}\mathtt{22},\mathtt{0}\mathtt{x}\mathtt{0}\mathtt{c},\hfill \\ & \mathtt{0}\mathtt{x}\mathtt{07},\mathtt{0}\mathtt{x}\mathtt{08},\mathtt{0}\mathtt{x}\mathtt{0}\mathtt{b},\mathtt{0}\mathtt{x}\mathtt{15},\mathtt{0}\mathtt{x}\mathtt{1}\mathtt{b},\mathtt{0}\mathtt{x}\mathtt{25},\mathtt{0}\mathtt{x}\mathtt{2}\mathtt{e},\mathtt{0}\mathtt{x}\mathtt{2}\mathtt{c},\mathtt{0}\mathtt{x}\mathtt{18},\mathtt{0}\mathtt{x}\mathtt{21},\mathtt{0}\mathtt{x}\mathtt{17},\mathtt{0}\mathtt{x}\mathtt{2}\mathtt{f},\mathtt{0}\mathtt{x}\mathtt{19},\mathtt{0}\mathtt{x}\mathtt{04},\mathtt{0}\mathtt{x}\mathtt{30},\mathtt{0}\mathtt{x}\mathtt{00},\hfill \\ & \mathtt{0}\mathtt{x}\mathtt{02},\mathtt{0}\mathtt{x}\mathtt{0}\mathtt{f},\mathtt{0}\mathtt{x}\mathtt{27},\mathtt{0}\mathtt{x}\mathtt{2}\mathtt{d},\mathtt{0}\mathtt{x}\mathtt{12},\mathtt{0}\mathtt{x}\mathtt{2}\mathtt{a},\mathtt{0}\mathtt{x}\mathtt{2}\mathtt{b},\mathtt{0}\mathtt{x}\mathtt{14},\mathtt{0}\mathtt{x}\mathtt{1}\mathtt{c},\mathtt{0}\mathtt{x}\mathtt{1}\mathtt{f},\mathtt{0}\mathtt{x}\mathtt{26},\mathtt{0}\mathtt{x}\mathtt{20}\}\hfill \end{array}$$
We choose these values as we considered them to be suitable in the experimental environment. The first 16 entries of b represent ${F}_{1}$, the next 16 values represent ${F}_{2}$ and the rest of the values represent ${F}_{3}$.
The first step for analysis is discovering a reference trace
R by SPA (
Figure 3). The length of
R is calculated by dividing the full trace length by the total number of operations. After that, the correlation coefficient can be calculated from the trace using the reference.
Figure 1 is a part of the result containing the high peaks and the following intervals. There are two indices tagged on each peak, one represents an order of the high peak and the other is a distance between the previous high peak. The
$\mathtt{31}$th peak has different distance than others, so the first degree where coefficient is
$\mathtt{1}$ is
$\mathtt{50}\mathtt{31}=\mathtt{19}=\mathtt{0}\mathtt{x}\mathtt{13}$. With this process, we can recover
${F}_{1},{F}_{2},{F}_{3},$ and the private key.
3.2. NTRUEncrypt
Representing Polynomials
In the NTRUEncrypt, the polynomial is represented as the coefficients in order. For example, $F\left(x\right)={x}^{3}+x1$ stored as F = $\{\mathtt{1},$$\mathtt{1},$$\mathtt{0},$$\mathtt{1}\}$. Before the polynomial multiplication of ciphertext and private key, there are steps to compute $f=pF+1$.
Polynomial Multiplication
The the Equation (
3) operates using the grade school multiplication. Unlike NTRU Open Source, the polynomial multiplication operates separately. These steps are described in Algorithm 3.
Algorithm 3 Decryption in NTRUEncrypt 
 Require:
Trinary polynomial $F\in {\mathcal{L}}_{f}$, ciphertext $e\in \mathcal{R}$  Ensure:
message $m=f\xb7e\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$
 1:
for$0\le i<N$do  2:
${f}_{i}\leftarrow {F}_{i}\times p$  3:
end for  4:
${f}_{0}\leftarrow {f}_{0}+1$  5:
for$0\le j<N$do  6:
${t}_{j}\leftarrow {e}_{0}\times {f}_{j}$  7:
end for  8:
for$1\le i<N$do  9:
${t}_{i+N1}\leftarrow 0$  10:
for $0\le j<N$ do  11:
${t}_{i+j}\leftarrow {t}_{i+j}+{e}_{i}\times {f}_{j}$  12:
end for  13:
end for  14:
${t}_{2N1}\leftarrow 0$  15:
for$0\le i<N$do  16:
${m}_{i}\leftarrow ({t}_{i}+{t}_{i+N})\phantom{\rule{4.44443pt}{0ex}}\left(\mathrm{mod}\phantom{\rule{0.277778em}{0ex}}q\right)$  17:
end for  18:
returnm

3.2.1. Proposed Method
The proposed method exploits the power consumption of steps 1 to 3 and steps 5 to 13 in Algorithm 3 to recover the trinary polynomial F. When F get recovered, the private key polynomial f is computed by $f=pF+1$, where p is a public value. The relative order of coefficients $\mathtt{1}$ is discovered by analyzing the steps 1 to 3 operation. Because F is a trinary polynomial, a constant value p is multiplied by three values $\mathtt{1},\mathtt{0},\mathrm{and}\mathtt{1}$. Since most of the processor apply 2’s complement method to express negative value, a hamming weight of $\mathtt{1}$ is bigger than others. Thus we can observe the high peaks in the power consumption trace when the $\mathtt{1}$ is operated. Note that, the proposed analysis depends on the operation of the processor. Thus, if the processor uses another method to represent negative value, the proposed analysis should consider such circumstances.
The next step, the relative orders of the coefficient $\mathtt{0}$ are known from 5 to 13 steps which are the polynomial multiplication of ciphertext e and private key f. The power consumption when calculating the coefficient of the ciphertext and 0 will be lower than other calculation processes. This portion where the power consumption is low is referred to as low peak. Therefore, after finding the relative position from $\mathtt{0},\mathtt{1}$ to $\mathtt{1}$, and combining this result with the information of $\mathtt{0}$s then F is completed recovered. Finally, we can get f by computing $pF+1$.
3.2.2. Experiment
Figure 4a is a full trace of the NTRUEncrypt porting on the KLASCARF AVR and is captured with a Lecroy HDO6104A oscilloscope at a 250 M sampling rate [
12,
31]. The parameters for the experiment are
N =
$\mathtt{49}$,
p =
$\mathtt{3}$,
q =
$\mathtt{2048}$, and a private key is as follows.
$$\begin{array}{cc}\hfill f=& \{\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{0},\mathtt{1},\mathtt{0},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{0},\mathtt{0},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{0},\mathtt{0},\mathtt{0},\mathtt{1},\mathtt{0},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{0},\mathtt{0},\mathtt{0},\mathtt{1},\mathtt{0},\mathtt{0},\mathtt{1},\hfill \\ & \mathtt{0},\mathtt{1},\mathtt{0},\mathtt{1},\mathtt{1},\mathtt{0},\mathtt{0},\mathtt{1},\mathtt{1},\mathtt{1},\mathtt{1}\}\hfill \end{array}$$
The p and q follow the proposed parameter but N is smaller than the standard because of the experimental environment.
Figure 4c depicts the power consumption of steps 1 to 3 in Algorithm 3. As mentioned above, the high peaks represent the moment when
p is multiplied by
$\mathtt{1}$. Also, in the
Figure 4c, there are the low peaks related to the coefficient
$\mathtt{0}$ and
$\mathtt{1}$. Thus the relative orders of
$\mathtt{1}$ and others can be recovered by analyzing
Figure 4c.
The following process is to recover the coefficients
$\mathtt{0}$. For each coefficient of the ciphertext, there are
N multiplications with the private key. During the
N operations, the operation of the private key
$\mathtt{0}$ appears in the same order, so the low peaks appear regularly on the whole power trace (
Figure 4a). To recover the degree, we should classify a set of multiplications by SPA among the trace. The multiplication between ciphertext and private key occurs after computing
$pF$, and the total recovered number of multiplications is
${N}^{2}$. To reduce the noise, one can average multiple power consumption trace.
Figure 5 illustrates the average of 10 traces.
Figure 4b is an enlarged plot of four low peaks to deduce that peaks are identified. Lastly, with the three coefficients recovered from the analysis, the private key
f is obtained.