Montgomery Reduction for Gaussian Integers †

: Modular arithmetic over integers is required for many cryptography systems. Montgomery reduction is an efﬁcient algorithm for the modulo reduction after a multiplication. Typically, Mont-gomery reduction is used for rings of ordinary integers. In contrast, we investigate the modular reduction over rings of Gaussian integers. Gaussian integers are complex numbers where the real and imaginary parts are integers. Rings over Gaussian integers are isomorphic to ordinary integer rings. In this work, we show that Montgomery reduction can be applied to Gaussian integer rings. Two algorithms for the precision reduction are presented. We demonstrate that the proposed Montgomery reduction enables an efﬁcient Gaussian integer arithmetic that is suitable for elliptic curve cryptography. In particular, we consider the elliptic curve point multiplication according to the randomized initial point method which is protected against side-channel attacks. The implementation of this protected point multiplication is signiﬁcantly faster than comparable algorithms over ordinary prime ﬁelds.

In this work, we consider the modulo reduction for Gaussian integers. Gaussian integers are complex numbers such that the real and imaginary parts are integers. Arithmetic over Gaussian integers for the RSA system was proposed in [15][16][17][18]. In [17,18], it was shown for the RSA system that a significant complexity reduction can be achieved with Gaussian integers. Due to the isomorphism between Gaussian integer rings and ordinary integer rings, this arithmetic is suitable for many cryptography systems. The Rabin cryptography system and elliptic curve cryptography over Gaussian integers were considered in [19][20][21][22]. Moreover, coding applications over Gaussian integers use Gaussian integer arithmetic [23][24][25][26][27][28].
To reduce the complexity of the modular reduction, we investigate Montgomery reduction algorithms for finite Gaussian integer rings. However, it is not trivial to generalize Montgomery reduction to Gaussian integers. The final step of the reduction algorithm is based on the total order of integers which does not exist for complex numbers. In [29], a Montgomery reduction algorithm was proposed that utilizes the absolute value to measure the size of a Gaussian integer. This study is an extension of our paper [29]. In this work, we present a new approach that aims on reducing the complexity of the reduction presented in [29]. In this algorithm, the absolute value is replaced by the Manhattan weight. The calculation of the Manhattan weight requires only a single addition, whereas the calculation of the absolute value requires addition and two squaring operations. On the other hand, the reduction using the Manhattan weight may not always obtain a unique solution. There are at most two possible solutions that are congruent. For many applications uniqueness is not an issue for the intermediate results in calculations, which require many subsequent reduction steps. For the final result, the correct representative can be calculated using absolute values.
The proposed concept of Montgomery reduction was applied in [22] for the efficient calculation of the elliptic curve point multiplication. In particular, an area-efficient coprocessor was designed in [22]. This processor uses the proposed Montgomery modular arithmetic over Gaussian integers. It is shown in [21,22] that a key representation with a Gaussian integer expansion is beneficial for the calculation of the point multiplication. This key representation reduces the computational complexity and the memory requirements of a secure hardware implementation, which is protected against side-channel attacks [30][31][32][33][34][35][36][37]. In this work, we provide the theoretical justification for the reduction algorithms applied in [22]. Furthermore, we present new results for protected implementations of the point multiplication. In particular, we consider the randomized initial point method proposed in [38]. This method is protected against several side-channel attacks, i.e., differential power analysis (DPA), zero-value point attacks (ZPA), and refined power analysis (RPA) [32,38]. We present synthesis results for a field-programmable gate array (FPGA) based on the architecture proposed in [22]. The protected implementation of the point multiplication is significantly faster with Gaussian integers. Moreover, it requires less memory and fewer flip-flops than the PM algorithms over ordinary prime fields reported in [39,40].
This publication is organized as follows. We review the Montgomery multiplication of ordinary integers and discuss some properties of Gaussian integers in Section 2. In Section 3, Montgomery reduction based on the absolute value is investigated. We present a lowcomplexity reduction algorithm based on the Manhattan weight in Section 4. To demonstrate that the arithmetic over Gaussian integers is useful, we consider the complexity for the elliptic curve point multiplication according to the randomized initial point method in Section 5. In Section 6, we discuss the results and conclude our work.

Preliminaries and Problem Statement
In this section, we briefly discuss Montgomery reduction of ordinary integers. Moreover, we review some properties of Gaussian integer rings.

Montgomery Reduction of Ordinary Integers
Montgomery reduction is considered in several ECC and RSA cryptography systems to decrease the complexity of the arithmetic in prime fields [9,11,14] as well as in rings [2,3]. The Montgomery multiplication uses the arithmetic over a ring R n that is isomorphic to the integer ring Z n = {x mod n : x = 0, . . . , n − 1, x ∈ Z} [1]. Montgomery reduction algorithm reduces the complexity of the modulo reduction. The expensive calculation mod n is replaced by mod R, where R > n is a power of two. Hence, the modulo reduction in the ring R n is a simple bitwise AND operation with R − 1.
For the Montgomery arithmetic, an element x ∈ Z n is mapped to the Montgomery domain as X = xR mod n.
Addition in the Montgomery domain is isomorphic to ordinary integer modular arithmetic, because (X + Y) mod n = (xR + yR) mod n = (x + y)R mod n.
The multiplication in the Montgomery domain needs Montgomery reduction function µ(Z) = ZR −1 mod n described in Algorithm 1. This function defines the inverse mapping, since µ(X) = xRR −1 mod n = x mod n. Using the reduction function, we obtain the product µ(XY) = XYR −1 mod n = xyR mod n in the Montgomery domain.
input: Z, with 0 ≤ Z < nR, n = −n −1 mod R, and R = 2 l ≥ n output: Typically, all variables are mapped at the beginning of the calculation into the Montgomery domain using this reduction function as since only one precomputed value R 2 mod n is required. We illustrate the Montgomery reduction of ordinary integers with an example. The binary representation of positive integers is denoted by brackets (·) 2 with the subscript 2. Example 1. We consider the product of two numbers x = 14 and y = 7 from Z 29 . The two integers x and y are mapped to the integers X = xR mod n = 13 and Y = yR mod n = 21 with R = 32 = 2 5 > n = 29 in the Montgomery domain. Hence, all elements of the Montgomery domain can be represented with five binary digits. The product xy mod n = 98 mod 29 = 11 corresponds to xyR mod n = 4 in the Montgomery domain.
With Algorithm 1, we can reduce the product Z = XY = 273 as follows. With n = 11, we first calculate t = Zn mod R = 3003 mod 32 = 27. Note that the modulo reduction of the binary representation is a simple bitwise AND operation with R − 1 = 31 or equivalently the truncation of all bits except the five least significant bits. The integer 3003 has the binary representation (1011 1011 1011) 2 and the five least significant bits (1 1011) 2 represent the value t = 27.
Next, we calculate the quotient q = (Z + tn) div R = 1056 div 32 = 33. For R = 32, the division by R without remainder in the binary representation is a simple right-shift by five bits or equivalently the truncation of the five least significant bits. The integer 1056 has the binary representation (100 0010 0000) 2 . The division without remainder results in (10 0001) 2 which is the binary representation of 33. The final result in the Montgomery domain is M = µ(Z) = q − 29 = 4 because q = 33 > n = 29. To obtain the result in Z 29 , we apply Montgomery reduction for the inverse mapping M = µ(M) = µ(4) = 11.

Rings of Gaussian Integers
The modulo arithmetic over Gaussian integers is based the modulo function where x and π are Gaussian integers. π * is the conjugate of the Gaussian integer π. The brackets [·] denote rounding to the closest Gaussian integer [23], i.e., for a complex number is a finite ring. For primes p of the form p ≡ 1 mod 4, G p is a finite field which is isomorphic to the prime field GF(p) over ordinary integers [23]. Such fields are suitable for elliptic curve cryptography [21,22]. A prime of the form p ≡ 1 mod 4 is the sum of two perfect squares, i.e., p = ππ * = |a| 2 + |b| 2 with the Gaussian integer π = a + bi. Moreover, for integers n = cd such that c and d are primes of the form c ≡ d ≡ 1 mod 4 and c = d, the set G n is a ring that is isomorphic to the ring Z n over ordinary integers [26]. Such Gaussian integers are suitable for the RSA public-key system [17,18].
Consider Montgomery reduction in Algorithm 1. Lines 3 to 7 of this algorithm uniquely determine the smallest integer that is congruent to ZR −1 mod n. However, complex numbers cannot be totally ordered [24]. Hence, it is not possible to directly apply this reduction algorithm to Gaussian integers. This reduction problem is not regarded in [17,18].
In this work, we consider two reduction strategies using different norms. One algorithm uses the absolute value |x| = |c| 2 + |d| 2 of the Gaussian integer x = c + di. The second approach is based on the Manhattan weight x = |c| + |d|. Calculating the Manhattan weight is less complex than calculating the absolute value, because only addition is required, whereas squaring is necessary to determine |x|. Note that |x| ≤ x holds for any x which follows from squaring both sides of the inequality. An algorithm for primes of the form p ≡ 1 mod 4 to find a, b such that p = a 2 + b 2 is proposed in [23]. This algorithm consists of two steps.
• Find x such that x 2 ≡ −1 mod p, which can be done using the Tonelli-Shanks algorithm [41]. • Use the Euclidean algorithm to determine the greatest common divisor of p and x. The first two remainders less than √ p are a and b.
On the other hand, there exist many primes of the form p ≡ 1 mod 4 such that p = a 2 + (a − 1) 2 or p = a 2 + 1. Exploiting this observation we can search for a such that the sum p = a 2 + 1 or p = 2a 2 − 2a + 1 is prime. This can significantly reduce the search complexity. Table 1 illustrates some examples of such primes with sufficient bit lengths for ECC applications.

Montgomery Reduction for Gaussian Integers
In this section, we consider some important properties of Gaussian integers and derive Montgomery reduction for Gaussian integers using the absolute value. First, we show that the set G n is symmetric and that the absolute value of any element of this ring is bounded.

Lemma 1.
For any x ∈ G n we have the following symmetry The absolute value |x| is bounded by Moreover, for any x / ∈ G n with x = x mod π we have Proof. Consider the quotient x π = xπ * n = c + di, where c and d are the real part and the imaginary part of x π . For x ∈ G n we have x = x mod π. Hence, the rounded quotient satisfies xπ * ππ * = 0.
Finally, consider a Gaussian integer x / ∈ G n which is congruent to x, i.e. x = x mod π. We use the notation Hence, we have (10).
The next example demonstrates that |x| < |π| √ 2 is not a sufficient condition for x ∈ G n .
Next, we consider now Montgomery reduction. The reduction for Gaussian integers can be performed according to Algorithm 2, where Steps 1 and 2 in Algorithm 2 are applied separately for the real-and imaginary part. We demonstrate that Algorithm 2 always obtains a precision reduction resulting in the correct representative.  Figure 1. Elements of the Gaussian integer field G 29 with π = 5 + 2i.

Algorithm 2 Montgomery reduction for Gaussian integers
) then 6: determine α according to (11) 7: M = q − α π 8: else 9: determine α according to (12) 10: Proof. We consider the sum Z + tπ in step 2 of Algorithm 2, where This shows that q is congruent to ZR −1 mod π or ZR −1 = q − απ. The absolute value of ZR −1 mod π is bounded, i.e., As shown in Example 2, the quotient q may not be the correct representative ZR −1 mod π even for |q| < |π| √ 2 . The congruent values q − απ with α ∈ {±1, ±i} are also possible candidates. However, for any x ∈ G n and α ∈ {±1, ±i}, the lower bound follows from the triangle inequality and Lemma 1. Consequently, the quotient q is the unique solution for |q| < √ 2−1 √ 2 |π|. Furthermore, for any x ∈ G n the condition |α| > Hence, only the candidates according to (11) are possible for . Using Lemma 1, it follows that the correct candidate can be found by minimizing the absolute value among all candidates according to (11).
, the result M is calculated as M = q − α π with α according to (12).
To demonstrate that (12) is correct, we derive the bound |α| ≤ √ 2. Consider again the quotient q = (Z + tπ) div R. We aim to compensate the offset tπR −1 by απ, where the absolute values are bounded by This follows from |t| ≤ √ R 2 + R 2 = √ 2R due to the reduction mod R and implies the bound |α| ≤ √ 2.
Example 3. As an example for Montgomery reduction, we consider the product of two numbers x = 6 and y = 7 from Z 29 or the corresponding field G 29 of Gaussian integers with π = 5 + 2i.
The two integers x and y are mapped to the Gaussian integers X = xR mod π = 2 − i and Y = yR mod π = −2 with R = 8 in the Montgomery domain. The product xy mod p = 42 mod 29 = 13 corresponds to xyR mod π = −i in the Montgomery domain.
Next, we calculate the quotient q = (Z + tπ) div R = −8i div 8 = −i. For R = 8 = 2 3 , the division by R without remainder is a simple right-shift by three bits. The quotient M = µ(Z) = q = −i is the final result without further reduction steps because |q| = 1 < √ 2−1 √ 2 |π| = 1.5773. The product xy mod p = 13 corresponds to xy mod π = 1 + i in the field G 29 . To obtain this result, we apply Montgomery reduction for the inverse mapping M = µ(M) = MR −1 . With M = −i, we have t = Zπ mod R = 2 + i mod 8 = 2 + i and the result M = q = (Z + tπ) div R = 8 + 8i div 8 = 1 + i. Again, no reduction is required because the condition |q| = We illustrate the final reduction procedure of Algorithm 2 in the following example. There are up to eight possible values α according to (11) and (12). However, not all potential values have to be considered for the reduction. The number of valid candidates α can be reduced based on the signs of the real and imaginary part of q.
Nonetheless, the reduction according to Algorithm 2 can be rather complex due to the squaring to determine the absolute values. This complexity may not be required in all applications of the Montgomery multiplication. For applications in cryptography, it is adequate to find the unique representative ZR −1 mod π only in the last calculation step, whereas for intermediates results a reduction that achieves M ≡ ZR −1 mod π is sufficient. This motivates a low complexity Montgomery reduction based on the Manhattan weight, which is considered in the next section.

Precision Reduction for Gaussian Integers Using the Manhattan Weight
In this section, we present a low complexity precision reduction for Gaussian integers based on the Manhattan weight, i.e., the absolute value in the reduction algorithm is replaced by the Manhattan weight. The calculation of the absolute value requires squaring, whereas the Manhattan weight requires only addition. On the other hand, this algorithm may not obtain a unique solution. However, there are at most two possible solutions that are congruent. In ECC or RSA systems that require many reduction steps, this algorithm can be used for the intermediate results.
Without loss of generality we consider π = a + bi with a > b ≥ 1. The precision reduction is described in Algorithm 3, wherê and N = a − 1. We demonstrated that this algorithm always obtains M ≡ ZR −1 mod π, where M ≤ N. determineα according to (14) 7:

Algorithm 3 Precision reduction for Gaussian integers
First, we note that using the symmetry according to Lemma 1, we can restrict the following derivations to the elements of the first quadrant. Next, we consider some important properties of the Manhattan weight.

Lemma 2. For x ∈ G p the Manhattan weight is upper bounded by
Proof. Let c and d be the real part and the imaginary part of x π . For x ∈ G p we have x = x mod π and consequently xπ * ππ * = 0.
This implies [c] = [d] = 0 and |c| < 1/2, |d| < 1/2. Hence, we consider (1/2 + 1/2i)π to upper bound the real and imaginary parts of x as Note that either a is odd and b is even or vice versa because p is an odd prime. Furthermore, the real-and imaginary parts of x are integers. Consequently, we have We can bound the Manhattan weight of x as Hence (15) holds.
Next, we consider bounds on the Manhattan weight for the sum and product of two elements x, y ∈ G p . Note that we consider arithmetic without modulo reduction. Lemma 3. For x, y ∈ G p we have the upper bounds for arithmetic without modulo reduction.
Proof. The bound on the sum follows from the triangle inequality and (15), i.e., Without loss of generality we consider two elements x = c + di and y = e + f i from the first quadrant for the product in (19). This implies Similarly, we have f ≤ N − e. For the product xy we have xy = (ce − d f ) + (ed + c f )i. (21) First, consider the absolute value of the imaginary part Im{xy} To determine the maximum value, we consider the bivariate function g(e, c) = eN + cN − 2ce and its partial derivatives ∂g(e, c) ∂c This results in a maximum for c = e = N/2 and the bound |Im{xy}| = ed + c f ≤ N 2 /2. Due to symmetry this bound also holds for the absolute value of the real part. Hence, we have The following proposition demonstrates that Algorithm 3 results in the desired reduction.

Proposition 2. For M and Z according to Algorithm 3, we have
Proof. The first two steps are identical in Algorithm 3 and Algorithm 2. Hence, the first statement (25) follows from the same arguments as in the proof of Proposition 1.
For q ≤ N, we immediately have (26). For q > N, we consider again the corresponding quotient q = (Z + tπ) div R. The Manhattan weight of ZR −1 mod π is bounded by where we have used (24) and the assumption R ≥ N. Similar to the proof of Proposition 1, we aim to compensate tπR −1 with απ. From Proposition 1 follows that α ≤ √ 2. Hence, the minimization in (14) over α ∈ {±1, ±i, ±1 ± i} is sufficient. From (27) follows that there is at least one solution with M = q −απ ≤ N. Consequently, the minimization in (14) will find a solution satisfying (26).
Finally, we can now describe the final reduction step of Algorithm 3. Note that there are eight possible values for α according to (14), but not all potential values have to be considered. The reduction procedure is demonstrated in the following example.
Example 5. Again, we consider the finite field G 29 of Gaussian integers with π = 5 + 2i, where all possible values of q after the first two steps of Algorithm 3 are depicted in Figure 2.
For instance, consider the product of x = 19 and y = 7 from Z 29 . The two integers x and y are mapped to the Gaussian integers X = xR mod π = 2 − 2i and Y = yR mod π = −2 with R = 8 in the Montgomery domain. The product xy mod p = 133 mod 29 = 17 corresponds to xyR mod π = 3 − i in the Montgomery domain.
With Algorithm 2, we can reduce the product Z = XY = −4 + 4i as follows. We calculate t = Zπ mod R = 4 + 4i and the quotient q = (Z + tπ) div R = 1 + 4i. A valid solution of the reduction is a Gaussian integer M with M ≤ N. We have q = 5 > N = 4. Hence, further reduction is required. As q is in the first quadrant, we have three possible values for α, i.e., α ∈ {1, i, 1 + i}. The final reduction step results in a Gaussian integer x with x ≤ N. Hence, Algorithm 3 achieves the desired precision reduction using the Manhattan weight. However, the result may not always be an element of the ring. For instance, for some π the value x = a − 1 is a ring element. Alternatively, the point x = x − π = −1 − bi can be a ring element. Both values are congruent and can satisfy x = a − 1 ≤ N, x = b + 1 ≤ N depending on π. The ring element is the value with the smallest absolute value. There exist at most two congruent values with Manhattan weight less or equal N. Consequently, the correct representation can be selected by minimizing the absolute value of these two congruent values. Depending on the application this step might be required once to obtain the final result x = q mod π.

Elliptic Curve Point Multiplication
In this section, we consider the elliptic curve point multiplication to demonstrate that the arithmetic over Gaussian integers enables an efficient calculation. Calculations of the elliptic curve point multiplication are prone to side-channel attacks. For example, an attacker can estimate the secret key based on the required power consumption of the different arithmetic operations over time. There are many known attacks on the point multiplication like timing attacks, simple power analysis, differential power analysis, refined power analysis, and zero-value point attacks [32,38].
The concept for the point multiplication over Gaussian integers was introduced in [22], where an area-efficient coprocessor design was proposed with an arithmetic unit that enables Montgomery modular arithmetic over Gaussian integers. It is shown in [22] that a key representation with a Gaussian integer expansion is beneficial to reduce the computational complexity and the memory requirements of a secure hardware implementation considering timing attacks and simple power analysis. This architecture supports different point multiplication algorithms. In contrast to [22], we consider the randomized initial point method which is additionally protected against differential power analysis, refined power analysis, and zero-value point attacks [32,38]. We refer to [22] for the architecture and the implementation details.
In the following, we consider the elliptic curve which is recommended for prime fields GF(p) [32,42]. The parameters α and β are constant coefficients. The pair x and y defines the coordinates of a point P(x, y) on the curve. The one-way function of elliptic-curve cryptography is the point multiplication, i.e., kP, where P is a point on the elliptic curve and k is an integer. The point multiplication is typically implemented based on the binary expansion k = ∑ r−1 i=0 k i 2 i of the integer k with binary digits k i ∈ {0, 1}, where r is the length in bits and k r−1 , k r−2 , . . . , k 0 is the binary representation of the key (the integer k). This method requires two different point operations, the point addition (ADD) and the point doubling operation (DBL) [32]. Let P(x 1 , y 1 ) and Q(x 2 , y 2 ) be two distinct points on an elliptic curve (28), then the sum S(x 3 , y 3 ) = P(x 1 , y 1 ) + Q(x 2 , y 2 ) is calculated as Similarly, the point doubling operation S(x 3 , y 3 ) = 2P(x 1 , y 1 ) is defined as The calculations in (29) and (30) are typically performed using arithmetic over prime fields. However, we consider arithmetic over Gaussian integer fields, where x i , y i , α, β ∈ G p .
As an alternative to the binary expansion key representation, τ-adic expansions of the integer k with a non-binary basis τ were proposed to speed up the point multiplication (PM) and to improve the resistance against attacks [38,[43][44][45][46][47]. The τ-adic expansion results in the point multiplication where we consider digits κ i from a Gaussian integer field.
To demonstrate that a τ-adic expansion with arithmetic over Gaussian integers can reduce the computational complexity, we consider an example based on the curve (28) with β = 0. For the products κ i P(x, y) we use the endomorphism iP(x, y) = P(−x, iy) for prime fields [32,48]. Note that P(−x, iy) is a point on the curve (28) if P(x, y) is a valid point. The negation of any point −P(x, y) is P(x, −y) according to [32]. Hence, for the products κ i P(x, y) with κ i ∈ {0, ±1, ±i} we have −iP = P(−x, −iy).
Using this endomorphism, we can calculate other products τP, e.g., τP = (2 + i)P = 2P + iP requires a DBL, the mapping iP, and an ADD operation. Several other endomorphisms for different curves over prime fields can be fund in [32,48]. For the point multiplication, we consider the randomized initial point method according to Algorithm 4. This method introduces a random point R into the calculation of the point multiplication. Consequently, all intermediate results in the calculation of the point multiplication become randomized. The algorithm starts with a precomputation phase, where the points S l−1 , . . . , S 1 , S 0 are computed in steps 1 to 4 and stored in a memory. The point addition in step 5 results in the point Q = S l−1 + R. This point is multiplied with τ and then added to the corresponding precomputed point S j in the loop in steps 6 to 9 of this algorithm. Due to the second product of the precomputations −(τ − 1) · R, we obtain the point Q + R after each iteration of this loop. Correspondingly, the final result is the point Q + R = k · P + R. Hence, we obtain the correct result of the point multiplication k · P by subtracting R from Q in step 10. The point addition in step 8 is computed in each iteration since the τ-adic expansion of the key according to [22] excludes all zero elements. This prevents SPA and timing attacks. Furthermore, all stored points S l−1 , . . . , S 1 , S 0 are randomized due to the subtracting the random point −(τ − 1) · R. Thus, the resistance against DPA, RPA, and ZPA is increased.
We can use the reduction based on the Manhattan weight in Algorithm 3 for all interim results in the point multiplication. The aim of the reduction of interim results is finding a Gaussian integerx that has a Manhattan weight satisfying x ≤ N and is congruent to the actual representative x. The reduction algorithm can be stopped once a valuex = q − απ with x ≤ N is found. Hence, not all offsets in (14) have to be considered in every reduction. Table 2 presents numerical results on the number of required reduction steps for different field sizes. The four columns on the right in Table 2 provide the percentage for the number of required offset reduction steps after an arithmetic operation, where no reduction is required if the result q already satisfies q ≤ N. These results illustrate that sequential processing of the offset reduction is suitable for the point multiplication because 91% of all operations require no reduction since q ≤ N is satisfied. About 4 − 5% of all operations require a single reduction step. Two or three calculation steps were required in approximately 4% of all cases.

Algorithm 4
Point multiplication according to the randomized initial point method from [38]. input: P, k = (κ l−1 , . . . , κ 1 , κ 0 ) τ output: k · P 1: R = random point // randomized initial point 2: for (j = l − 1 down to 0) do 3: S j = κ j · P − (τ − 1) · R // store all precomputed points in memory 4: end for Q = Q + S j // ADD Q and κ j · P − (τ − 1) · R that is read from the memory 9: end for 10: return Q − R On the other hand, the different execution times of the reduction steps may cause concerns about side-channel vulnerabilities. However, the probability for additional reduction steps is relativity low. Moreover, the randomized initial point method also randomizes the occurrence of these additional steps.
Next, we demonstrate that determining the point multiplication using Gaussian integers reduces the computational complexity. The number of required point operations per iteration of the point multiplication is similar to the results in [22]. However, some additional computations are required to subtract the random point. As in [22], we estimate the computational complexity for the point multiplication in terms of multiplication equivalent operations M. Table 3 illustrates two examples for the complexity with different bases τ. The parameter r denotes the maximum key length in bits and the value l is the number of iterations per point multiplication. A larger value of |τ| 2 reduces the number of iterations and speeds up the computation compared with a conventional binary PM, but requires more storage for the precomputed points. The results are compared with the point multiplication according to randomized initial point but using ordinary integer expansions as proposed in [38] for comparable base values w. For instance, we can compare the complex base τ = 4 + i with |τ| 2 = 17 digits and the base w = 4 with 2 w = 16 digits from [38]. For these values, the number M of multiplications is reduced by 32.5%. This reduction results mainly from the simpler calculation of τ · Q for Gaussian integers due to the used endomorphism. In [21], a similar PM with τ-adic expansions over Gaussian integers was proposed. This algorithm additionally reduces the number of precomputed and stored points. However, this method considers only timing and simple power analysis attacks. It is more vulnerable to statistical attacks than the randomized initial point method.
To illustrate the efficiency of Gaussian integer arithmetic, we consider latencies for the point multiplication according to Algorithm 4 using both norms in Table 4. The table provides results for a Xilinx Virtex-7 field-programmable gate array (FPGA) based on the architecture from [22]. The hardware requirements are represented by the number of look-up tables (LUT), flip-flops (FF), slices, and digital signal processor (DSP) units, as well as by the RAM size. The maximum clock frequency is denoted by f clk . These point multiplications are significantly faster than the results from [39,40]. For instance, the design in [40] considers key lengths up to r = 256 and the unprotected PM. This design is optimized for the use of DSP units which reduces the number of LUT. The number 1990 of LUT is similar to the proposed design with key length r = 253 using DSP units. It employs much more flip-flops (1786) and memory (234 kbit). An unprotected PM of length r = 256 requires 23.5ms whereas the proposed protected PM requires only 6.87ms for r = 253.
Note that the latency values for the reduction algorithms depend on the hardware architecture. The calculation of the Manhattan weight requires one addition which is performed in a single clock cycle with the architecture from [22]. For the absolute value, two additional multiplications are required which need four clock cycles each. Hence, the latency for calculating the Manhattan weight is only 11% of the time for the absolute value. The results in Table 4 illustrate the impact of this complexity reduction on the total latency of a point multiplication. Applying the Manhattan weight reduces the latency of a PM by 13% compared with the reduction algorithm from [29].
The generalization of Montgomery reduction to Gaussian integers is not trivial, because complex numbers cannot be totally ordered. In this work, we have presented two algorithms for Montgomery reduction for Gaussian integers which use different norms. The first algorithm utilizes the absolute value to measure the size of a Gaussian integer. This algorithm can uniquely determine the correct representative. The second algorithm is based on the Manhattan weight of Gaussian integers, which reduces the computational complexity for the modulo reduction. However, two congruent solutions may occur. It is not possible to determine the correct candidate based on the Manhattan weight. The correct representative is the candidate with the smaller absolute value. Hence, the first algorithm is required to determine the final result, e.g., for the inverse mapping from the Montgomery domain to the original field representation.
It is shown in [21,22] that a key representation with a Gaussian integer expansion is beneficial to reduce the computational complexity and the memory requirements of elliptic curve point multiplication algorithms that are protected against side-channel attacks. However, only timing and simple power analysis attacks were considered. This PM is vulnerable to statistical attacks [38]. As an alternative, we have shown that Gaussian integer expansions can be used with the randomized initial point method. This method is protected against statistical attacks like DPA, RPA, and ZPA. The FPGA implementation of this protected PM with Gaussian integer expansions is significantly faster than the PM algorithms over ordinary prime fields reported in [39,40].
However, Gaussian integer fields can only be constructed for primes of the form p mod 4 = 1, hence a generalization of this work to Eisenstein integers could be beneficial. Eisenstein integers are complex numbers of the form x = a + bω, where ω = 1 2 · 1 + √ −3 , and Eisenstein integer fields can be constructed for primes of the form p mod 6 = 1. An elliptic curve point multiplication using Eisenstein integers was considered in [49] showing similar properties as the point multiplication over Gaussian integers. Up to now no efficient modulo operation for Eisenstein integers is known. We believe that a generalization of the Montgomery modular multiplication to Eisenstein integers would be a promising direction for further research.