Optimized CSIDH Implementation Using a 2-Torsion Point

The implementation of isogeny-based cryptography mainly use Montgomery curves, as they offer fast elliptic curve arithmetic and isogeny computation. However, although Montgomery curves have efficient 3- and 4-isogeny formula, it becomes inefficient when recovering the coefficient of the image curve for large degree isogenies. Because the Commutative Supersingular Isogeny Diffie-Hellman (CSIDH) requires odd-degree isogenies up to at least 587, this inefficiency is the main bottleneck of using a Montgomery curve for CSIDH. In this paper, we present a new optimization method for faster CSIDH protocols entirely on Montgomery curves. To this end, we present a new parameter for CSIDH, in which the three rational two-torsion points exist. By using the proposed parameters, the CSIDH moves around the surface. The curve coefficient of the image curve can be recovered by a two-torsion point. We also proved that the CSIDH while using the proposed parameter guarantees a free and transitive group action. Additionally, we present the implementation result using our method. We demonstrated that our method is 6.4% faster than the original CSIDH. Our works show that quite higher performance of CSIDH is achieved while only using Montgomery curves.


Introduction
With the evolution of a quantum computing environment, currently used public key cryptosystems based on factorization and discrete logarithm problems, such as RSA and ECC, will not be able to guarantee their security in the near future. This has led to the need for post-quantum cryptography (PQC) that is secure, even in quantum computing environments. The National Institute of Standards and Technology (NIST) opened the PQC standardization project, which is now in Round 2. Among the PQC categories, isogeny-based cryptography interests many researchers, as it offers smaller key sizes than any other PQC candidates. The isogeny-based cryptography is based on the difficulty of finding a specific isogeny between two elliptic curves defined on the same finite field. Despite having a fairly small key size, isogeny-based cryptography has the disadvantage of being considerably slower than most of the PQC candidates.
The isogeny-based cryptography was first proposed by Couveignes in 2006 [1]. This is a non-interactive key exchange protocol, which uses a set of F q -isomorphism classes of ordinary elliptic curves that are defined on F q . The endomorphism ring between these curves is given by the order O in an imaginary quadratic field. Subsequently, the ideal class group cl(O) acts freely and transitively on In this paper, we apply an optimization technique that was proposed by Costello and Hisil in CSIDH in order to obtain image curve coefficients during isogeny computations [12]. The following are the main contributions of this work. • We present a new initial curve and a new prime of the form 8k + 7, enabling the use of the two-torsion method by Costello and Hisil [12]. In the parameter presented in the original CSIDH, F p -rational two-torsion points do not exist, except for (0, 0), so that this method cannot be used for recovering the coefficient of the image curve in CSIDH. Compared to Meyer's method [8], computing the coefficient of the image curve is the main bottleneck for implementing faster CSIDH entirely on Montgomery curves. By using our prime, F p -rational two-torsion points exist, so that the coefficient can be efficiently computed. • We also prove that our algorithm assures one-to-one correspondence between image curves and elliptic curve isomorphism classes. Given a Montgomery curve M A : y 2 = x 3 + Ax 2 + x on the surface with curve coefficient A and base field prime p, we prove that the ideal-class group cl(O) acts freely and transitively on the set S + p,Z[(1+ √ −p)/2],i in [13]. The details of our proof are denoted in Section 4. • We present the implementation results of our proposed method. The group action of our implementation is about 7.1% faster than the original CSIDH. The entire key exchange is about 6.4% faster than the original CSIDH. Although the proposed CSIDH implementation is slower than [8], we stress the fact that we provide the fastest performance using only Montgomery curves. Section 5 denote details of our implementation and results.
This paper is organized, as follows. In Section 2, we review on background of elliptic curves and CSIDH key exchange. In Section 3, we introduce the various way of odd-degree isogeny computations. In Section 4, we present a new parameter that makes the use of the two-torsion point and our optimization methods. Section 5 describes the specific implementation process and the result of comparing the costs and speed. We draw our conclusions and future work in Section 6.

Preliminary
In this section, we describe the background knowledge needed to develop this paper. First, we review some properties of elliptic curves. Subsequently, we introduce the CSIDH protocol.

Montgomery Curves
Let K be a field with the characteristic not equal to 2 or 3. The Montgomery elliptic curves over K are expressed by the following equation: where b(a 2 − 4) = 0. We shall write M a when b = 1 throughout the paper. For efficient implementation of isogeny operation, we use the projective coordinate and projective curve coefficient to avoid inversions. Because Montgomery curve arithmetic can only be constructed with the x-coordinate, XZ-coordinate system is mainly used for implementing isogeny-based cryptography. Now, we write a point P = (x, y) on M a,b and coefficient a as P = (X : Z) and a = (A : C), respectively, where x = X/Z and a = A/C.

Isogeny
Let O E be the group identity of a group of an elliptic curve E. Given two elliptic curves E and E , we define an isogeny φ between E and E by φ : Because φ is group homomorphism between E and E , ker(φ) is a subgroup of E. Given any finite subgroup K of E, we use Velu's formula to compute an isogeny φ : E → E . Subsequently, we obtain an isogeny φ : E → E satisfying ker(φ) = K and denote deg(φ) = |K|.

Supersingularity
Given a prime p, let E be an elliptic curve defined over F p . Afterwards, E is a supersingular curve if and only if Otherwise, E is an ordinary curve. Let End(E) be a full endomorphism ring of E and End F p (E) be an F p -rational endomorphism ring defined over F p . A full endomorphism ring of an ordinary curve is isomorphic to an order in an imaginary quadratic field. On the other hand, A full endomorphism ring End(E) of a supersingular curve E is isomorphic to an order in a quaternion algebra. Additionally, F p -rational endomorphism ring End F p (E) of supersingular curve E is isomorphic to an order in an imaginary quadratic field Q( √ −p). Now, denote an order O for End F p (E). Let π ∈ O be the F p -Frobenius endomorphism of E and E p (O, π) be the set of elliptic curves E defined over F p satisfying O = End F p (E). Afterwards, the ideal-class group cl(O) acts freely and transitively on E p (O, π) by . Since E is a supersingular curve with #E(F p ) = p + 1 = 4 · 1 · · · n , for each i, there is F p -rational subgroup of order i . Additionally, let π = √ −p be the F p -Frobenius endomorphism of E. Subsequently, since p = −1 mod i , for a prime i , it is well-known that i O splits into two prime ideals l i = ( i , π − 1) and l −1

CSIDH Group Action
An element of the ideal-class group cl(O) is of the form ∏ n i=1 l e i i (l i = ( i , π − 1)) for small e i ∈ [−m, m]. Accordingly, in CSIDH protocol, Alice and Bob randomly select a vector (e 1 , e 2 , · · · , e n ) ∈ Z n and consider it as a secret key. Thus, a group action [a]E can be computed by applying i -isogeny operation e i times for a = ∏ n i=1 l e i i ∈ cl(O). If e i > 0, i -isogeny is applied with the kernel generated by a point in E(F p ) of order i . If e i < 0, i -isogeny is applied with the kernel generated by a point in E(F p 2 \F p ) of order i . As i s are all primes, this means that efficient odd-degree isogeny formula at least up to 587 for CSIDH-512 is required for implementation. For Montgomery curves, Costello and Hisil proposed an efficient method for computing odd-degree isogenies [12]. For twisted Edwards curves, Moody and Shumow proposed generalized odd-degree isogeny formula [14]. In [15], they optimized the Moody and Shumow formula by using the w-coordinate on Edwards curves.

Odd-Degree Isogenies
Generally, an isogeny operation is divided into two parts-evaluation of an isogeny and coefficients computation of an image curve. In this section, we shall briefly introduce the formula in [12] for point evaluations. For coefficient computations, we introduce various methods that can be used to implement CSIDH. From this section, M, S, and a refer to a field multiplication, squaring, and addition, respectively.

Point Evaluation
In [12], Costello and Hisil proposed a simple formula for computing arbitrary degree isogenies on Montgomery curves. Their formula can be summarized as follows.
Theorem 1. For a field K, whose characteristic is not 2, let P be a point of order = 2d + 1 on the Montgomery curve M a,b /K : ) 2 and f (x) is its derivative.
As mentioned earlier, because the Montgomery curve arithmetic can only be constructed with the x-coordinate, the function f (x) is of our main interest in (2).
Let P be a point on a Montgomery curve having order = 2d + 1. Subsequently, in projective XZ-coordinate we express P as P = (X : Z), where x = X/Z. Let φ be an isogeny -isogeny, where ker φ = P . From the formula proposed in [12], P = φ(P) = (X : Z ) is computed as For -isogeny evaluaton, the computational cost is (4d)M + 2S + (4d + 2)a. As denoted in (1), the computation of the image curve using Theorem 1 in [12] is somewhat complicated. Therefore, an alternate way to recover the coefficient of the image curve is presented in [12]. The first method is to use a two-torsion point of a Montgomery curve, and another is to use two points and its differential of a Montgomery curve. We shall call the former method as a two-torsion method and the later as a differential method. As the two-torsion method is of our primary interest in this paper, we shall only describe the details of the two-torsion method in this paper. Additionally, we provide two other ways to compute the coefficient of the image curve that is presented in [7,8], in the following subsection. Remark 1. Using the differential method, we can alternatively compute the image curve coefficient with the cost 8M + 5S + 11a [12]. However, unlike SIDH, as CSIDH does not require such three points, additional point evaluation is required when this method is used. Thus, when the differential method is used, CSIDH will have inefficient speed and large key size when compared to the original method. Therefore, we exclude the use of the differential method in this paper.

The 2-Torsion Method
In [12], the main idea is to use two-torsion points for coefficient computation, as pushing a 2-torsion point through an odd-degree isogeny preserves their order on the image curve.
For a Montgomery curve defined over K, it is well-known that the two-torsion point has the following form-(0, 0), (α, 0), (α −1 , 0) for α ∈K. If we know α of the two-torsion point on a Montgomery curve, then we can recover the coefficient of a Montgomery curve. For a given elliptic curve M a , since Let φ : M a → M a be an isogeny of odd-degree = 2d + 1, and P = (α, 0) be a two-torsion point on M a . Subsequently, it is clear that φ(P) is two-torsion point on M a . Using this fact, we can recover the coefficient of the image curve by first, evaluating φ(P) and obtaining the coefficient by using (5). More precisely, assuming that φ(P) = (α , 0), we obtain a = −((α ) 2 + 1)/α . In projective coordinate, let P = (X α , Z α ), where α = X α /Z α . Subsequently, projective curve coefficient of the image curve is derived by using the following equation.

Remark 2.
Recently, in [13], Castryck and Decru proposed the CSURF algorithm using the tweaked Montgomery curve M t a : y 2 = x 3 + ax 2 − x and it is about 5.68% faster than the original CSIDH. CSURF can also use the two-torsion method because three two-torsion points are on M t a (F p ). If (α, 0) is a two-torsion point on a tweaked Montgomery curve M t a for α = 0, then since α 2 + aα − 1 = 0, we can reconstruct the tweaked Montgomery coefficient a by a = (A : So, we can compute an image curve coefficient by one additional point evaluation and 2M + 2a. Using the two-torsion method, CSURF will might be more efficient in computing odd-degree isogeny parts.

Optimization by Castryck et al.
They [7] optimize (1) to compute the coefficient of the image curve, as F p -rational two-torsion point does not exist for the original parameters of CSIDH.
For a point P of order on E and k ∈ {1, · · · , − 1}, let (X k : Z k ) be the projective x-coordinate of [k]P. Define c i ∈ F p , such that as polynomials in w, and define τ, σ by Subsequently, coefficient (a : 1) of image curve of -isogeny with the kernel P is computed by Using this method, the cost of calculating curve coefficient is (6d − 2)M + 3S + 4a in implementation.

Exploiting Twisted Edwards Curves
In [8], Meyer and Reith proposed a Montgomery-Edwards hybrid method for implementing CSIDH. They exploited the fact that recovering the coefficient of the image curve is more efficient on twisted Edwards curves than Montgomery curves. By using the efficiency of the birational map between Montgomery curves and twisted Edwards curves, they used Montgomery curves for scalar multiplication and isogeny evaluation and used twisted Edwards curves for recovering the coefficient of the image curve.
The outline of the process is summarized in the equation below. In the equation, φ denotes an isogeny on a twisted Edwards curve, ι denotes conversion from Montgomery to twisted Edwards curves, and ι −1 denotes conversion from twisted Edwards to Montgomery curves.
By composing the functions φ = ι −1 • ψ • ι, one can obtain the coefficient of a Montgomery curve. Using this method, the computational cost of recovering the curve coefficient is (2d)M + 6S + 6a + 2c( ), where c( ) is the cost for computing r for a constant r ∈ F p . Details of this method can be found in [8].

Proposed Method
In this section, we present the optimized algorithms for CSIDH group action. First, we briefly state our motivation for this paper. The idea is to use the two-torsion method to recover the coefficient of the image curve. To use the two-torsion method in [12], we adjust the prime, so that the rational two-torsion points exist on F p . The CSIDH using the proposed parameter is performed on the surface. We provide two versions of our modified CSIDH, where one exchanges the two-torsion points, and the other calculates the two-torsion point for a given elliptic curve.

Motivation
Although there is an efficient way for computing 3-and 4-isogeny on Montgomery curves, the original formula presented in [12] for computing the coefficient of the image curve is inefficient for large odd-degree isogenies. Therefore, Costello and Hisil proposed alternate methods for computing the curve coefficient of the image curve. However, these methods unfit in the CSIDH protocol, as there is no rational two-torsion point, nor do they use the difference of two points, as in SIDH. Hence, Castryck et al. compute the coefficient of the image curve by using (6).
On the other hand, Meyer et al. exploit the twisted Edwards curve for computing the coefficients of the image curve, as there is a simple formula for recovering the coefficient proposed by Moody and Shomow in [14]. Combining Montgomery and twisted Edwards curves, Meyer's method led to speed up of CSIDH protocol. In [15], using Edwards w-coordinate, Kim et al. proposed optimized isogeny formula on Edwards curves, which can be used to implement CSIDH fully on Edwards curves.
To summarize, unlike SIDH, using only Montgomery curves might be an inefficient choice for implementing CSIDH protocol. However, associated in Table 1, if the application of the two-torsion method is possible, then we can implement CSIDH entirely on Montgomery curves efficiently. Therefore, we provide the way to use the two-torsion method for computing the coefficients in CSIDH by tailoring the primes used in the base field. The proposed parameter executes CSIDH on the surface. We prove that our method also provides free and transitive group action.

Proposed Method
We define a new prime and a new base curve in order to have a rational two-torsion point other than (0, 0) in order to use the two-torsion method. By doing so, we can construct more efficient Montgomery-only CSIDH.

New Parameters
Let M a be a Montgomery curve defined over finite field F p where p ≡ 3 mod 4. If E has a 2-torsion point on F p except for (0, 0), then the 2-torsion subgroup M a (F p ) [2] satisfy |M a (F p )[2]| = 4. In this situation, the supersingular elliptic curve M a /F p is on the surface satisfying End F p (M a ) ∼ = Z [(1 + √ −p)/2] [13]. Note that the original CSIDH uses p ≡ 3 mod 8, so that the supersingular curve M a /F p exists on the floor satisfying End F p (M a ) ∼ = Z[ √ −p]. Thus, in order to have two-torsion points on F p , we must use a prime of the form p ≡ 7 mod 8. Following the notation presented in [13], we define the set S + p = {a ∈ F p | y 2 = x 3 + ax 2 + x is supersingular} and the set of an elliptic curves satisfying This set splits into two partitions, as follows. . In order to have transitive group action, we refer to the following lemma. Lemma 1. Let p ≡ 7 mod 8 and supersingular Montgomery curve M a : y 2 = x 3 + ax 2 + x be on the surface. Subsequently, there exists P = (x, y) ∈ M a (F p ), such that [2]P = (0, 0) if and only if a ± 2 are both square in F p .
Using this lemma, we can prove the following theorem.
Similarly, a ± 2 = (X ∓ Z ) 2 /(−X Z ). Afterwards, squareness of a ± 2 (resp. a ± 2) and −XZ (resp. −X Z ) is the same. Additionally, by applying (3) and (4), we can know that the squareness of −XZ and −X Z is the same. Following the proof of Lemma 1, a ± 2 and a ± 2 are all squares in F p or not squares in F p . Therefore, Theorem 2 holds by Lemma 1.
By Theorem 2, we consider free and transitive group action A two-torsion point P on a Montgomery curve is always of the form (α, 0). Since α 2 + aα + 1 = 0, α ∈ F p or α ∈ F p 2 . The initial curve of the original CSIDH is y 2 = x 3 + x, whose x-coordinate of the two-torsion point is on F p 2 , extension field of F p . Accordingly, we need new parameters that offer 2-torsion points in M a (F p ) except for (0, 0). The followings are those parameters. p = 2 4 · 3 3 · 5 · 7 · 11 2 · 13 · . . . · 373 − 1 ≈ 2 510.1 (8) We use the prime p ≡ 7 mod 8 and the Montgomery curve M a satisfying |M a (F p )[2]| = 4. Accordingly, we can apply free and transitive group action presented in (7). Note that using the above 73 consecutive odd primes starting at 3, this parameter provides less security level than the parameters of CSIDH-512. Note that the proposed parameter in this paper is just an example parameter to apply two-torsion method on CSIDH.

First Method-Exchanging the Two-Torsion
The first method is to exchange two-torsion points when exchanging a curve. Alice and Bob calculate curve coefficients of image curves using a two-torsion point when computing the group action and pass it along with the image curve to each other.
Alice computes her secret isogeny φ A : E → E A with her secret key [a], and compute the coefficient of E A through φ A (T). Upon receiving Bob's public key E B , Alice also receives φ B (T) in order to compute the proceeding phase. Likewise, Bob must also receive Alice's public key E A and φ A (T). As they need to send the image of two-torsion point in projective coordinate as well as the curve, the key size will be 3b p bits, where b p is the number of bits of p.

Second Method-Computing the 2-Torsion
Note that when using the first method, the key size is tripled to 3b p bits, where b p bits is the key size of the original CSIDH protocol. This is a huge loss as compared to a little increase in speed.
Because a two-torsion point on a Montgomery curve is of the form (α, 0), we can calculate α through solving a quadratic equation modulo p. Also, as , Alice and Bob can directly calculate the two-torsion point upon the receipt of the image curve computed through each other's secret isogeny.
For p ≡ 3 mod 4, if a is a quadratic residue modulo p, then the square root of a modulo p is computed by x = a (p+1)/4 mod p. Using this equation, we can find a two-torsion point for a given elliptic curve E. Also, by precomputing 2 −1 mod p, we can obtain a two-torsion point with less computation. Note that the cost of computing a (p+1/4) mod p for a ∈ F p is very small compared to the total CSIDH algorithm. Additionally, computing the square root occurs only two times throughout the total protocol.
When the second method is used, the key size decreases to b p bits again, so we can preserve the key size and improve speed. Summing up the whole process, a class group action by computing the two-torsion point is presented in Algorithm 1. The public key validation can also be performed as in [7] for both methods.
Alternatively, one can also exchange the image of the two-torsion point in affine coordinate, instead of the coefficient of the image curve. In this case, the coefficient of the image curve can be easily recovered from the received 2-torsion point, and the key size will be decreased to b p again. However, this requires two F p -inversions-one for recovering the affine public key and another for computing the affine two-torsion point. Thus, there is no difference between the cost of exchanging the affine image two-torsion point and our second method, and we do not explicitly consider the case.

Algorithm 1 Evaluating the class group action using the second method-Computing the two-torsion
Require: a ∈ F p such that M a : y 2 = x 3 + ax 2 + x is supersingular curve over F p and an integer vector (e 1 , e 2 , · · · , e n ) for e i ∈ [−m, m] Ensure: a such that M a : y 2 = x 3 + a x 2 + x where M a = [l e 1 1 l e 2 2 · · · l e n n ]M a 1: Compute a two-torsion point T in M a (F p ) // This step is omitted in the initial group action 2: while some e i = 0 do if R = ∞ then 15: Compute an isogeny φ : M a → M a with ker φ = R 16:

Implementation
In this section, we provide the implementation results and analysis. For clear expression, we shall denote the first method as Ours_Exchange and the second method as Ours_Compute.

Parameter Setting
For implementation, we used the finite field F p , where p is the prime presented in (8), and we used the Montgomery coefficient of the initial curve presented in (9) for both CSIDH and our methods. To make an exact comparison, we use the field operations that were implemented in [7] for both CSIDH and our methods.
For a more accurate comparison, we first measured the field operations over F p to examine the ratio between each operation. To this end, each field operation was repeated 10 9 times for F p . Table 2 summarizes the average cycle counts of F p -operations and p+1 4 -power of field elements.

. Further Modification
Let M a be a Montgomery curve. In [12], the coefficient of the Montgomery curve is presented as (Â :Ĉ) = (a + 2 : 4) instead of (A : C) = (a : 1) for accelerating the doubling (DBL) and differential addition (DBL&ADD) computation. The cost of DBL&ADD decreases from 8M + 4S + 11a to 8M + 4S + 8a and the cost of DBL decreases from 4M + 2S + 7a to 4M + 2S + 4a, when we used the transformed coefficient. Additionally, the cost of recovering the coefficient from a two-torsion point decreases from 2S + 5a to 2S + 3a.
The original CSIDH implementation in [7] does not use this transformed coefficient. Although there is an additional cost for converting the form of the coefficients, we can save the cost of scalar multiplication in all i -isogeny operation. As this optimization also holds in our proposed method, we applied this technique for both CSIDH and our method. The transformations (A : C) ↔ (Â :Ĉ) occurs before and after the group action, where the elliptic curve arithmetic is used.
Additionally, we noticed that the optimized point evaluation that is presented in (3) and (4) are not used in the implementation of the original CSIDH. For a reasonable comparison, we apply (3) and (4) to the original CSIDH. To summarize, by using the transformed curve coefficient and additional optimization of the point evaluation in CSIDH, the difference in the performance lies purely in the computation of recovering the curve coefficient.

Implementation Result
The algorithms are implemented in C language to evaluate the performance of each algorithms. All cycle counts were obtained on one core of an Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz, running Ubuntu 18.04.1 LTS. For compilation, we used GNU GCC version 7.5.0 with compile option -O3 using the benchmark provided by [7]. The running time and clock cycles of the group action and the entire key exchange of the original CSIDH, Ours_Exchange, Ours_Compute, and Meyer's hybrid method are as in Table 3. Because each algorithm is implemented with a non-constant time, we report the average of one-million runs. As shown in Table 3, the group action using Ours_Compute is about 7.1% faster than the original algorithm, and the entire key exchange is about 6.4% faster than the original CSIDH. The main operation for recovering a two-torsion point is computing the p+1 4 -power. The cost of recovering the two-torsion point is small compared to the cost of the entire group action, as shown in Table 2. Thus, the difference between Ours_Exchange and Ours_Compute is negligible.
Meanwhile, optimized CSIDH using twisted Edwards curves is proposed in [8,15], and using the Edwards curve is more efficient than using the two-torsion method to computing the coefficient of the image curve for higher odd-degree isogenies. However, by using the two-torsion method, we can simplify the implementation as transformations between Montgomery curves and Edwards curves are not required. Moreover, by using our method, we provide the fastest performance among the CSIDH implementation, while only using Montgomery curves.

Remark 4.
Recently, in [16], Bernstein et al. proposed a new odd-degree isogeny evaluation algorithm, called VeluSqrt algorithm, using onlyÕ( √ ) F p -operations, where theÕ is uniform in p. Because this algorithm impacts on evaluating isogenies only, it can be applied to all methods in Table 3.

Conclusions
In this paper, we proposed the optimized method for improving the performance of CSIDH and provided a new parameter to use our method. We set the parameters, so that the three two-torsion points on a Montgomery curve are all in E(F p ). Therefore, by using a two-torsion point, we optimized the cost of computing the coefficient of the image curve of odd-degree isogeny required in the group action. When our algorithm is used, the group action is about 7.1% faster than the original CSIDH and the entire key exchange is about 6.4% faster than the original CSIDH.
As mentioned before, the proposed method in this paper is still slower than the Montgomery-Edwards hybrid method presented in [8]. However, we examined that Montgomery-only implementation is still competitive enough through various studies, like [16].
To apply our method, the prime of the base field and the initial elliptic curve must be well-selected for a target security level. If we choose the parameter, which enables applying the two-torsion method, then CSIDH will be optimized further by studying the application of two-isogeny as in [13].