Abstract
Reducing the computation time of scalar multiplication for elliptic curve cryptography is a significant challenge. This study proposes an efficient scalar multiplication method for elliptic curves over finite fields . The proposed method first converts the scalar into a binary number. Then, using Horner’s rule, the binary number is divided into fixed-length bit-words. Each bit-word undergoes repeating point doubling, which can be precomputed. However, repeating point doubling typically involves numerous inverse operations. To address this, significant effort has been made to develop formulas that minimize the number of inverse operations. With the proposed formula, regardless of how many times the operation is repeated, only a single inverse operation is required. Over , the proposed method for scalar multiplication outperforms the sliding window method, which is currently regarded as the fastest available. However, the introduced formulas require more multiplications, squares, and additions. To reduce these operations, we further optimize the square operations; however, this introduces a trade-off between computation time and memory size. These challenges are key areas for future improvement.
MSC:
68P25
1. Introduction
Elliptic curve cryptography (abbreviated as ECC) was introduced by Miller [1] in 1986 and Koblitz [2] in 1987. ECC is typically defined over prime finite fields or binary finite fields . Public key cryptographic primitives can be implemented using abelian groups generated by elliptic curves over or . ECC provides the same level of security as traditional public key cryptography, but with a smaller number of parameters. In practical applications, ECC over and each possess distinct advantages, and the choice between them depends on the specific requirements of the application. For example, is often preferred in scenarios demanding high security and versatility, such as financial transactions, digital signatures, and SSL/TLS protocols. ECC over generally provides stronger security guarantees and is well supported in both hardware and software implementations. On the other hand, is particularly suitable for resource-constrained environments, such as embedded systems and Internet of Things (IoT) devices, due to its computational efficiency. Operations over can be significantly accelerated through hardware optimization, making them more advantageous in scenarios where high computational efficiency is critical.
ECC designs over prime fields generally offer stronger resistance to side-channel attacks, while designs over binary fields benefit from a carry-free feature, making arithmetic operations more suitable for hardware implementation. ECC employs an encryption technique based on the discrete logarithm problem. The discrete logarithm problem is defined as follows: Given an elliptic curve E over a finite field and two points P and Q on E, the task is to find the value of k such that . However, scalar multiplications and point inversions both are computationally intensive and represent key challenges. Regarding ECC defined over finite fields, numerous methods have been developed to optimize scalar multiplication and point inversion, including algebraic theorem-based designs [3], bit-slicing techniques [4], lookup tables [5], non-adjacent forms (NAFs) [6], and so on. For instance, the method in [7] minimizes the number of non-zero bits using the direct recoding method [8] to enhance scalar multiplication. Implementing ECC arithmetic operations on various coordinates can lead to faster computations. In [9], Jacobian coordinates are used to achieve high-efficiency point addition and doubling without requiring point inversions. In [10], the authors derive formulas for in -projective coordinates and for in both affine and -projective coordinates, marking the first study in -projective coordinates.
The methods presented in [11,12,13] transform scalar multiplication processes from affine to projective coordinate systems, with implementations verified on FPGA boards. For a more in-depth analysis on using various coordinates, we refer the reader to [14]. In terms of hardware implementations, the lookup table approach in [15] optimizes double point-doubling operations, while the triple-based chain method [16] reduces time consumption in elliptic curve cryptosystems. A low-latency window algorithm [17] enhances security, as does an enhanced comb method for point addition and doubling. In [18], a configurable ECC crypto-processor defined with the Weierstrass equation over prime fields was implemented and verified on a Xilinx FPGA board. Modular multipliers over are discussed in [19], and algorithmic improvements for computational complexity in low-power devices are presented in [20].
Reducing the number of inverse operations in scalar multiplication is crucial, as inversion over finite fields is the most time-consuming of all basic operations. In this work, a modified Horner’s rule based on binary scalar representation and a grouping technique are employed to accelerate scalar multiplication. Using the grouping technique, the scalar is partitioned into bit-words, where each represents a sum of repeating point doublings that can be precomputed and stored. Instead of traditional point doubling, this study derives formulas for performing repeating point doubling. These formulas require only one inversion operation regardless of the number of repetitions. Unlike projective coordinate systems, the derived formulas are based on the affine coordinate system. To the best of our knowledge, these formulas are the first to compute scalar multiplication in this manner. Additionally, the proposed method is suitable for both software and hardware implementations, as the arithmetic operations are simple and consistent in execution. From a software perspective, the proposed method achieves faster scalar multiplication computation compared with the sliding window algorithm [21]. While the sliding window method [21] is a highly efficient general-purpose technique and is widely regarded as the fastest available, it may not be optimal for all scenarios. Integrating the proposed method with the sliding window algorithm can further enhance its performance.
The contributions of this study are as follows:
- We propose an efficient repeating point-doubling algorithm that relies solely on standard inversion operations.
- A generalizable accelerated squaring method is introduced, which can be applied to inverse element computation.
- The proposed repeating point-doubling algorithm can enhance the performance of the sliding window method or any other technique requiring repeating point-doubling operations.
- The calculation of repeated point doubling is a critical component in algorithms for computing scalar multiplication. By replacing these operations with our proposed method, we can achieve further improvements in efficiency. For instance, our approach demonstrates significant performance gains when applied to techniques such as the sliding window algorithm, as evidenced by the experimental data presented in Section 4.
The rest of this paper is organized as follows: Section 2 introduces finite field arithmetic on elliptic curves. In Section 3, formulas for repeating point doubling are derived, which significantly reduce the computation time compared with traditional point doubling. Additionally, a modified square operation is introduced to further improve efficiency. Section 4 presents the results of the simulation implemented in Python 3.9 and executed on an Intel Core i9-14900K processor, showcasing the performance of the proposed methods. Finally, conclusions are provided in Section 5.
2. Preliminaries
2.1. Basic Operations on
In the following, the binary operator “+” will denote an addition operation, which may vary depending on the context, such as addition of real numbers, bits, polynomials, or points on an elliptic curve. The exact meaning of “+” will be clear from the context in which it is used. When , the operation refers to the addition modulo 2 (i.e., binary addition).
Let be an element in , where for . Then, A can be represented as a polynomial . For simplicity, is referred to as being defined over . Let be defined over . Then, the addition of and , denoted as , is defined by . For example, suppose that and are elements in . We can express these binary elements as polynomials and Now, performing the addition , which is equivalent to bitwise addition modulo 2, we obtain the following:
or equivalently, in binary form: In programming, is the XOR operation of and . The multiplication of and is the remainder of the product of and by dividing an irreducible polynomial of degree m defined over . Symbolically, the multiplication of and in is denoted as .
Over , the Extended Euclidean Algorithm is employed to compute the remainder of the product of and by dividing . However, there are many time-consuming divisions in the algorithm. In order to avoid the divisions, Fermat’s Little Theorem is usually employed to compute the remainder or inverse. A polynomial is said to be the inverse of a polynomial if
We denote as . In Fermat’s Little Theorem, suppose that A is an element in . Then, the inverse of A is equal to ; moreover, .
2.2. Point Addition and Point Doubling on Elliptic Curve
The elliptic curve defined in is given by , where A and B are elements in . Let and be two points in . The point addition of P and Q, denoted by , is the point in obtained as follows:
- If , is defined by the negative of the point that is the intersection of and the line passing through P and Q. Let denote the slope of the line. Then,
- If , is the negative of the point that is the intersection of and the tangent line passing through the point P. is written as and is called the point doubling of P, and we have
An illustrative example is presented based on the definitions of point addition and point doubling as follows. Over , let be points on , where , and an irreducible polynomial . Let . Then,
Let . Then,
3. Optimizing Scalar Multiplications
3.1. Scalar Multiplication
Let be an integer and P be a point in . The scalar multiplication of P is defined by . The computation of is lengthy. To reduce the duration, first, k is converted to a binary representation, as follows:
where for . Let be non-negative integers such that with . Then, using Horner’s rule, Equation (3) can be represented as
For example, suppose that and . Then,
As the idea behind the proposed method comes from the sliding window [21], let us briefly introduce the basic concept of the sliding window by the following example. For k in Equation (4) with window size , k can be written as
In accordance with the sliding window method, the precomputations are point additions; a point doubling number of 9; and , , , and . The number of point doubling is the number of times a window with length w is successively shifted one place from left to right, skipping the zeros if they are not in the window. More details on the sliding window method can be found in [21]. With the proposed method, is written as
In Equation (6), for , each
is referred to as a w-bit word, denoted as . For the last r terms,
in Equation (6) is also represented as a w-bit word with for .
For Equations (7) and (8), it is evident that any scalar multiplication operation can be equivalently expressed as the computation of , for each i. For a small value of w, the points can be precomputed and stored in advance, as illustrated in Table 1. In this table, given the point P, the scalar k, and the word length w, the result of Equation (7) or (8) can be directly retrieved from the entry , provided that for and .
We will use the following example to demonstrate how to look up values in Table 1. Suppose that ; we precompute all eight possible combinations, as shown in the following table. For the value of k according to Equation (5), for the combination 110, we obtain the result from the table entry , which is .
| 0 | |
| P | |
Therefore, given point P, scalar k, and word length w, can be computed with the following Algorithm 1, ScalarMUL.
| Algorithm 1 ScalarMUL | |
| 1. | Set |
| 2. | Using P to create table L, as shown in Table 1 |
| 3. | Set and |
| 4. | For downto 0 |
| 5. | do |
| 6. | |
| 7. | |
| 8. | Enddo |
| 9. | |
| 10. | |
| 11. | return Q |
3.2. Reducing Inverse in the Repeating Point Doubling
The sliding window method [21] shifts a window of length and skips over runs of zeros between them while disregarding the fixed digit boundaries. However, in the ScalarMUL algorithm, the binary representation of k is partitioned into fixed-length bit-words of size w, where each word is processed sequentially. This approach can also be extended to the sliding window method, as will be demonstrated in Section 4 with the experimental results. Within the ScalarMUL algorithm, it is necessary to compute in step 7 and in step 9.
According to the definition of scalar multiplication of Q and the associative property of point addition on the elliptic curve , for any positive integer n, can be expressed as the point doubling of . Specifically, . Traditionally, as described in Equation (2), can be computed using the following Algorithm 2, referred to as Tradition. In the Tradition algorithm, line 4 employs Equation (2) to compute . Each iteration performs a point-doubling operation on Q requiring five XORs (additions), two multiplications, and one inverse operation. The addition, multiplication, and square operations mentioned here are all operations defined within .
| Algorithm 2 Tradition | |
| 1. | Set |
| 2. | For downto 0 |
| 3. | do |
| 4. | |
| 5. | Enddo |
| 6. | return Q |
To obtain , we have to compute . Therefore, in the computation, there are n inverse operations, XORs, multiplications, and squares. Since the inverse operation is computationally expensive, we have developed optimized formulas to replace the point-doubling computation in the Tradition algorithm. The derived formulas are designed to ensure that only a single inverse operation is required when computing of a given point Q, significantly improving computational efficiency. Let be a point in . For , let be the point doubling of . Then, is the scalar multiplication of . Let be the slope of the tangent line passing through the point . Then, to derive formulas for obtained from via the iteration of point doubling, first, consider . We have
where and .
In what follows, the formula for will be omitted until and are obtained.
For ,
where and .
For ,
where and .
For ,
where
and
The formulas for and can be extended iteratively for arbitrarily large values of n, allowing us to compute for any desired n. However, the derivation process becomes increasingly laborious and cumbersome as n grows larger, making it impractical for manual computation. Before establishing that there is only one inverse operation involved in the computation of scalar multiplication, it will be helpful to introduce the following recurrence relations. By following Equations (9)–(12), let , and . Then,
For , the following relationships can be easily derived:
Table 2 is an illustration of Equations (9)–(12) to compute and . In the example, the curve is defined over . For , the computations of and are shown in Appendix A.
Table 2.
Computations of and for four times point doubling of for .
Lemma 1.
For , and .
Proof of Lemma 1.
We will proceed with induction on n. Equations (9)–(12) show the basis step for and . For the inductive step,
This lemma holds. □
Corollary 1.
For , .
Proof of Corollary 1.
According to Lemma 1, ,
□
Given a point and a positive integer n, the n-times point doubling of Q can be efficiently computed using the following Algorithm 3, referred to as PDNTimes.
| Algorithm 3 PDNTimes | |
| 1. | Set |
| 2. | If , then |
| 3. | ; |
| 4. | ; |
| 5. | |
| 6. | |
| 7. | For upto n |
| 8. | do |
| 9. | |
| 10. | |
| 11. | |
| 12. | Enddo |
| 13. | |
| 14. | |
| 15. | |
| 16. | |
| 17. | |
| 18. | |
| 19. | EndIf |
| 20. | return |
In the PDNTimes algorithm, the computational complexity can be broken down as follows:
- Lines 5 and 6: These lines involve 3 XOR operations, 2 multiplications, and 2 square operations.
- Lines 9–11: Each iteration of the loop in these lines requires 5 XOR operations, 6 multiplications, and 6 square operations.
- Lines 13–17: These lines consist of 2 XOR operations, 4 multiplications, 3 square operations, and 1 inverse operation.
Therefore, a total of XORs, multiplications, and squares are required. However, in the case of hardware devices, the time complexity of adding any two n-bit numbers is currently , while the time complexity of their multiplication is .
Lemma 2.
Over , let be an integer and Q be a point in . The computation of n times point doubling of Q requires multiplications, squares, and one inverse operation.
For the repeating point doubling on , Table 3 demonstrates the execution times of the Tradition algorithm and the PDNTimes algorithm involved in the ScalarMUL algorithm. In other words, in line 7 of the ScalarMUL algorithm, the computation of is compared using PDNTimes and Tradition. Let and denote the execution time of the previous method and the proposed method, respectively. Then, in the table, the decreasing ratio is given by
Table 3.
The execution times ( s) for Tradition and PDNTimes over and the decreasing ratio, where .
When comparing the performance of Tradition with that of PDNTimes for different values of m, it is observed that while the reduction in inverse operations has led to a decrease in computation time, the increased number of multiplication and square operations in the formula results in a slowdown of the computation time reduction as n approaches 8. This trend is illustrated in Figure 1. This trend is attributed to the increase in word length, which leads to longer table construction times and a corresponding rise in memory consumption. Furthermore, as depicted in the figure, this behavior remains consistent across different values of m, indicating that the trade-off between reduced inversions and increased multiplication and square operations persists regardless of the specific parameters.
Figure 1.
The decreasing behavior shown by the data shown in Table 3.
3.3. Reducing Square Operation Time
In the PDNTimes algorithm, there are many square operations in , and . To further reduce the computation time for scalar multiplication or repeating point doubling, precomputations for square operations are employed again. The method we propose below will enable the square operation to utilize three main operations: XOR, bit shifting, and table lookup. Recall that is a polynomial defined over . Then, given an integer , let d and r be integers such that and . Using Horner’s rule again (note that the m we are considering is odd),
In Equation (16), the computation of involves sequentially evaluating the expression
for increasing values of i. Similar to Equation (7) (respectively, Equation (8)), the expression (respectively, ) represents a w-bit word, denoted as (respectively, ). The result of computing , for , and can be found in the entry in Table 4 provided that for . In the subsequent discussion, the notation “” will be used to denote shifting • to the left by n positions, with all the least significant bits set to zero, where n is a positive integer.
Table 4.
The precomputations for .
In Equation (17), since the maximum degree before applying the modulo operation with respect to is less than m, the remainder obtained through traditional long division depends on the polynomial . The result of this modulo operation, denoted as , is provided in Table 5, which represents the remainder of (17). Table 5 comprehensively lists all possible outcomes for .
Table 5.
The precomputations for (17). .
Therefore, the square operation can be computed with the following Algorithm 4, SquareMod.
| Algorithm 4 SquareMod | |
| 1. | Set |
| 2. | Make table and such as Table 4 and Table 5, respectively |
| 3. | For to |
| 4. | do |
| 5. | |
| 6. | |
| 7. | Enddo |
| 8. | |
| 9. | |
| 10. | |
| 11. | return C |
In the SquareMod algorithm, for each iteration i, the result of the equation of Equation (17) is represented as . In practical implementation, the term in Equation (17) implies that each in C is shifted to the left by positions, with all lower-order bits set to zero, where . Let denote the maximum degree of the polynomial in Equation (17) before applying the modulo operation with , and let . As is stored in an m-bit array in the code, there is a constraint on the shifting of C. Specifically, must be greater than the sum of and the maximum degree of . This ensures that the shifting operation does not exceed the bounds of the array and that the modulo operation can be correctly applied.
In the SquareMod algorithm, the computational time can be broken down as follows:
- Lines 5 and 6: Each iteration of the loop in these lines requires 2 XOR operations, 1 shift, and 2 table lookups.
- Lines 8–10: These lines consist of 3 XOR operations, 1 multiplication, and 3 table lookups.
Therefore, a total of XORs, d shifts, and table lookups are required. From the perspective of time complexity, this time is negligible compared with the time required for multiplication.
In the ScalarMUL and SquareMod algorithms, scalar multiplication corresponds to retrieving precomputed values stored in Table 1, Table 4, and Table 5. As a result, this approach significantly enhances computational efficiency by reducing the need for repeated calculations.
Lemma 3.
Given an integer w, the scalar multiplication of a point on over can be computed in iterations in the algorithms ScalarMUL and SquareMod.
In Lemma 3, the iterations imply that a scalar multiplication of the form of a given point Q is performed on a given point Q. To evaluate the execution time of the SquareMod algorithm, a test code was implemented to execute the algorithm 100,000 times for each word length w with . Additionally, the memory size required for the lookup table in SquareMod was measured for each word length. For instance, in the case of , Table 6 summarizes the execution time and the corresponding memory size needed for the lookup table in SquareMod. Figure 2 provides a graphical representation of the data presented in Table 6. As evident from the table or figure, there is a trade-off between execution time and memory usage. While increasing the word length w can enhance computational efficiency, it also results in a significant increase in the memory size required and construction times for the lookup table. This highlights the need to carefully balance performance optimization with memory constraints when implementing the SquareMod algorithm. Finding the optimal word length will also determine the performance of scalar multiplication, meaning the efficiency of scalar multiplication is adjustable. Taking as an example, in our program execution environment, the memory size required for each word length w is shown in Table 7. The execution time can be optimized by selecting an appropriate value of w based on the hardware and software specifications of the specific execution environment.
Table 6.
The execution time (seconds) and memory size ( bits/8 bits) of the implementation of the SquareMod algorithm for in computing over .
Figure 2.
The execution time and memory size used for the algorithm SquareMod over .
Table 7.
The execution time and memory size ( bits/8 bits) of the ScalarMUL algorithm for over , where . Note that the memory size does not include the and values listed in Table 6.
4. Inverse Algorithm Use ScalarMUL
ECC parameters over used in the ScalarMUL algorithm and the sliding window method [21] are provided in Table A5 of Appendix A. The execution times for each word length, both with and without the formulas utilized in the ScalarMUL algorithm, as well as for each window size in the sliding window method [21], are presented in Table 8. Note that the scalar k used in the algorithm ScalarMUL and the sliding window are the extension degree m of . Additionally, Table 9 illustrates the decreasing ratio, which compares the execution time of the proposed method with that of the sliding window method [21], highlighting the efficiency improvements achieved by the proposed approach. The decreasing trend in execution time is illustrated in Figure 3. The proposed formulas are specifically tailored for scenarios that involve repeating point-doubling operations, enabling a significant reduction in the number of inverse operations required. The application of our proposed method to the sliding window technique simply requires replacing the formulas we derived for repeating point doubling in Algorithm 2 in [21] with our proposed formulas. Furthermore, these formulas can be seamlessly integrated into the sliding window method to further improve its computational efficiency, as demonstrated by the results presented in Table 9. From Table 9, we observe that the sliding window method with formulas exhibits better efficiency. This is because the sliding window method utilizes a window based on the positions of the bit 1s in the binary representation of k for repeated point doubling. In contrast, our method uses a fixed word length, which requires more precomputation. However, this also demonstrates the value of our derived formulas. This integration highlights the versatility and effectiveness of the proposed approach in optimizing elliptic curve operations.
Table 8.
The execution times (seconds) for the ScalarMUL algorithm (both the PDNTimes and SquareMod algorithms are utilized) and sliding window method [21] over .
Table 9.
For , on over , the decreasing ratio for the ScalarMUL algorithm with formulas for the sliding window method [21] and the sliding window method with formulas to the sliding window method.
Figure 3.
The decreasing behavior based on the data shown in Table 9.
Over , given a point Q, word length w, and setting , Figure 4 illustrates the advantages of the PDNTimes algorithm in reducing the number of inverse operations required. In line 7 of the ScalarMUL algorithm, the operation requires computing , which involves performing w consecutive point doublings on the point Q. We compare the performances based on the number of multiplication operations required. In finite fields, the performance is largely determined via the inverse operations, as they require multiple multiplication operations to compute. The exact number of multiplications depends on the algorithm used. For example, if the Extended Euclidean Algorithm is used, an inverse operation generally takes about to multiplications, depending on the implementation’s optimization. The exact number of multiplications required for an inverse operation using Fermat’s Little Theorem is . In line 13 of the PDNTimes algorithm, we utilize Fermat’s Little Theorem to compute the inverse . For the PDNTimes algorithm, multiplication operations are required. If we replace the computation of in line 7 of PDNTimes with Equation (2) (in line 4 of Tradition) to compute , we will require multiplication operations.
Figure 4.
Comparison of the number of multiplications required for n iterations in Tradition and PDNTimes.
In the affine coordinate system, both point addition and point doubling require one inverse to compute the slope . Additionally, each operation involves five multiplications, as follows:
- Two multiplications for calculating ;
- Two multiplications for determining the new x-coordinate;
- One multiplication for determining the new y-coordinate.
Although our algorithm demonstrates reduced time complexity compared with the sliding window method, as shown in Table 10, its practical execution requires the construction of a larger lookup table. As a result, while our approach still outperforms the sliding window method in terms of efficiency, the performance gap is not as significant as indicated in Table 10.
Table 10.
Summary of the number of operations required for scalar multiplication over in the affine coordinate system.
5. Conclusions
In this work, we focused on significantly reducing the computation time of scalar multiplication, which can be easily implemented in software, by further expanding the application of Horner’s rule and optimizing the square operations, specifically, through the introduction of several formulas for the inverse operations involved in repeating point doubling.
In elliptic curve cryptography and other cryptographic protocols, scalar multiplication is a critical operation that can be computationally expensive, primarily due to the repeated use of inverses and point doubling, which are key to optimizing efficiency.
The introduced formulas can help to minimize the number of inverse operations needed, thereby streamlining the computational process. Computation using the introduced formulas for and requires more multiplication, square, and addition operations. We also developed the ScalarMod algorithm to reduce the computation time for square operations.
Figure 1 demonstrates that if the ScalarMul algorithm does not optimize for square operations, the overall reduction in computation time begins to plateau when the word length w reaches 8. This highlights the importance of optimizing square operations to achieve consistent performance improvements. On the other hand, Figure 2 illustrates that while the Square algorithm optimizes square operations, a trade-off must be made between execution time and the required memory size. These two phenomena represent key challenges that we aim to overcome and improve upon in future work.
From a theoretical perspective, analyzing the trade-off between execution time and memory usage is an intriguing research topic and a promising direction for future exploration. Understanding this balance could lead to more efficient algorithms that are both fast and resource efficient, making them suitable for a wider range of applications, including resource-constrained environments such as embedded systems and IoT devices.
On the other hand, an important consideration lies in the potential trade-offs between security and implementation complexity. While the primary focus of our work was to reduce the number of inverse operations in scalar multiplication—a critical bottleneck in ECC—any optimization technique must be carefully assessed for its impact on both security and practical implementation.
- Security ConsiderationsOur proposed method is based on well-established mathematical principles and does not introduce new assumptions or structures that could weaken the cryptographic security of the system. The repeating point-doubling formulas and grouping technique are derived directly from the affine coordinate system, ensuring that the underlying security properties of the elliptic curve are preserved. However, we recognize that side-channel attacks (e.g., timing or power analysis) could still pose a risk, as with any cryptographic implementation. While our current work does not explicitly address side-channel resistance, we plan to investigate this aspect in future research, potentially integrating countermeasures such as constant-time execution or masking techniques.
- Implementation ComplexityThe proposed method is designed to be simple and consistent in execution, making it suitable for both software and hardware implementations. The grouping technique and modified Horner’s rule introduce minimal overhead in terms of precomputation and memory usage, as the bit-words and repeated point-doubling results can be efficiently stored and reused. Our approach achieves faster scalar multiplication with a comparable level of implementation complexity in comparison with traditional methods like the sliding window algorithm. That said, we acknowledge that further evaluation is needed to assess its performance in highly resource-constrained environments, such as IoT devices or embedded systems.
- Future WorkWhile our initial results demonstrate significant improvements in computational efficiency, we agree that a more comprehensive evaluation of security and implementation complexity is essential. Future work will involve the following:
- A thorough security analysis, including resistance to side-channel attacks;
- Evaluation of the performance of methods in a wider range of hardware and software environments; particularly in resource-constrained settings.
- Comparison of the proposed method with other state-of-the-art techniques to identify potential trade-offs and optimize its practical applicability.
Finally, the formulas we derived are completely independent of B in the elliptic curve equation . This independence simplifies the application of our formulas across different elliptic curves. However, we are also curious whether it is possible to derive formulas that are independent of the parameter A in the equation. Exploring this possibility could lead to even more generalized and versatile results, potentially opening new avenues for optimization in elliptic curve cryptography. Such advancements could further enhance the efficiency and applicability of cryptographic protocols in real-world scenarios.
Author Contributions
Conceptualization, F.-J.K., Y.-H.C. and J.-J.W.; methodology, J.-J.W.; software, F.-J.K. and Y.-H.C.; validation, C.-D.L. and J.-J.W.; formal analysis, Y.-H.C.; investigation, C.-D.L.; resources, C.-D.L.; data curation, F.-J.K. and Y.-H.C.; writing—original draft preparation, J.-J.W.; writing—review and editing, J.-J.W.; visualization, F.-J.K. and Y.-H.C.; supervision, J.-J.W.; project administration, F.-J.K., Y.-H.C., C.-D.L. and J.-J.W.; funding acquisition, F.-J.K. and Y.-H.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by NSTC grant numbers 113-2221-E-214-021 and 113-2221-E-214-017, which may include administrative and technical support.
Data Availability Statement
Data are contained within the article.
Acknowledgments
We sincerely thank the National Science and Technology Council (NSTC) projects 113-2221-E-214-021 and 113-2221-E-214-017 for their funding and support. Heartfelt thanks to the reviewers for their suggestions and comments.
Conflicts of Interest
The authors declare that this study and its results were conducted independently, without any influence from financial, academic, or personal interests that could affect the outcomes. There are no direct or indirect conflicts of interest to disclose.
Appendix A
Table A1.
Computations of and for four times point doubling of for .
Table A1.
Computations of and for four times point doubling of for .
| x1, primitive polynomial , | |
| 0x017232BA853A7E731AF129F22FF4149563A419C26BF50A4C9D6EEFAD6126 | |
| 0x01DB537DECE819B7F70F555A67C427A8CD9BF18AEB9B56E0C11056FAE6A3 | |
| 0x17232BA853A7E731AF129F22FF4149563A419C26BF50A4C9D6EEFAD6126 | |
| 0xC8EFD200D0B85E058BEB9366C7B9D7C0D0323D4A7084B2ABAE8EBB7B92 | |
| 0x46B65EFCDD714E1FB3D046F17BFA928F300C397396A6D72A7BCEE4623E | |
| 0x1282331550168DE9B9630E825A5E58AB9A1D2D81F63051833AD8662D99C | |
| 0x277012D40FD385CB18ABFB705129D6A9709385C81D184AF636C800F25D | |
| 0x3E6B5DE2E8141083DAC00140C5936CD62A17ACE5620EEF8BD6763661FF | |
| 0xE7DC44EEB5FB14897829274A375A200B9D227AB7277745638B12045E3C | |
| 0x1D42DB6C4FC88A78613881210C5DCD474641567C546AFD60F1F3C70A52 | |
| 0xC90CB3BA4AFEF4FC089394671D12533FA38AC99B369E16AE91D06E541C | |
| 0xD5E4BAC01DB2D0DDF4B5818595D13B649FE1C22CAC6AE6DFF91A267AB6 | |
| 0x1F255D6098337C7B333913B56B4208769192550D64956C18F2DC35ABDF4 | |
| 0x1DFAC578A9A7E1742AA21F99C4BD2233A785F011584EF8BD203D7899E0 | |
| 0x1FFF77FF7E8D784B371FB9F83CFD834653F1BA3F507898E3536808A7651 | |
| 0x66B6351AF207F92AA3F52AE7BDF78E1E6F8CDF51E918A8CB63AD38741A | |
| 0x30BD692E27C7A151D6CC09E18FEA36E6EB710B197A6A96D0840183BBAA | |
Table A2.
Computations of and for four times point doubling of for .
Table A2.
Computations of and for four times point doubling of for .
| x1, primitive polynomial , | |
| 0x00FAC9DFCBAC8313BB2139F1BB755FEF65BC391F8B36F8F8EB7371FD558B | |
| 0x01006A08A41903350678E58528BEBF8A0BEFF867A7CA36716F7E01F8105 | |
| 0xFAC9DFCBAC8313BB2139F1BB755FEF65BC391F8B36F8F8EB7371FD558B | |
| 0x54515111155544512B1CF113BF10D7F74CDDDFA7FE4BAFCED7 | |
| D841A5294DD45E322F068 | |
| 0xCE238EFFA4284AF0160D29B4F683E93BAC0F38BDC8B297AE78 | |
| B59107EA9443D30936D9 | |
| 0x1CB0EDF1BB4103B60965BF4190FB920B757DDACEF61CC7603C0 | |
| E001ECE6278B3C48085E | |
| 0x17C10F70CF0C20F2ABECDA0639B1E878BD05271DF7A8FB00A42 | |
| 3673F8426B106AF66A61 | |
| 0x2A2FD15E5F8B390EAE362C83D0337A73D290A9FEBC241E5244D | |
| 8B6F32BDE4828821B7F9 | |
| 0x1CF2B34F4C12286EB5EE15DDD1A49369CF15CDF44DD8B69C421 | |
| DCC278F977F68CF7B750 | |
| 0x6BF096B1D4E3B3CCA95469A1794EF15B1AD97DADA461C6F350EA | |
| B6C32507AE181D9339C | |
| 0x1F1DA1BD7AAFBB0B823DBA1653B4A5D866F57845BE0099617B7 | |
| D2EDD3405B74A9F0B23B | |
| 0x245783E6BD266915C279EEAB5E9E657FFA5749E367E8E996655 | |
| 72D234B01C95C5BC9E89 | |
| 0x26FA03DF5515517EF2501202F38AF8C04B7F1F8773941DCCFD2 | |
| C38C54F35E0CA419A372 | |
| 0x2195B07257881D6D374F60B424FC79F4229C30C8207EC6B07EDF | |
| E3C86232B41CED9F992 | |
| 0x1C161B5BF6C81EF2CC6E80280A98CC5CFD395F0EA246525B10A | |
| 930DCCC2734A208D49D9 | |
| 0x6553144C06D11F234DE640CF8CF399B2B85634FEFDE4089350B | |
| C151CBD12EC5306113EE | |
| 0x3A55D77017BA6EC0D6AF87B67F8C33B3F7661FD7D3FF5033DBB9523 | |
| F5625EBB78BF623F | |
Table A3.
Computations of and for four times point doubling of for .
Table A3.
Computations of and for four times point doubling of for .
| x1, primitive polynomial , | |
| 0x0060F05F658F49C1AD3AB1890F7184210EFD0987E307C84C27ACC | |
| FB8F9F67CC2C460189EB5AAAA62EE222EB1B35540CFE9023746 | |
| 0x01E369050B7C4E42ACBA1DACBF04299C3460782F918EA427E632516 | |
| 5E9EA10E3DA5F6C42E9C55215AA9CA27A5863EC48D8E0286B | |
| 0x60F05F658F49C1AD3AB1890F7184210EFD0987E307C84C27ACCFB8 | |
| F9F67CC2C460189EB5AAAA62EE222EB1B35540CFE9023746 | |
| 0x1AC786F8F21F08DA00A37308FA9787E4DA69A59142AD5B7C8EC95C4E | |
| 0BD5522A561845C2DC240CBFFD1E788D8C28EEEC557F1AF | |
| 0x1026BFE44829CBA15B8C26B8E906F2241E47775A7C5D996AAA9AD28 | |
| 88EC57CEDB82F6BB23EFD18F5F269C4D34984B7BA0B1F3CB | |
| 0xFC0EF81DA9679D4FC66DB292971CABBB552D78D6A48C67650940273 | |
| F4CB7957F8FAAE7F94D5DA38A03A74C6CA3111BE362DEAF | |
| 0x3B899DCBB8BE70261408872B757C476E5F89DE93E596B68B36ECD6EB | |
| 75649E82B6723804E3B2959FE7CE14F0E2DF1F8E9242AB | |
| 0x117EDD4AF7D8EF95B46D0DE4548C89B872F3D1A00198675B0490AFBEA | |
| BE3413E19237E92A1FC940F2289E9E3F2AE2BC69502DDB | |
| 0x7588FD6A097CF42FA6B4A8F8C315A33012989C406217A7FC23E034632B | |
| 43C0C8C8797EB2A2FD0643611B1E499858B2C9F8E9D8 | |
| 0xA150FFB3B4919CD47A10A2AABA0486114E2BDB2C63FEDC14CD2B71695 | |
| BF91868E9533CBC63E811EC16BEEB8DF8A3941D2C551F | |
| 0x93682A817E4F27E418633D540CAC43A6952FABC521CBD6DC88A6EA6B1B4 | |
| B2CEC6C603276E6E3B267468E4A034134FBDF25EB4A | |
| 0xC0B0B103E854D8505C71151018952F6E4955B646F001C28ECA78B73E0 | |
| 53E6E2B7BC849150054F0040D0C9F3204869ED4106EE5 | |
| 0x18F980F54D4A5327DAB97A7DB060A75D44BBA63B6AE60E1E1DF3B8495 | |
| BD0D06304CA90EB77B145E5885ED59EFDD49BB25426A1E | |
| 0x1AF76EEA319ED639010848C7FC6F027FD701D8F2063348E0920BAF9AC | |
| EEBCC07033951B6FBF140957DB90DA12292F80B0819528 | |
| 0x1DF450CBAC4D70BBF94C8E5219AB0C775EE4F37CF033275682BEA7 | |
| 8F7C25E2DB292A52B95F2B92FB1588AD285A7570570175A69 | |
| 0x1E8F34968A9C9C65B1D056D71ABCF13D93C2211550AB0F59FBE9 | |
| 01756646108E70960C750069112300120BA1A1DE6A31D5FADBA | |
| 0x528673FF64BD082F3A60914056944B3BA99AC518D0D93F5F1CB3FB3DA0B | |
| 6F4579BC9C1125345DAE9BFCE973BC477747BA4CAF5 | |
Table A4.
Computations of and for four times point doubling of for .
Table A4.
Computations of and for four times point doubling of for .
| x1, primitive polynomial , | |
| 0x026EB7A859923FBC82189631F8103FE4AC9CA2970012D5D46024804801 | |
| 841CA44370958493B205E647DA304DB4CEB08CBBD1BA39494776FB988 | |
| B47174DCA88C7E2945283A01C8972 | |
| 0x0349DC807F4FBF374F4AEADE3BCA95314DD58CEC9F307A54FFC61E | |
| FC006D8A2C9D4979C0AC44AEA74FBEBBB9F772AEDCB620B01A7BA7 | |
| AF1B320430C8591984F601CD4C143EF1C7A3 | |
| 0x26EB7A859923FBC82189631F8103FE4AC9CA2970012D5D460248048018 | |
| 41CA44370958493B205E647DA304DB4CEB08CBBD1BA39494776FB988B4 | |
| 7174DCA88C7E2945283A01C8972 | |
| 0x2BF4AB0A0654BCC72510BA7C97DE64A1AE0751E2026B571B207ED40B | |
| A71667E4E8D88ED0A7687C20E786092A0294F91246B0B76338CD70EC3803 | |
| B75A92F06BBD9314CE03131BCA0 | |
| 0x2BCCA12217DE9277B0B2011E225EBA18027DDE7E54A78221DF115074 | |
| 3866EE6BD3A301D14243961C0694AF2A124E2DF0889112E9D9809D | |
| 9BAE9B7B41AFA4C39C7E33100BC1E6A6E | |
| 0x16E7B7EF519ADF86BF01ED25CCFC6CABD4933D1BFEF9B6ADE7818 | |
| AFB872580F2C0A2D07A5533568596888DCAFCA4C627C14697BCC4BD599 | |
| F40F62C1952916F4B20C9943FE59ECA7 | |
| 0x13BEE3A1BF46B5DEBD2D827F158FB4205CDBBD0B37670FD4D249C | |
| C9776C6E7475D4C58ECB7003E1464AA655B176564DF251B223642D965E | |
| 546EA2028A35700AC5A1CE1C25833E20 | |
| 0xE19CBC6A4D57A6E1465C2A9E87F34207EC3C4FE70B69D1B1A83CD | |
| 55E6A02D8978215F4AD2BBFB14BD9F444A2FB169502D8114D47D9FE | |
| 4582FA470F1EA7CF73700D7D66EC5FACE6 | |
| 0x705810F304A19E653B1DB8A1451F3F6296CB174243A86AFDDA06C1E74 | |
| 62EF1D3FE6AE540FA775BF61E2B5D4B3CC5C7E77818B24A1E88BF3CA | |
| 43C793F358BFFF6DD70292113EBA0D | |
| 0x5BCF313A3AFC4D794C9D0366461F019BC343BC25AF970EBC81E3CD | |
| B42B4E221C771B70C4B76D89DE5472FBB67973B22EA76112AD3F63A8F | |
| D0DB845970466D1401CE97EFAED1906C | |
| 0x3768E085616E1041183FC92AB605B4D66A5906561BB66AD2283DC6B | |
| 2BC8026699AF02C9B9996ED727B1B5E2DBBE62D6C5923A33205D23A011 | |
| 693DE482988480ECC227E76710AB9 | |
| 0x12775DC1600EBA8175A61AC35380F29868603C6803BD2F25FB5ABBEF | |
| 3C34E67EA50E983E1A265C3FBF30BCD1817B98A9F24AA1B18E04423B5 | |
| 018A73710941EDEE3494B316CDBCD | |
| 0x16E89D4D47B7DDAAFC8F25CDD200F0FF3DAC8D687E17325C2594566 | |
| BAD586676E6E138D5A352DDB278D9D86BF1BDBA1A8E72D18C9F5E0226 | |
| 10340AA8055B9CD03CF94312FFC215C | |
| 0xED089CC8CF79B26383A0082FE34EA885C6FF7EC123FAE8D8C3178 | |
| AF2792318011E71377D481BB784EE048DE9C0309AB1936ADA2A60C19 | |
| DA67C6663F3DEF1D61740F0D5E1F76883 | |
| 0x253E10151FF41A3EA108024F484D4C65AB81A3E49901BD2DC858F63C | |
| 87C865A28737A9BE47407ABD3166C39915E445AAB5B902B1009DB20E37 | |
| 0A47F02EF03D29E5C071C8089D50F | |
| 0x500477AFFF704DE6EF4846F7F4CAA9E48DB443466E6F8C2B85F1A75 | |
| 2A31110DBC30E2491C17F308B248A57CC5E31794BBD7F2915B243053C | |
| 65045830F12D50581BA869AF7F09D24 | |
| 0x2F04E2F7C2D35C1D42E68075890653DC3B65B112780C70521590A79E4 | |
| 3288E7ACB0F03B5189825F11A64729F492668EBB67A7129A61DCD33E47 | |
| A4E36B8F51769439D8E82C4E77C8 | |
Table A5.
Recommended elliptic curve domain parameters over .
Table A5.
Recommended elliptic curve domain parameters over .
| 0x1, 0xc9 | |
| 0x02FE13C0537BBC11ACAA07D793DE4E6D5E5C94EEE8 | |
| 0x0289070FB05D38FF58321F2E800536D538CCDAA3D9 | |
| 0x1, 0x4000000000000000001 | |
| 0x017232BA853A7E731AF129F22FF4149563A419C26BF50A4C9D6EEFAD6126 | |
| 0x01DB537DECE819B7F70F555A67C427A8CD9BF18AEB9B56E0C11056FAE6A3 | |
| 0x1, 0x10a1 | |
| 0x00FAC9DFCBAC8313BB2139F1BB755FEF65BC391F8B36F8F8EB7371FD558B | |
| 0x01006A08A41903350678E58528BEBF8A0BEFF867A7CA36716F7E01F8105 | |
| 0x1, 0x8000000000000000000001 | |
| 0x0060F05F658F49C1AD3AB1890F7184210EFD0987E307C84C27ACCFB8F9F67C | |
| C2C460189EB5AAAA62EE222EB1B35540CFE9023746 | |
| 0x01E369050B7C4E42ACBA1DACBF04299C3460782F918EA427E6325165E9EA10E | |
| 3DA5F6C42E9C55215AA9CA27A5863EC48D8E0286B | |
| 0x1, 0x425 | |
| 0x026EB7A859923FBC82189631F8103FE4AC9CA2970012D5D46024804801841CA4 | |
| 4370958493B205E647DA304DB4CEB08CBBD1BA39494776FB988B47174DCA88C | |
| 7E2945283A01C8972 | |
| 0x0349DC807F4FBF374F4AEADE3BCA95314DD58CEC9F307A54FFC61EFC006D | |
| 8A2C9D4979C0AC44AEA74FBEBBB9F772AEDCB620B01A7BA7AF1B320430C85 | |
| 91984F601CD4C143EF1C7A3 | |
References
- Miller, V. Uses of elliptic curves in cryptography. In Advances in Cryptology: Proceedings of Crypto’85; Springer: Berlin/Heidelberg, Germany, 1986. [Google Scholar]
- Koblitz, N. Elliptic curve cryptosystems. Math. Comput. 1987, 48, 203–209. [Google Scholar] [CrossRef]
- Wang, C.C.; Truong, T.K.; Shao, H.M.; Deutsch, L.J.; Omura, J.K.; Reed, I.S. VLSI architectures for computing multiplications and inverses in GF(2m). IEEE Trans. Comput. 1985, C-34, 709–717. [Google Scholar] [CrossRef] [PubMed]
- Bernstein, D.J. Batch Binary Edwards. In Advances in Cryptology—CRYPTO 2009; Halevi, S., Ed.; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5677, pp. 317–333. [Google Scholar]
- Chen, Y.H.; Huang, C.H. Efficient operations in large finite fields for elliptic curve cryptographic. Int. J. Eng. Technol. Manag. Res. 2020, 7, 141–151. [Google Scholar] [CrossRef]
- Blake, F.; Murty, V.K.; Xu, G. A note on window tau − NAF algorithm. Inf. Process. Lett. 2005, 95, 496–502. [Google Scholar] [CrossRef]
- Al Saffar, N.F.H.; Said, M.R.M. High performance methods of elliptic curve scalar multiplication. Int. J. Comput. Appl. 2014, 108, 39–45. [Google Scholar] [CrossRef]
- Pathak, H.K.; Sanghi, M. Speeding up computation of scalar multiplication in elliptic curve cryptosystem. Int. J. Comput. Sci. Eng. 2010, 2, 1024–1028. [Google Scholar]
- Eid, W.; Turki, F.A.; Marius, C.S. Efficient elliptic curve operators for Jacobian coordinates. Electonics 2022, 11, 3123. [Google Scholar] [CrossRef]
- Al Musa, S.; Xu, G. Fast scalar multiplication for elliptic curves over binary fields by efficiently computable formulas. In Progress in Cryptology—INDOCRYPT 2017; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Li, J.; Zhong, S.; Li, Z.; Cao, S.; Zhang, J.; Wang, W. Speed-oriented architecture for binary field point multiplication on elliptic curves. IEEE Access 2019, 7, 32048–32060. [Google Scholar] [CrossRef]
- Li, J.; Wang, W.; Zhang, J.; Luo, Y.; Ren, S. Innovative dual-binary-field architecture for point multiplication of elliptic curve cryptography. IEEE Access 2021, 9, 12405–12419. [Google Scholar] [CrossRef]
- Oudjida, A.K.; Liacha, A. Radix-2w arithmetic for scalar multiplication in elliptic curve cryptography. IEEE Trans. Circuits Syst. I Reg. Pap. 2021, 68, 1979–1989. [Google Scholar] [CrossRef]
- Bernstein, D.J.; Lange, T. Analysis and optimization of elliptic-curve single-scalar multiplication. In Proceedings of the Eighth International Conference on Finite Fields and Applications, Melbourne, Australia, 9–13 July 2007; pp. 1–20. [Google Scholar]
- Ning, Y.D.; Chen, Y.H.; Shih, C.S.; Chu, S.I. Lookup table-based design of scalar multiplication for elliptic curve cryptography. Cryptography 2024, 8, 11. [Google Scholar] [CrossRef]
- Cho, S.M.; Gwak, S.G.; Kim, C.H.; Hong, S. Faster elliptic curve arithmetic for triple-base chain by reordering sequences of field operations. Multimed. Tools Appl. 2016, 75, 14819–14831. [Google Scholar] [CrossRef]
- Zhang, J.; Chen, Z.; Ma, M.; Jiang, R.; Li, H.; Wang, W. High-performance ECC scalar multiplication architecture based on comb method and low-latency window recoding algorithm. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2024, 32, 382–395. [Google Scholar] [CrossRef]
- Matteo, S.D.; Baldanzi, L.; Crocetti, L.; Nannipieri, P.; Fanucci, L.; Saponara, S. Secure elliptic curve crypto-processor for real-time IoT applications. Energies 2021, 14, 4676. [Google Scholar] [CrossRef]
- Pillutla, S.R.; Boppana, L. A high-throughput fully digit-serial polynomial basis finite field GF(2m) multiplier for IoT applications. In Proceedings of the IEEE Region 10 International Conference (TENCON2019), Kochi, India, 17–20 October 2019; pp. 920–924. [Google Scholar]
- Sabbry, N.H.; Levina, A.B. An optimized point multiplication strategy in elliptic curve cryptography for resource-constrained devices. Mathematics 2024, 12, 881. [Google Scholar] [CrossRef]
- Shah, P.G.; Huang, X.; Sharma, D. Sliding window method with flexible window size for scalar multiplication on wireless sensor network nodes. In Proceedings of the International Conference on Wireless Communication and Sensor Computing (ICWCSC), Chennai, India, 2–4 January 2010; pp. 1–6. [Google Scholar]
- Darrel, H.; Scott, V.; Alfred, M. Guide to Elliptic Curve Cryptography; Springer: New York, NY, USA, 2004. [Google Scholar]
- Montgomery, P.L. Speeding the Pollard and elliptic curve methods of factorization. Math. Comput. 1987, 48, 243–264. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).