Abstract
A basic but expensive operation in the implementations of several famous public-key cryptosystems is the computation of the multi-scalar multiplication in a certain finite additive group defined by an elliptic curve. We propose an adaptive window method for the multi-scalar multiplication, which aims to balance the computation cost and the memory cost under register-constrained environments. That is, our method can maximize the computation efficiency of multi-scalar multiplication according to any small, fixed number of registers provided by electronic devices. We further demonstrate that our method is efficient when five registers are available. Our method is further studied in detail in the case where it is combined with the non-adjacent form (NAF) representation and the joint sparse form (JSF) representation. One efficiency result is that our method with the proposed improved NAF -bit representation on average requires 209n/432 point additions. To the best of our knowledge, this efficiency result is optimal compared with those of similar methods using five registers. Unlike the previous window methods, which store all possible values in the window, our method stores those with comparatively high probabilities to reduce the number of required registers.
1. Introduction
The notations in Table 1 are used throughout this paper, often without further definition. Others are defined where they are first used and in Appendix A.
Table 1.
Description of notations.
A basic but expensive operation in the implementations of several famous public-key schemes, for instance, Digital Signature Algorithm (DSA) [1], Elliptic Curve Digital Signature Algorithm (ECDSA) [2], and the Schnorr signature scheme [3], is the computation of the multi-scalar multiplication in a certain finite additive group defined by an elliptic curve or the multi-exponentiation in a certain finite multiplication group. Moreover, many public-key protocols, such as [4,5,6,7], also require one or more of the multi-scalar multiplication/multi-exponentiation operations.
For a better understanding of this, we adopt the symbol system of the multi-scalar multiplication herein. Without loss of generality, all techniques discussed in this paper can also be directly applied to the computation of the multi-exponentiation. The multi-scalar multiplication can be written as follows: given two integers and , and points and , compute . Due to the large operands, the computation of the multi-scalar multiplication requires a large number of processing steps and is thus time consuming. Since cryptographic implementations on embedded devices provided with little computation and memory power are often desired, a challenging problem is how to reduce the costs for the computation of the multi-scalar multiplication.
1.1. Previous Work
Obviously, we can separately compute the scalar multiplication values and , and then add them together. Gordon [8] surveyed the key techniques for the computation of the scalar multiplication. However, since the public-key cryptosystems do not require the intermediate values and , Shamir [9] suggested a simple but efficient trick for speeding up the multi-scalar multiplication by doing the two scalar multiplications simultaneously. Figure 1, describes Algorithm Shamir’s trick. Indeed, start from in Step 2. In Step 3, scan the bit pair of for simultaneously from left to right. In Step 3.1, always do and it means that this step requires point doublings. Then, if the current bit pair of are , , or , add by , , or in Step 3.2 accordingly. For example, to compute , write the binary expansion of and as and and apply the rules above so that the successive values of are at each step , , , , , , , , and finally .
Figure 1.
Algorithm Shamir’s trick.
It needs to be pointed out that more sophisticated techniques can potentially improve Step 3.2 to reduce the number of point additions. The value is equal to the number of bit pairs , in which at least one bit, i.e., or , is nonzero. It therefore implies that the performance factor is
on average.
Based on the frame of Shamir’s trick, the improved multi-scalar multiplication algorithms are divided into two categories. The first category codes the integers and so that the number of zero bit pairs, i.e.,, increases. Whereas the binary representation for an integer is unique, the signed binary representation by , , and is not. Since the cost of computing the inverse of a point is negligible compared to the point addition over the elliptic curve group, the improved multi-scalar multiplication algorithms, detailed in [10,11,12,13,14,15,16,17], require only one extra register to store the value in Step 1 of Algorithm Shamir’s trick in Figure 1. The NAF representation [18,19] is optimal for one integer. The performance factor can be improved to on average, when the algorithm uses the NAF representation instead. The JSF [11] is the optimal signed binary representation for two integers. The performance factor can further hit on average, when the algorithm uses the JSF representation instead. Due to its optimality property, a disadvantage of the coding approach is that the best performance factor cannot exceed the value wJSF−4(n). The second category, in contrast, scans and processes the -bit pair in Step 3.2 of Algorithm Shamir’s trick in Figure 1, where is an integer, and . To reduce the number of point additions, all possible values for -bit pairs should be pre-computed and stored in Step 1 of Algorithm Shamir’s trick in Figure 1. Certainly, the wide scanning approach [20,21,22,23,24,25], including the -ary method and the sliding window method, always combines with the coding approach, such as the NAF representation and the JSF representation, in practice. However, the approach requires a large number of extra registers to store all possible values for -bit pairs, even with a moderate .
Finally, some works [26,27] are dedicated to presenting the parallel algorithms for the multi-scalar multiplication, because the chip manufacturers are increasing the number of cores inside the processors. Other works [28,29] focus on the algorithms to speed up a group of the multi-scalar multiplications under cryptosystem configurations.
1.2. Motivation and Contribution
As low-cost computing devices, such as smartcards and RFID tags, are becoming ever more pervasive, new security threats are growing very quickly. However, these devices cannot always provide enough computation, memory, and electric power resources to implement the standard public-key schemes. We give several examples of potential crypto-oriented devices under register-constrained environments. One example is the ATmega128, which is part of the megaAVR family from Atmel [30] and has been widely used in embedded systems, automotive environments, and sensor-node applications. The ATmega128 features KB of flash memory and KB of internal SRAM. Additionally, it has only -bit general-purpose registers (R0 to R31) and the -bit result is stored in the registers R0 (lower word) and R1 (higher word). Another example is the ARM7TDMI (ARM7 Thumb Debug Multiplier ICE) [31], which was introduced by ARM in and has been used in a wide range of applications, e.g., mobile devices produced by Nokia and Motorola, Apple’s iPod, video game consoles integrated by companies such as SEGA and Sony, routers, and automobile systems. For the standard ARM operating mode, general-purpose registers (R0 to R15) are available to users. In the Thumb mode, only eight registers are available, i.e., R0 to R7, which in general limits the applicability for many cryptographic algorithms. Moreover, even if these devices will be more powerful as a result of Moore’s Law, the manufacturer may still prefer those that are less powerful but more cost competitive. As a result, cryptographic engineers are always faced with a situation where the number of available registers is not sufficient for the ideal cryptographic implementation of multi-scalar multiplication.
Therefore, under register-constrained environments, this paper focuses on the design and analysis of the multi-scalar multiplication algorithms, which can flexibly improve the computation efficiency based on the available registers. We present an adaptive window method, which codes the integers and in the forms such as the NAF. Our adaptive window method can practically improve the computation efficiency of multi-scalar multiplication according to the small, fixed number of the registers provided by the register-constrained computing devices. To illustrate this, we further give an example with five registers. To be more precise, we consider the 5-register adaptive window method using the NAF and JSF, respectively. Additionally, the computational complexity is analyzed by modeling the scan process as the Markov chain. Furthermore, the performance factor for the adaptive window method using our improved NAF representation can achieve
on average, which is slightly smaller than using the JSF representation. To the best of our knowledge, when only five registers are allowed, our method with the improved NAF representation is the most efficient one for the computation of the multi-scalar multiplication.
2. Adaptive Window Method
Assume that the register-constrained computing devices can provide registers for the computation of multi-scalar multiplication. Figure 2 describes the Algorithm adaptive window method. The integers and are coded by a certain signed binary representation, i.e.,
and
where . Let denote the window size. In Step 3 of Algorithm Adaptive window method in Figure 2, scan and in the ordinary signed binary representations from left to right for the largest bit(s) pair within the window such that the pair has a value already precomputed in Step 1 and its first 1-bit sub-pair is nonzero.
Figure 2.
Algorithm Adaptive window method.
To compute the multi-scalar multiplication, previous window methods require to store all possible values for -bit pairs. However, our adaptive window method merely pre-computes and stores part of the values for these pairs (See Step 1 of Algorithm Adaptive window method in Figure 2), when the available registers (whose number is denoted by ) are not enough. Thus, our method may spend more than one point addition for -bit pairs using Steps 3.1 and 3.2 of Algorithm Adaptive window method in Figure 2. Obviously, to reduce the number of point additions as much as possible, Step 1 of Algorithm Adaptive window method in Figure 2 should select the pairs with comparatively high probabilities in the signed sequence pair and store their corresponding values. As a result, the achievement of our adaptive window method is that the computation efficiency of the multi-scalar multiplication can be flexibly improved according to the registers provided by the register-constrained computing devices. Comparatively, previous window methods require the fixed number of registers based on the window size .
Compared with Shamir’s trick using the NAF representation or the JSF representation, algorithm Adaptive window method in Figure 2 at least requires five registers to improve the computation efficiency of the multi-scalar multiplication. Therefore, in the next section, we provide an example of the adaptive window method. That is, a detailed design and analysis for is presented with the NAF representation and the JSF representation, respectively.
3. A Case Study: Adaptive Window Method for Five Registers
3.1. Using NAF Representation
Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3 illustrates the basic version of the adaptive window method combined with the NAF representation, when five registers are available. In this case, we find that the pairs , , , , and have comparatively high probabilities in the NAF sequence pair on average. Hence, Step 1 of Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3 correspondingly requires the values , , , , and . Additionally, Step 3 scans and collects the pairs , , , , and in the NAF sequence pair and then computes their corresponding values. For example, to compute , write the NAF expansion of and as and and apply the defined rules in Steps 1, 3.1, 3.2, 3.3, 3.4, 3.5, and 4 so that the successive values of are at each step O, , , ,, 13A + 42B, , and finally .
Figure 3.
Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation.
Due to the frame of Shamir’s trick, the above -register algorithm still requires point doublings. Thus, we only need to consider its performance factor, which directly determines the number of point additions. We have the following result.
Theorem 1.
The performance factor of Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3 is
on average, when .
The proof of Theorem 1 appears in Appendix B and Appendix C.
According to Theorem 1, the basic version of the above -register algorithm has the same performance factor as that of Shamir’s trick coupled with the JSF representation, which merely needs four registers. However, the basic version can be further improved to reduce its performance factor. We propose the recoding rules for the input NAF sequence pair as follows:
After Step 1 of Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3, the improved -register algorithm converts into by replacing according to the above recoding rules from left to right. If the replacement is due to Rule A1, A2, A3, or A4, then discard the left two columns that have been replaced and consider the next three or four columns for future replacement. If a replacement is due to Rule A5, A6, A7, or A8, then discard all columns that have been replaced and consider the next three or four columns for future replacement. If no replacement is possible, then discard one column and consider the next three or four columns for future replacement. The improved version of the -register algorithm is fully the same as its basic version except for the replacement operation by above recoding rules. For example, to compute , write the NAF expansion of and as and , apply Rule A3 so that and , and further use Steps 1, 3.1, 3.2, 3.3, 3.4, 3.5, and 4 of Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3 so that the successive values of are at each step , , , ,, , , and finally . We further have the following result.
Theorem 2.
The performance factor of the improved Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3 is about
on average, when n → ∞.
The proof of Theorem 2 appears in Appendix D.
3.2. Using JSF Representation
Assume that the JSF representation [11] is used for the integers and as the inputs of Algorithm Adaptive window method in Figure 2. Additionally, assume that the window size Algorithm Adaptive window method in Figure 2. According to the properties of the JSF representation, all possible -bit and -bit pairs are , , , , , , , , , , , , , , , , , , , and in the JSF sequence pair . Therefore, the values , , , , , , , , , , , and can be selectively pre-computed and stored during Step 1 of Algorithm Adaptive window method in Figure 2. Now, we can consider designing the -register algorithm using the JSF representation. In fact, if the NAF is replaced with the JSP, then Algorithm The 5-register algorithm using the non-adjacent form (NAF) representation in Figure 3 is a 5-register algorithm using the JSF representation. We can obtain the following result.
Theorem 3.
The performance factor of the 5-register algorithm using the JSF representation is
on average, when.
The proof of Theorem 3 appears in Appendix E and Appendix F.
Unlike the NAF, no recording rule is found in the JSF, and thus no further improvement can be provided.
4. Experiments and Comparison
For performance evaluation, we have simulated the adaptive window method and other similar methods in the Visual C++ platform. Those methods include Shamir’s trick using, respectively, the NAF representation and the JSF representation, and the interleaving method using the -NAF [21,32]. Here, we only consider the multi-scalar multiplication algorithms, which require at most five registers in the pre-computation process. For the multi-scalar multiplication , we assume that the bit lengths of integers and using any representation are all . To compare the number of point additions in terms of bits, a performance factor constant is defined as the ratio of the performance factor to the bit length , i.e., . During the experiments, we generate randomly 1,000,000 pairs of -bit integers and calculate the performance factor constant for each method. The results are summarized in Table 2. In the -register case, the experiments on the adaptive window method using Ruan-Katti’s representation [14] are also conducted for comparison. However, the improvement from Ruan-Katti’s representation is not so much as from the NAF and JSF representations.
Table 2.
Comparison with the related methods.
The asynchronous method [23] mandatorily requires six or eight registers. When six registers are available, the asynchronous method is actually the same as the adaptive window method using the NAF representation. Additionally, the asynchronous method using eight registers is the corresponding sliding window method [21] with the window size . Hence, the asynchronous method can be treated as a special case of the adaptive window method. The interleaving method using the -NAF is not directly suitable for optimizing computation efficiency based on the available registers. If all five available registers are required to be used, Shamir’s trick with the -NAF and -NAF interleaving is the only choice. However, it makes no sense, since its performance factor constant is , even larger than that of Shamir’s trick with the -NAF interleaving (See Table 2). When five registers are available, the adaptive window method using our improved NAF representation requires the least number of point additions compared to the known methods.
We also verify the efficiency results in the real mobile phone. We use Eclipse to edit the Java code and the C code of those five algorithms in Table 2. Additionally, JNI (Java Native Interface) is employed to realize the interaction between the C code and the Java code. The interaction process of the five algorithms’ codes is shown in Figure 4. Here, the C code is responsible for operating the CPU registers of the five algorithms. We then use Android Studio to implement them in the Android system. For those five algorithms, the elliptic curve group is based on NIST’s P- curve [1]. For each algorithm, the computation of 1,000,000 multi-scalar multiplications is carried out, and the average value of the running time is taken as the final result (see Figure 5). It can be seen that the efficiency results achieved on the physical device are basically consistent with our theoretical expectation.
Figure 4.
Implementation frame of the five algorithms by using Java Native Interface (JNI).
Figure 5.
Average running time for each algorithm.
5. Future Work
Three possible directions for future improvement are as follows.
(1) The optimal signed binary representation for the adaptive window method. In practice, the improved NAF representation in Section 3.1 can achieve the minimal performance factor among all well-known representations. However, we still do not know how to find the one with the best performance factor among all signed binary representations. Hence, it remains an open problem to find the optimal one for the adaptive window method with a fixed number of registers.
(2) The on-line strategy for the adaptive window method. To compute the multi-scalar multiplication , the -ary method and the sliding window method need to pre-compute and store all possible values for the -bit pairs, where the integer is the window size, and . However, the adaptive window method only computes and stores part of them based on the number of available registers. Thus, it would be useful to check each -bit pair in the sequence pair according to on-line input integers and , and then determine the high frequency values among all possible values in real time. Clearly, should those high frequency values be pre-computed and stored, the adaptive window method could be further improved in practical implementations. It might be interesting to investigate this on-line strategy further.
(3) The register-constrained implementation for the adaptive window method. We use the Java code linked with the C code to implement several multi-scalar multiplication algorithms on the mobile phone and obtain their corresponding efficiency results. However, both the device and the development tool are not perfect in consideration of the register-constrained environment. Embedded hardware microprocessors, such as Atmega and ARM, and the assembly code, are more suitable to simulate our proposed multi-scalar multiplication algorithms and verify their performance results. Additionally, the novel optimization implementation technique on our proposed algorithms may be designed according to the particular embedded hardware microprocessor. Hence, it is valuable work to further implement the adaptive window method in the embedded hardware microprocessors.
6. Conclusions
We have studied the cryptographic implementations of multi-scalar multiplication under register-constrained environments. In order to make the best of the available registers, our idea is not to store all possible values in the window, but only to store those with comparatively high probabilities. The computational complexity analysis and the experimental results show that the proposed adaptive window method achieves the notable computation efficiency with one more register provided. For embedded cryptographic applications, it is especially convenient for our method to balance the performance and the costs according to the computation and memory abilities of the embedded devices. We also expect that our research will inspire others to work in the fascinating algorithms of multi-scalar multiplication under resource-constrained environments.
Author Contributions
Conceptualization, D.-Z.S.; methodology, D.-Z.S.; validation, D.-Z.S. and J.-D.Z.; formal analysis, D.-Z.S.; investigation, H.-D.Z.; writing—original draft preparation, D.-Z.S. and H.-D.Z.; writing—review and editing, D.-Z.S. and X.-Y.G.; supervision, D.-Z.S. and J.-D.Z.; funding acquisition, D.-Z.S. All authors have read and agreed to the published version of the manuscript.
Funding
The work of Da-Zhi Sun was supported in part by the National Natural Science Foundation of China under Grant No. 61872264. The APC was funded by the National Natural Science Foundation of China under Grant No. 61872264.
Acknowledgments
The authors would like to thank the editor and the reviewers for their valuable suggestions and comments.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Some Notations Using in the Appendixes
Let and be the signed binary representations. Supposing , we write
(1) denotes the probability of any ;
(2) denotes the conditional probability of any , given that ;
(3) denotes the probability of any and ;
(4) denotes the probability of any and , given that and .
For example, , , , and , respectively, mean , , , and , when .
Appendix B. Some Facts of NAF Representation
To analyze the computational complexity of our proposed algorithms, we need first review two well-known properties of the NAF representation [18,19] as follows.
Lemma A1.
The NAF representation of an integer is unique. No two nonzero digits are adjacent in the representation.
Lemma A2.
For any bit of the NAF representation, the probabilities of , , and are, respectively, ,, and . That is,
and
We can further give three useful properties of the NAF representation. The following three properties can be easily derived from the NAF coding algorithm.
Lemma A3.
For any two consecutive bits of the NAF representation,
and
Lemma A4.
For any three consecutive bits of the NAF representation,
and
Lemma A5.
For any four consecutive bits of the NAF representation,
and
Appendix C. Proof of Theorem 1
Proof.
Consider that Figure 3 scans the input NAF sequence pair . Let , , , , , , , and , respectively, be the corresponding scanning states , , , , , , , and . It means that one of Steps 3.1, 3.2, 3.3, 3.4, 3.5, and 4 is executed in Figure 3. Note that the NAF representations of the integers and are independent from each other. Based on the properties of the NAF representation, we can compute all one-step transition probabilities for the scanning states . For example, we can compute
by Lemmas A2 and A3;
by Lemma A1;
by Lemmaa A3 and A5;
by Lemmas A2 and A4.
Thus, the one-step transition probability matrix of the states is
where denotes the conditional probability of the next state given the current state .
Since the matrix has all positive elements, this Markov chain is a regular chain. Let be the probability of the state as . According to the theorems of the regular Markov chain, we have
In Figure 3, we can see that the state needs no point addition, and each state needs one point addition. Thus, on average, the asymptotic performance factor is
□
Appendix D. Proof of Theorem 2
Proof.
Consider the -bit or -bit pairs in the recoding rules. Based on the scanning process of the input NAF sequence pair in Figure 3, each -bit pair using Rule A1, A2, A3, or A4 requires two point additions before the recoding process. However, it requires only one point addition after the recoding process. For example, the -bit pair , consecutively, needs to pass Steps 3.3 and 3.4 of Figure 3 for the first two columns, and Steps 3.3 and 3.4 requires in total two point additions. Comparably, the -bit pair needs to execute Steps 3.1 and 3.5 of Figure 3 for the first two columns, which requires one point addition instead. Similarly, each -bit pair using Rule A5, A6, A7, or A8 requires three point additions before the recoding process, but requires two point additions after the recoding process.
Next, according to Rules A1, A2, A3, A4, A5, A6, A7, and A8, we calculate the probabilities of those -bit and -bit pairs appeared in the NAF representation. We can obtain
and
by Lemmas A4 and A5.
Consequently, it follows from Theorem 1 that the performance factor of the improved Figure 3 can be estimated as
where denotes the number of saving point additions due to Rules A1, A2, A3, A4, A5, A6, A7, and A8.
□
Appendix E. Some Facts of JSF Representation
To analyze the -register algorithm using the JSF representation, we need the following important fact of the JSF representation.
Lemma A6.
Assume that the scan process of the sliding window method [22] is used for the JSF sequence pair, and the window size. Letbe any-bit pair appeared in the JSF sequence pair, that is,. On average, the probabilities of all possible pairs are
and
when.
Proof.
We assume that the reader is already acquainted with the results in Solinas’ technical report [11], from which we recall a few important facts. For the JSF coding algorithm, the JSF coding output is a function of the internal current state , where and . We can further extend the detailed relations between the states and the corresponding outputs as follows:
(1) maps to the current output ;
(2) maps to the case where the current output is and the next output will be ;
(3) maps to the case where the current output is and the next output will be ;
(4) maps to the case where the current output is and the next output will be ;
(5) maps to the case where the current output is and the next output will be ;
(6) maps to the case where the current output is and the next output will be ;
(7) maps to the case where the current output is and the next output will be ;
(8) maps to the current output .
According to Solinas’ result, the one-step transition probability matrix of the states is
Since all the elements in the matrix are positive, this Markov chain is a regular chain. Let be the probability of the state as . According to the theorems of the regular Markov chain, we have
Because of previous relations between the states and the corresponding outputs, we know
Furthermore, by the JSF coding algorithm, i.e., Algorithm Shamir’s trick in Figure 1 in [11], all possible -bit pairs , , , , , , , , , , , , , , , and should have the same probability. It means that
□
Appendix F. Proof of Theorem 3
Proof.
Our -register algorithm using the JSF representation is almost the same as Figure 3 but with the input and represented in the JSF. By Lemma A6, our algorithm is optimal. Because the value is stored in Step 1 of Figure 3, the -bit pairs and in the JSF sequence pair only require one point addition. Therefore, the corresponding performance factor is
□
References
- National Institute of Standards and Technology. Federal Information Processing Standards Publication 186-3: Digital Signature Standard (DSS). 2009. Available online: https://csrc.nist.gov/csrc/media/publications/fips/186/3/archive/2009-06-25/documents/fips_186-3.pdf (accessed on 5 November 2020).
- American National Standards Institute. ANSI X9.62: Public Key Cryptography for the Financial Services Industry: The Elliptic Curve Digital Signature Algorithm (ECDSA); American National Standards Institute: New York, NY, USA, 2005. [Google Scholar]
- Schnorr, C.P. Efficient signature generation by smart cards. J. Cryptol. 1991, 4, 161–174. [Google Scholar] [CrossRef]
- Fuchsbauer, G.; Orrù, M.; Seurin, Y. Aggregate Cash Systems: A Cryptographic Investigation of Mimblewimble. In Proceedings of the 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques Selected Areas in Cryptography (EUROCRYPT 2019), Part I, Darmstadt, Germany, 19–23 May 2019; Ishai, Y., Rijmen, V., Eds.; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2019; Volume 11476, pp. 657–689. [Google Scholar]
- Sun, D.Z.; Sun, L.; Yang, Y. On secure simple pairing in Bluetooth standard v5.0-part II: Privacy analysis and enhancement for low energy. Sensors 2019, 19, 3259. [Google Scholar] [CrossRef]
- Zhang, Y.D.; He, D.B.; Zhang, M.W.; Choo, K.K.R. A provable-secure and practical two-party distributed signing protocol for SM2 signature algorithm. Front. Comput. Sci. China 2020, 14, 143803. [Google Scholar] [CrossRef]
- Chen, E.; Zhu, Y.; Lin, C.L.; Lv, K.W. Zero-pole cancellation for identity-based aggregators: A constant-size designated verifier-set signature. Front. Comput. Sci. China 2020, 14, 144806. [Google Scholar] [CrossRef]
- Gordon, D.M. A survey of fast exponentiation methods. J. Algorithms 1998, 27, 129–146. [Google Scholar] [CrossRef]
- ElGamal, T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 1985, 31, 469–472. [Google Scholar] [CrossRef]
- Dimitrov, V.S.; Jullien, G.A.; Miller, W.C. Complexity and fast algorithms for multiexponentiations. IEEE Trans. Comput. 2000, 49, 141–147. [Google Scholar] [CrossRef]
- Solinas, J.A. Low-Weight Binary Representations for Pairs of Integers; Combinatorics and Optimization Research Report CORR 2001-41, Centre for Applied Cryptographic Research, University of Waterloo. 2001. Available online: http://www.cacr.math.uwaterloo.ca/techreports/2001/corr2001-41.ps (accessed on 5 November 2020).
- Grabner, P.J.; Heuberger, C.; Prodinger, H. Distribution results for low-weight binary representations for pairs of integers. Theor. Comput. Sci. 2004, 319, 307–331. [Google Scholar] [CrossRef][Green Version]
- Yang, W.C.; Guan, D.J.; Laih, C.S. Algorithm of asynchronous binary signed-digit recoding on fast multiexponentiation. Appl. Math. Comput. 2005, 167, 108–117. [Google Scholar] [CrossRef]
- Ruan, X.Y.; Katti, R.S. Left-to-right optimal signed-binary representation of a pair of integers. IEEE Trans. Comput. 2005, 54, 124–131. [Google Scholar] [CrossRef]
- Sun, D.Z.; Huai, J.P.; Sun, J.Z.; Zhang, J.W. Computational efficiency analysis of Wu et al.’s fast modular multi-exponentiation algorithm. Appl. Math. Comput. 2007, 190, 1848–1854. [Google Scholar] [CrossRef]
- Sun, D.Z.; Huai, J.P.; Sun, J.Z.; Li, J.X. Analysis of multi-exponentiation algorithm using binary signed-digit representations. Int. J. Comput. Methods 2009, 6, 307–315. [Google Scholar] [CrossRef]
- Yang, W.C.; Hung, C.P. Analysis of the Dimitrov-Jullien-Miller recoding algorithm. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2016, E99A, 139–144. [Google Scholar] [CrossRef]
- Arno, S.; Wheeler, F.S. Signed digit representations of minimal hamming weight. IEEE Trans. Comput. 1993, 42, 1007–1010. [Google Scholar] [CrossRef]
- Menezes, A.; van Oorschot, P.; Vanstone, S. Handbook of Applied Cryptography; CRC Press: Boca Raton, FL, USA, 1997; pp. 627–628. [Google Scholar]
- Yen, S.M.; Laih, C.S.; Lenstra, A.K. Multi-exponentiation. IEE Proc. Comp. Digit. Tech. 1994, 141, 325–326. [Google Scholar] [CrossRef]
- Möller, B. Algorithms for Multi-exponentiation. In Proceedings of the Selected Areas in Cryptography (SAC 2001), Toronto, ON, Canada, 16–17 August 2001; Vaudenay, S., Youssef, A.M., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2001; Volume 2259, pp. 165–180. [Google Scholar]
- Avanzi, R.M. The complexity of certain multi-exponentiation techniques in cryptography. J. Cryptol. 2005, 18, 357–373. [Google Scholar] [CrossRef]
- Yang, W.C.; Guan, D.J.; Laih, C.S. Fast multicomputation with asynchronous strategy. IEEE Trans. Comput. 2007, 56, 234–242. [Google Scholar] [CrossRef]
- Sun, D.Z.; Huai, J.P.; Li, J.X. A note on asynchronous multi-exponentiation algorithm using binary representation. Inf. Process. Lett. 2012, 112, 876–879. [Google Scholar] [CrossRef]
- Chevalier, C.; Laguillaumie, F.; Vergnaud, D. Privately outsourcing exponentiation to a single server: Cryptanalysis and optimal constructions. Algorithmica 2020, 83, 72–115. [Google Scholar] [CrossRef]
- Borges, F.; Lara, P.; Portugal, R. Parallel algorithms for modular multi-exponentiation. Appl. Math. Comput. 2017, 292, 406–416. [Google Scholar] [CrossRef]
- Topcuoglu, C.; Kaya, K.; Savas, E. A generic private information retrieval scheme with parallel multi-exponentiations on multicore processors. Concurr. Comput. Pract. Exp. 2018, 30, e4685. [Google Scholar] [CrossRef]
- Tao, R.; Liu, J.; Su, H.; Sun, Y.; Liu, X. Combination in Advance Batch Multi-exponentiation on Elliptic Curve. In Proceedings of the 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing (CSCloud 2015), New York, NY, USA, 3–5 November 2015; Qiu, M.K., Zhang, T., Das, S., Eds.; IEEE Computer Society: Washington, DC, USA, 2015; pp. 411–416. [Google Scholar]
- Wu, Q.H.; Sun, Y.; Qin, B.; Hu, J.K.; Liu, W.R.; Liu, J.W.; Ding, Y. Batch public key cryptosystem with batch multi-exponentiation. Futur. Gener. Comp. Syst. 2016, 62, 196–204. [Google Scholar] [CrossRef]
- Atmel Corporation. 8-Bit AVR Microcontroller with 128K Bytes In-System Programmable Flash. 2007. Available online: http://ww1.microchip.com/downloads/en/DeviceDoc/doc0945.pdf (accessed on 5 February 2021).
- ARM. ARM7TDMI Technical Reference Manual (Rev 3). 2017. Available online: http://ww1.microchip.com/downloads/en/DeviceDoc/DDI0029G_7TDMI_R3_trm.pdf (accessed on 5 February 2021).
- Hankerson, D.; Menezes, A.; Vanstone, S. Guide to Elliptic Curve Cryptography; Springer: New York, NY, USA, 2004; pp. 109–113. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).