On the Performance and Security of Multiplication in GF(2N)
Abstract
:1. Introduction
2. The Field in Cryptography: Arithmetic and Suitability
2.1. Application to Block Ciphers
2.2. Application to Classical Public-Key Cryptography
2.3. Application to Post-Quantum Public-Key Cryptography
2.4. Arithmetic in Extensions of
2.5. Tower Fields Representation
- is an irreducible polynomial over of degree ℓ,
- is an irreducible polynomial over of degree m,
- is an irreducible polynomial over of degree N,
2.6. Composite Fields and Fields Mapping
- is known for some integer r and;
- .
- (1)
- ,
- (2)
- if β is a primitive element, then γ is primitive in .
- where , hence
- hence,
3. Results and Discussion
3.1. Multiplication in
Algorithm 1 Initialization of the antilog table |
Require: The finite field and its generator polynomial P. Ensure: The antilog table.
|
Algorithm 2 Initialization of the log table |
Require: the antilog table. Ensure: the log table.
|
Algorithm 3 Multiplication in the tower field in |
Require: two polynomials , and an extension polynomial p of order 2 Ensure: polynomial
|
Algorithm 4 Iterative multiplication with conditional reduction |
Require: Two polys , of orders at most n and a reduction polynomial P of order n Ensure: Polynomial of order n
|
3.2. Secure Computation in
Algorithm 5 Iterative multiplication with unconditional reduction |
Require: Two polynomials , of orders at most n and a reduction polynomial P of order n Ensure: Polynomial of order n
|
Algorithm 6 Bitsliced multiplication |
Require:n-bit words and where Ensure: 64 n-bit words
|
4. Case Study: Optimization of DAGS
4.1. Initial Choice of Parameters
4.2. Improved Field Selection
- Key Generation,
- Encapsulation,
- Decapsulation.
- Tabulated log/antilog (Algorithms A1–A3),
- Iterative, conditional reduction (Algorithm A5),
- Iterative, ASM with PCLMUL, conditional reduction (Algorithm A5),
- Iterative, unconditional reduction (Algorithm A6),
- Iterative, ASM with PCLMUL, unconditional reduction (Algorithm A6),
- Iterative, unconditional reduction, 1-bit-sliced, 64 comput. in parallel (Algorithm A7),
- Iterative, ASM with PCLMUL, unconditional reduction, bit-sliced 2 computs. In parallel (Algorithm A8).
4.3. Implementation Performances
- (*): Conversion from to using T in Example (1) is 112 cycles, using POPCNT ASM instruction is 38 cycles (Algorithm A9).
- (**): Time to initialize the tables: (Algorithm A1 and A2);
- 2360 cycles on ,
- 267,086 cycles on and,
- 7884 cycles on ,
(can be precomputed, hence cycles=0) - (***): Transposition (Algorithm A7.1) time is;
- 780 cycles on and,
- 1613 cycles on ,
- (****): Transposition is 2 cycles on .
- The tabulated log-antilog version is the fastest amongst non-parallel algorithms.
- It is faster to implement tower field computation directly in an isomorphic field of characteristic two.
- The modular multiplication with Carry-less MULtiplication (PCLMUL) dedicated Assembly (ASM) instruction does not improve the speed since the overhead in the function call is dominating the computation for those small values of N. However, in case of only one serial operation, PCLMUL should be used because it has the lowest latency.
- Constant-time cache secure implementations take more time than those that are not secure. Moreover, we noticed that “conditional reduction” in C code is actually constant-time once compiled in assembly code (when optimization flag is set) owing to the use by the compiler of the CMOV (conditional move) assembly instruction, which executes in one single clock cycle.
- The bitsliced single multiplication takes only cycle over and cycles for and is invulnerable to cache-timing attacks. Thus, it is our champion implementation to be chosen for fast and secure arithmetic over .
- For the second version of bitsliced implementation, we pack two words as . Then, the products and can be computed in one go by noticing that PCLMULQDQ() = PCLMULQDQ(, ) + (PCLMULQDQ(X, )⊕ PCLMULQDQ(, Y)) + PCLMULQDQ(X, Y); hence, the results are obtained at bit indices and .
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A. C Code for Various Algorithms
- #include <stdlib.h>
- #include <stdint.h>
- typedef uint16_t gf_t; /* Galois field elements */
- #define gf_extd 6
- #define gf_card (1 << gf_extd)
- #define gf_ord ((gf_card)-1)
- #define poly_primitive_subfield 67 // 0x43 (0b01000011; the bits are defined
- //following the polynomial: X^6 + x + 1
- /* Algorithm A1: Precomputation of the antilog table for F(2)[x]/x^6+x+1 */
- static gf_t *gf_antilog;
- void gf_init_antilog()
- {
- int i = 1;
- int temp = 1 << (gf_extd - 1);
- gf_antilog = (gf_t *)malloc((gf_card * sizeof(gf_t)));
- gf_antilog[0] = 1; // Dummy value (not used)
- for (i = 1; i < gf_ord; ++i)
- {
- gf_antilog[i] = gf_antilog[i - 1] << 1;
- if ((gf_antilog[i - 1]) & temp)
- {
- // XOR with 67: X^6 + x + 1
- gf_antilog[i] ^= poly_primitive_subfield;
- }
- }
- gf_antilog[gf_ord] = 1;
- }
- /* Algorithm A2: Precomputation of the log table for F(2)[x]/x^6+x+1 */
- static gf_t* gf_log;
- void gf_init_log()
- {
- int i = 1;
- gf_log = (gf_t *)malloc((gf_card * sizeof(gf_t)));
- gf_log[0] = -1; // Dummy value (not used)
- gf_log[1] = 0;
- for (i = 1; i < gf_ord; ++i)
- {
- gf_log[gf_antilog[i]] = i;
- }
- }
- /* Algorithm A3: Tabulated multiplication over GF(2^6) */
- /* Use precomputed tables to accelerate the multiplication: it uses the */
- /* algorithm 1 and 2 which are done just once to the DAGS initialization*/
- /* This algorithm is not constant-time, so it is not protected. */
- #define gf_mult_tabulated(x,y)((y) ? gf_antilog[(gf_log[x]+gf_log[y])
- % gf_ord]: 0)
- //not constant-time
- /* Algorithm A4: Tabulated multiplication over GF((2^6)^2) */
- /* Uses the algorithm 3, so it is not protected */
- gf_t gf_mult_extension_tabulated(gf_t x, gf_t y)
- {
- gf_t a1, b1, a2, b2, a3, b3;
- a1 = x >> 6;
- b1 = x & 63;
- a2 = y >> 6;
- b2 = y & 63;
- // not constant-time
- a3 = gf_mult_tabulated(gf_mult_tabulated(a1, a2), 36) ^
- gf_mult_tabulated(a1, b2)^ gf_mult_tabulated(b1, a2);
- //36 is p_1 in the extension polynomial
- b3 = gf_mult_tabulated(gf_mult_tabulated(a1, a2), 2)
- ^ gf_mult_tabulated(b1, b2); //2 is p_0 in the extension polynomial
- return (a3 << 6) ^ b3;
- }
- /* Algorithm A5: Iterative multiplication over GF(2^6) with conditional reduction */
- /* The multiplication does not use the precomputed tables and the
- ASM PCLMUL instruction */
- /* can be used. It is not constant-time. */
- gf_t gf_mult_iterative_conditional(gf_t x, gf_t y)
- {
- #ifndef PCLMUL
- gf_t res,m;
- res = 0; // this variable will contain the result
- for(int i=0;i<6;++i) // For each coefficient of the polynomial
- {
- if(y&1==1) // Check the coefficient, it is not constant-time
- {
- res = res ^ x; // addition
- }
- y=y>>1;
- //this shift permits to have the next coefficient b_i for the next iteration
- x = x << 1;
- if((x & 64) != 0)
- // x must be reduced modulo X^6+X+1, 64 for 0x40, 0b01000000
- {
- x ^= 67; // 0x43: X^6 + x + 1
- }
- }
- return res;
- #else
- //using ASM PCLMUL instruction
- uint32_t a, m;
- // Multiplication
- asm volatile ("movdqa %1, %%xmm0;\n\t"
- "movdqa %2, %%xmm1;\n\t"
- "pclmulqdq $0x00, %%xmm0, %%xmm1;\n\t"
- "movdqa %%xmm1, %0;\n\t"
- : "=x"(a)
- : "x"((uint32_t)y), "x"((uint32_t)x)
- : "%xmm0","%xmm1"
- );
- // reduction polynomial (conditional reduction)
- for (int k=0; k<6; k++) { // For each coefficient of the polynomial
- if (a >> (11-k)) //not constant-time
- {
- a ^= (67 << (5-k)); // 0x43: X^6 + x + 1
- }
- }
- return a&0xFFFF;
- #endif
- }
- /* Algorithm A6: Iterative multiplication
- over GF(2^6) with unconditional reduction */
- /* The multiplication does not use the
- precomputed tables and the ASM PCLMUL instruction */
- /* can be used. It is constant-time. */
- gf_t gf_mult_iterative_unconditional(gf_t x, gf_t y)
- {
- #ifndef PCLMUL
- gf_t res,m;
- res = 0;
- for(int i=0;i<6;++i)
- // For each coefficient of the polynomial, constant-time
- {
- m = -(y&1); //m is either 0xffff or 0x0000
- res = res ^ (x&m); // addition
- y=y>>1;
- x = x << 1;
- // x must be reduced modulo X^6+X+1
- m=-(((x)>>6)&1);
- x ^= m & 67; // 0x43 : X^6 + x + 1
- }
- return res;
- #else
- //using ASM PCLMUL instruction
- uint32_t a, m;
- // multiplication
- asm volatile ("movdqa %1, %%xmm0;\n\t"
- "movdqa %2, %%xmm1;\n\t"
- "pclmulqdq $0x00, %%xmm0, %%xmm1;\n\t"
- "movdqa %%xmm1, %0;\n\t"
- : "=x"(a)
- : "x"((uint32_t)y), "x"((uint32_t)x)
- :"%xmm0","%xmm1"
- );
- // reduction polynomial
- for (int k=0; k<6; k++) {
- m = -((a >> (11-k))&1);
- a ^= ((67 << (5-k))&m); // 0x43: X^6 + x + 1
- }
- return a&0xFFFF;
- #endif
- }
- /* Algorithm A7: 1-bitsliced multiplication
- over GF(2^6) (64 computations in parallel) */
- /* A7.1: Transpositions */
- void to_bitslice(gf_t *x, uint64_t *res) {
- int i = 0;
- for (i = 0; i<64; i++) {
- res[0] |= (((((uint64_t)x[i])) & 1) << i);
- res[1] |= (((((uint64_t)x[i]) >> 1) & 1) << i);
- res[2] |= (((((uint64_t)x[i]) >> 2) & 1) << i);
- res[3] |= (((((uint64_t)x[i]) >> 3) & 1) << i);
- res[4] |= (((((uint64_t)x[i]) >> 4) & 1) << i);
- res[5] |= (((((uint64_t)x[i]) >> 5) & 1) << i);
- }
- }
- void from_bitslice(uint64_t *res, gf_t *x) {
- int i = 0;
- for (i = 0;i<64; i++) {
- x[i] |= (((res[0] >> i) & 1));
- x[i] |= (((res[1] >> i) & 1) << 1);
- x[i] |= (((res[2] >> i) & 1) << 2);
- x[i] |= (((res[3] >> i) & 1) << 3);
- x[i] |= (((res[4] >> i) & 1) << 4);
- x[i] |= (((res[5] >> i) & 1) << 5);
- }
- }
- /* A7.2: 1-bit-sliced multiplication (SIMD code) */
- void gf_multsubTab(gf_t *x, gf_t *y, gf_t *z)
- {
- uint64_t xbin[6];
- uint64_t ybin[6];
- uint64_t res[6];
- xbin[0]=xbin[1]=xbin[2]=xbin[3]=xbin[4]=xbin[5] = 0;
- ybin[0]=ybin[1]=ybin[2]=ybin[3]=ybin[4]=ybin[5] = 0;
- // Transpose x and y
- to_bitslice(x, xbin);
- to_bitslice(y, ybin);
- // Multiplication and reduction polynomial
- //with 64 computations in parallel for a each coefficient of the polynomial
- // constant-time
- uint64_t const xbin05 = xbin[0] ^ xbin[5];
- uint64_t const xbin54 = xbin[5] ^ xbin[4];
- uint64_t const xbin43 = xbin[4] ^ xbin[3];
- uint64_t const xbin32 = xbin[3] ^ xbin[2];
- uint64_t const xbin21 = xbin[2] ^ xbin[1];
- res[0] = (xbin[0] & ybin[0]);
- res[1] = (xbin[1] & ybin[0]);
- res[2] = (xbin[2] & ybin[0]);
- res[3] = (xbin[3] & ybin[0]);
- res[4] = (xbin[4] & ybin[0]);
- res[5] = (xbin[5] & ybin[0]);
- res[0] ^= (xbin[5] & ybin[1]);
- res[1] ^= (xbin05 & ybin[1]);
- res[2] ^= (xbin[1] & ybin[1]);
- res[3] ^= (xbin[2] & ybin[1]);
- res[4] ^= (xbin[3] & ybin[1]);
- res[5] ^= (xbin[4] & ybin[1]);
- res[0] ^= (xbin[4] & ybin[2]);
- res[1] ^= (xbin54 & ybin[2]);
- res[2] ^= (xbin05 & ybin[2]);
- res[3] ^= (xbin[1] & ybin[2]);
- res[4] ^= (xbin[2] & ybin[2]);
- res[5] ^= (xbin[3] & ybin[2]);
- res[0] ^= (xbin[3] & ybin[3]);
- res[1] ^= (xbin43 & ybin[3]);
- res[2] ^= (xbin54 & ybin[3]);
- res[3] ^= (xbin05 & ybin[3]);
- res[4] ^= (xbin[1] & ybin[3]);
- res[5] ^= (xbin[2] & ybin[3]);
- res[0] ^= (xbin[2] & ybin[4]);
- res[1] ^= (xbin32 & ybin[4]);
- res[2] ^= (xbin43 & ybin[4]);
- res[3] ^= (xbin54 & ybin[4]);
- res[4] ^= (xbin05 & ybin[4]);
- res[5] ^= (xbin[1] & ybin[4]);
- res[0] ^= (xbin[1] & ybin[5]);
- res[1] ^= (xbin21 & ybin[5]);
- res[2] ^= (xbin32 & ybin[5]);
- res[3] ^= (xbin43 & ybin[5]);
- res[4] ^= (xbin54 & ybin[5]);
- res[5] ^= (xbin05 & ybin[5]);
- // Transpose
- from_bitslice(res, z);
- }
- /* Algorithm A8: Iterative, ASM with PCLMUL, unconditional reduction, bit-sliced
- (2 computations in parallel) */
- /* with PCMUL, 2 computations maximum are possible. It is constant-time. */
- void gf_mult_bitslice_2computations(gf_t *x, gf_t *y, gf_t *tab) {
- // Transposition for computation in parallel
- uint64_t x2 = x[1] << 12 | x[0], y2 = y[1] << 12 | y[0];
- uint64_t a, m, m1, s, m0;
- // Multiplication
- // As the output is on 64 bits max
- asm volatile ("movdqa %1, %%xmm0;\n\t"
- "movdqa %2, %%xmm1;\n\t"
- "pclmulqdq $0x00, %%xmm0, %%xmm1;\n\t"
- "movq %%xmm1, %0;\n\t"
- : "=x"(a)
- : "x"(y2), "x"(x2)
- :"%xmm0","%xmm1"
- );
- // Polynomial reduction
- for (int k=0; k<6; k++) {
- m0 = a >> (11-k);
- m = -(m0&1);
- m1 = -((m0>>24)&1);
- s = (67 << (5-k)); // 0x43: X^6 + x + 1
- a ^= (( s & m ) | ((s << 24) & m1 ));
- }
- // Transposition
- tab[0] = a&0x3F;
- tab[1] = (a>>24)&0x3F;
- }
- /* Algorithm A9: Mapping between GF((2^6)^2) and GF(2^12) */
- //Conversion Matrix from GF((2^6)^2) to GF(2^12)
- static const gf_t T[12] = {3857, 1140, 3330, 132, 286,
- 1954, 1938, 1208, 314, 3754, 2750, 188};
- //Conversion Matrix from GF(2^12) to GF((2^6)^2)
- static const gf_t Ti[12] = {3321, 3388, 4080, 2152,
- 3712, 3808, 2274, 4088, 1076, 3904, 1904, 3708};
- //Hamming weight computation
- static inline gf_t hamming_weight(gf_t n) {
- #ifndef ASM_POPCNT
- n = ((n & 0x0AAA) >> 1) + (n & 0x0555);
- n = ((n & 0x0CCC) >> 2) + (n & 0x0333);
- n = ((n & 0x00F0) >> 4) + (n & 0x0F0F);
- n = ((n & 0x0F00) >> 8) + (n & 0x00FF);
- #else
- //using ASM
- asm (
- "POPCNT %1, %0 \n" // Count of Number of Bits Set to 1
- : "=r" (n)
- : "mr" (n)
- : "cc"
- );
- #endif
- return n;
- }
- /* A9.1: Conversion from GF(2^12) to GF((2^6)^2) */
- /* with the convertion Matrix Ti from GF(2^12) to GF((2^6)^2) */
- gf_t iconv_bit(gf_t x)
- {
- gf_t res = 0;
- for (int i=0; i<12; i++) {
- res |= (hamming_weight(x & Ti[i])&1) << i; // Ti defined in (3.5)
- }
- return res;
- }
- /* A9.2: Conversion from GF((2^6)^2) to GF(2^12) */
- /* with the convertion Matrix T from GF((2^6)^2) to GF(2^12) */
- gf_t conv_bit(gf_t x)
- {
- gf_t res = 0;
- for (int i=0; i<12; i++) {
- res |= (hamming_weight(x & T[i])&1) << i; // T defined in (3.5)
- }
- return res;
- }
References
- Paar, C. Efficient VLSI architectures for Bit-Parallel Computation in Galois Fields. Ph.D. Thesis, Institute for Experimental Mathematics, University of Essen, Duisburg, Germany, 1994. Available online: https://tinyurl.com/yc7hmfmo (accessed on 18 September 2018).
- Sunar, B.; Savas, E.; Koç, Ç.K. Constructing composite field representations for efficient conversion. IEEE Trans. Comput. 2003, 52, 1391–1398. [Google Scholar] [CrossRef] [Green Version]
- Round 1 Submissions (30/11/2017)—Post-Quantum Cryptography. Available online: https://csrc.nist.gov/Projects/Post-Quantum-Cryptography/Round-1-Submissions (accessed on 18 September 2018).
- DAGS project. Available online: http://www.dags-project.org (accessed on 18 September 2018).
- NIST/ITL/CSD. Advanced Encryption Standard (AES). FIPS PUB 197, 11/26/2001. (Also ISO/IEC 18033-3:2010). Available online: http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.197.pdf (accessed on 18 September 2018).
- McEliece, R.J. A public-key cryptosystem based on algebraic coding theory. JPL DSN Prog. Rep. 1978, 42–44, 114–116. [Google Scholar]
- Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 1978, 21, 120–126. [Google Scholar] [CrossRef] [Green Version]
- Diffie, W.; Hellman, M. New directions in cryptography. IEEE Trans. Inf. Theory 1976, 22, 644–654. [Google Scholar] [CrossRef] [Green Version]
- Bardet, M.; Chaulet, J.; Dragoi, V.; Otmani, A.; Tillich, J.P. Cryptanalysis of the McEliece public key cryptosystem based on polar codes. In Proceedings of the 7th International Conference on Post-Quantum Cryptography (PQCrypto 2016), Fukuoka, Japan, 24–26 February 2016; Springer: Berlin, Germany, 2016; pp. 118–143. [Google Scholar]
- Post-Quantum Cryptography Challenge (ongoing). Available online: https://csrc.nist.gov/Projects/Post-Quantum-Cryptography/Round-1-Submissions (accessed on 18 September 2018).
- Yarom, Y.; Falkner, K. FLUSH+RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack. In Proceedings of the 23rd USENIX Security Symposium (USENIX Security 14), San Diego, CA, USA, 20–22 August 2014; pp. 719–732. [Google Scholar]
- Facon, A.; Guilley, S.; Lec’hvien, M.; Schaub, A.; Souissi, Y. Detecting cache-timing vulnerabilities in post-quantum cryptography algorithms. In Proceedings of the 3rd IEEE International Verification and Security Workshop, Hotel Cap Roig, Platja d’Aro, Costa Brava, Spain, 2–4 July 2018. [Google Scholar]
- Lidl, R.; Niederreiter, H. Finite Fields; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
- Tromer, E.; Osvik, D.A.; Shamir, A. Efficient Cache Attacks on AES, and Countermeasures. J. Cryptol. 2010, 23, 37–71. [Google Scholar] [CrossRef]
- Aciiçmez, O.; Koç, Ç.K.; Seifert, J.P. On the power of simple branch prediction analysis. In Proceedings of the 2nd ACM Symposium on Information, Computer and Communications Security, Singapore, 20–22 March 2007; pp. 312–320. [Google Scholar]
- Aciiçmez, O.; Koç, Ç.K.; Seifert, J. Predicting Secret Keys Via Branch Prediction. In Proceedings of the Cryptographers’ Track at the RSA Conference 2007, San Francisco, CA, USA, 5–9 February 2007; pp. 225–242. [Google Scholar]
- Biham, E. A Fast New DES Implementation in Software. In Proceedings of the the Fourth International Workshop on Fast Software Encryption, Haifa, Israel, 20–22 January 1997; pp. 260–272. [Google Scholar]
- Matsui, M.; Nakajima, J. On the Power of Bitslice Implementation on Intel Core2 Processor. In Proceedings of the Cryptographic Hardware and Embedded Systems, Vienna, Austria, 10–13 September 2007; pp. 121–134. [Google Scholar]
- Berlekamp, E.; McEliece, R.; van Tilborg, H. On the Inherent Intractability of Certain Coding Problems. IEEE Trans. Inform. Theory 1978, 24, 384–386. [Google Scholar] [CrossRef]
- Misoczki, R.; Barreto, P.S.L.M.B. Compact McEliece Keys from Goppa Codes. In Proceedings of the 16th Workshop on Selected Areas in Cryptography (SAC 2009), Calgary, AB, Canada, 13–14 August 2009; pp. 376–392. [Google Scholar]
- Persichetti, E. Compact McEliece keys based on quasi-dyadic Srivastava codes. J. Math. Cryptol. 2012, 6, 149–169. [Google Scholar] [CrossRef]
- Faugère, J.C.; Otmani, A.; Perret, L.; Tillich, J.P. Algebraic Cryptanalysis of McEliece Variants with Compact Keys. In Proceedings of the 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, France, 30 May–3 June 2010; pp. 279–298. [Google Scholar]
- Prange, E. The use of information sets in decoding cyclic codes. IRE Trans. Inf. Theory 1962, 8, 5–9. [Google Scholar] [CrossRef]
- Peters, C. Information-Set Decoding for Linear Codes over Fq. In Proceedings of the The Third International Workshop on Post-Quantum Cryptography, Darmstadt, Germany, 25–28 May 2010; pp. 81–94. [Google Scholar]
Submission | Type | Finite Field | Tower Fields Used |
---|---|---|---|
BIG QUAKE | Code-based | No | |
DAGS | Code-based | Yes | |
EdonK | Code-based | No | |
Ramstake | Code-based | No | |
RLCE | Code-based | No | |
LAC | Lattice-based | No | |
DME | Multivariate | No | |
HIMQ-3 | Multivariate | No | |
LUOV | Multivariate | Yes |
Name | Security Level | q | m | n | k | s | t | Public Key Size |
---|---|---|---|---|---|---|---|---|
DAGS_1 | 128 | 2 | 832 | 416 | 13 | 6760 | ||
DAGS_3 | 256 | 2 | 1216 | 512 | 11 | 8448 | ||
DAGS_5 | 512 | 2 | 2112 | 704 | 11 | 11,616 |
Multiplication Algorithm | Algorithm | (*) | Constant-Time | ||
---|---|---|---|---|---|
Tabulated log/antilog (**) | 3, 4 | 8 | 11 | 20 | No |
Iterative, conditional reduction | 5 | 27 | 51 | 133 | No |
Iterative, ASM with PCLMUL, | 5 | 29 | 41 | 146 | No |
conditional reduction | |||||
Iterative, unconditional reduction | 6 | 30 | 58 | 155 | Yes |
Iterative, ASM with PCLMUL, | 6 | 35 | 65 | 225 | Yes |
unconditional reduction | |||||
Iterative, unconditional reduction, | 7 | - | Yes | ||
1-bit-sliced (***) 64 computations in parallel | |||||
Iterative, ASM with PCLMUL, | 8 | - | Yes | ||
unconditional reduction, bit-sliced (****) 2 | |||||
computations in parallel |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Danger, J.-L.; El Housni, Y.; Facon, A.; Gueye, C.T.; Guilley, S.; Herbel, S.; Ndiaye, O.; Persichetti, E.; Schaub, A. On the Performance and Security of Multiplication in GF(2N). Cryptography 2018, 2, 25. https://doi.org/10.3390/cryptography2030025
Danger J-L, El Housni Y, Facon A, Gueye CT, Guilley S, Herbel S, Ndiaye O, Persichetti E, Schaub A. On the Performance and Security of Multiplication in GF(2N). Cryptography. 2018; 2(3):25. https://doi.org/10.3390/cryptography2030025
Chicago/Turabian StyleDanger, Jean-Luc, Youssef El Housni, Adrien Facon, Cheikh T. Gueye, Sylvain Guilley, Sylvie Herbel, Ousmane Ndiaye, Edoardo Persichetti, and Alexander Schaub. 2018. "On the Performance and Security of Multiplication in GF(2N)" Cryptography 2, no. 3: 25. https://doi.org/10.3390/cryptography2030025
APA StyleDanger, J. -L., El Housni, Y., Facon, A., Gueye, C. T., Guilley, S., Herbel, S., Ndiaye, O., Persichetti, E., & Schaub, A. (2018). On the Performance and Security of Multiplication in GF(2N). Cryptography, 2(3), 25. https://doi.org/10.3390/cryptography2030025