A Finite Regime Analysis of Information Set Decoding Algorithms

: Decoding of random linear block codes has been long exploited as a computationally hard problem on which it is possible to build secure asymmetric cryptosystems. In particular, both correcting an error-affected codeword, and deriving the error vector corresponding to a given syndrome were proven to be equally difﬁcult tasks. Since the pioneering work of Eugene Prange in the early 1960s, a signiﬁcant research effort has been put into ﬁnding more efﬁcient methods to solve the random code decoding problem through a family of algorithms known as information set decoding. The obtained improvements effectively reduce the overall complexity, which was shown to decrease asymptotically at each optimization, while remaining substantially exponential in the number of errors to be either found or corrected. In this work, we provide a comprehensive survey of the information set decoding techniques, providing ﬁnite regime temporal and spatial complexities for them. We exploit these formulas to assess the effectiveness of the asymptotic speedups obtained by the improved information set decoding techniques when working with code parameters relevant for cryptographic purposes. We also delineate computational complexities taking into account the achievable speedup via quantum computers and similarly assess such speedups in the ﬁnite regime. To provide practical grounding to the choice of cryptographically relevant parameters, we employ as our validation suite the ones chosen by cryptosystems admitted to the second round of the ongoing standardization initiative promoted by the US National Institute of Standards and Technology.


Introduction
Asymmetric cryptosystems are traditionally built on a mathematical function which is hard to compute unless the knowledge of a special parameter is available.Typically, such a function is known as a mathematical trapdoor, and the parameter acts as the private key of the asymmetric cryptosystem.Decoding a random linear block code was first proven to be equivalent to solve an instance of the three dimensional matching by Elwin Berlekamp et al. in 1978 [1].By contrast, efficient decoding algorithms for well structured codes have a long history of being available.Therefore, McEliece himself proposed to disguise an efficiently decodable code as a random code and employ the knowledge of the efficiently decodable representation as the private key of an asymmetric cryptosystem.In this way, a legitimate user of the cryptosystem would be able to employ an efficient decoder for the chosen hidden code, while an attacker would be forced to resort to decoding techniques for a generic linear code.
Since the original proposal, a significant amount of variants of the McEliece cryptosystem were proposed swapping the original decodable code choice (Goppa codes [2]) with other efficiently decodable codes with the intent of enhancing computational performances of reducing the key size.The first attempt in this direction was the Niederreiter cryptosystem [3] using generalized Reed-Solomon (GRS) codes.While the original proposal by Niederreiter was broken by Sidel'nikov and Shestakov in [4], replacing the hidden code with a Goppa code in Niederreiter's proposal yields a cryptosystem which is currently unbroken.More recently, other families of structured codes have been considered in this framework, such as Quasi Cyclic (QC) codes [5], Low Density Parity Check (LDPC) codes [6], Quasi Dyadic (QD) codes [7], Quasi Cyclic Low Density Parity Check (QC-LDPC) codes [8] and Quasi Cyclic Moderate Density Parity Check (QC-MDPC) codes [9].
The significant push in the development of code-based cryptosystems was also accompanied by a comparably sized research effort in their cryptanalysis.In particular, the best attack technique that does not rely on the underlying hidden code structure, and thus is applicable to all the variants, is known as Information Set Decoding (ISD).In a nutshell, ISD attempts at finding enough error-free locations in a codeword to be able to decode it regardless of the errors which affect the codeword itself.Such a technique was first proposed by Prange [10] as a more efficient alternative to decode a general linear block code, with respect to a straightforward guess on the error affected locations.Since then, a significant amount of improvements to Prange's original technique were proposed [11][12][13][14][15][16], effectively providing significant polynomial speedups on the exponential-time decoding task.In addition to the former works, where the focus is to propose an operational description of a general decoding technique based on information set decoding, the works by the authors of [17][18][19] provide a more general view on general decoding techniques, including split syndrome decoding and supercode decoding, and report proven bounds on the complexities of the said approaches.Finally, we report the work of Bassalygo et al. [20] as the first tackling formally the complexity of decoding linear codes.For a more comprehensive survey of hard problems in coding theory, we refer the interested reader to [21][22][23].
The common praxis in the literature concerning ISD improvements is to evaluate the code parameters for the worst-case-scenario of the ISD, effectively binding together the code rate and number of corrected errors to the code length.Subsequently, the works analyze the asymptotic speedup as a function of the code length alone.While this approach is effective in showing an improvement in the running time of the ISD in principle, the practical relevance of the improvement when considering useful parameter sizes in cryptography may be less significant.
We note that, in addition to being the most efficient strategy to perform general random linear code decoding, ISD techniques can also be employed to recover the structure of the efficiently decodable code from its obfuscated version for the LDPC, QC-LDPC and QC-MDPC code families.
Recently, the National Institute of Standards and Technology (NIST) has started a selection process to standardize asymmetric cryptosystems resistant to attacks with quantum computers.Since decoding a random code is widely believed to require an exponential amount of time in the number of errors, even in presence of quantum computers, code based cryptosystems are prominent candidate in the NIST selection process [24].Hence, having accurate and shared expressions in the finite length regime, both in the classic and in the quantum computing setting, for the work factor of attacks targeting such schemes, it is important to define a common basis for their security assessment.A work sharing our intent is [25], where a non-asymptotic analysis of some ISD techniques is performed.However, a comprehensive source of this type is not available in the literature, to the best of our knowledge.

Contributions
In this work, we provide a survey of the existing ISD algorithms, with explicit finite regime expressions for their spatial and temporal complexities.We also detail which free parameters have to be optimized for each of the ISD algorithms, and provide a software tool implementing the said optimization procedure on a given set of code parameters in [26].
with a polynomial amount of calls to an oracle for the corresponding decision problem.This, in turn, states that the difficulty of Syndrome Decoding Problem (SDP) and Codeword Finding Problem (CFP) is the same of solving their decisional variants Decisional Syndrome Decoding Problem (DSDP) and Decisional Codeword Finding Problem (DCFP), i.e., they are as hard as an NP-Complete problem.We note that, in the light of such a reduction, and despite it is a notation abuse, it is commonplace to state that Syndrome Decoding Problem (SDP) and Codeword Finding Problem (CFP) are NP-Complete, although only decisional problems belong to the NP-complete class.

Applications to Cryptography
The class of NP-Complete problems is of particular interest to design cryptosystems, as it is widely believed that problems contained in such a class cannot be solved in polynomial time by a quantum computer.Indeed, the best known approaches to solve both the Codeword Finding Problem (CFP) and the Syndrome Decoding Problem (SDP) have a computational complexity which is exponential in the weight of the codeword or error vector to be found.A notable example of code-based cryptosystem relying on the hardness of the Syndrome Decoding Problem (SDP) is the one proposed by Niederreiter in [3].
Niederreiter cryptosystem generates a public-private key-pair selecting as the private key an instance of a code from a family for which efficient decoding algorithms are available.The code is then represented by its parity check matrix H priv , which is multiplied by a rank-r random square binary matrix S obtaining the public key of the cryptosystem H pub = SH priv .The assumption made by Niederreiter is that the multiplication by the random, full rank matrix S makes H pub essentially indistinguishable from a random parity matrix.While the original choice to employ a Reed-Solomon code as private code was found to falsify this assumption and lead to a practical attack, other code families have proven to be good candidates (e.g., Goppa codes, Low/Medium Density Parity Check codes [28][29][30][31]).A message is encrypted in the Niederreiter cryptosystem encoding it as a fixed weight error vector e and computing its syndrome through H pub , s = H pub e T , which acts as the ciphertext.The owner of the private key (S, H priv ) is able to decipher the ciphertext through first obtaining s = S −1 s and subsequently performing the syndrome decoding of s employing H priv .
It is easy to note that, under the assumption that H pub is indistinguishable from a random parity check matrix, an attacker willing to perform a Message Recovery Attack (MRA) must solve an instance of the Syndrome Decoding Problem (SDP).We note that, as proven by Niederreiter [3], the Syndrome Decoding Problem (SDP) is computationally equivalent to the problem of correcting a bounded amount of errors affecting a codeword, when given a random generator matrix of the code, G.Such a problem goes by the name of Decoding Problem, and is the mainstay of the original cryptosystem proposal by McEliece [32].In such a scheme, the ciphertext thus corresponds to the sum between a codeword of the public code, obtained as mG, with m being a length-k vector, and a vector e of weight t.The message can either be encoded into e or into m; in the latter case, Message Recovery Attack (MRA) is performed by searching for the error vector e and, subsequently, by adding it to the intercepted ciphertext.We point out that this search can automatically turned into the formulation of Syndrome Decoding Problem (SDP), by first computing a valid H pub from G and then by trying to solve Syndrome Decoding Problem (SDP) on the syndrome of the intercepted ciphertext through H pub .
One of the most prominent cases where Codeword Finding Problem (CFP) appears in code-based cryptosystems is represented by a Key Recovery Attack (KRA) against Niederreiter cryptosystems where the private parity check matrix H priv contains rows with a known low weight w.Indeed, in such a case, considering H pub as the generator matrix of the dual code, solving the Codeword Finding Problem (CFP) for such a code reveals the low weight rows of H priv .We note that such a Key Recovery Attack (KRA) is in the same computational complexity class as Syndrome Decoding Problem (SDP), assuming that the obfuscation of H pub makes it indistinguishable from a random one.
Two notable cases where solving the Codeword Finding Problem (CFP) is the currently best known method to perform a Key Recovery Attack (KRA) are the LEDAcrypt [33] and BIKE [34] proposals to the mentioned NIST standardization effort for post-quantum cryptosystems.Since such a Codeword Finding Problem (CFP) can also be seen as the problem of finding a binary vector c with weight w such that H pub c T = 0, the problem is also known as the Homogeneous Syndrome Decoding Problem (SDP), as it implies the solution of a simultaneous set of linear equations similar to the Syndrome Decoding Problem (SDP), save for the syndrome being set to zero.

Strategies to Perform MRA
As described in the previous section, security of code-based cryptosystems relies on the hardness of solving Syndrome Decoding Problem (SDP) or Codeword Finding Problem (CFP) instances.In this section, we analyze the case of Syndrome Decoding Problem (SDP) and show that the optimal strategy to perform Message Recovery Attack (MRA) depends on the code parameters.The optimal strategy for solving Syndrome Decoding Problem (SDP) depends on the relation between the actual parameters of the instance under analysis.In particular, in the cases where t is above the Gilbert-Varshamov (GV) distance [35], Generalized Birthday Algorithm (GBA) is the best currently known algorithm for solving Syndrome Decoding Problem (SDP) [36,37].However, for the cases we consider in this paper, practical values of t are significantly smaller than the GV distance; in such cases, the best known methods to solve Syndrome Decoding Problem (SDP) go by the name of Information Set Decoding (ISD) algorithms.Such algorithms are aimed at lessening the computational effort required in the guesswork of an exhaustive search for the unknown error vector e of weight t, given a syndrome and a parity check matrix.We point out that it is also possible to adapt all Information Set Decoding (ISD) algorithms, save for the first one proposed by Prange [10], to solve the Codeword Finding Problem (CFP), as a consequence of the structural similarity of the two problems.
All Information Set Decoding (ISD) algorithms share a common structure where an attempt at retrieving the error vector corresponding to a given syndrome is repeated, for a number of times whose average value depends on the success probability of the single attempt itself.The complexity of all Information Set Decoding (ISD) variants can be expressed as the product between the complexity of each attempt, which we denote as C iter , and the average number of required attempts.In particular, such a value can be obtained as the reciprocal of the success probability of each attempt, which we denote as Pr succ ; thus, when considering a code with length n, redundancy r = n − k and Hamming weight of the sought error bounded to t, we generically denote the time complexity of obtaining one solution of Syndrome Decoding Problem (SDP) employing the Information Set Decoding (ISD) variant at hand, i.e., As we show in the following, the work factor of a Message Recovery Attack (MRA) performed through Information Set Decoding (ISD) may actually depend on the system parameters; to this end, we first exploit the following well-known result.Let C(n, k, d) be a linear binary code with length n, dimension k and minimum distance d, and let H be a parity-check matrix for C(n, k, d).Let s be a length-r binary vector and t be an integer ≤n; then, if t < d 2 , there is at maximum one vector e of weight t such that s = He T .
Thus, when H pub is the parity-check matrix of a code with minimum distance d > 2t, then solving Syndrome Decoding Problem (SDP) guarantees that the found error vector corresponds to the one that was used to encrypt the message.In this case, the attack work factor corresponds to C ISD (n, r, t).
However, when d ≤ 2t, the time complexity of a Message Recovery Attack (MRA) needs to be derived through a different approach.Indeed, in such a case, the adversary has no guarantee that the output of Information Set Decoding (ISD) corresponds to the error vector that was actually used in the encryption phase.Thus, the work factor of a Message Recovery Attack (MRA) cannot be simply taken as the time complexity of the chosen Information Set Decoding (ISD) algorithm.
Let s be the syndrome corresponding to the intercepted ciphertext and e be the searched error vector, i.e., s = H pub e T .We define N(e) = e ∈ F n 2 s.t.wt(e ) = t and H pub e T = s . ( Clearly, N(e) corresponds to the number of valid outputs that Information Set Decoding (ISD) can produce, when applied on the syndrome s corresponding to e.In such a case, the probability that an Information Set Decoding (ISD) iteration will not find any valid error vector can be estimated as (1 − Pr succ ) N(e) .Thus, in such a case, one attempt of Information Set Decoding (ISD) will succeed with probability Pr succ (e) = 1 − (1 − Pr succ ) N(e) .In particular, the algorithm will randomly return a vector among the set of the N(e) admissible ones: thus, the probability that the obtained vector corresponds to e is 1/N(e).
To obtain a closed-form expression for the attack work factor, we can consider the average value of N(e), which we obtain by averaging over all the possible vectors e of weight t and length n, and denote it with N.Then, the attack work factor can be computed as In particular, it can be shown that NP ≥ 1 − (1 − P) N , so that with We point out that, for the cases we analyze in this paper, we have NPr succ 1, so that α ≈ 1.Thus, from now on, we assume α = 1, i.e., that the time complexity of performing a Message Recovery Attack (MRA) is equal to that of running an Information Set Decoding (ISD) algorithm in the case in which a unique solution exists.

A Finite Regime Analysis of Information Set Decoding Techniques
In the following, we report an analysis of the best known variants of Information Set Decodings (ISDs) and their execution on a classic computer, namely the ones proposed by Prange [10], Lee and Brickell [11], Leon [12], Stern [13], Finiasz and Sendrier [14], May, Meurer and Thomae [15], and Becker, Joux, May and Meurer [16].For the sake of clarity, we describe the Information Set Decoding (ISD) in their syndrome decoding formulation, highlighting for the first variant amenable to dual-use, i.e., Lee and Brickell's, how to adapt the technique to the Codeword Finding Problem (CFP).For all these algorithms, we provide finite-regime time complexities and space complexities, with the aim to analyze the actual computational effort and memory resources needed to solve both the Syndrome Decoding Problem (SDP) and the Codeword Finding Problem (CFP) on instances with cryptographically sized parameters.We also report lower bounds on the complexities of the execution of Prange, Lee and Brickell's and Stern's variants of the Information Set Decoding (ISD) on a quantum computer, allowing an evaluation of the corresponding computational efforts.
We provide the exact formulas for the time complexity of ISD variants as a function of the code length n, the code dimension k and the number of errors t.We note that the ISD algorithms having the best asymptotic time complexity are also characterized by an exponential space complexity, which may significantly hinder their efficiency or make their implementation unpractical.In particular, we also analyze the computational cost of such algorithms with a logarithmic memory access cost criterion.Indeed, the logarithmic access cost criterion is the one which fits better scenarios where the spatial complexity of an algorithm is more than polynomial in its input size, therefore resulting in a non-negligible cost for the memory accesses.
In the reported formulas, we employ the O-notation simply to remove the need to specify the computing architecture-or implementation-dependant constants.

Prange
Prange's algorithm [10] is the first known variant of ISD, based on the idea of guessing a set I of k error-free positions in the error vector e to be found in the Syndrome Decoding Problem (SDP).For this purpose, the columns of H are permuted so that those indexed by I are packed to the left.This operation is equivalent to the multiplication of H by an appropriately sized permutation matrix P. The column-reordered matrix H = HP is hence obtained, which can be put in Reduced Row Echelon Form (RREF), with the identity matrix I r placed to the right, i.e., [V I r ] = U H. If turning H in Reduced Row Echelon Form (RREF) is not possible as the r × r rightmost submatrix is not full-rank, a different permutation is picked.The same transformation U required to bring H in Reduced Row Echelon Form (RREF) is then applied to the single-bit rows of the column syndrome vector s, obtaining s = Us.If the weight of the permuted error vector e obtained as e = eP = [0 1×k sT ], where 0 1×k is the all-zero error vector of length k, matches the expected error weight t, then the algorithm succeeds and the non-permuted error vector eP T is returned.A pseudo-code description of Prange's ISD algorithm is provided in Algorithm 1. Proposition 1 (Computational complexity of Algorithm 1).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, the complexity for finding the row error vector e with length n and weight t such that s = He T with Algorithm 1 can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-9 and the computational requirements of executing the loop body c iter .In particular, the time complexity is The spatial complexity is S ISD (n, r, t) = O(rn).
Proof.The loop body of Algorithm 1 is dominated by the cost of finding an information set and validating it through checking that the matrix W is indeed an identity, i.e., that the corresponding submatrix of H indeed has full rank.Note that, in an r × r binary matrix, the first row has a probability of 1 2 r of being linearly dependent from itself (i.e., zero); the second row has a probability of 2  2 r of being linearly dependent (i.e., zero or equal to the first).With an inductive argument, we obtain that the rth row has a probability of 2 r−1 2 r of being linearly dependent from the previous ones.We thus have that the probability of having all the rows independent from one another is ∏ r i=1 (1 We thus have that the column permutation (Line 4), with computational complexity rn (which can be lowered to n keeping only the permuted column positions) and the Reduced Row Echelon yielding the first addend of the computational cost C IS (n, r).The cost C RREF (n, r) is derived considering the Reduced Row Echelon Form (RREF) as an iterative algorithm performing as many iterations as the rank of the identity matrix in the result (i.e., r in this case).Each iteration 0 ≤ i ≤ r − 2 proceeds to find a pivot, taking O(r − i), swaps it with the (r − i)th row in O(n) and proceeds to add the pivot to all the remaining r − i − 1 rows which have a one in the (n − i)th column.The total cost is The second addend of the cost C IS (n, r) is constituted by the computational complexity of computing s = Us, which is r 2 .The total cost of computing an iteration c iter is the sum of C IS (n, r) and the cost of building e, i.e., O(n).
Pr succ is obtained as the number of permuted error vectors with the error-affected positions such that they are fitting the hypotheses made by the algorithm, divided by the number of all the possible error vectors.This fact holds for all ISD algorithms.In the case of Prange's ISD, the permuted error vectors admissible by the hypotheses are ( r t ), as all the error-affected positions should be within the last r bits of the permuted error vectors, while the number of error vectors is ( n t ).
For the sake of clarity, from now on, we denote as ISEXTRACT(H, s) the procedure computing P, [V I r ], s , performed on Lines 2-7 of Algorithm 1, with computational time complexity C IS (n, r) and space complexity O(rn).

Lee-Brickell
The ISD algorithm introduced by Lee and Brickell in [11] starts with the same initial operations as in Prange's, i.e., the computation of the Reduced Row Echelon Form (RREF) of H and the derivation of the corresponding syndrome s.However, Lee and Brickell improved Prange's original idea by allowing p positions in the k selected in the error vector to be error-affected.These p remaining error positions are guessed.To verify the guess, Lee and Brickell exploit the identity [V I r ] e T = s, where e is split in two parts, e = [ e 1 e 2 ], with e 1 being k bit long and with weight p, and e 2 being r bits long and with weight t − p.The identity is rewritten as V e T 1 + I r e T 2 = V e T 1 + e T 2 = s, from which follows the fact that V e T 1 + s = e T 2 must have weight t − p. Indeed, this condition is employed by the algorithm to check if the guess of p positions is correct.The procedure is summarized in Algorithm 2.
Algorithm 2: Syndrome decoding formulation of Lee and Brickell's ISD.
Input: s: an r-bit long syndrome (column vector) H: an r × n binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t.He T = s, with WEIGHT(e) = t Data: P: an n × n permutation matrix e = eP: the error vector permuted by P p: the weight of the first k bits of e, 0 ≤ p ≤ t, p = 2 proven optimal s: an r-bit long binary column vector, equal to the syndrome of e through [V I r ] Proposition 2 (Computational complexity of Algorithm 2).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 2 requires an additional parameter 0 ≤ p ≤ t.The time complexity of Algorithm 2 can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-10 and the computational requirements of executing the loop body c iter .In particular, the time complexity is ) is the cost of decoding an integer into its combinadics representation, i.e, finding the corresponding combination among all the ( k p ) ones.The spatial complexity is S ISD (r, n) = O(rn).
Proof.The probability of success of Lee and Brickell's ISD is obtained following the same line of reasoning employed for Prange's, thus dividing the number of admissible permuted error vectors, ( k p )( r t−p ) by the number of the possible error vectors ( n t ).The cost of an iteration of Lee and Brickell's algorithm can be obtained as the cost of adding together p + 1 bit vectors of length r, i.e., pr (Line 6), multiplied by the number of such additions, i.e., ( k p ) as they constitute the body of the loop at Lines 4-9.Note that, in a practical implementation where the value of p is fixed, it is possible to avoid C IntToComb altogether, specializing the algorithm with a p-deep loop nest to enumerate all the weight p, length k binary vectors.

Adapting Lee and Brickell to Solve CFP
The structure of Lee and Brickell's Information Set Decoding (ISD) allows employing substantially the same algorithm to solve the Codeword Finding Problem (CFP), given a parity matrix H as the representation of the code where a weight w codeword c should be found.The line of reasoning to employ Lee and Brickell's Information Set Decoding (ISD) to solve the Codeword Finding Problem (CFP) is to note that, by definition, for any codeword c of the code represented by H we have that Hc T = 0 1×r , i.e., a codeword multiplied by the parity check matrix yields a null syndrome.As a consequence, we have that 0 This implies that V ĉT 1 = ĉT 2 , which can be exploited as an alternative stopping condition to the one of Algorithm 2, yielding in turn Algorithm 3. The only remaining difference between the Syndrome Decoding Problem (SDP) solving Lee-Brickell ISD and the Codeword Finding Problem (CFP) one is represented by the ISEXTRACT primitive, which no longer needs to compute a transformed syndrome s = Us as it is null.We thus have a small reduction in C IS (n, r), which becomes losing an additive r 2 term.We note that such a reduction is expected to have little impact in practice as the dominant portion of the ISEXTRACT function is represented by the Reduced Row Echelon Form (RREF) computation.This in turn implies that solving the Syndrome Decoding Problem (SDP) on a code C has practically the same complexity of finding a codeword with weight w = t in the same code.Therefore, finding low-weight codewords in the code defined by a Niederreiter cryptosystem public key H pub has an effort comparable to the one of performing syndrome decoding assuming an error with the same weight as the codeword to be found.Two families of codes which may be vulnerable to such an attack unless the parameters are designed taking into account a Codeword Finding Problem (CFP) ISD are the Low Density Parity Check (LDPC) and Moderate Density Parity Check (MDPC) codes.Indeed, such code families can be represented with a parity check matrix with low weight rows, and such a low weight representation can be relied upon to perform efficient decoding, leading to an effective cryptosystem break.Indeed, we now show that, if a code C(n, k, t) can be represented by a low-weight parity matrix H priv , the code will contain low weight codewords.Without loss of generality, consider k > r.Moreover, consider H priv as split in three portions [A l A r B] of size r × (k − r), r × r and r × r, respectively, with B non-singular.We derive the corresponding generator matrix as: and note that all the rows of this product are valid codewords, as they are the result of a linear combination of rows of the generator matrix G.Moreover, given that the private parity-check matrix H priv has low row and column weight by construction, we have that the aforementioned codewords, i.e., the rows of [0 r×(k−r) B T A T r ], also have a low weight.This fact may thus allow an attacker to perform a Key Recovery Attack (KRA) retrieving the low weight codewords and rebuilding H priv .
A different attack strategy for the same code families is to try and find codewords in the dual code with respect to the one represented by the parity check matrix H priv .Such a code, by definition, sees H priv as a valid generator matrix, and thus makes it possible to directly reconstruct H priv solving r instances of Codeword Finding Problem (CFP) to obtain the r instances of H priv .Solving the Codeword Finding Problem (CFP) on the dual code implies that Algorithm 3 is called considering the aforementioned G matrix as a parity check matrix.Thus, if we denote with C ISD (n, r, w) the complexity of solving Codeword Finding Problem (CFP) on the code described by H priv , solving the Codeword Finding Problem (CFP) on the dual code, will have a complexity of C ISD (n, k, w ), where w is the weight of the codeword of the dual code.Whether this strategy or the one of solving the Codeword Finding Problem (CFP) on the primal code is more advantageous depending on the code rate and the values of w and w .

Leon
The algorithm proposed by Leon in [12], reported in Algorithm 4, improves the Lee and Brickell's Information Set Decoding (ISD) assuming that the contribution to the value of the first bits of the syndrome s, sup , comes only from columns in V, i.e., there is a run of zeroes of length leading the final r bits of the permuted error vector e , i.e., e = [ e 1 0 1× e 2 ], where e 1 is k bits long and e 2 is r − bits long.We thus have that the expected situation after the permutation and RREF computation is where e down is assumed to have a run of zeroes in its first bits.Such an assumption will clearly reduce the success rate of an iteration, as not all the randomly chosen permutations will select columns having this property.However, making such an assumption allows performing a preliminary check of the value of the sum of the topmost bits only of each column of V. Indeed, such a sum should match the value of the corresponding topmost bits of s, sup , because the leading null bits in e down in turn nullify the contribution of the columns the topmost rows I up of the identity matrix.Such a check (Line 5 in Algorithm 4) allows discarding a selection of the p columns from the ones of V, earlier, saving addition instructions with respect to a full column check.The length of the run of zeroes should be picked so that the trade-off between the reduction in success probability is compensated by the gain in the speed of a single iteration.s: an r-bit long binary column vector, equal to the syndrome of e through [V I r ], Proposition 3 (Computational complexity of Algorithm 4).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 4 requires two additional parameters 0 ≤ p ≤ t, 0 ≤ ≤ r.The time complexity of Algorithm 4 can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-11 and the computational requirements of executing the loop body c iter .
In particular, the time complexity is where C IS (n, r) is as in Equation (6) and ) is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ( k p ) ones.Note that, if the value of p is fixed, it is possible to avoid C IntToComb , specializing the algorithm with a p-deep loop nest to generate the combinations.The spatial complexity is S ISD (r, n) = O(rn).
Proof.The success probability of an iteration of Leon's algorithm follows the same line of reasoning of Prange's and Lee and Brickell's, dividing the number of admissible permuted error vectors ( k p )( r− t−p ) by the total one ( n t ).The complexity of a single iteration is obtained considering that the loop at Lines 4-10 will perform ( k p ) iterations, where p + 1 vectors of length are added together (complexity p ), and, if the result is zero, a further addition of p + 1 bit vectors, each one of length r − has to be performed (complexity p(r − )).This further addition takes place with a probability of 2 , as all possible values for sup are 2 , and only ( k p ) attempts at hitting the correct one are made, thus yielding the correct complexity, under the assumption that the sums of bit vectors are independent and uniformly distributed over all the bit strings.

Stern
Stern's algorithm, introduced in [13], improves Leon's (Algorithm 4) by employing a meet-in-the-middle strategy to find which set of size p, containing bit portions of columns of V, adds up to the first bits of the syndrome s.For the sake of clarity, consider where v up i are -bit column vectors, and v down i are (r − )-bit column vectors, and the transformed syndrome s as s = sup sdown .
Stern's strategy splits the p-sized set I of indexes of the columns of V up , which should add up to sup , into two p 2 sized ones I and J (I = I ∪ J ). Stern's strategy mandates that all columns indexed by I should be within the leftmost k 2 ones of V, while the ones indexed by J should be within the rightmost k 2 ones.It then exploits the following equation to precompute the value of sup + ∑ i∈I v up i for all possible ( k/2 p/2 ) choices of I, and store them into a lookup table L, together with the corresponding choice of I.The algorithm then enumerates all possible p 2 sized sets of indexes J , computing for each one ∑ j∈J v up j , and checking if the result is present in L. If this is the case, the algorithm has found a candidate pair (I, J ) for which ∑ i∈I ∪J v up i = sup holds, and thus proceeds to check if ∑ i∈I ∪J v down i = sdown .This strategy reduces the cost of computing an iteration quadratically at the price of increasing the number of iterations with respect to Lee and Brickell's approach, and taking a significant amount of space to store the lookup table L which contains ( k/2 p/2 ) elements.We note that Stern's variant of the ISD is the first one to exhibit non-polynomial memory requirements, due to the size of the set I, which should be memorized and looked up.Stern's algorithm is summarized in Algorithm 5.
Proposition 4 (Computational complexity of Algorithm 5).As for Algorithm 4, given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 5 requires two additional parameters The time complexity of Algorithm 5 can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-17 and the computational requirements of executing the loop body c iter .In particular, the time complexity is ) is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ( k/2 p/2 ) ones.Note that, if the value of p is fixed, it is possible to avoid C IntToComb , specializing the algorithm with a p-deep loop nest to generate the combinations.The spatial complexity is S ISD (n, r, t, p, ) = s: an r-bit long binary column vector, equal to the syndrome of e through [V I r ], L: list of pairs (I, a I ), with I a set of integer indexes between 0 and k 2 − 1, and a I an -bit binary column vector ( r− t−p ) by the total one ( n t ).The complexity of a single iteration is obtained considering that the loop at Lines 5-7 will compute ( k/2 p/2 ) additions of p 2 + 1 vectors, each one bits in length (complexity ( k/2 p/2 ) p 2 ).The loop at Lines 8-16 performs ( k/2 p/2 ) iterations, where p 2 + 1 vectors of length are added together (complexity p ), and the result is looked up in table L. If the result is found, a further addition of p + 1 bit vectors, each one r − bits long is performed (complexity p(r − )).This further addition takes place with a probability of 2 , as all the possible values for the computed bit sum are 2 , and only ( k/2 p/2 ) are present in L.
The spatial complexity of Stern's algorithm is the result of adding together the space required for the operations on the H matrix (i.e., rn) with the amount of space required by the list L, which is ( k/2 p/2 ) elements long.Each element of the list takes p 2 log 2 ( k 2 ) bits to store the set of indexes, and bits to store the partial sum, yielding a total spatial cost for the list of ( k/2 p/2 )( The aforementioned temporal complexity is obtained assuming a constant memory access cost which, given the exponential amount of memory required is likely to be ignoring a non-negligible amount of time spent to perform memory accesses.Indeed, it is possible to take into account such a time employing a logarithmic memory access cost model.Recalling that the address decoding logic for an n element digital memory of any kind cannot have a circuit depth smaller than log 2 (n), we consider that the operations involved in the computation of an iteration will require such an access, in turn obtaining a cost per iteration equal to c iter−logcost = c iter log 2 (S ISD (n, r, t, p, )).

Finiasz-Sendrier
Finiasz and Sendrier in [14] proposed two improvements on Stern's Information Set Decoding (ISD) algorithm, obtaining Algorithm 6.The first improvement is represented by removing the requirement for the presence of a run of zeroes in the permuted error vector e and allowing the p error bits to be guessed to be present also in that region of e.Such an approach raises the success probability of an iteration.Following the fact that the p positions which should be guessed are picked among the first k + ones of the error vector, Finiasz and Sendrier computed only a partial Reduced Row Echelon Form (RREF) transformation obtaining a smaller, (r − ) × (r − ), identity matrix in the lower rightmost portion of U H = V up 0 ×(r− ) V down I r− , and leaving a zero submatrix on top of the identity.As a consequence, the cost of computing such an Reduced Row Echelon Form (RREF) is reduced to: Considering that the invertibility condition is required only for an (r − ) × (r − ) submatrix, we have that for a use of the ISD to solve Syndrome Decoding Problem (SDP), while the last r 2 term is not present in case the method is employed to solve the Codeword Finding Problem (CFP).
Input: s: an r-bit long syndrome (column vector) H: an r × n binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t.He T = s, with WEIGHT(e) = t Data: P: an n × n permutation matrix e = eP: the error vector permuted by P p: the weight of the first k bits of e, 0 ≤ p ≤ t, : a free parameter 0 ≤ ≤ r − (t − p) s: an r-bit long binary column vector, equal to the syndrome of e through V up 0 ×(r− ) L: list of pairs (I, a I ), with I a set of integer indexes between 0 and k+ 2 − 1, and a I an -bit binary column vector Proposition 5 (Computational complexity of Algorithm 6).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 6 also requires two additional parameters 0 ≤ p ≤ t, 0 ≤ ≤ (k − p).
The time complexity of Algorithm 6 can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-17 and the computational requirements of executing the loop body c iter .In particular, the time complexity is C ISD (n, r, t, p, ) = 1 Pr succ c iter = where ) is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ( (k+ )/2 p/2 ) ones.Note that, if the value of p is fixed, it is possible to avoid C IntToComb , specializing the algorithm with a p-deep loop nest to generate the combinations.The spatial complexity is S ISD (n, r, t, p, With a line of reasoning analogous to Stern's ISD, we consider the complexity of Finiasz and Sendrier's ISD with a logarithmic memory access cost, multiplying the cost of the iteration by the binary logarithm of the size of the required memory.

May-Meurer-Thomae
The variant of ISD proposed by May, Meurer and Thomae in [15] improves Finiasz and Sendrier's variant by introducing two tweaks, resulting in Algorithm 7. The first tweak changes the way in which the p error positions in the permuted error vector e are chosen.Instead of splitting them equally as p 2 in the leftmost k+ 2 columns and p 2 in the subsequent k+ 2 ones, the selection is made picking two disjoint sets of indexes I, J ⊂ {0, . . ., k + − 1}.Such an approach increases the number of possible permutations which respect this constraint.
The second tweak considers V up as logically split into two submatrices it row-wise into two parts, the first one with 1 rows and the second one with 2 = − 1 rows.After performing the same partial RREF, as it is done in the Finiasz and Sendrier's ISD, we obtain Such a subdivision is employed to further enhance the efficiency of the checks on the columns of V with respect to the idea of matching bit strings with a precomputed set introduced by Stern.To this end, the sets I and J are in turn obtained as the disjoint union of a pair subsets with cardinality p 4 .Let I be I 1 ∪ I 2 and J be J 1 ∪ J 2 .For the sake of simplicity, the disjoint union is realized picking the elements of I 1 , J 1 in {0, . . ., k+ 2 − 1} and the ones of I 2 , J 2 in { k+ 2 , . . ., k + − 1}.The May-Meurer-Thomae (MMT) algorithm thus proceeds to exploit a time to memory tradeoff deriving from the same equation employed by Stern, applying twice the precomputation strategy.The derivation from the test equality on the syndrome is done as follows May, Meurer and Thomae exploited the strategy described by Stern to derive candidate values for the elements of I and J , rewriting the last two equalities as and exploiting their form to build two lists of candidate values for I and J , Ī and J , such that for the elements of the first list, it holds that 0 2 ×1 + ∑ i∈ Ī1 vup− 2 i = ∑ i∈I 2 vup− 2 i , and for the elements of the second list it holds that sup− 2 + ∑ j∈ J1 vup− 2 j = ∑ j∈ J2 vup− 2 j .We note that, through a straightforward implementation optimization, matching the one employed in Stern's algorithm, only the first list needs to be materialized (appearing as L in Algorithm 7).
The second observation made in the MMT algorithm relates to the possibility of reducing the size of the list involved in Stern's algorithm to compute the value of the sought error vector via a meet-in-the-middle strategy.The observation relies on the fact that it is possible to obtain the first k + bits of the permuted error vector e, e[k + ] as the sums of two k + long bit vectors, e 1 , e 2 with weight p 2 each.Stern's algorithm limits the positions of the ones in e 1 , e 2 to be in the first half of the bits for e 1 and in the second half for e 2 , yielding a single valid pair e 1 , e 2 for a given e.By contrast May-Meurer-Thomae does not constrain the positions of the two sets of p 2 positions to be picked from different halves of the k + region of the error vector, but instead it only constrains the choice to non-overlapping positions.In such a fashion, considering the correct guess of p positions, we have that they can be split into the two p 2 sets in ( p p/2 ) possible valid ways, in turn increasing the likelihood of a correct guess.If this choice is made, it is possible to reduce the size of the lists employed to compute the man in the middle approach by a factor of ( p p/2 ), while retaining (on average), at least a solution.To this end, the authors suggested picking the value of l 2 as ( ), in a way to reduce the size of the lists by the proper factor.
Proposition 6 (Computational complexity of Algorithm 7).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 7 requires three additional parameters 0 ≤ p ≤ t, 1 and 2 such that

p(r − )
The spatial complexity of May-Meurer-Thomae is Proof.The computational complexity is derived considering the number of iterations of the loops in the algorithm, taking into account the probability of the checks being taken.The spatial complexity is obtained as the sum of the size of the matrix H, the size requirements of L 1 (as L 2 has the same size and can reuse its space) and the expected size of L considering how many pairs may survive the check performed when building it.
Input: s: an r-bit long syndrome (column vector) H: an r × n binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t.He T = s, with WEIGHT(e) = t Data: P: an n × n permutation matrix e = eP: the error vector permuted by P p: the weight of the first k bits of e, 0 ≤ p ≤ t, p = 2 proven optimal 1 , 2 : two parameters with 1 + 2 = , 0 ≤ ≤ r − (t − p) s: an r-bit long binary column vector, equal to the syndrome of e through    L: list of pairs (I, a I ), with I a set of integer indexes between 0 and k + − 1, and a I an -bit binary column vector.Length of L is kept at most ( (k+ ) p/2 ) ( p p/2 ) L 1 , L 2 : lists of pairs (I 1 , a I 1 ), and (J 1 , a J 1 ) with I 1 , J 1 sets of integer indexes between 0 and k+ 2 − 1, and a I 1 , a J 1 , 1 and 2 bit binary column vectors.

Becker-Joux-May-Meurer
The Becker-Joux-May-Meurer (BJMM) algorithm introduced in [16] improves the MMT algorithm in two ways: the former is a recursive application of the list building strategy, and the latter is a change to the list element generation employed.We discuss the latter first, forsaking for the sake of clarity its recursive application at first.We then describe the adaptations needed to adopt the recursive splitting strategy without issues.
The BJMM algorithm considers that it is possible to represent a vector e of weight p, length k + , as the sum of two vectors e 1 , e 2 of weight p 2 + ε, and with the same length, under the assumption that the ε extra ones cancel out during the addition.We recall that the MMT approach demands that both e 1 and e 2 have weight strictly equal to p 2 .The BJMM algorithm thus raises the number of valid pairs e 1 , e 2 employed to represent e by a factor equal to ( k+ −p ε ).Such an improvement is employed to further reduce the size of the lists of values which need to be computed to find e with a meet-in-the middle approach on checking that the condition Ve 1 + Ve 2 = sup .Indeed, since R = ( Willing to employ a strategy to enumerate only a 1 R fraction of the pairs, while doing useful computation instead of a simple 1 R sub-sampling of the space of the pairs, the BJMM algorithm opts for performing a partial check of the Ve 1 + Ve 2 = sup equation, on a smaller number of bits than and discarding the pairs which do not pass the check.
Let us denote with V up[ρ] the first ρ rows of V up and with sup[ρ] the first ρ bits of the syndrome sup The BJMM algorithm thus employs the test to obtain a twofold objective: discard a fraction of the (e 1 , e 2 ) pairs, and select the portion to be kept among the pairs which at least have the first ρ bits of the sum of the columns of V up[ρ] matching the value of the corresponding syndrome bits.Such an approach has the advantage over a random sampling that the pairs which are selected have already been checked for compliance on a part of the Ve 1 + Ve 2 = sup equation, i.e., it performs a random subsampling while doing useful computation.The BJMM paper suggests that the value of ρ should be ρ ≈ log 2 (R): under the assumption that the r bit sums being performed are made of random values, and that the sum should match a given r bit value, only a fraction equal to 1 2 ρ , i.e., ≈ 1 R survives.Note that, regardless of the choice of ρ, a selection of the correct positions will always survive the preliminary checks on ρ bits, while wrong solutions are filtered out on the basis that they will not match on the first ρ bits.In the complete algorithm, the aforementioned line of reasoning is applied twice, as the list-building strategy is recursively applied, leading to two different reduction factors depending on the recursion depth itself.
We now come to the second improvement of the BJMM algorithm, the recursive application of the meet-in-the-middle strategy employed by all the ISDs since Stern's.Stern's meet-in-the-middle approach starts from rewriting The original BJMM proceeds to build two lists of pairs.The first list contains, for all possible e 1 , the pairs (V up[l] e 1 , e 1 ).The second list contains, for all possible e 2 , the pairs (V up[l] e 2 + sup , e 2 ).The BJMM algorithm sorts lexicographically the two lists on the first elements of the pairs and then checks for the matching pairs in essentially linear time in the length of the lists.
We note that a more efficient (memory saving) way of performing the same computation involves inserting in a (open hash) hashmap which employs as the key V up[l] e 1 for the value e 1 .Subsequently, computing on the fly V up[l] e 2 + sup , and looking it up in the hashmap, yields all the matching pairs for e 2 .Let N be the number of possible pairs and M the number of matching pairs, the original strategy requires O(2 √ N + M) vector sized operations, while the modified one requires O( √ N + √ N + M).The BJMM algorithm employs the meet-in-the-middle strategy to generate the values for the candidate vectors e more than once.In particular, a candidate for e, e (0) , weight p, length k + , positions.Such a factor, considered in all the splits in the BJMM is The cost of an iteration of the loop at Lines 1-27 of the BJMM is as follows (2p 2 1 ) ) The first line constitutes the cost of the loop at Lines 4-16, the second line is the cost of the loop at Lines 17-23, the third line is the cost of the loop at Lines 24-27 save for the portion related to the branch at Line 25 being taken.The last line is the cost of computing the body of the taken branch, multiplied by the probability of such a branch being taken.
The BJMM variant of the ISD shares with the Stern, Finiasz and Sendrier, and May-Meurer-Thomae ISDs the fact that the exponential memory requirements should be taken into account by a logarithmic access cost, instead of a constant one.We do so following the same method employed for the aforementioned variants, i.e., augmenting the cost of the iteration accordingly.

Speedups in ISD Algorithms Due to Quasi-Cyclic Codes
A common choice to reduce the size of the keys in a McEliece or Niederreiter cryptosystem is to employ a so-called quasi-cyclic code.Such a code is characterized by a parity-check matrix composed by circulant block matrices, i.e., matrices where all the rows are obtained as a cyclic shift of the first one.
It is possible to exploit such a structure to provide a polynomial speedup factor to both the solution of Codeword Finding Problem (CFP) and Syndrome Decoding Problem (SDP).The speedup in the solution of the Codeword Finding Problem (CFP) can be derived in a straightforward fashion observing that, in the case of both the Codeword Finding Problem (CFP) against the primal and the one against the dual code, for each codeword to be found in a quasi cyclic code with p sized circulant blocks, p − 1 more codewords can be derived simply as a block-wise circulant shift of the first one.As a consequence, for a given codeword with weight w sought by the algorithm, it is guaranteed that at least p many of them are present in the code.Thus, in this case.the success probability of each iteration can be obtained as 1 − (1 − Pr succ ) p ; when p Pr succ 1, this in turn implies that the probability of success approximately grows by a factor p, in turn speeding up any ISD by the same factor.
An analogous line of reasoning leads to exploit the Decoding One Out of Many (DOOM) algorithm proposed by Sendrier in [38] to speed up the solution of the Syndrome Decoding Problem (SDP).Decoding One Out of Many (DOOM) relies on the fact that a set of syndromes S through the same parity check matrix are provided to the attacker, and he attempts at decoding at least one of them.In case of a quasi cyclic code, cyclically shifting the syndrome yields a different, valid syndrome, and a predictable cyclic shift on the corresponding (unknown) error vector.It is therefore possible for an attacker to derive p different syndromes, starting from one and, in case one of them is successfully decoded, no matter which one, he will be able to reconstruct the sought error vector.Essentially, Decoding One Out of Many (DOOM) performs multiple ISD instances, taking care of duplicating only the checks which involve the syndrome, thus pooling the considerable amount of effort required in the rest of the iteration.The overall speedup achieved by Decoding One Out of Many (DOOM) for a quasi cyclic code with block size p is √ p.

Speedups from Quantum Computing
While there is no known polynomial time algorithm running on a quantum computer able to solve either Syndrome Decoding Problem (SDP) or Codeword Finding Problem (CFP), it is still possible to achieve a significant speedup in the attacks exploiting Grover's zero-finding algorithm.Grover's algorithm [39] finds a zero of an n-input Boolean function with a computational cost of √ 2 n function computations, instead of the 2 n required with a classic computer.The first instance of a proposed exploitation of Grover's algorithm to speed up ISDs was made by Bernstein in [40], observing that one iteration of Prange's algorithm can be rewritten as a Boolean function having a zero iff the iteration is successful in finding a valid error vector.The essence of the observation is that the Reduced Row Echelon Form (RREF) computation, and the weight check on the resulting syndrome can be expressed as Boolean functions, and it is straightforward to extend them so that a single bit output indicating the success of the iteration is added.Such an approach allows reducing the number of iterations to be performed to the square root of the one for the classical algorithm, since each iteration of Prange's algorithm is essentially trying (exhaustively) to find a zero of the aforementioned Boolean function.We therefore rephrase the computational complexity of Prange's algorithm on a quantum computer.For the sake of simplicity in the analysis, we forgo the overhead of implementing the Boolean function as a reversible circuit, obtaining a conservative estimate of the actual complexity.
Proposition 8 (Quantum computational complexity of Algorithm 1).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 1 running on a quantum computer can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-7 and the computational requirements of executing the loop body c iter .In particular, the time complexity is The spatial complexity is S ISD (n, r, t) = O(rn).
Following an argument similar to the one for Prange, we note that there is essentially no difficulty in interpreting Lee and Brickell's variant of the ISD as computing a slightly more complex Boolean function at each iteration, allowing to reformulate its complexity (Proposition 2) for the quantum case as follows.
Proposition 9 (Quantum computational complexity of Algorithm 2).Given H, an r × n binary parity-check matrix and s, an r-bit long syndrome (column vector) obtained through H, finding the row error vector e with length n and weight t such that s = He T with Algorithm 2 requires an additional parameter 0 ≤ p ≤ t.
The time complexity of Algorithm 2 running on a quantum computer can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-14 and the computational requirements of executing the loop body c iter .In particular, the time complexity is where C IntToComb = O((2k − p)(log ( k p )) 2 ) is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ( k p ) ones.The spatial complexity is S ISD (r, n) = O(rn).
Similarly, we also reformulate Leon's Algorithm 4 as follows.
Proposition 10 (Quantum computational complexity of Algorithm 4).Given H, an r × n binary parity-check matrix, s, an r-bit long syndrome (column vector) obtained through H, and the two parameters 0 ≤ p ≤ t, 0 ≤ ≤ (k − p), the complexity of finding the row error vector e with length n and weight t such that s = He T with Algorithm 4 running on a quantum computer can be computed starting from the probability of success Pr succ of a single iteration of the loop at Lines 1-10 and the computational requirements of executing the loop body c iter .In particular, the time complexity is C ISD (n, r, t, p, ) = ) is the cost of decoding an integer into its combinadics representation, i.e., finding the corresponding combination among all the ( k p ) ones.Note that, if the value of p is fixed, it is possible to avoid C IntToComb , specializing the algorithm with a p-deep loop nest to generate the combinations.The spatial complexity is S ISD (r, n) = O(rn).
Finally, we tackle the reformulation of Stern's ISD for the quantum case.

Quantitative Assessment of ISD Complexities
In this section, we analyze the computational complexities to solve the Syndrome Decoding Problem (SDP) and the Codeword Finding Problem (CFP) for sets of code parameters relevant to post quantum cryptosystems.To this end, we select the proposals which were admitted to the second round of the NIST post quantum cryptography standardization effort [24] relying on a Niederreiter cryptosystem, or its McEliece variant.
In particular, we consider Classic McEliece [43] and NTS-KEM [44], which employ Goppa codes, and BIKE [34] and LEDAcrypt [31], which employ quasi cyclic codes, to assess our finite domain approach on both quasi-cyclic codes and non-quasi-cyclic codes.The parameters for the reported cryptosystems are designed to match the computation effort required to break them to the one required to break AES-128 (Category 1), AES-192 (Category-3), and AES-256 (Category 5).We report the code length n, code dimension k, number of errors to be corrected t and size of the circulant block p for all the aforementioned candidates in Table 2. Furthermore, for the cryptosystems relying on low-or moderate-density parity check codes, we also report the weight of the codeword to be found, w, in the case of a Codeword Finding Problem (CFP).
We implemented all the complexity formulas from Section 3 Employing Victor Shoup's NTL library, representing the intermediate values either as arbitrary precision integers, or as NTL::RR selectable precision floating point numbers.We chose to employ a 128 bit mantissa and the default 64 bit exponent for the selectable precision.
To minimize the error in computing a large amount of binomial coefficients, while retaining acceptable performance, we precompute the exact values for all the ( n k ) binomials for all pairs n, k up to ( 3000 200 ).Furthermore, to minimize the error of Stirling's approximation whenever n k we also precompute all the exact values for the binomials up to ( 10,000 10 ), and compute the exact value whenever k < 10.To provide a fast approximated computation for all the binomials which do not fall in the aforementioned intervals, we compute the binomials employing the logarithm of Stirling's series approximated to the fourth term.We explored the parameter space of each algorithm considering the fact that the different parameters drive different tradeoff points in each algorithm.To this end, we explored an appropriately sized region of the parameter space, which we report in Table 3.To determine the explored region, we started from a reasonable choice and enlarged the region until the value of the parameters minimizing the attack complexity was no longer on the region edge for all the involved parameters.We employed, for the choice of the 2 parameter in the MMT ISD variant and the 1 and 2 parameters in the BJMM variant, the choices which were advised in the respective works.

ISD Variant Parameter Range
Prange [10] none Lee-Brickell [11] 1 ≤ p ≤ 12 (p = 2 is asymp.optimal) Leon [12] 1 ≤ p ≤ 12, 0 ≤ ≤ min(100, r − (t − p)) Stern [13] 2 ≤ p ≤ 18, 0 ≤ ≤ min(100, r − (t − p)) Finasz-Sendrier [14] 2 ≤ p ≤ 18, 0 ≤ ≤ min(100, r − (t − p)) MMT [15] 4 ≤ p ≤ 34, 110 ≤ ≤ min(350, r − (t − p)) We took into account the advantage provided by a quasi cyclic code in both the Syndrome Decoding Problem (SDP) and the Codeword Finding Problem (CFP) solution complexity, reducing it by a factor equal to √ p, the square root of the cyclic block size for the Syndrome Decoding Problem (SDP), and p for the Codeword Finding Problem (CFP), in accordance with the point raised in Section 3.9.Table 4 reports the computational costs of solving both Syndrome Decoding Problem (SDP) and Codeword Finding Problem (CFP) by means of the described variants of the Information Set Decoding (ISD).In addition to the computational complexities obtained, the value of a simple asymptotic cost criterion for the Information Set Decoding (ISD)s, described in [34], is reported.Such a criterion states that asymptotic complexity of an ISD is 2 − log 2 (1− k n )t , for the case of the use in solving a Syndrome Decoding Problem (SDP).A noteworthy point to observe is that, considering the finite regime value of the complexities, the May-Meurer-Thomae algorithm attains a lower time complexity than the Becker-Joux-May-Meurer algorithm in most cases.Indeed, while the Becker-Joux-May-Meurer Information Set Decoding (ISD) variant has a lower asymptotic cost, considering a worst-case-scenario for the solution of the Syndrome Decoding Problem (SDP), i.e., code rate close to 0.5, and a large enough value for n, a finite regime estimate of its cost reports that employing the May-Meurer-Thomae approach should result in a faster computation.Concerning the space complexities of the approaches with exponential (in the code length n) space requirements, we report the obtained values in Table 5.We note that the Information Set Decoding (ISD) variants proposed by Stern and Finiasz and Sendrier have an overall lower memory consumption that their more advanced counterparts.In particular, the space complexities of the aforementioned variants start as low as 16Gib for Category 1 parameters, and are thus amenable to an implementation which keeps the entire lists in main memory on a modern desktop.By contrast, the May-Meurer-Thomae and Becker-Joux-May-Meurer Information Set Decoding (ISD) variants require a significantly higher amount of memory, with the latter being less demanding than the former in all cases but the one of LEDAcrypt in its n 0 = 2 parameterization for extremely long term keys (LEDAcrypt-XT).In all cases, the space complexities of May-Meurer-Thomae and Becker-Joux-May-Meurer exceed 2 50 , pointing strongly to the need of a significant amount of mass storage to implement a practical attack.Such a requirement is even more stringent in the case of higher security levels, where the memory requirements exceed 2 100 for most parameter choices.

Algorithm 1 : 4 H 5 U 6 until W = I r 7 s ← Us 8 e ← [0 1×k sT ] 9 until
Syndrome decoding formulation of Prange's ISD.Input: s: a r-bit long syndrome (column vector) H: a r × n binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t.He T = s, with WEIGHT(e) = t Data: P: an n × n permutation matrix s an r-bit long binary column vector V: an r × k binary matrix V = [v 0 , . . ., v k−1 ] ← HP // the corresponding error vector is e = eP , [V|W] ← REDROWECHELONFORM( Ĥ) // UH = [V|W r×r ] WEIGHT( e) = t 10 return eP T

9 break 10 until
WEIGHT( e) = t 11 return eP T

Algorithm 3 :
Codeword finding formulation of Lee and Brickell's ISD.Input: H: an r × n binary parity-check matrix w: the weight of the codeword to be found Output: c: an n-bit codeword with WEIGHT(c) = w Data: P: an n × n permutation matrix c = cP: the error vector permuted by P p: the weight of the first k bits of c, 0

9 break 10 until
WEIGHT( c) = w 11 return cP T

Algorithm 4 :
Syndrome decoding formulation of Leon's ISD.Input: s: an r-bit long syndrome (column vector) H: an r × n binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t.He T = s, with WEIGHT(e) = t Data: P: an n × n permutation matrix e = eP: the error vector permuted by P p: the weight of the first k bits of e, 0 ≤ p ≤ t, : length of the run of zeroes in e = [ e 1×k 0 1× e 1×r− ]

Algorithm 5 :
Syndrome decoding formulation of Stern's ISD.Input: s: an r-bit long syndrome (column vector) H: an r × n binary parity-check matrix t: the weight of the error vector to be recovered Output: e: an n-bit binary row error vector s.t.He T = s, with WEIGHT(e) = t Data: P: an n × n permutation matrix e = eP: the error vector permuted by P p: the weight of the first k bits of e, 0 ≤ p ≤ t, : length of the run of zeroes in e = [ e 1×k 0 1× e 1×r− ]

p p 2 ) 2 + ) 2 exhaustively
( k+ −p ε ) pairs (e 1 , e 2 )which respect e = e 1 + e 2 exist, searching a 1 R fraction of the ( k+l p will yield (on average) a solution, assuming that the solution pairs are uniformly distributed over all the (

Table 2 .
Summary of the code parameters for the second round submissions: code length n, code redundancy n − k = r , and number of errors expected to be corrected t

Table 3 .
Explored parameter range for the different ISD variants.