Analysis and Correction of the Attack against the LPN-Problem Based Authentication Protocols

This paper reconsiders a powerful man-in-the-middle attack against Random-HB# and HB# authentication protocols, two prominent representatives of the HB family of protocols, which are built based on the Learning Parity in Noise (LPN) problem. A recent empirical report pointed out that the attack does not meet the claimed precision and complexity. Performing a thorough theoretical and numerical re-evaluation of the attack, in this paper we identify the root cause of the detected problem, which lies in reasoning based on approximate probability distributions of the central attack events, that can not provide the required precision due to the inherent limitations in the use of the Central Limit Theorem for this particular application. We rectify the attack by employing adequate Bayesian reasoning, after establishing the exact distributions of these events, and overcome the mentioned limitations. We further experimentally confirm the correctness of the rectified attack and show that it satisfies the required, targeted accuracy and efficiency, unlike the original attack.


Introduction
The construction of lightweight and secure authentication protocols for RFID (Radio Frequency IDentification) devices is an important task of contemporary cryptography. These devices are employed in supply-chain management, payment and transportation systems, for the tracking of goods and other applications, and are rapidly becoming one of the most pervasive technologies. An RFID system usually consists of two entities-a resource-constrained Tag attached to a physical object and a more computationally powerful Reader, which communicate using authentication protocol in order to validate Tag by the Reader. Reaching high security requirements for such validation while minimizing its resources cost is a very active research area [1][2][3]. One of the important families of authentication protocols for RFID systems is the HB family.
The HB family originates from a lightweight protocol called HB that was proposed by Hopper and Blum [4] and is built over the hardness of the Learning Parity in Noise (LPN) problem. Informally, the LPN problem could be considered as a problem of solving an overdefined system of consistent linear equations over GF (2), the field with two elements, where certain equations are available only in a corrupted form. While the HB protocol resists passive (eavesdropping) attacks, it is shown to be vulnerable against an active adversary who can impersonate a reader and interact with legitimate tags. A modified protocol named HB+ [5,6] was proposed with the aim of addressing this weakness. Soon after, it was shown that the HB+ protocol is defenseless against a stronger adversary who can modify the messages sent by the reader [7]. This attack is known as the GRS man-inthe-middle (MIM) attack. In order to avoid the GRS-MIM attack, different protocol variants were proposed (see, for example, HB++ [8] and HB-MP [9]). However, they were shown to be vulnerable [10], until the HB# and Random-HB# protocols were introduced in [11] and proven to be secure against GRS-MIM. Shortly thereafter, Ouafi, Overback and Vaudenay proposed a more general MIM attack (OOV, by authors' initials) [12] against HB# and Random-HB#. The attack implies an adversary that can modify the messages exchanged in both directions between the tag and the reader. Moreover, OOV can be regarded as a generic attack against the HB-family. The OOV attack remains one of the keystones in the analysis of HB-like authentication schemes and it is recognized as essential in the security evaluation of any novel HB-like protocol [1].
Motivation for the work. Recent results presented in [25] showed that the OOV attack is significantly less successful than it was claimed in [12] and pointed out malfunctioning in the core component of the attack. The estimated complexity of the attack is 18% higher for HB# and 55% for Random-HB# than the claimed, in the case of the standard parameter set II. This is a significant increase having in mind the overall complexity and time consumption of the attack, which is claimed to be 2 29. 4 for Random-HB#, and 2 21 for HB#. In this paper, we continue on this investigation path and revise the theoretical and numerical analysis behind the attack provided in [12], in order to determine the cause of the mentioned problem and try to solve it, if possible.
Summary of the results. This paper revises the cryptanalysis from [12] providing proof and explaining why the approximations of the probability distributions employed in the core component of the attack are inappropriate in the considered context, which results in lower precision and higher complexity of the OOV attack [12]. Further, this paper provides a derivation of the correct probability distributions on the number of successful authentications that leaks secret information, which can be used to recover secret keys. Finally, a correction of the OOV attack is proposed, which uses the derived, correct probability distributions, satisfying the targeted performances/complexity.
Organization of the work. Section 2 provides background on the HB# and Random-HB# protocols and the OOV attack. Section 3 brings a thorough revision of theoretical analysis behind the OOV attack and points to the critical omissions in it. Section 4 introduces the corrected attack and analyze its performance. Section 5 provides results of experimental analysis. In Section 6, the findings and results presented in the paper are briefly summarized.

Preliminaries
A list containing notation used throughout the remainder of the paper is given below. − → X : sequence of random variables X 1 , X 2 , . . . X n converges weakly (in distribution) to a distribution X as n → ∞ • P(w): probability of acceptance during the OOV attack when the Adversary adds noise vectorē, ē =w to a regular noise vector e in a protocol session, that is, ): approximation of P(w) used in the OOV attack [12].
The HB family of authentication protocols has attracted a lot of attention because of their simple implementations and the provable security based on the well-known hard problem-Learning Parity with Noise (LPN). Random-HB# and HB# are prominent representatives of this family. Their authentication procedure consists of the following steps [11]-first, the Tag sends a random blinding vector b to the Reader to initiate the authentication and the Reader responds with a random challenge vector a to the Tag. Then Tag sends z = aX ⊕ bY ⊕ e to the Reader, where e is a noise vector whose bits independently follow Bernoulli distribution with coefficient τ, and X ∈ Z k X ×m 2 , Y ∈ Z k Y ×m 2 are their shared secret keys (random matrices for Random-HB# and so-called Toeplitz matrices for HB#). The Reader validates the Tag, that is, accepts its response, if and only if the Hamming weight aX ⊕ bY ⊕ z|| falls under a certain threshold value (see Figure 1). Standard parameters' values for these protocols are given in Table 1.   [11]. Number l of secret bits is (k x + k y )m for Random-HB#, while it is k x + k y + 2m − 2 for HB#. The mechanism of the OOV attack proposed in [12] is shown in Figure 2. The adversary:
Collects a triplet (ā,b,z =āX ⊕bY ⊕ē) of messages exchanged between the Tag and the Reader by eavesdropping one of their communication sessions 2.
Replaces each triplet (a, b, z) of messages between the Tag and the Reader during n following communication sessions with a triplet (a ⊕ā, b ⊕b, z ⊕z) 3.
Counts the number c of "ACCEPT" decisions of the Reader at the end of those n sessions. The acceptance rate c n , as it turns out, leaks the critical information which reveals the secret values. More precisely, the theoretical analysis from [12] shows that: where Φ is the standard normal cumulative distribution function. This formula allows the adversary to estimate the Hamming weight ē using solely the empirical value c n , for n large enough (Algorithm 1 from [12]).
After the adversary discovers the Hamming weight of the noise vectorē, he can reconstruct the vector by flipping its bits (more precisely, he flipsz =āX ⊕bY ⊕ē which secretly containsē) and measures weight ofē after the flipping. If the weight has increased, the flipped bit was 0, otherwise, it was 1. This way, he reconstructs the noise vectorē and obtains the linear combinationāX ⊕bY sinceāX ⊕bY =z ⊕ē (Algorithm 2 from [12]). The whole previous procedure is repeated also for other modification triplets (ā i ,b i ,z i =ā i X ⊕b i Y ⊕ē i ) obtained by eavesdropping, until the adversary collects enough of these linear combinationsā i X ⊕b i Y =z i ⊕ē i to form a full system of linear equations. The secret keys X and Y are then recovered as the solution to this system.
As illustrated above, in each corrupted communication session, the Reader computes: and the Tag successfully authenticates iff e ⊕ē ≤ thr, whereas in a regular session, the Tag successfully authenticates iff e ≤ thr. This way, by creating the cumulative noise e ⊕ē, the adversary manipulates the verification criterion of the Reader and changes its theoretical acceptance rate from Pr[ e ≤ thr] to P(w) ).
Let us provide a simple and useful characterization of the OOV attack Algorithm 1 [12] output by introducing the notion of "decision zones." Definition 1. ("OOV decision zones"). OOVw-decision zone is an interval I OOV w such that OOV Algorithm 1 estimates ē asw iff c n ∈ I OOV w .
After eavesdropping a triplet, the adversary considers all weights of the noise vector e possible. He decides that ē isw ⇐⇒w − 1 2 ≤ P −1 OOV ( c n ) <w + 1 2 ⇐⇒ c n ∈ I OOV w = (P OOV (w + 1 2 ), P OOV (w − 1 2 )], since P is a monotone decreasing function (see Figure 3a). After flipping a bit in the noise vector, whose weight is previously estimated asw, the adversary considers only two weights possible:w − 1 andw + 1, so there are two decision zones -I OOV w−1 = (P OOV (w), ∞) and I OOV w+1 = (−∞, P OOV (w)] (see Figure 3b).  In [12], the complexity of the OOV attack is estimated as an overall number of modified authentication sessions, which is minimized if the expected noise vector weight w exp coincides with the so-called optimal weight when it is polynomial (for weights not near enough the optimal one, it becomes exponential). To achieve this, different strategies are introduced in [12], such as flipping the adequate number of last bits when w exp ≤ w opt before weight measurement, or removing the already recovered 1-bits when w exp ≥ w opt . Also, ref [12] provides an optimized version of the attack which uses flipping blockby-block in the noise vector recovery process instead of bit-by-bit. However, it is observed empirically in [25] that the actual benefit that this optimized version brings is somewhat overestimated. An explanation was offered that it uses insufficient sample sizes for decision making when measuring the weights, which are further away from the optimal one.

Revision of the OOV Attack
The previous work [25] has shown that the OOV attack predominantly incorrectly estimates weights of noise vectors. The probability of key recovery, that is, efficiency of the attack, is shown to be significantly lower compared to the values claimed in [12]. For the standard parameter set II, the probability of correct key recovery is shown to be 0.158 in the case of HB# and 10 −7 in the case of Random-HB#. The analysis presented in [25] reveals that, in order to achieve the precision of key recovery claimed in [12], it is necessary to increase the number of intercepted authentications by 18% in the case of HB# and by 55% in the case of Random-HB#. Since the number of intercepted authentication sessions is the unit of the attack complexity, the complexity increases accordingly. Furthermore, the analysis from [25] shows that the weight estimation error cannot be corrected by taking a larger "sample" n, i.e., larger number of intercepted authentication sessions. On the contrary, by increasing the sample, the quality of the weight estimate worsens. So, for example, experimental evaluation on the standard parameter set II shows that the percentage of correctly estimated weights is only 5%, even if a very large number of modifications is used (for more details, see Section 4.4 in [25]).
This has led us to conduct a thorough revision of cryptanalysis from [12], which we provide in this Section. We shall prove that the attack's erroneous output is caused by inadequate, non-Bayesian inference over improper, approximate probability distributions of acceptance rates, which cannot be improved due to Central limit theorem application limitations for these protocols. We identify the exact distributions and the exact error of the approximations from [12]. Then we employ Bayesian reasoning over the exact distributions to construct proper decision zones, and show how OOV weight decision making proposed in [12] deviates significantly from the proper, Bayesian one.

Revision of the Theoretical Analysis behind the OOV Weight Estimate
Here, we revise the derivation of approximations of acceptance rates used in the OOV weight estimate process and report their significant imprecision. Specifically, this derivation is given in the "Correctness" paragraph, Section 2.1 in [12].
3.1.1. Incorrect Claim that Cumulative Noise Vector e ⊕ē Follows Binomial Distribution The mentioned paragraph begins with calculation of the probability that i-th bit of cumulative noise vector e ⊕ē is 1: Then it says (exact quotation): "Hence, m −w bits of e ⊕ē follow a Bernoulli distribution of parameter τ and the otherw bits follow a Bernoulli distribution of parameter 1 − τ, thus e ⊕ē follows a binomial distribution." [12].
However, that is not correct: e ⊕ē does not follow binomial distribution, because that is by definition a distribution of sum of independent and identically distributed Bernoulli trials i.e., of the same parameter (probability of success). Here, the Hamming weight of cumulative noise e ⊕ē = ∑ m i=1 (e ⊕ē) i , as we can see, is a sum of Bernoulli trials of mixed parameter values τ or 1 − τ and actually corresponds to a more general so-called Poisson-Binomial Distribution. We elaborate more on this distribution in the upcoming Section 3.2.

Approximation of Acceptance Rates P(w) ≈ P OOV (w) without Error Estimation
The "Correctness" paragraph [12] continues with the calculations of the expected weight of the vector e ⊕ē as µ = E( e ⊕ē ) =w(1 − τ) + (m −w)τ and its variance σ 2 = Var( e ⊕ē ) = mτ(1 − τ), which are correct, and derives approximation of acceptance rate during the attack: where Φ is the standard normal cumulative distribution function, by referring to the Central Limit Theorem (CLT)-Formula (1) in [12]. Here, ref [12] applies CLT to sum e ⊕ē without discussing the magnitude of error of this approximation-the theorem itself only points to its convergence when m → ∞.

Unknown Error Bound of the Weight Estimate Process
In the rest of the "Correctness" paragraph [12], the authors merge previous approximation (1) with the second one (which is a consequence of the Law of large numbers): to conclude that: The idea behind the merging of the two approximations can be explained in the following way: c/n converges to P(w) when n → ∞ (by the Law of large numbers), while |P OOV (w) − P(w)| converges to 0 when m → ∞ (by the Central Limit Theorem). Thus, c/n gets arbitrarily close to P OOV (w), if both n and m are large enough. This can be represented as: Unlike P(w) ≈ P OOV (w), ref [12] actually does derive error for c n ≈ P(w) approximation, and how large n should be in order to make the error negligible: [12]) where er f c(θ) gets exponentially small as θ (i.e., n) increases asymptotically. Therefore, c n is used to estimate P(w) for n large enough. However, as the final approximation (3) contains estimate P(w) ≈ P OOV (w) whose error was not assessed in [12], its error is also unknown. The bound of the error for (3) is essential, because if these approximate values P OOV (w) deviate too much from the actual P(w) values, it could lead to the wrong decision ofw. Let us remember that in the OOV attack the weight of noise vectorē isw, if P OOV −1( c n ) is closest tow, for all possible values ofw.

Main Conclusions
We summarize the mistakes in the theoretical analysis behind the OOV attack from [12], found in the analysis given above, which will turn out as crucial for high error rate of the OOV weight estimate: • The distribution of the Hamming weight of cumulative noise vector is wrongly assessed as Binomial, • Approximation P(w) ≈ P OOV (w) lacks error estimation, • The error of the weight estimate procedure is unknown. Since error bound of P(w) ≈ P OOV (w) is unknown, this consequently also stands for the final approximation c n ≈ P OOV (w) which produces the output of weight estimate procedure.
In the following sections, we introduce our research process to overcome the listed omissions.

Error Estimation of Acceptance Rates Approximation P(w) ≈ P OOV (w)
First, we infer the standard upper error bound of P(w) ≈ P OOV (w) by applying Berry-Esseen inequality for CLT approximations. The obtained result indicates that distance between P(w) and P OOV (w) could be too high and thus prevent a correct weight estimation. Then, we proceed to infer the exact distribution of the acceptance rates and the exact error of this approximation.

Standard Upper Error Bound for CLT Approximations
) was derived in [12] using the CLT, which only implies its convergence when m → ∞. The Berry-Eseen inequality further refines this result by providing bound on its maximal error. Here, we show that the sum e ⊕ē follows Poisson-Binomial distribution, not the plain Binomial distribution as claimed in [12]. Then we apply a general CLT for non-identical random variables to this distribution in order to obtain P(w) ≈ P OOV (w), and we estimate its precision using the Berry-Eseen inequality.

Definition 2.
Poisson-Binomial distribution is a probability distribution of a sum ∑ n i=0 X i of independent Bernoulli random variables X 1 , . . . , X n with possibly different probabilities of success p 1 , . . . , p n , and we denote it by P B(p 1 , . . . , p n ). Binomial distribution is a special case of the Poisson-Binomial distribution where X 1 , . . . , X n share the same probability of success. Proof. Since a new noise vector e ← Ber m τ is being generated in each modification session, andē ← Ber m τ remains fixed during all modifications, notice that: where s k ands k are Bernoulli random variables, such that Pr[s k = 1] = τ and Pr[s k = 1] = 1 − τ (see Figure 4). Theorem 1 (General CLT, Lyapunov condition [26]). Let X 1 , X 2 . . . be a sequence of independent (and not necessarily identical) random variables such that Proof of Theorem 2. Let µ i = E(X i ) = p i . We prove the Lyapunov condition is satisfied for δ = 1.
, as a direct consequence of Theorem 2 and Lemma 1, we have that: For the cumulative noise vector e ⊕ē it holds that: In order to estimate the precision of this approximation, we proceed to use the standard error measure for general CLT: Theorem 3 (Berry-Eseen inequality for non-identical random variables [27]). Let X 1 , . . . , X n be independent random variables such that EX i = 0, Then for every n there is an absolute constant C such that: It was proven that 0.4097 ≈ 28]). C 0 is the biggest known lower bound and C 1 the smallest known upper bound for C in literature, to the best of our knowledge.
Theorem 4 (Berry-Eseen inequality for Poisson-Binomial distribution). If random variable X follows Poisson-Binomial distribution, that is, , then for every n there is a constant C ∈ [C 0 , C 1 ] such that: . The claim follows directly by applying Berry-Eseen inequality to random variables Y 1 , . . . , Y n .

Lemma 3.
For the cumulative noise vector e ⊕ē it holds that: Proof. This is a direct consequence of the inequality above, taken in consideration that: As a consequence of this Lemma, by taking x = thr−µ σ , µ = (m −w)τ +w(1 − τ) and σ 2 = mτ(1 − τ), the standard Berry-Eseen upper bound estimate for the error of the approximation P(w) ≈ P OOV (w) is: ]. Using Formula (5)  The exact P(w) lies somewhere in the interval [(P OOV (w) −C, P OOV (w) +C]. Nevertheless, this interval is wider, i.e., covers the interval in which the adversary has to decide between adjacent weightsw andw + 1 (see Figure 5). It is possible that the adversary is incapable to determine and decide accurately if c n is closest to P(w) or P(w + 1), which directly jeopardizes his decision making. For example, if c n is in the position marked in Figure 6, the adversary will decide that the weight isw, because P OOV (w) is closest to it, but since c n is in a possible location of P(w + 1), it could in fact be closest to P(w + 1), and the actual weight could bew + 1. In order to investigate possibility of such scenarios of erroneous weight conclusions due to high error of approximation, in the next Section, we shall determine where precisely are P(w) values.

The Exact Distribution of the Acceptance Rates
Here, we calculate the exact acceptance rate of HB# and Random-HB# protocols while under the OOV attack, by using Lemma 1 from Section 3.2.1.

Theorem 5.
Let P(w) = P[ e ⊕ē ≤ thr] denote the probability of successful authentication after MIM modification using triplet (ā,b,z =āX ⊕bY ⊕ē) of exchanged messages caught in a Random HB# or HB# protocol session, wherew = ē . Then: In addition, if c is the number of successful authentications after n MIM modifications, then for acceptance rate c n it holds that c n ← Bin(n, P(w)) n .

Proof.
Since: where s k ← Ber τ ,s k ← Ber 1−τ (see Proof of Lemma 1) we have that: (Number of successes inw Bernoulli experiments can not exceedw. Similarly for m −w.)

Exact Error of the Approximation P(w) ≈ P OOV (w)
Finally, we are able to derive the exact error of the P OOV approximation as: where µ = (m −w)τ +w(1 − τ) and σ 2 = mτ(1 − τ). Although, in theory, this error diminishes for m large enough (see Figure 7), in the OOV attack m is the dimension of secret matrices. Thus, this error is a constant intrinsic to the protocol and the adversary is unable to manipulate it. The exact error values for standard protocol parameters are shown in Figure 8. Note that the error gets higher asw approaches the claimed optimal weight w opt , where it reaches its maximum. This weight is 228 for the standard parameter set I, while it is 77 for the other one.

Proper Decision Zones
The OOV decision zones, which are based on the inverse function P −1 OOV ( c n ) values, have the following potential drawbacks, in general case: • the inverse function might not preserve the ratios of distances, so, for example, it could be possible that P −1 ( c n ) is closer tow than tow + 1, while c n is actually closer to P(w + 1) than to P(w), • P OOV is used as an approximation of exact acceptance rates P with unknown precision, •w should be determined by considering which of the possible distributions is c n most likely sampled from, i.e., by probabilistic reasoning, instead of simply applying the inverse function to c n value. We employ the Bayesian reasoning over the exact distributions of acceptance rates to construct proper decision zones. The noise vector weight ē is estimated asw if the observed empirical acceptance frequency c n most likely follows the exact distribution Bin(n,P(w)) n ,w ∈ W, where W is the set of all the weightsw the adversary considers possible. As a general weight decision rule,w = argmax w∈W Pr[ ē = w | c n observed acceptance rate)] = argmax w∈W {P(w) c n (1 − P(w)) 1− c n · P occur (w) 1 n }, where P occur (w) = Pr[ ē = w] is the probability of occurrence of noise vector whose weight is w. By its logarithmic transformation, we obtain that the adversary decides the noise vector weight asw iff: where c 0 (w) = log(1 − P(w)) + log(P occur (w)) n , c 1 (w) = log P(w) 1−P(w) . After the mere eavesdropping, P occur (w) = ( m w )τ w (1 − τ) m−w . If the eavesdropped vector was flipped in f positions to reach optimal weight, P occur (w) = PB( f , m, τ, w) − PB( f , m, τ, w − 1). When recovering bits, P occur (w − 1) = τ and P occur (w + 1) = 1 − τ. The values P(w) and P occur (w) may be calculated in advance, so decision making is highly efficient.
However, when considering weights near w opt and standard parameter sets, for n large enough, the decision making can be further simplified. Namely, after comparing variances Var(w) = P(w)(1−P(w)) n of the exact distributions for the consecutive weights w − 1, w and w + 1 in such case, we have found their differences as insufficient to impact the Bayesian decision. Also, probabilities of occurrence of these weights produce negligible priors (observe division by n in c 0 (w)). , ∞) when deciding betweenw + 1 andw − 1 after flipping a bit. We shall also call them "PBdecision zones", since they use the exact values P(w) = PB(w). Figure 9 provides a graphical illustration of the PB-decision zones used in the processes of weight estimate after eavesdropping and bit recovery. Expressed more formally-probability that c n is sampled from N (P(w), Var(w)) is

2
, that is, that the observed frequency c n is closer to P(w) than to P(w + i) (P is monotone decreasing function).

The Exact and the Approximate Probability Distribution Relation
We now show that the decisions the adversary makes about noise vectors weights can differ depending on whether he uses the OOV approximation or the exact distribution.
We have noticed that the OOV w-decision zones are substantially shifted to the left with respect to PB-w decision zones, and that they often largely overlap with the correct PB-w + 1 decision zone (see Figure 10). As a consequence, there is a high chance that the OOV adversary decides the weight is w, while the actual weight is w + 1. This adverse phenomena is especially pronounced in the expected case-whenw is near the optimal weight w opt , since there is the biggest distance between P(w) and P OOV (w)-degrading significantly the precision of the weight estimate.
Furthermore, the shift of the OOV decision zones can not be repaired by employing larger "sample size" n, that is, number of intercepted authentications, because the approximation P OOV (w) ≈ P(w) has a high fixed error in this scenario (as shown in the previous Section). The convergence c n → P OOV (w) occurs only when both m → ∞ and n → ∞: However, in the context of the OOV attack, m is a constant protocol parameter and thus: This explains the experimental observations from [25] that the weight estimate does not improve by increasing the sample size.

Correction of the OOV Attack
In this section, we give a correction of the OOV-MIM attack and show that it meets the targeted precision, unlike the original attack.

Correction of the OOV Attack Algorithm
In order to solve the problem of high error of the approximation P OOV (w) ≈ P(w), we eliminate this approximation altogether, since we have shown that it can not be improved. Instead we employ the acceptance rates obtained from the exact distribution. That is, instead of: we use Poisson-Binomial cumulative distribution function: Then, we incorporate it in proper, Bayesian decision zones described in Section 3.3 with their corresponding optimal weights and modification samples. Noise vectorē Hamming weight will be estimated asw if and only if c n is nearest to P(w), for all weights w ∈ {0, . . . , m} considered possible.
Hence, the pseudocode of the proposed correction of the weight estimate procedure is given in Algorithm 1: Algorithm 1 PB-OOV weight estimate alg. Approximatingw = ē 1: Input:ā,b,z =āX ⊕bY ⊕ē, n 2: Output: estimate of noise vector weight w = āX ⊕bY ⊕z , where 3: Processing: 4: c = 0 5: for i = 1...n do 6: During i-th session, the adversary modifies and replaces messages: 7: a withâ = a ⊕ā, b withb = b ⊕b, z withẑ = z ⊕z 8: if Verifier accepts the modified response then Since the PB Decision zone for w ⊃ I = [P(w) −r, P(w) +r], wherer = 1 2 min{P(w) − P(w + 1), P(w − 1) − P(w)} we have that: . Therefore, after the eavesdropping, PB-OOV adversary chooses sample of size n PB = 4θ 2 R PB (w), R PB (w) = 2 P(w)(1−P(w)) r 2 to achieve the required precision 1 − er f c(θ), which is based on exact values PB(w) instead of approximate ones P OOV (w) as in Formula (2) from [12]. Accordingly, he uses optimal weight w PB opt which minimizes this sample across all weights and its value is 229 for parameter set I, and 78 for parameter set II. After the flipping, he will use samples of size θ 2 R PB (w) to recover bits.
It should be noted that the values P(w), w = 0, . . . , m can be calculated in advance, as a part of the preprocessing step, and stored in a table to be later used during the attack.

Comparison of the OOV and PB-OOV Attack Success
In this section we analyze the probability of success of the OOV and PB-OOV attack. Namely, we derive the probability the OOV adversary will correctly reconstruct a noise vector (and consequently recover the key) and show that, as a consequence of the approximation employed, the OOV attack is significantly less efficient than claimed in [12]. Oppositely, the PB-OOV attack proposed in Section 4.1, achieves the desired precision and efficiency.

Noise Vector Hamming Weight Estimate
OOV adversary. First, let us observe the distribution of acceptance rate c n during the attack when ē =w: where σ = σ(w, n) := P(w)(1−P(w)) n . The probability that the OOV adversary estimates that noise vectorē has weight w est , when its weight isw (which may or may not be equal to w est ), using n modifications of authentication sessions is: Therefore, the adversary makes correct decision when ē =w, using n modifications, with probability p 0 (w,w, n).
After evaluating Formula (8), we have found that the weight will either be estimated as one lower (when the adversary is wrong, which is the majority of the time for the weights near the expected ones) or make a correct guess, that is, all other cases will appear with negligible probability (see Table 2). This supports the experimental findings from [25]. Table 2 shows comparison between the claimed and real precision p 0 (w,w, n) of the OOV weight estimate (see Appendix A Table A1 for details on the parameters' values). It can be noticed that, for parameter set II, in the case of Random-HB#, the claimed precision is by two orders of magnitude smaller than the claimed. In all other cases, the discrepancy is somewhat smaller but, still, the real precision is by an order of magnitude smaller than the claimed. ), w = 1, . . . , m is PB-w decision zone after the eavesdropping, and PB-OOV adversary uses exact values P(w) instead of the approximate ones P OOV (w), by analogous analysis as above, we obtain that he estimates the weight as w est when its actual value isw with probability: Unlike the OOV weight estimate, whose precision is shown to be remarkably lower than the claimed, precision of the PB-OOV weight estimate is within the given boundaries (see Table 3). This is also confirmed by the experimental results presented in Section 5.3.

Noise Vector Bits Recovery
Here, we compare the success rate of the OOV adversary and PB-OOV adversary when it comes to the reconstruction of noise vectors, that is, bit recovery.
OOV adversary. After the adversary has estimated the weight of the observed vectorē as w est after eavesdropping, he tries to recover its bits by flipping one by one each bitē i and estimating new weight as w est − 1 or w est + 1. If the weight has decreased, he concludes the flipped bit is 1, otherwise, that the bit is 0. Therefore, he recovers a bit correctly, depending on its value, with probabilities: Pr c n ∈ OOV Decision zone for w est + 1 , −1,n) ),ē i = 1 The results of evaluation of Formula (9) are shown in Table 4. First, it should be noted that the probability for bit recovery is very asymmetrical, that is, the precision for 0-bit recovery is very different from the precision for 1-bit, while the claimed precision is uniform for both bit values. Secondly, when the weight is correctly estimated, the precision for 0-bit is much lower than the claimed and it would make reconstruction of the noise vector (and further the key recovery itself) practically impossible. This is in accordance with the experimental results from [25]. On the other hand, the OOV adversary has more success in bits recovery when the initial weight estimate is incorrect, since the relative change remains intact if the measured weights are both one lower than the actual ones. The two errors made in the weight estimate processes can neutralize each other; however, even with this mutual cancellation of the errors, the claimed precision is not achieved. Namely, the precision for 1-bit recovery is lower than the targeted 1 − 1 2 er f c(θ) and that lowers the probability of the attack success.
Let us further consider the probability that the OOV adversary will successfully recover a complete noise vector. We observe the expected case ē = w exp (= w opt for parameter set II). As we have already noted: (a) w est is either ē or ē − 1, and (b) the noise vector is practically impossible to recover when w est = ē due to too high error for 0-bit. Thus, for parameter set II, the probability of the OOV Adversary successfully recovering a complete m-bit noise vector of weight ē = w opt is: where p 0 = p 0 (w opt − 1, w opt , 4θ 2 R(w opt )), p i k = p i k (w opt − 1, w opt , θ 2 R(w opt − 1)), k = 0, 1. Similarly, for parameter set I, the adversary needs to recover and remove ∆ = w est − w opt errors in a noise vector in order to achieve the optimal weight. This is expected to happen after recovering ∆/τ bits, thus: where: (10) and (11) we can evaluate the probability that the OOV adversary will correctly recover a complete noise vector in the expected case, and compare the obtained probability with the claimed one, which is calculated based on the claimed probabilities of correct weight estimate and bit guess as (1 − er f c(θ))(1 − 1 2 er f c(θ)) m . Results of the comparison are given in Table 5. Although the difference between the claimed and real precision on the noise vector level does not seem remarkable for Random-HB#, it does make a significant impact on the key recovery probability, having in mind the number of noise vector that have to be reconstructed, which is 592. More details will be provided in the next Section 4.2.3.

PB-OOV adversary.
For the PB-OOV adversary, by replacing P OOV with P, and P OOV (w est ) with P(w est −1)+P(w est +1) 2 (i.e., by using proper PBw-decision zones) in the derivation above, the probabilities of successful bit recovery, depending on its value are: ),ē i = 0.
(12) Table 6 shows the precision the PB-OOV adversary achieves in the bit recovery process, when the standard parameter sets are employed. The results obtained by evaluating Formula (12) prove that the PB-OOV on the bit level does achieve the targeted precision using the OOV sample (i.e., the number of modifications). This is also confirmed by the experimental results presented in Section 5.3. Further, we analyze the probability that the PB-OOV adversary will successfully recover a complete noise vector. We observe the expected case ē = w exp (= w opt for parameter set II). For parameter set II, the probability is given by the formula: where p 0 = p 0 (w opt , w opt , θ 2 Rw opt )), p i k = p i k (w opt , w opt , θ 2 R(w opt )), k = 0, 1. Similarly, for parameter set I, we have that: where p 0 = p 0 (w exp , w exp , 4θ 2 R(w exp )), p * i k = p i k (w exp , w exp , θ 2 R(w exp )), p i k = p i k (w opt , w opt , Using Formulas (13) and (14), we can evaluate the probability that the PB-OOV adversary will correctly recover a complete noise vector in the expected case. Table 7 shows the results of this evaluation, which confirm that the PB-OOV attack does meet the targeted precision.

Evaluation of the Acceptance Rates
We have conducted a set of experiments to confirm the convergence of experimentally obtained acceptance rates to the corresponding PB values. There were 4 rounds of tests, for n = 2500, n = 5000, n = 10,000 and n = 15,000. For each n, we generated 500 noise vectors and flipped the appropriate number of their last bits, so that the expected weight of the noise vectors is optimal, that is, 78. For each test vector e i , we measured the acceptance rate and analyzed how it relates to P OOV ( e i ) and PB( e i ). In general, it can be noted that the experimental acceptance rates lie above the corresponding OOV points, but compared to the corresponding PB points, they are evenly distributed above and bellow (see Figure 11). It can also be noticed that as n increases, the experimental points concentrate around the PB points, as expected. This further explains and confirms that the OOV algorithm relaying on the OOV approximation has high error rate when it comes to weight estimate, while the corrected PB-OOV algorithm gives much better results.
Furthermore, we compare experimentally obtained acceptance rates c(e i ) n with OOV and PB reference points, i.e., P OOV ( e i ) and PB( e i ), using a standard error measure-Mean absolute error (MAE), and show how it relates to the correctness of weight estimates. That is, for a set {e i } N i=1 of test noise vectors, we observe the MAE between the acceptance rates c(e i ) n , where n is the number of intercepted authentication sessions (i.e., modifications), and P OOV ( e i ) and PB( e i ): Consequently, the expected MAE value for the OOV points, across all possible weights e i , as n → ∞, converges to: , after flipping f last bits in e i , while for the PB points they converge to: E(Avg_dist PB ∞ ) = 0. This is in accordance with the experimental results shown in Figure 12, for different number of modifications n. Furthermore, Figures 12 and 13 show that there is an inverse correlation between the distance (between the experimental and OOV points, i.e., PB points, respectively) and the accuracy of weight estimation.

Precision Comparison of the OOV and PB-OOV Weight Estimate: Experimental
Here, the differences in the weight estimate quality between the original OOV Algorithm 1 and the PB-OOV Algorithm 1 proposed in Section 4.1 are experimentally proven. We have analyzed and compared effectiveness of the two algorithms for different Hamming weights. For the standard parameter set I, 99% of all noise vectors have the weight between 250 and 330. The comparison of the algorithms is based on the sample of 5000 noise vectors whose Hamming weight is from that interval. The number of modifications employed for weight estimation corresponds to the HB# scenario. The success rate of the OOV algorithm is 20% and for the PB-OOV it is 98%. Detailed results are given in Figure 14a. For the standard parameter set II, 99% of all noise vectors have the weight between 60 and 95 (this is after flipping (w opt − mτ)/(1 − 2τ) bits to obtain a vector of the optimal weight from a vector of the expected weight) and the comparison of the two algorithms is based on the sample of 5000 noise vectors with the Hamming weight in this interval. The number of modifications employed for weight estimation corresponds to the HB# scenario. The experimental results again show that the success rate of the OOV algorithm is much worse than PB-OOV (11% in contrast to 99%). Details are given in Figure 14b.

Evaluation of the PB-OOV Attack Precision
In Section 2.1 from [12], the authors derive the error formula and calculate the number of modifications n that should provide the aimed accuracy of the OOV attack, that is, of the weight estimate and bit recovery. However, the analysis given in Section 4.2 shows that the precision deviates significantly from the one claimed. The analysis provides the theoretical proof that supports the experimental findings presented in [25]. On the other hand, the analysis of the proposed PB-OOV algorithm given in Section 4.2, shows that this algorithm does achieve the desired precision and efficiency. We have conducted a series of experiments in order to experimentally verify the correctness of the PB-OOV attack. The experimental results presented in this section support the conclusions of the theoretical analysis.
The tests are conducted for both HB# and Random-HB# protocols and parameter set II. The number of modifications used ("sample size") is the one from the [12]. For the HB# protocol we have tested the weight estimate and bit recovery precision for 2000 randomly generated noise vectors of the optimal weight. The weights of two noise vector were incorrectly estimated as 79, since the obtained acceptance rates were 0.473227 and 0.475217. This gives success rate of 0.999 in weight estimation step. When the weight of a vector is incorrectly guessed, it further causes high error rate in the bit recovery process, since the algorithm relies on the initial weight estimate w est and chooses between w est − 1 and w est + 1 after flipping the observed bit. However, when the weight estimate is correct, targeted bit precision is 1 − 1 2 er f c(θ), and our tests verify that the PB-OOV attack complies with this. Namely, in the set of noise vectors whose weight is correctly estimated, the average bit guessing success rate in our test is 0.999342, compared to the targeted 0.999320. For Random-HB#, we have randomly generated 25,000 noise vectors of the optimal weight. The PB-OOV attack correctly estimated all weights, while the achieved average bit guessing success rate was 0.999996, which is in line with the targeted precision. An interesting finding regarding the OOV attack is that the bit guessing precision may significantly differ for 0-bits and 1-bits, for example, in the case of HB# and parameter set II, precision for 0-bit is 0.764623, while for 1-bit it is remarkably higher and equal 1 − 5.5 × 10 −9 (see Table 4). On the other hand, the proposed PB-OOV algorithm does not have this strong and distinct bias. Table 8 summarizes the results of the tests.

Conclusions
This paper provides a detailed examination of the OOV attack reported in [12] against the LPN based authentication protocols known as HB# and Random-HB#. We have found that the problem of discrepancy between the theoretically estimated performances and complexity in [12] and the experimentally evaluated ones in [25] arises from non-Bayesian reasoning with inadequate approximations of the probability distributions on the acceptance rates during the attack, which can not be improved due to the limitations of Central limit theorem use in the attack context. We give a correction of the attack by employing proper, Bayesian inference after establishing the exact underlying probability distributions, and prove that the new version of the attack, unlike the original one, achieves the targeted precision and complexity.
Since the OOV attack is recognized as one of the cornerstones in the analysis of any HB-like authentication protocol, our correction of the OOV attack is not only significant against Random-HB# and HB#, but also for practical security analysis of all new members of the HB-family. An interesting future direction could be a design of improved MIM attacks against HB-like protocols, which could be based on the corrected version of the OOV attack proposed in this paper.

Conflicts of Interest:
The authors declare no conflict of interest.