A New LSB Attack on Special-Structured RSA Primes

: Asymmetric key cryptosystem is a vital element in securing our communication in cyberspace. It encrypts our transmitting data and authenticates the originality and integrity of the data. The Rivest–Shamir–Adleman (RSA) cryptosystem is highly regarded as one of the most deployed public-key cryptosystem today. Previous attacks on the cryptosystem focus on the effort to weaken the hardness of integer factorization problem, embedded in the RSA modulus, N = pq . The adversary used several assumptions to enable the attacks. For examples, p and q which satisfy Pollard’s weak primes structures and partial knowledge of least signiﬁcant bits (LSBs) of p and q can cause N to be factored in polynomial time, thus breaking the security of RSA. In this paper, we heavily utilized both assumptions. First, we assume that p and q satisfy speciﬁc structures where p = a m + r p and q = b m + r q for a , b are positive integers and m is a positive even number. Second, we assume that the bits of r p and r q are the known LSBs of p and q respectively. In our analysis, we have successfully factored N in polynomial time using both assumptions. We also counted the number of primes that are affected by our attack. Based on the result, it may poses a great danger to the users of RSA if no countermeasure being developed to resist our attack.


Introduction
One of the earliest asymmetric key cryptosystems is the Rivest-Shamir-Adleman (RSA) cryptosystem, introduced by Rivest, Shamir and Adleman in 1978 [1]. Its simple and easy-to-understand mathematical design makes it compelling to be used in the early ages of digital cyberspace technology. Since then, it is considered as the most widely known asymmetric key cryptosystem. In its key generation algorithm, an RSA modulus, N = pq is computed where p and q, called RSA primes are two distinct primes such that p < q < 2p. From the values of p and q, another parameter called RSA public exponent, e is obtained which satisfies e < φ(N) and gcd(e, φ(N)) = 1 where φ(N) = (p − 1)(q − 1). An RSA private exponent, d that satisfies ed ≡ 1 (mod N) then is computed. One of the security strength of RSA is integer factorization problem and it is embedded in the RSA modulus since p and q are very large n−bit primes (typically, n = 1024). The problem is deemed infeasible to be solved by current computing machines and the best algorithm to solve the problem, called general number field sieve (GNFS) [2] is still running in sub-exponential time.
Past attacks on RSA by Pollard in 1974 [3] have shown that primes with particular structures are vulnerable to be factored in polynomial time, which is easily computed by any modern computers.
In his attacks, Pollard showed that if p − 1 or q − 1 are constituted of small primes, then there is a factoring algorithm to factor N = pq in polynomial time. Another method in attacking RSA assumes that several bits of p and q are known by the adversary and this weakens the hardness of factoring N. Particularly, ref. [4] showed that 1/2 least significant bits (LSBs) of the RSA primes are sufficient to factor N in polynomial time. Random reconstruction algorithm by Heninger and Shacham also showed that it can efficiently recover all of the RSA keys given 0.57 fraction of the random bits of each p and q [5]. Later, Maitra et al. [6] provided a combinatorial model of Heninger's work and was able to reconstruct the LSBs of RSA primes using modified brute-force by shortening the total search space.
The LSBs discussed in the prior attacks of RSA are commonly gathered by side-channel attack. It is one of the prominent methods to collect the physical outputs or side-effects of cryptographic devices during the computing processes [7]. The outputs or side-effects include but are not limited to the computational time and power of decryption [8,9], emission heat and electromagnetic radiation of the devices [10], cache behavior [11] and sound of processor during computations [12].

About This Paper
The results in this paper are the extensions from our papers in [13] and [14]. In this paper, we assume that certain LSBs of the RSA primes are known. We show that only a small amount of LSBs are required in our attack to factor N in polynomial time given that the RSA primes satisfy specified structures. We also show the abundance of primes that can satisfy the structures and no proper checking mechanism has been done in any standard RSA libraries to hinder the usage of such primes. This shows the risks inherent in the existing method to generate RSA keys may produces RSA modulus that falls under our attack.

Preliminaries
In this section, we provide some helpful lemmas which results are applied to make our attack successful. Lemma 1. Let a, r ∈ Z + and m ≥ 2 be an even number. If √ a m + r = a m/2 + then < r 2a m/2 .

Proof.
Let a m + r be an integer where a ∈ Z + . Then Since √ a m + r = a m/2 + then < r 2a m/2 . This terminates the proof.
Suppose N = pq is a valid RSA modulus where p = a m + r p and q = b m + r q . Let a, b ∈ Z + , we can see that ab is unknown if p and q are secret values. Using the result from Lemma 1, we find the lower and upper bounds of N 1/2 − (ab) m/2 in the following lemma. Lemma 2. Let a, b ∈ Z + and m ≥ 2 be an even number such that a < b < (2a m + 1) 1 m . Suppose N = (a m + r p )(b m + r q ) where r p ≤ r q < N γ . If r p < 2a m/2 and r q < 2b m/2 then (r p r q ) 1 Proof. To prove the lower bound, first we need to show that a m r q + b m r p > 2(ab) m/2 (r p r q ) 1/2 . Observe that a m/2 r 1/2 q − b m/2 r 1/2 p 2 = a m r q + b m r p − 2(ab) m/2 (r p r q ) 1/2 .
By obtaining the lower and upper bounds of N 1/2 − (ab) m/2 in Lemma 2, we have gathered a result that can be useful in our attack later. Throughout this paper, we focus on the RSA primes in the forms of p = a m + r p and q = b m + r q . Therefore, we define LSBs in the next definition based on these forms. Definition 1 (Least Significant Bits (LSBs) of Primes). Let l 1 , l 2 , m ∈ Z + . Suppose p = a m + r p and q = b m + r q are primes. Suppose there exist unknown a 0 and b 0 such that and Then we define r p and r q to be k-many LSBs of p and q respectively where k ≤ l 1 m, l 2 m satisfies and r q ≡ q (mod 2 l 2 m ). (6) To identify primes that satisfy Equations (3) and (4), we observe the binary representations of a m and b m . Their LSBs must have k many consecutive 0's to satisfy p = a m + r p and q = b m + r q . Particularly, let r p i be the binary representation of a and r q i be the binary representation of b where i = 1, 2, . . . , n. Observe n − k many bits of 1 and 0's k many bits of 0's n − k many bits of 1 and 0's k many bits of 0's The random reconstruction algorithm [5], which was improved by [6], is one of the efficient algorithms used to find the LSBs of RSA primes. Thus, it can be utilized to find the values of r p and r q that satisfy Equations (5) and (6).

Our Attack
Before we proceed to show how N can be factored in polynomial time using previous results, we define the term 'sufficiently small' that is used to justify our attack.

Definition 2.
We define sufficiently small value in this paper to be a value smaller than the largest feasible value of the lowest security level to be brute forced by current computing machine. Remark 1. The latest recommendation for key management by NIST [15] stated that the lowest security level is 112-bit. This implies that the largest feasible value of this security level to be brute forced by current computing machine is 2 112 . Based on Definition 2, a value lower than 2 112 is considered sufficiently small. This value can be changed in the future, depends on the future advancements of computing technology. Now we are ready to show how RSA modulus can be factored in polynomial time by using this next theorem. Theorem 1. Let a, b ∈ Z + and m ≥ 2 be an even number such that a < is a sufficiently small value as defined in Definition 2 and k many LSBs of p and q are known then N can be factored in polynomial time.
Proof. From Lemma 2 we can see that (r p r q ) 1 Suppose r p and r q are known LSBs of p and q respectively. The LSB values may be obtained from side-channel attacks described previously in Section 1. Since max{r p , r q } < 2 k , then the difference between the upper and lower bounds of Equation (9) is which is the size for set of integers to find (ab) m/2 . If 2 k−1 2 m 2 + 1 is sufficiently small as defined in Definition 2, then we can find (ab) m/2 in polynomial time. By computing (ab) m/2 2 , we find (ab) m . Then Observe that from r p < 2a m/2 and r q < 2b m/2 , then we can have a m r q + b m r p < (ab) m . Thus, we obtain the full integer a m r q + b m r p without modular reduction. Since the values of r p , r q , (ab) m and a m r q + b m r p are known, we can find the roots of the following quadratic equation We find that x 1 = a m r q and x 2 = b m r p . Since r p and r q are known, we can can obtain Thus we can factor N by calculating N b m + r q = a m + r p .
The next remark justifies our selection criteria on parameter m.

Remark 2.
Let A be the set of possible value of (ab) m/2 . From Equation (9), we know that A will yield a set of numbers between N 1/2 − r q 2 + 2 m 2 −1 r p + 1 and N 1/2 − (r p r q ) 1/2 . If m ≥ 2 is an even integer, then (ab) m/2 will be an integer and causes A to be a finite set. However, if m is a positive odd integer, then (ab) m/2 will be a real value and causes A to be an infinite set. The latter consequence will make our method to be infeasible since there are infinite possible values of (ab) m/2 to be tried on. Therefore, m must be an even integer equals or greater than 2.
The following is an example to illustrate the result from Theorem 1. Then we calculate and solve the equation We find that neither x 1 r q + r p nor x 2 r p + r q are integers. This means x 1 and x 2 are not our final solutions. It also means σ = (ab) m at this point. To find the correct σ, we have to iterate the computation of Equations (13) and (14)  Using values of σ and z, we solve the equation The solutions of Equation (15)  Hence, N has been successfully factored in polynomial time.

Remark 3.
From Example 1, we show that as small as 12-bits of LSBs are required to successfully execute our attack. Hence, this put our method in advantage since it does not necessarily depend on side-channel attack [7] to gather the LSBs. Instead, by using our method, an adversary can use brute-force approach to find the correct LSBs since the required LSBs can be very small.

Numbers of Primes with Vulnerable Specialized Structures Against Random Reconstruction Algorithm
From Equations (7) and (8) we can see that r p 1 until r p (n−k) must be another binary representation of a squared number. The same case also applies on r q 1 until r q (n−k) In the next Theorem, we count the number of squared numbers with n − k bit. Proof. Let X = {x 2 i } for i = {1, 2, 3, . . .} be the set of all squared numbers between 2 n−k−1 and 2 n−k − 1. Particularly, 2 n−k−1 < x 2 i < 2 n−k − 1.
To find the least number of i, the amount of squared numbers between 2 n−k−1 and 2 n−k − 1, we compute the difference between the upper bound and the lower bound of Equation (16) in integer form. That is, If n is any large positive integer and k is a small positive integer then 2 n−k 2 This terminates the proof.
Theorem 3. Let a, b ∈ Z + and m ≥ 2 be an even number such that a < b < (2a m + 1) 1 m . Suppose N = pq = (a m + r p )(b m + r q ) be a valid RSA modulus. Let r p ≡ p (mod 2 m ) and r q ≡ q (mod 2 m ) where r p < 2a m/2 and r q < 2b m/2 such that max{r p , r q } < 2 k . Let x > 0 be an integer where x 2 is the smallest squared number with n-bit size. If 2 k−1 2 m 2 + 1 is a sufficiently small value as defined in Definition 2 and k many LSBs of p and q are known, then there are at most candidates of p and q with size of n-bit such that p = a m + r p and q = b m + r q satisfy Theorem 1.
Proof. Let x > 0 be an integer where x 2 is the smallest squared number with n − k-bit. Let f (x) be the prime-counting function between x 2 and x 2 + max{r p , r q }. Then From Theorem 2, we know there are approximately 2 squared numbers with n − k-bit size where n − k is a large integer suitably used in RSA. Thus, π * 1 (x) for the consecutive squared numbers are as follows: The summation of Equation (17) can be represented in the sum of arithmetic progression formula where the number of i terms is multiplied by the sum of the first and last number in the progression and dividing by 2. That is, This terminates the proof.
Result from Theorem 3 shows there is a significant amount of primes that satisfy Theorem 1.

Comparative Analysis
Here we compare our results with the existing attacks with known bits of primes. The authors of [16] introduced partial key exposure attacks with assumption that certain bits of primes can be known by the adversary. They showed that 2/3 bits of p or q are sufficient to factor N using integer programming technique. Later, ref. [17] reduced this value to 1/2 using LLL algorithm. The attack from Herrmann and May later on required the known bits to be arranged in random blocks [18].
Heninger and Shacham's attack is motivated by the so-called cold boot attack which targets the memory in electronic chips to reconstruct the bits of the private keys given that the bits are from random positions [5]. They successfully conducted the attack if 0.57 random bits of the primes are known. It should be noted here their fraction value is much lower if they consider the random bits of RSA private exponent, d (d p and d q in the case of CRT-RSA). Using a similar method, ref. [6] proved that if the total LSBs from both p and q known is at least 50% of the total length of N, then N can be factored using lattice-based method. Our method, unlike existing methods, utilize k-many LSBs of the primes where k is less than the value of 2 k−1 2 m 2 + 1 which is sufficiently small as defined in Definition 2, as shown in Theorem 1.
The summaries of all the attacks are compiled in Table 1.
From Table 1, we can see that our method required less LSBs for the attack to be successful when compared to [5,6]. That is, the attack required less computational time and space to be executed. It is easy to see that if N ≈ 2 2048 and k = 80, then r p , r q < N 0.039 . This is a substantial improvement from previous works.
We would like to point out the trade-off of our attack, namely the characteristics as mentioned in Theorem 1. Nevertheless, our analysis shows that if r p and r q are bounded to 2 k where k is stated as in Definition 2, the side-channel attack can be conducted in reasonable time in order to identify whether the primes in physical devices fall under the category as mentioned. This results in our research to be of importance for real-world implementation of the RSA cryptosystem. Moreover, we have shown in Section 4 that the number of primes satisfying our conditions are exponentially many. This shows the importance of our attack. Table 1. Comparison of our method against existing attacks with known bits of primes.

Comments/ Remarks
Advantages/ Disadvantages Rivest and Shamir (1985) LSBs or MSBs of the bits of p or q of the bits of p or q Using RRA together with lattice-based method

Advantages:
Fast speed Disadvantages: Requires a lot of known bits Our method: Theorem 1 LSBs r p , r q < 2 k where 2 k is sufficiently small as in Definition 2.
That is r p , r q < N k log 2 N .
Side-channel attack of complexity O(2 k ) where 2 k is sufficiently small as in Definition 2.

Advantages:
Fast speed, requires less known bits Disadvantages: Requires specific hardware to conduct side-channel attack

Countermeasure of the Attack
Although the attack seems to target a niche set of primes, there is no immediate noticeable detection that can be implemented to overcome the attack. This means the prevention from utilizing the weak primes must be applied in the RSA key generator with the full knowledge of the secret parameters, p and q. The countermeasure is depicted in Figure 1.
Given N, p and q, if is a sufficiently small integer as defined in Definition 2, then RSA key generator must find new p or q. Since the computation is minimal, the prevention of the attack can be applied in the real-world RSA implementation.
Example 2. For a toy example of this countermeasure method, we revisit the values in Example 1. Given N, p, q from Example 1, we compute N 1/2 − p 1/2 · q 1/2 = 2811.
Since 2811 is definitely sufficiently small based on Definition 2, an RSA key generator must find new p and q. Let which is larger than 2 112 . Hence N is safe from our attack.

Conclusions
We have shown an attack on RSA modulus, N = pq where p = a m + r p and b m + r q for r p and r q are k LSBs of p and q respectively. Our attack can be mounted successfully in polynomial time if the LSBs of the primes are known and satisfy the conditions. We also show that there is a significant number of primes with respect to their sizes that are vulnerable to our attack. This imposes a great threat to the RSA users who might not realize that their RSA primes may fall under these vulnerable primes. However, our suggestion on how to detect the vulnerable primes during the key generation process may help to overcome this problem so that the RSA cryptosystem can still be applied.

Conflicts of Interest:
The authors declare no conflict of interest.