A Modification of the Fast Inverse Square Root Algorithm

We present a new algorithm for the approximate evaluation of the inverse square root for single-precision floating-point numbers. This is a modification of the famous fast inverse square root code. We use the same “magic constant” to compute the seed solution, but then, we apply Newton–Raphson corrections with modified coefficients. As compared to the original fast inverse square root code, the new algorithm is two-times more accurate in the case of one Newton–Raphson correction and almost seven-times more accurate in the case of two corrections. We discuss relative errors within our analytical approach and perform numerical tests of our algorithm for all numbers of the type float.

In this paper, we consider an algorithm for computing the inverse square root using the so-called magic constant instead of an LUT [37][38][39][40].The zeroth approximation (initial seed) for the inverse square root of a given floating-point number is obtained by a logical right shift by one bit and subtracting this result from an specially-chosen integer ("magic constant").Both operations are performed on bits of the floating-point number interpreted as an integer.Then, a more accurate value is produced by a certain number (usually one or two) of standard Newton-Raphson iterations.The following code realizes the fast inverse square root algorithm in the case of single-precision IEEE Standard 754 floating-point numbers (type float).
The code InvSqrt (see Algorithm 1) consists of two main parts.Lines 4 and 5 produce in a very inexpensive way a quite good zeroth approximation of the inverse square root of a given positive floating-point number x. Lines 6 and 7 apply the Newton-Raphson corrections twice (often, a version with just one iteration is used, as well).Originally, R was proposed as 0x5F3759DF; see [37,38].More details, together with a derivation of a better magic constant, are given in Section 2. return y ; 9. } InvSqrt is characterized by a high speed, more that three-times higher than computing the inverse square root using library functions.This property was discussed in detail in [41].The errors of the fast inverse square root algorithm depend on the choice of the "magic constant" R. In several theoretical papers [38,[41][42][43][44] (see also Eberly's monograph [19]), attempts were made to determine analytically the optimal value of the magic constant (i.e., to minimize errors).In general, this optimal value can depend on the number of iterations, which is a general phenomenon [45].The derivation and comprehensive mathematical description of all the steps of the fast inverse square root algorithm were given in our recent paper [46].We found the optimum value of the magic constant by minimizing the final maximum relative error.
In the present paper, we develop our analytical approach to construct an improved algorithm (InvSqrt1) for fast computing of the inverse square root; see Algorithm 2 in Section 4. The proposed modification does not increase the speed of data processing, but increases, in a significant way, the accuracy of the output.In both codes, InvSqrt and InvSqrt1, magic constants serve as a low-cost way of generating a reasonably accurate first approximation of the inverse square root.These magic constants turn out to be the same.The main novelty of the new algorithm is in the second part of the code, which is changed significantly.In fact, we propose a modification of the Newton-Raphson formulae, which has a similar computational cost, but improve the accuracy by several fold.

Analytical Approach to the Algorithm InvSqrt
In this paper, we confine ourselves to positive single-precision floating-point numbers (type float).Normal floating-point numbers can be represented as: where m x ∈ [0, 1) and e x is an integer (note that this formula does not hold for subnormal numbers).
In the case of the IEEE-754 standard, a floating-point number is encoded by 32 bits.The first bit corresponds to a sign (in our case, this bit is simply equal to zero); the next eight bits correspond to an exponent e x ; and the last 23 bits encode a mantissa m x .The integer encoded by these 32 bits, denoted by I x , is given by: where N m = 2 23 and B = 127 (thus B + e x = 1, 2, . . ., 254).Lines 3 and 5 of the InvSqrt code interpret a number as an integer (2) or float (1), respectively.Lines 4, 6, and 7 of the code can be written as: The first equation produces, in a surprisingly simple way, a good zeroth approximation y 0 of the inverse square root y = 1/ √ x.Of course, this needs a very special form of R. In particular, in the single precision case, we have e R = 63; see [46].The next equations can be easily recognized as the Newton-Raphson corrections.We point out that the code InvSqrt is invariant with respect to the scaling: x like the equality y = 1/ √ x itself.Therefore, without loss of the generality, we can confine our analysis to the interval: Ã := [1,4). ( The tilde will denote quantities defined on this interval.In [46], we showed that the function ỹ0 defined by the first equation of (3) can be approximated with a very good accuracy by the piece-wise linear function ỹ00 given by: ỹ00 where: is the mantissa of the floating-point number corresponding to R).Note that the parameter t, defined by (7), is uniquely determined by R.
The only difference between y 0 produced by the code InvSqrt and y 00 given by ( 6) is the definition of t, because t related to the code depends (although in a negligible way) on x.Namely, Taking into account the invariance (4), we obtain: These estimates do not depend on t (in other words, they do not depend on R).The relative error of the zeroth approximation ( 6) is given by: This is a continuous function with local maxima at: given respectively by: In order to study the global extrema of δ0 ( x, t), we need also boundary values: which are, in fact, local minima.Taking into account: we conclude that: Because δ0 ( xIII 0 , t) < 0 for t ∈ (2, 4), the global maximum is one of the remaining local maxima: Therefore, max In order to minimize this value with respect to t, i.e., to find t r 0 such that: we observe that | δ0 (t, t)| is a decreasing function of t, while both maxima ( δ0 ( xI 0 , t) and δ0 ( xII 0 , t)) are increasing functions.Therefore, it is sufficient to find t = t I 0 and t = t I I 0 such that: and to choose the greater of these two values.In [46], we showed that: Therefore, t r 0 = t I I 0 , and: The following numerical values result from these calculations [46]: Newton-Raphson corrections for the zeroth approximation ( ỹ00 ) will be denoted by ỹ0k (k = 1, 2, . ..).
In particular, we have: and the corresponding relative error functions will be denoted by δk ( x, t): where we included also the case k = 0; see (10).The obtained approximations of the inverse square root depend on the parameter t directly related to the magic constant R. The value of this parameter can be estimated by analyzing the relative error of ỹ0k ( x, t) with respect to 1/ √ x.As the best estimation, we k minimizing the relative error δk ( x, t): We point out that in general, the optimum value of the magic constant can depend on the number of Newton-Raphson corrections.Calculations carried out in [46] gave the following results: We omit the details of the computations except one important point.Using (24) to express ỹ0k by δk and √ x, we can rewrite (23) as: The quadratic dependence on δk−1 means that every Newton-Raphson correction improves the accuracy by several orders of magnitude (until the machine precision is reached); compare (26).
The Formula ( 27) suggests another way of improving the accuracy because the functions δk are always non-positive for any k 1. Roughly speaking, we are going to shift the graph of δk upwards by an appropriate modification of the Newton-Raphson formula.In the next section, we describe the general idea of this modification.

Modified Newton-Raphson Formulas
The Formula (27) shows that errors introduced by Newton-Raphson corrections are nonpositive, i.e., they take values in intervals [− δk max , 0], where k = 1, 2, . ... Therefore, it is natural to introduce a correction term into the Newton-Raphson formulas (23).We expect that the corrections will be roughly half of the maximal relative error.Instead of the maximal error, we introduce two parameters, d 1 and d 2 .Thus, we get modified Newton-Raphson formulas: where zeroth approximation is assumed in the form (6). In the following section, the term 1/ √ x will be replaced by some approximations of ỹ, transforming (28) into a computer code.In order to estimate a possible gain in accuracy, in this section, we temporarily assume that ỹ is the exact value of the inverse square root.The corresponding error functions, (where ỹ10 ( x, t) := ỹ00 ( x, t)), satisfy: where: δ 0 ( x, t) = δ0 ( x, t).Note that: In order to simplify notation, we usually will suppress the explicit dependence on d j .We will write, for instance, δ 2 ( x, t) instead of δ 2 ( x, t, d 1 , d 2 ).The corrections of the form (28) will decrease relative errors in comparison with the results of earlier papers [38,46].We have three free parameters (d 1 , d 2 , and t) to be determined by minimizing the maximal error (in principle, the new parameterization can give a new estimation of the parameter t).By analogy to (25), we are going to find t = t (0) minimizing the error of the first correction (25): where, as usual, Ã = [1,4].The first of Equation (30) implies that for any t, the maximal value of δ 1 ( x, t) equals 1 2 d 1 and is attained at zeros of δ 0 ( x, t).Using the results of Section 2, including ( 15), ( 16), (20), and ( 21), we conclude that the minimum value of δ 1 ( x, t) is attained either for x = t or for x = x I I 0 (where there is the second maximum of δ 0 ( x, t)), i.e., Minimization of | δ 1 ( x, t)| can be done with respect to t and with respect to d 1 (these operations obviously commute), which corresponds to: Taking into account: we get from (34): where: δ1 max := min t∈(2,4) and the numerical value of δ1 max is given by (26).These conditions are satisfied for: In order to minimize the relative error of the second correction, we use equation analogous to (34): where from (30), we have: Hence: Expressing this result in terms of formerly computed δ1 max and δ2 max , we obtain: 5.75164 × 10 −7 δ2 max 7.99 , (42) where: Therefore, the above modification of Newton-Raphson formulas decreases the relative error two times after one iteration and almost eight times after two iterations as compared to the standard InvSqrt algorithm.
In order to implement this idea in the form of a computer code, we have to replace the unknown 1/ √ x (i.e., ỹ) on the right-hand sides of ( 28) by some numerical approximations.

New Algorithm with Higher Accuracy
Approximating 1/ √ x in Formulas ( 28) by values on the left-hand sides, we transform (28) into: where ỹ2k (k = 1, 2, . ..) depend on x, t and d j (for 1 j k).We assume ỹ20 ≡ ỹ00 , i.e., the zeroth approximation is still given by ( 6).We can see that ỹ21 and ỹ22 can be explicitly expressed by ỹ20 and ỹ21 , respectively.Parameters d 1 and d 2 have to be determined by minimization of the maximum error.We define error functions in the usual way: Substituting ( 44) into (43), we get: The equation ( 45) expresses ∆ 1 ( x, t, d 1 ) as a linear function of the nonpositive function δ1 ( x, t) with coefficients depending on the parameter d 1 .The optimum parameters t and d 1 will be estimated by the procedure described in Section 3. First, we minimize the amplitude of the relative error function, i.e., we find t (1) such that: for all t = t (1) .Second, we determine d 1 such that: 1 ( x, t (1) , d 1 ( x, t (1) , d Thus, we have: 1 ( x, t (1) , d for all real d 1 and t ∈ (2, 4).∆ (1) 1 ( x, t) is an increasing function of δ1 ( x, t); hence: which is satisfied for: Thus, we can find the maximum error of the first correction ∆ (1) 1 ) (presented in Figure 1): which assumes the minimum value for t (1) = t ∆ This result practically coincides with δ 1 max given by (36).Analogously, we can determine the value of d 2 (assuming that t = t (1) is fixed): Now, the deepest minimum comes from the global maximum: Therefore, we get: and the maximum error of the second correction is given by: ∆ which is very close to the value of δ 2 max given by (42).
Thus, we have obtained the algorithm InvSqrt1, see Algorithm 2, which looks like InvSqrt with modified values of the numerical coefficients.return y ; 9. } Comparing InvSqrt1 with InvSqrt, we easily see that the number of algebraic operations in InvSqrt1 is greater by one (an additional multiplication in Line 7, corresponding to the second iteration of the modified Newton-Raphson procedure).We point out that the magic constants for InvSqrt and InvSqrt1 are the same.

Numerical Experiments
The new algorithm InvSqrt1 was tested on the processor Intel Core i5-3470 using the compiler TDM-GCC 4.9.2 32-bit.Using the same hardware for testing the code InvSqrt, we obtained practically the same values of errors as those obtained by Lomont [38].The same results were obtained also on Intel i7-5700.In this section, we analyze the rounding errors for the code InvSqrt1.
Applying algorithm InvSqrt1, we obtain relative errors InvSqrt1( x) characterized by "oscillations" with a center slightly shifted with respect to the analytical approximate solution ỹ22 ( x, t (1) ); see Figure 2. The observed blur can be expressed by a relative deviation of the numerical result from ỹ22 ( x): The values of this error are distributed symmetrically around the mean value ε (1) : enclosing the range: see Figure 3.The blur parameters of the function ε (1) ( x, t) show that the main source of the difference between analytical and numerical results is the use of precision float and, in particular, rounding of constant parameters of the function InvSqrt1.We point out that in this case, the amplitude of the error oscillations is about 40% greater than the amplitude of oscillations of ( ỹ00 − ỹ0 )/ ỹ0 (i.e., in the case of InvSqrt); see the right part of Figure 2 in [46].It is worth noting that for the first Newton-Raphson correction, the blur is of the same order, but due to a much higher error value in this case, its effect is negligible (i.e., Figure 1 would be practically the same with or without the blur).The maximum numerical errors practically coincide with the analytical result (53), i.e., ∆ 1,N min ≈ −8.76 × 10 −4 , ∆ In the case of the second Newton-Raphson correction, we compared results produced by InvSqrt1 with exact values of the inverse square root for all numbers x of the type float such that e x ∈ [−126, 128).The range of errors was the same for all these intervals (except e x = −126): For e x = −126, the interval of errors was slightly wider: (−6.72 × 10 −7 , 6.49 × 10 −7 ), which can be explained by the fact that the analysis presented in this paper is not applicable to subnormal numbers; see (1).Therefore, our tests showed that relative errors for all numbers of the type float belong to the interval (∆ 2,N min , ∆ 2,N max ), where: 2,N min ≈ −6.72 × 10 −7 , ∆ 2,N max ≈ 6.49 × 10 −7 . (63) These values are significantly higher than the analytical result (5.76 × 10 −7 ) (see (57)), but are still much lower than the analogous error for the algorithm InvSqrt (4.60 × 10 −6 ; see [46]).
Figure 3. Relative error ε (1) arising during the float approximation of corrections ỹ22 ( x, t).Dots represent errors determined for 2000 random values x ∈ [1,4).Solid lines represent maximum (max i ) and minimum (min i ) values of relative errors (intervals [1,2) and [2,4) were divided into 64 equal intervals, and then, extremum values were determined in all these intervals).

Conclusions
In this paper, we presented a modification of the famous code InvSqrt for fast computation of the inverse square root of single-precision floating-point numbers.The new code had the same magic constant, but the second part (which consisted of Newton-Raphson iterations) was modified.In the case of one Newton-Raphson iteration, the new code InvSqrt1 had the same computational cost as InvSqrt and was two-times more accurate.In the case of two iterations, the computational cost of the new code was only slightly higher, but its accuracy was higher by almost seven times.
The main idea of our work consisted of modifying coefficients in the Newton-Raphson method and demanding that the maximal error be as small as possible.Such modifications can be constructed if the distribution of errors for Newton-Raphson corrections is not symmetric (like in the case of the inverse square root, when they are non-positive functions).

Figure 1 .
Figure 1.Graph of the function ∆