A Survey on Old and New Approximations to the Function φ ( x ) Involved in LDPC Codes Density Evolution Analysis Using a Gaussian Approximation

: Low Density Parity Check (LDPC) codes are currently being deeply analyzed through algorithms that require the capability of addressing their iterative decoding convergence performance. Since it has been observed that the probability distribution function of the decoder’s log-likelihood ratio messages is roughly Gaussian, a multiplicity of moderate entanglement strategies to this analysis has been suggested. The ﬁrst of them was proposed in Chung et al.’s 2001 paper, where the recurrent sequence, characterizing the passage of messages between variable and check nodes, concerns the function φ ( x ) , therein speciﬁed, and its inverse. In this paper, we review this old approximation to the function φ ( x ) , one variant on it obtained in the same period (proposed in Ha et al.’s 2004 paper), and some new ones, recently published in two 2019 papers by Vatta et al. The objective of this review is to analyze the differences among them and their characteristics in terms of accuracy and computational complexity. In particular, the explicitly invertible , not piecewise deﬁned approximation of the function φ ( x ) , published in the second of the two abovementioned 2019 papers, is shown to have less relative error in any x than most of the other approximations. Moreover, its use conducts to an important complexity reduction, and allows better Gaussian approximated thresholds to be obtained.


Introduction
The investigation of Low Density Parity Check (LDPC) codes decoding, carried out by means of the belief propagation algorithm, usually takes place by analyzing the evolution of the probability densities of the messages exchanged between check and variable nodes of the bipartite (Tanner) graph describing a code belonging to this family [1]. However, this is a rather complicated job due to the continuous nature of the messages themselves, which makes them difficult to analyze. To avoid the difficulty of this task, Chung et al. presented, in [2], a Gaussian approximation (GA) (Despite its limits (see, e.g., [3,4]), the use of the GA is very convenient when a powerful and effortless approach is required. See its recent adoption, e.g., to obtain "bounds on LDPC decoding thresholds [5], rate-compatible puncturing patterns for LDPC codes [6,7], analytical bit-error rate (BER) expressions [8], and unequal error protection (UEP) LDPC codes [9].") for the message probability distribution, simplifying the evolution of the infinite dimensional density space to the evolution of a single parameter. They mathematically described the passage of messages between variable and check nodes of the Tanner graph by means of a recurring succession of the mean values of the check nodes' output messages, having shown that regular LDPC codes present message densities that are roughly Gaussian, whereas irregular ones present message densities that are Gaussian combinations [10]. In this way, instead of describing the variable and check node output message as a function of the incoming messages from their respective neighbors, a recursive sequence is obtained by expressing a check node output message mean value at the l-th iteration as a function of the same mean value at the (l − 1)-th iteration [4]. The belief-propagation decoding thresholds, also known as density evolution thresholds-addressing LDPC codes iterative decoding convergence performance and defined as the maximum channel noise level such that the probability of error of the belief-propagation decoding algorithm tends to zero as the number of its iterations tends to infinity [2]-may be approximated by evaluating the last value of the channel noise variance σ 2 n such that this recurrent sequence converges, or, equivalently, as the first σ 2 n value such that the sequence diverges, obtaining the so-called GA thresholds. The recurrent sequence in [2] is expressed through the function φ(x), defined in the aforementioned paper. Its study has been performed in [3], to obtain tighter upper and lower bounds on it for x > 0. Chung et al.'s lower and upper bounds (1 − 3/x)ψ(x) and , holding for the function φ(x) involved in the GA, have been both analytically proved in [3] to be improved by replacing 3 with π 2 /4 and 1/7x with 0, respectively. It was shown in [3] that the tighter upper bound therein obtained is explicitly invertible through the Lambert W-function (also called omega function or product logarithm), whereas the one obtained in [2] is not explicitly invertible. Moreover, on the basis of the two tighter bounds on φ(x) found, a piecewise defined approximation of φ(x) has been obtained in [3]. This is analogous but improved with respect to the one in [2], in the sense that it allows to obtain GA thresholds closer to the ones obtained applying density evolution than the GA thresholds given in [2]. In [10], a new approximation of the function φ(x) was given, which, unlike the other approximations found in the literature (see [2,3,6]),

1.
is not piecewise defined in the definition interval [0, +∞[ i.e., it is defined through a unique mathematical expression, 2.
remains between the two above said tighter bounds of [3] in the domain of interest, 4. and has less relative error for any value of x than most of the other approximations, when the error is evaluated by computing φ(x) numerically through a Mathematica ® statement published in [11].
As noted in [10], the fact that this new approximation has always an univocal analytical expression (i.e., it is not piecewise defined) implies its easy mathematical handling, in contrast to formerly defined piecewise functions, defined by distinct analytical expressions being dependent on the range of values taken by the argument x. Moreover, its explicit invertibility implies that no interpolated functions approximating it by points (x k , φ(x k )) and its inverse by points (φ(x k ), x k ) are required. These properties conduct to an important complexity reduction of the algebraic handling and utilization of the approximation itself. Moreover, its use allows better GA thresholds to be obtained, in the sense that they approximate the thresholds obtained with density evolution better than the GA thresholds obtained in [2].
In this paper, a review of the abovementioned old approximations (published in [2,6]) and new approximations (published in [3,10]) to the function φ(x) is performed, so that to analyze the differences among them and their characteristics in terms of accuracy and computational complexity. The paper is organized as follows. In the next section, we review the GA for irregular LDPC code density evolution. Section 3 is dedicated to the valuable merits an approximation should have: among them, there is the explicit invertibility, addressed in Section 6. Section 4 recalls the definitions of absolute and relative errors as means to evaluate the accuracy of an approximation. Section 5 recalls the function φ(x) specified in [2] and the literature where its principal approximations were presented. Section 6 addresses the approximation of the inverse φ −1 (y). Section 7 compares the numerical results-in terms of calculated GA thresholds-obtained applying the approximations to φ(x) reviewed in the paper, and, finally, Section 8 recapitulates the conclusions.

Gaussian Approximation for Irregular LDPC Codes
We treat irregular LDPC codes as it has been shown that they behave better than regular ones [12]. Furthermore, it was shown in [2] and recalled in [10] that message densities are roughly Gaussians for regular LDPC codes or Gaussian mixtures for irregular ones, and hence regular LDPC codes can be considered as a special case of irregular ones. Irregular LDPC codes [12] are defined by means of the distribution of the node degrees in their Tanner graphs [1], specified through the polynomials λ(x) and ρ(x) [13]: The (d l −1)-tuple {λ i } and (d r −1)-tuple {ρ j } both add up to 1 [14], and λ i (respectively, ρ j ) specifies the fraction of total edges in the Tanner graph insisting on a degree-i variable node (respectively, on a degree-j check node).
Express with v the output message of a variable node and with u the output message of a check node (In soft-decision belief-propagation decoding, the messages are the loglikelihood ratios (LLRs) of received bits). The first is expressed as where x is the node bit value, and y is all the information at this node's disposal up to the current iteration. The edge bringing the information related to v is not considered. Likewise, u is expressed as where x is the variable node bit value, and y is the information at this check node's disposal up to the current iteration, ignoring the edge bearing u [7]. Following the analysis in [2], where a one-dimensional quantity, i.e., the mean of Gaussian densities, was demonstrated to perform as a reliable replacement for the message densities themselves, we hypothesize, as in [10], that we can approximate as Gaussian the LDPC message distributions for additive white Gaussian noise (AWGN) channels and denote with m v the means of u and v at the l-th iteration, respectively. Moreover, we assume the log-likelihood ratio (LLR) message u 0 from the channel to be Gaussian with mean m u 0 = 2/σ 2 n and variance 4/σ 2 n , being σ 2 n the variance of the channel noise. For a degree-i variable node at the lth iteration, the mean of the output yields is the mean of u at the (l − 1)-th iteration. Assuming that the codeword length tends to infinity, and consequently adducing a cycle-free Tanner graph argument, for a check node at the lth iteration, the mean of the output yields being the fundamental function φ(x) of [2] defined as: Assuming s := m u 0 and t l := m (l) u , and defining, as in [5], where the sequence (3) may be rewritten as Using the algorithmic method presented in [15], instead of looking for, as done in the original paper [2], the minimum value of the parameter s guaranteeing the convergence of the sequence (7), as in [10] we make use of a standard software to solve a problem of quadratic degeneracy [16]. Namely, when the second derivative g s (t) = 0, the problem of quadratic degeneracy is the solution of the system of equations To explain the mathematical meaning of (8), a graphical representation of it has been shown in Figure 1 of [15]. The solution of (8), (t * , s * ), gives an approximation σ * := 2 s * of the belief-propagation decoding threshold obtained applying density evolution, i.e., the so-called GA threshold.

Valuable Merits of an Approximation
It is our opinion that the valuable merits of an approximation are (see also [10,17], the latter concerning the approximation for the Gaussian probability integral Q(x)): • to be defined by a single expression, i.e., not piecewise defined; • its simplicity; • to be expressed in closed form, without integrals and series and continuous fractions (and limits) by means of: elementary functions with standard names used in mathematics; -the abovementioned elementary functions and the Lambert W-function (Notice that the principal branch of the real Lambert W-function (which is the inverse of x e x for x > −1 [10]), at least for x ≥ 0, is very classic, with a standard name used in mathematics, has well known and published series expansions, is explicitly invertible by means of elementary functions with standard names used in mathematics, but is not an elementary function;) • to be appreciable on a wide domain: the better would be R, subordinately to present a low absolute error (on some domain); • to present a low relative error in absolute value (on some domain); • to be explicitly invertible (see Section 6).
Moreover, as far as the approximation a(x) of the function φ(x)-reported in (4)-is specifically concerned, the search for it is based on these two fundamental points [10]: 1.
To avoid an infinite relative error, a(x) has to present a finite value in x = 0, otherwise it diverges in 0.

2.
To avoid an infinite absolute relative error ε r (1) (defined in (9)), the inverse a −1 (y) of the approximation a(x) has to be such that

Notes on Absolute and Relative Errors
Let's denote, for a function f (·) and an approximation a(·) of the same argument, by the absolute error, intended as a function with the same domain of f , and by a) , or even ε the maximum absolute error, intended as a number, summarizing the distance of the approximation from the function. Analogously, for relative errors, denote with the absolute relative error, intended as a function with the same domain of f if = 0, and by the maximum absolute relative error, intended as a number, summarizing the distance of the approximation from the function, normalized with respect to the function f itself, if = 0. Generally, it may be observed that, when classifying two approximations of a function by means of their closeness to that function, the use of the relative error as a means of measuring this closeness is more suitable if that function has a zero limit (In fact, for example, it is not very important to notice that an approximated value of about, let us say, 10 −5 , has an absolute error less than, let us say, 10 −4 ). Thus, for the fundamental function φ(x) defined in (4), with a zero limit for x → ∞ (see Figure 3 in [11], showing its graph), the use of relative errors is more significant and widely used in the literature: see [2,3,6,10], concerning the approximations for the fundamental function φ(x), but also works concerning other functions having a zero limit, such as the already cited Gaussian probability integral Q(x) (see [18][19][20][21][22] where the approximations for Q(x) were all found, minimizing the absolute relative error). Here, the relative error has been evaluated by applying (9) and (10), and computing φ(x) numerically through the following Mathematica ® statement [11]: The first line simply rewrites the first line of (4) to perform a numerical integration with the instruction NIntegrate. The second line specifies the integration interval and, through the instruction WorkingPrecision, how many digits of precision should be maintained in the computation. For the computations of this paper, the second line has been refined to {u,-320,320}, WorkingPrecision ->40]

The Function φ(x): Review of Its Principal Approximations
For the function φ(x) defined in (4), these upper and lower bounds have been proved in [2]: and these others, tighter, have been proved recently in [3]: Moreover, the following three approximations, all piecewise defined, have been published in [2,3,6], respectively (The approximations, throughout the paper, have been named with a subscript reporting the name of the first author of the publication): As remarked in [10], (8) can be solved making use of an explicitly invertible approximation of φ(x), since g s (t) is defined through φ and φ −1 .
The piecewise defined approximation found in [2], and recalled in (13), is made of two parts: the first, holding for 0 ≤ x ≤ 10, is the ad hoc approximation by elementary functions (13a), explicitly invertible, the second, holding for x > 10, is the approximation (13b), instead not explicitly invertible, and therefore, not usable directly in (8). Thus, to solve (8), in [15] Babich et al. have used the approximation (13a) only, which behaves very badly, as shown in Figure 1 of [10], for x > 10. In this case, the more the argument of φ(x) gets greater than 10 in the sequence defined by (7), the more uncertain are the GA thresholds evaluated solving (8), with respect to the exact ones σ exact evaluated through density evolution, e.g., in [12].
The piecewise defined approximation found in [3], and recalled in (14), also contains, for 0 ≤ x ≤ 10, an explicitly invertible ad hoc approximation (14a), obtained through elementary functions [10]. The approximation (14b) of [3] for x > 10, is instead not explicitly invertible again, like the one in [2], and thus again not usable directly in (8) to find its solution. Thus, to find the numerical results published in [2,3], their authors had to resort (even not explicitly stated in [2,3]) to an interpolated function approximating (13b) and (14b) by points to determine its inverse, needed to solve (8) (since it is defined through (5) and (6)).
To avoid the abovementioned problems, a new, explicitly invertible, approximation of φ(x) was given in [10]. This was not defined by multiple functions, but by a unique expression (the subscript "np" stands for "not piecewise" (defined)): being where W(x) is the Lambert W-function, and a Vatta−np (x) is the new approximation of φ(x), defined by a single expression for any x ≥ 0. To build an approximation a(x) of φ(x), having the two fundamental characteristics mentioned at the end of Section 3, the function was taken as a fundamental function, since, as remarked in the Appendix of [5], it is a good explicitly invertible asymptotical approximation of φ(x), but-as observed in [10]-it diverges in x = 0, giving an infinite maximum absolute relative error ε r (see Table 1 in [10]), and its inverse presents an infinite maximum absolute relative error ε r , too, since ψ −1 (1) ≈ 1.5 (i.e., = 0). In [10], some suitable functions ν(y), µ(x), and parameters a, b, c, and d have been explored to detect the most appropriate approximation. The fitting parameter a was selected so that a Vatta−np (0) = 1 precisely and, therefore, a −1 Vatta−np (1) = 0 (see the function ψ −1 (y) derived in the Appendix of [5]), thus fulfilling the two fundamental points mentioned at the end of Section 3. The functions ν(y), µ(x), and the parameters b, c, and d have been searched with the goal of minimizing the absolute relative error (9), while maintaining the explicit invertibility.
In Figure 2 of [10], it was shown that the relative error of a Vatta−np (x) is <0.14% in [0, 80], which is the interval of main interest, where the approximation (16) remains between the two tighter bounds (12) defined in [3] as shown in Figure 1. Furthermore, as shown in Table 1, it has a lower maximum absolute relative error ε r (10)-reported in bold-than any other approximation, except (15a) (In [6], the authors had to add the ad hoc explicitly invertible approximation (15a), since they had to compute φ(x) around x = 0, where the approximation (13a) is not good enough (see Figure 1 of [10])).

Approximating the Function φ −1 (y) through an Explicitly Invertible Approximation
The explicit invertibility of an approximation a(x) avoids the necessity of making use of an interpolated function to approximate it by points (x k , a(x k )) and its inverse by points (a(x k ), x k ) [10]. In other words, we assume that a function is explicitly invertible if we are able to find an inverse whose expression involves elementary functions and the Lambert W-function. Here, this property will be explained through an example, showing how to find an inverse with these characteristics.

Example of Explicit Invertibility
Given the function , the procedure to obtain the explicit inverse goes through the following steps:

7.
Finally, recalling that W(xe x ) ≡ x for x > 0 (see, e.g., [5]) invert the last obtaining this function (of the variable y) 8. and then make the aforementioned substitutions: yielding 10. Thus, the explicit inverse of the starting function y(x) can be expressed as

Explicit Inverse of the Approximation a Vatta−np (x)
Since the approximation a Vatta−np (x) (16) is made of the three explicitly invertible functions ψ, ν and µ, it is explicitly invertible by the Lambert W-function [10]: The inverse a −1 Vatta−np (y) of the approximation a Vatta−np (x), considered as an approximation of the inverse φ −1 (y) of φ(x), has a maximum absolute relative error ε r (10) of 1.1% in its domain [ψ(80), 1], which is the interval of main interest, being ψ(80) ≈ 0, as shown in Figure 2. This is less than the maximum absolute relative error of any other approximation inverse, when it exists, except (13b) (In [2], for x > 10 the authors suggest to use the approximation (13b), average of the upper and lower bound (11) which, like (14b), the average of the upper and lower bound (12), is not explicitly invertible), as shown in Table 1 (in bold). As mentioned at the end of Section 3, the approximation of the inverse in y = 1 must match φ −1 (1) exactly, otherwise it gives rise to an infinite maximum absolute relative error for the inverse (last column of Table 1) in 1, since φ −1 (1) = 0. This happens to the approximation (13a), for which a −1 Chung (1) ≈ ( 0.0218 0.4527 ) 1/0.86 = 0 (see [15], where the inverse of (13a) has been derived in (6)), and to the approximation (14a), whose inverse may be expressed as follows: with α = −0.4527, β = 0.0218, and γ = 0.864. In this case, a −1 Vatta (1) ≈ ( 0.0218 0.4527 ) 1/0.864 = 0 yields an infinite maximum absolute relative error for the inverse.

Numeric Results Concerning the Computation of the GA Thresholds and Discussion
Generalizing Luby et al.'s results reported in [23], Richardson and Urbanke have proved a concentration principle in [24] . In particular, they have shown that the average behaviour of the individual instances of a code tends to concentrate around its expected behaviour, obtained when its length is supposed to grow. In other words, they showed that this average behaviour converges to the behaviour of the cycle-free case, which can be derived from the corresponding cycle-free Tanner graph. They defined the threshold, as explained in the Introduction, for a random ensemble of irregular LDPC codes, specified by their degree distributions (1) and (2). Moreover, they presented an algorithm, denoted as the density evolution algorithm, for the determination of the thresholds through iteratively calculated message densities.
In this section, we present a comparison (see Table 2) between the GA thresholds, calculated by applying the procedure described throughout this paper, and the thresholds calculated through the density evolution algorithm. These thresholds have been shown in [24] to be approached more closely as the codeword lengths increase (see Figures 5 and 6 in [24], where the bit error rate performance curves are shown to move closer to the density evolution thresholds, denoted by σ * , as the block lengths increase from 1000 to 100,000). This corresponds to making a comparison between the GA thresholds and those obtained by applying a message passing algorithm to the iterative decoding of a long deterministic LDPC code.  [2] x > 0 σ GAChung = 0.9473 σ GA0.05 = 0.9519 (14) Vatta (2019) [3] x > 0 σ GA0.1 = 0.9538 σ GA0.5 = 0.9533 (16) Vatta (2019) [10] x ≥ 0 σ GAVatta−np = 0.9526 For the irregular rate-1/2 LDPC code given in [2], with d l = 20 and d r = 9, with degree distributions λ(x) = 0.23403x + 0.21242x 2 + 0.14690x 5 + 0.10284x 6 +0.30381x 19 ρ(x) = 0.71875x 7 + 0.28125x 8 (18) in [3], applying the software of [15] and the piecewise defined approximation (14), a GA threshold was calculated σ GAVatta = 0.9538, which was closer to the density evolution one σ exact = 0.9669 (reported in [2]) than the GA threshold σ GAChung = 0.9473 computed in [2] using the piecewise defined approximation (13). As said in Section 5, to find the numerical results published in [2,3], their authors had to resort (even not explicitly stated) to an interpolated function approximating (13b) and (14b) by points (x k , a(x k )) to determine its inverse by points (a(x k ), x k ), needed to solve (8). As far as the function interpolation (14b) is concerned, we found that the GA threshold computed through it depends on the interpolation step ∆x considered. Taking, e.g., an interpolation step ∆x = 0.5, we computed a GA threshold σ GA0.5 = 0.9533, with ∆x = 0.1 a GA threshold σ GA0.1 = 0.9538 (the one reported in [3]), and with ∆x = 0.05, a GA threshold σ GA0.05 = 0.9519. By instead applying the new, not piecewise defined, approximation (16), we obtain σ GAVatta−np = 0.9526, which is closer to the density evolution threshold σ exact = 0.9669 than σ GA0.05 = 0.9519 and σ GAChung = 0.9473, but is slightly looser than σ GA0.5 = 0.9533 and σ GA0.1 = 0.9538. However, using (16), the solution of (8) is greatly simplified since, as noticed in [10], this approximation is given in [0, +∞[ by a single expression. This implies that, in any argument x, it presents a unique analytical expression permitting its straightforward algebraic manipulation. On the other hand, the so-called piecewise defined functions present different analytical expressions determined by the value taken by the argument x. Moreover, by using it, the solution of (8) does not depend on any discretization step.

Conclusions
In this paper, a review of old and new approximations to the function φ(x), characterizing the passage of messages between variable and check nodes of the bipartite graph describing an LDPC code [1], has been performed. In particular, two old (published in 2001 and 2004, respectively) and two new approximations, recently published, to the function φ(x), were reviewed to analyze the differences among them and their characteristics in terms of accuracy and computational complexity. The second of the two new approximations, unlike the other three (two old and one new) is defined by a single expression (i.e., it is not piecewise defined), it is explicitly invertible by means of the Lambert W-function, it remains between the two tighter bounds (12) recently defined in [3], and it has a lower relative error in any x than most of the other approximations (With the only exception of the explicitly invertible approximation (15a), valid only for 0 ≤ x < 0.867861) when the error is evaluated using a numerical computation of φ(x). As remarked in [10], the fact that it is defined by a single expression and it is explicitly invertible yield an important complexity reduction of the algebraic handling of the approximation itself. Moreover, its use allows better GA thresholds to be obtained, in the sense that they approximate the thresholds obtained with density evolution better than the GA thresholds obtained in [2].