The truncated Lindley distribution with applications in astrophysics

This paper reviews the Lindley distribution and then introduces the scale and the double truncation. The unknown parameters of the truncated Lindley distribution are evaluated with the maximum likelihood estimators. An application of the Lindley distribution with scale is done to the initial mass function for stars. The magnitude version of the Lindley distribution with scale is applied to the luminosity function for the Sloan Digital Sky Survey (SDSS) galaxies and to the photometric maximum of the 2MASS Redshift Survey (2MRS) galaxies. The truncated Lindley luminosity function allows us to model the Malquist bias of the 2MRS galaxies.


Introduction
The Lindley distribution is defined by one parameter and was introduced to study the difference between fiducial distribution and posterior distribution, see [1,2]. Its detailed properties such as moments, cumulants, characteristic function, failure rate function and . . . can be found in [3]. We now briefly outline some new trends, among others, for this distribution. A three parameter generalization of the Lindley distribution has been analyzed by [4], the truncated versions of the Lindley distribution has been studied by [5], and the estimation of the parameters of the generalized Lindley distribution has been done by [6] and a three-parameter Lindley distribution has been introduced by [7]. A careful analysis of these applications in the various fields of the natural sciences has revealed that the Lindley distribution has not yet been applied to astrophysics. Usually the mathematicians introduce many parameters, which characterize statistical distributions. In contrast, applications in the real world require fewer parameters, such as mean value and variance. The rapid development of computers has allowed to simulate the statistical distributions through the generation of random numbers, but this requires the evaluation of the inverse of the distribution function. A first example of an astrophysical application for a probability density function (PDF) is represented by the initial mass function for the stars (IMF). The distribution in mass of the stars has been fitted with a power law. This started with [8], who suggested that ξ(m) ∝ m −α where ξ(m) represents the probability of having a mass between m and m + dm; He found α = 2.35 in the range 10M > M ≥ 1M . Subsequent research has started to analyze the initial mass function (IMF) with three power laws, see [9,10,11], and four power laws, see [12]. The approach to the IMF using a continuous distribution has been modeled by the lognormal distribution in order to fit both the range of the stars and the brown dwarfs (BDs) regime, see [13], by the beta distribution, see [14], by the truncated gamma distribution, see [15] and by the truncated lognormal distribution, see [16]. The previous analysis raises the following questions: -Is it possible to find the constant of normalization for a left and right truncated Lindley PDF? -Is it possible to derive an analytical expression for the mean of a left and right truncated Lindley PDF -Is a left and right truncated Lindley PDF a model for the IMF and for a sample of masses?
A second example of an astrophysical application for a PDF is given by the luminosity function (LF) for galaxies. The Schechter function was the first LF for galaxies to be introduced, see [17]. Over the years, other LFs for galaxies have been suggested, such as a two-component Schechter-like LF, see [18], the hybrid Schechter+power-law LF to fit the faint end of the K-band, see [19], and the double Schechter LF, see [20]. To improve the flexibility at the bright end, a new parameter η was introduced in the Schechter LF, see [21]. A third astrophysical application is in the photometric maximum visible in the number of cluster of galaxies as function of the redshift; for example, see Figure 7 in [22] where the number of galaxies as function of the redshift is plotted and Figure 2 in [23] where the number of clusters for three catalogs are reported as function of the redshift. The theoretical explanation of this effect is the joint distribution in redshift and and flux for galaxies; see formula (5.133) in [24] or formula (1.104) in [25] or formula (1.117) in [26]. Despite this theoretical background, the photometric maximum has been poorly analyzed. A fourth astrophysical application is in the range in absolute magnitude of galaxies versus the redshift visible in the various catalogs; for example, see Figure 9 in [22]. The mass of the stars in the IMF, the luminosity of galaxy in the LF and the absolute magnitude of galaxy in a given range of redshift vary between a minimum and a maximum value. This discussion suggests the introduction of finite boundaries for the Lindley IMF and LF rather than the usual zero and infinity following a pattern similar to the introduction of a left truncated beta LF; see [27], and for a left and right truncated Schechter LF luminosity function, see [28].
This paper reviews the original Lindley distribution in Section 2.1. It introduces the scaling in Section 2.2 and the double truncation in Section 2.3. The applications to the astrophysics are developed for the IMF, see Section 3, and for the luminosity function (LF) for galaxies, see Section 4.

The Lindley family
We present a family of distributions of gradually increasing complexity.

Lindley distribution
Let X be a random variable defined in [0, ∞]; the Lindley probability density function (PDF), f (x), is the distribution function (DF), F (x), is where c > 0. At x = 0 f (0) = c 2 1+c and not zero. The average value or mean, µ, is the variance, σ 2 , is The rth moment about the origin and an approximation of the median are reported in Appendix A. The random generation of the Lindley variate X:c is given by where W is the Lambert W function, after [29], and R the unit rectangular variate R. The Lambert W function according to [30] is defined as W e W = x .
The principal branch, W p(x), and the other branch, W m(x), of the Lambert W-function can evaluated with the Halley method see [31,32,33]. The two branches of the Lambert W-function are reported in Figure 1.
A typical simulation of the Lindley PDF is reported in Figure 2.
The experimental sample consists of the data x i with i varying between 1 and n; the sample mean, the unbiased sample variance, s 2 , is and the sample rth moment about the origin,x r , is The parameter c can be obtained by the following match and therefore c = −µ 1 + 1 + µ 1 2 + 6 µ 1 + 1 2 µ 1 .

The Lindley distribution with scale
We now introduce the scale b in the Lindley distribution and the PDF, f s (x; b, c), is The mean, µ s (c, b), is and the variance, At The rth moment about the origin is reported in Appendix B. The parameters b and c can be obtained by the following match which means The inequality s 2 >x 2 /2 makes both b and c negatives and, therefore, the sample is not suitable for a fit with Lindley distribution with scale. [5,34], is

The truncated Lindley distribution with scale
and the DF, The inequality which fixes the range of existence is The first moment about the origin, µ 1,t (b, c, x l , x u ), is where and the second moment about the origin, The variance, σ 2 t (b, c, x l , x u ), is evaluated as The parameters b and c can be evaluated with the maximum likelihood estimators (MLE), see Section C.

The IMF for stars
The IMF for stars is actually fitted with three and four power laws, see [35,36]. The piece-wise broken inverse power law IMF is each zone being characterized by a different exponent α i and two boundaries m i and m i+1 . To have a PDF normalized to unity, one must have The number of parameters to be found from the considered sample for the n-piece-wise IMF is 2n − 1 when m 1 and m n+1 are the minimum and maximum of the masses of the sample. In the case of n = 4, which fits also the region of brown dwarfs (BD), see [14], the number of parameters is seven. In the field of statistical distributions, the PDF is usually defined by two parameters. Examples of two-parameter PDFs are: the beta, gamma, normal, and lognormal distributions, see [37]. The lognormal distribution is widely used to model the IMF for the stars, see [38,39,40,13]. The lognormal distribution is defined in the range of M ∈ (0, ∞) where M is the mass of the star. Nevertheless, the stars have minimum and maximum values. In an example from the MAIN SEQUENCE, an M8 star has M = 0.06M and an O3 star has M = 120M , see [41]. The presence of boundaries for the stars makes the analysis of the truncated lognormal, see [16], and of the truncated Lindley PDF attractive. In the case of the truncated Lindley PDF, the analysis of the samples representative of the IMF for stars is limited to those that produce both parameters b and c positive, and are therefore suitable for a fit with the truncated Lindley distribution. The statistical parameters are the same of [16] and are the merit function χ 2 , the reduced merit function χ 2 red , the Akaike information criterion (AIC), the number of degrees of freedom N F = n − k where n is the number of bins and k is the number of parameters, the goodness of the fit expressed by the probability Q, the maximum distance, D, between the theoretical and the astronomical DF and the significance level, P KS , for the Kolmogorov-Smirnov test (K-S).
To give an example, Figure 3 reports the truncated Lindley DF for NGC 6611 with statistical parameters as in Table 1.
A careful analysis of Table 1 allows to conclude that in the case of NGC 6611 the truncated Lindley PDF produces a better fit in respect to the lognormal and truncated lognormal PDFs.  Table 1, MLE method. The horizontal axis has a logarithmic scale. The lifetime of a star belonging to the MAIN V, t M S , is where t is the lifetime of the sun, 10 10 yr, M is the mass of MAIN V star and M the solar mass, see http://astronomy.swin.edu.au/cosmos/ for more details. Figure 4 reports the modifications of the Lindley PDF with an increasing upper boundary. Meanwhile, Table 2 reports the correspondence between the selected mass and the connected lifetime.  For example, in the case of the cluster NGC 6611, the upper limit in mass will decrease from 1.4 M to 1 M in 9.9 10 9 yr and after that time the total number of stars will be the 92.25% of the original number of stars. The above model allows to see how the time modifies the Hertzsprung-Russell (H-R) diagram, i.e. the M V against (B − V ), in the young clusters of stars.

The luminosity function for galaxies
In this section, we review the standard luminosity function (LF) for galaxies, we introduce a Lindley LF and a truncated Lindley LF, we then outline the formulae of the photometric maximum and we parametrize the averaged absolute magnitude as function of the redshift.

The Schechter function
The Schechter function, introduced by [17], provides a useful fit for the LF of galaxies here α sets the slope for low values of L, L * is the characteristic luminosity and Φ * is the normalization. The equivalent distribution in absolute magnitude is where M * is the characteristic magnitude as derived from the data. We now introduce the parameter h which is The numerical exploration of a new LF for galaxies requires that the χ 2 red is smaller or approximately equal to that of the Schechter LF. As an example, the LF given by the generalized gamma distribution with four parameters gives χ 2 red smaller than that of the Schechter LF in the five bands of the Sloan Digital Sky Survey (SDSS) galaxies, see equation (21) an Table II in [42]

The Lindley LF
We start with the Lindley PDF with scaling as given by equation (13), where L is the luminosity, L * is the characteristic luminosity and Ψ * is the normalization. The magnitude version is where M is the absolute magnitude, M * the characteristic magnitude and Ψ * is the normalization. A test is performed on the u * band of SDSS as in [43] with data available at https://cosmo.nyu.edu/ blanton/lf.html. The Schechter function, the new Lindley LF represented by formula (33) and the data are reported in Figure 5, parameters as in Table 3.

The truncated Lindley LF
We start with the truncated Lindley PDF with scaling as given by equation (20) Ψ (L; c, L * , Ψ * , L l , L u )dL = Ψ * e − cL L * (L + L * ) c 2 DT ,  with where L is the luminosity, L * is the characteristic luminosity, L l is the lower boundary in luminosity, L u is the upper boundary in luminosity, and Ψ * is the normalization. The magnitude version is At the moment of writing, the analytical solution does not exists and the integration should be done numerically. Table 3 reports the parameters of the truncated Lindley LF from which is possible to conclude that the effect of truncation in the Lindley LF produces a minimum decrease in the χ 2 red : Lindley LF with truncation χ 2 red = 6.6739 and Lindley LF χ 2 red = 6.6741.

The photometric maximum
In the pseudo-Euclidean universe, the correlation between expansion velocity and distance is where H 0 is the Hubble constant, after [44], H 0 = 100h km s −1 Mpc −1 , with h = 1 when h is not specified, D is the distance in M pc, c l is the light velocity and z is the redshift. In the pseudo-Euclidean universe the flux of radiation, f , expressed in L M pc 2 units, where L represents the luminosity of the sun, is where D represents the distance of the galaxy expressed in M pc, and The joint distribution in z and f for the Schechter LF, see formula (1.104) in [25] or formula (1.117) in [26], is where dΩ, dz and df represent the differential of the solid angle, the redshift and the flux respectively and Φ is the Schechter LF. The critical value of z, z crit , is where L * has been defined in Section 4.1. The number of galaxies in z and f for the Schechter LF as given by formula (43) has a maximum at z = z pos−max , where which can be re-expressed as where M is the reference magnitude of the sun at the considered bandpass. The position of the maximum in redshift for the Schechter LF depends from the flux of the selected astronomical band, f , and from the two parameter which characterizes the Schechter LF: α and M * . More details can be found in [45]. The joint distribution in z and f for galaxies for the Lindley LF, see equation (34), is The maximum in the number of galaxies for the Lindley LF as function of z crit is at or as function of the flux f The position of the maximum in redshift for the Lindley LF depends from the flux of the selected astronomical band, f , or the selected apparent magnitude, m, and from the two parameter which characterizes the Lindley LF: c and M * . The mean redshift for galaxies z can be defined as The mean redshift for the Lindley LF as function of z crit is or as a function of the flux .
(54) Figure 6 reports the number of observed galaxies of the 2MASS Redshift Survey (2MRS) catalog for a given apparent magnitude and the two theoretical curves are analyzed with same parameters as in Table  4. These parameters are derived in such a way that the χ 2 is minimum. Therefore, this is a new method to derive the parameters which characterize the two LFs here adopted without using the samples for the LF such as the five bands of SDSS galaxies. L M pc 2 are organized in frequencies versus heliocentric redshift, (empty circles); the error bar is given by the square root of the frequency. The maximum frequency of observed galaxies is at z = 0.018. The full line is the theoretical curve generated by dN dΩdzdf (z) as given by the application of the Schechter LF which is equation (43) and the dashed line represents the Lindley LF which is equation (47). The parameters are the same of Table 4, χ 2 = 198 for the Schechter LF and χ 2 = 191 for the Lindley LF.

Averaged absolute magnitude
We now introduce the concept of limiting apparent magnitude. The observable absolute magnitude as a function of the limiting apparent magnitude, m L , is Figure 7 presents such a curve and the galaxies of the 2MRS. We now compare the theoretical averaged  absolute magnitude of the truncated Lindley LF, see equation (39), with the observed averaged absolute magnitude of 2MRS as function of the redshift. To fit the data we assumed the following empirical dependence with redshift for the characteristic magnitude of the truncated Lindley LF This relationship models the decrease of the characteristic absolute magnitude as function of the redshift and allows us to match observational and theoretical data. The lower bound in absolute magnitude is given by the minimum magnitude of the selected bin, the upper bound is given by equation (55), the characteristic magnitude varies according to equation (56) and Figure 8 reports the comparison between theoretical and observed absolute magnitude for 2MRS.

Conclusions
Statistical Distributions We introduced the Lindley distribution with scale and the truncated Lindley distribution. The parameters of the Lindley distribution with scale can be found with the method of the matching moments. In the the case of the truncated Lindley distribution the MLE is used to estimate the unknown parameters.
Application to the stars To fit the IMF for stars with the truncated Lindley PDF, the parameters b and c, which is deduced from the astronomical sample, should be positive. This is the case of NGC 6611 (207 stars + BDs), for which the reduced merit function is smaller for the truncated Lindley distribution in respect to the lognormal and truncated lognormal distribution, see Table 1.
Application to the galaxies The Lindley LF for galaxies is characterized by a higher reduced merit function in respect to the Schechter LF for the case of SDSS Galaxies in the u * band, see Table 3. Conversely the Lindley LF for galaxies produces a lower value of the merit function when the photometric maximum of 2MRS is modeled in respect to the Schechter model for the maximum, see Figure 6. The truncated Lindley LF produces an acceptable model for the averaged absolute magnitude of the galaxies belonging to the 2MRS, see Figure 8.

A Other parameters of the Lindley distribution
The rth moment about the origin for the Lindley distribution is, µ r , is is the gamma function, see [30]. The central moments, µ r , are being µ 2 = σ 2 . Is impossible to evaluate the median in a closed form and therefore we introduce an approximated distribution function, F 2,2 in terms of the Padé rational polynomial approximation, after [46], of degree 2 in the numerator and degree 2 in the denominator about the point x = 0 The approximated median, m 2,2 turns out to be The percent error, δ, in the evaluation of the approximated median is δ = 1.179 % at c=0.5 and δ = 0.077 % at c=2.

B Moments for the Lindley distribution with scale
The rth moment about the origin for the Lindley distribution with scale, µ r,s , is The central moments, µ r,s , are

C The parameters of the truncated Lindley distribution
The parameters of the truncated Lindley distribution can be obtained from empirical data by the maximum likelihood estimators (MLE) and by the evaluation of the minimum and maximum elements of the sample. Consider a sample X = x 1 , x 2 , . . . , x n and let x (1) ≥ x (2) ≥ · · · ≥ x (n) denote their order statistics, so that x (1) = max(x 1 , x 2 , . . . , x n ), x (n) = min(x 1 , x 2 , . . . , x n ). The first two parameters x l and x u are The MLE is obtained by maximizing Λ = n i ln(f t (x i ; b, c, x l , x u )). (C.9) The two derivatives ∂Λ ∂b = 0 and ∂Λ ∂c = 0 generate two non-linear equations in b and c which can be solved numerically, we used FORTRAN subroutine SNSQE in [47],  x i bc 2 x i bc + 2 e − cx l b b 2 n . (C.13)