Corrected Maximum Likelihood Estimations of the Lognormal Distribution Parameters

As a result of asymmetry in practical problems, the Lognormal distribution is more suitable for data modeling in biological and economic fields than the normal distribution, while biases of maximum likelihood estimators are regular of the order O(n−1), especially in small samples. It is of necessity to derive logical expressions for the biases of the first-order and nearly consistent estimators by bias correction techniques. Two methods are adopted in this article. One is the Cox-Snell method. The other is the resampling method known as parametric Bootstrap. They can improve maximum likelihood estimators performance and correct biases of the Lognormal distribution parameters. Through Monte Carlo simulations, we obtain average root mean squared error and bias, which are two important indexes to compare the effect of different methods. The numerical results reveal that for small and medium-sized samples, the performance of analytical bias correction estimation is superior than bootstrap estimation and classical maximum likelihood estimation. Finally, an example is given based on the actual data.


Introduction
Because of its flexibility and universality, the Lognormal distribution is a commonly used reliability function distribution, which can be applied to describe the fatigue life and wear resistance of products in the article of [1]. And [2] introduces the latest computer statistical method for test plan and reliability data analysis of industrial products. The Lognormal distribution is asymmetrical and positively skewed. The hazard rate function of this distribution always starts at zero, rises to the maximum, and then drops to zero at a slow pace. Hence, when positive studies suggest that the hazard rate of potential distribution is nonmonotone and single-peaked, these factual data can be analyzed by the Lognormal distribution. As shown by [3], in the table associated with normal (Gaussian) random variables, approximate values of the hazard rate that can be used to calculate the parameter values outside of the usual range are presented. At present, [4] is the first one to put forward the Lognormal distribution theory in detail. In the late 1920's, there was an independent development path in small-particle statistics with the Lognormal distribution. [5] pointed out the applicability of truncated or censored lognormal distributions and [6] applied his theory to biological data which appeared as discrete counts. In this case, [7] further studied the Lognormal distribution.
Most of the existing statistical analysis methods based on the hypothesis of the normal distribution of data. Nevertheless, for data that does not contain negative values or is otherwise distorted, the assumption of normality is not realized-which is common in different disciplines such as biology, politics, philology, economics, physical, finance and sociology. In this case, the Lognormal distribution can better fit the data.
When an event is affected by multiple factors, the distribution of these factors is unknown. According to the central limit theorem, after they are all added up, the average value of the result is concerned as the normal distribution. Because of the existence of the central limit theorem, we use the normal distribution to obtain most theoretical arguments. This theorem shows that with the increase of the number of variables in the normal distribution, the distribution which is standardized random variable summation approaches to unit a normal distribution. Nevertheless, on account of the symmetry of the normal distribution, it can not accurately estimate actual problems. Because many natural growth processes do not add up independently. They are related to each other and driven by the accumulation of many small percentage changes. When inversely transformed to the initial scale, it makes the scale distribution approximate to the Lognormal distribution. They are multiplicative and additive on the logarithmic scale. Refs. [8,9] pointed out that an important difference amid the normal distribution and the Lognormal distribution is that the former is on the basic effect of multiplication and the latter is based on addition. Taking logarithm can make us change multiplication into addition.
If normal distribution is followed by log (X), we believe random variable X follows the Lognormal distribution. In some cases, X only takes positive value. The calculation mean and median can be different. Especially when there is a large value in the data, the arithmetic mean will be seriously affected. Face so of circumstance, the Lognormal distribution is better for data fitting. The geometric mean usually represents median, but arithmetic mean is greater than median, resulting in right deviation of distribution. The curve of the Lognormal distribution is usually right-skewed, with long tail on the right-hand position and narrow array on the left-hand sideways. The Lognormal distribution is similar to Weibull distribution in some shape parameters, and some data suitable for Weibull distribution are also appropriate for Lognormal distribution.
In this paper, we discuss some useful methods which can correct the maximum likelihood estimators from the Lognormal distribution and deduce specific formulae of bias with limited samples. The probability density function of Lognormal distribution is written as: where µ ∈ R, σ > 0 and x > 0. The mathematical expectation of the Lognormal distribution is E (X) = e µ+ σ 2 2 . For a given parameter µ, when σ tends to zero, the mathematical expectation tends to e µ .
Then, the function of the failure rate is written as: where ϕ (y) is probability density function of standard normal distribution and Φ (y) is its distribution function.
The failure rate function is applicable to non monotonic data of inverted bathtub shape. When the parameter σ is small, it can be simplified as the failure rate increasing model. When the parameter σ is large, it can be simplified as the failure rate decreasing model. And the Lognormal distribution is applied to reliability analysis of semiconductor devices and fatigue life of some kinds of mechanical parts. Its main purpose is to analyze the repair time data accurately in the maintainability analysis.
The Lognormal distribution has many important applications in financial asset prices such as Black-Scholes equations and in reliability engineering in [10]. The Lognormal distribution is also widely applied to the realms of health care. For example, Ref. [11] pointed out that the transcriptional level of different genes is consistent with the Lognormal distribution and provided a more exact method to estimate the gene activity of the typical cell. Ref. [12] exploited the age distribution of Perthes disease is log-normal.
For the curve with horizontal asymptote, the standard concept of additive homogeneous error of symmetrical distribution will appear almost inevitably. When estimating parameters of any probability distribution, it is very important to choose the estimation method. No matter for real data or random simulation data, the most commonly used approach in all classical estimation techniques is maximum likelihood estimation. Under normal regularity circumstances, it is invariant, asymptotically consistent, as well as asymptotically normal. Therefore, we mainly observe the maximum likelihood estimators for Lognormal distribution parameters. However, we observe that these excellent properties can only be realized in large-sample size. In the limited samples, especially in the small samples, the maximum likelihood estimation (MLE) is often biased. Since the characteristic of likelihood equation of the Lognormal distribution is usually highly non-linear and it also have an unclosed form solution, it can be complex to determine such bias. In realistic application, in order to increase the accuracy in estimation, it is very essential to obtain the closed expression of the first-order deviation of the estimator. Many researchers have corrected the bias of different distributions. Readers can refer to [13][14][15][16][17][18][19][20][21].
We focus on two approaches to modify the bias of MLEs from the first order to the second order, and illustrate their effects. Ref. [22] recommended a method to obtain specific bias analytical expressions. Then we use these expressions to develop the bias of MLEs, and get the consistent second order. Instead of the analytically bias-adjusted MLEs, Ref. [23] introduced the parametric Bootstrap resampling method. The two methods are compared with the classical MLE in terms of error reduction and root mean square error. On the basis of results, a conclusion is drawn that the analytic method is suitable for the Lognormal distribution.
In the article remaining part is arranged as follows. A brief Section 2 describes the parametric point estimation for the Lognormal distribution by maximum likelihood method. In Section 3, we describe the Cox-Snell bias-corrected method and parametric bootstrap resampling method. And these methods are used in the Lognormal distribution. Section 4, contains simulation by Monte Carlo as used to contrast action of Cox-Snell bias adjustment estimation, bootstrap bias adjustment estimation and MLE. In Section 5, for the purpose of illustrating the point, the application of real data are proposed. Eventually, the summary of the article is in Section 6.

Maximum Likelihood Estimator
Lognormal distribution maximum likelihood estimators are discussed in this part. If x = (x 1 , · · · , x n ) are randomly selected from independent observations which obey the Lognormal distribution. From (1), the function of the likelihood can be donated as below: where µ and σ are two unknown parameters, and µ ∈ R, σ > 0. The function of log-likelihood of µ and σ is denoted as below: The MLEs of µ and σ areμ andσ which can be solved from the following equations: The expected information matrix is defined as: The inverse of K is expressed as:

Bias-Corrected MLEs
We adopt the method proposed by [22] and the parametric Bootstrap resampling method (Efron 1982). Both methods modify maximum likelihood estimators by reducing the bias order.

Cox-Snell Methodology
According to observation samples of size n and r-dimensional parameter vector θ, we can derive L (θ). Due to properties of logarithm and derivative, the likelihood function is always regular for all derivatives even the third derivative. The L (θ) derivatives joint cumulant is expressed as: where the function of the log-likelihood is L (θ), i, j, l = 1, · · · , r. Moreover, the derivatives of k ij can be written as: All of the derivatives joint cumulants are presumed to be O (n). The data of the corresponding sample are independent but not strictly subject to the same distribution. The s th element bias of theθ can be denoted as follows: where s = 1, · · · , r. (i, j) th element of the inverse expected information matrix K = −k ij is k ij . Furthermore, in the case of non independent and every term of the k is O(n), the bias equation is also applicable, It can be adjusted as: In terms of calculation, we normally choose this expression to evaluate, because it does not relate to terms of k ij,l . Define a l ij = k l ij − 1 2 k ijl ,where i, j, l = 1, · · · , r, and the matrix of A is obtained as: The bias ofθ can be re-expressed in the concise form: The bias correction ofθ can be given by: It is noteworthy that from (9), the expected value of joint cumulants results of ln L (θ) derivatives can be calculated. For this reason this methodology is suitable for the Lognormal distribution. If the analytical solution cannot be obtained, the numerical solution can be used instead.
Next, we ought to calculate the third derivative of the two parameters of the Lognormal distribution. From (7) and (8), it follows that: Due to the higher-order derivatives and observations x, we can obtain the derivatives of cumulants.
The information matrix is Thus, in the process of MLE, the bias which is to order O n −1 of the estimated values of µ and σ are expressed, separately, as: Bias-adjusted estimators are then given by: We observe the bias-corrected estimation (BCE) ofμ is 0. According to the behavior of the Lognormal distribution, we consider that it is related to the linear property of maximum likelihood estimation of parameter µ. It is to be noted thatσ BCE is consistent to order O n −2 .

Parametric Bootstrap
Bootstrap method can be divided into parametric method and nonparametric method. When the distribution of original data is clear, it is generally considered that the efficiency of parametric method is higher than that of nonparametric method. According to the characteristics of this paper, we decided to adopt the parameter bootstrap method.
Bootstrap theory is applied to assess the bias of maximum likelihood estimate, carried out in a parametric framework. Efron introduced the parametric Bootstrap resampling estimation (PBE) in 1982. Suppose random variable X is from distribution function F. Then we randomly select n samples x = (x 1 , · · · , x n ) from X. And θ = t (F) is a parameter related to function F. The estimator of θ iŝ θ = s (x). In order to resample, we generate a great quantity pseudo samples x * = x * 1 , · · · , x * n , and calculate θ * = s (x * ) respectively. In such a scenario, the factual values generate x * of parameters are MLEs of the θ and x * is the data for the second MLEs. Then, the empirical distribution ofθ * can be applied to assess the distribution function ofθ. If F θ is a finite dimensional parameter family of F, according to the consistency of the distribution, the parameter estimation of F can be obtained by using a estimator Fθ. The bias of theθ is expressed as: where E F [s (x)] is the expectation of F distribution. By substituting Fθ for F, we can get the bias of parametric bootstrap estimation.
Based on the original samples, we generate independent bootstrap samples of size B and calculate the bootstrap estimators for each time as follows θ * (1) , · · · ,θ * (B) . When B is large enough, the expected value E Fθ θ is approximately equal to 1 The bias-corrected estimators are written as: is the j th Bootstrap sample. And then the maximum likelihood estimators are regarded as the true values. Therefore, the estimated value from PBE is defined as: Ref. [24] pointed out that since the estimatorθ PBE is approximate to a constant, it should be called constant bias correction MLE. Compared with BCE, the form of PCE is more convenient and it does not involve the joint cumulants of the derivatives, when this methodology realizes a consistent estimator to second order. However, there is a certain randomness in the process of resampling. This may lead to the risk of unstable correction results.

Simulation Results
Based on different sample sizes and the true value of parameters, the effect of the maximum likelihood estimation, analytic correction method and parametric Bootstrap resampling method are compared in a Monte Carlo experiment. root mean squared errors (RMSEs) and average biases are criteria for the assessment. In the process of research, we notice that the characteristics of the Inverse Gaussian distribution are similar to those of the Lognormal distribution. After calculation, we observe that the bias of maximum likelihood algorithm of µ from the Inverse Gaussian distribution is also equal to zero. Therefore, we evaluateλ from three approaches in the Inverse Gaussian distribution as well.
By considering the density curve shown in Figures 1 and 2, different parameter values in the Lognormal distribution and Inverse Gaussian distribution can be selected. We have considered sample sizes of 10, 20, 30, 40, 50 and µ = 0.5, 1, 1.5, 2 and σ, λ = 0.5, 1, 1.5, 2, 3, 5. Using inverse transformation method, pseudo-random samples are generated. Nevertheless, the Lognormal distribution does not have the form of cumulative distribution function(c.d.f). So we generate random numbers of the normal distribution and then take their index.  We use Monte Carlo method for 10,000 repetitions, and use 5000 re-samples to construct bootstrap bias correction. The first two tables are from the Lognormal distribution, and the data of the last two tables are simulated by the Inverse Gaussian distribution.
For results discussion of above methods, we consider the theoreticalθ bias, defined as shown: In the process of Monte Carlo simulation, we need to bring specific values into the formula. The average bias and RMSE ofθ MLE ,θ BCE andθ PBE of the sample are defined as where M is the number of simulations. In this paper, M equals 10,000. If scholars need relevant codes, they can send an email to 17271075@bjtu.edu.cn.         From Tables 1 and 2, we can observe that the bias of µ MLE is much smaller than that of σ MLE , so it is reasonable that Cox-Snell methodology can only reduce the bias of estimator σ. Therefore, we mainly consider the bias correction of σ. The bias of σ MCE and σ PBE is always smaller than that of σ MLE ; the bias of σ MCE is usually smaller than that of σ PBE . In general, with the increasing of sample sizes n, the bias of all estimators of σ is close to zero.   Tables 3 and 4 show the corresponding results of the Inverse Gaussian distribution. These results and those of the Lognormal distribution have similarities, besides that the performance of analytic bais correction is better. Although the bias of λ MLE of Inverse Gaussian distribution is larger than that of lognormal distribution, the bias of λ BCE of Inverse Gaussian distribution is smaller than that of the Lognormal distribution. We also show the Lognormal distribution and the Inverse Gaussian distribution parameters estimation when the size is 50 by using the boxplots. Boxplots can convey more information than tables. Figures 11 and 12 show that the parameter estimators are closer to the real values through the bias corrected method. At the same time, Cox Snell method can reduce the range of parameter estimates to a certain extent.  We change parameter values and carry out repeated experiments. It is observed that the result will not change with the change of parameter setting, which shows that Cox-Snell method is robust. In the process of bias correction, the focus of calculation is to find Fisher information matrix. For most distribution functions, this is easy to calculate. In terms of the usability of existing software, we mainly use optim package in R language. This package is efficient and concise, and is suitable for practical calculation.
On this basis, in almost all cases considered, the analytical Cox-Snell method is superior to the bootstrap method and the classical maximum likelihood algorithm. BCE and PBE also reduce the RMSE in the Inverse Gaussian distribution. Therefore, when the standards to measure the correction effect are consistent, it can be determined that the improved estimator is nearer to the real value of parameter than the uncorrected estimator. Maximum likelihood estimators of µ in the Lognormal distribution and the Inverse Gaussian distribution are unbiased and efficient. In the process of analysis, we find that consistent estimation of µ is general, when k 12 and k 21 equal to zero in the expected information matrix. In this case, we can obtain that the bias of maximum likelihood estimator is zero by using Cox and Snell method.

Example Illustration
In this part, we use a few actual data to verify whether we can obtain estimators with smaller biases by adopting the Cox-Snell approach and the parameter bootstrap resampling method. The experimental data comes from the daily trading volume of Shanghai stock market in 2002. Data sourced from China Securities Regulatory Commission. We first adopt the maximum likelihood estimation to fit the actual data, and then use the analytic method. Since the estimator of µ has little difference in various methods, we mainly focus on the correction effect of σ estimation. Figure 13 depicts the fitting of the Lognormal probability density at different estimators of µ and σ. Table 5 shows that for σ, the lowest RMSE is given by the biased MLE whereas PBE gives the highest RMSE.
We also observe that BCE of σ is the largest and MLE of σ is the smallest, which is in accordance with the simulation results. Therefore, we have additional evidence that the behavior of the bias correction estimation is superior. It is worth noting that there are obvious differences in the estimators of σ, which illustrate that in the case of small sample size, bias correction is still necessary due to the effective information contained in the correction method. To evaluate the degree of figures fitting the Lognormal distribution, we use Kolmogorov-Smirnov (KS) test to calculate D-values and p-values with different parameter values. It is observed that p-values of KS in analytic bias-corrected method derive higher values than MLEs and parametric Bootstrap bias correction and D-values are the opposite. So we have every reason to believe that analytical bias correction can get extremely useful results.

Conclusions
The Lognormal distribution and Inverse Gaussian distribution are applied in a broad variety of fields. In practice, the maximum likelihood method is the most used method for these distributions. In this article, we employ the bias-adjusted method recommended by Cox and Snell to increase the accuracy of MLE. We obtain concrete analytical expressions of O n −1 for the biases and then use these data to construct the biases of the prospective estimated values which are of O n −2 . Additionally, we have compared a different bias-corrected approach based on Bootstrap resampling. According to the simulation results, we consider that theμ of two distributions is close to the true value through classical MLEs. When we focus on the other parameter, it is noted that Bootstrap resampling bias correction is not as efficient as Cox-Snell modification in aspect of average bias as well as error of root mean squared, especially in the Inverse Gaussian distribution. In particular, when the sample size increases and the parameter value decreases, the correction effect is better. The bias correction proposed by Cox and Snell is heartily recommended in estimating the Lognormal distribution parameters and the Inverse Gaussian distribution, which is often met in the context of reliability analysis.
In the future work, wild bootstrap is worth considering. In this method, the random weight is generated independent of the model bias and multiplied by the residual to simulate the actual bias distribution. It is suitable for regression analysis the model with heteroskedasticity. In practical analysis, it is characterized by weighted residuals rather than data pairs. Now in the bivariate setting of unknown (µ, σ), it is not a priori clear how to combine coordinate-wise accuracy measures for the two components, but depending on this choice different bias weights result optimal and will lead to different rankings of procedures. And the normal distribution is closely related to the lognormal distribution, so the bias corrected method is also suitable for the normal distribution. As we know, that the MLE is asymptotic, but there is no specific research to prove that Cox-Snell method is also asymptotic. We think it's worth thinking about in the future work. Then, the Phase-Type distributions are widely used, including many commonly used distributions, such as Erlang, Hypo-empirical and Coxian distributions. Readers can refer to [25]. This distribution can correct some inaccuracies of Weibull distribution in terms of physics. The Phase-Type distributions are non-negative type distributions and have two unknown parameters, which are worth studying in the future work.