Identifying the Probability Distribution of Fatigue Life Using the Maximum Entropy Principle

Hongshuang Li 1,*, Debing Wen 1, Zizi Lu 2, Yu Wang 1 and Feng Deng 1 1 Key Laboratory of Fundamental Science for National Defense-Advanced Design Technology of Flight Vehicles, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China; debing54w@sina.cn (D.W.); wangyu22@nuaa.edu.cn (Y.W.); fdeng@nuaa.edu.cn (F.D.) 2 Beijing Aeronautical Science and Technology Research Institute, Beijing 102211, China; luzizi@comac.cc * Correspondence: hongshuangli@nuaa.edu.cn; Tel.: +86-25-84890119


Introduction
In engineering structural design, it is well-known that the experimental data of fatigue testing and structures subject to cyclic loads display large variations, even if under the same loading conditions.This is caused by the uncertainties in material properties, loading conditions, boundary conditions, etc. [1,2].Therefore, probabilistic methods are very popular for fatigue life prediction and anti-fatigue design because they can deal with uncertainties in a rational way.As part of the development of anti-fatigue design, many probabilistic methods have been proposed for assessing the uncertainty in fatigue lives, and their applications have been reported both on crack initiation life and crack propagation life [2,3].Many experimental data sets indicate that crack initiation life and crack propagation life have a certain amount of scatter.The most famous experimental data sets were produced by Schijve [4] for three types of specimens and Virkler et al. [5] for fatigue crack propagation.Other frequently mentioned data sets were generated by Ghonem and Dore [6], Yang et al. [7], Itagaki et al. [8] and Wu and Ni [9].From the view of point of engineering design, the probabilistic distribution of fatigue life is very important, given stress level or crack length, because they are the fundamentals to construct P-S-N curves or P-a-N curves, which are used in engineering fatigue design.In most research studies, it is generally assumed that the fatigue lives of metal materials follow a lognormal distribution or a Weibull distribution [1,2,4,10].However, these assumptions are based on intuition, practical experience and sometimes the consideration of simplification of mathematical operations [10].The prediction of fatigue life, the accuracy of analysis and result credibility depend on whether these assumed distribution types can really reflect the uncertainties in the fatigue lives of metal materials.The most reliable data source when studying the fatigue phenomena of materials and structures is that obtained from fatigue testing.Maximized extracting the variation information of fatigue life from experimental data and reducing subjective uncertainty from the introduction of assumed distribution types are critical for identifying a proper distribution type for fatigue lives.
Entropic concepts have been developed for assessing degradation in fatigue phenomena and uncertainty quantification in fatigue life analysis and prediction.The former topic is based on the concept of thermodynamic entropy generation, which leads to deterministic fatigue models [11][12][13][14][15][16][17][18], while the latter topic is based on information entropy, which is used to quantify the fatigue life uncertainties [19].To address the abovementioned issue, the maximum entropy (MaxEnt) principle with the first two statistical moments was suggested to determine the distribution type of fatigue life for materials by Gong and Norton [19].Since only the first two statistical moments are involved in their study, an analytical distribution type for fatigue life can be derived, i.e., a normal or truncated normal distribution.However, this method is not widely accepted.The main reason is that only an analytical normal distribution or truncated normal distribution (the nonnegativity of fatigue live is considered in a truncated normal distribution) can be obtained, since the approach only uses the first two statistical moments.Therefore, their method did not take full advantage of the flexibility and optimal unbiased estimation of the MaxEnt principle when employing it to identify the probability distribution type of fatigue life.
In this paper, a new method to identify the probability distribution of fatigue life based on the MaxEnt principle with the first four statistical moments is proposed.The first four statistical moments of fatigue life are directly obtained from experimental data sets.Then, these statistical moments are formulated as constraints in the MaxEnt principle.The probability distribution of fatigue life is the solution to an optimization problem formed by applying the MaxEnt principle.This new method makes a good usage of the flexibility and optimal unbiased estimation of the MaxEnt principle.Two groups of fatigue data sets are used to demonstrate the performance of the proposed method.The MaxEnt distribution, the lognormal distribution and the three-parameter Weibull distribution are considered and compared in this study.They are used to fit fatigue data in two groups of testing results.Attention is paid on the goodness-of-fit based on two fit indexes.

Probability Distributions for Fatigue Life
One of the most useful probability distributions for modeling failure times of fatiguing materials and structures is the log-normal one.The log-normal distribution is characterized by a probability density function (PDF) of the form: where the parameters µ and σ are the mean and standard deviation of the distribution, respectively.In general, the statistical variable x in Equation ( 1) is the 10-logarithm of the fatigue life N, i.e.,: x " logN From the fatigue physics point of view, the fatigue life N does not take on negative values.Thus, the transformation of the fatigue life N in Equation ( 2) is consistent with the requirement of the fatigue physics.It should be pointed out that if the fatigue life N has a log-normal distribution, the logarithm of N, i.e., x has a normal distribution.This relationship is very useful as one can use the theoretical results and formula of the normal distribution when dealing with the log-normal distribution.The cumulative distribution function (CDF) of the log-normal distribution is given by: where t is the integrated variable.There is no-closed form solution to this integral, and it needs to be evaluated by a numerical algorithm.However, many good tools are available for the evaluation of Equation ( 3) in today's software and computer languages, such as Matlab, Scipy, Microsoft Excel, etc.
One mission in fatigue reliability analysis is to estimate the two parameters of the log-normal distribution.Suppose that a fatigue data set is obtained from lab testing or in service measurement with n samples x i " logN i , pi " 1, . . ., nq.Based on the maximum likelihood principle, the estimators of µ and σ are estimated as: The Weibull distribution is another very popular model which has been successfully used for fatigue phenomena.The formula for the PDF of the Weibull distribution is: where β is the shape parameter that influences the shape of PDF curve of the distribution.For β ă 1, the PDF has a shape which is somewhat similar to that of the exponential distribution.For 1 ă β ă 3, the PDF is positively skewed.For β ą 3, the PDF is somewhat symmetrical.θ is the scale parameter that influences both the mean and standard deviation, or dispersion, of the distribution.The third parameter x 0 , i.e., the location parameter, in the Weibull distribution corresponds to a lower fatigue life limit N, i.e., x 0 " logN 0 .No fatigue failure less than N 0 will take place.This seems to be very reasonable based on the common understanding of fatigue physics.We will examine this parameter when working with its estimation.If the value of x 0 reduces to zero, the three-parameter Weibull distribution degrades into the two-parameter Weibull distribution.In general, the fitting capacity of the three-parameter Weibull distribution is superior to that of the two-parameter Weibull, according to comparisons and observations available in the literature.Therefore, only a comparison between the three-parameter Weibull distribution and the presented MaxEnt distribution is performed in this study.
For the Weibull distribution, there is an explicit expression of its CDF: The primary method to estimate these three parameters is to fit a linear regression line of the form y " a `bx to a set of transformed data using the least squares method.From the CDF of the Weibull distribution, one can obtain: by taking double logarithms.
Therefore, a least-square fit is achieved with: If the fatigue data is from a Weibull distribution, a proper fit would graph as an approximate straight line in a Weibull plot.The shape parameter β can be easily estimated from the slope of the plotted line, and the scale parameter θ can be estimated from the point on the plotted line which corresponds to 63.2% of failure, i.e., Fpx i q " 0.632.With regard to the estimation of the location parameter x 0 , there is no closed-form solution, so an iteration procedure should be adopted.The estimator that results in a minimal of the sum of squared deviations would be used.In other words, the "best" estimator for the location parameter is defined to be a minimum-bias estimator, and searched by a numerical algorithm [1,4].Recall that the lower limit of fatigue life is consistent with the fatigue physics.However, an estimator from a numerical algorithm cannot be interpreted as having a realistic meaning of the fatigue physics.It is even possible for the estimator of N 0 to be negative if a least square fit is performed using the fatigue data directly, i.e., N i pi " 1, . . .nq.
Schijve [1] also discussed another three-parameter distribution with a lower limit of fatigue life.The statistical variable x is written as: x " logpN ´N0 q (10) Then, the statistical data of x i is also fitted by a normal distribution.It is clear that the PDF and CDF are still the same as in Equations ( 1) and ( 2), and µ and σ are the mean and standard deviation of x, respectively.Like in the three-parameter Weibull distribution, the introduction of N 0 assumes that no fatigue failures will occur prior to N 0 .To estimate N 0 , an iterative least squares method is also required to search for a solution that meets the mathematical meaning rather than the physical meaning as expected.Comparisons between the three-parameter Weibull distribution and three-parameter log-normal distribution have been carried out by Schijve [1].It turns out that the three-parameter log-normal distribution produced similar fits to the three-parameter Weibull distribution for two fatigue data sets investigated by Schijve.This statement is based on the small difference between the root mean square (r.m.s.) of the deviations from the two distributions when dealing with the same fatigue data set.However, Schijve also reported that there are very big differences in the estimators of the minimum fatigue lives determined from these two distributions for his investigated fatigue data sets.This phenomenon indicates the physical meaning of location parameter may not be realistic as expected since its value is found from the mathematical point of view.
It is well known that the first two statistical moments, i.e., mean value and standard deviation, are enough to uniquely determine a normal distribution.Therefore, it is not surprising that the identified distribution using the MaxEnt principle follows a normal distribution or a truncated normal distribution [19].The Weibull distribution also only utilizes the first two statistical moments with a logarithm transformation [20].In contrast, the presented MaxEnt principle can extract statistical information from the first four statistical moments and select a proper distribution from a family of distributions [20].

Fit Index
For identifying an appropriate probability distribution of fatigue life, some fit indexes have been proposed in the literature under limited fatigue data conditions.Zhao et al. suggested the proper fit indexes should consider the following aspects [21]: (a) the total fit effects for goodness-of-fit, (b) the tail fit effects for goodness-of-fit and (c) the consistency of the identified distribution with fatigue physics.Since Zhao et al.'s work concentrated on the selection of an appropriate distribution from a group of standard distributions, the linear correlation coefficient r and the critical value of r c , r c for hypothesis test, were adopted for the checking of total fit effects [21].As an alternative, the r.m.s. of the deviations was suggested as a fit index which reflects the total fit effects by Schijve [1,4].The MaxEnt distribution does not belong to any standard distribution so that the concept linear correlation coefficient is not defined for the MaxEnt distribution.In this study, the r.m.s. of the deviation is used to check the total fit effects, and comparisons are carried out among the presented MaxEnt distribution, the three-parameter Weibull distribution and the log-normal distribution.Here, we select the three-parameter Weibull distribution as example to deduce the calculation of the r.m.s. of deviations.For the ith sample of x i " logN i , the empirical failure probability P i associated with this sample is approximated by: An alternative empirical expression is: In order to make a comparison with Schijve's computational results [4], we utilize Equation ( 11) to approximate the empirical failure probability.From the CDF expression of the three-parameter Weibull distribution, the estimated xi is given by: xi " x 0 `θr´lnp1 ´pi qs Then the deviation between x i and xi is: The r.m.s. of the deviation is expressed as: Attention is paid to the left tail region of a fatigue life distribution because this region has very low failure probabilities and is important in fatigue reliability analysis.Therefore, the differences between the empirical failure probability and the predicted failure probability for the first two samples in an ordered data set are suggested to measure the goodness-of-fit in the left region [21].Hence, the two parameters d 1 and d 2 are given by: and: where n is the total number of samples, Fp¨q is the CDF of the identified distribution, and x 1 and x 2 are the first two samples, i.e., x 1 ă x 2 ă ¨¨¨x n .In fact, these two parameters provide the tendency of the predicted values, which are expected to be conservation for engineering design.Small values of |d i | are preferred because they indicate a better fit.If d 1 ă d 2 , it indicates that there is a conservative predication of the lower failure probability less than p 1 .On the contrary, if d 1 ą d 2 , it indicates there is a non-conservative prediction.The value of d 1 may also be useful to check whether a prediction is conservative or not.If d 1 ă 0, it implies that the predictions of failure probabilities less than p 1 may lean to be conservative.Otherwise, they may lean to be non-conservative.From the fatigue physics of the weakest link failure and irreversible cumulative damage failure, Zhao et al. proposed the following two statements [21]: (a) Failure rate curve increase with the fatigue cycling.
(b) The PDF of the identified distribution is positively skewed.
The method presented in this paper is a data-driven method, which does not capture the difference between failure modes and mechanisms.This may result in the identified distribution inconsistent with the fatigue physics.For example, it is well-known that the third statistical moment of a data set determines the skewness of the underlying probability distribution.If the estimator of the skewness is less than zero, the identified distribution by numerical methods often tends to be negatively skewed.Obviously, it is not consistent with the fatigue physics, but it at least reflects the statistical information hidden in the fatigue data set.Here, we do not force a probability distribution to totally meet the fatigue physics.

Computation of Statistical Moments of Fatigue Life
For fatigue reliability analysis, the number of load cycles under the given stress level or crack length can be obtained in lab condition or service measurement.The samples of the number of load cycles contain the distribution information of fatigue life, but they are not suitable for statistical inference since the valuable information is scattered in each sample.A more proper method is to employ sample statistics, e.g., mean value and standard deviation, to assemble the scatter information in each sample.Then, the statistical inference is performed based on sample statistics.
The statistical moments of fatigue life are required to be obtained from the experimental data by the proposed method.Suppose that x 1 , x 2 , ¨¨¨, x n are the logarithms of fatigue life N i from lab testing or service measurement.Then, the first four statistical moments of fatigue life can be estimated as: ŝ2 " kur " where μ, ŝ, ske and kur are mean value, standard deviation (SD), skewness and kurtosis of fatigue life, respectively.

Maximum Entropy Principle
In probability theory, the information entropy is a measure of the uncertainty represented by a probability distribution.For a continuous random variable with a PDF f pxq, its information entropy is defined as: The maximum entropy (MaxEnt) principle states that the most unbiased estimate of the PDF f pxq is that maximizes Equation ( 22) subjected to the given information, such as statistical moments.The MaxEnt principle possesses four axioms: uniqueness, invariance, system independence and subset independence.For a general problem with the given constraints of the form of statistical moments, there is only one distribution, which is determined by MaxEnt principle, satisfying these statistical moments.And it is also the best one.Following the derivation in Jaynes's studies [22,23], if the available knowledge is m high order statistical moments of a random variable, i.e., µ i " In addition, the usual PDF normalization constraint is given by: Therefore, the problem of identifying a probability distribution by MaxEnt principle is transformed to maximize Equation ( 22) subjected to constraints of Equations ( 23) and ( 24), i.e., max H X s.t.
f pxq dx " 1 The Lagrange multiplier method is usually employed to solve the optimization problem in Equation (25).It is easy to obtain the following well-known form of the MaxEnt distribution: where λ i i " 1, ¨¨¨, m are the Lagrange multipliers.If we substitute Equation (26) into Equation ( 24), the normalizing factor λ 0 is in the following form: It should be pointed out that MaxEnt principle is not only limited to statistical moment information but also can be generalized to the expected value of an arbitrary function that is integrable with respect to.

Computation of Lagrange Multipliers
The MaxEnt distribution in Equation ( 26) belongs to the Pearson family of distributions.For m ď 2, analytical solutions to λ i i " 1, ¨¨¨, m in terms of the available moments are possible to be obtained.For example, it is the well-known normal distribution when m = 2. Park and Bera [20] provided a table of examples of MaxEnt distributions, which includes 16 standard distribution types.However, no analytical solutions are available to provide a MaxEnt distribution for m ě 3.This also means that the MaxEnt distribution is not standard distribution type, e.g., normal distribution, when m ě 3. It is obvious that the Lagrange multipliers may be achieved by solving N nonlinear equations through the standard Newton's method [24].This formulation can be easily derived by substituting Equations ( 26) and (27) into Equation (23).However, the solving procedure may encounter the numerical stable issue.In this paper, the unconstrained minimization of a potential function had been presented to address this issue, instead of solving N nonlinear equations [25].This family of methods is much faster and numerical stable than the traditional Newton's method.The derivation procedure of this method is briefly described here.
Substituting Equation (26) into Equation ( 24): In order to calculate conveniently, define a function of λ i i " 1, ¨¨¨, m: Then, the PDF in Equation ( 26) is rewritten as: Arranging terms in Equation (29) produces: The integral term in the right hang of Equation ( 31) is defined as a potential function: Then, Equations ( 29) and (30) are rewritten as: and: From the definition of Q pλ 1 , ¨¨¨, λ m q, calculating the partial derivation of Q pλ 1 , ¨¨¨, λ N q with respect to λ i i " 1, ¨¨¨, m, yields: It can be readily obtained from Equation ( 23): Therefore, the partial derivation BQ{Bλ i pi " 1, ¨¨¨, mq have zero values.This means the point pλ 1 , ¨¨¨, λ m q is at least a stationary point for the potential function Q pλ 1 , ¨¨¨, λ m q.Calculating the element in the corresponding Hessian matrix: Equation (37) shows that the Hessian matrix of Q pλ 1 , ¨¨¨, λ m q is a variance-covariance-like matrix because x i i " 1, ¨¨¨, N are linearly independent.As long as f pxq is a PDF, this variance-covariance-like Hessian matrix should be positive definite and of full rank.Agmon et al. [15] proved that the problem of searching a minimum of the potential function Q pλ 1 , ¨¨¨, λ m q is equivalent to solving N nonlinear equations.In other words, the global minimum of Q pλ 1 , ¨¨¨, λ m q is also the solution set of the MaxEnt distribution that maximizes the information entropy shown in Equation ( 22) while satisfying the statistical moment constraints.Since there is no constraint on the potential function Q pλ 1 , ¨¨¨, λ m q, the solving procedure is more easy and can be performed by the famous Nelder-Mead algorithm.

Identifying the Distribution Type of Fatigue Life
In order to determine the MaxEnt distribution of fatigue life, the first four statistical moments are required and calculated form experimental data.It should be pointed out that the standardization of statistical moments is more convenient to implement this algorithm [20].It is also clear that the type of entropy used in this paper is information entropy.
Then, the first two statistical moment constraints are rewritten as: where x b " x ´μ ŝ is the standardized random variable which is used to descript the randomness in fatigue life.The skewness and kurtosis are the same since it is a linear transformation.
The probability distribution of fatigue life of metal material can be determined from the first four statistical moments based on the MaxEnt principle, i.e., solving the optimization problem in Section 4.2 using the numerical algorithm in Section 4.3.In probability theory, a set of finite statistical moments is not sufficient to uniquely determine the underlying probability distribution.Following the principle of maximum entropy, however, one can obtain a distribution amongst all possible distributions.It is consistent with the available partial information and contains minimum arbitrary assumption of information, i.e., the minimization of the introduction of subjective uncertainty.Furthermore, it is the only unbiased distribution [22,23].
Once the expression of fatigue life distribution is identified, it can be readily employed to calculate the cumulative distribution function (CDF) and quantile of fatigue life by computer programming.The information of CDF and quantile are valuable for fatigue reliability prediction, analysis and design.
Here, we also need to mention another exciting research branch using entropic concepts, that is entropy based damage and fatigue characterization, which have been attracting a lot of research interests.The method is within an irreversible thermodynamic framework and entropy production in a solid system is used as a measure of damage.The discussion of this method is beyond the scope of this paper and readers may refer to references [11][12][13][14][15][16][17][18] for more details.

Test Examples
Two groups of fatigue data sets are using to demonstrate the performance of the presented MaxEnt distribution.The first group is taken from Schijve's study [4], and it includes six fatigue data sets of 2024-T3 Alclad material.The second group is stemmed from Wu and Ni's study [9], which focused on the fatigue life of crack growth.Comparisons of the total fit effects and the tail effects are made with the log-normal distribution and the three-parameter Weibull distribution.To guarantee the predicted fatigue life taking positive value, all experimental data are processed by taking the 10-logarithm of the fatigue lives N i .

Six Fatigue Data Sets From Schijve's Study
The presented MaxEnt distribution, the log-normal distribution and the three-parameter Weibull distribution are firstly applied to six fatigue data sets of 18 to 30 similar fatigue tests in Schijve's work [4].The experimental information is summarized in Table 1.Three kinds of specimens were involved in this fatigue testing, i.e., unnotched, edge notched and riveted lap joint.In addition, two stress levels were considered.In order to apply the presented MaxEnt distribution, we first calculated the first four statistical moments of fatigue lives, as shown in Table 2.For the original test results, please refer to Schijve's work [4].The estimators of all parameters in three distributions are listed in Table 3.Unfortunately, no Weibull distribution could be filled to data set 3 through the least square method.The computational results of fit indexes are listed in Table 4.In addition, Figures 1 and 2 show the PDF curves and probability plots of these six fatigue data set.From Table 4, it can be seen that the r.m.s. of the Weibull distribution are similar or smaller than those of the log-normal distribution.It implies that the Weibull distribution has a better total fit effect than the log-normal distribution for the investigated data sets.Comparing the r.m.s. of the presented MaxEnt distribution and the log-normal distribution, a similar conclusion can be drawn from Table 4, i.e., the MaxEnt distribution gives a better total fit effect than the log-normal distribution on the average.For data set 2, the r.m.s.values is 0.0335 for the log-normal distribution, and 0.0337 for the MaxEnt distribution.In fact, these two values are too small and close to be distinguished.Then, the r.m.s. of the Weibull distribution and the MaxEnt distribution are compared.In 4 of 6 data sets, the MaxEnt distribution has a smaller value of the r.m.s.than the Weibull distribution.It should be pointed out that the MaxEnt distribution is suited for the fatigue data set 4, with a good fit (the r.m.s. is 0.0122), while the Weibull distribution cannot provide a fit.For data set 4, the MaxEnt distribution and the Weibull distribution are also hardly to be distinguished since they have small values for the r.m.s of deviation.The above observations indicate that, on the average, the MaxEnt distribution is better than the Weibull distribution for the investigated fatigue data sets.
In the following subsection, we are going to make comparisons of the tail fit effects.For the log-normal distribution, data sets 1, 2 and 4 have the same trends, i.e., d 1 ă d 2 , and data 3, 5 and 6 have the same trends, i.e., d 1 ą d 2 .The trend of d 1 ă d 2 is preferred in engineering design because the predicted fatigue life may be conservative for the investigated fatigue data sets. 2 of 6 data sets i.e., sets 2 and 4 show d 1 ă 0, while the rest data sets have d 1 ě 0. For the Weibull distribution, data sets 1, 2 and 4 also show a trend of d 1 ă d 2 , and data sets 5 and 6 have a trend of d 1 ą d 2 .Based on this observation, it seems that the three-parameter Weibull distribution and log-normal distribution have the same predicted trend.For the MaxEnt distribution, data sets 2 and 4 have a trend of d 1 ă d 2 , and the rest data set have a trend of d 1 ą d 2 .However, the value of d 1 is very close to the d 2 in data set 1, which has been observed with a preferred trend of d 1 ă d 2 .General speaking, those three distributions have similar predicted trend for the investigated fatigue data sets.In the next step, the number of data set with negative d 1 is checked for each probability distribution.Both the log-normal distribution and the MaxEnt distribution have 2 cases of d 1 ă 0, i.e., data sets 2 and 4, while the Weibull distribution has only one cases of d 1 ă 0, i.e., data sets 2. Considering these two parameters, all three probability distributions may give a non-conservative prediction in the tail region for the current investigated fatigue data.
The PDF curves of three distributions are shown in Figure 1.For data sets 1, 2, 5 and 6, the PDF shape obtained by three distributions are very similar to each other.This further confirms the observation from the comparison of r.m.s. in Table 4.Because the smallest differences among the r.m.s. are observed in Table 4, the smallest differences among the PDF curves of the three distributions can also be found in Figure 1e.As mentioned before, the three-parameter Weibull distribution is not suitable for data set 3. Both the log-normal distribution and the MaxEnt distribution can be used to fit data set 3. However, a bi-modal PDF is observed for the MaxEnt distribution.Investigating the r.m.s.values in Table 4, the MaxEnt distribution is better than the log-normal distribution for data set 3. For data set 4, the MaxEnt distribution is similar to the three-parameter Weibull, but different from the log-normal distribution (Figure 1d).The probability plots of three distributions are given in Figure 2.For comparison purposes, the experimental data are plotted in Figure 2.These probability plots are obtained on a normal probability graph.This means the fitted normal distribution is a straight line, while the other two distributions are nonlinear curves.For data sets 1, 2, 5 and 6, the CDF curves obtained by the MaxEnt distribution and the Weibull distribution have good agreement with each other, especially for the region where experimental samples are available, and are better than those of the lognormal distribution.On the contrary, the difference among CDF curves is becoming bigger in the small probability region (left tail region on a PDF curve) and high reliability region (right tail region on a PDF curve).This phenomenon is not surprising since both the small probability region and high reliability region are generalized from the experimental data.From Figure 2c, it can be seen that data The probability plots of three distributions are given in Figure 2.For comparison purposes, the experimental data are plotted in Figure 2.These probability plots are obtained on a normal probability graph.This means the fitted normal distribution is a straight line, while the other two distributions are nonlinear curves.For data sets 1, 2, 5 and 6, the CDF curves obtained by the MaxEnt distribution and the Weibull distribution have good agreement with each other, especially for the region where experimental samples are available, and are better than those of the lognormal distribution.On the contrary, the difference among CDF curves is becoming bigger in the small probability region (left tail region on a PDF curve) and high reliability region (right tail region on a PDF curve).This phenomenon is not surprising since both the small probability region and high reliability region are generalized from the experimental data.From Figure 2c, it can be seen that data set 3 has an irregular trend.set 3 has an irregular trend.Although the lognormal distribution may be used to fit this data set, it is clear that the lognormal distribution disagrees with the trend of data.Integrating the information in Figure 2c, the MaxEnt distribution is the only proper distribution to fit data set 3, which has an irregular trend.For data set 4, the experimental data shows a similar trend as data set 3 because the specimens are the same just with different stress levels.All three investigated distributions may be used to fit it.However, Figure 2d further confirms the fact that both the MaxEnt distribution and the Weibull distribution are better than the lognormal distribution.(

A Fatigue Data Set of Fatigue Crack Propagation
The experimental data of fatigue crack propagations [9] is selected to further demonstrate the performance of the presented MaxEnt distribution.The experimental system comprises an MTS dynamic testing machine, a machine controller, a LabVIEW signal generating/data acquisition system and a zoom microscope for the measurement of crack length.The metal material is 2024-T351 aluminum alloy, which is a major metal material in the aeronautical industry.The dimensions of the specimens were 50.0 mm wide and 12.0 mm thick.The nominal yield strength of the material is 320 MPa, the ultimate strength is 462 MPa, and the elongation is 15.4%.
The crack length is started at a pre-cracking of 15.0 mm and extended to the length of 18.0 mm.During both the pre-cracking and fatigue crack growth tests, sinusoidal signals with maximum of 4.5 kN, minimum of 0.9 kN, and frequency of 15 Hz were used as the input loads.Figure 3 shows the crack propagation curves for all 30 specimens.According to Equations ( 18)-( 21), the statistical moments of fatigue lives at different crack lengths are summarized in Table 5.

A Fatigue Data Set of Fatigue Crack Propagation
The experimental data of fatigue crack propagations [9] is selected to further demonstrate the performance of the presented MaxEnt distribution.The experimental system comprises an MTS dynamic testing machine, a machine controller, a LabVIEW signal generating/data acquisition system and a zoom microscope for the measurement of crack length.The metal material is 2024-T351 aluminum alloy, which is a major metal material in the aeronautical industry.The dimensions of the specimens were 50.0 mm wide and 12.0 mm thick.The nominal yield strength of the material is 320 MPa, the ultimate strength is 462 MPa, and the elongation is 15.4%.
The crack length is started at a pre-cracking of 15.0 mm and extended to the length of 18.0 mm.During both the pre-cracking and fatigue crack growth tests, sinusoidal signals with maximum of 4.5 kN, minimum of 0.9 kN, and frequency of 15 Hz were used as the input loads.Figure 3 shows the crack propagation curves for all 30 specimens.According to Equations ( 18)-( 21), the statistical moments of fatigue lives at different crack lengths are summarized in Table 5.Based on the statistical moments in Table 5, the MaxEnt principle was applied to identify the probability distributions under different crack lengths, i.e., a = 22, 24, 26, 28 and 30 mm.Table 6 gives the estimators of parameters in three distributions at different crack lengths.Table 7 summarizes the fit indexes of the three distributions.It is found that the expressions of PDFs for fatigue lives at different crack length are very similar to each other if only one of three distributions is considered.Comparing the r.m.s of deviations (Table 7), the presented MaxEnt distribution produced the smallest values among three distributions for all crack lengths, and the three-parameter Weibull distribution yields smaller values than the lognormal distribution for all crack lengths.In other words, regarding the total fit effects, the MaxEnt distribution is the best choice for this fatigue data set.For all three distributions, five out of six crack lengths show a trend of 1 2

d d
 , except at a = 22 mm, which  Based on the statistical moments in Table 5, the MaxEnt principle was applied to identify the probability distributions under different crack lengths, i.e., a = 22, 24, 26, 28 and 30 mm.Table 6 gives the estimators of parameters in three distributions at different crack lengths.Table 7 summarizes the fit indexes of the three distributions.It is found that the expressions of PDFs for fatigue lives at different crack length are very similar to each other if only one of three distributions is considered.Comparing the r.m.s of deviations (Table 7), the presented MaxEnt distribution produced the smallest values among three distributions for all crack lengths, and the three-parameter Weibull distribution yields smaller values than the lognormal distribution for all crack lengths.In other words, regarding the total fit effects, the MaxEnt distribution is the best choice for this fatigue data set.For all three distributions, five out of six crack lengths show a trend of d 1 ă d 2 , except at a = 22 mm, which indicates conservative evaluations in the left tails.In addition, most of d 1 have negative values, except for a = 22 mm from the MaxEnt distribution.The PDF curves and probability plots for three distributions are plotted in Figures 4 and 5 respectively.It is obvious that the identified distributions by the MaxEnt principle are not the commonly used standard distribution type.Taking a close look at the identified PDF curves, bimodal PDFs are observed from the MaxEnt principle in this case study (Figure 4).They are very different for standard probability distributions, e.g., the lognormal distribution and the Weibull distribution in this study.From Figure 5, it can be seen that all fatigue lives have an irregular trend like in data set 3 in test example 1.The MaxEnt distribution shows a better agreement with the experimental data than that of the lognormal distribution and the Weibull distribution.Using the same experimental data, both the lognormal and Weibull distribution can pass hypothesis testing.However, the CDFs of fatigue lives identified by the lognormal and Weibull distribution cannot match the experimental data well.This indicates that the commonly used standard probability distributions are not suitable to descript the scatter in fatigue lives in this case study.
For comparison purposes, the computational results reported by Wu and Ni [9] should be also considered in this study.It should be pointed out that Wu and Ni adopted a stochastic process method [7,26] which seems to have an advanced mathematical theory, however, the computational results are almost the same to the lognormal distribution.Compared with the experimental data, the MaxEnt distribution is better than the stochastic process method.Furthermore, the implementation procedure of the proposed method is much easier than that of the stochastic process method. (a)

Conclusions
A new computational method is proposed in this paper for identifying the probability distribution of fatigue life of materials and structures.It is based on the first four statistical moments of fatigue life and the maximum entropy principle.The first four statistical moments are calculated from samples of fatigue life in a fatigue testing or measurement.Then, they serve as constraints in the maximum entropy principle.The corresponding optimization problem in the maximum entropy principle is solved by transferring it into an unconstrained optimization problem using a potential function.This strategy avoids the possible numerical singularity when solving N nonlinear equations.The following conclusions have been reached: (1) The MaxEnt distribution is the most natural distribution given the fatigue life samples.It reduces the introduction of subjective uncertainty when a standard distribution is chosen to model fatigue life.Additionally, the presented method has many significant characteristics, such as simply concept, high efficiency, easy programming, wide applicability, etc.

Conclusions
A new computational method is proposed in this paper for identifying the probability distribution of fatigue life of materials and structures.It is based on the first four statistical moments of fatigue life and the maximum entropy principle.The first four statistical moments are calculated from samples of fatigue life in a fatigue testing or measurement.Then, they serve as constraints in the maximum entropy principle.The corresponding optimization problem in the maximum entropy principle is solved by transferring it into an unconstrained optimization problem using a potential function.This strategy avoids the possible numerical singularity when solving N nonlinear equations.The following conclusions have been reached: (1) The MaxEnt distribution is the most natural distribution given the fatigue life samples.It reduces the introduction of subjective uncertainty when a standard distribution is chosen to model fatigue life.Additionally, the presented method has many significant characteristics, such as simply concept, high efficiency, easy programming, wide applicability, etc. (2) A careful comparison between the MaxEnt distribution, the lognormal distribution and the three-parameter distribution has been carried out in this paper.The three-parameter Weibull distribution has a minimum of fatigue life (N 0 ) as the third parameter.This seems to be consistent with fatigue physics, but the physical significance of this minimum fatigue life remains unclear because its estimated value is obtained by an iteration algorithm without any physical concepts.Due to the existing of the third parameter, the three-parameter Weibull distribution cannot be performed in extremely cases, e.g., data set 3 in test example 1.
(3) The fatigue life distributions of two groups of fatigue data sets were identified by the lognormal distribution, the Weibull distribution and the MaxEnt distribution.The computational results show that both the Weibull distribution and the MaxEnt distribution have a better agreement with the experimental data than the lognormal distribution.The most attractive feature is that the MaxEnt distribution is better than the Weibull distribution on average, especially for the cases where the fatigue life CDF has an irregular trend.

Table 1 .
Experimental information of six fatigue data sets.

Table 2 .
Statistical moments of six fatigue data sets.

Table 3 .
Estimators of parameters in three probability distributions.

Table 4 .
Fit indexes of three investigated probability distributions.

Table 5 .
The statistical moments of fatigue lives at different crack lengths.

Table 5 .
The statistical moments of fatigue lives at different crack lengths.

Table 6 .
Estimators of parameters in three probability distributions at different crack lengths.

Table 7 .
Fit indexes of three probability distributions at different crack lengths.