The Sine Modiﬁed Lindley Distribution

: The paper contributes majorly in the development of a ﬂexible trigonometric extension of the well-known modiﬁed Lindley distribution. More precisely, we use features from the sine generalized family of distributions to create an original one-parameter survival distribution, called the sine modiﬁed Lindley distribution. As the main motivational fact, it provides an attractive alternative to the Lindley and modiﬁed Lindley distributions; it may be better able to model lifetime phenomena presenting data of leptokurtic nature. In the ﬁrst part of the paper, we introduce it conceptually and discuss its key characteristics, such as functional, reliability, and moment analysis. Then, an applied study is conducted. The usefulness, applicability, and agility of the sine modiﬁed Lindley distribution are illustrated through a detailed study using simulation. Two real data sets from the engineering and climate sectors are analyzed. As a result, the sine modiﬁed Lindley model is proven to have a superior match to important models, such as the Lindley, modiﬁed Lindley, sine exponential, and sine Lindley models, based on goodness-of-ﬁt criteria of importance.


Introduction
The last few years in applied sciences have been marked by the need and volume of data to be analyzed. To meet this need, new models have been proposed, and their improvement is a hot topic. These require, among other things, the underlying development of new (statistical or probabilistic) distributions. In this regard, one idea is to modify existing distributions in order to make the corresponding models more flexible and adaptable to several kinds of data. Hence, several modifications based on mathematical techniques have been proposed, generating distributions classified under "families of distributions". The readers are referred to [1] for a bird's-eye view. In recent times, the families described by "trigonometric transformations" have gained a lot of interest because of their applicability and working capability in a variety of situations. Related to this topic, Refs. [2][3][4] were among the first to study the sinusoidal transformation that leads to the sine generated (S-G) family. For this family, the cumulative distribution function (cdf) and probability density function (pdf) are given by and f S (x; η) = π 2 g(x; η) cos π 2 G(x; η) , x ∈ R, (2) respectively, where G(x; η) and g(x; η) represent the cdf and pdf of a certain continuous distribution with a parameter vector denoted by η. Thus, the functions F S (x; η) and f S (x; η) are linked to a baseline or parent distribution determined beforehand, relying on and respectively. Basically, the ML distribution satisfies the following FOSO property: for all x ∈ R, where G L (x; θ) and G E (x; θ) represent the cdfs of the Lindley and exponential distributions, respectively. In this sense, the ML distribution constitutes a real alternative to these two classical distributions. The ML distribution is also identified as a linear combination of the exponential distribution with parameter θ and the gamma distribution with parameters (2, 2θ), and it has an "increasing-reverse bathtub-constant" hazard rate function (hrf). The real benefit is quite noteworthy; the ML model is superior to the Lindley and exponential models for the three data sets seen in [11]. A few inspired distributions enhancing or generalising the ML distribution were proposed for the purpose of optimality. These include the Poisson ML distribution by [12], wrapped ML distribution by [13], and discrete ML distribution by [14]. The immediate aim of the S-ML distribution is to use the S-G technique to enhance the effectiveness of the ML distribution on diverse data sets. In particular, thanks to the FOSO properties in Equations (3) and (6), it is a real and attractive alternative to the Lindley and ML distributions. Further exploration in the following research will reveal deeper motives. To summarise, the S-ML model's utility and adaptability make it particularly appealing to fit data from various fields. Remarkably, the characterized pdf shows a variety of curve shapes, some of which have only one mode, are decreasing, and are asymmetrical to the right. In comparison to the pdf of the ML distribution, when it is unimodal, the pdf of the S-ML distribution has a more rounded peak, meaning that it is better adapted to fit a data histogram presenting a high kurtosis level. Furthermore, the S-ML distribution exhibits a non-monotonic hrf which is "increasing-reverse bathtub-constant" shaped. The hrf of the ML distribution has this feature as well. As with other competent models, the accuracy of the fits is persistent in the case of the S-ML model due to their characteristics. The claim is demonstrated by examining two published real-world data sets, primarily from engineering and climate data, against twelve competent models.
We prepare the rest of the paper in the following manner. The concept, quality, and key aspects of the S-ML distribution are covered in Section 2. A moment analysis is conducted in Section 3. The maximum likelihood estimation of the parameter θ is explained in Section 4. A simulation study is presented in Section 5. Section 6 assesses the proposed model's applicability to real-world data. Finally, in Section 7, the conclusions are provided.

The S-ML Distribution
The mathematical foundation for the S-ML distribution is first presented.

Functional Analysis
To begin, we perform a functional analysis of the S-ML distribution. By substituting Equations (4) and (5) in Equations (1) and (2), respectively, we derive the major functions of the S-ML distribution; the cdf and pdf are given as follows with θ > 0. As a primary result mentioned in the introduction section, the following FOSO property holds: G ML (x; θ) ≤ F S−ML (x; θ) for any x ∈ R, making an immediate difference between the ML and S-ML modeling from the cdf viewpoint. Differences can also be observed on the respective pdfs, as discussed below. Naturally, variant forms of f S−ML (x; θ) can be obtained by changing the value of θ. Due to the relative complexity of this function in the analytical sense, we propose a graphical study for shape analysis. The more representative shapes of this pdf are shown in Figure 1.
fit data from various fields. Remarkably, the characterized pdf shows a variety of curve shapes, some of which have only one mode, are decreasing, and are asymmetrical to the right. In comparison to the pdf of the ML distribution, when it is unimodal, the pdf of the S-ML distribution has a more rounded peak, meaning that it is better adapted to fit a data histogram presenting a high kurtosis level. Furthermore, the S-ML distribution exhibits a non-monotonic hrf which is "increasing-reverse bathtub-constant" shaped. The hrf of the ML distribution has this feature as well. As with other competent models, the accuracy of the fits is persistent in the case of the S-ML model due to their characteristics. The claim is demonstrated by examining two published real-world data sets, primarily from engineering and climate data, against twelve competent models. We prepare the rest of the paper in the following manner. The concept, quality, and key aspects of the S-ML distribution are covered in Section 2. A moment analysis is conducted in Section 3. The maximum likelihood estimation of the parameter θ is explained in Section 4. A simulation study is presented in Section 5. Section 6 assesses the proposed model's applicability to real-world data. Finally, in Section 7, the conclusions are provided.

The S-ML Distribution
The mathematical foundation for the S-ML distribution is first presented.

Functional Analysis
To begin, we perform a functional analysis of the S-ML distribution. By substituting Equations (4) and (5) in Equations (1) and (2), respectively, we derive the major functions of the S-ML distribution; the cdf and pdf are given as follows with θ > 0. As a primary result mentioned in the introduction section, the following FOSO property holds: G ML (x; θ) ≤ F S−ML (x; θ) for any x ∈ R, making an immediate difference between the ML and S-ML modeling from the cdf viewpoint. Differences can also be observed on the respective pdfs, as discussed below. Naturally, variant forms of f S−ML (x; θ) can be obtained by changing the value of θ. Due to the relative complexity of this function in the analytical sense, we propose a graphical study for shape analysis. The more representative shapes of this pdf are shown in Figure 1. We can observe from Figure 1, that, for smaller values of θ, the plot of f S−ML (x; θ) is unimodal, and for larger values of θ, the plot of f S−ML (x; θ) is decreasing. As a result, the S-ML distribution is suitable for modeling a vast majority of lifetime phenomena. We can observe from Figure 1, that, for smaller values of θ, the plot of f S−ML (x; θ) is unimodal, and for larger values of θ, the plot of f S−ML (x; θ) is decreasing. As a result, the S-ML distribution is suitable for modeling a vast majority of lifetime phenomena. Compared to the parent ML distribution, the following observations are made: When it is unimodal, we observe that the pdf of the S-ML distribution has a more rounded peak, meaning that it is better adapted to fit a data histogram presenting a high kurtosis level. In other words, the S-ML model is more able to analyze data of a leptokurtic nature.

Reliability Analysis
We complete the previous functional analysis by studying the complementary reliability functions, such as the survival function (sf), hrf (for hazard rate function), reversed hrf (rhrf), second rate of failure (srf), and the cumulative hrf (chrf) of the S-ML distribution. In a broader sense, the sf measures the probability that the life of an item will survive beyond any specified time. Mathematically, the sf of the S-ML distribution is given by The hrf measures the likelihood of an item deteriorating or expiring depending on its lifetime. As a direct consequence, it is critical in the classification of survival distributions.
The hrf of the S-ML distribution is specified by Further, Figure 2 displays the shapes of this hrf for various values of θ.
Math. Comput. Appl. 2021, 1, 0 4 of 15 Compared to the parent ML distribution, the following observations are made: When it is unimodal, we observe that the pdf of the S-ML distribution has a more rounded peak, meaning that it is better adapted to fit a data histogram presenting a high kurtosis level. In other words, the S-ML model is more able to analyze data of a leptokurtic nature.

Reliability Analysis
We complete the previous functional analysis by studying the complementary reliability functions, such as the survival function (sf), hrf (for hazard rate function), reversed hrf (rhrf), second rate of failure (srf), and the cumulative hrf (chrf) of the S-ML distribution. In a broader sense, the sf measures the probability that the life of an item will survive beyond any specified time. Mathematically, the sf of the S-ML distribution is given by The hrf measures the likelihood of an item deteriorating or expiring depending on its lifetime. As a direct consequence, it is critical in the classification of survival distributions. The hrf of the S-ML distribution is specified by Further, Figure 2 displays the shapes of this hrf for various values of θ.  Figure 2 emphasizes that the hrf of the S-ML distribution has "increasing-reverse bathtub-constant" shapes, which is also possessed by the hrf of the ML distribution. This makes a solid difference between the Lindley and exponential distributions. It is also a desirable property for modelling purposes.
The rhrf is the ratio between the pdf to its cdf and it plays a role in analyzing censored data. Analytically, it corresponds to The srf is the logarithmic ratio of the sf at time x and x + 1, and it is given by  Figure 2 emphasizes that the hrf of the S-ML distribution has "increasing-reverse bathtub-constant" shapes, which is also possessed by the hrf of the ML distribution. This makes a solid difference between the Lindley and exponential distributions. It is also a desirable property for modelling purposes.
The rhrf is the ratio between the pdf to its cdf and it plays a role in analyzing censored data. Analytically, it corresponds to The srf is the logarithmic ratio of the sf at time x and x + 1, and it is given by The chrf is the negative logarithm of sf and is given by With these functions, we conclude different reliability analysis in regard with the S-ML distribution.

Moment Analysis
For any lifetime distribution, a moment analysis is necessary to handle numerically its modeling capacities, identifying the behavior of various central and dispersion moment parameters, as well as moment skewness and kurtosis coefficients.
As a first notion, for any positive integer r ≥ 1, and a random variable X with the S-ML distribution, the r-th moment of X exists. It can be expressed as Integral developments in the classical sense are limited. Computer software, on the other hand, can be used to quantitatively evaluate it for a given θ.
We propose a series development of mom(r) in the next result, which can be used for computational purposes in a less opaq method than a "ready to use but black box" computer program. Proposition 1. The r-th moment of X can be expanded as Proof. For the proof, we do not directly use the integral expression of mom(r) as described in (8). An integration by part gives Now, by utilizing the series expansion of the cosine function and the classical binomial formula, we obtain Hence, after some developments including the change of variable y = ( + 2k)θx (so that dx = [1/(( + 2k)θ)]dy), and the calculus of gamma-type integral, we get Proposition 1 is proved.
Then, based on Proposition 1, the following finite sum approximation remains acceptable: where U represents any large integer. From the above moment formulas, we can easily derive the mean, variance, moment skewness coefficient and moment kurtosis coefficient; the mean is given by mom (1), the variance is obtained as V(X) = E (X − mom(1)) 2 , the moment skewness coefficient can be derived as MS = E (X − mom(1)) 3 /V(X) 3/2 and the moment kurtosis coefficient can be derived as MK = E (X − mom(1)) 4 /V(X) 2 . Table 1 gives a glimpse of these values for different values of θ. From Table 1, we can observe that, as the value of the parameter θ of the S-ML distribution increases, all the considered measures increase. Furthermore, since MS > 0, it is clear that the S-ML distribution is mainly right-skewed, and since MK > 3, it is mainly leptokurtic.
We can complete the previous moment results by investigating the incomplete moments. To begin, let r ≥ 1 be an integer, t ≥ 0, and X be a random variable with the S-ML distribution. Based on this variable, we define its incomplete version by Y(t) = X if X ≤ t and Y(t) = 0 if X > t. Then, the r-th incomplete moment of X given at t exists, and it is defined by It is involved in developments of important probabilistic objects, such as mean deviations, income curves, etc. More basically, it can be viewed as a truncated version of the standard r-moment. We may refer to [15] in this regard.
In the next results, we present a series expansion of mom(r, t), which can be used for approximation purposes. Proposition 2. The r-th incomplete moment of X given at t exists and can be expanded as where γ(a, t) denotes the incomplete gamma function defined by γ(a, t) = t 0 x a−1 e −x dx, where a > 0 and t ≥ 0.
Proof. The proof follows the lines of the one of Proposition 1. An integration by part gives It follows from the series expansion in Equation (9) and the change of variable This concludes the proof of Proposition 2.
The rest of the study is devoted to the applicability of the S-ML model, illustrated with concrete examples of data analysis.

Inferential Analysis
The inference of the S-ML model is covered in this section. The parameter θ is supposed to be unknown. In order to estimate it, the maximum likelihood estimation method is employed. We adopt the methodology as described in a broader context, as seen in [16].
Thus, the next is a mathematical representation of this methodology in the setting of the S-ML distribution. First, let n be a positive integer and x 1 , x 2 , . . . , x n be observations drawn from a random variable X following the S-ML distribution. Then, the corresponding likelihood function and log-likelihood function are as follows and ln L= n ln π − n ln 2 + n ln θ − n ln(1 + θ) − 2θ respectively. The maximum likelihood estimate (MLE) of θ can be defined via the following argmax definition:θ = argmax ln θ>0 L.
This estimate can be formalized through the solution of the non-linear equations expressed as d ln L/dθ = 0, where There is no analytical solution for this equation, butθ can be determined at least numerically with any statistical software such as the R software (see [17]). Based onθ, the estimated pdf (epdf) of the S-ML model is given by f S−ML x;θ and the estimated cdf (ecdf) of the S-ML model is given by F S−ML x;θ . Let I(θ) = −E d 2 ln[ f S−ML (X; θ)]/dθ 2 be the expected Fisher information matrix. Then, the estimated standard error (SE) of θ is achieved by considering the value of the diagonal component of I θ −1 raised to half.

Simulation Study
In the framework of the S-ML model, a simulation study is carried out to study the performance ofθ given as Equation (10) in terms of their bias (bias) and mean squared error (MSE). The simulated procedure can be described as follows: We generate samples of sizes n = 20, 50, 100, 200, 500, 1000 from the S-ML distribution with θ = (1.25, 1.50, 2.00, 2.50). For each sample, the MLEθ is calculated. Here, 1000 such repetitions are made to calculate the standard mean MLE (MMLE), bias and MSE of these estimates using the formula: respectively, whereθ i is the estimate of θ for each iteration in the simulation study; i is from 1 to 1000. The results of the study are reported in Table 2. From Table 2, it is observed that as sample size n increases,

Applications of the S-ML Model
We use the S-ML model on two data sets based on the maximum likelihood method as introduced previously. The data differ in size, traits, and background, but they are all of current interest in their areas.

Method
We proceed as follows for each data set: The data are presented briefly, accompanied with their reference; 2.
A table that encapsulates the basic statistical measures of the data is provided; 3.
The goodness-of-fit measures of the models under consideration are evaluated and arranged in order of model performance in a table; 4.
The MLE(s) of the model parameters is(are) shown, as well as the relevant SEs, as supplementary work; 5.
It is concluded with a visual concept by presenting the histogram of the data and the epdf, as well as the empirical cdf plots and ecdf for the S-ML model exclusively in another graph.
The adequacy measures that are used for model fitting are provided here. Suppose x 1 , x 2 , . . . , x n represent the data and x (1) , x (2) , . . . , x (n) be their ordered values. As an initial step, we consider the Cramér von-Mises, Anderson Darling, and Kolmogorov-Smirnov statistics defined by respectively. The p-value of the Kolmogorov-Smirnov test linked to D n is also examined. Of course, the above definitions can be adapted to any other model than the S-ML model. The measures of adequacy are extensively employed to determine which model is best in terms of fitting the data set under study. The model having the least value for the W * or A * , and the highest p-value, is considered to give the best fit that is in correspondence with the data. Furthermore, we consider the following goodness-of-fit measures: Akaike information criterion (AIC) and Bayesian information criterion (BIC), given as follows respectively, where LL is the value of the log-likelihood function taken atθ and k, being the number of parameters of the model, here k = 1 for the S-ML model. As it is widely understood, the model with the lowest value for AIC or BIC is selected as the greatest player of models that fits the data set compared to the other models. For more information on the usage and the underlying meaning of the measures W * , A * , D n , AIC and BIC, we refer to [18].
In order to study the best fit of the S-ML model, we aim to compare it with some useful and competent models, which include the ML, Lindley, sine exponential and sine Lindley models listed in Table 3. It is worth noting that models with three parameters are also considered. The aim is to prove that our model can be efficient enough to outperform more complex models in the literature. Table 3. Competent models with the S-ML model.

Precipitation Data Set
The data set has thirty consecutive values of precipitation (in inches) in the month of March in Minneapolis, as provided by [29] and recently used by [30]. The data are: (0.77,  1.74, 0.81, 1.2, 1.95, 1.2, 0.47, 1.43, 3.37, 2.2, 3, 3.09, 1.51, 2.1, 0.52, 1.62, 1.31, 0.32, 0.59, 0.81,  2.81, 1.87, 1.18, 1.35, 4.75, 2.48, 0.96, 1.89, 0.9, 2.05). The descriptive statistical measures of these data are presented in Table 4. Based on the information in Table 4, the data are right-skewed and leptokurtic. The MLE, SE, and goodness-of-fit measures of the S-ML model and those of the other models for precipitation data set are given in Tables 5 and 6. Table 5. MLEs, SEs, and goodness-of-fit measures for the precipitation data set with one parameter models.  Table 6. MLEs, SEs, and goodness-of-fit measures for the precipitation data set with models having more than one parameter. We can observe from Table 5 that the S-ML model has the lowest statistics with the highest p-value, implying that it delivers a better fit than the other models studied. Comparing the models in Table 6, we can see that the lognormal model gives a better fit, while the S-ML model takes the second place, but with less modeling complexity in terms of the number of parameters. Figure 3 depicts the epdf and ecdf plots of the S-ML model for the precipitation data set. We can observe from Table 5 that the S-ML model has the lowest statistics with the highest p-value, implying that it delivers a better fit than the other models studied. Comparing the models in Table 6, we can see that the lognormal model gives a better fit, while the S-ML model takes the second place, but with less modeling complexity in terms of the number of parameters. Figure 3 depicts the epdf and ecdf plots of the S-ML model for the precipitation data set.

Time between Failure Data Set
This data set refers to the time between failures for repairable items. It was obtained from [31].   From Figure 3, it is obvious that the S-ML model captures the histogram's overall pattern and illustrates the comparison of the cdf with the empirical cdf of the S-ML model. The suitable behaviour of the S-ML model is further confirmed by these graphs. Apart from the lognormal model, the S-ML model clearly fits better than the Lindley, ML, S-Expo and S-Lindley, and other models.

Time between Failure Data Set
This data set refers to the time between failures for repairable items. It was obtained from [31].  From Table 7, we can observe that the failure time data set is right-skewed and leptokurtic.
The MLE, SE, and goodness-of-fit measures of the S-ML model and those of the other models for the failure time data set are given in Tables 8 and 9.  Tables 8 and 9 show that, for the failure time data set, the S-ML model has the lowest statistics and the highest p-value, meaning that it provides a better match than the other models investigated.   Table 9. MLEs, SEs, and goodness-of-fit measures for the failure time data set with models having more than one parameter.  Tables 8 and 9 show that, for the failure time data set, the S-ML model has the lowest statistics and the highest p-value, meaning that it provides a better match than the other models investigated. Figure 4 depicts the epdf and ecdf plots of the S-ML model for the failure time data set.  From Figure 4, it is obvious that the S-ML model captures the histogram's overall pattern and illustrates the comparison of the cdf with the empirical cdf of the S-ML model. The suitable behaviour of the S-ML model is further confirmed by these graphs.

Conclusions
The article's major contribution is a flexible trigonometric extension of the well-known modified Lindley model that proposes a novel efficient statistical modelling technique. We employ the features of the sine generalized family of distributions in this regard, and develop the sine modified Lindley distribution. We have displayed a few of its more noteworthy attributes, with a focus on the shape properties of the corresponding probability density and hazard rate functions, as well as discussing moments. Simulation studies and applications demonstrate the utility of the model under consideration. In particular, we compared it to the primary current models derived from the Lindley, exponential and other models with one or more parameters, using two real-world data sets. As a result, the obtained findings are really satisfactory, demonstrating that the novel distribution has a wide range of applications that could be the subject of additional research in a variety of scientific fields.
Funding: This research received no external funding.