Arctan-Based Family of Distributions: Properties, Survival Regression, Bayesian Analysis and Applications

: In this paper, a new class of the continuous distributions is established via compounding the arctangent function with a generalized log-logistic class of distributions. Some structural properties of the suggested model such as distribution function, hazard function, quantile function, asymptotics and a useful expansion for the new class are given in a general setting. Two special cases of this new class are considered by employing Weibull and normal distributions as the parent distribution. Further, we derive a survival regression model based on a sub-model with Weibull parent distribution and then estimate the parameters of the proposed regression model making use of Bayesian and frequentist approaches. We consider seven loss functions, namely the squared error, modiﬁed squared error, weighted squared error, K-loss, linear exponential, general entropy, and precautionary loss functions for Bayesian discussion. Bayesian numerical results include a Bayes estimator, associated posterior risk, credible and highest posterior density intervals are provided. In order to explore the consistency property of the maximum likelihood estimators, a simulation study is presented via Monte Carlo procedure. The parameters of two sub-models are estimated with maximum likelihood and the usefulness of these sub-models and a proposed survival regression model is examined by means of three real datasets.


Introduction
Distribution theory provides useful tools in describing and identifying the model of occurred events and predicting next events.Recently, several generators of probability distributions have been introduced by many researchers in the statistical literature.Some well-known generators are the Marshall-Olkin generated (MO-G) by [1], beta-G by [2], Kumaraswamy-G (Kw-G) by [3], Weibull-G by [4], exponentiated half-logistic-G by [5], Lomax-G by [6], and polar-generalized normal distribution by [7], among others.
A favorite technique in expanding statistical distributions is the method introduced by [8], who have introduced the generalized log-logistic (GLL-G) class of distributions.The cumulative distribution (cdf) function of this class based on underline cdf G, is given by where β > 0 and Ḡ(x) = 1 − G(x) denote the survival function.This class has named by Odd log-logistc (OLL-G) and several extensions of this class were introduced.Kumaraswamy Odd log-logistic due to [9], beta odd log-logistic due to [10], odd burr general-ized class due to [11], Topp-Leone odd log-logistic due to [12], generalized odd log-logistic due to [13], new odd log-logistic due to [14] and odd log-logistic logarithmic by [15].The purposes of this work are two fold.We first introduce a general and versatile class of distributions in terms of compounding the arctan function and cdf defined in (1).This model is referred to as the arctan odd log-logistic-G (ATOLL-G) distribution.The second purpose of this work lies in the study of two sub-models of the general ATOLL-G model via classical and Bayesian approaches.Further, we study the corresponding regression model derived from sub-model which is defined based on the Weibull distribution.First, certain statistical and reliability properties of the ATOLL-G distribution are derived in a general setting.Then, we establish two special cases of ATOLL-G by using the Weibull and normal distributions instead of the parent distribution G.These models are called ATOLL-W and ATOLL-N distribution, respectively.We also provide a discussion for the ATOLL-W regression model via log-transformation of ATOLL-W (LATOLL-W) distribution.Furthermore, we obtain Bayesian and maximum likelihood estimates of the parameters of proposed models via real examples.
For Bayesian inference, we consider several asymmetric and symmetric loss functions such as squared error loss, modified squared error, precautionary, weighted squared error, linear exponential, general entropy, and K-loss functions to estimate the parameters of the LATOLL-W regression model.Further, making use of the independent prior distributions, Bayesian 95% credible and highest posterior density (HPD) intervals (see [16]) are provided for each parameter of the proposed model.In addition, a simulation study is performed to investigate Maximum Likelihood Estimators (MLEs) of consistency.
The rest of the manuscript is organized as follows.In Section 2, we introduce a new class of distributions called arctan odd log-logistic-G (ATOLL-G) distribution.Some structural properties of the ATOLL-G distribution such as the hazard function, quantiles, asymptotics and some useful expansions of the proposed model are given in a general setting in Section 3. In Section 4, two special cases of this class is considered by employing Weibull and normal distributions as the parent distribution.The ATOLLW regression model and its Bayesian inference are presented by considering seven well-known loss functions in Section 5.In Section 6, we study the performance of the maximum likelihood estimates of the parameters of ATOLLW distribution via Monte Carlo simulation to investigate the mean square error and bias of the maximum likelihood estimators.In Section 7, the supremacy of the ATOLLN and ATOLLW models to some challenger models is exhibited via several selection model criteria by analyzing Data 1 and Data 2 real examples, respectively.Further, we fit the LATOLLW regression model to heart transparent dataset and compare its efficiency with some competitor models.We also provide the numerical results of Bayesian inference and related plots to posterior samples for heart transplant data in this Section.Finally, the paper is concluded in Section 8.

Model Genesis
In this section, we first introduce an unit-interval distribution based on arctan function.Then we propose arctan odd log logistic G class of distributions.

A New Extension of Uniform Distribution in Terms of Arctan Function
We create a new unit-interval distribution, based on the definition of arctan function with closed-form cdf given by: The related probability density function (pdf) is obtained by: To study the effect of α on the pdf in (3), we plot this pdf under some selected values of the parameter α in Figure 1.It is worthwhile to note that when α → 0 + , the pdf in (3) reduces to standard uniform distribution.
The first four moments of the pdf in (3) are given by:

Arctan Odd Log Logistic G Family of Distributions
Here, we propose a general class of distributions in terms of compounding the arctan function and cdf in (1).The cdf of arctan odd log logistic G class of continuous distributions is given as: where Π(x; β) is defined as in (1).This model is called arctan odd log logistic G (ATOLL-G) distribution.The corresponding pdf is also given by: where π(x) is defined by Note that when α → 0 + , the pdf in (5) reduces to OLL-G family.Further, when α → 0 + and β = 1, it reduces to baseline distribution G.We can readily obtain the associated hazard rate function of (4) as:

Properties
In this section we study some basic properties of the ATOLL-G family.

Probability Density and Cumulative Density Function Expansion Series
For a given cdf G(x), a variable Z has the exp-G distribution with power parameter η > 0, say Z ∼ exp-G(η), if the related pdf and cdf are given by: respectively.For pertinent details, one can see [17][18][19].
We can obtain some mathematical properties of the ATOLL-G based on EXP-G densities, for example we can obtain moments, incomplete moments, moment generating function and linear combination for order statistics.

Two Sub-Models
In this section, we propose two special cases of ATOLLG distribution, which are used in squeal.

Arctan Odd Log-Logistic Weibull Distribution
Suppose that the parent distribution G has Weibull distribution with cdf G(x) = 1 − e −(λx) γ , then from (5), the pdf of arctan odd log-logistic Weibull distribution (ATOLLW) is defined by: From ( 4), the corresponding cdf is given by: The density of ATOLLW distribution under some selected values of associated parameters is plotted in Figure 2.

Arctan Odd Log-Logistic Normal Distribution
Let the parent distribution G have normal distribution with cdf Φ(x; µ, σ 2 ).From (1), the pdf of arctan odd log-logistic normal distribution (ATOLLN) is defined by: (14) where φ(x; µ, σ 2 ) is the pdf of a normal distribution with mean µ and variance σ 2 .From (4), the corresponding cdf of ( 14), is given by: We plot the density of ATOLLN distribution under some selected values of associated parameters in Figure 3.

The ATOLLW Regression Model
The survival regression model is one of well-known models in survival analysis.Sometimes for analyzing a lifetime variable, there are auxiliary information (as independent variables) that help us to explore the lifetime variable more precisely.More recently, by considering the class of location statistical distributions, different regression models have been introduced in the applied statistical literature (for example see [13,21]).The log-odd log-logistic Weibull regression model for censored data was introduced by [22] in terms of odd log-logistic Weibull distribution.Further, Cordeiro et al. [23] introduced a general regression model based on the Burr XII system of densities and also the log-odd power Cauchy-Weibull regression proposed by [24].
Let X be a variable with pdf ATOLL-W defined in (12).Making use of the log transformation Y = ln(X), the pdf of transformed variable Y is given by: where σ > 0 is a scale, β > 0 is a shape and µ ∈ R is a location parameter.The model in ( 16) is referred to as log-ATOLL-W (LATOLL-W) distribution, and it is briefly shown by Y ∼ LATOLL-W(β, σ, µ).The survival function of Y is: Let Z = (Y − µ)/σ be the standardized random variable having pdf, The ATOLL-W regression is defined by: where τ = (τ 1 , • • • , τ p ) is parameter vector of regression model, v i is covariate variable vector and z i is an error of regression model with density h(z; α, σ).Further, under assumptions y i = min{log(c i ), log(x i )}, where log(c i ) denotes log-censoring and log(x i ) follows ( 16), and represent the log-lifetime.Let r is the number of uncensored observations, then the log-likelihood function for ψ = (β, σ, τ ) in terms of sets F (set of individuals with log-lifetime) and C (set of individuals with log-censoring) is given by: where For example, we can use the optim function of R software to obtain the MLE of ψ by maximizing (19).

Residual
The martingale and modified deviance residuals (mdr) for the LATOLL-W regression are given respectively by: , and When the regression model is well-fitted to a given data, the mdr are normally distributed with zero men and unit variance.

Bayesian Inference of Regression Model
In this section, we consider the Bayesian inference of the parameters for the survival regression model, which is discussed in Section 5. Let the parameters α, β and σ of the LATOLLW distribution have independent prior distributions as: where a, b, c, d, e and f are positive and τ i ∈ R, i = 0, 1, 2, 3.Under these assumptions, the joint prior density function is formulated as follows: where τ = (τ 0 , τ 1 , τ 2 , τ 3 ).
Here, we consider several asymmetric and symmetric loss functions including: squared error loss function (SELF), modified squared error loss function (MSELF), weighted squared error loss function (WSELF), K-loss function (KLF), linear exponential loss function (LINEXLF), precautionary loss function (PLF) and general entropy loss function (GELF).For more details, see [25] and the references therein.In Table 1, we provide a summary of these loss functions and associated Bayesian estimators and posterior risks.

Loss Function
Bayes Estimator Posterior Risk For more details see [26].Let ϕ be a function defined as: Since the joint posterior distribution π(α, β, σ, τ) is formulated as: Therefore, the joint posterior density is given by: where V = (v 1 , . . . ,v n ) is a known matrix that contains the auxiliary variables, µ i = v i τ and K is given as: where the outer integration stands for parameter vector τ.

Simulation
Here, we examine the performance of the maximum likelihood estimates associated to the ATOLLN(µ, σ, a, b) distribution in (14) with respect to sample size n.The simulation study is performed via the Monte Carlo procedure as follows: 1.
We repeated these steps based on the sample sizes n = 100, 110, 120, . . ., 500 for the one set of selected values of parameter vector as (α, β, µ, σ) = c(3, 0.5, 0, 1).Figures 4-7 show how the AB, MSR, CP and the AW vary with respect to n.These results show that the average biases, mean-squared errors and average lengths for each parameter decrease to zero as n → ∞.Additionally, the CP vary with respect to n.The associated results of CP corresponds to the nominal coverage probability of 0.95 for two parameters β and σ.The level of CP for the two parameters α and µ are increasing when n is increased to the level of 0.95.

Applications
In this part, we present three applications to investigate the efficiency and flexibility of two sub-classes distributions which formerly defined in Sections 4 and 5.In the first two applications, we present some numerical and graphical results for fitting the special submodels defined in Section 4. The third application is associated with a survival regression analysis of the ATOLL-W regression model presented in Section 5.
For the first two applications, the goodness-of-fit statistics including the Cramér-von Mises (W * ) and Anderson-Darling (A * ) test statistics are adopted to compare the fitted models (see [27][28][29] for more details).The smaller values of A * and W * present the better fit to the data.For the sake of comparison, we also consider the Kolmogorov-Smirnov (K-S) statistic and its corresponding p-value and the minus log-likelihood function (− (ψ)) for the sake of comparison [28,29].For the third application (covariate censored data), we adopt the AIC and BIC statistics to compare the fitted models since the A * and W * statistics are not suitable for censored data.
For the first application, we take the ATOLLN distribution and, for comparison purposes, we fitted the following models to the above datasets:

Failure Times Data
Data 1: First, we analyze the 84 failure times of a particular windshield device.These data were also studied by [32,33].
The MLEs of the parameters, standard errors (SE) (in parentheses) and the goodnessof-fit statistics for failure times data are reported in Table 2.One can see that the ATOLLLN model outperforms all the fitted competitive models under these statistics.The fitted densities and histogram of the data are displayed in Figure 8.For failure times, we note that the fitted ATOLLN distribution best captures the empirical histogram.
Here, we fit the ATOLLW distribution and some of its sub-models, odd log-logistic Weibull (OLL-W), beta Weibull (BW), Kumaraswamy-Weibull (kwW), gamma Weibull (GaW) and exponentiated Weibull (EW) distributions to the windshield device data.Similar numerical results are provided in Table 3 for windshield device data as well as data failure times data.It is immediately seen that the ATOLLW model outperforms all the fitted competitive models under the model selection criteria presented for the first data application.The fitted densities and histogram of the windshield device data are displayed in Figure 9.This figure shows that the fitted ATOLLW distribution best captures the empirical histogram among the considered competitor models.We note that the ATOLLN and ATOLLW models outperform all the fitted competitive models under the selected criterion for the datasets' failure times and windshield device, respectively.

Third Application: Regression Analysis
Survival regression analysis has been developed in several forms.One of them is the non-parameteric, where Kaplan-Meier estimation [34] is highlighted.The Kaplan-Meier estimate is a common way of obtaining the survival curve using probabilities of an event's occurrence at a time.In this section, we provided a parametric approach as a counterpart, where we fit the LATOLLW regression to the heart transplant data.The current data are available in a survival package of R software.The considered survival regression model based on response variable y i and covariate variables (v i1 , v i2 , v i3 ) is formulated as: where y i is distributed as the LATOLLW distribution and the covariate random variables are described as:

Parameter Estimation
A summary of model fitting based on MLE discussion for the heart transplant data is provided in Table 4.We fit the LATOLLW regression model to this dataset and compare the results with LBXII-W, LOLLW and log-Weibull distributions.For more details about these competitor models, see [23].We also consider another alternative models such as a log-log mean Weibull (LLMW) regression proposed by [35] and log exponential-Pareto (LEP) regression model proposed by [36].The estimated parameters, standard errors (given in parentheses) and AIC and BIC measures as well as corresponding p-values in [.] are reported in Table 4.We conclude that the estimated regression parameters are statistically significant at the 5% level.The suitability of the fitted LATOLLW regression model is evaluated by residual analysis.The plot of the modified deviance residuals is displayed in Figure 10, which reveals that the fitted LATOLLW regression provides a good fit to the current data.

Bayesian Regression Analysis: Heart Transplant Data
From (24), we can see that there is no explicit form for the Bayesian estimators under the loss functions considered in Table 1, so we use Gibbs sampling technique and MCMC procedure based on 10,000 replicates to obtain Bayesian estimators for the heart transplant data.A summary of Bayesian analyses (point and interval estimations with related posterior risk) are reported in Tables 5 and 6.Table 6 provides 95% credible and HPD intervals for each parameter of the LATOLLW distribution.Moreover, we provide the posterior summary plots in Figures 11-13.These plots confirm that the convergence of sampling process occurred.

Conclusions
A new class of lifetime distributions was introduced via compounding odd log logistic distribution and the arctan function.Two special sub-models of this class were proposed by considering the Weibull and normal distributions instead of the baseline distribution.
We have also provided a survival regression model based on Weibull distribution and a comprehensive discussion about Bayesian inference for the parameters of this survival regression model were studied under various loss functions.Numerical analyses of fitting two univariate real datasets were provided via a maximum likelihood approach and the corresponding plots were drawn to evaluate these results visually.The data analysis empirically proved that the proposed distributions provide a better fit than their competing distributions.Finally, the performance of the survival regression sub-model was examined in terms of maximum likelihood and Bayesian procedures for a real example of observations with covariate variables.

Figure 4 .
Figure 4. AB of the MLE Θ of the vector parameter Θ.

Figure 5 .
Figure 5. RMSE of the MLE Θ of the vector parameter Θ.

Figure 6 .
Figure 6.CP of 95% confidence intervals of the MLE Θ of the vector parameter Θ.

Figure 7 .
Figure 7. AW of 95% confidence intervals of the vector parameter Θ.

Figure 8 .
Figure 8. Histogram and density plots for failure times data.Plots for (a) sub-models and (b) others models.

Figure 9 .
Figure 9. Histogram and density plots for windshield device data.Plots for (a) sub-models and (b) others models.

Figure 10 .
Figure10.Index plot of the quantile residuals.

Figure 11 .Figure 12 .
Figure 11.Trace plots of Bayesian analysis and performance of Gibbs sampling for the each parameter of LATOLLW distribution based on the heart transplant data.

Figure 13 .
Figure 13.Histogram plots of Bayesian analysis and performance of Gibbs sampling for the each parameter of LATOLLW distribution based on the Heart transplant data.

Table 1 .
Seven loss functions with Bayes estimator and related posterior risk.

Table 2 .
A summary of model fitting to the failure times data.

Table 3 .
A summary of model fitting to the windshield device data.

Table 4 .
A summary of fitted regression models to the heart transplant data.

Table 5 .
A summary of the Bayesian analysis of the LATOLLW regression based on heart transplant data.