A Family of SkewNormal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
Abstract
:1. Introduction
2. DoublyCensored SN Distribution
3. DoublyCensored LogSN and Centered SN Distributions
4. The Bernoulli/DoublyCensored SN Mixture Distribution
5. Monte Carlo Simulation Study
Algorithm 1 Generation of random numbers from the BDCSN distribution. 
1: Fix values for ${\delta}_{0}$, ${\delta}_{1}$, $\xi $, $\eta $, and $\alpha $. 
2: Generate values for u from $U\sim \mathrm{Uniform}({\delta}_{0},1{\delta}_{1})$. 
3: Compute values for x from
$$x={\Phi}_{\mathrm{SN}}^{1}\left(\right)open="("\; close=")">{q}_{0}+\frac{u{\delta}_{0}}{1{\delta}_{0}{\delta}_{1}}({q}_{1}{q}_{0}),\xi ,\eta ,\alpha $$

4: Repeat steps 2–3 until the required numbers of data (n) is completed. 
6. Real Data Application 1
7. Real Data Application 2
8. Conclusions and Future Research
 (i)
 By using skewnormal distributions, we have proposed a new family of distributions which are an alternative to the beta distribution when an excess zeros and/or one inflation is present.
 (ii)
 The parameters of the distributions were estimated by the maximum likelihood method.
 (iii)
 The expected and observed Fisher information matrices associated with the new family of distributions were derived, and an parameterization was proposed to circumvent a singularity problem in these matrices, which is inherited from the classical skewnormal distribution.
 (iv)
 The Fisher information matrix related to the new mixture distribution obtained in this study resulted to be block ortogonal, facilitating the estimation of parameters and doing it separately in two groups with respect to the discrete and continuous parts of this mixture, respectively.
 (v)
 An algorithm to generate random numbers from the new family of distributions derived in this study was proposed and implemented.
 (vi)
 Monte Carlo simulations based on the new family of distributions proposed in this research were provided to detect performance of the maximum likelihood estimators of their parameters.
 (vii)
 Examples with two real data sets were performed to illustrate the potential applications with the new family of distributions based on the skewnormal distribution proposed in the paper. In addition, we compare the new distributions to their natural competitors, corresponding to the beta and normal distributions, showing their convenience.
 (i)
 Parameter estimates of censored distributions are more efficient than when censorship is not considered. Indeed, if censored cases are present and a noncensored distribution is used, evidently it is not possible to estimate the variance of the censored part. However, if censored distributions are utilized in this case, such a variance may be estimated from the data. For more details, see page 199 in [47]. Subsequently, the study of asymptotic efficiency bounds in the new family of distributions proposed in the present investigation is an issue of interest; see details in [48]. In addition, asymptotic behavior and performance of maximum likelihood estimators in more complex statistical models can be studied in [49,50].
 (ii)
 The use of covariates when modeling a doublycensored response with support in [0, 1] following the new family of distributions is of interest. In this case, type Tobit models can be considered as benchmark to compare the new regression models. Specifically, when studying a doublycensored response in [0, 1] through a linear predictor which includes covariates, the number of observations below ${c}_{0}$ and/or above ${c}_{2}$ can be modeled by a Bernoulli distribution with a logit link function and polychotomous response. Given the possible orthogonality in the information matrix, the parameters of this model of two parts can be estimated separately. Refs. [30,36] discussed estimation methods for the regression parameters in a similar context under a mixture structure.
 (iii)
 (iv)
 (v)
 (vi)
 Robust estimation methods when outliers are present into the data set can be used [63].
 (vii)
 Applications of the new methodology derived here can be of interest in diverse areas [64].
Appendix A
Appendix A.1. DoublyCensored SN Distribution
Appendix A.2. The Bernoulli/DoublyCensored SN Mixture Distribution
References
$\mathit{n}=\mathbf{30}$  $\mathit{n}=\mathbf{50}$  $\mathit{n}=\mathbf{100}$  $\mathit{n}=\mathbf{500}$  

${\mathit{\delta}}_{\mathbf{0}}$  ${\mathit{\delta}}_{\mathbf{1}}$  True Value  Mean  Variance  Bias  RMSE  Mean  Variance  Bias  RMSE  Mean  Variance  Bias  RMSE  Mean  Variance  Bias  RMSE 
0.1  0.1  $\xi $ = 0.2  0.21148  0.00110  0.01148  0.03513  0.20632  0.00137  0.00632  0.03755  0.20642  0.00124  0.00642  0.03578  0.20976  0.00035  0.00976  0.02109 
0.1  0.1  $\eta =0.2$  0.18965  0.00057  0.01034  0.02597  0.19227  0.00054  0.00773  0.02459  0.19269  0.00053  0.00730  0.02419  0.19235  0.00018  0.00765  0.01554 
0.1  0.1  $\alpha $ = 0.3  0.31774  0.01593  0.01774  0.12746  0.35869  0.06063  0.05869  0.25314  0.35355  0.05694  0.05355  0.24457  0.31103  0.00205  0.01103  0.04661 
0.1  0.2  $\xi $ = 0.2  0.21323  0.00123  0.01323  0.03752  0.20943  0.00126  0.00943  0.03669  0.20649  0.00142  0.00649  0.03830  0.21034  0.00044  0.01033  0.02345 
0.1  0.2  $\eta =0.2$  0.18861  0.00065  0.01138  0.02796  0.19103  0.00052  0.00896  0.02458  0.19247  0.00056  0.00752  0.02489  0.19218  0.00021  0.00781  0.01654 
0.1  0.2  $\alpha $ = 0.3  0.30536  0.01117  0.00536  0.10581  0.33359  0.04294  0.03359  0.20994  0.36002  0.06839  0.06002  0.26833  0.30800  0.00443  0.00800  0.06709 
0.2  0.1  $\xi $ = 0.2  0.21216  0.00120  0.01216  0.03678  0.20810  0.00126  0.00810  0.03650  0.20556  0.00137  0.00556  0.03754  0.21035  0.00044  0.01035  0.02345 
0.2  0.1  $\eta =0.2$  0.18884  0.00066  0.01115  0.02812  0.19139  0.00053  0.00860  0.02460  0.19262  0.00055  0.00737  0.02474  0.19217  0.00021  0.00782  0.01653 
0.2  0.1  $\alpha $ = 0.3  0.31271  0.01290  0.01271  0.11430  0.34589  0.05113  0.04589  0.23075  0.36530  0.06428  0.06530  0.26181  0.30776  0.00416  0.00776  0.06501 
0.2  0.2  $\xi $ = 0.2  0.21338  0.00143  0.01338  0.04013  0.21185  0.00118  0.01185  0.03633  0.20870  0.00132  0.00870  0.03749  0.21051  0.00049  0.01051  0.02453 
0.2  0.2  $\eta =0.2$  0.18768  0.00076  0.01231  0.03026  0.18964  0.00056  0.01035  0.02601  0.19139  0.00052  0.00860  0.02448  0.19200  0.00024  0.00799  0.01732 
0.2  0.2  $\alpha $ = 0.3  0.30392  0.00752  0.00392  0.08681  0.31181  0.01922  0.01181  0.13914  0.34616  0.06196  0.04616  0.25316  0.30677  0.00369  0.00677  0.06119 
0.3  0.1  $\xi $ = 0.2  0.21337  0.00143  0.01337  0.04018  0.21184  0.00116  0.01184  0.03608  0.20867  0.00134  0.00867  0.03767  0.21049  0.00049  0.01049  0.02467 
0.3  0.1  $\eta =0.2$  0.18769  0.00076  0.01230  0.03026  0.18963  0.00056  0.01036  0.02602  0.19141  0.00052  0.00859  0.02447  0.19201  0.00024  0.00798  0.01734 
0.3  0.1  $\alpha $ = 0.3  0.30449  0.00951  0.00449  0.09766  0.31186  0.01993  0.01186  0.14167  0.34602  0.05773  0.04602  0.24464  0.30712  0.00439  0.00712  0.06670 
0.3  0.2  $\xi $ = 0.2  0.21357  0.00172  0.01357  0.04367  0.21245  0.00135  0.01245  0.03886  0.21158  0.00115  0.01158  0.03577  0.21117  0.00057  0.01117  0.02648 
0.3  0.2  $\eta =0.2$  0.18669  0.00090  0.01330  0.03296  0.18889  0.00069  0.01110  0.02851  0.18981  0.00055  0.01019  0.02555  0.19163  0.00029  0.00836  0.01902 
0.3  0.2  $\alpha $ = 0.3  0.30823  0.00996  0.00824  0.10013  0.30489  0.00961  0.00489  0.09818  0.31686  0.02287  0.01687  0.15217  0.30325  0.00537  0.00325  0.07338 
Minimum  Median  Mean  SD  CV  CS  CK  Maximum 

0.003  0.250  0.292  0.216  73.892  0.811  −0.206  0.941 
Estimate  ZOIB  CDCSN  BDCSN 

$\widehat{\mu}$  0.2974 (0.0043)  −0.0691 (0.0217)  0.1965 (0.0039) 
$\widehat{\sigma}$  0.4562 (0.0050)  0.4448 (0.0248)  0.1818 (0.0040) 
${\widehat{\gamma}}_{1}$  –  0.6374 (0.1900)  0.9905 (0.0023) 
${\widehat{\delta}}_{0}$  0.6055 (0.0066)  –  0.6055 (0.0066) 
${\widehat{\delta}}_{1}$  0.0313 (0.0023)  –  0.0313 (0.0023) 
AIC  7464.2  7615.6  7455.7 
BIC  7498.7  7635.5  7488.8 
CAIC  7466.2  7617.6  7457.7 
$\ell \left(\widehat{\mathit{\theta}}\right)$  −3728.1  −3804.8  −3722.9 
Minimum  Median  Mean  SD  CV  CS  CK  Maximum 

0.007  0.269  0.305  0.18  59.034  0.918  0.969  0.927 
Estimate  ZOIB  Normal  CSN  BDCSN 

$\widehat{\mu}$  0.3080 (0.0140)  0.2069 (0.0153)  0.2110 (0.0153)  0.2952 (0.0156) 
$\widehat{\sigma}$  5.3239 (0.0004)  0.2487 (0.0125)  0.2455 (0.0129)  0.1873 (0.0121) 
$\widehat{\varsigma}$  –  –  0.4046 (0.1937)  0.5750 (0.1644) 
${\widehat{\delta}}_{0}$  0.2198 (0.0006)  –  –  0.2198 (0.0006) 
AIC  144.748  142.258  150.278  135.197 
BIC  155.674  149.542  161.205  149.765 
CAIC  146.893  144.344  152.423  137.415 
$\ell \left(\widehat{\mathit{\theta}}\right)$  −69.374  −69.127  −72.139  −63.599 
