# A Flexible Semi-Poisson Distribution with Applications to Insurance Claims and Biological Data

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Characterizations

#### 1.2. The Cumulative Distribution Function

## 2. Properties

#### 2.1. Natural Exponential Family

#### 2.2. The Generating Functions

#### 2.3. Moments

#### Incomplete Moments

#### 2.4. Reliability Properties

**Proposition**

**1.**

**Proof.**

**Corollary**

**1.**

- i
- Using the relationships between log-concavity and unimodality presented in [16], the$SP$distribution is unimodal.
- ii
- SP is an IFR.
- iii
- Convolution of SP with any other discrete distribution will also result in a log-concave distribution.
- iv
- SP has at most an exponential tail, ${lim}_{x\to \infty}{e}^{bx}\mathbb{P}(Y=x)=0,$ which implies $\mathbb{P}(Y=x)=o\left({e}^{-bx}\right)$ for some $b>0$ as $x\to \infty $.
- v
- SP has a monotonically decreasing mean residual life (MRL) time function.

#### 2.5. Entropy

#### 2.6. Infinite Divisibility and Further Properties

## 3. Estimation

**Proposition**

**2.**

**Proof.**

## 4. Simulations

- In order to generate a random sample from the $SP\left(\lambda \right)$ distribution, we make use of the characterization based on the weighted Poisson distribution, as seen in Equation (3). We therefore generate 10,000 samples of size n from the $SP\left(\lambda \right)$ distribution for two specific values of $\lambda $, i.e., $0.5$ and 2.
- The MLEs for a particular $\lambda $ are then computed from these 10,000 samples. Let these be denoted by ${\widehat{\lambda}}_{i},i=1,2,\dots ,10,000$.
- Using the MLEs from step (2), we compute the biases (Biases) and mean squared errors (MSEs) of the estimates using the following forms:$$Bias(\widehat{\lambda},n)=\frac{1}{10000}\sum _{i=1}^{10000}({\widehat{\lambda}}_{i}-\lambda ),$$$$MSE(\widehat{\lambda},n)=\frac{1}{10000}\sum _{i=1}^{10000}{({\widehat{\lambda}}_{i}-\lambda )}^{2}.$$

## 5. Applications to Real Data

#### 5.1. Dataset 1

- ZIW (zero inflated Waring distribution) given by [25],$$P(x;\alpha ,\beta ,\pi )=\left\{\begin{array}{cc}\pi +(1-\pi )\frac{\alpha}{\alpha +\beta},\hfill & \mathrm{if}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0\hfill \\ \\ (1-\pi )\frac{\alpha (\alpha +\beta -1)!(x+\beta -1)!}{(\beta -1)!(x+\alpha +\beta )!},\hfill & \mathrm{if}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x>0,\hfill \end{array}\right.$$
- NLD (new logarithmic distribution) given by [26],$${p}_{x}=\frac{log(1-\alpha {\theta}^{n})-log(1-\alpha {\theta}^{x+1})}{log(1-\alpha )},\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0,1,2\dots ,$$
- NGDP (new geometric discrete Pareto distribution), as proposed by [23],$${p}_{x}=\frac{{q}^{x}}{{(x+1)}^{\alpha}}-\frac{{q}^{x+1}}{{(x+2)}^{\alpha}},\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0,1,2\dots ,$$
- DGP (discrete generalized Pareto), provided by [27],$${p}_{x}=\frac{1}{{(1+\lambda x)}^{\alpha}}-\frac{1}{{(1+\lambda (x+1))}^{\alpha}},\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0,1,2\dots ,$$
- PIG (Poisson inverse Gaussian distribution), outlined by [28],$${p}_{x}=\frac{1}{x!}\sqrt{\frac{2\varphi}{\pi}}{e}^{\varphi /\mu}{\varphi}^{-\frac{1}{4}+\frac{x}{2}}{\left(2+\frac{\varphi}{{\mu}^{2}}\right)}^{\frac{1-2x}{4}}{K}_{x-\frac{1}{2}}\left(\sqrt{2\varphi +\frac{{\varphi}^{2}}{{\mu}^{2}}}\right),\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0,1,2\dots ,$$
- NB (negative binomial distribution),$${p}_{x}=\frac{(x+r-1)!}{x!(r-1)!}{p}^{x}{(1-p)}^{x}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0,1,2\dots ,$$
- PLD (Poisson Lindley distribution) provided in [29],$${p}_{x}=\frac{{\theta}^{2}(x+\theta +2)}{{(\theta +1)}^{x+3}}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}x=0,1,2\dots ,$$

#### 5.2. Dataset 2

## 6. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. R Codes

x<-c()y<-c()b_l<-c()mse_l<-c()for(n in seq(from=10, to=200, by=10)){ l=0.5 for(i in 1:1000){ nll <- function(lam) { c<-lam^2/(exp(lam) ∗ (lam^2-lam+1)-1) z<-rpois(n,l) -n ∗ log(c)-sum(log((z+1)/(z+2) ∗ /factorial(z))) -sum(z) ∗ log(lam)+n ∗ lam }fit<-optim(l<-c(2),nll,hessian = FALSE)x<-c(x,fit$par[1]) } b_l<-c(b_l,mean(x)-l) mse_l<-c(mse_l,mean((x-l)^2))} df<-data.frame(b_l,mse_l) n=seq(from=10, to=200, by=10)ggplot(data=df, aes(x=n, y=b_l))+ geom_line()+ geom_point()+ xlab("n")+ ylab(expression(Bias(hat(lambda))))+ theme_bw() |

## References

- Lambert, D. Zero-Inflated Poisson regression, with an application to defects in manufacturing. Technometrics
**1992**, 34, 1–14. [Google Scholar] [CrossRef] - Böhning, D.; Dietz, E.; Schlattmann, P.; Mendonca, L.; Kirchner, U. The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. J. R. Stat. Soc. Ser. A
**1999**, 162, 195–209. [Google Scholar] [CrossRef] - Atkins, D.C.; Gallop, R.C. Rethinking how family researchers model infrequent outcomes: A tutorial on count regression and zero-inflated models. J. Fam. Psychol.
**2007**, 21, 726–735. [Google Scholar] [CrossRef] - Fagundes, R.A.A.; Souza, R.M.C.R.; Cysneiros, F.J.A. Zero-inflated prediction model in software-fault data. IET Softw.
**2016**, 10, 1–9. [Google Scholar] [CrossRef] - Greene, W.H. Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. NYU Working Paper No. EC-94-10. 1994. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1293115 (accessed on 17 January 2023).
- Mullahy, J. Specification and testing of some modified count data models. J. Econom.
**1986**, 33, 341–365. [Google Scholar] [CrossRef] - Bandyopadhyay, D.; DeSantis, S.M.; Korte, J.E.; Brady, K.T. Some considerations for excess zeroes in substance abuse research. Am. J. Drug Alcohol Abus.
**2011**, 37, 376–382. [Google Scholar] [CrossRef] [PubMed] - Coxe, S.; West, S.G.; Aiken, L.S. The analysis of count data: A gentle introduction to Poisson regression and its alternatives. J. Personal. Assess.
**2009**, 91, 121–136. [Google Scholar] [CrossRef] - Hu, M.; Pavlicova, M.; Nunes, E.V. Zero-inflated and hurdle models of count data with extra zeros: Examples from an HIV-risk reduction intervention trial. Am. J. Drug Alcohol Abus.
**2011**, 37, 367–375. [Google Scholar] [CrossRef] [Green Version] - Usman, M.; Oyejola, B.A. Models for count data in the presence of outliers and/or excess zero. Math. Theory Model.
**2013**, 3, 94–103. [Google Scholar] - Tüzen, M.F.; Erbaş, S. A comparison of count data models with an application to daily cigarette consumption of young persons. Commun. Stat.-Theory Methods
**2018**, 47, 5825–5844. [Google Scholar] [CrossRef] - Louzayadio, C.G.; Malouata, R.O.; Koukoutikissa, M.D. A weighted Poisson distribution for underdispersed count data. Int. J. Stat. Probab.
**2021**, 10, 157. [Google Scholar] [CrossRef] - Rattihalli, R.N.; Bhati, D. Generation of new families of discrete distributions. Calcutta Stat. Assoc. Bull.
**2017**, 68, 135–146. [Google Scholar] [CrossRef] - Tripathi, R.C.; Gupta, P.L.; Gupta, R.C. Incomplete moments of modified power series distributions with applications. Commun. Stat.-Theory Methods
**1986**, 15, 999–1015. [Google Scholar] [CrossRef] - Gupta, P.L.; Gupta, R.C.; Tripathi, R.C. On the monotonic properties of discrete failure rates. J. Stat. Plan. Inference
**1997**, 65, 255–268. [Google Scholar] [CrossRef] - Grandell, J. Mixed Poisson Processes; Chapman & Hall: London, UK, 1997. [Google Scholar]
- Evans, R.J. Ramanujan’s Second Notebook: Asymptotic expansions for hypergeometric series and related functions. In Ramanujan Revisited; Academic Press: New York, NY, USA, 1988. [Google Scholar]
- Nekoukhou, V.; Alamatsaz, M.H.; Bidram, H. A discrete analog of the generalized exponential distribution. Commun. Stat.-Theory Methods
**2012**, 41, 2000–2013. [Google Scholar] [CrossRef] - Steutel, F.W.; van Harn, K. Infinite Divisibility of Probability Distributions on the Real Line, 1st ed.; CRC Press: Boca Raton, FL, USA, 1979. [Google Scholar]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 17 January 2023).
- Rohatgi, V.K.; Saleh, A.K. An Introduction to Probability and Statistics, 2nd ed.; John Wiley and Sons: Hoboken, NJ, USA, 2000. [Google Scholar]
- Zeileis, A.; Kleiber, C. Countreg: Count Data Regression. R Package Version 0.2-0. 2018. Available online: http://R-Forge.Rproject.org/projects/countreg/ (accessed on 17 January 2023).
- Bhati, D.; Bakouch, H.S. A new infinitely divisible discrete distribution with applications to count data modeling. Commun. Stat.-Theory Methods
**2019**, 48, 1401–1416. [Google Scholar] [CrossRef] - Shaul, K.B.; Ridder, A. Exponential dispersion models for over-dispersed zero-inflated count data. Commun. Stat.-Simul. Comput.
**2021**. [Google Scholar] [CrossRef] - Rivas, L.; Campos, F. Zero inflated Waring distribution. Commun. Stat.-Simul. Comput.
**2021**. [Google Scholar] [CrossRef] - Gómez-Déniz, E.; Sarabia, J.M.; Calderín-Ojeda, E. A new discrete distribution with actuarial applications. Insur. Math. Econ.
**2011**, 48, 406–412. [Google Scholar] [CrossRef] - Prieto, F.; Gómez-Déniz, E.; Sarabia, J.M. Modelling road accident blackspots data with the discrete generalized Pareto distribution. Accid. Anal. Prev.
**2014**, 71, 38–49. [Google Scholar] [CrossRef] [Green Version] - Willmot, G. The Poisson-inverse Gaussian as an alternative to the negative binomial. Scand. Actuar. J.
**1987**, 3, 113–127. [Google Scholar] [CrossRef] - Sankaran, M. The Discrete Poisson-Lindley Distribution. Biometrics
**1970**, 26, 145–149. [Google Scholar] [CrossRef] - Rose, C.E.; Martin, S.W.; Wannemuehler, K.A.; Plikaytis, B.D. On the use of zero inflated and hurdle models for modeling vaccine adverse events count data. J. Biopharm. Stat.
**2006**, 16, 463–481. [Google Scholar] [CrossRef] [PubMed]

Value | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|---|

Frequency | 654 | 38 | 33 | 21 | 17 | 12 | 11 | 10 | 5 | 4 | 8 |

Model | Fitted Parameters | Log-L | AIC | BIC | K-S | p-Value |
---|---|---|---|---|---|---|

SP $\left(\lambda \right)$ | $\widehat{\lambda}=0.7220$ | −778.18 | 1558.36 | 1563.06 | 0.04 | 0.71 |

Poisson $\left(\lambda \right)$ | $\widehat{\lambda}=0.7130$ | −1332.85 | 2667.71 | 2672.41 | 0.24 | 0.04 |

ZIW $(\alpha ,\beta ,\pi )$ | $\widehat{\alpha}=1.04,\widehat{\beta}=0.26,$ | −781.56 | 1569.12 | 1583.22 | 0.05 | 0.64 |

$\widehat{\pi}=0.49$ | ||||||

NLD $(\alpha ,\theta )$ | $\widehat{\alpha}=0.55,\widehat{\theta}=0.17$ | −824.19 | 1652.38 | 1661.78 | 0.18 | 0.13 |

NGDP $(q,\alpha )$ | $\widehat{q}=0.73,\widehat{\alpha}=5.92$ | −800.91 | 1605.68 | 1615.08 | 0.11 | 0.36 |

DGP $(\lambda ,\alpha )$ | $\widehat{\lambda}=0.71,\widehat{\alpha}=5.38$ | −811.12 | 1626.24 | 1655.64 | 0.15 | 0.26 |

PIG $(\varphi ,\mu )$ | $\widehat{\varphi}=0.052,\widehat{\mu}=0.108$ | −791.54 | 1587.82 | 1596.48 | 0.21 | 0.42 |

NB $(r,p)$ | $\widehat{r}=0.58,\widehat{p}=0.32$ | −785.41 | 1574.82 | 1584.22 | 0.08 | 0.51 |

PLD $\left(\theta \right)$ | $\widehat{\theta}=1.92$ | −974.37 | 1950.74 | 1955.45 | 0.06 | 0.06 |

Value | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Frequency | 1437 | 1010 | 660 | 428 | 236 | 122 | 62 | 34 | 14 | 8 | 4 | 4 | 1 |

Model | Fitted Parameters | Log-L | AIC | BIC | K-S | p-Value |
---|---|---|---|---|---|---|

SP $\left(\lambda \right)$ | $\widehat{\lambda}=1.50$ | −3753.627 | 7149.29 | 7155.54 | 0.04 | 0.58 |

Poisson $\left(\lambda \right)$ | $\widehat{\lambda}=1.51$ | −7231.13 | 14,464.27 | 14,470.57 | 0.29 | 0.05 |

ZIW $(\alpha ,\beta ,\pi )$ | $\widehat{\alpha}=48.92,\widehat{\beta}=67.02$ | −6797.25 | 13,588.51 | 13,619.41 | 0.18 | 0.24 |

$\widehat{\pi}=0.45$ | ||||||

ZIPD $(\mu ,\sigma )$ | $\widehat{\mu}=2.04,\widehat{\sigma}=0.26$ | −6869.55 | 13,741.5 | 13,754.1 | 0.24 | 0.22 |

NGDP $(q,\alpha )$ | $\widehat{q}=0.51,\widehat{\alpha}=0.33$ | −6741.78 | 13,487.57 | 13,500.16 | 0.10 | 0.33 |

NB $(r,p)$ | $\widehat{r}=1.52,\widehat{p}=0.50$ | −6741.6 | 13,485.2 | 13,486.33 | 0.06 | 0.44 |

PLD $\left(\theta \right)$ | $\widehat{\theta}=0.99$ | −6745.99 | 13,493.99 | 13,500.29 | 0.14 | 0.29 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Almuhayfith, F.E.; Bapat, S.R.; Bakouch, H.S.; Alnaghmosh, A.M.
A Flexible Semi-Poisson Distribution with Applications to Insurance Claims and Biological Data. *Mathematics* **2023**, *11*, 1122.
https://doi.org/10.3390/math11051122

**AMA Style**

Almuhayfith FE, Bapat SR, Bakouch HS, Alnaghmosh AM.
A Flexible Semi-Poisson Distribution with Applications to Insurance Claims and Biological Data. *Mathematics*. 2023; 11(5):1122.
https://doi.org/10.3390/math11051122

**Chicago/Turabian Style**

Almuhayfith, Fatimah E., Sudeep R. Bapat, Hassan S. Bakouch, and Aminh M. Alnaghmosh.
2023. "A Flexible Semi-Poisson Distribution with Applications to Insurance Claims and Biological Data" *Mathematics* 11, no. 5: 1122.
https://doi.org/10.3390/math11051122