# A Statistical Model for Count Data Analysis and Population Size Estimation: Introducing a Mixed Poisson–Lindley Distribution and Its Zero Truncation

## Abstract

## 1. Introduction

## 2. Poisson-Improved Second-Degree Lindley Distribution

#### 2.1. Probability Mass Function of the PISDL Distribution

#### 2.2. Some Statistical Properties of the PISDL Distribution

#### 2.3. Parameter Estimation of the PISDL Distribution

#### 2.3.1. Method of Moments Estimator

#### 2.3.2. Maximum Likelihood Estimator

#### 2.4. Simulation Study

- Step 1:
- Generate $N=1000,2000,\dots ,\mathrm{10,000}$random data that follows the PISDL distribution with $\lambda =0.5$.
- Step 2:
- Obtain the estimated $\lambda $ using MLE and moment estimator.
- Step 3:
- Repeat Steps 1–2 for a total of 2000 iterations and obtain the estimates.
- Step 4:
- Calculate the mean absolute deviation, MAD, and the mean squared error values, MSEs, given, respectively, as $MAD={\sum}_{i=1}^{2000}\left|\stackrel{\u02c7}{\lambda}-\lambda \right|/2000$ and $MSE={\sum}_{i=1}^{2000}{\left(\stackrel{\u02c7}{\lambda}-\lambda \right)}^{2}/2000$, where $\stackrel{\u02c7}{\lambda}$ can be the MLE or moment estimator for $\lambda $.
- Step 5:
- Repeat Steps 3–4 for $\lambda =2.0,5.0$.

## 3. Zero-Truncated Poisson-Improved Second-Degree Lindley Distribution

#### 3.1. Probability Mass Function of the ZTPISDL Distribution

#### 3.2. Some Statistical Properties of the ZTPISDL Distribution

#### 3.3. Parameter Estimation of the ZTPISDL Distribution

#### 3.4. Simulation Study

## 4. Population Size Estimation

#### 4.1. Horvitz–Thompson Estimator under ZTPISDL Distribution (HT-ZTPISDL)

#### 4.2. Variance and Confidence Interval for HT-ZTPISDL

#### 4.3. Simulation Study

- Step 1:
- Generate $N=1000,2000,\dots ,10,000$random data, which follow the ZTPISDL distribution with $\lambda =0.5$.
- Step 2:
- Obtain $\widehat{\lambda}$ using the MLE and use $\widehat{\lambda}$ to obtain $\widehat{N}$.
- Step 3:
- Repeat Steps 1–2 for a total of 2000 iterations and obtain the estimates.
- Step 4:
- Calculate the relative absolute error, RAB values, and the relative standard deviation, RSd values, given, respectively, as $RAB=\left|\overline{\widehat{N}}-N\right|/N$ and $RSd=\frac{1}{N}\sqrt{{\sum}_{i=1}^{2000}{\left(\widehat{N}-\overline{\widehat{N}}\right)}^{2}/2000}$, where $\overline{\widehat{N}}={\sum}_{i=1}^{2000}\widehat{N}/2000$.
- Step 5:
- Repeat Steps 3–4 for $\lambda =2.0,5.0$.

## 5. Medical Data Applications

#### 5.1. Model Fittings Using the PISDL Distribution

#### 5.2. Model Fittings Using ZTPISDL Distribution

#### 5.3. Estimating Population Size

## 6. Conclusions, Limitations, and Future Research

**Figure 5.**The $MAD$ and the $MSE$ of the MLE and moment estimator for the $PISDL$ distribution when $\lambda =0.5,2.0,5.0$ and $N=1000,2000,\dots ,\mathrm{10,000}$.

**Figure 7.**The $MAD$ and the $MSE$ of the MLE and moment estimator for the $ZTPISDL$ distribution when $\lambda =0.5,1.0,2.0$ and $N=1000,2000,\dots ,\mathrm{10,000}$.

**Figure 9.**Plots of the empirical (vertical black line) and the fitted (blue line) for (

**i**) the number of dicentric chromosomes after being exposed to a 0.405 radiation dose and (

**ii**) the number of dicentric chromosomes after being exposed to a 0.600 radiation dose.

**Figure 10.**A plot of the empirical (vertical black line) and the fitted values (blue line) for the number of positive samples of Salmonella.

**Table 1.**Model fittings of the number of dicentric chromosomes after being exposed to a 0.405 radiation dose using Poisson, PL, and PISDL distributions.

$\mathit{x}$ | ${\mathit{n}}_{\mathit{x}}$ | Distributions | |||
---|---|---|---|---|---|

Poisson | $\mathit{P}\mathit{L}$ | $\mathit{P}\mathit{I}\mathit{S}\mathit{D}\mathit{L}$ (MLE) | $\mathit{P}\mathit{I}\mathit{S}\mathit{D}\mathit{L}$ (Moment) | ||

0 | 437 | 426.55 | 433.87 | 433.73 | 433.64 |

1 | 66 | 84.50 | 72.04 | 72.30 | 72.36 |

2 | 15 | 8.37 | 11.81 | 11.75 | 11.77 |

3 | 1 | 0.55 | 1.91 | 1.87 | 1.88 |

4 | 1 | 0.03 | 0.37 | 0.35 | 0.37 |

Total | 520 | 520.00 | 520.00 | 520.00 | 520.00 |

Parameter | |||||

$\widehat{\lambda}$ | 0.1981 | - | 6.5464 | 6.5392 | |

$\widehat{\theta}$ | - | 5.7953 | - | - | |

Max log-likelihood | −285.14 | −279.40 | −279.45 | ||

$AIC$ | 572.27 | 560.80 | 560.90 | - | |

$BIC$ | 576.52 | 565.05 | 565.15 | - | |

${\chi}^{2}$ | 11.55 | 1.13 | 1.23 | 1.22 | |

df | 1 | 1 | 1 | 1 | |

p-value | 0.0007 | 0.2878 | 0.2674 | 0.2694 |

**Table 2.**Model fittings of the number of dicentric chromosomes after being exposed to a 0.600 radiation dose using Poisson, PL, and PISDL distributions.

$\mathit{x}$ | ${\mathit{n}}_{\mathit{x}}$ | Distributions | |||
---|---|---|---|---|---|

Poisson | $\mathit{P}\mathit{L}$ | $\mathit{P}\mathit{I}\mathit{S}\mathit{D}\mathit{L}$ (MLE) | $\mathit{P}\mathit{I}\mathit{S}\mathit{D}\mathit{L}$ (Moment) | ||

0 | 473 | 456.69 | 475.79 | 474.91 | 475.07 |

1 | 119 | 147.65 | 117.81 | 118.96 | 118.88 |

2 | 34 | 23.87 | 28.53 | 28.55 | 28.50 |

3 | 3 | 2.57 | 6.79 | 6.64 | 6.62 |

4 | 2 | 0.22 | 2.08 | 1.94 | 1.93 |

Total | 631 | 631.00 | 631.00 | 631.00 | 631.00 |

Parameter | |||||

$\widehat{\lambda}$ | 0.3233 | - | 4.4001 | 4.4046 | |

$\widehat{\theta}$ | - | 3.7420 | - | - | |

Max log-likelihood | −469.65 | −464.09 | −463.96 | ||

$AIC$ | 941.30 | 930.17 | 929.91 | - | |

$BIC$ | 945.75 | 934.61 | 934.36 | - | |

${\chi}^{2}$ | 11.85 | 2.77 | 2.54 | 2.54 | |

df | 1 | 2 | 2 | 2 | |

p-value | 0.0006 | 0.2503 | 0.2808 | 0.2808 |

**Table 3.**Model fittings of the number of positive samples of Salmonella using the ZTP, ZTPL, and ZTPISDL distributions.

$\mathit{y}$ | ${\mathit{n}}_{\mathit{y}}$ | Distributions | |||
---|---|---|---|---|---|

$\mathit{Z}\mathit{T}\mathit{P}$ | $\mathit{Z}\mathit{T}\mathit{P}\mathit{L}$ | $\mathit{Z}\mathit{T}\mathit{P}\mathit{I}\mathit{S}\mathit{D}\mathit{L}$ (MLE) | $\mathit{Z}\mathit{T}\mathit{P}\mathit{I}\mathit{S}\mathit{D}\mathit{L}$ (Moment) | ||

1 | 17 | 7.88 | 15.06 | 14.01 | 14.00 |

2 | 9 | 12.12 | 11.55 | 11.62 | 11.62 |

3 | 5 | 12.44 | 8.45 | 8.82 | 8.82 |

4 | 6 | 9.57 | 5.99 | 6.32 | 6.32 |

5 | 5 | 5.89 | 4.15 | 4.34 | 4.34 |

6 | 5 | 3.02 | 2.83 | 2.89 | 2.89 |

7 | 6 | 2.08 | 4.97 | 5.00 | 5.01 |

Total | 53 | 53.00 | 53.00 | 53.00 | 53.00 |

Parameter | |||||

$\widehat{\lambda}$ | 3.0778 | - | 0.8932 | 0.8928 | |

$\widehat{\theta}$ | - | 0.6660 | - | - | |

Max log-likelihood | −110.64 | −105.28 | −105.10 | ||

$AIC$ | 223.27 | 212.55 | 212.19 | - | |

$BIC$ | 225.24 | 214.53 | 214.16 | - | |

${\chi}^{2}$ | 24.10 | 3.61 | 4.16 | 4.16 | |

df | 4 | 3 | 4 | 4 | |

p-value | <0.0001 | 0.3068 | 0.3848 | 0.3848 |

**Table 4.**The estimated population size, the standard deviation, and the lower and upper limits for the 95% confidence interval of the population size estimator based on the ZTP, ZTPL, and ZTPISDL distributions.

Distributions | $\mathbf{Estimated}\widehat{\mathit{N}}$ | $\mathit{S}\mathit{D}$ | 95% Lower Limit | 95% Upper Limit |
---|---|---|---|---|

$ZTP$ | 55.56 | 1.761 | 52.11 | 59.01 |

$ZTPL$ | 71.21 | 4.045 | 63.28 | 79.14 |

$ZTPISDL$ (MLE) | 66.64 | 4.896 | 57.04 | 76.24 |

$ZTPISDL$ (Moment) | 63.10 | 4.961 | 53.38 | 72.82 |

