Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model

Pearce, Michael; Perlman, Michael D.

doi:10.3390/stats8030055

Open AccessArticle

Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model

by

Michael Pearce

^1,*

and

Michael D. Perlman

²

¹

Department of Mathematics and Statistics, Reed College, 3203 SE Woodstock Blvd, Portland, OR 97202, USA

²

Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195, USA

^*

Author to whom correspondence should be addressed.

Stats 2025, 8(3), 55; https://doi.org/10.3390/stats8030055

Submission received: 28 May 2025 / Revised: 2 July 2025 / Accepted: 3 July 2025 / Published: 5 July 2025

Download

Browse Figures

Versions Notes

Abstract

The problem of estimating the ratio of the means of a two-component Poisson mixture model is considered, when each component is subject to zero-inflation, i.e., excess zero counts. The resulting zero-inflated Poisson mixture (ZIPM) model can be viewed as a three-component Poisson mixture model with one degenerate component. The EM algorithm is applied to obtain frequentist estimators and their standard errors, the latter determined via an explicit expression for the observed information matrix. As an intermediate step, we derive an explicit expression for standard errors in the two-component Poisson mixture model (without zero-inflation), a new result. The ZIPM model is applied to simulated data and real ecological count data of frigatebirds on the Coral Sea Islands off the coast of Northeast Australia.

Keywords:

zero-inflated Poisson mixture; ratio of means; maximum likelihood estimator; EM algorithm; information matrix; standard error; zero-truncated Poisson distribution

1. Introduction

Baker and Holdsworth (2013) [1] present data relevant to the determination of the relative abundances of two subspecies of frigatebirds (FB), least (LFB) and greater (GFB), in the Coral Sea Islands off the coast of Northeast Australia. The available data is indirect, consisting only of counts of nests in several standardized sites over several time points, rather than direct observations of individuals. Furthermore, the nests of LFB and GFB usually are indistinguishable, (possibly) differing only in their relative numbers per site. Thus, to infer the type of nest one may use techniques from model-based clustering, such as a finite mixture model [2]. In the presence of count data (such as ours), a finite mixture of Poisson distributions is a common choice [3,4]. Previous authors have studied identifiability and estimation of this and related models (e.g., [2]). Given inferred nest type, ecologists may be interested in estimating other qualities of the ecosystem: If the expected numbers of nests per site for LFB and GFB are denoted by

μ

and

ν

respectively, in this paper we study estimation of their ratio,

θ \equiv μ / ν

, where

0 < θ < \infty

.

Several complications arise. Because no further constraint can be imposed on

θ

a priori, the problem is unidentifiable as stated, i.e.,

(μ, ν)

is indistinguishable from

(ν, μ)

. However, LFB are less prevalent than GFB (based on available labeled data, see [1]), which will render the model identifiable; see the second paragraph below. Furthermore, it is typical of such field studies that zero counts are recorded for reasons other than true absence, such as short study periods, secretive or small species, or other uncontrollable factors. In such cases, it is necessary to model the excessive zero-counts directly to not contaminate other inferences [5]. As is commonly done, we shall adopt the zero-inflated Poisson (ZIP) distribution to represent this feature [6,7,8].

We now state a probability model based on a finite mixture model of zero-inflated Poisson distributions to represent the aforementioned scenario: Let

Y_{j}

indicate whether LFB (

Y_{j} = 1

) or GFB (

Y_{j} = 0

) are observed in time point j. Let

M_{i j}

be the actual number of FB nests at site i in time point j (which may not be directly observed due to zero-inflation). Let

Z_{i j}

indicate whether nest counts are directly observed (

Z_{i j} = 1

) or subject to zero-inflation and hence lost (

Z_{i j} = 0

). Finally, let

N_{i j}

denote the number of FB nests observed at site i at time point j. Let

I \equiv {1, \dots, I}

and

J \equiv {1, \dots, J}

be the corresponding index sets, and set

K = I \times J

,

K = | K | = I J

. For

(i, j) \in K

, consider random variables (rvs),

Y_{j} \sim Bernoulli (π),

(1)

M_{i j} | Y_{j} \sim Poisson \{t_{i} [(1 - 0^{Y_{j}}) μ + 0^{Y_{j}} ν]\},

(2)

Z_{i j} \sim Bernoulli (ϵ),

(3)

N_{i j} = Z_{i j} M_{i j};

(4)

where

0^{0} = 1

,

{Y_{j}}

and

{Z_{i j}}

are mutually independent, and

{M_{i j}}

and

{Z_{i j}}

are conditionally mutually independent given

{Y_{j}}

. Thus

M_{i j}

is a π- mixture of

Poisson (t_{i} μ)

and

Poisson (t_{i} ν)

rvs, where each

t_{i} > 0

is known, potentially reflecting a feature of each site common to all time points, and

μ, ν \in (0, \infty)

are unknown. Further,

N_{i j}

is a zero-inflated Poisson mixture (ZIPM) rv with zero-inflation parameter

1 - ϵ \in (0, 1)

(One may ask if a conditional ZIPM model is equivalent to the well-studied Zero-Truncated Poisson model. We demonstrate that this is not the case in Appendix A).

The main objective of this paper is the problem of estimating the ratio

θ \equiv μ / ν

based solely on the observed data

{N_{i j}}

, with

{Y_{j}}

,

{M_{i j}}

, and

{Z_{i j}}

unobserved. As noted above, for identifiability of

(μ, ν)

, and therefore of

θ = μ / ν

, a restriction must be imposed: we assume that

0 < π \leq 1 / 2

, corresponding to the knowledge that LFB occurs no more frequently than GFB. We propose frequentist estimation of

θ

via the EM algorithm and approximate standard errors via explicit calculation of the observed information matrix (A Bayesian analysis of this same problem can be found in a preprint version of this paper: M. D. Perlman (2022). Estimating the ratio of means in a zero-inflated Poisson mixture model. arXiv:math 2203.13994).

The rest of this paper is organized as follows: In Section 2, we briefly provide notation. In Section 3, we present a preliminary problem of estimating

θ

in a standard, two-component Poisson mixture model (i.e.,

{Y_{j}}

are unobserved and

{M_{i j}}

are observed, without zero-inflation) to serve as a guidepost for the main problem. Therein, we estimate

θ

in a frequentist context via EM and approximate standard errors via explicit calculation of the observed information matrix, which to our knowledge is a new result (Approximate methods are usually used, such as the SEM algorithm [9] or the bootstrap [4]). In Section 4 we address the main problem of estimating the ratio of Poisson means in a zero-inflated Poisson mixture (ZIPM) model, as described in the previous paragraph. In Section 5 the ZIPM model is applied both to simulated data and real data on frigatebirds in the Coral Sea Islands. The results of this study are summarized in Section 6.

2. Notation

Column vectors and arrays denoted by Roman letters appear in bold type, their components in plain type; caps denote rvs:

\begin{matrix} t & \equiv {(t_{1}, \dots, t_{I})}^{'} \in R^{I}, \\ y & \equiv {(y_{1}, \dots, y_{J})}^{'} \in {0, 1}^{J}, & Y & \equiv {(Y_{1}, \dots, Y_{J})}^{'} \in {0, 1}^{J}, \\ z & \equiv (z_{i j}) \in {0, 1}^{K}, & Z & \equiv (Z_{i j}) \in {0, 1}^{K}, \\ m & = (m_{i j}) \in Z_{+}^{K}, & M & = (M_{i j}) \in Z_{+}^{K}, \\ n & = (n_{i j}) \in Z_{+}^{K}, & N & = (N_{i j}) \in Z_{+}^{K}, \end{matrix}

where

R

is the set of real numbers and

Z_{+}

is the set of nonnegative integers. Note that

Y_{j}, Z_{i j} \in {0, 1}

as each is an indicator variable, and

M_{i j}, N_{i j} \in Z_{+}

as each represents count data. Sums and products will range over the index sets

I

and

J

unless otherwise specified, e.g.,

\begin{matrix} \sum_{i} & = \sum_{i = 1}^{I}, & \prod_{j} & = \prod_{j = 1}^{J}, & \sum_{i, j} & = \sum_{i = 1}^{I} \sum_{j = 1}^{J}, \end{matrix}

etc. Summation over one or both of the indices

i, j

involving

m_{i j}

,

n_{i j}

,

z_{i j}

, or their random (capitalized) versions will be indicated by simply dropping the indices that are summed over, e.g.,

\begin{matrix} m_{i} & = \sum_{j} m_{i j}, & n_{j} & = \sum_{i} n_{i j}, \\ m & = \sum_{i, j} m_{i j}, & n & = \sum_{i, j} n_{i j} . \end{matrix}

We set

m! = \prod_{i, j} m_{i j}!

and

n! = \prod_{i, j} n_{i j}!

. All conditioning events

Y = y

,

N = n

, etc., will be abbreviated as

y

,

n

, etc. Lastly, for

i = 1, \dots, I

and

j = 1, \dots, J

, we define

\begin{matrix} M_{j} & = (M_{i, j} | i = 1, \dots, I), & m_{j} & = (m_{i, j} | i = 1, \dots, I), \\ N_{j} & = (N_{i, j} | i = 1, \dots, I), & n_{j} & = (n_{i, j} | i = 1, \dots, I), \\ 1_{i j}^{\neq} & = 1_{i j}^{\neq} (n_{i j}) = 1 - 0^{n_{i j}}, & 1_{j}^{\neq} & = 1_{j}^{\neq} (n_{j}) = \sum_{i} 1_{i j}^{\neq}, \\ 1^{\neq} & = 1^{\neq} (n) = \sum_{j} 1_{j}^{\neq}, & 1^{=} & = 1^{=} (n) = \sum_{j} 1_{j}^{=}, \\ 1_{i j}^{=} & = 1_{i j}^{=} (n_{j}) = 0^{n_{i j}}, & 1_{j}^{=} & = 1_{j}^{=} (n_{j}) = \sum_{i} 1_{i j}^{=}, \\ t_{j}^{\neq} & = t_{j}^{\neq} (n_{j}) = \sum_{i} t_{i} 1_{i j}^{\neq}, & t_{j}^{=} & = t_{j}^{=} (n_{j}) = \sum_{i} t_{i} 1_{i j}^{=} . \end{matrix}

Here

1_{i j}^{\neq}

(

1_{i j}^{=}

) is the indicator function of the event

{n_{i j} \neq 0}

(

{n_{i j} = 0}

), so

1_{j}^{\neq}

(

1_{j}^{=}

) is the number of nonzero (zero)

n_{i j}

with j fixed, etc.

3. A Preliminary Problem

In this section, we address estimation of

θ \equiv μ / ν

in a standard, two-component Poisson mixture model. Specifically, we derive an EM algorithm for maximum likelihood estimation of the unknown model parameters (Section 3.1) and subsequently provide an explicit formula for standard errors of the maximum likelihood estimators (Section 3.2). We begin with a few preliminaries.

Here,

M_{i j}

is a

π

-mixture of

Poisson (t_{i} μ)

and

Poisson (t_{i} ν)

rvs, where

π

is the unknown mixing probability, cf. (2). Thus the probability mass function (pmf) of the observed data array

M \equiv (M_{i j})

is

\begin{matrix} f_{π, μ, ν} (m) & = \prod_{i, j} [π e^{- t_{i} μ} {(t_{i} μ)}^{m_{i j}} + (1 - π) e^{- t_{i} ν} {(t_{i} ν)}^{m_{i j}}] / m_{i j}! \\ = \prod_{i, j} [π e^{- t_{i} μ} μ^{m_{i j}} + (1 - π) e^{- t_{i} ν} ν^{m_{i j}}] \cdot Ξ_{t} (m), \end{matrix}

(5)

where

Ξ_{t} (m) = (\prod_{i} t_{i}^{m_{i}}) / m!

. The joint pmf of the complete (unobserved and observed) data

(Y, M)

is

\begin{matrix} f_{π, μ, ν} (y, m) & = f_{π} (y) f_{μ, ν} (m | y) \\ = \prod_{j} π^{y_{j}} {(1 - π)}^{1 - y_{j}} \prod_{i, j} {(e^{- t_{i} μ} μ^{m_{i j}})}^{y_{j}} {(e^{- t_{i} ν} ν^{m_{i j}})}^{1 - y_{j}} Ξ_{t} (m) \\ = {[π^{\bar{y}} {(1 - π)}^{1 - \bar{y}}]}^{J} \prod_{j} {[{(e^{- \bar{t} μ})}^{I} μ^{m_{j}}]}^{y_{j}} {[{(e^{- \bar{t} ν})}^{I} ν^{m_{j}}]}^{1 - y_{j}} Ξ_{t} (m), \\ = {[π^{\bar{y}} {(1 - π)}^{1 - \bar{y}}]}^{J} {[e^{- \bar{t} \bar{y} μ} μ^{\bar{m y}} e^{- \bar{t} (1 - \bar{y}) ν} ν^{\bar{m (1 - y)}}]}^{K} Ξ_{t} (m), \end{matrix}

(6)

where

\begin{matrix} \bar{y} & = \frac{1}{J} \sum_{j} y_{j}, & \bar{m} & = \frac{1}{K} \sum_{i, j} m_{i j} = \frac{m}{K}, \\ \bar{m y} & = \frac{1}{K} \sum_{j} m_{j} y_{j}, & \bar{m (1 - y)} & = \frac{1}{K} \sum_{j} m_{j} (1 - y_{j}) . \end{matrix}

Thus,

f_{π, μ, ν} (y, m)

determines an exponential family with sufficient statistic

(\bar{Y}, \bar{M Y}, \bar{M (1 - Y)})

.

3.1. Estimation via the EM Algorithm

To obtain the MLEs

\hat{π}, \hat{μ}, \hat{ν}

and thus

\hat{θ} = \hat{μ} / \hat{ν}

, it is straightforward to apply the EM algorithm [10,11] as follows:

E-Step: Because (6) is an exponential family, Bayes formula shows that for $l = 0, 1, \dots$ , the $(l + 1)$ -st E-step simply imputes $y_{j}$ to be

$\begin{matrix} {(\hat{y_{j}})}_{l + 1} & = E_{{\hat{π}}_{l}, {\hat{μ}}_{l}, {\hat{ν}}_{l}} [Y_{j} | m] \\ = P_{{\hat{π}}_{l}, {\hat{μ}}_{l}, {\hat{ν}}_{l}} [Y_{j} = 1 | m_{j}] \\ = \frac{{\hat{π}}_{l} \prod_{i} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{m_{i j}}}{{\hat{π}}_{l} \prod_{i} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{m_{i j}} + (1 - {\hat{π}}_{l}) \prod_{i} e^{- t_{i} {\hat{ν}}_{l}} {(t_{i} {\hat{ν}}_{l})}^{m_{i j}}} \\ = \frac{{\hat{π}}_{l}}{{\hat{π}}_{l} + (1 - {\hat{π}}_{l}) e^{- I \bar{t} ({\hat{ν}}_{l} - {\hat{μ}}_{l})} {(\frac{{\hat{ν}}_{l}}{{\hat{μ}}_{l}})}^{m_{j}}} . \end{matrix}$

(7)

Observe that in (7), the numerator and first term in the denominator is the unnormalized probability that

Y_{j} = 1 | m_{j}

and the second term in the denominator is the unnormalized probability that

Y_{j} = 0 | m_{j}

.

M-Step: From (6), the complete-data MLEs are found to be,

$π = \bar{y}, μ = \frac{\bar{m y}}{\bar{t} \bar{y}}, ν = \frac{\bar{m (1 - y)}}{\bar{t} (1 - \bar{y})} .$

Thus, In the

(l + 1)

-st iteration, we maximize estimates of the unknown parameters via

{\hat{π}}_{l + 1} = \frac{1}{J} \sum_{j} {(\hat{y_{j}})}_{l + 1}, {\hat{μ}}_{l + 1} = \frac{\bar{m {(\hat{y})}_{l + 1}}}{\bar{t} \bar{{(\hat{y})}_{l + 1}}}, {\hat{ν}}_{l + 1} = \frac{\bar{m {(1 - \hat{y})}_{l + 1}}}{\bar{t} \bar{{(1 - \hat{y})}_{l + 1}}},

where

{(\hat{y})}_{l + 1} = ({(\hat{y_{1}})}_{l + 1}, \dots, {(\hat{y_{J}})}_{l + 1})

. Note that the identifiability constraint,

π \leq \frac{1}{2}

, is briefly ignored: Aitken and Rubin [10] (1985, p. 69) state that assuming convergence of

({\hat{π}}_{l}, {\hat{μ}}_{l}, {\hat{ν}}_{l})

to an MLE

(\hat{π}, \hat{μ}, \hat{ν})

, the same maximum value will occur at

(1 - \hat{π}, \hat{ν}, \hat{μ})

. Thus, we simply take the MLE to be that for which the first component is ≤

\frac{1}{2}

(say

(\hat{π}, \hat{μ}, \hat{ν})

for the sake of specificity). This concludes the EM algorithm.

Using estimates

(\hat{π}, \hat{μ}, \hat{ν})

from the EM algorithm, we obtain the following estimator of

θ

:

\begin{matrix} {\hat{θ}}_{l + 1} & = \frac{{\hat{μ}}_{l + 1}}{{\hat{ν}}_{l + 1}} = \frac{\bar{m {(\hat{y})}_{l + 1}}}{\bar{{(\hat{y})}_{l + 1}}} \frac{1 - \bar{{(\hat{y})}_{l + 1}}}{\bar{m} - \bar{m {(\hat{y})}_{l + 1}}} = \frac{\frac{1}{\bar{{(\hat{y})}_{l + 1}}} - 1}{\frac{\bar{m}}{\bar{m {(\hat{y})}_{l + 1}}} - 1} . \end{matrix}

(8)

3.2. Standard Error for the MLE $\hat{θ}$

We now provide an explicit formula for approximating standard errors of the unknown parameters,

(π, μ, ν)

, and thereby, of

θ \equiv μ / ν

. For simplicity of notation set

ω = (π, μ, ν)

, and assume that the EM iterates

ω_{l}

converge to

\hat{ω} \equiv (\hat{π}, \hat{μ}, \hat{ν})

, the actual MLEs based on the observed data

M

.

One method for approximating the standard error of

\hat{ω}

uses the total expected information matrix

\begin{matrix} I_{M} (ω) & \equiv - E_{ω} [\nabla_{ω}^{2} log f_{ω} (M)] \end{matrix}

for the observed data

M

: If

K \equiv I J

is large, it follows from Theorem 2 of [12] that

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{3} [0, K I_{M}^{- 1} (ω)] . \end{matrix}

Alternatively, refs. [13,14] note that the observed information matrix

I_{m} (ω) \equiv - \nabla_{ω}^{2} log f_{ω} (m)

usually yields a better normal approximation and often is more readily computed than expected information.

Theorem 1.

Assume the two-component Poisson mixture model presented in (1) and (2), where

M = m

is observed and

Y

is unobserved. If

K \equiv I J

is large, then

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{3} [0, K I_{m}^{- 1} (\hat{ω})] . \end{matrix}

The observed information matrix is given explicitly as follows:

\begin{matrix} I_{m} (ω) & = D (ω; \bar{t}; m) - e^{- \bar{t} I (μ + ν)} Δ (ω; \bar{t}; m) Δ {(ω; \bar{t}; m)}^{'}; \\ D (ω; \bar{t}; m) & = (\begin{matrix} \frac{J [(1 - 2 π) \bar{p} + π^{2}]}{π^{2} {(1 - π)}^{2}} & 0 & 0 \\ 0 & K [\frac{\bar{m p}}{μ^{2}}] & 0 \\ 0 & 0 & K [\frac{\bar{m (1 - p)}}{ν^{2}}] \end{matrix}), \\ Δ (ω; \bar{t}; m) & = (\begin{matrix} \frac{{(μ ν)}^{I {\bar{m}}_{1} / 2}}{γ_{1}} δ_{1}, \dots, \frac{{(μ ν)}^{I {\bar{m}}_{J} / 2}}{γ_{J}} δ_{J} \end{matrix}), \\ δ_{j} & = (\begin{matrix} \frac{1}{\sqrt{π (1 - π)}} \\ I \sqrt{π (1 - π)} (\frac{{\bar{m}}_{j}}{μ} - \bar{t}) \\ - I \sqrt{π (1 - π)} (\frac{{\bar{m}}_{j}}{ν} - \bar{t}) \end{matrix}), \\ γ_{j} & = π {(e^{- \bar{t} μ} μ^{{\bar{m}}_{j}})}^{I} + (1 - π) {(e^{- \bar{t} ν} ν^{{\bar{m}}_{j}})}^{I} . \end{matrix}

Elements on the main diagonal of covariance matrix

K I_{m}^{- 1} (\hat{ω})

are estimated standard errors of parameters

\hat{π}, \hat{μ}, \hat{ν}

and off-diagonal elements are their respective covariances. The proof of this theorem appears in Appendix B.

Theorem 1 provides an approximate confidence interval for the parameter of interest,

θ = μ / ν

:

Proposition 1.

Under the conditions of Theorem 1, an approximate

(1 - α)

confidence interval for θ is given by,

\begin{matrix} \hat{θ} \pm \frac{\hat{σ}}{\sqrt{n}} z_{α / 2}, \end{matrix}

where

z_{α / 2}

is the upper

(1 - \frac{α}{2})

-quantile of the standard normal distribution,

\begin{matrix} {\hat{σ}}^{2} & = K (\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}}) {(I_{22} - I_{21} I_{11}^{- 1} I_{12})}^{- 1} {(\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}})}^{'}, \end{matrix}

and

I_{m} (\hat{ω})

is partitioned as

\begin{matrix} I_{m} (\hat{ω}) & = (\begin{matrix} I_{11} & I_{12} \\ I_{21} & I_{22} \end{matrix}) \end{matrix}

with

I_{11} : 1 \times 1

,

I_{22} : 2 \times 2

,

I_{12} : 1 \times 2

, and

I_{21} = I_{12}^{'}

.

Proof.

An approximate confidence interval for

θ \equiv μ / ν \equiv g (ω)

is obtained by propagation of error.

\begin{matrix} \sqrt{K} (\hat{θ} - θ) & \approx N [0, K {(\nabla_{ω} g (ω) |_{\hat{ω}})}^{'} I_{m}^{- 1} (\hat{ω}) \nabla_{ω} g (\hat{ω}) |_{\hat{ω}}]] \\ = N [0, K (\frac{\partial g}{\partial π} |_{\hat{ω}}, \frac{\partial g}{\partial μ} |_{\hat{ω}}, \frac{\partial g}{\partial ν} |_{\hat{ω}}) I_{m}^{- 1} (\hat{ω}) {(\frac{\partial g}{\partial π} |_{\hat{ω}}, \frac{\partial g}{\partial μ} |_{\hat{ω}}, \frac{\partial g}{\partial ν} |_{\hat{ω}})}^{'}] \\ = N [0, K (0, \frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}}) I_{m}^{- 1} (\hat{ω}) {(0, \frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}})}^{'}] \\ = N [0, K (\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}}) {(I_{22} - I_{21} I_{11}^{- 1} I_{12})}^{- 1} {(\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}})}^{'}] \\ \equiv N (0, {\hat{σ}}^{2}) . \end{matrix}

□

4. The Main Problem

We now turn to our main problem of estimating the ratio of Poisson means

θ \equiv μ / ν

in a zero-inflated Poisson mixture (ZIPM) model cf. (1)–(4). Again, we derive an EM algorithm for maximum likelihood estimation of the unknown model parameters (Section 4.1) and subsequently provide an explicit formula for standard errors of the maximum likelihood estimators (Section 4.2). We begin with a few preliminaries.

Note that

N_{i j}

is an

ϵ

-mixture of

M_{i j}

and

O_{i j}

, where

O_{i j}

is degenerate at 0, so

O_{i j} \sim Poisson (λ = 0)

; while

M_{i j}

is a

π

-mixture of

Poisson (t_{i} μ) \equiv Poisson (t_{i} θ λ)

and

Poisson (t_{i} ν) \equiv Poisson (t_{i} λ)

rvs. Thus this problem can be viewed as a three-component Poisson mixture model with one degenerate component and non-i.i.d. observations. The three weights are

π ϵ

,

(1 - π) ϵ

, and

1 - ϵ

, with the identifiability constraint

0 < π \leq 1 / 2

.

For notational simplicity, set

ω = (π, ϵ, μ, ν)

. Under this three-component mixture model, the unconditional pmf of the observed data

N \equiv {N_{i j}}

is

f_{ω} (n) = \prod_{i, j} [π ϵ e^{- t_{i} μ} {(t_{i} μ)}^{n_{i j}} + (1 - π) ϵ e^{- t_{i} ν} {(t_{i} ν)}^{n_{i j}} + (1 - ϵ) 0^{n_{i j}}] / n!,

where

0^{0} = 1

. The joint pmf of the unobserved and observed data

(Y, Z, N)

is given by

\begin{matrix} f_{ω} (y, z, n) & = f_{π} (y) f_{ϵ} (z) f_{μ, ν} (n | y, z) \\ = \prod_{j} π^{y_{j}} {(1 - π)}^{1 - y_{j}} \prod_{i, j} ϵ^{z_{i j}} {(1 - ϵ)}^{1 - z_{i j}} \\ \cdot \prod_{i, j} {[e^{- t_{i} μ} {(t_{i} μ)}^{n_{i j}}]}^{y_{j} z_{i j}} {[e^{- t_{i} ν} {(t_{i} ν)}^{n_{i j}}]}^{(1 - y_{j}) z_{i j}} 0^{n_{i j} (1 - z_{i j})} / n! \\ = {[π^{\bar{y}} {(1 - π)}^{1 - \bar{y}}]}^{J} {[ϵ^{\bar{z}} {(1 - ϵ)}^{1 - \bar{z}}]}^{K} \\ \cdot {[e^{- \bar{t y z} μ} μ^{\bar{n y}} e^{- \bar{t (1 - y) z} ν} ν^{\bar{n (1 - y)}}]}^{K} Ξ_{t} (z, n), \end{matrix}

(9)

where

y = {y_{j}}

,

z = {z_{i j}}

,

n = {n_{i j}}

,

\begin{matrix} \bar{z} & = \frac{1}{K} \sum_{i, j} z_{i j}, \\ \bar{t z} & = \frac{1}{K} \sum_{i, j} t_{i} z_{i j} = \frac{1}{K} \sum_{i, j} t_{i} z_{i j}, \\ \bar{t y z} & = \frac{1}{K} \sum_{i, j} t_{i} y_{j} z_{i j} = \frac{1}{K} \sum_{i, j} t_{i} y_{j} z_{i j}, \\ Ξ_{t} (z, n) & = \prod_{i, j} t_{i}^{n_{i j} z_{i j}} 0^{n_{i j} (1 - z_{i j})} / n!, \end{matrix}

and similarly with y replaced by

1 - y

. To obtain (9) we have used,

\begin{matrix} \bar{n y z} & = \frac{1}{K} \sum_{i, j} n_{i j} y_{j} z_{i j} = \frac{1}{K} \sum_{i, j} n_{i j} y_{j} = \frac{1}{K} \sum_{j} n_{j} y_{j} = \bar{n y} \\ Ξ_{t} (z, n) & = \prod_{i, j} t_{i}^{n_{i j}} 0^{n_{i j} (1 - z_{i j})} / n! = \prod_{i} t_{i}^{n_{i}} \cdot \prod_{i, j} 0^{n_{i j} (1 - z_{i j})} / n! \end{matrix}

and similarly with y replaced by

1 - y

. Thus,

f_{ω} (y, z, n)

determines an exponential family with sufficient statistic

(\bar{Y}, \bar{Z}, \bar{t Y Z}, \bar{t (1 - Y) Z}, \bar{n Y}, \bar{n (1 - Y)})

.

4.1. Estimation via the EM Algorithm

To obtain the MLEs

\hat{ϵ}, \hat{π}, \hat{μ}, \hat{ν}

and then

\hat{θ} = \hat{μ} / \hat{ν}

, it is again straightforward (albeit, notationally challenging) to apply the EM algorithm, as follows:

E-Step: Since (9) is an exponential family, Bayes formula shows that for $l = 0, 1, \dots$ , the $(l + 1)$ -st E-step imputes $y_{j}$ , $y_{j} z_{i j}$ , $(1 - y_{j}) z_{i j}$ , and $z_{i j}$ , respectively, as,

$\begin{matrix} {(\hat{y_{j}})}_{l + 1} & = E_{{\hat{ω}}_{l}} [Y_{j} | n] \\ = P_{{\hat{ω}}_{l}} [Y_{j} = 1] P_{{\hat{ω}}_{l}} [N_{j} = n_{j} | Y_{j} = 1] / P_{{\hat{ω}}_{l}} [N_{j} = n_{j}] \\ = \frac{{\hat{π}}_{l} \prod_{i} [{\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}]}{{\hat{π}}_{l} \prod_{i} [{\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}] + (1 - {\hat{π}}_{l}) \prod_{i} [{\hat{ϵ}}_{l} e^{- t_{i} {\hat{ν}}_{l}} {(t_{i} {\hat{ν}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}]} \\ = \frac{{\hat{π}}_{l}}{{\hat{π}}_{l} + (1 - {\hat{π}}_{l}) e^{- t_{j}^{\neq} ({\hat{ν}}_{l} - {\hat{μ}}_{l})} {(\frac{{\hat{ν}}_{l}}{{\hat{μ}}_{l}})}^{n_{j}} \prod_{i} {[\frac{{\hat{ϵ}}_{l} e^{- t_{i} {\hat{ν}}_{l}} + (1 - {\hat{ϵ}}_{l})}{{\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} + (1 - {\hat{ϵ}}_{l})}]}^{1_{i j}^{=}}}, \\ {(\hat{y_{j} z_{i j}})}_{l + 1} & = E_{{\hat{ω}}_{l}} [Y_{j} Z_{i j} | n] \\ = P_{{\hat{ω}}_{l}} [Y_{j} = 1, Z_{i j} = 1] P_{{\hat{ω}}_{l}} [N_{j} = n_{j} | Y_{j} = 1, Z_{i j} = 1] / P_{{\hat{ω}}_{l}} [N_{j} = n_{j}] \\ = \frac{{\hat{π}}_{l} {\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{n_{i j}} \prod_{i^{'} \neq i} [{\hat{ϵ}}_{l} e^{- t_{i^{'}} {\hat{μ}}_{l}} {(t_{i^{'}} {\hat{μ}}_{l})}^{n_{i^{'} j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i^{'} j}}]}{{\hat{π}}_{l} \prod_{i} [{\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}] + (1 - {\hat{π}}_{l}) \prod_{i} [{\hat{ϵ}}_{l} e^{- t_{i} {\hat{ν}}_{l}} {(t_{i} {\hat{ν}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}]} \\ = {(\hat{y_{j}})}_{l + 1} \cdot \frac{{\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{n_{i j}}}{{\hat{ϵ}}_{l} e^{- t_{i} {\hat{μ}}_{l}} {(t_{i} {\hat{μ}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}} \\ = {(\hat{y_{j}})}_{l + 1} \cdot {[1 + (\frac{1 - {\hat{ϵ}}_{l}}{{\hat{ϵ}}_{l}}) e^{t_{i} {\hat{μ}}_{l}}]}^{- 1_{i j}^{=}} \\ \hat{(1 - y_{j}) z_{i j}}]_{l + 1} & = [1 - {(\hat{y_{j}})}_{l + 1}] \cdot \frac{{\hat{ϵ}}_{l} e^{- t_{i} {\hat{ν}}_{l}} {(t_{i} {\hat{ν}}_{l})}^{n_{i j}}}{{\hat{ϵ}}_{l} e^{- t_{i} {\hat{ν}}_{l}} {(t_{i} {\hat{ν}}_{l})}^{n_{i j}} + (1 - {\hat{ϵ}}_{l}) 0^{n_{i j}}} \\ = [1 - {(\hat{y_{j}})}_{l + 1}] \cdot {[1 + (\frac{1 - {\hat{ϵ}}_{l}}{{\hat{ϵ}}_{l}}) e^{t_{i} {\hat{ν}}_{l}}]}^{- 1_{i j}^{=}}; \\ {(\hat{z_{i j}})}_{l + 1} & = {(\hat{y_{j} z_{i j}})}_{l + 1} + {[\hat{(1 - y_{j}) z_{i j}}]}_{l + 1} . \end{matrix}$

Note that

{(\hat{y_{j} z_{i j}})}_{l + 1} \neq {(\hat{y_{j}})}_{l + 1} \cdot {(\hat{z_{i j}})}_{l + 1}

in general.

M-Step: From (9), the complete-data MLEs are found to be,

$\tilde{π} = \bar{y}, \tilde{ϵ} = \bar{z}, \tilde{μ} = \frac{\bar{n y}}{\bar{t y z}}, \tilde{ν} = \frac{\bar{n (1 - y)}}{\bar{t (1 - y) z}} .$

Thus the

(l + 1)

-st M-step yields the updated estimates

\begin{matrix} {\hat{π}}_{l + 1} & = \frac{1}{J} \sum_{j} {(\hat{y_{j}})}_{l + 1}, \\ {\hat{ϵ}}_{l + 1} & = \frac{1}{K} \sum_{i, j} {(\hat{z_{i j}})}_{l + 1}, \\ {\hat{μ}}_{l + 1} & = \frac{\bar{n {(\hat{y})}_{l + 1}}}{\bar{t {(\hat{y z})}_{l + 1}}} = \frac{\sum_{j} n_{j} {(\hat{y_{j}})}_{l + 1}}{\sum_{i, j} t_{i} {(\hat{y_{j} z_{i j}})}_{l + 1}}, \\ {\hat{ν}}_{l + 1} & = \frac{\bar{n {[\hat{(1 - y)}]}_{l + 1}}}{\bar{t {[\hat{(1 - y) z}]}_{l + 1}}} = \frac{\sum_{j} n_{j} {[1 - \hat{y_{j}}]}_{l + 1}}{\sum_{i, j} t_{i} {[\hat{(1 - y_{j}) z_{i j}}]}_{l + 1}} . \end{matrix}

Again, as in Aitken and Rubin [10] (1985, p. 69), the constraint

π \leq \frac{1}{2}

is ignored and, assuming convergence to an MLE

(\hat{π}, \hat{ϵ}, \hat{μ}, \hat{ν})

, the same maximum value will occur at

(1 - \hat{π}, \hat{ϵ}, \hat{ν}, \hat{μ})

. Thus, the MLE is taken to be that for which the first component is ≤

\frac{1}{2}

, say

(\hat{π}, \hat{ϵ}, \hat{μ}, \hat{ν})

. This concludes the EM algorithm.

Using estimates

(\hat{ϵ}, \hat{π}, \hat{μ}, \hat{ν})

from the EM algorithm, we obtain an updated estimator

{\hat{θ}}_{l + 1} \equiv \frac{{\hat{μ}}_{l + 1}}{{\hat{ν}}_{l + 1}}

. Note that unlike in (8),

{\hat{θ}}_{l + 1}

depends on

{t_{i}}

.

4.2. Standard Error for the MLE $\hat{θ}$

We provide an explicit formula for approximating standard errors of the unknown parameters

(ϵ, π, μ, ν)

, and thereby, of

θ \equiv μ / ν

.

Theorem 2.

Assume the two-component Poisson mixture model presented in (1)–(4), where

N = n

is observed and

Y

,

Z

, and

M

are unobserved. If

K \equiv I J

is large, then

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{4} [0, K I_{n}^{- 1} (\hat{ω})] . \end{matrix}

The observed information matrix

I_{n} (ω) \equiv - \nabla_{ω}^{2} log f_{ω} (n)

is given explicitly as follows:

\begin{matrix} I_{n} (ω) & = T_{1} + T_{2}, \\ T_{1} & = (\begin{matrix} \frac{J [(1 - 2 π) \bar{q} + π^{2}]}{π^{2} {(1 - π)}^{2}} & 0 & 0 & 0 \\ 0 & \frac{K [(1 - 2 ϵ) ρ + ϵ^{2}]}{ϵ^{2} {(1 - ϵ)}^{2}} & 0 & 0 \\ 0 & 0 & K [\frac{\bar{n q}}{μ^{2}}] & 0 \\ 0 & 0 & 0 & K [\frac{\bar{n (1 - q)}}{ν^{2}}] \end{matrix}), \end{matrix}

\begin{matrix} T_{2} & = - π (1 - π) \sum_{j} \frac{e^{- t_{j}^{\neq} (μ + ν)} {(μ ν)}^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1] [ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}}{ψ_{j}^{2}} ϕ_{j} ϕ_{j}^{'} \\ - ϵ (1 - ϵ) \sum_{i, j} 1_{i j}^{=} \{\frac{e^{- t_{i} μ}}{{[ϵ (e^{- t_{i} μ} - 1) + 1]}^{2}} χ_{i} (1) χ_{i} {(1)}^{'} q_{j} + \frac{e^{- t_{i} ν}}{{[ϵ (e^{- t_{i} ν} - 1) + 1]}^{2}} χ_{i} (0) χ_{i} {(0)}^{'} (1 - q_{j})\}, \\ ρ & = \frac{1^{\neq}}{K} + \frac{ϵ}{K} \sum_{i, j} 1_{i j}^{=} [\frac{q_{j}}{ϵ + (1 - ϵ) e^{t_{i} μ}} + \frac{1 - q_{j}}{ϵ + (1 - ϵ) e^{t_{i} ν}}], \end{matrix}

\begin{matrix} q_{j} & = \frac{π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ e^{- t_{i} μ} + (1 - ϵ)]}^{1_{i j}^{=}}}{π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ e^{- t_{i} μ} + (1 - ϵ)]}^{1_{i j}^{=}} + (1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ e^{- t_{i} ν} + (1 - ϵ)]}^{1_{i j}^{=}}}, \\ ψ_{j} & = π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1]}^{1_{i j}^{=}} + (1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}, \\ ϕ_{j} & = (\begin{matrix} \frac{1}{π (1 - π)} \\ \sum_{i} \frac{1_{i j}^{=} (e^{- t_{i} μ} - e^{- t_{i} ν})}{[ϵ (e^{- t_{i} μ} - 1) + 1] [ϵ (e^{- t_{i} ν} - 1) + 1]} \\ [\frac{n_{j}}{μ} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} μ}}{ϵ (e^{- t_{i} μ} - 1) + 1}] \\ [\frac{n_{j}}{ν} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} ν}}{ϵ (e^{- t_{i} ν} - 1) + 1}] \end{matrix}), \\ χ_{i} (0) & = {(0, \frac{1}{ϵ (1 - ϵ)}, t_{i}, 0)}^{'}, χ_{i} (1) = {(0, \frac{1}{ϵ (1 - ϵ)}, 0, t_{i})}^{'} . \end{matrix}

The proof of this theorem also appears in Appendix B.

Again, Theorem 2 provides an approximate confidence interval for the parameter of interest,

θ = μ / ν

:

Proposition 2.

Under the conditions of Theorem 2, an approximate

(1 - α)

confidence interval for θ is given by,

\begin{matrix} \hat{θ} \pm \frac{\hat{τ}}{\sqrt{K}} z_{α / 2}, \end{matrix}

where

z_{α / 2}

is the upper

(1 - \frac{α}{2})

-quantile of the standard normal distribution,

\begin{matrix} {\hat{τ}}^{2} & = K (\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}}) {(I_{22} - I_{21} I_{11}^{- 1} I_{12})}^{- 1} {(\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}})}^{'}, \end{matrix}

and

I_{n} (\hat{ω})

is partitioned as

\begin{matrix} I_{n} (\hat{ω}) & = (\begin{matrix} I_{11} & I_{12} \\ I_{21} & I_{22} \end{matrix}), \end{matrix}

with

I_{11} : 2 \times 2

,

I_{22} : 2 \times 2

,

I_{12} : 2 \times 2

, and

I_{21} = I_{12}^{'}

.

Proof.

Again, an approximate confidence interval for

θ \equiv μ / ν \equiv g (ω)

is obtained by propagation of error. For

\hat{θ} = \hat{μ} / \hat{ν}

,

\begin{matrix} \sqrt{K} (\hat{θ} - θ) & \approx N {[0, K (\nabla_{ω} g (ω) |_{\hat{ω}}])}^{'} I_{n}^{- 1} (\hat{ω}) \nabla_{ω} g (\hat{ω}) |_{\hat{ω}}] \\ = N [0, K (0, 0, \frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}}) I_{n}^{- 1} (\hat{ω}) {(0, 0, \frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}})}^{'}] \\ = N [0, K (\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}}) {(I_{22} - I_{21} I_{11}^{- 1} I_{12})}^{- 1} {(\frac{1}{\hat{ν}}, \frac{- \hat{μ}}{{\hat{ν}}^{2}})}^{'}] \\ \equiv N (0, {\hat{τ}}^{2}) . \end{matrix}

□

5. Simulation and Data Analysis

The frequentist estimation procedure for ZIPM models in Section 4 is now applied to simulated and real data.

5.1. Simulation Study

Simulated data is used to assess the estimation error and confidence interval coverage in various regimes. Across all simulations, we set

μ = 10

and

ν = 5

so that

θ = 2

. Across simulations we vary I and J to assess accuracy as the overall amount of available data changes, vary

π

to assess accuracy as the relative prevalence between the more or less prevalent groups becomes more severe, and vary

ϵ

to assess accuracy as zero-inflation becomes more severe. Specifically, for each combination of

I \in {5, 10, 20, 40, 80}

,

J \in {5, 10, 20, 40, 80}

,

π \in {0.1, 0.25, 0.4}

, and

ϵ \in {0.6, 0.7, 0.8}

, we generate 200 independent datasets and estimate

θ

using the EM algorithm from Section 4.1 (with 20 random starts), as well as a 95% confidence interval using Proposition 2. Estimation error and nominal coverage of confidence intervals is shown in Figure 1 and Figure 2, respectively. (The information shown in these figures is presented in tabular form in Appendix C).

We observe that the methods derived in Section 4 yield accurate estimates of

θ

and well-calibrated confidence intervals. Figure 1 shows that mean absolute error decreases in I, J,

π

, and

ϵ

, and is generally small. The decrease in error as

π

and

ϵ

increase may be attributed to the corresponding increase in sample size for the less prevalent component and the decreasing amount of zero-inflation, respectively.

Figure 2 shows that coverage hovers close to 95% for most combinations of I, J,

π

,

θ

. We observe that error is somewhat larger and coverage is somewhat inaccurate when

I, J \in {5, 10}

. However, these inaccuracies are modest given the highly-limited data availability in these simulation scenarios.

5.2. Analysis of Frigatebird Nest Counts

We study ecological count data on frigatebirds in the Coral Sea Islands off the coast of Northeast Australia, as described in [1] (The specific data studied herein was provided via email with author G. Barry Baker). They obtained counts of frigatebird nests over 11 standardized sites across 4 separate time points.

This data is relevant to our study for two reasons: First, the ecological count data is zero-inflated. Of the 44 unique combinations of sites and time points in which data was collected, 7 had 0 nests (about 15.9%). Second, the frigatebird species has two subspecies, least (LFB) and greater (GFB). In the observed data, some nests were specified as LFB nests or GFB nests, but the majority were unidentified (Table 1). We analyze counts of only unidentified frigatebird nests,

N_{i j}

, by site i and time point j (Table 2).

We applied our work from Section 4 to the unidentified nest counts in order to estimate the ratio

θ \equiv μ / ν

, where

μ

and

ν

denote the expected numbers of nests per site for the less prevalent (Based on the numbers of nests that could be identified; see Table 1). LFB and more prevalent GFB, respectively. We set

t_{i} = 1

for each site i in the absence of additional information on each site (To assess the sensitivity of our results to this assumption, we have run an additional analysis in which

t_{i}

is iteratively updated during the EM algorithm. We find estimates of

θ

are nearly unchanged; see Appendix C for details). During estimation the EM algorithm was run with 1000 random initializers.

Results appear in Table 3. We estimate that 25% of nests belong to LFB (

\hat{π} = 0.25

), 75% belong to GBF, and that 16% of observed nest counts are zero-inflated (

1 - \hat{ϵ} = 0.16

). The EM algorithm yields the MLE

\hat{θ} = 3.65

for the ratio

θ

; the 95% confidence interval for

θ

is (3.23, 4.08).

6. Conclusions

In this paper, we studied the zero-inflated Poisson mixture (ZIPM) model in the frequentist setting. In addition to deriving an EM algorithm for point-estimation of model parameters, we stated an explicit formula for estimating standard errors of the MLEs. As a preliminary, we derived analogous results for the commonly-used, two-component Poisson mixture model. Although somewhat complex notationally, our formulae are straightforward to apply.

Our results were applied to real data on frigatebirds in the Coral Sea Islands off the coast of Northeastern Australia, where the ratio between two subspecies is of interest to ecologists. In this setting, knowledge of which species was more prevalent allows identifiability. We then used only unlabeled, zero-inflated nest count data to estimate (i) the relative abundance sites for each subspecies, (ii) the rate of zero-inflation, (iii) the mean numbers of nests per site for each subspecies, and (iv) the ratio of nests per site for each subspecies. We expect the ZIPM model to be useful in other ecological count data settings. Hence, our work provides straightforward ways for practitioners to estimate key parameters of interest.

Author Contributions

Conceptualization, M.P. and M.D.P.; methodology, M.P. and M.D.P.; software, M.P.; validation, M.P. and M.D.P.; writing—original draft preparation, M.P. and M.D.P.; writing—review and editing, M.P. and M.D.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code and data required to reproduce our analyses can be found at https://github.com/pearce790/ZIPM, accessed on 2 July 2025.

Acknowledgments

We thank the three anonymous referees for their helpful feedback during the peer review process. Furthermore, we are grateful to Barry Baker for providing the frigatebird data used in Section 5.2, and to Jon Wellner for his generous and always-insightful comments.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Conditional ZIPM = ZTP?

Consider the following two subsets of the index set

K

and two subarrays of the data array

N \equiv (N_{i j})

:

\begin{matrix} Ω_{Z}^{\neq} & = {(i, j) | Z_{i j} = 1}, \\ Ω_{N}^{\neq} & = {(i, j) | N_{i j} \neq 0}, \\ N_{Z}^{\neq} & = (N_{i j} | Z_{i j} = 1) = (M_{i j} | Z_{i j} = 1), \\ N^{\neq} & = (N_{i j} | N_{i j} \neq 0) = (M_{i j} | M_{i j} \neq 0) . \end{matrix}

Both

Ω_{Z}^{\neq}

and

Ω_{N}^{\neq}

are random subsets,

Ω_{Z}^{\neq}

is unobserved,

Ω_{N}^{\neq}

is observed, and

Ω_{N}^{\neq} \subseteq Ω_{Z}^{\neq}

, so

N^{\neq} \subseteq N_{Z}^{\neq}

. Because

M

is independent of

Z

,

N_{Z}^{\neq}

is a random subarray of the i.n.i.d. array

(M_{i j})

, where membership in

N_{Z}^{\neq}

depends only on

Z

. Thus

N^{\neq}

is also is a (smaller) random subarray of the i.n.i.d. array

(M_{i j})

, where membership in

N^{\neq}

depends on both

Z

and the events

{M_{i j} \neq 0}

.

The latter fact suggest a question: Is the conditional distribution of the two-component ZIPM rv

N_{i j}

given

N_{i j} \neq 0

the same as the distribution of the mixture of the conditional distributions of the two Poisson components given that each is non-zero? The latter conditional distribution is the well-known zero-truncated Poisson (ZTP) distribution, also called positive Poisson, which has been thoroughly studied [15]. The ZTP distribution model also is an exponential family, with pmf given by

\begin{matrix} g_{λ} (x) = \frac{λ^{x}}{(e^{λ} - 1) x!}, x = 1, 2, \dots . \end{matrix}

(A1)

If the answer to the above question is yes, then estimation of

π, μ, ν

and thus

θ

could be based on only the set of non-zero

N_{i j}

. That is, discard all 0’s and view the remaining

N_{i j}

as

π

-mixtures of two ZTP components with parameters

t_{i} μ

and

t_{i} ν

. Because this involves only two mixture components rather than three as above, both being exponential families, and neither is degenerate, estimation methods such as the EM algorithm would be easier to carry out.

Unfortunately the answer to the question is no. If we abbreviate

N_{i j}

by N,

M_{i j}

by M, and

Z_{i j}

by Z, then the question can be expressed as follows:

\begin{matrix} Is & P [N = x | N \neq 0] = \frac{π μ^{x}}{(e^{μ} - 1) x!} + \frac{(1 - π) ν^{x}}{(e^{ν} - 1) x!}, x = 1, 2, \dots ? \end{matrix}

However, for

x \geq 1

,

\begin{matrix} P [N = x | N \neq 0] & = \frac{P [Z M = x, Z M \neq 0]}{P [Z M \neq 0]} \\ = \frac{P [M = x, Z = 1, M \neq 0]}{P [Z = 1, M \neq 0]} \\ = \frac{P [M = x, M \neq 0]}{P [M \neq 0]} \\ = \frac{\frac{π μ^{x}}{e^{μ} x!} + \frac{(1 - π) ν^{x}}{e^{ν} x!}}{1 - \frac{π}{e^{μ}} - \frac{(1 - π)}{e^{ν}}}, \end{matrix}

since M and Z are independent, so the question becomes:

\begin{matrix} Is & \frac{\frac{π μ^{x}}{e^{μ} x!} + \frac{(1 - π) ν^{x}}{e^{ν} x!}}{1 - \frac{π}{e^{μ}} - \frac{(1 - π)}{e^{ν}}} = \frac{π μ^{x}}{(e^{μ} - 1) x!} + \frac{(1 - π) ν^{x}}{(e^{ν} - 1) x!}, x = 1, 2, \dots ? \end{matrix}

After some algebra, this equation simplifies to

\begin{matrix} {(\frac{μ}{ν})}^{x} & = (\frac{e^{μ} - 1}{e^{ν} - 1}), \end{matrix}

which cannot hold for all

x \geq 1

unless

μ = ν

.

Appendix B. Proofs of Theorems 1 and 2

Proof of Theorem 1.

As previously noted, it follows from [12,13,14] that for large K,

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{3} [0, K I_{m}^{- 1} (ω)], \\ I_{m} (ω) & \equiv - \nabla_{ω}^{2} log f_{ω} (m) \\ = - E_{ω} [\nabla_{ω}^{2} log f_{ω} (m) | m] \\ = - E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, m) | m] + E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y | m) | m] . \end{matrix}

(A2)

From (6),

\begin{matrix} log f_{ω} (Y, m) & = J [\bar{Y} log π + (1 - \bar{Y}) log (1 - π)] \\ + K [\bar{m Y} log μ - \bar{t} \bar{Y} μ + \bar{m (1 - Y)} log ν - \bar{t} (1 - \bar{Y}) ν] + h_{t} (m); \\ \nabla_{ω} log f_{ω} (Y, m) & = (\begin{matrix} \frac{J (\bar{Y} - π)}{π (1 - π)} \\ K [\frac{\bar{m Y}}{μ} - \bar{t} \bar{Y}] \\ K [\frac{\bar{m (1 - Y)}}{ν} - \bar{t} (1 - \bar{Y})] \end{matrix}); \\ - \nabla_{ω}^{2} log f_{ω} (Y, m) & = (\begin{matrix} \frac{J [(1 - 2 π) \bar{Y} + π^{2}]}{π^{2} {(1 - π)}^{2}} & 0 & 0 \\ 0 & K [\frac{\bar{m Y}}{μ^{2}}] & 0 \\ 0 & 0 & K [\frac{\bar{m (1 - Y)}}{ν^{2}}] \end{matrix}); \end{matrix}

(A3)

where

\bar{m Y}

and

\bar{m (1 - Y)}

are defined similarly to

\bar{m y}

and

h_{t} (m)

does not depend on

ω

. Furthermore by (6), for fixed

m

,

\begin{matrix} f_{ω} (y | m) & = f_{ω} (y, m) / f_{ω} (m) \\ \propto \prod_{j} {(π e^{- \bar{t} I μ} μ^{m_{j}})}^{y_{j}} {[(1 - π) e^{- \bar{t} I ν} ν^{m_{j}}]}^{1 - y_{j}}, \end{matrix}

(A4)

hence

Y_{1}, \dots, Y_{J}

are conditionally independent given

m

with

\begin{matrix} [Y_{j} | m] & \sim Bernoulli (p_{j}), \\ p_{j} \equiv p_{j} (ω; \bar{t}; {\bar{m}}_{j}) & = \frac{π e^{- \bar{t} I μ} μ^{m_{j}}}{π e^{- \bar{t} I μ} μ^{m_{j}} + (1 - π) e^{- \bar{t} I ν} ν^{m_{j}}} \end{matrix}

(A5)

\begin{matrix} = \frac{π {(e^{- \bar{t} μ} μ^{{\bar{m}}_{j}})}^{I}}{π {(e^{- \bar{t} μ} μ^{{\bar{m}}_{j}})}^{I} + (1 - π) {(e^{- \bar{t} ν} ν^{{\bar{m}}_{j}})}^{I}}, \end{matrix}

(A6)

where

{\bar{m}}_{j} = \frac{1}{I} m_{j}

. From (A5),

\begin{matrix} E_{ω} [\bar{Y} | m] & = \bar{p}; \\ E_{ω} [\bar{m Y} | m] & = \bar{m p}; \\ E_{ω} [\bar{m (1 - Y)} | m] & = \bar{m (1 - p)}; \end{matrix}

where

\bar{m p}

and

\bar{m (1 - p)}

are defined similarly to

\bar{m y}

and

\bar{m (1 - y)}

, and

\begin{matrix} \bar{p} & = \frac{1}{J} \sum_{j} p_{j}, \\ \bar{p (1 - p)} & = \frac{1}{J} \sum_{j} p_{j} (1 - p_{j}) . \end{matrix}

Furthermore,

\begin{matrix} E_{ω} [(1 - 2 π) \bar{Y} + π^{2} | m] & = (1 - 2 π) \bar{p} + π^{2} . \end{matrix}

Thus from (A3), the first term in (A2) is given by

\begin{matrix} - E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, m) | m] \\ = (\begin{matrix} \frac{J [(1 - 2 π) \bar{p} + π^{2}]}{π^{2} {(1 - π)}^{2}} & 0 & 0 \\ 0 & K [\frac{\bar{m p}}{μ^{2}}] & 0 \\ 0 & 0 & K [\frac{\bar{m (1 - p)}}{ν^{2}}] \end{matrix}) = : D (ω; \bar{t}; m) . \end{matrix}

(A7)

The second term in (A2) is obtained as follows: From (A5),

\begin{matrix} f_{ω} (Y | m) & = \prod_{j} p_{j}^{Y_{j}} {(1 - p_{j})}^{1 - Y_{j}}; \\ log f_{ω} (Y | m) & = \sum_{j} Y_{j} log p_{j} + (1 - Y_{j}) log (1 - p_{j}); \\ \nabla_{ω} log f_{ω} (Y | m) & = \sum_{j} \frac{(Y_{j} - p_{j})}{p_{j} (1 - p_{j})} \nabla_{ω} p_{j}; \\ \nabla_{ω}^{2} log f_{ω} (Y | m) & = \sum_{j} [\frac{(Y_{j} - p_{j}) \nabla_{ω}^{2} p_{j} - (\nabla_{ω} p_{j}) {(\nabla_{ω} p_{j})}^{'}}{p_{j} (1 - p_{j})} - \frac{(Y_{j} - p_{j}) (\nabla_{ω} p_{j}) {[\nabla_{ω} (p_{j} (1 - p_{j}))]}^{'}}{p_{j}^{2} {(1 - p_{j})}^{2}}]; \\ E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y | m) | m] & = - \sum_{j} \frac{(\nabla_{ω} p_{j}) {(\nabla_{ω} p_{j})}^{'}}{p_{j} (1 - p_{j})} \\ = \sum_{j} (\nabla_{ω} log p_{j}) {[\nabla_{ω} log (1 - p_{j})]}^{'} . \end{matrix}

From (A6),

\begin{matrix} log p_{j} & = log π - I \bar{t} μ + I {\bar{m}}_{j} log μ - log γ_{j}, \\ γ_{j} \equiv γ_{j} (ω; \bar{t}; {\bar{m}}_{j}) : & = π {(e^{- \bar{t} μ} μ^{{\bar{m}}_{j}})}^{I} + (1 - π) {(e^{- \bar{t} ν} ν^{{\bar{m}}_{j}})}^{I}, \end{matrix}

from which it can be shown that

\begin{matrix} \nabla_{ω} log p_{j} & = \frac{{(e^{- \bar{t} ν} ν^{{\bar{m}}_{j}})}^{I}}{γ_{j}} {(\frac{1}{π}, (1 - π) (\frac{{\bar{m}}_{j}}{μ} - \bar{t}) I, - (1 - π) (\frac{{\bar{m}}_{j}}{ν} - \bar{t}) I)}^{'}, \\ \nabla_{ω} log (1 - p_{j}) & = - \frac{p_{j}}{1 - p_{j}} \nabla_{ω} log p_{j} \\ = - \frac{π}{1 - π} {[e^{\bar{t} (ν - μ)} {(\frac{μ}{ν})}^{{\bar{m}}_{j}}]}^{I} \nabla_{ω} log p_{j} \\ = - \frac{{(e^{- \bar{t} μ} μ^{{\bar{m}}_{j}})}^{I}}{γ_{j}} {(\frac{1}{1 - π}, π (\frac{{\bar{m}}_{j}}{μ} - \bar{t}) I, - π (\frac{{\bar{m}}_{j}}{μ} - \bar{t}) I)}^{'}, \\ (\nabla_{ω} log p_{j}) {[\nabla_{ω} log (1 - p_{j})]}^{'} & = - \frac{{[e^{- \bar{t} (μ + ν)} {(μ ν)}^{{\bar{m}}_{j}}]}^{I}}{γ_{j}^{2}} δ_{j} δ_{j}^{'}, \\ δ_{j} \equiv δ_{j} (ω; \bar{t}; m_{j}) : & = {(\frac{1}{\sqrt{π (1 - π)}}, I \sqrt{π (1 - π)} (\frac{{\bar{m}}_{j}}{μ} - \bar{t}), - I \sqrt{π (1 - π)} (\frac{{\bar{m}}_{j}}{ν} - \bar{t}))}^{'} . \end{matrix}

Therefore

\begin{matrix} E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y | m) | m] & = - \sum_{j} \frac{{[e^{- \bar{t} (μ + ν)} {(μ ν)}^{{\bar{m}}_{j}}]}^{I}}{γ_{j}^{2}} δ_{j} δ_{j}^{'} . \end{matrix}

(A8)

Thus by (A2), (A7), and (A8), the observed information matrix is

\begin{matrix} I_{m} (ω) & = D (ω; \bar{t}; m) - e^{- \bar{t} I (μ + ν)} Δ (ω; \bar{t}; m) Δ {(ω; \bar{t}; m)}^{'}; \\ Δ (ω; \bar{t}; m) : & = (\begin{matrix} \frac{{(μ ν)}^{I {\bar{m}}_{1} / 2}}{γ_{1}} δ_{1}, & \dots, & \frac{{(μ ν)}^{I {\bar{m}}_{J} / 2}}{γ_{J}} δ_{J} \end{matrix}) . \end{matrix}

Finally, we may now estimate

I_{m} (ω)

in the normal approximation

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{3} [0, K I_{m}^{- 1} (ω)] \end{matrix}

by replacing

ω

in

I_{m} (ω)

by its MLE

\hat{ω} \equiv (\hat{π}, \hat{μ}, \hat{ν})

to obtain

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{3} [0, K I_{m}^{- 1} (\hat{ω})] . \end{matrix}

This requires replacing

π, μ, ν

by

\hat{π}, \hat{μ}, \hat{ν}

wherever the former three appear in the entries of

I_{m} (ω)

, including in

p_{j}

,

δ_{j}

, and

γ_{j}

. For large K the

3 \times 3

matrix

I_{m} (\hat{ω})

is positive definite, hence invertible. □

Proof of Theorem 2.

It follows from [12,13,14] that for large K,

\sqrt{K} (\hat{ω} - ω) \approx N_{4} [0, K I_{n}^{- 1} (ω)],

where

I_{n} (ω)

is the

4 \times 4

observed information matrix. Then,

\begin{matrix} I_{n} (ω) & \equiv - \nabla_{ω}^{2} log f_{ω} (n) \\ = - E_{ω} [\nabla_{ω}^{2} log f_{ω} (n) | n] \\ = - E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, Z, n) | n] + E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, Z | n) | n] \\ \equiv T_{1} + T_{2} . \end{matrix}

(A9)

By (9),

\begin{matrix} log f_{ω} (Y, Z, n) & = J [\bar{Y} log π + (1 - \bar{Y}) log (1 - π)] + K [\bar{Z} log ϵ + (1 - \bar{Z}) log (1 - ϵ)] \\ + K [\bar{n Y} log μ - \bar{t Y Z} μ + \bar{n (1 - Y)} log ν - \bar{t (1 - Y) Z} ν] + log Ξ_{t} (n, z); \\ \nabla_{ω} log f_{ω} (Y, Z, n) & = (\begin{matrix} \frac{J (\bar{Y} - π)}{π (1 - π)} \\ \frac{K (\bar{Z} - ϵ)}{ϵ (1 - ϵ)} \\ K [\frac{\bar{n Y}}{μ} - \bar{t Y Z}] \\ K [\frac{\bar{n (1 - Y)}}{ν} - \bar{t (1 - Y) Z}] \end{matrix}); \\ - \nabla_{ω}^{2} log f_{ω} (Y, Z, n) & = (\begin{matrix} \frac{J [(1 - 2 π) \bar{Y} + π^{2}]}{π^{2} {(1 - π)}^{2}} & 0 & 0 & 0 \\ 0 & \frac{K [(1 - 2 ϵ) \bar{Z} + ϵ^{2}]}{ϵ^{2} {(1 - ϵ)}^{2}} & 0 & 0 \\ 0 & 0 & K [\frac{\bar{n Y}}{μ^{2}}] & 0 \\ 0 & 0 & 0 & K [\frac{\bar{n (1 - Y)}}{ν^{2}}] \end{matrix}) . \end{matrix}

Furthermore by (9), with

n

fixed,

\begin{matrix} f_{ω} (y, z | n) & = f_{ω} (y, z, n) / f_{ω} (n) \\ \propto \prod_{j} π^{y_{j}} {(1 - π)}^{1 - y_{j}} \prod_{i, j} {\{ϵ {[e^{- t_{i} μ} {(t_{i} μ)}^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} {(t_{i} ν)}^{n_{i j}}]}^{1 - y_{j}}\}}^{z_{i j}} {[(1 - ϵ) 0^{n_{i j}}]}^{1 - z_{i j}} . \end{matrix}

From this,

{Z_{i j}}

are conditionally independent given

Y

and

N

, with

\begin{matrix} [Z_{i j} | y, n] & \sim Bernoulli (r_{i j}), \\ r_{i j} & \equiv r (t_{i}; y_{j}, n_{i j}) \\ : & = \frac{ϵ {[e^{- t_{i} μ} μ^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} ν^{n_{i j}}]}^{1 - y_{j}} t_{i}^{n_{i j}}}{ϵ {[e^{- t_{i} μ} μ^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} ν^{n_{i j}}]}^{1 - y_{j}} t_{i}^{n_{i j}} + (1 - ϵ) 0^{n_{i j}}} \end{matrix}

(A10)

\begin{matrix} = 1 - 0^{n_{i j}} + \frac{0^{n_{i j}} ϵ {[e^{- t_{i} μ} μ^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} ν^{n_{i j}}]}^{1 - y_{j}} t_{i}^{n_{i j}}}{ϵ {[e^{- t_{i} μ} μ^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} ν^{n_{i j}}]}^{1 - y_{j}} t_{i}^{n_{i j}} + (1 - ϵ)}, \end{matrix}

(A11)

and

\begin{matrix} f_{ω} (y | n) \\ \propto \sum_{z} f_{ω} (y, z | n) \\ \propto \prod_{j} π^{y_{j}} {(1 - π)}^{1 - y_{j}} \cdot \prod_{i, j} {ϵ {[e^{- t_{i} μ} μ^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} ν^{n_{i j}}]}^{1 - y_{j}} t_{i}^{n_{i j}} + (1 - ϵ) 0^{n_{i j}}} \\ = \prod_{j} π^{y_{j}} {(1 - π)}^{1 - y_{j}} \\ \cdot \prod_{{i, j | n_{i j} \neq 0}} {ϵ {[e^{- t_{i} μ} μ^{n_{i j}}]}^{y_{j}} {[e^{- t_{i} ν} ν^{n_{i j}}]}^{1 - y_{j}} t_{i}^{n_{i j}}} \cdot \prod_{{i, j | n_{i j} = 0}} [ϵ e^{- t_{i} μ y_{j}} e^{- t_{i} ν (1 - y_{j})} + (1 - ϵ)] \\ \propto ϵ^{1^{\neq}} \prod_{j} {[π e^{- t_{j}^{\neq} μ} μ^{n_{j}}]}^{y_{j}} {[(1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}}]}^{1 - y_{j}} \cdot \prod_{j} \prod_{{i | n_{i j = 0}}} [ϵ e^{- t_{i} μ y_{j}} e^{- t_{i} ν (1 - y_{j})} + (1 - ϵ)] \\ \propto \prod_{j} \{{[π e^{- t_{j}^{\neq} μ} μ^{n_{j}}]}^{y_{j}} {[(1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}}]}^{1 - y_{j}} \cdot \prod_{i} {[ϵ e^{- t_{i} μ y_{j}} e^{- t_{i} ν (1 - y_{j})} + (1 - ϵ)]}^{1_{i j}^{=}}\} \\ \propto \prod_{j} q_{j}^{y_{j}} {(1 - q_{j})}^{1 - y_{j}}, \end{matrix}

where

\begin{matrix} q_{j} & \equiv q_{j} (n_{j}) \equiv q_{j} (π, ϵ, μ, ν; t; n_{j}) \\ : & = \frac{π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ e^{- t_{i} μ} + (1 - ϵ)]}^{1_{i j}^{=}}}{π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ e^{- t_{i} μ} + (1 - ϵ)]}^{1_{i j}^{=}} + (1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ e^{- t_{i} ν} + (1 - ϵ)]}^{1_{i j}^{=}}} . \end{matrix}

Thus

{Y_{j}}

are conditionally independent given

N

, with

\begin{matrix} [Y_{j} | n] & \sim Bernoulli (q_{j}) . \end{matrix}

(A12)

Therefore

E_{ω} [\bar{Y} | n] = \bar{q} \equiv \bar{q} (n)

, while

\begin{matrix} E_{ω} [\bar{n Y} | n] & = \frac{1}{K} \sum_{i, j} n_{i j} E_{ω} [Y_{j} | n] \\ = \frac{1}{K} \sum_{i, j} n_{i j} q_{j} \\ = \bar{n q}; \\ E_{ω} [\bar{n (1 - Y)} | n] & = \frac{1}{K} \sum_{i, j} n_{i j} (1 - q_{j}) \\ = \bar{n (1 - q)} . \end{matrix}

Furthermore,

\begin{matrix} E_{ω} [(1 - 2 π) \bar{Y} + π^{2} | n] & = (1 - 2 π) \bar{q} + π^{2} . \end{matrix}

Next,

\begin{matrix} E_{ω} [\bar{Z} | n] & = E_{ω} {E_{ω} [\bar{Z} | y, n] | n}; \\ = \frac{1}{K} \sum_{i, j} E_{ω} {r (t_{i}; Y_{j}, n_{i j}) | n}; \\ = \frac{1}{K} \sum_{i, j} {q_{j} r (t_{i}; 1, n_{i j}) + (1 - q_{j}) r (t_{i}; 0, n_{i j})} \\ \equiv \bar{q r (1) + (1 - q) r (0)} . \end{matrix}

(A13)

From (A11), note that

\begin{matrix} r (t_{i}; 1, n_{i j}) & = 1 - 0^{n_{i j}} + \frac{0^{n_{i j}} ϵ e^{- t_{i} μ} μ^{n_{i j}} t_{i}^{n_{i j}}}{ϵ e^{- t_{i} μ} μ^{n_{i j}} t_{i}^{n_{i j}} + (1 - ϵ)}, \\ r (t_{i}; 0, n_{i j}) & = 1 - 0^{n_{i j}} + \frac{0^{n_{i j}} ϵ e^{- t_{i} ν} ν^{n_{i j}} t_{i}^{n_{i j}}}{ϵ e^{- t_{i} ν} ν^{n_{i j}} t_{i}^{n_{i j}} + (1 - ϵ)}, \end{matrix}

and decompose

\sum_{i, j}

as

\sum_{{i, j | n_{i j} \neq 0}}

+

\sum_{{i, j | n_{i j} = 0}}

, so (A13) becomes

\begin{matrix} E_{ω} [\bar{Z} | n] & = \frac{1}{K} \sum_{i, j} [1 - 0^{n_{i j}} + \frac{q_{j} 0^{n_{i j}} ϵ e^{- t_{i} μ} μ^{n_{i j}} t_{i}^{n_{i j}}}{ϵ e^{- t_{i} μ} μ^{n_{i j}} t_{i}^{n_{i j}} + (1 - ϵ)} + \frac{(1 - q_{j}) 0^{n_{i j}} ϵ e^{- t_{i} ν} ν^{n_{i j}} t_{i}^{n_{i j}}}{ϵ e^{- t_{i} ν} ν^{n_{i j}} t_{i}^{n_{i j}} + (1 - ϵ)}] \\ = \frac{1^{\neq}}{K} + \frac{ϵ}{K} \sum_{{i, j | n_{i j} = 0}} [\frac{q_{j} e^{- t_{i} μ}}{ϵ e^{- t_{i} μ} + (1 - ϵ)} + \frac{(1 - q_{j}) e^{- t_{i} ν}}{ϵ e^{- t_{i} ν} + (1 - ϵ)}] \\ = \frac{1^{\neq}}{K} + \frac{ϵ}{K} \sum_{i, j} 1_{i j}^{=} [\frac{q_{j}}{ϵ + (1 - ϵ) e^{t_{i} μ}} + \frac{1 - q_{j}}{ϵ + (1 - ϵ) e^{t_{i} ν}}] \\ : & = ρ (ϵ, μ, ν; t; n) \equiv ρ . \end{matrix}

Thus,

\begin{matrix} E_{ω} [(1 - 2 ϵ) \bar{Z} + ϵ^{2} | n] & = (1 - 2 ϵ) ρ + ϵ^{2} . \end{matrix}

Therefore the first term in (A9) is evaluated explicitly as follows:

\begin{matrix} T_{1} & = - E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, Z, n) | n) | n] \\ = (\begin{matrix} \frac{J [(1 - 2 π) \bar{q} + π^{2}]}{π^{2} {(1 - π)}^{2}} & 0 & 0 & 0 \\ 0 & \frac{K [(1 - 2 ϵ) ρ + ϵ^{2}]}{ϵ^{2} {(1 - ϵ)}^{2}} & 0 & 0 \\ 0 & 0 & K [\frac{\bar{n q}}{μ^{2}}] & 0 \\ 0 & 0 & 0 & K [\frac{\bar{n (1 - q)}}{ν^{2}}] \end{matrix}) . \end{matrix}

(A14)

For the second term in (A9), it follows from (A10) and (A12) that

\begin{matrix} f_{ω} (Y, Z | n) & = f_{ω} (Y | n) f_{ω} (Z | Y, n) \\ = \prod_{j} q_{j}^{Y_{j}} {(1 - q_{j})}^{1 - Y_{j}} \prod_{i,} r_{i j}^{Z_{i j}} {(1 - r_{i j})}^{1 - Z_{i j}} \\ = \prod_{j} q_{j}^{Y_{j}} {(1 - q_{j})}^{1 - Y_{j}} \prod_{{i, j | n_{i j} = 0}} r_{i j}^{Z_{i j}} {(1 - r_{i j})}^{1 - Z_{i j}}, \end{matrix}

since

n_{i j} \neq 0 \Rightarrow Z_{i j} = 1

and

r_{i j} = 1

. Thus,

\begin{matrix} log f_{ω} (Y, Z | n) & = \sum_{j} [Y_{j} log q_{j} + (1 - Y_{j}) log (1 - q_{j})] \\ + \sum_{{i, j | n_{i j} = 0}} [Z_{i j} log r_{i j} + (1 - Z_{i j}) log (1 - r_{i j})]; \\ \nabla_{ω} log f_{ω} (Y, Z | n) & = \sum_{j} \frac{(Y_{j} - q_{j})}{q_{j} (1 - q_{j})} \nabla_{ω} q_{j} + \sum_{{i, j | n_{i j} = 0}} \frac{(Z_{i j} - r_{i j})}{r_{i j} (1 - r_{i j})} \nabla_{ω} r_{i j}; \\ \nabla_{ω}^{2} log f_{ω} (Y, Z | n) & = \sum_{j} [\frac{(Y_{j} - q_{j}) \nabla_{ω}^{2} q_{j} - (\nabla_{ω} q_{j}) {(\nabla_{ω} q_{j})}^{'}}{q_{j} (1 - q_{j})} - \frac{(Y_{j} - q_{j}) (\nabla_{ω} q_{j}) {[\nabla_{ω} (q_{j} (1 - q_{j}))]}^{'}}{q_{j}^{2} {(1 - q_{j})}^{2}}] \\ + & \sum_{{i, j | n_{i j} = 0}} [\frac{(Z_{i j} - r_{i j}) \nabla_{ω}^{2} r_{i j} - (\nabla_{ω} r_{i j}) {(\nabla_{ω} r_{i j})}^{'}}{r_{i j} (1 - r_{i j})} - \frac{(Z_{i j} - r_{i j}) (\nabla_{ω} r_{i j}) {[\nabla_{ω} (r_{i j} (1 - r_{i j}))]}^{'}}{r_{i j}^{2} {(1 - r_{i j})}^{2}}], \end{matrix}

where

r_{i j} \equiv r (t_{i}; y_{j}, n_{i j})

. Therefore, a preliminary expression for the second term in (A9) is given by

\begin{matrix} T_{2} & = E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, Z | n) | n] \\ = - \sum_{j} \frac{(\nabla_{ω} q_{j}) {(\nabla_{ω} q_{j})}^{'}}{q_{j} (1 - q_{j})} - \sum_{{i, j | n_{i j} = 0}} E_{ω} [\frac{(\nabla_{ω} r_{i j}) {(\nabla_{ω} r_{i j})}^{'}}{r_{i j} (1 - r_{i j})} | n] \\ = \sum_{j} (\nabla_{ω} log q_{j}) {[\nabla_{ω} log (1 - q_{j})]}^{'} \\ + \sum_{{i, j | n_{i j} = 0}} E_{ω} \{(\nabla_{ω} log r (t_{i}; Y_{j}, n_{i j]})) {[\nabla_{ω} log (1 - r (t_{i}; Y_{j}, n_{i j}))]}^{'} | n\}, \end{matrix}

(A15)

where we used the facts that for any functions

h (n)

and

h (y, n)

,

\begin{matrix} E_{ω} [(Y_{j} - q_{j}) h (n) | n] & = h (n) E_{ω} [(Y_{j} - q_{j}) | n] = 0, \\ E_{ω} [(Z_{i j} - r_{i j}) h (y, n) | n] & = E_{ω} {h (y, n) E_{ω} [Z_{i j} - r_{i j} | y, n] | n} = 0 . \end{matrix}

Now note that

\begin{matrix} log q_{j} & = log π - t_{j}^{\neq} μ + n_{j} log μ + \sum_{i} 1_{i j}^{=} log [ϵ (e^{- t_{i} μ} - 1) + 1] - log ψ_{j}; \\ ψ_{j} : & = π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1]}^{1_{i j}^{=}} + (1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}; \\ \frac{\partial ψ_{j}}{\partial π} & = e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1]}^{1_{i j}^{=}} - e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}, \\ \frac{\partial ψ_{j}}{\partial ϵ} & = π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \sum_{i} \frac{1_{i j}^{=} (e^{- t_{i} μ} - 1)}{ϵ (e^{- t_{i} μ} - 1) + 1} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1]}^{1_{i j}^{=}} \\ + (1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}} \sum_{i} \frac{1_{i j}^{=} (e^{- t_{i} ν} - 1)}{ϵ (e^{- t_{i} ν} - 1) + 1} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}, \\ \frac{\partial ψ_{j}}{\partial μ} & = π e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1]}^{1_{i j}^{=}} [\frac{n_{j}}{μ} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} μ}}{ϵ (e^{- t_{i} μ} - 1) + 1}], \\ \frac{\partial ψ_{j}}{\partial ν} & = (1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}} [\frac{n_{j}}{ν} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} ν}}{ϵ (e^{- t_{i} ν} - 1) + 1}]; \end{matrix}

from which it can be shown that

\begin{matrix} \frac{\partial log q_{j}}{\partial π} & = \frac{e^{- t_{j}^{\neq} ν} ν^{n_{j}}}{π ψ_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}, \\ \frac{\partial log q_{j}}{\partial ϵ} & = \frac{(1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}}}{ψ_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}} \sum_{i} \frac{1_{i j}^{=} (e^{- t_{i} μ} - e^{- t_{i} ν})}{[ϵ (e^{- t_{i} μ} - 1) + 1] [ϵ (e^{- t_{i} μ} - 1) + 1]}, \\ \frac{\partial log q_{j}}{\partial μ} & = \frac{(1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}}}{ψ_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}} [\frac{n_{j}}{μ} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} μ}}{ϵ (e^{- t_{i} μ} - 1) + 1}], \\ \frac{\partial log q_{j}}{\partial ν} & = - \frac{(1 - π) e^{- t_{j}^{\neq} ν} ν^{n_{j}}}{ψ_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}} [\frac{n_{j}}{ν} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} ν}}{ϵ (e^{- t_{i} ν} - 1) + 1}] . \end{matrix}

These four partial derivatives determine the

4 \times 1

column vector

\nabla_{ω} log q_{j}

. Furthermore,

\begin{matrix} \nabla_{ω} log (1 - q_{j}) & = - \frac{q_{j}}{1 - q_{j}} \nabla_{ω} log q_{j} \\ = - \frac{π}{1 - π} \frac{e^{- t_{j}^{\neq} μ} μ^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1]}^{1_{i j}^{=}}}{e^{- t_{j}^{\neq} ν} ν^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}} \nabla_{ω} log q_{j}, \end{matrix}

hence

\begin{matrix} (\nabla_{ω} log q_{j}) {[\nabla_{ω} log (1 - q_{j})]}^{'} \\ = & - \frac{π (1 - π) e^{- t_{j}^{\neq} (μ + ν)} {(μ ν)}^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1] [ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}}{ψ_{j}^{2}} ϕ_{j} ϕ_{j}^{'}, \end{matrix}

where

\begin{matrix} ϕ_{j} \equiv ϕ_{j} (ω; t; n_{j}) : & = (\begin{matrix} \frac{1}{π (1 - π)} \\ \sum_{i} \frac{1_{i j}^{=} (e^{- t_{i} μ} - e^{- t_{i} ν})}{[ϵ (e^{- t_{i} μ} - 1) + 1] [ϵ (e^{- t_{i} ν} - 1) + 1]} \\ [\frac{n_{j}}{μ} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} μ}}{ϵ (e^{- t_{i} μ} - 1) + 1}] \\ [\frac{n_{j}}{ν} - t_{j}^{\neq} - ϵ \sum_{i} \frac{1_{i j}^{=} t_{i} e^{- t_{i} ν}}{ϵ (e^{- t_{i} ν} - 1) + 1}] \end{matrix}) . \end{matrix}

Next, for

n_{i j} = 0

,

\begin{matrix} r (t_{i}; 1, 0) & = \frac{ϵ e^{- t_{i} μ}}{ϵ (e^{- t_{i} μ} - 1) + 1}, \\ r (t_{i}; 0, 0) & = \frac{ϵ e^{- t_{i} ν}}{ϵ (e^{- t_{i} ν} - 1) + 1}; \\ log r (t_{i}; 1, 0) & = log ϵ - t_{i} μ - log [ϵ (e^{- t_{i} μ} - 1) + 1], \\ log (1 - r (t_{i}; 1, 0)) & = log (1 - ϵ) - log [ϵ (e^{- t_{i} μ} - 1) + 1]; \\ log r (t_{i}; 0, 0) & = log ϵ - t_{i} ν - log [ϵ (e^{- t_{i} ν} - 1) + 1], \\ log (1 - r (t_{i}; 0, 0)) & = log (1 - ϵ) - log [ϵ (e^{- t_{i} ν} - 1) + 1]; \end{matrix}

so with

ω = (π, ϵ, μ, ν)

, we find that

\begin{matrix} \nabla_{ω} log r (t_{i}; 1, 0) & = {(0, \frac{1}{ϵ [ϵ (e^{- t_{i} μ} - 1) + 1]}, \frac{- (1 - ϵ) t_{i}}{ϵ (e^{- t_{i} μ} - 1) + 1}, 0)}^{'}, \\ \nabla_{ω} log (1 - r (t_{i}; 1, 0)) & = {(0, \frac{- e^{- t_{i} μ}}{(1 - ϵ) [ϵ (e^{- t_{i} μ} - 1) + 1]}, \frac{ϵ t_{i} e^{- t_{i} μ}}{ϵ (e^{- t_{i} μ} - 1) + 1}, 0)}^{'}, \\ \nabla_{ω} log r (t_{i}; 0, 0) & = {(0, \frac{1}{ϵ [ϵ (e^{- t_{i} ν} - 1) + 1]}, 0, \frac{- (1 - ϵ) t_{i}}{ϵ (e^{- t_{i} ν} - 1) + 1})}^{'}, \\ \nabla_{ω} log (1 - r (t_{i}; 0, 0)) & = {(0, \frac{- e^{- t_{i} ν}}{(1 - ϵ) [ϵ (e^{- t_{i} ν} - 1) + 1]}, 0, \frac{ϵ t_{i} e^{- t_{i} ν}}{ϵ (e^{- t_{i} ν} - 1) + 1})}^{'} . \end{matrix}

Thus

\begin{matrix} E_{ω} & \{(\nabla_{ω} log r (t_{i}; Y_{j}, 0)) {[\nabla_{ω} log (1 - r (t_{i}; Y_{j}, 0))]}^{'} | n\} \\ = (\nabla_{ω} log r (t_{i}; 1, 0)) {[\nabla_{ω} log (1 - r (t_{i}; 1, 0))]}^{'} q_{j} \\ + (\nabla_{ω} log r (t_{i}; 0, 0)) {[\nabla_{ω} log (1 - r (t_{i}; 0, 0))]}^{'} (1 - q_{j})} \\ = - \frac{ϵ (1 - ϵ) e^{- t_{i} μ}}{{[ϵ (e^{- t_{i} μ} - 1) + 1]}^{2}} χ_{i} (1) χ_{i} {(1)}^{'} q_{j} - \frac{ϵ (1 - ϵ) e^{- t_{i} ν}}{{[ϵ (e^{- t_{i} ν} - 1) + 1]}^{2}} χ_{i} (0) χ_{i} {(0)}^{'} (1 - q_{j}), \end{matrix}

where

\begin{matrix} χ_{i} (1) & \equiv χ (ϵ; t_{i}; 1) : = {(0, \frac{1}{ϵ (1 - ϵ)}, t_{i}, 0)}^{'}, \\ χ_{i} (0) & \equiv χ (ϵ; t_{i}; 0) : = {(0, \frac{1}{ϵ (1 - ϵ)}, 0, t_{i})}^{'} . \end{matrix}

Therefore, using (A15), the second term in (A9) is given by

\begin{matrix} T_{2} & = E_{ω} [\nabla_{ω}^{2} log f_{ω} (Y, Z | n) | n] \\ = - π (1 - π) \sum_{j} \frac{e^{- t_{j}^{\neq} (μ + ν)} {(μ ν)}^{n_{j}} \prod_{i} {[ϵ (e^{- t_{i} μ} - 1) + 1] [ϵ (e^{- t_{i} ν} - 1) + 1]}^{1_{i j}^{=}}}{ψ_{j}^{2}} ϕ_{j} ϕ_{j}^{'} \\ - ϵ (1 - ϵ) \sum_{i, j} 1_{i j}^{=} \{\frac{e^{- t_{i} μ}}{{[ϵ (e^{- t_{i} μ} - 1) + 1]}^{2}} χ_{i} (1) χ_{i} {(1)}^{'} q_{j} + \frac{e^{- t_{i} ν}}{{[ϵ (e^{- t_{i} ν} - 1) + 1]}^{2}} χ_{i} (0) χ_{i} {(0)}^{'} (1 - q_{j})\} . \end{matrix}

(A16)

Thus, (A14) and (A16) explicitly determine the observed information matrix

I_{n} (ω)

in (A9).

Finally, we may estimate

I_{n} (ω)

in the normal approximation

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{4} [0, K I_{n}^{- 1} (ω)] \end{matrix}

by replacing

ω

in

I_{n} (ω)

by its MLE,

\hat{ω} \equiv (\hat{π}, \hat{ϵ}, \hat{μ}, \hat{ν})

, thereby obtaining

\begin{matrix} \sqrt{K} (\hat{ω} - ω) & \approx N_{4} [0, K I_{n}^{- 1} (\hat{ω})] . \end{matrix}

For large K the

4 \times 4

matrix

I_{n} (\hat{ω})

is positive definite, hence invertible. □

Appendix C. Additional Results from Section 5

Appendix C.1. Additional Results from Section 5.1

Table A1 and Table A2 display the results of Figure 1 and Figure 2, respectively, in tabular form.

Table A1. Mean absolute error in estimation of

θ

across varying values of I, J,

ϵ

, and

π

.

Table A1. Mean absolute error in estimation of

θ

across varying values of I, J,

ϵ

, and

π

.

			J = 5			J = 10			J = 20			J = 40			J = 80
		$π$	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4
	$ϵ$		0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4
$I = 5$	0.6		0.82	0.70	0.55	0.94	0.39	0.30	0.75	0.29	0.19	0.39	0.14	0.12	0.44	0.09	0.08
	0.7		0.57	0.43	0.37	0.61	0.32	0.28	0.41	0.22	0.18	0.23	0.12	0.11	0.14	0.08	0.07
	0.8		0.54	0.52	0.42	0.47	0.27	0.23	0.31	0.21	0.15	0.18	0.11	0.10	0.15	0.08	0.07
$I = 10$	0.6		0.55	0.34	0.33	0.38	0.24	0.18	0.27	0.13	0.12	0.15	0.08	0.08	0.10	0.06	0.06
	0.7		0.48	0.32	0.27	0.34	0.19	0.17	0.23	0.13	0.12	0.13	0.08	0.07	0.09	0.06	0.05
	0.8		0.43	0.27	0.26	0.32	0.22	0.15	0.24	0.11	0.10	0.12	0.08	0.07	0.07	0.05	0.05
$I = 20$	0.6		0.49	0.28	0.21	0.33	0.15	0.12	0.20	0.09	0.08	0.10	0.06	0.06	0.06	0.05	0.04
	0.7		0.44	0.26	0.19	0.35	0.16	0.13	0.18	0.09	0.08	0.08	0.05	0.05	0.05	0.04	0.04
	0.8		0.45	0.28	0.18	0.30	0.13	0.11	0.16	0.08	0.07	0.07	0.05	0.05	0.05	0.04	0.04
$I = 40$	0.6		0.47	0.24	0.15	0.26	0.12	0.09	0.14	0.06	0.06	0.07	0.04	0.04	0.05	0.03	0.03
	0.7		0.43	0.24	0.17	0.29	0.10	0.09	0.15	0.06	0.05	0.07	0.04	0.04	0.05	0.03	0.03
	0.8		0.48	0.22	0.14	0.23	0.09	0.08	0.09	0.05	0.04	0.06	0.04	0.04	0.04	0.03	0.02
$I = 80$	0.6		0.45	0.20	0.11	0.17	0.11	0.06	0.08	0.05	0.04	0.04	0.03	0.03	0.03	0.02	0.02
	0.7		0.41	0.19	0.12	0.23	0.08	0.05	0.11	0.04	0.04	0.04	0.03	0.03	0.03	0.02	0.02
	0.8		0.38	0.18	0.10	0.23	0.09	0.05	0.11	0.04	0.03	0.04	0.03	0.03	0.03	0.02	0.02

Table A2. Nominal coverage of 95% confidence intervals for

θ

across varying values of I, J,

ϵ

, and

π

.

Table A2. Nominal coverage of 95% confidence intervals for

θ

across varying values of I, J,

ϵ

, and

π

.

			J = 5			J = 10			J = 20			J = 40			J = 80
		$π$	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4
	$ϵ$		0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4	0.1	0.25	0.4
$I = 5$	0.6		0.65	0.81	0.92	0.82	0.93	0.97	0.90	0.98	0.97	0.95	0.98	0.99	0.94	0.98	1.00
	0.7		0.75	0.94	0.93	0.92	0.96	0.97	0.93	0.97	0.97	0.95	0.98	0.99	0.98	0.99	0.98
	0.8		0.79	0.90	0.96	0.93	0.96	0.97	0.94	0.97	0.96	0.97	1.00	0.96	0.96	0.98	0.98
$I = 10$	0.6		0.79	0.91	0.93	0.89	0.94	0.96	0.91	0.96	0.98	0.95	0.98	0.97	0.98	0.95	0.98
	0.7		0.83	0.92	0.95	0.90	0.93	0.96	0.95	0.96	0.94	0.96	0.98	0.98	0.95	0.96	0.96
	0.8		0.81	0.94	0.94	0.89	0.94	0.96	0.89	0.96	0.95	0.96	0.94	0.96	0.98	0.96	0.96
$I = 20$	0.6		0.80	0.92	0.95	0.88	0.97	0.98	0.92	0.95	0.96	0.96	0.94	0.92	0.96	0.94	0.94
	0.7		0.78	0.92	0.91	0.80	0.91	0.89	0.91	0.94	0.96	0.94	0.95	0.95	0.96	0.95	0.96
	0.8		0.74	0.91	0.90	0.86	0.93	0.95	0.90	0.96	0.93	0.97	0.96	0.94	0.94	0.94	0.96
$I = 40$	0.6		0.77	0.90	0.93	0.83	0.95	0.96	0.91	0.95	0.93	0.91	0.95	0.94	0.94	0.94	0.98
	0.7		0.79	0.91	0.95	0.84	0.96	0.95	0.93	0.96	0.97	0.96	0.93	0.95	0.95	0.94	0.98
	0.8		0.73	0.89	0.94	0.83	0.96	0.93	0.95	0.98	0.96	0.95	0.94	0.92	0.92	0.94	0.95
$I = 80$	0.6		0.72	0.90	0.94	0.90	0.93	0.95	0.97	0.95	0.95	0.95	0.97	0.92	0.96	0.93	0.94
	0.7		0.70	0.91	0.94	0.84	0.95	0.96	0.92	0.94	0.96	0.94	0.94	0.94	0.93	0.94	0.96
	0.8		0.75	0.87	0.94	0.86	0.96	0.97	0.93	0.95	0.96	0.95	0.94	0.93	0.94	0.94	0.94

Appendix C.2. Additional Results from Section 5.2

To assess the sensitivity of our results to the assumption that

t_{i} = 1

for each site

i = 1, \dots, 11

, we ran a sensitivity analysis in which

t_{i}

was iteratively updated during the E-step of the proposed EM algorithm. Specifically, we updated

t_{i}

by maximizing (9) conditional on current estimates of

\hat{y}

,

\hat{z}

,

\hat{μ}

, and

\hat{ν}

at any given step of the algorithm. After estimation, estimates of

t_{i}

were treated as fixed and known.

Results under this sensitivity analysis appear in Table A3. We notice that estimates of

θ

,

π

, and

ϵ

are remarkably similar, while estimates of

μ

and

ν

are each changed by the same scale factor. Thus, we observe that our results for the parameter of interest,

θ

, are not sensitive to our choice of

t_{i}

in this case.

Table A3. Sensitivity analysis results when estimating

t_{i}

in frigatebird analysis.

Table A3. Sensitivity analysis results when estimating

t_{i}

in frigatebird analysis.

Parameter	$π$	$ϵ$	$μ$	$ν$	$θ$ (95% CI)
Estimate	0.25	0.88	50.55	13.84	3.65 (3.22, 4.08)
Site (i)	1	2	3	4	5	6	7	8	9	10	11
${\hat{t}}_{i}$	0.104	0.166	0.065	0.261	0.293	0.366	3.413	2.574	1.434	4.485	0.116

References

Baker, G.B.; Holdsworth, M. Seabird monitoring study at Coringa Herald National Nature Reserve 2012; Report Prepared for Department of Sustainability, Environment, Water, Populations and Communities; Latitude 42 Environmental Consultants Pty Ltd.: Kettering, TAS, Australia, 2013. [Google Scholar]
Bouveyron; Celeux, C.G.; Murphy, T.B.; Raftery, A.E. Model-Based Clustering and Classification for Data Science: With Applications in R; Cambridge University Press: Cambridge, UK, 2019. [Google Scholar]
Lindsay, B.G. Mixture Models: Theory, Geometry, and Applications; Institute for Mathematical Statistics: Hayward, CA, USA, 1995. [Google Scholar]
McLachlan, G.J.; Peel, D. Finite Mixture Models; Wiley: New York, NY, USA, 2000. [Google Scholar]
Martin, T.G.; Wintle, B.A.; Rhodes, J.R.; Kuhnert, P.M.; Field, S.A.; Low-Choy, S.J.; Tyre, A.J.; Possingham, H.P. Zero tolerance ecology: Improving ecological inference by modelling the source of zero observations. Ecol. Lett. 2005, 8, 1235–1246. [Google Scholar] [CrossRef] [PubMed]
Lambert, D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 1992, 34, 1–14. [Google Scholar] [CrossRef]
Lim, H.K.; Li, W.K.; Philip, L.H. Zero-inflated Poisson regression mixture model. Comput. Stat. Data Anal. 2014, 71, 151–158. [Google Scholar] [CrossRef]
Long, D.L.; Preisser, J.S.; Herring, A.H.; Golin, C.E. A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 2014, 33, 5151–5165. [Google Scholar] [CrossRef] [PubMed]
Jamshidian, M.; Jennrich, R.I. Standard errors for EM estimation. J. R. Stat. Soc. Ser. B 2000, 62, 257–270. [Google Scholar] [CrossRef]
Aitken, M.; Rubin, D.B. Estimation and hypothesis testing in finite mixture models. J. R. Stat. Soc. Ser. B 1985, 47, 67–75. [Google Scholar] [CrossRef]
McLachlan, G.J.; Krishnan, T. The EM Algorithms and its Extensions, 2nd ed.; Wiley: New York, NY, USA, 2008. [Google Scholar]
Hoadley, B. Asymptotic properties of maximum likelihood estimators for the independent not identically distributed case. Ann. Math. Stat. 1971, 42, 1977–1991. [Google Scholar] [CrossRef]
Efron, B.; Hinkley, D.V. Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information. Biometrika 1978, 65, 457–482. [Google Scholar] [CrossRef]
Louis, T. Finding the observed information matrix when using the EM Algorithm. J. R. Stat. Soc. Ser. B 1982, 44, 226–233. [Google Scholar] [CrossRef]
Johnson, N.L.; Kemp, A.W.; Kotz, S. Univariate Discrete Distributions, 3rd ed.; Wiley-Interscience: Hoboken, NJ, USA, 2005. [Google Scholar]

Figure 1. Estimation error for

θ

in simulated ZIPM data.

Figure 1. Estimation error for

θ

in simulated ZIPM data.

Figure 2. Nominal coverage of 95% confidence intervals for

θ

in simulated ZIPM data. Red dashed lines represent target coverage level of 0.95.

Figure 2. Nominal coverage of 95% confidence intervals for

θ

in simulated ZIPM data. Red dashed lines represent target coverage level of 0.95.

Table 1. Counts of frigatebird nests by subspecies.

Frigatebird Subspecies	Nest Counts	Relative Proportion
Lesser	46	0.036
Greater	81	0.063
Unidentified	1158	0.901

Table 2. Total counts of 1158 unidentified frigatebird nests by site and time point.

Site	August 2007	September 2008	October 2009	August 2012
1	0	1	0	3
2	2	0	8	4
3	1	1	2	2
4	4	4	14	2
5	12	1	8	6
6	0	0	18	6
7	54	0	209	4
8	53	54	127	3
9	24	3	80	25
10	137	62	196	18
11	4	2	4	0

Table 3. ZIPM model MLEs to study unidentified frigatebird nests.

Parameter	$π$	$ϵ$	$μ$	$ν$	$θ$ (95% CI)
Estimate	0.25	0.84	66.60	18.22	3.65 (3.23, 4.08)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pearce, M.; Perlman, M.D. Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model. Stats 2025, 8, 55. https://doi.org/10.3390/stats8030055

AMA Style

Pearce M, Perlman MD. Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model. Stats. 2025; 8(3):55. https://doi.org/10.3390/stats8030055

Chicago/Turabian Style

Pearce, Michael, and Michael D. Perlman. 2025. "Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model" Stats 8, no. 3: 55. https://doi.org/10.3390/stats8030055

APA Style

Pearce, M., & Perlman, M. D. (2025). Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model. Stats, 8(3), 55. https://doi.org/10.3390/stats8030055

Article Menu

Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model

Abstract

1. Introduction

2. Notation

3. A Preliminary Problem

3.1. Estimation via the EM Algorithm

3.2. Standard Error for the MLE $\hat{θ}$

4. The Main Problem

4.1. Estimation via the EM Algorithm

4.2. Standard Error for the MLE $\hat{θ}$

5. Simulation and Data Analysis

5.1. Simulation Study

5.2. Analysis of Frigatebird Nest Counts

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Conditional ZIPM = ZTP?

Appendix B. Proofs of Theorems 1 and 2

Appendix C. Additional Results from Section 5

Appendix C.1. Additional Results from Section 5.1

Appendix C.2. Additional Results from Section 5.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Site	August 2007	September 2008	October 2009	August 2012
1	0	1	0	3
2	2	0	8	4
3	1	1	2	2
4	4	4	14	2
5	12	1	8	6
6	0	0	18	6
7	54	0	209	4
8	53	54	127	3
9	24	3	80	25
10	137	62	196	18
11	4	2	4	0

Site	August 2007	September 2008	October 2009	August 2012
1	0	1	0	3
2	2	0	8	4
3	1	1	2	2
4	4	4	14	2
5	12	1	8	6
6	0	0	18	6
7	54	0	209	4
8	53	54	127	3
9	24	3	80	25
10	137	62	196	18
11	4	2	4	0

Article Menu

Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model

Abstract

1. Introduction

2. Notation

3. A Preliminary Problem

3.1. Estimation via the EM Algorithm

3.2. Standard Error for the MLE θ ^

4. The Main Problem

4.1. Estimation via the EM Algorithm

4.2. Standard Error for the MLE θ ^

5. Simulation and Data Analysis

5.1. Simulation Study

5.2. Analysis of Frigatebird Nest Counts

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Conditional ZIPM = ZTP?

Appendix B. Proofs of Theorems 1 and 2

Appendix C. Additional Results from Section 5

Appendix C.1. Additional Results from Section 5.1

Appendix C.2. Additional Results from Section 5.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. Standard Error for the MLE $\hat{θ}$

4.2. Standard Error for the MLE $\hat{θ}$

Site	August 2007	September 2008	October 2009	August 2012
1	0	1	0	3
2	2	0	8	4
3	1	1	2	2
4	4	4	14	2
5	12	1	8	6
6	0	0	18	6
7	54	0	209	4
8	53	54	127	3
9	24	3	80	25
10	137	62	196	18
11	4	2	4	0