Zero-and-One Integer-Valued AR(1) Time Series with Power Series Innovations and Probability Generating Function Estimation Approach

Vladica S. Stojanović; Hassan S. Bakouch; Eugen Ljajko; Najla Qarmalah

doi:10.3390/math11081772

,

and

¹

Department of Informatics & Computer Sciences, University of Criminal Investigation and Police Studies, 11060 Belgrade, Serbia

²

Department of Mathematics, College of Science, Qassim University, Buraydah 51452, Saudi Arabia

³

Department of Mathematics, Faculty of Science, Tanta University, Tanta 31111, Egypt

⁴

Department of Mathematics, Faculty of Sciences & Mathematics, University of Kosovska Mitrovica, 38220 Kosovska Mitrovica, Serbia

Mathematics2023, 11(8), 1772;https://doi.org/10.3390/math11081772

Version Notes

Order Reprints

Abstract

Zero-and-one inflated count time series have only recently become the subject of more extensive interest and research. One of the possible approaches is represented by first-order, non-negative, integer-valued autoregressive processes with zero-and-one inflated innovations, abbr. ZOINAR(1) processes, introduced recently, around the year 2020 to the present. This manuscript presents a generalization of ZOINAR processes, given by introducing the zero-and-one inflated power series (ZOIPS) distributions. Thus, the obtained process, named the ZOIPS-INAR(1) process, has been investigated in terms of its basic stochastic properties (e.g., moments, correlation structure and distributional properties). To estimate the parameters of the ZOIPS-INAR(1) model, in addition to the conditional least-squares (CLS) method, a recent estimation technique based on probability-generating functions (PGFs) is discussed. The asymptotic properties of the obtained estimators are also examined, as well as their Monte Carlo simulation study. Finally, as an application of the ZOIPS-INAR(1) model, a dynamic analysis of the number of deaths from the disease COVID-19 in Serbia is considered.

Keywords:

time series; zero-and-one inflation; probability generating functions; parameter estimation; simulation; COVID-19; application

MSC:

62M10; 60G10; 60G25

1. Introduction

Integer-valued time series modeling is a very popular research topic, with various applications (cf. [1,2,3,4,5]). One of the most popular approaches in modeling the dynamics of count data is provided by non-negative integer-valued autoregressive (INAR) time series models. This approach started from the famous work by Al-Osh and Alzaid [6], which first introduced the so-called INAR(1) process, and since then many results related to these models have been obtained (cf. [7,8,9,10,11,12,13,14,15,16,17]). One of the recently frequent problems in count data modeling is the presence of inflated zero-and-one values in the data, which can appear in various areas of human activity (e.g., the number of requests for issuing policies, breakdowns in the production process, injury in traffic accidents, etc.). To investigate this and similar problems, Saito et al. [18] and Zhang et al. [19] considered a modification (and generalization) of the traditional Poisson distribution, i.e., the so-called zero-and-one inflated Poisson (ZOIP) distribution. As an example of the application of the ZOIP distribution, the frequency of visits to the dentist in Swedish cities was considered in both mentioned manuscripts. Subsequently, Zhang et al. [20] introduced the multivariate ZOIP distribution, with applications in respect of healthcare demand data in Australia and car portfolio data in France.

Using the ZOIP distributed innovations, Qi et al. [21] introduced the first zero-and-one inflated INAR-based model, named the first-order zero-and-one INAR (ZOINAR(1)) process. Another class of ZOINAR time series, named the ZOIPLINAR process, has recently been introduced by Mohammadi et al. [22], where a Poisson–Lindley distribution of innovations inflated by zero and one is considered. Our main motivation is to introduce a more general form of the ZOINAR process, where the power series (PS) distribution with zero-and-one inflation is observed as its innovations. It should be noted that PS distributions represent a wide family of stochastic distributions, based on which many known integer-valued distributions can be obtained. In this way, the first-order zero-and-one inflated power series INAR process (abbr. ZOIPS-INAR(1) process) is proposed here, and it can be seen as a generalization of the previous ZOINAR models. The definition and the basic stochastic characteristics of this process are described in Section 2 and Section 3.

In order to estimate the parameters of INAR-based processes, various techniques have been developed. Conditional least-squares (CLS) estimation is a commonly utilized method, proposed in [23], as well as the method of conditional maximum likelihood (CML) [24]. However, in [25,26], among other methods, moment-based estimation procedures, i.e., Yule–Walker (YW) equations, are discussed. Nevertheless, to apply any of the aforementioned methods, there is usually an assumption that the estimation functions are given in a closed form and also bounded on some parameter space. As will be seen below, such conditions are not fully met in the case of our ZOIPS-INAR(1) process. Therefore, an alternative and a more contemporary approach, named the probability generating function (PGF) method, is proposed here. PGF estimation was theoretically described by Esquivel [27], first practically applied in Stojanović et al. [28], and recently examined in its general form by Stojanović et al. [29]. In order to apply the PGF method in the parameter estimation of the ZOIPS-INAR(1) process, some basic facts about this estimation method are given in Section 4. In addition, the asymptotic properties and efficiency of the PGF estimators of the ZOIPS-INAR(1) process, under some regulatory conditions, are considered here.

In Section 5, the PGF estimators for some specific ZOIPS innovations are analyzed. As typical members of the PS distributions family, but also for some practical reasons, the Poisson and geometric zero-and-one inflated distributions are considered here. For both of them, Monte Carlo simulations of PGF estimates were calculated and compared with the corresponding CLS estimates, which were taken as initial values for the PGF procedure. The asymptotic properties of both types of estimators are also examined here. The application of the ZOIPS-INAR(1) process in modeling the distribution of the number of deaths from the disease COVID-19 in the Republic of Serbia is presented in Section 6. In addition, by comparing the ZOIPS-INAR(1) model with the standard INAR(1) model, it is shown that the observed actual series has pronounced zero-and-one inflation, and that the proposed model has better efficiency and predictive accuracy. Finally, some concluding remarks are given in Section 7.

2. Structure of the ZOIPS-INAR(1) Process

In this section, similarly as in [28,29,30], we firstly introduce the independent identically distributed (IID) time series with the so-called power series (PS) distribution.

Definition 1.

The IID time series

(ε_{t})

,

t \in Z

is PS-distributed if its probability mass distribution (PMF) is as follows:

p_{ε} (x; θ) : = P {ε_{t} = x} = \frac{a (x) θ^{x}}{f (θ)}, x \in S .

(1)

Here,

S \subseteq Z^{+} = \{0, 1, 2, \dots\}

is the discrete set of values that the series

(ε_{t})

can take, and:

(i): $a (x) \geq 0$ is a function defined on the set $S$ ;
(ii): $θ > 0$ is the (unknown) one-dimensional parameter;
(iii): $f (θ) : = \sum_{x \in S} a (x) θ^{x}$ is the function that depends (only) on θ, and such that it is $0 < f (θ) < + \infty$ , when $0 < θ < R$ .

Equation (1) can, for particular choices of

a (x)

,

f (θ)

and

θ

, give some of the most well-known types of discrete distributions (see Table 1 below). Nevertheless, we assume that the condition

0, 1 \in S

is fulfilled, as is usual in zero-and-one inflated distributions. Additionally, note that according to

(i i i)

, the power series

f (θ)

converges in fact on

(- R, R)

. Nevertheless, the assumption

θ > 0

is common for the PMF of PS-distributed series

(ε_{t})

, and we observe the convergence of

f (θ)

only on the positive interval

(0, R)

. Moreover, in this interval, the function

f (θ)

has positive, increasing values, as well as positive derivatives, as follows:

f^{(n)} (θ) : = \frac{d^{n}}{d θ^{n}} (\sum_{x = 0}^{\infty} a (x) θ^{x}) = \sum_{x = n}^{\infty} (\prod_{k = 0}^{n - 1} (x - k)) a (x) θ^{x - n}, n \in N .

(2)

Table 1. Some specific PS distributions, along with their over-dispersion indices and PGFs.

Equality (2) can be useful for determining the moments

μ_{n}^{(ε)} : = E (ε_{t}^{n})

of series

(ε_{t})

. For this purpose, we have applied a similar procedure as in Stojanović et al. [28], based on the calculation of the moment-generating function (MGF):

M_{ε} (u) = E [e^{u ε_{t}}] = \frac{1}{f (θ)} \sum_{x \in S} a (x) {(θ e^{u})}^{x} = \frac{f (θ e^{u})}{f (θ)} .

Using (2) and the properties of the MGFs, after some calculations, one obtains:

\begin{matrix} μ_{n}^{(ε)} & = E [ε_{t}^{n}] = \frac{d^{n} M (u)}{d u^{n}} |_{u = 0} = \frac{1}{f (θ)} \sum_{k = 1}^{n} b_{n; k} {(θ e^{u})}^{k} f^{(k)} (θ e^{u}) |_{u = 0} \\ = \frac{1}{f (θ)} \sum_{k = 1}^{n} b_{n; k} θ^{k} f^{(k)} (θ) . \end{matrix}

(3)

In doing so, the coefficients

b_{n; k}

, for each

n \in N

, are calculated recursively:

b_{n; 1} = 1, b_{n; k + 1} = \frac{{(k + 1)}^{n - 1}}{k!} - \sum_{j = 1}^{k} \frac{b_{n; j}}{(k - j + 1)!}, k = 1, 2, \dots, n - 1 .

Using the first two moments, obtained by Equation (3), the mathematical expectation and the variance of the random variables (RVs)

(ε_{t})

are obtained as follows:

\begin{matrix} μ_{ε} & : = E [ε_{t}] = μ_{1}^{(ε)} = θ g^{'} (θ) \\ σ_{ε}^{2} & : = V a r [ε_{t}] = μ_{2}^{(ε)} - {(μ_{1}^{(ε)})}^{2} = μ_{ε} + θ^{2} g^{″} (θ), \end{matrix}

(4)

where

g (θ) = log (f (θ))

. If

D_{ε} (θ) : = σ_{ε}^{2} - μ_{ε} = θ^{2} g^{″} (θ)

is the so-called over-dispersion index, then the series

(ε_{t})

is over-dispersed; that is,

D_{ε} (θ) > 0

, if and only if the inequality

g^{^{″}} (θ) > 0

holds, for any

θ \in (0, R)

. Hence, the convexity of

g (θ)

indicates an overdispersion of the series

(ε_{t})

, as can be seen in Table 1. Moreover, for an arbitrary

u \in [- 1, 1]

, we can introduce the first-order PGF of RVs

(ε_{t})

in the following way:

Ψ_{ε} (u; θ) : = E [u^{ε_{t}}] = \frac{1}{f (θ)} \sum_{x \in S} a (x) {(θ u)}^{x} = \frac{f (θ u)}{f (θ)} .

(5)

The sum obtained above obviously converges on the interval

u \in (0, R / θ)

. In addition, the expression in (5) gives the possibility of the simple calculation of first-order PGFs for PS distributions, which are also given in Table 1.

In the following, we define a zero-and-one inflated distribution for an arbitrary PS-distributed time series

(ε_{t})

.

Definition 2.

Let

(ε_{t})

,

t \in Z

be the IID time series with the PS distribution, given by Equation (1). The series

(η_{t})

,

t \in Z

has a zero-and-one inflated power series (ZOIPS) distribution if for some

ϕ_{0}, ϕ_{1} > 0

, such that

0 < ϕ_{0} + ϕ_{1} \leq 1

, its PMF is given as follows:

p_{η} (x; λ_{η}) : = P {η_{t} = x} = \{\begin{matrix} ϕ_{0} + (1 - ϕ_{0} - ϕ_{1}) p_{ε} (x; θ), & x = 0 \\ ϕ_{1} + (1 - ϕ_{0} - ϕ_{1}) p_{ε} (x; θ), & x = 1 \\ (1 - ϕ_{0} - ϕ_{1}) p_{ε} (x; θ), & x \geq 2 \end{matrix},

(6)

where

λ_{η} = (θ, ϕ_{0}, ϕ_{1})

is the vector of (unknown) parameters.

Note that the ZOIPS distribution is a mixture of three distributions:

I_{0} (x)

concentrated in zero,

I_{1} (x)

concentrated in one, and the PS distribution of the series

(ε_{t})

. Thus, the PMF of RVs

(η_{t})

can be written as

p_{η} (x; λ_{η}) = ϕ_{0} I_{0} (x) + ϕ_{1} I_{1} (x) + ϕ_{2} p_{ε} (x; θ),

(7)

where we set

ϕ_{2} = 1 - ϕ_{0} - ϕ_{1} \in [0, 1),

or equivalently,

ϕ_{0} + ϕ_{1} + ϕ_{2} = 1

. It is obvious that when

ϕ_{0} = ϕ_{1} = 0

, the ZOIPS distribution is reduced to the previous PS distribution. For these reasons, we assume that

ϕ_{0} > 0

and

ϕ_{1} > 0

, so that these coefficients represent, respectively, the additional proportions of zeros and ones compared to those allowed by the PS distribution of the series

(ε_{t})

. Using the previous facts, and similar to Qi et al. [21], for the n-th moments

μ_{n}^{(η)} : = E [η_{t}^{n}]

of series

(η_{t})

one obtains:

μ_{n}^{(η)} = E [η_{t}^{n}] = ϕ_{1} + ϕ_{2} μ_{n}^{(ε)} .

(8)

According to (4) and (8), the mean and variance of the series

(η_{t})

are, respectively,

\begin{matrix} μ_{η} & : = E [η_{t}] = ϕ_{1} + ϕ_{2} μ_{ε} = ϕ_{1} + ϕ_{2} θ g^{'} (θ) \\ σ_{η}^{2} & : = V a r [η_{t}] = μ_{2}^{(η)} - μ_{η}^{2} = ϕ_{1} + ϕ_{2} μ_{2}^{(ε)} - μ_{η}^{2} = ϕ_{1} + ϕ_{2} (μ_{ε}^{2} + σ_{ε}^{2}) - μ_{η}^{2} \\ = μ_{η} - μ_{η}^{2} + ϕ_{2} (μ_{ε}^{2} + θ^{2} g^{″} (θ)) = μ_{η} - μ_{η}^{2} + ϕ_{2} θ^{2} ({(g^{'} (θ))}^{2} + g^{″} (θ)) . \end{matrix}

(9)

Using Equations (4) and (9), similarly as with the PS series

(ε_{t})

, one can obtain the necessary and sufficient conditions for the over-dispersion of the ZOIPS series

(η_{t})

. According to

D_{η} (λ_{η}) : = σ_{η}^{2} - μ_{η} = ϕ_{2} (μ_{ε}^{2} + σ_{ε}^{2} - μ_{ε}) - {(ϕ_{1} + ϕ_{2} μ_{ε})}^{2},

the series

(η_{t})

will be over-dispersed if and only if

ϕ_{2} (μ_{ε}^{2} + D_{ε} (θ)) > {(ϕ_{1} + ϕ_{2} μ_{ε})}^{2} .

This condition is more flexible than the “ordinary“ over-dispersion of the series

(ε_{t})

. For instance, if

(ε_{t})

is equally-dispersed, i.e.,

D_{ε} (θ) = 0

holds, the inequality

D_{η} (λ_{η}) > 0

is fulfilled when

μ_{ε} > ϕ_{1} / (\sqrt{ϕ_{2}} - ϕ_{2}) .

Then, the ZOIPS series

(η_{t})

will be overdispersed, which is the same result as for the Poisson ZOINAR process introduced in Qi et al. [21].

Finally, using Equations (1), (5) and (6), the first-order PGF of the RVs

(η_{t})

can be easily obtained. After some simple computations, for an arbitrary

u \in [- 1, 1]

, it follows that:

\begin{matrix} Ψ_{η} (u; λ_{η}) & : = E [u^{η_{t}}] = \sum_{k = 0}^{\infty} u^{k} p_{η} (k; λ_{η}) = ϕ_{0} + ϕ_{1} u + ϕ_{2} \sum_{k = 0}^{\infty} u^{k} p_{ε} (k; θ) \\ = ϕ_{0} + ϕ_{1} u + ϕ_{2} \frac{f (θ u)}{f (θ)} . \end{matrix}

(10)

Assuming that the aforementioned notations are valid, we now introduce the INAR-based time series with the ZOIPS-distributed innovations

(η_{t})

.

Definition 3.

The time series

(X_{t})

,

t \in Z

, represents an INAR(1) process with ZOIPS innovations or, simply, a ZOIPS-INAR(1) process, if it fulfills the recurrence relation:

X_{t} = α \circ X_{t - 1} + η_{t} .

(11)

Here,

(η_{t})

is the ZOIPS-distributed time series with the PMF given by (6),

α \in (0, 1)

is an unknown parameter, and

α \circ X : = \sum_{j = 1}^{X} B_{j} (α)

(12)

is the binomial thinning operator. More precisely, for an arbitrary non-negative integer-valued RV X, the RVs

B_{j} = B_{j} (α)

have Bernoulli’s distribution

P {B_{j} = 1} = 1 - P {B_{j} = 0} = α

. In addition, RVs

B_{j}

are mutually independent (and also independent of X).

As an illustration, Figure 1 shows the realizations of the ZOIPS series and ZOIPS-INAR(1) process, where a Poisson distribution with the parameter

θ = 5

is taken as the PS distribution. As can be easily seen, although the value of the parameter

θ

is significantly greater than zero (and one), both time series have emphasized zero-and-one inflation.

Figure 1. (a) Dynamics of the ZOIPS series and ZOIPS-INAR(1) process. (b) Empirical frequency distributions of both time series. (Parameters values are:

α = 0.5, θ = 5, ϕ_{0} = ϕ_{1} = 0.45

).

3. Stochastic Properties of the ZOIPS-INAR Process

Based on the mentioned properties of the ZOIPS series

(η_{t})

, some special properties of the ZOIPS-INAR(1) process can be shown. First, we compute the k-step conditional measures of

X_{t + k}

on

X_{t}

. Using some well-known properties of binomial thinning (cf. [31,32]) and Equation (11), for the first-step conditional mean one obtains:

E [X_{t + 1} | X_{t}] = α X_{t} + μ_{η} = α X_{t} + ϕ_{1} + ϕ_{2} θ g^{'} (θ),

and for the conditional variance:

\begin{matrix} V a r [X_{t + 1} | X_{t}] & = α (1 - α) X_{t} + σ_{η}^{2} \\ = α (1 - α) X_{t} + μ_{η} - μ_{η}^{2} + ϕ_{2} θ^{2} ({(g^{'} (θ))}^{2} + g^{″} (θ)) . \end{matrix}

In the general case, by the method of induction and after some computation, conditional measures of the k-degree can be computed for each

k \in N

. Hence, it follows that:

Theorem 1.

Let

(X_{t})

be a ZOIPS-INAR(1) process defined by Equation (11). Then, for each

k = 1, 2, \dots

the k-step conditional mean and variance for the series

(X_{t})

are, respectively,

\begin{matrix} E [X_{t + k} | X_{t}] & = α^{k} X_{t} + \frac{1 - α^{k}}{1 - α} μ_{η}, \\ Var [X_{t + k} | X_{t}] & = α^{k} (1 - α^{k}) X_{t} + \frac{(1 - α^{k}) (α - α^{k})}{1 - α^{2}} μ_{η} + \frac{1 - α^{2 k}}{1 - α^{2}} σ_{η}^{2}, \end{matrix}

(13)

and the autocorrelation function (ACF) at lag k is

ρ (k) = α^{k}

.

According to Equalities (13), when

k \to \infty

, the unconditional mean and the variance of RVs

(X_{t})

can be obtained as follows:

\begin{matrix} μ_{X} & : = & E [X_{t}] = lim_{k \to \infty} E [X_{t + k} | X_{t}] = \frac{μ_{η}}{1 - α} = \frac{ϕ_{1} + ϕ_{2} θ g^{'} (θ)}{1 - α}, \\ σ_{X}^{2} & : = & V a r [X_{t}] = lim_{k \to \infty} V a r [X_{t + k} | X_{t}] = \frac{α μ_{η} + σ_{η}^{2}}{1 - α^{2}} \\ = & μ_{X} + \frac{ϕ_{2} (μ_{ε}^{2} + σ_{ε}^{2} - μ_{ε}) - μ_{η}^{2}}{1 - α^{2}} . \end{matrix}

Remark 1.

It can easily be seen that differences

D_{X} (θ) : = σ_{X}^{2} - μ_{X}

and

D_{η} (θ) = σ_{η}^{2} - μ_{η}

justify equality

D_{X} (θ) = D_{η} (θ) / (1 - α^{2}) .

Thus, similarly to other INAR processes, the series

(X_{t})

and

(η_{t})

have the equivalent over-dispersed properties, i.e., they are both at the same time over-, equal- or under-dispersed.

In the following, we examine some characteristics of the distribution of the ZOIPS-INAR(1) process. First of all, we conduct an integer-valued moving average (INMA) representation of infinite order for the series

(X_{t})

.

Theorem 2.

Let us assume that PS series

(ε_{t})

, defined by Equation (1), has finite moments up to the order two, uniformly bounded for any

θ \in (0, R)

. Then, for any

α \in (0, 1)

, ZOIPS-INAR(1) series

(X_{t})

, defined by Equation (11), has an INMA

(\infty)

representation:

X_{t} \overset{d}{=} \sum_{k = 0}^{\infty} α^{k} \circ η_{t - k} .

(14)

In addition, the sum in (14) converges in mean-square sense and almost surely.

Proof.

Using the assumptions of the theorem, one can find a constant

C > 0

such that

0 < μ_{ε} (θ) \leq C < + \infty, \forall θ \in (0, R) .

According to this, it follows that:

\sum_{k = 1}^{\infty} P {ε_{t} \geq k} = \sum_{k = 1}^{\infty} k P \{ε_{t} = k\} = μ_{ε} (θ) \leq C < + \infty,

and using the definition of the ZOIPS series

(η_{t})

, given by Equation (6), one obtains:

\sum_{k = 1}^{\infty} P {η_{t} \geq k} = ϕ_{1} + ϕ_{2} \sum_{k = 1}^{\infty} P {ε_{t} \geq k} = ϕ_{1} + ϕ_{2} μ_{ε} (θ) \leq ϕ_{1} + ϕ_{2} C < + \infty .

Hence, the above sum converges uniformly on

θ \in (0, R)

. Further, the sequence

1 / k

,

k = 1, 2, \dots

is monotone and bounded, so Abel’s convergence criterion for infinite sums gives

\sum_{k = 1}^{\infty} \frac{1}{k} P {η_{t} \geq k} = ϕ_{1} + ϕ_{2} \sum_{k = 1}^{\infty} \frac{1}{k} P {ε_{t} \geq k} < + \infty,

(15)

where the convergence above is uniformly on

θ \in (0, R)

. According to Theorem 2.1 in Alzaid and Al-Osh [33], condition (15) is sufficient for the equality

Ψ_{X} (u; λ) = \prod_{k = 0}^{\infty} Ψ_{η} (1 + α^{k} (u - 1); λ_{η}) .

(16)

Here,

Ψ_{η} (u; λ_{η}) : = E [u^{η_{t}}]

and

Ψ_{X} (u; λ) : = E [u^{X_{t}}]

are, respectively, the PGFs of ZOIPS series

(η_{t})

and ZOIPS-INAR(1) process

(X_{t})

, and

λ : = {(θ, α, ϕ_{0}, ϕ_{1})}^{'}

is the vector of (unknown) parameters. Moreover, at least on

u \in [- 1, 1]

, the product above converges absolutely. Hence, according to the one-to-one correspondence between the PMFs and PGFs of discrete RVs, it follows that the INMA

(\infty)

representation in (14) is equivalent to (16).

To prove the second part of the theorem, note that, using Equation (11), for any

n \in N

it holds that:

X_{t} \overset{d}{=} α^{n} \circ X_{t - n} + \sum_{j = 0}^{n - 1} α^{j} \circ η_{t - j} .

(17)

According to this, as well as the well-known properties of the binomial thinning operator [31,32], it follows that:

E {[X_{t} - \sum_{j = 0}^{n - 1} α^{j} \circ η_{t - j}]}^{2} = E {[α^{n} \circ X_{t - n}]}^{2} = α^{2 n} E [X_{t - n}^{2}] + α^{n} (1 - α^{n}) E [X_{t - n}] ⟶ 0, n \to \infty .

Therefore, the mean-square convergence of the sum in (14) is valid.

To prove the almost certain convergence in (14), we define the event

A : = \{lim_{n \to \infty} \sum_{j = 0}^{n - 1} α^{j} \circ η_{t - j} = X_{t}\} .

According to (17) and the definition of the limit value of real functions, we can write the event A as

A : = \{lim_{n \to \infty} α^{n} \circ X_{t - n} = 0\} = ⋂_{δ > 0} ⋃_{k = 1}^{\infty} ⋂_{n = k}^{\infty} \{0 \leq α^{n} \circ X_{t - n} < δ\} = ⋃_{k = 1}^{\infty} A_{k},

where

A_{k} : = ⋂_{n = k}^{\infty} \{α^{n} \circ X_{t - n} = 0\} .

Again using (17), for each (fixed)

k = 1, 2, \dots

, and

m = 1, 2, \dots

, one obtains:

α^{k} \circ X_{t - k} \overset{d}{=} α^{m + k} \circ X_{t - m - k} + \sum_{n = k}^{m + k - 1} α^{n} \circ η_{t - n},

where the expression on the right is the sum of the uncorrelated RVs. Thus, applying the continuity of probability and the definition of the thinning operator (12), for events

A_{k}

we have:

\begin{matrix} P (A_{k}) & = lim_{m \to \infty} (P \{α^{m + k} \circ X_{t - m - k} = 0\} \times \prod_{n = k}^{m + k - 1} P \{α^{n} \circ η_{t - n} = 0\}) \\ = lim_{m \to \infty} (\sum_{j = 0}^{\infty} {(1 - α^{m + k})}^{j} P \{X_{t - m - k} = j\}) \times lim_{m \to \infty} \prod_{n = k}^{m + k - 1} (\sum_{j = 0}^{\infty} {(1 - α^{n})}^{j} P \{η_{t - n} = j\}) \\ = lim_{m \to \infty} Ψ_{X} (1 - α^{m + k}; λ) \times lim_{m \to \infty} \prod_{n = k}^{m + k - 1} Ψ_{η} (1 - α^{n}; λ_{η}) \\ = Ψ_{X} (1; λ) \times \prod_{n = k}^{\infty} Ψ_{η} (1 - α^{n}; λ_{η}) \\ = \prod_{n = k}^{\infty} Ψ_{η} (1 - α^{n}; λ_{η}) . \end{matrix}

By re-applying the continuity property of the probability, as well as the convergence of the product in (16), it follows that

P (A) = lim_{k \to \infty} P (A_{k}) = 1,

i.e., the sum in (14) converges almost surely. □

Remark 2.

Using a similar procedure as in the previous theorem and some general results about the PGFs of non-negative discrete-valued stationary time series (cf. Stojanović et al. [29]), an explicit expression for PGFs of the ZOIPS-INAR(1) process can be obtained. According to Equations (10) and (16), for arbitrary

α \in (0, 1)

, the series

(X_{t})

has a PGF of the first order:

Ψ_{X} (u; λ) = \prod_{k = 0}^{\infty} (ϕ_{0} + ϕ_{1} (1 + α^{k} (u - 1)) + ϕ_{2} \frac{f ((1 + α^{k} (u - 1)) θ)}{f (θ)}) .

(18)

Furthermore, suppose that

u = {(u_{1}, \dots, u_{r})}^{'} \in R^{r}

,

r \geq 2

and

X_{t}^{(r)} : = {(X_{t}, \dots, X_{t + r - 1})}^{'}

,

t \in Z

are the so-called overlapping blocks of series

(X_{t})

. Putting

k = 1, \dots, r - 1

into Equation (17), after some calculations, the explicit expression of the r-dimensional PGF random vector

X_{t}^{(r)}

can be obtained:

\begin{matrix} Ψ_{X}^{(r)} (u; λ) & : = E [u_{1}^{X_{t}} \dots u_{r}^{X_{t + r - 1}}] \\ = Ψ_{X} (\prod_{k = 0}^{r - 1} (1 + α^{k} (u_{k + 1} - 1)); λ) \prod_{ℓ = 2}^{r} Ψ_{η} (\prod_{k = 0}^{r - ℓ} (1 + α^{k} (u_{k + ℓ} - 1)); λ_{η}) . \end{matrix}

(19)

It can be noted that the PGF

Ψ_{X}^{(r)} (u; λ)

of the order

r = 2

will be used in the estimation of parameters of the ZOIPS-INAR(1) process (see Section 4 below).

Let us now consider the Markov properties and marginal distribution of our model.

Theorem 3.

Let

(η_{t})

be the ZOIPS series, with the PMF given by (6). Then, the process

(X_{t})

, defined by (11), is a homogeneous Markovian process with the first-step transition probabilities:

\begin{matrix} p_{i j} & : = P \{X_{t} = j | X_{t - 1} = i\} = ϕ_{0} (\binom{i}{j}) α^{j} {(1 - α)}^{i - j} I (j \leq i) \\ + ϕ_{1} (\binom{i}{j - 1}) α^{j - 1} {(1 - α)}^{i - j + 1} I (1 \leq j \leq i + 1) + ϕ_{2} \sum_{x = 0}^{m} (\binom{i}{x}) α^{x} {(1 - α)}^{i - x} p_{ε} (j - x; θ), \end{matrix}

(20)

where

i, j \in S

and

m = min {i, j}

.

Proof.

According to Equation (11) and the definition of binomial thinning, the conditional distribution of

X_{t}

at a given

X_{t - 1}

can be expressed as follows:

\begin{matrix} p_{i j} & : = P \{α \circ X_{t - 1} + η_{t} = j | X_{t - 1} = i\} = \sum_{x = 0}^{m} P \{α \circ X_{t - 1} = x | X_{t - 1} = i\} P {η_{t} = j - x} \\ = \sum_{x = 0}^{m} (\binom{i}{x}) α^{x} {(1 - α)}^{i - x} p_{η} (j - x; λ_{η}) = \sum_{x = 0}^{m} p_{i} (x; α) p_{η} (j - x; λ_{η}), \end{matrix}

where

p_{i} (x; α) : = (\binom{i}{x}) α^{x} {(1 - α)}^{i - x}, x = 0, 1, \dots, i

is the PMF of the binomial

B (i; α)

distribution. Using the definition of the ZOIPS distribution, that is, the PMF of RVs

(η_{t})

given by Equation (7), one obtains:

p_{i j} = \sum_{x = 0}^{m} p_{i} (x; α) [ϕ_{0} p_{0} (j - x) + ϕ_{1} p_{1} (j - x) + ϕ_{2} p_{ε} (j - x; θ)],

where

p_{0} (k)

and

p_{1} (k)

are, respectively, the PMFs of

I_{0} \overset{a s}{=} 0

and

I_{1} \overset{a s}{=} 1

. Based on these, we obtain:

\begin{matrix} p_{i j} & = ϕ_{0} \sum_{x = 0}^{m} p_{i} (x; α) I (j = x) + ϕ_{1} \sum_{x = 0}^{m} p_{i} (x; α) I (j = x + 1) + ϕ_{2} \sum_{x = 0}^{m} p_{i} (x; α) p_{ε} (j - x; θ) \\ = ϕ_{0} p_{i} (k; α) I (j \leq m) + ϕ_{1} p_{i} (j - 1; α) I (1 \leq j \leq m + 1) + ϕ_{2} \sum_{x = 0}^{m} p_{i} (x; α) p_{ε} (j - x; θ) . \end{matrix}

It is easy to see that the obtained equality is equivalent to (20), which proves the theorem. □

Remark 3.

Let us recall once again that it is usual to assume that for the zero-and-one distribution, the condition

0, 1 \in S

is usually taken. It follows that the first singular part in Equation (20) exists if and only if

(X_{t})

passes from the state

X_{t - 1} = i

to the non-increasing state

X_{t} = j \leq i

. Similarly, the second singular part exists if and only if

1 \leq X_{t} = j \leq i + 1

. Finally, the transition probabilities (20), as well as the use of the conditional probability, give the marginal PMF of the ZOIPS-INAR(1) process:

\begin{matrix} p_{X} (x; λ) & : = P \{X_{t} = x\} = \sum_{k \in S} P \{X_{t} = x | X_{t - 1} = k\} P \{X_{t - 1} = k\} \\ = ϕ_{0} {(\frac{α}{1 - α})}^{x} \sum_{k = 0}^{\infty} (\binom{k}{x}) {(1 - α)}^{k} I (x \leq k) p_{X} (k; λ) \\ + ϕ_{1} {(\frac{α}{1 - α})}^{x - 1} \sum_{k = 0}^{\infty} (\binom{k}{x - 1}) {(1 - α)}^{k} I (1 \leq x \leq k + 1) p_{X} (k; λ) \\ + ϕ_{2} \sum_{k = 0}^{\infty} p_{X} (k; λ) \sum_{j = 0}^{m} p_{k} (j; α) p_{ε} (x - j; θ) . \end{matrix}

Thus, the ZOIPS-INAR(1) process

(X_{t})

is a strictly stationary and ergodic time series.

At the end of this section, similar to Qi et al. [21] and Mohammadi et al. [22], we examine the distribution of the zero-and-one lengths of the ZOIPS-INAR(1) process. Starting with the basic results of the zero-inflated INAR processes (cf. Jazi et al. [34], Wang et al. [35]), we observe the distribution of the “runs”of zeros (resp. ones) in the ZOIPS-INAR(1) process. Thereby, the “runs“ are defined as the number of zeros (resp. ones) between two different non-zeros, i.e., non-one values, respectively. In the following statements, we give the expected lengths and proportions of zeros and ones in our model.

Theorem 4.

The expected lengths of the runs of zeros (resp. ones) for a ZOIPS-INAR(1) process are, respectively, given by:

\begin{matrix} L_{0} & = \frac{ϕ_{0} f (θ) + ϕ_{2} a (0)}{(1 - ϕ_{0}) f (θ) - ϕ_{2} a (0)}, \\ L_{1} & = \frac{α [ϕ_{0} f (θ) + ϕ_{2} a (0)] + (1 - α) [ϕ_{1} f (θ) + ϕ_{2} a (1) θ]}{α [(1 - ϕ_{0}) f (θ) - ϕ_{2} a (0)] + (1 - α) [(1 - ϕ_{1}) f (θ) - ϕ_{2} a (1) θ]}, \end{matrix}

(21)

where

a (x)

and

f (θ)

are the functions introduced in Definition 1 for the PS distributions.

Proof.

According to Equation (20), the transition probabilities from zero to zero, and from zero to non-zero values are, respectively,

\begin{matrix} p_{0}^{*} & : = P \{X_{t} = 0 | X_{t - 1} = 0\} = ϕ_{0} + ϕ_{2} p_{ε} (0; θ), \\ 1 - p_{0}^{*} & = 1 - ϕ_{0} - ϕ_{2} p_{ε} (0; θ) = ϕ_{1} + ϕ_{2} [1 - p_{ε} (0; θ)] . \end{matrix}

Since the zero run length is defined as the number of zeros between two non-zero values, it can easily be seen to follow a geometric distribution with parameter

1 - p_{0}^{*}

. Therefore, the expected length of zero is

L_{0} = \frac{p_{0}^{*}}{1 - p_{0}^{*}} = \frac{ϕ_{0} + ϕ_{2} p_{ε} (0; θ)}{ϕ_{1} + ϕ_{2} [1 - p_{ε} (0; θ)]},

where

p_{ε} (0; θ) = a (0) / f (θ)

, and the first equality in (21) immediately follows. Similarly, the transition probabilities from one to one and from one to non-one values are, respectively,

\begin{matrix} p_{1}^{*} & : = P \{X_{t} = 1 | X_{t - 1} = 1\} = α [ϕ_{0} + ϕ_{2} p_{ε} (0; θ)] + (1 - α) [ϕ_{1} + ϕ_{2} p_{ε} (1; θ)], \\ 1 - p_{1}^{*} & = α [1 - ϕ_{0} - ϕ_{2} p_{ε} (0; θ)] + (1 - α) [1 - ϕ_{1} - ϕ_{2} p_{ε} (1; θ)] . \end{matrix}

Applying the same procedure as before, the second equality in (21) is easily obtained. □

Let us point out that, in the same way as in similar INAR-based models, the expected length of the zero runs in the ZOIPS-INAR(1) process is independent of the parameter

α

.

Theorem 5.

The proportions of zeros and ones in the ZOIPS-INAR(1) process are, respectively,

\begin{matrix} P \{X_{t} = 0\} & = \prod_{k = 0}^{\infty} (ϕ_{0} + ϕ_{1} (1 - α^{k}) + ϕ_{2} \frac{f ((1 - α^{k}) θ)}{f (θ)}), \\ P \{X_{t} = 1\} & = \sum_{j = 0}^{\infty} [α^{j} (ϕ_{1} + ϕ_{2} \frac{θ f^{'} ((1 - α^{j}) θ)}{f (θ)}) \prod_{k \neq j} (ϕ_{0} + ϕ_{1} (1 - α^{k}) + ϕ_{2} \frac{f ((1 - α^{k}) θ)}{f (θ)})] . \end{matrix}

Proof.

Using the well-known properties of the PGFs of discrete-valued RVs

(X_{t})

, for an arbitrary

k = 0, 1, \dots

the PMF of

(X_{t})

can be expressed as follows:

P \{X_{t} = k\} = \frac{1}{k!} \frac{\partial^{k} Ψ_{X} (u; λ)}{\partial u^{k}} |_{u = 0} .

(22)

From here, using expression (18) for the first-order PGF of the ZOIPS-INAR process, and replacing

k = 0, 1

in Equation (22), the statement of the theorem immediately follows. □

4. Parameter Estimation Procedure

Due to the specific structure, the parameter estimation procedure of the ZOIPS-INAR(1) model is more complex than for the ordinary INAR model. The main reason is that the basic INAR models with ‘ordinary’ PS innovations

(ε_{t})

have (only) two unknown parameters

θ, λ

. However, in the non-trivial case, the ZOIPS-INAR(1) process has two additional parameters

ϕ_{0}, ϕ_{1} \in (0, 1]

. Therefore, the structure of this process affects the fact that even some simpler types of estimators, such as YW estimators, cannot be obtained by simple calculation. In previous works on ZOINAR processes, Qi et al. [21] proposed the CML estimation method, while Mohammadi et al. [22] additionally described the CLS estimates. In the following, CLS estimators will also be used as initial values for the PGF estimation procedure, which will now be given more attention.

We emphasize once again that the general aspect of the PGF method was recently described by Stojanović et al. [29]. Accordingly, a specific PGF estimation procedure is examined here, in the case of the ZOIPS-INAR(1) process. The basic idea of the PGF method is close to the empirical characteristic function (ECF) estimation method introduced in [36,37]. It is based on minimizing the ‘distance’ between the theoretical PGF of order

r \in N

, defined by Equations (18) and (19), as well as the corresponding empirical PGF:

{\tilde{Ψ}}_{T}^{(r)} (u) : = \frac{1}{T - r + 1} \sum_{t = 1}^{T - r + 1} u_{1}^{X_{t}} \dots u_{r}^{X_{t + r - 1}},

where

{X_{1}, X_{2}, \dots, X_{T}}

is some finite realization of the ZOIPS-INAR series

(X_{t})

. Since the ZOIPS-INAR(1) series

(X_{t})

is stationary and ergodic, hence it follows that:

E [{\tilde{Ψ}}_{T}^{(r)} (u)] = Ψ_{X}^{(r)} (u; λ_{0}),

(23)

where

λ_{0} \in R^{4}

is the (unknown) true value of parameter

λ = {(θ, α, ϕ_{0}, ϕ_{1})}^{'}

. Thus,

{\tilde{Ψ}}_{T}^{(r)} (u)

is an unbiased estimator of

Ψ_{X}^{(r)} (u; λ_{0})

. Further, as the PGF

Ψ_{X}^{(r)} (u; λ)

is well-defined at least for all

u \in {[- 1, 1]}^{r}

, the objective (minimization) function can be defined as follows:

S_{T}^{(r)} (λ) : = \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) {(Ψ_{X}^{(r)} (u; λ) - {\tilde{Ψ}}_{T}^{(r)} (u))}^{2} d u,

(24)

where

d u : = d u_{1} \dots d u_{r}

and

ω : R^{r} \to R^{+}

is a weight function, integrable on

{[- 1, 1]}^{r}

.

The PGF estimators are then obtained by minimizing the objective function (24) with respect to

λ

. In other words, they represent the solutions of the following minimization equation:

{\hat{λ}}_{T} = a r g min_{λ \in Λ} S_{T}^{(r)} (λ),

(25)

where

Λ = (0, R) \times {(0, 1)}^{3} \subset R^{4}

is a regular parameter space of the ZOIPS-INAR(1) process. To solve Equation (25), numerical integration procedures have been used, which are discussed in Section 5. In the following, under some regularity conditions, we examine the consistency and asymptotic normality (AN) of the PGF estimators.

Theorem 6.

Let

λ_{0}

be the exact value of the parameter λ, and

{\hat{λ}}_{T}

, for arbitrary

T = 1, 2, \dots

, the solutions of Equation (25). In addition, assume that the following regularity conditions are fulfilled:

$(A_{1})$: $λ_{0} \in Λ$ and ${\hat{λ}}_{T} \in Λ$ for large enough T.
$(A_{2})$: At the point $λ = λ_{0}$ function

$S_{0}^{(r)} (λ) : = \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) {(Ψ_{X}^{(r)} (u; λ) - Ψ_{X}^{(r)} (u; λ_{0}))}^{2} d u$

has a unique minimum $S_{0}^{(r)} (λ_{0}) = 0$ .
$(A_{3})$: $\frac{\partial^{2} S_{T}^{(r)} (λ_{0})}{\partial λ \partial λ^{'}}$ is a regular matrix.
$(A_{4})$: $\frac{\partial Ψ_{X}^{(r)} (u; λ_{0})}{\partial λ} \frac{\partial Ψ_{X}^{(r)} (u; λ_{0})}{\partial λ^{'}}$ is a non-zero matrix uniformly bounded by a positive ω-integrable function $W : R^{r} \to R^{+}$ .

Then, the estimator

{\hat{λ}}_{T}^{(r)}

is strictly consistent and AN for the parameter λ.

Proof.

To prove the consistency of the estimator

{\hat{λ}}_{T}

, we firstly check the sufficient conditions for the consistency of the extremum estimators (cf. Newey and McFadden [38]). As it was previously shown that the ZOIPS-INAR(1) series

(X_{t})

is ergodic, applying Equation (23) and the strong law of large numbers (SLLN) follows:

{\tilde{Ψ}}_{T}^{(r)} (u) \overset{as}{⟶} Ψ_{X}^{(r)} (u; λ_{0}), T \to \infty,

(26)

where “as” denotes the almost sure convergence. Further, under assumption

(A_{1})

, the closed set

\bar{Λ} = [0, R] \times {[0, 1]}^{3}

is compact, and

λ_{0}

belongs to its interior. Hence, the functions

Ψ_{X}^{(r)} (u; λ)

and

{\tilde{Ψ}}_{T}^{(r)} (u)

are continuous on the compacts

{[- 1, 1]}^{r} \times \bar{Λ}

and

{[- 1, 1]}^{r}

, respectively, and therefore, there exist constants

C_{1}, C_{2} > 0

such that

max_{(u; λ) \in {[- 1, 1]}^{r} \times \bar{Λ}} | Ψ_{X}^{(r)} (u; λ) | \leq C_{1} < + \infty, max_{u \in {[- 1, 1]}^{r}} | {\tilde{Ψ}}_{T}^{(r)} (u) | \leq C_{2} < + \infty .

According to these, similarly as in Stojanović et al. [29], one obtains

\begin{matrix} | S_{T}^{(r)} (λ) - S_{0}^{(r)} (λ) | & \leq (3 C_{1} + C_{2}) \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) | {\tilde{Ψ}}_{T}^{(r)} (u) - Ψ_{X}^{(r)} (u; λ_{0}) | d u, \end{matrix}

so the last inequality and Equation (26) imply

sup_{λ \in \bar{Λ}} | S_{T}^{(r)} (λ) - S_{0}^{(r)} (λ_{0}) | \overset{as}{⟶} 0, T \to + \infty .

Thus, the function

S_{T}^{(r)} (λ)

converges uniformly and almost surely to

S_{0}^{(r)} (λ)

. According to these, as well as assumption

(A_{2})

and Theorem 2.1 in Newey and McFadden [38], it follows that:

{\hat{λ}}_{T}^{(r)} - λ_{0} \overset{as}{⟶} 0, T \to + \infty,

that is, the estimator

{\hat{λ}}_{T}

is strictly consistent for

λ

.

To prove the AN property of the estimator

{\hat{λ}}_{T}

, note that the first two orders of partial derivatives of the function

S_{T}^{(r)} (λ)

are continuous functions. Therefore, using the Taylor expansion for

\partial S_{T}^{(r)} (λ) / \partial λ

at the point

λ = λ_{0}

, one obtains:

\frac{\partial S_{T}^{(r)} (λ)}{\partial λ} = \frac{\partial S_{T}^{(r)} (λ_{0})}{\partial λ} + \frac{\partial^{2} S_{T}^{(r)} (λ_{0})}{\partial λ \partial λ^{'}} (λ - λ_{0}) + o (λ - λ_{0}) .

By replacing

λ

with

{\hat{λ}}_{T}

, for sufficiently large T, under assumption

(A_{3})

and the fact that

\partial S_{T}^{(r)} ({\hat{λ}}_{T}) / \partial λ = 0

, we have:

{\hat{λ}}_{T}^{(r)} - λ_{0} = - {[\frac{\partial^{2} S_{T}^{(r)} (λ_{0})}{\partial λ \partial λ^{'}}]}^{- 1} \frac{\partial S_{T}^{(r)} (λ_{0})}{\partial λ} + o ({\hat{λ}}_{T} - λ_{0}) .

(27)

Furthermore, according to the properties mentioned above, the function

S_{T}^{(r)} (λ)

can be differentiated under the sign of the integral, such as:

\begin{matrix} \frac{\partial S_{T}^{(r)} (λ)}{\partial λ} & = & 2 \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) [Ψ_{X}^{(r)} (u; λ) - {\tilde{Ψ}}_{T} (u)] \frac{\partial Ψ_{X}^{(r)} (u; λ)}{\partial λ} d u, \\ \frac{\partial^{2} S_{T}^{(r)} (λ)}{\partial λ \partial λ^{'}} & = & 2 \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) \{\frac{\partial Ψ_{X}^{(r)} (u; λ)}{\partial λ} \frac{\partial Ψ_{X}^{(r)} (u; λ)}{\partial λ^{'}} \end{matrix}

(28)

\begin{matrix} + [Ψ_{X}^{(r)} (u; λ) - {\tilde{Ψ}}_{T} (u)] \frac{\partial^{2} Ψ_{X}^{(r)} (u; λ)}{\partial λ \partial λ^{'}}\} d u . \end{matrix}

(29)

By taking the mathematical expectations in Equations (28) and (29), one obtains:

E [\frac{\partial S_{T}^{(r)} (λ_{0})}{\partial λ}] = 0_{r \times 1}, E [\frac{\partial^{2} S_{T}^{(r)} (λ_{0})}{\partial λ \partial λ^{'}}] = 2 V,

(30)

where, according to Equation (23), it follows that:

V = \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) \frac{\partial Ψ_{X}^{(r)} (u; λ_{0})}{\partial λ} \frac{\partial Ψ_{X}^{(r)} (u; λ_{0})}{\partial λ^{'}} d u .

According to assumption

(A_{4})

, there exists a

ω

-integrable function

V : R^{r} \to R^{+}

such that

0 < ∥\frac{\partial Ψ_{X}^{(r)} (u; λ_{0})}{\partial λ} \frac{\partial Ψ_{X}^{(r)} (u; λ_{0})}{\partial λ^{'}}∥ \leq W (u), u \in {[- 1, 1]}^{r},

where

∥\cdot∥

is the matrix norm on

R^{r} \times R^{r}

consistent with the Euclidean vector norm. Hence, inequalities

0 < ∥V∥ \leq \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) W (u) d u < + \infty

hold, so Equation (30) and SLLN give

(\frac{\partial S_{T}^{(r)} (λ_{0})}{\partial λ}, \frac{\partial^{2} S_{T}^{(r)} (λ_{0})}{\partial λ \partial λ^{'}}) \overset{a s}{⟶} (0, 2 V), T \to + \infty .

(31)

Further, note that Equation (28) for the gradient of function

S_{T}^{(r)} (λ)

can be written as

\frac{\partial S_{T}^{(r)} (λ)}{\partial λ} = \frac{2}{T - r + 1} \sum_{t = 1}^{T - r + 1} K_{t}^{(r)} (λ),

(32)

where

K_{t} (λ) : = \int_{- 1}^{1} \dots \int_{- 1}^{1} ω (u) [Ψ_{X}^{(r)} (u; λ) - u_{1}^{X_{t}} \dots u_{r}^{X_{t + r - 1}}] \frac{\partial Ψ_{X}^{(r)} (u; λ)}{\partial λ} d u .

(33)

Thereby, according to Equation (23), the equality

E [K_{t} (λ_{0})] = 0

holds. It can also be shown (cf. Stojanović et al. [39,40]) that

\begin{matrix} W^{2} & : = & lim_{T \to \infty} Var [\frac{1}{\sqrt{T - r + 1}} \sum_{t = 1}^{T - r + 1} K_{t} (λ_{0})] = lim_{T \to \infty} \frac{1}{T - r + 1} E {[\sum_{t = 1}^{T - r + 1} K_{t} (λ_{0})]}^{2} \\ = & lim_{T \to \infty} \frac{1}{{(T - r + 1)}^{2}} \sum_{t = 1}^{T - r + 1} \sum_{s = 1}^{T - r + 1} Cov [K_{t} (λ_{0}), K_{s} (λ_{0})] \end{matrix}

is the finite non-zero limit value if the covariance function

γ (k) : = Cov (X_{t}, X_{t + k})

, when

k = 0, \pm 1, \pm 2, \dots

, is absolutely summable. In the case of the ZOIPS-INAR

(1)

process, using Theorem 1 we obtain:

C : = \sum_{k = - \infty}^{+ \infty} γ (k) = γ (0) \sum_{k = - \infty}^{+ \infty} ρ (k) = σ_{X}^{2} (2 \sum_{k = 0}^{+ \infty} α^{k} - 1) = σ_{X}^{2} \frac{1 + α}{1 - α} .

Thus,

0 < C < + \infty

hold for any

λ \in Λ

. By applying the central limit theorem for stationary processes (cf. Billingsley [41]), the convergence that is proved and Equations (31)–(33) give:

\sqrt{T - r + 1} \frac{\partial S_{T}^{(r)} (λ_{0})}{\partial λ} \overset{d}{⟶} N (0_{r \times 1}, 4 W^{2}), T \to + \infty,

(34)

where “d” denotes the convergence in the distribution. Finally, according to Equations (27), (31) and (34) it follows that:

\sqrt{T - r + 1} ({\hat{λ}}_{T}^{(r)} - λ_{0}) \overset{d}{⟶} N (0_{r \times 1}, V^{- 1} W^{2} V^{- 1}), T \to + \infty,

which completes the proof of the theorem. □

Remark 4.

Using similar considerations as in ECF estimates (cf. Knight & Yu [36]), the PGF procedure for estimating the true values of the parameter

λ = λ_{0}

is based on the realization of the two-dimensional random vector

X_{t}^{(2)} : = {(X_{t}, X_{t + 1})}^{'}

. Then, the objective function

S_{T}^{(2)}

represents a double integral with respect to the weight function

ω : R^{2} \to R^{+}

, and can be numerically approximated by some cubature formulas. For that purpose, it is necessary to determine the two-dimensional PGF (as well as EPGF) of the ZOIPS-INAR(1) series

(X_{t})

. By replacing

r = 2

in Equation (19), and using Equation (18), the two-dimensional theoretical PGF can be obtained as follows:

\begin{matrix} Ψ_{X}^{(2)} (u_{1}, u_{2}; λ) & = Ψ_{X} (u; λ) Ψ_{ε} (u_{2}; λ_{η}) \\ = \prod_{k = 0}^{\infty} (ϕ_{0} + ϕ_{1} (1 + α^{k} (u - 1)) + ϕ_{2} \frac{f ((1 + α^{k} (u - 1)) θ)}{f (θ)}) \\ \times (ϕ_{0} + ϕ_{1} u_{2} + ϕ_{2} \frac{f (u_{2} θ)}{f (θ)}), \end{matrix}

(35)

where

u = u_{1} (1 + α (u_{2} - 1)) .

As an illustration, Figure 2 presents the theoretical and empirical PGF of the ZOIPS-INAR(1) process with geometric PS innovations, which were obtained using the ‘R’ function “persp()”.

Figure 2. 3D plots of the two-dimensional PGF (a) and the corresponding EPGF (b) innovations are the ZOIPS series with a geometric PS distribution. (Parameters values are

θ = α = 0.5

,

ϕ_{0} = ϕ_{1} = 0.4

).

5. Numerical Simulations

In this section, numerical simulations of the proposed PGF procedure for estimating the unknown parameters

λ = {(θ, α, ϕ_{0}, ϕ_{1})}^{'}

of the ZOIPS-INAR(1) process are performed. For this purpose, as previously noted, different PS-distributed series

(ε_{t})

can be observed. As an illustration, but also for practical reasons that are stated in the next section, two different distributions of the series

(ε_{t})

are observed. First it is assumed that RVs

(ε_{t})

have a Poisson distribution, and then they are assumed to have a geometric distribution. For both of these distributions, we generated samples of length

T = 1000

, whose size is close to the length of the COVID-19 count series data that will be analyzed in Section 6. These samples were generated through 500 independent Monte Carlo simulations of the PS series

(ε_{t})

as well as the ZOIPS series

(η_{t})

. After that, according to Equations (11) and (12), the corresponding realizations

{X_{1}, \dots, X_{T}}

of the ZOIPS-INAR

(X_{t})

series were obtained.

Using a similar procedure as in Mohammadi et al. [22], we firstly computed the CLS estimates by minimizing the objective function:

\begin{matrix} Q_{T} (λ) & : = \sum_{t = 2}^{T} {(X_{t} - E [X_{t} | X_{t - 1}])}^{2} = \sum_{t = 2}^{T} {(X_{t} - α X_{t - 1} - μ_{η})}^{2} \\ = \sum_{t = 2}^{T} {(X_{t} - α X_{t - 1} - ϕ_{1} - ϕ_{2} θ g^{'} (θ))}^{2} . \end{matrix}

By applying the usual procedure, that is, by solving coupled equations

\frac{\partial Q_{T} (λ)}{\partial α} = \frac{\partial Q_{T} (λ)}{\partial μ_{η}} = 0,

parameter estimators

α

and

μ_{η}

can be easily obtained as follows:

\begin{matrix} \tilde{α} & : = {\hat{ρ}}_{X} (1) = \frac{(T - 1) \sum_{t = 2}^{T} X_{t} X_{t - 1} - \sum_{t = 2}^{T} X_{t} X_{t - 1} \sum_{t = 2}^{T}}{(T - 1) \sum_{t = 2}^{T} X_{t}^{2} - {(\sum_{t = 2}^{T} X_{t})}^{2}}, \\ {\tilde{μ}}_{η} & = \frac{\sum_{t = 2}^{T} X_{t} - \tilde{α} \sum_{t = 2}^{T} X_{t - 1}}{T} . \end{matrix}

(36)

In the next step, estimates of the parameters

θ, ϕ_{0}, ϕ_{1}

can be obtained by minimizing the objective function:

\begin{matrix} Q_{T}^{*} (λ) & : = \sum_{t = 2}^{T} {[{(X_{t} - E [X_{t} | X_{t - 1}])}^{2} - V a r [X_{t} | X_{t - 1}]]}^{2} \\ = \sum_{t = 2}^{T} {[{(X_{t} - α X_{t - 1} + μ_{η})}^{2} - α (1 - α) - μ_{η} (1 - μ_{η}) - ϕ_{2} θ^{2} ({(g^{'} (θ))}^{2} + g^{″} (θ))]}^{2}, \end{matrix}

where

α

and

μ_{η}

were replaced by their CLS estimators

\tilde{α}

and

{\tilde{μ}}_{η}

, respectively. Minimization of the function

Q_{T}^{*} (λ)

was conducted using a numerical procedure based on the R-function "nlminb", where the initial values of the parameters were taken randomly from the uniform distribution

U (0, 1)

. The asymptotic properties of the obtained CLS estimates can be proven by applying some basic results of the CLS theory [42,43], in the same way as this was achieved in Mohammadi et al. [22]. The results of these simulations are given in the left part of Table 2, where the minimums (Min.), mean values (Mean), maximums (Max.) and mean squared estimated errors (MSEE ) of the CLS estimates are shown.

Table 2. Estimated values of parameters of the ZOIPS-INAR

(1)

process. (True parameters are

θ = α = 0.5

,

ϕ_{0} = ϕ_{1} = 0.35

).

Next, the PGF method was applied, with initial values obtained from the previous CLS procedure. The parameter estimates

λ = {(θ, α, ϕ_{0}, ϕ_{1})}^{'}

were calculated based on the minimization of the double integral

S_{T}^{(2)} (λ) = \underset{{[- 1, 1]}^{2}}{\int \int} ω (u_{1}, u_{2}) {(Ψ_{X}^{(2)} (u_{1}, u_{2}; λ) - {\tilde{Ψ}}_{T} (u_{1}, u_{2}))}^{2} d u_{1} d u_{2},

(37)

where

ω : {[- 1, 1]}^{2} \to R^{+}

is the weight function, and

Ψ_{X}^{(2)} (u_{1}, u_{2}; λ)

is the two-dimensional PGF of the ZOIPS-INAR(1) process

(X_{t})

. Using Equation (35), in the case of the series

(X_{t})

with ZOIPS for the Poisson distribution case, this PGF can be obtained as follows:

\begin{matrix} Ψ_{X}^{(2)} (u_{1}, u_{2}; λ) & = \prod_{k = 0}^{\infty} (ϕ_{0} + ϕ_{1} (1 + α^{k} (u - 1)) + ϕ_{2} exp (θ α^{k} (u - 1))) \\ \times (ϕ_{0} + ϕ_{1} u_{2} + ϕ_{2} exp (θ (u_{2} - 1))) . \end{matrix}

Similarly, for the appropriate PGF of the series

(X_{t})

with ZOIPS for the geometric innovations, one obtains:

\begin{matrix} Ψ_{X}^{(2)} (u_{1}, u_{2}; λ) & = \prod_{k = 0}^{\infty} (ϕ_{0} + ϕ_{1} (1 + α^{k} (u - 1)) + ϕ_{2} \frac{1 - θ}{1 - (1 + α^{k} (u - 1)) θ}) \\ \times (ϕ_{0} + ϕ_{1} u_{2} + ϕ_{2} \frac{1 - θ}{1 - θ u_{2}}) . \end{matrix}

It can be observed that these PGFs are not in closed form, but they can be approximated by finite k-term products with an arbitrary precession.

Thereafter, the integral in (37) can be numerically approximated using some of the N-point cubature formulas of the form:

I (φ; ω) : = \underset{{[- 1, 1]}^{2}}{\int \int} ω (u_{1}, u_{2}) φ (u_{1}, u_{2}) d u_{1} d u_{2} \approx \sum_{j = 1}^{N} ω_{j} φ (u_{1 j}, u_{2 j}) .

Here,

(u_{1 j}, u_{2 j})

are the cubature nodes, and

ω_{j}

denotes the appropriate weight coefficients. In this simulation study, we used cubature formulas with

N = 36

nodes, based on Gauss–Legendre orthogonal polynomials and weight function

ω (u_{1}, u_{2}) \equiv 1

. The numerical construction of these formulas was carried out using the package “Orthogonal polynomials” within the software Mathematica, authored by Cvetković and Milovanović [44]. Next, the objective function (24) was minimized using the “R” procedure for linearly constrained minimization, based on the Nelder–Mead optimization method [45]. Summary statistics of the PGF estimates, which were obtained using the aforementioned estimation procedure and with additional values of the objective function

S_{T}^{(2)}

, are shown in the right part of Table 2.

Comparison of the CLS and PGF estimated values indicates that the mean values of CLS estimates are somewhat closer to the true parameter values (only) for the parameter

α

. This is expected, because

α

represents the first correlation of the ZOIPS-INAR(1) series

(X_{t})

. Since the CLS-estimate

\tilde{α}

, given by the first of Equation (36), is a sampled correlation, it is the most efficient estimate for

α

. However, it is obvious that the other parameter estimates are more efficient in the case when the PGF estimation procedure is applied, as well as that they have smaller mean squared estimation errors (MSEEs).

In addition, the AN test results are shown in Table 2, where the Anderson–Darling normality test was conducted. The test statistic, denoted AD, along with the corresponding p-values, were calculated using the procedure from the R-package “nortest”, authored by Gross [46]. According to the obtained values, it can be seen that the AN property is confirmed for most of the PGF estimates of parameters

λ = {(θ, α, ϕ_{0}, ϕ_{1})}^{'}

. However, the CLS estimates of the parameter

θ

do not have the AN property, at the significance level of

p = 0.05

. Therefore, this would be another advantage of the PGF estimates. Certain confirmation of these facts can be observed visually in Figure 3 and Figure 4.

Figure 3. Histograms of empirical distributions of the PGF estimated parameters of the ZOIPS-INAR(1) process with zero-and-one inflated Poisson PS innovations. (Parameters are the same as in Table 2).

Figure 4. Histograms of empirical distributions of the estimated parameters of the ZOIPS-INAR(1) process with zero-and-one inflated geometric PS innovations. (Parameters are the same as in Table 2).

6. Application of the Model

Here we point out some possibilities of practical application of the ZOIPS-INAR process in real-world data modeling. In this regard, it is worth noting that the COVID-19 pandemic has received a great deal of attention since its appearance. From a mathematical point of view, various theoretical models have been proposed to investigate this still current problem (for more recent examples, see [47,48,49,50,51,52,53]). To that end, here we explore some additional possibilities in modeling the dynamics of COVID data.

More precisely, we observed an actual count data set, which represents the dynamics of the number of deaths due to COVID-19 in the Republic of Serbia, based on the data of the World Health Organization (WHO) [54], over a period from 1 January 2020 to 31 December 2022. In this way, time series of counting data of length

T = 1094

were observed, the dynamics part of which is shown in the diagram in Figure 5. In addition, the autocorrelation function (ACF) and partial autocorrelation function (PACF) of this time series can be seen. Depending on the lag

k = 0, 1, 2, \dots

, it is clear that there is an exponentially decreasing autocorrelation, as well as that partial autocorrelation indicates suitable modeling with the INAR process of the order one (or two). Therefore, it could be expected that INAR-based processes can be adequate stochastic models to describe these dynamics.

Figure 5. Graph above: dynamics of the number of deaths caused by the disease COVID-19. Graphs below: autocorrelation (a) and partial autocorrelation (b) of the observed time series.

Furthermore, based on the summary statistics of the data of this time series, shown in Table 3, it can be observed that COVID-19 data have a significant number of zero and one values. Therefore, it could be assumed that the ZOIPS-INAR(1) process can be taken as a suitable stochastic model. In doing so, as members of the PS family, we consider innovations with the Poisson distribution (and the ‘small’ parameter

θ

), and with the geometric distribution. In the following, we first assume Poisson PS-innovations, since a zero-one inflation testing procedure can be applied. In this regard, we point out that the testing procedure here is different than one used in Qi et al. [21]. We form a null hypothesis that a certain INAR time series is not of the ZOI-type; that is,

H_{0} : ϕ_{0} = ϕ_{1} = 0

. In that case, according to Theorem 5, that is, Equation (22), the proportions of the occurrences of the zero and one values can be, respectively, expressed as follows:

\begin{matrix} p_{0} : = P \{X_{t} = 0\} & = Ψ_{X} (0; λ) = \prod_{k = 0}^{\infty} \frac{exp [θ (1 + α^{k} (u - 1))]}{exp (θ)} |_{u = 0} \\ = exp (\frac{θ (u - 1)}{1 - α}) |_{u = 0} = exp (- \frac{θ}{1 - α}), \end{matrix}

(38)

\begin{matrix} p_{1} : = P \{X_{t} = 1\} & = \frac{\partial Ψ_{X} (0; λ)}{\partial u} = \frac{θ}{1 - α} exp (\frac{θ (u - 1)}{1 - α}) |_{u = 0} \\ = \frac{θ}{1 - α} exp (- \frac{θ}{1 - α}) . \end{matrix}

(39)

Table 3. Summary statistics of the number of deaths per day in the Republic of Serbia, with zero-and-one inflation testing.

Therefore, in the INAR(1) process with ‘ordinary’ Poisson innovations, the zero-and-one proportions are exponentially related to the mean

μ_{X} = θ / (1 - α)

. If we define the so-called sample zero-and-one proportions:

{\hat{p}}_{0} = \frac{1}{T} \sum_{t = 1}^{T} I {X_{t} = 0}, {\hat{p}}_{1} = \frac{1}{T} \sum_{t = 1}^{T} I {X_{t} = 1},

then, according Equations (38) and (39), the so-called zero-and-one test statistics can be taken as follows:

{\hat{I}}_{0} = \frac{{\hat{p}}_{0} - e^{- {\bar{X}}_{T}}}{{\hat{σ}}_{0}}, {\hat{I}}_{1} = \frac{{\hat{p}}_{1} - {\bar{X}}_{T} e^{- {\bar{X}}_{T}}}{{\hat{σ}}_{1}} .

Here,

{\bar{X}}_{T} = T^{- 1} \sum_{t = 1}^{T} X_{t}

is the sample mean, and

{\hat{σ}}_{0}

,

{\hat{σ}}_{1}

are sample deviations of the statistics

{\hat{p}}_{0} - e^{- {\bar{X}}_{T}}

,

{\hat{p}}_{1} - {\bar{X}}_{T} e^{- {\bar{X}}_{T}}

, respectively. By applying some general asymptotic results related to Poisson INAR processes (cf. Weiß et al. [55]), it is shown that the central limit theorem (CLS) holds for these statistics; that is,

{\hat{I}}_{0}, {\hat{I}}_{1} \overset{d}{⟶} N (0, 1), T \to + \infty .

Thus, by applying a simple testing procedure based on the standard Gaussian distribution, hypothesis

H_{0}

can be verified. The test results are shown in the lower part of Table 3, from which it is evident that in both cases hypothesis

H_{0}

was rejected. Thus, the observed count data series can be modeled using the ZOIPS-INAR(1) process.

In a similar way, for the appropriate INAR(1) process with geometric PS-innovations, one obtains:

\begin{matrix} p_{0} : = Ψ_{X} (0; λ) & = \prod_{k = 0}^{\infty} \frac{1 - θ}{1 - θ (1 + α^{k} (u - 1))} |_{u = 0} = exp [- \sum_{k = 0}^{\infty} ln (1 + \frac{θ}{1 + θ} α^{k})] \\ = exp [- \sum_{j = 1}^{\infty} \frac{1}{j (1 - α^{j})} {(\frac{θ}{1 - θ})}^{j}], \\ p_{1} : = P \{X_{t} = 1\} & = \frac{\partial Ψ_{X} (0; λ)}{\partial u} = (\sum_{j = 1}^{\infty} \frac{1}{1 - α^{j}} {(\frac{θ}{1 - θ})}^{j}) exp [- \sum_{j = 1}^{\infty} \frac{1}{j (1 - α^{j})} {(\frac{θ}{1 - θ})}^{j}] . \end{matrix}

It can also be seen that the earlier expressions are not in closed form. However, they can be computed approximately with an arbitrary precision.

The estimated values of parameters for both of those PS innovations (Poisson and geometric) are shown in the upper parts of Table 4 and Table 5. In order to compare the ZOIPS-INAR(1) model with the ordinary INAR(1) model, the same estimation procedures were used to fit the observed data, assuming

ϕ_{0} = ϕ_{1} = 0

. Note that the INAR(1) model has only two parameters

(θ, α)

whose estimated values are also shown in Table 4 and Table 5. Thereby, the estimated values of the parameter

α

obtained using the CLS method, as the first sample correlation, are the same for both models. As in the previous simulation study, the PGF estimates were calculated using the previously described two-step estimation procedure. It is worth noting that the estimated parameter values obtained by applying both estimation methods are quite close. At the same time, the relatively small estimated values of the parameter

θ

are a consequence of the other small emphasized frequencies of the observed time series, which are not zero and one (and can also be seen in the following Figure 6). However, the estimated values of the parameter

α

are close to 1. This can also be explained by the previously described properties of the observed series, i.e., its significantly high first correlation. Finally, note that by using Equation (7), and similarly as in Mohammadi et al. [22], innovations of the ZOIPS-INAR(1) process can be represented as follows:

η_{t} \overset{d}{=} Y_{t} Z_{t} + (1 - Y_{t}) ε_{t} .

Table 4. Estimated fitting parameters of the considered data with the INAR(1) and ZOIPS-INAR(1) models, along with their goodness-of-fit and predictive statistics. (Poisson distribution is assumed for PS-distributed innovations).

Table 5. Estimated fitting parameters of the considered data with the INAR(1) and ZOIPS-INAR(1) models, along with their goodness-of-fit and predictive statistics. (Geometric distribution is assumed for PS-distributed innovations).

Figure 6. Frequency distributions of the observed time series and data fitted by INAR(1) and ZOIPS-INAR(1) models: (a) Poisson-distributed innovations; (b) geometric-distributed innovations.

Here,

Y_{t} : B (α)

,

Z_{t} : B (β)

are the IID Bernoulli time series, also mutually independent of the RVs

(ε_{t})

, and

ϕ_{0} = (1 - α) (1 - β)

,

ϕ_{1} = (1 - α) β

. Thus, in the observed series, the estimates of

ϕ_{0}

are related to the zero-proportion

p_{0}

, and the estimates of

ϕ_{1}

to the one-proportion

p_{1}

.

In addition, we analyzed the efficiency of the fit for both INAR-based models and both estimation procedures. To this end, using the estimated parameters, we generated 500 independent simulations of the INAR(1) and ZOIPS-INAR(1) time series. To check the effectiveness of the fit to real-life data, two typical goodness-of-fit statistics were calculated: the mean squared error of estimation (MSEE) and the Akaike information criterion (AIC). Average values of these statistics, obtained on the basis of the previously described simulations, are shown in the middle parts of Table 4 and Table 5. In both cases, it is noticeable that the MSEE and AIC statistics have relatively close and small estimated values, i.e., the CLS and PGF estimates apparently have similar efficiency. Nevertheless, it is certainly noticeable that the PGF estimates have slightly smaller fitting errors. The goodness-of-fit statistics are also significantly lower in the case when the ZOIPS-INAR(1) model is applied. This means that it is a more suitable stochastic model for fitting the observed real-life time series. Finally, in the case of Poisson PS-innovations, the error statistics have slightly smaller values, so they seem more suitable for fitting with the ZOPIS-INAR(1) model. Some of the mentioned facts can also be seen in Figure 6, where the empirical and fitted frequencies of both INAR-based models are shown, whereby parameter estimates obtained by the PGF method were used.

Finally, for both of INAR-based models, as well as the both estimation procedures, the forecast accuracy analysis of the obtained models was checked. To that end, the time interval from 1 January 2023 to 28 February 2023 was taken as the forecast horizon of length

h = 59

. The testing procedure was conducted using the one-sided Diebold–Mariano prediction accuracy test [56]. The null hypothesis was that the time series fitted with the INAR and ZOIPS-INAR models had the same forecast accuracy, while the alternative was that the ZOIPS-INAR model had better accuracy. The test statistic, denoted as DM, along with the appropriate p-values, was computed using the R-package “forecast”, authored by Hyndman [57]. They both are shown in the lower parts of Table 4 and Table 5, and indicate that ZOIPS-INAR models have better forecast accuracy; that is, the alternative hypothesis is valid. This is particularly evident in the case of the Poisson PS-innovations, where for both CLS and PGF estimation procedures the obtained p-values are less than the significance level of

p = 0.01

.

7. Conclusions

In this paper, a generalized zero-and-one inflated ZOIPS-INAR(1) process, based on power series (PS) distributed innovations, was presented. As already noted, zero-and-one inflationary data are of particular interest in contemporary real-world data research. Let us emphasize once again that there are only two contributions in this direction: the ZOINAR model with Poisson [21] and another model with Poisson–Lindley innovations [22]. The ZOIPS-INAR(1) process proposed here, according to the general form of its PS distributions, can be viewed as a generalization of the previous models. Thus, for instance, the Poisson innovations represent only a special case of the PS-distributed innovation series. In a similar way, the ZOIPS-INAR(1) process with any other PS innovations (such as, for example, geometrically distributed innovations) may be suitable for estimating and fitting different kinds of time series. However, it should be noted that there are distributions of discrete types that do not belong to the PS family, which can be a certain limitation of this model.

The stochastic properties of the ZOIPS-INAR(1) process were considered in detail, and two parameter estimation methods were proposed. As a more contemporary method, we once again emphasize the estimation procedure based on probability generating functions (PGF method). It was shown here that the PGF estimators have slightly better asymptotic properties as well as efficiency compared to the widely used conditional least-squares (CLS) estimators. Additionally, the application of the ZOIPS-INAR(1) model to fitting real-life data on the number of deaths from the COVID-19 pandemic in the Republic of Serbia was presented. In order to determine the effectiveness of the proposed ZOIPS-INAR(1) model, it was compared with the standard INAR(1) model. In doing so, it was shown that the ZOIPS-INAR(1) model provides a better fit with the considered data; that is, it has greater efficiency and predictive accuracy compared to the standard INAR(1) model. Finally, we emphasize that two particular cases of the ZOIPS-INAR(1) model were obtained, and for both of them the fitting procedures were checked with different aspects, along with the predicting accuracy. The results obtained in this way indicate the appropriateness of the proposed model in fitting zero-and-one processes, both from a theoretical and a practical point of view. This can also be a motivation for some further research.

Author Contributions

Conceptualization, V.S.S. and H.S.B.; methodology, V.S.S. and H.S.B.; software, V.S.S. and H.S.B.; validation, E.L. and N.Q.; formal analysis, V.S.S. and H.S.B.; data curation, V.S.S. and H.S.B.; writing—original draft preparation, V.S.S., H.S.B. and E.L.; writing—review and editing, E.L. and N.Q.; visualization, V.S.S. and H.S.B.; supervision, V.S.S. and E.L.; project administration, E.L. and N.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R376), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Data Availability Statement

Official data on deaths from the disease COVID-19 can be found on the website of the World Health Organization (https://covid19.who.int/data, accessed on 19 March 2023).

Acknowledgments

The authors gratefully acknowledge Princess Nourah bint Abdulrahman University Researchers’ Supporting Project number (PNURSP2023R376), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia, for the financial support for this project.

Conflicts of Interest

The authors declare no conflict of interest.

References

Johnson, N.L.; Kemp, A.W.; Kotz, S. Univariate Discrete Distributions, 3rd ed.; Wiley Series in Probability and Statistics; John Wiley & Sons: Washington, DC, USA, 2005. [Google Scholar]
Xu, H.-Y.; Xie, M.; Goh, T.N.; Fu, X. A Model for Integer–Valued Time Series with Conditional Overdispersion. Comput. Stat. Data Anal. 2012, 56, 4229–4242. [Google Scholar] [CrossRef]
Weiß, C.H.; Pollet, P.K. Binomial Autoregressive Processes with Density-Dependent Thinning. J. Time Ser. Anal. 2014, 35, 115–132. [Google Scholar] [CrossRef]
Martin, V.L.; Tremayne, A.R.; Jung, R.C. Efficient Method of Moments Estimators for Integer Time Series Models. J. Time Ser. Anal. 2014, 35, 491–516. [Google Scholar] [CrossRef]
Graziadei, H.; Lijoi, A.; Lopes, H.F.; Marques, F.P.C.; Prünster, I. Prior Sensitivity Analysis in a Semi-Parametric Integer-Valued Time Series Model. Entropy 2020, 22, 69. [Google Scholar] [CrossRef] [PubMed]
Al-Osh, M.A.; Alzaid, A.A. First-Order Integer-Valued Autoregressive (INAR(1)) Process. J. Time Ser. Anal. 1987, 8, 261–275. [Google Scholar] [CrossRef]
Silva, M.E.; Oliveira, V.L. Difference Equations for the Higher-Order Moments and Cumulants of the INAR(1) Model. J. Time Ser. Anal. 2004, 25, 317–333. [Google Scholar] [CrossRef]
Jung, R.C.; Ronning, G.; Tremayne, A.R. Estimation in Conditional First Order Autoregression with Discrete Support. Stat. Pap. 2005, 46, 195–224. [Google Scholar] [CrossRef]
Ristić, M.; Bakouch, H.S.; Nastić, A. A New Geometric First-Order Integer-Valued Autoregressive (NGINAR(1)) Process. J. Stat. Plan. Infer. 2009, 139, 2218–2226. [Google Scholar] [CrossRef]
Schweer, S.; Weiß, C.H. Compound Poisson INAR(1) Processes: Stochastic Properties and Testing for Over-Dispersion. Comput. Stat. Data Anal. 2014, 77, 267–284. [Google Scholar] [CrossRef]
Asgharzadeh, A.; Bakouch, H.S.; Nadarajah, S.; Sharafi, F. A New Weighted Lindley Distribution with Application. Braz. J. Probab. Stat. 2016, 30, 1–27. [Google Scholar]
Mohammadpour, M.; Bakouch, H.S.; Shirozhan, M. Poisson–Lindley INAR(1) Model with Applications. Braz. J. Probab. Stat. 2018, 32, 262–280. [Google Scholar] [CrossRef]
Bermúdez, L.; Karlis, D. Multivariate INAR(1) Regression Models Based on the Sarmanov Distribution. Mathematics 2021, 9, 505. [Google Scholar] [CrossRef]
Li, Q.; Chen, H.; Liu, X. A New Bivariate Random Coefficient INAR(1) Model with Applications. Symmetry 2022, 14, 39. [Google Scholar] [CrossRef]
Maya, R.; Chesneau, C.; Krishna, A.; Irshad, M.R. Poisson Extended Exponential Distribution with Associated INAR(1) Process and Applications. Stats 2022, 5, 755–772. [Google Scholar] [CrossRef]
Maya, R.; Irshad, M.R.; Chesneau, C.; Nitin, S.L.; Shibu, D.S. On Discrete Poisson–Mirra Distribution: Regression, INAR(1) Process and Applications. Axioms 2022, 11, 193. [Google Scholar] [CrossRef]
Khoo, W.C.; Ong, S.H.; Atanu, B. Coherent Forecasting for a Mixed Integer-Valued Time Series Model. Mathematics 2022, 10, 2961. [Google Scholar] [CrossRef]
Saito, M.Y.; Rodrigues, J. A Bayesian Analysis of Zero and One Inflated Distributions. Rev. Mat. Estatíst. 2005, 23, 47–57. [Google Scholar]
Zhang, C.; Tian, G.; Ng, K. Properties of the Zero-and-One Inflated Poisson Distribution and Likelihood-Based Inference Methods. Stat. Interface 2016, 9, 11–32. [Google Scholar] [CrossRef]
Zhang, C.; Tian, G.-L.; Yuen, K.C.; Wu, Q.; Li, T. Multivariate Zero-and-One Inflated Poisson Model with Applications. J. Comput. Appl. Math. 2020, 365, 112356. [Google Scholar] [CrossRef]
Qi, X.; Li, Q.; Zhu, F. Modeling Time Series of Count with Excess Zeros and Ones Based on INAR(1) Model with Zero-and-One Inflated Poisson Innovations. J. Comput. Appl. Math. 2019, 346, 572–590. [Google Scholar] [CrossRef]
Mohammadi, Z.; Sajjadnia, Z.; Bakouch, H.S.; Sharafi, M. Zero-and-One Inflated Poisson–Lindley INAR(1) Process for Modelling Count Time Series with Extra Zeros and Ones. J. Stat. Comput. Simulat. 2022, 92, 2018–2040. [Google Scholar] [CrossRef]
Franke, J.; Seligmann, T.H. Conditional Maximum Likelihood Estimates for INAR(1) Processes and Their Application to Modelling Epileptic Seizure Counts; Developments in Time Series; Chapman & Hall: London, UK, 1993; pp. 310–330. [Google Scholar]
Du, J.-G.; Li, Y. The Integer-Valued Autoregressive (INAR(p)) Model. J. Time Ser. Anal. 1991, 12, 129–142. [Google Scholar]
Latour, A. Existence and Stochastic Structure of a Non-negative Integer-valued Autoregressive Process. J. Time Ser. Anal. 1998, 19, 439–455. [Google Scholar] [CrossRef]
Silva, I.; Silva, M.E. Asymptotic Distribution of the Yule-Walker Estimator for INAR(p) Processes. Stat. Prob. Lett. 2006, 76, 1655–1663. [Google Scholar] [CrossRef]
Esquìvel, M.L. Some Applications of Probability Generating Function Based Methods to Statistical Estimation. Discuss. Math. 2009, 29, 131–153. [Google Scholar] [CrossRef]
Stojanović, V.; Randjelović, D.; Kuk, K. Noise-Indicator Non-negative Integer-Valued Autoregressive Time Series of the First Order. Braz. J. Probab. Stat. 2018, 32, 147–171. [Google Scholar] [CrossRef]
Stojanović, V.; Ljajko, E.; Tošić, M. Parameters Estimation in Non-Negative Integer-Valued Time Series: Approach Based on Probability Generating Functions. Axioms 2023, 12, 112. [Google Scholar] [CrossRef]
Bourguignon, M.; Vasconcellos, K.L.P. First Order Non-Negative Integer Valued Autoregressive Processes with Power Series Innovations. Braz. J. Probab. Stat. 2015, 29, 71–93. [Google Scholar] [CrossRef]
Weiß, C.H. Thinning Operations for Modelling Time Series of Counts—A Survey. Adv. Statist. Anal. 2008, 92, 319–341. [Google Scholar] [CrossRef]
Kella, O.; Löpker, A. On Binomial Thinning and Mixing. Indag. Math. 2022, in press. [Google Scholar] [CrossRef]
Alzaid, A.A.; Al-Osh, M.A. An Integer-Valued pth-order Autoregressive Structure (INAR(p)) Process. J. App. Prob. 1990, 27, 314–324. [Google Scholar] [CrossRef]
Jazi, M.A.; Jones, G.; Lai, C.D. First-Order Integer-Valued Ar Processes with Zero-Inflated Poisson Innovations. J. Time Ser. Anal. 2012, 33, 954–963. [Google Scholar] [CrossRef]
Li, C.; Wang, D.; Zhang, H. First-Order Mixed Integer-Valued Autoregressive Processes with Zero-Inflated Generalized Power Series Innovations. J. Korean Stat. Soc. 2015, 44, 232–246. [Google Scholar] [CrossRef]
Knight, J.L.; Yu, J. Empirical Characteristic Function in Time Series Estimation. Econ. Theory 2002, 18, 691–721. [Google Scholar] [CrossRef]
Yu, J. Empirical Characteristic Function Estimation and Its Applications. Econ. Rev. 2004, 23, 93–123. [Google Scholar] [CrossRef]
Newey, W.K.; McFadden, D. Large Sample Estimation and Hypothesis Testing. In Handbook of Econometrics; Elsevier: Amsterdam, The Netherlands, 1994; Volume 4. [Google Scholar]
Stojanović, V.; Milovanović, G.V.; Jelić, G. Distributional Properties and Parameters Estimation of GSB Process: An Approach Based on Characteristic Functions. ALEA Lat. Am. J. Probab. Math. Stat. 2016, 13, 835–861. [Google Scholar] [CrossRef]
Stojanović, V.; Popović, B.Č.; Milovanović, G.V. The Split-SV Model. Comput. Statist. Data Anal. 2016, 100, 560–568. [Google Scholar] [CrossRef]
Billingsley, P. Probability and Measure; John Wiley & Sons: New York, NY, USA, 1995. [Google Scholar]
Klimko, L.A.; Nelson, P.I. On Conditional Least Squares Estimation for Stochastic Processes. Ann. Stat. 1978, 6, 629–642. [Google Scholar] [CrossRef]
Tjøstheim, D. Estimation in Non-linear Time Series Models. Stochastic Process. Appl. 1986, 21, 251–273. [Google Scholar] [CrossRef]
Cvetković, A.S.; Milovanović, G.V. The Mathematica Package “Orthogonal Polynomials”. Facta Univ. Ser. Math. Inform. 2004, 19, 17–36. [Google Scholar]
Lange, K. Numerical Analysis for Statisticians; Statistics and Computing; Springer: New York, NY, USA, 2001. [Google Scholar]
Gross, L. Tests for Normality. R Package Version 1.0-2. 2013. Available online: http://CRAN.R-project.org/package=nortest (accessed on 28 February 2023).
Vaz, S.; Torres, D.F.M. A Discrete-Time Compartmental Epidemiological Model for COVID-19 with a Case Study for Portugal. Axioms 2021, 10, 314. [Google Scholar] [CrossRef]
Ghosh, S.; Volpert, V.; Banerjee, M. An Epidemic Model with Time Delay Determined by the Disease Duration. Mathematics 2022, 10, 2561. [Google Scholar] [CrossRef]
Sivakumar, B.; Deepthi, B. Complexity of COVID-19 Dynamics. Entropy 2022, 24, 50. [Google Scholar] [CrossRef] [PubMed]
Hassan, S.M.; Riveros Gavilanes, J.M. First to React Is the Last to Forgive: Evidence from the Stock Market Impact of COVID 19. J. Risk Financ. Manag. 2021, 14, 26. [Google Scholar] [CrossRef]
Zakharov, V.; Balykina, Y.; Ilin, I.; Tick, A. Forecasting a New Type of Virus Spread: A Case Study of COVID-19 with Stochastic Parameters. Mathematics 2022, 10, 3725. [Google Scholar] [CrossRef]
Jovanović, M.; Stojanović, V.; Kuk, K.; Popović, B.; Čisar, P. Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia. Mathematics 2022, 10, 3849. [Google Scholar] [CrossRef]
Stapper, M. Count Data Time Series Modelling in Julia—The CountTimeSeries.jl Package and Applications. Entropy 2021, 23, 666. [Google Scholar] [CrossRef]
World Health Organization. Available online: https://covid19.who.int/data (accessed on 3 March 2023).
Weiß, C.H.; Homburg, A.; Puig, P. Testing for Zero Inflation and Overdispersion in INAR(1) Models. Stat. Pap. 2019, 60, 823–848. [Google Scholar] [CrossRef]
Diebold, F.X.; Mariano, R.S. Comparing Predictive Accuracy. J. Bus. Econ. Stat. 1995, 13, 253–263. [Google Scholar]
Hyndman, R. Forecasting Functions for Time Series and Linear Models. R Package Version 7.1. 2016. Available online: http://CRAN.R-project.org/package=forecast (accessed on 3 March 2023).

Figure 1. (a) Dynamics of the ZOIPS series and ZOIPS-INAR(1) process. (b) Empirical frequency distributions of both time series. (Parameters values are:

α = 0.5, θ = 5, ϕ_{0} = ϕ_{1} = 0.45

).

Figure 2. 3D plots of the two-dimensional PGF (a) and the corresponding EPGF (b) innovations are the ZOIPS series with a geometric PS distribution. (Parameters values are

θ = α = 0.5

,

ϕ_{0} = ϕ_{1} = 0.4

).

Figure 3. Histograms of empirical distributions of the PGF estimated parameters of the ZOIPS-INAR(1) process with zero-and-one inflated Poisson PS innovations. (Parameters are the same as in Table 2).

Figure 4. Histograms of empirical distributions of the estimated parameters of the ZOIPS-INAR(1) process with zero-and-one inflated geometric PS innovations. (Parameters are the same as in Table 2).

Figure 5. Graph above: dynamics of the number of deaths caused by the disease COVID-19. Graphs below: autocorrelation (a) and partial autocorrelation (b) of the observed time series.

Figure 6. Frequency distributions of the observed time series and data fitted by INAR(1) and ZOIPS-INAR(1) models: (a) Poisson-distributed innovations; (b) geometric-distributed innovations.

Table 1. Some specific PS distributions, along with their over-dispersion indices and PGFs.

Distributions	$S$	$a (x)$	$(0, R)$	$f (θ)$	$μ_{ε}$	$D_{ε} (θ)$	$Ψ_{ε} (u; θ)$	$R / θ$
1. Bernoulli	$\{0, 1\}$	1	$(0, \infty)$	$1 + θ$	$\frac{θ}{1 + θ}$	$- \frac{θ^{2}}{{(1 + θ)}^{2}}$	$\frac{1 + θ u}{1 + θ}$	∞
2. Binomial	$\{0, \dots, n\}$	$(\binom{n}{x})$	$(0, \infty)$	${(1 + θ)}^{n}$	$\frac{n θ}{1 + θ}$	$- \frac{n θ^{2}}{{(1 + θ)}^{2}}$	${(\frac{1 + θ u}{1 + θ})}^{n}$	∞
3. Poisson	$\{0, \dots, \infty\}$	$\frac{1}{x!}$	$(0, \infty)$	$exp (θ)$	$θ$	0	$exp (θ (u - 1))$	∞
4. Geometric	$\{0, \dots, \infty\}$	1	$(0, 1)$	$\frac{1}{1 - θ}$	$\frac{θ}{1 - θ}$	$\frac{θ^{2}}{{(1 - θ)}^{2}}$	$\frac{1 - θ}{1 - θ u}$	$1 / θ$
5. Negative binomial	$\{0, \dots, \infty\}$	$\frac{Γ (x + n)}{x! Γ (n)}$	$(0, 1)$	$\frac{1}{{(1 - θ)}^{n}}$	$\frac{n θ}{{(1 - θ)}^{n}}$	$\frac{n θ^{2}}{{(1 - θ)}^{2}}$	${(\frac{1 - θ}{1 - θ u})}^{n}$	$1 / θ$
6. Pascal	$\{n, \dots, \infty\}$	$(\binom{x - 1}{n - 1})$	$(0, 1)$	${(\frac{θ}{1 - θ})}^{n}$	$\frac{n}{1 - θ}$	$\frac{n (2 θ - 1)}{{(1 - θ)}^{2}}$	${(\frac{(1 - θ) u}{1 - θ u})}^{n}$	$1 / θ$

Table 2. Estimated values of parameters of the ZOIPS-INAR

(1)

process. (True parameters are

θ = α = 0.5

,

ϕ_{0} = ϕ_{1} = 0.35

).

Table 2. Estimated values of parameters of the ZOIPS-INAR

(1)

process. (True parameters are

θ = α = 0.5

,

ϕ_{0} = ϕ_{1} = 0.35

).

	Sample	CLS Estimates				PGF Estimates				$S_{T}^{(2)}$
	Sample	$\tilde{θ}$	$\tilde{α}$	${\tilde{ϕ}}_{0}$	${\tilde{ϕ}}_{1}$	$\hat{θ}$	$\hat{α}$	${\hat{ϕ}}_{0}$	${\hat{ϕ}}_{1}$	$S_{T}^{(2)}$
	Min.	0.2137	0.4116	0.1209	0.1515	0.2926	0.3303	0.0975	0.1394	3.45 $\times 10^{- 6}$
	Mean	0.4846	0.4980	0.3380	0.3412	0.5054	0.5040	0.3591	0.3574	1.18 $\times 10^{- 4}$
	Max.	0.7624	0.5841	0.7286	0.7171	0.7155	0.6352	0.7515	0.6084	1.84 $\times 10^{- 3}$
	MSEE	0.0091	0.0012	0.0082	0.0080	0.0057	0.0049	0.0060	0.0022	–
Poisson	$A D$	2.0393 **	0.3435	0.5178	0.9376 *	0.5725	0.3594	0.7822 *	0.4727	–
	(p-value)	(3.34 $\times 10^{- 5}$ )	(0.2670)	(0.1870)	(0.0173)	(0.1361)	(0.4474)	(0.0418)	(0.2414)	–
	Min.	0.1522	0.3955	0.1028	0.1271	0.3043	0.2973	0.1515	0.1797	2.52 $\times 10^{- 7}$
	Mean	0.5131	0.4966	0.3581	0.3387	0.4952	0.4963	0.3546	0.3483	4.04 $\times 10^{- 6}$
	Max.	0.6487	0.5764	0.6938	0.6702	0.6973	0.6768	0.5958	0.5294	1.28 $\times 10^{- 5}$
	MSEE	0.0137	0.0011	0.0175	0.0174	0.0074	0.0053	0.0086	0.0064	–
Geometric	$A D$	1.0235 *	0.2859	0.7060	0.5108	0.6628	0.1867	0.7069	0.4753	–
	(p-value)	(0.0106)	(0.6227)	(0.0646)	(0.1945)	(0.0826)	(0.9038)	(0.0643)	(0.2379)	–

* p < 0.05; ** p < 0.01.

Table 3. Summary statistics of the number of deaths per day in the Republic of Serbia, with zero-and-one inflation testing.

Statistics	Estimated Values
Sample size	1094
Minimum	0
Mode	0
Median	9
Mean	16.01
Maximum	79
St. deviation	18.21
Variance	331.57
Skewness	1.269
Kurtosis	3.4367
Zero-count	104
Zero-proportion	0.0951
Zero-statistics ( ${\hat{I}}_{0}$ )	5.601 **
(p-value)	(2.10 $\times 10^{- 6}$ )
Ones-count	102
Ones-proportion	0.0932
Ones-statistics ( ${\hat{I}}_{1}$ )	5.492 **
(p-value)	(3.54 $\times 10^{- 6}$ )

** p < 0.01.

Table 4. Estimated fitting parameters of the considered data with the INAR(1) and ZOIPS-INAR(1) models, along with their goodness-of-fit and predictive statistics. (Poisson distribution is assumed for PS-distributed innovations).

Parameters/Statistics	CLS Estimates		PGF Estimates
Parameters/Statistics	INAR	ZOIPS-INAR	INAR	ZOIPS-INAR
$θ$	0.1731	0.1420	0.1768	0.1592
$α$	0.9890	0.9890	0.9431	0.9902
$ϕ_{0}$	–	0.0709	–	0.0775
$ϕ_{1}$	–	0.1492	–	0.1416
$S_{T}^{2}$	3.92 $\times 10^{- 2}$	3.09 $\times 10^{- 2}$	1.17 $\times 10^{- 2}$	6.96 $\times 10^{- 4}$
MSEE	8.61 $\times 10^{- 3}$	5.98 $\times 10^{- 3}$	6.79 $\times 10^{- 3}$	5.96 $\times 10^{- 3}$
AIC	4.8393	4.2495	4.8265	4.2489
$D M$	2.4269 $^{* *}$		2.3312 $^{* *}$
(p-value)	(7.69 $\times 10^{- 3}$ )		(9.96 $\times 10^{- 3}$ )

** p < 0.01.

Table 5. Estimated fitting parameters of the considered data with the INAR(1) and ZOIPS-INAR(1) models, along with their goodness-of-fit and predictive statistics. (Geometric distribution is assumed for PS-distributed innovations).

Parameters/Statistics	CLS Estimates		PGF Estimates
Parameters/Statistics	INAR	ZOIPS-INAR	INAR	ZOIPS-INAR
$θ$	0.1461	0.0919	0.2236	0.1075
$α$	0.9890	0.9890	0.9099	0.9908
$ϕ_{0}$	–	0.1068	–	0.1169
$ϕ_{1}$	–	0.1231	–	0.0859
$S_{T}^{2}$	2.90 $\times 10^{- 2}$	2.78 $\times 10^{- 2}$	8.99 $\times 10^{- 3}$	9.91 $\times 10^{- 5}$
MSEE	1.11 $\times 10^{- 2}$	6.81 $\times 10^{- 3}$	6.84 $\times 10^{- 3}$	6.78 $\times 10^{- 3}$
AIC	4.8686	4.8301	4.8355	4.8291
$D M$	1.7961 *		2.3446 **
(p-value)	(3.64 $\times 10^{- 2}$ )		(9.61 $\times 10^{- 3}$ )

* p < 0.05; ** p < 0.01.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Zero-and-One Integer-Valued AR(1) Time Series with Power Series Innovations and Probability Generating Function Estimation Approach

Abstract

1. Introduction

2. Structure of the ZOIPS-INAR(1) Process

3. Stochastic Properties of the ZOIPS-INAR Process

4. Parameter Estimation Procedure

5. Numerical Simulations

6. Application of the Model

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics