Bivariate Random Coefficient Integer-Valued Autoregressive Model Based on a ρ-Thinning Operator

Chang Liu; Dehui Wang

doi:10.3390/axioms13060367

Abstract

While overdispersion is a common phenomenon in univariate count time series data, its exploration within bivariate contexts remains limited. To fill this gap, we propose a bivariate integer-valued autoregressive model. The model leverages a modified binomial thinning operator with a dispersion parameter

ρ

and integrates random coefficients. This approach combines characteristics from both binomial and negative binomial thinning operators, thereby offering a flexible framework capable of generating counting series exhibiting equidispersion, overdispersion, or underdispersion. Notably, our model includes two distinct classes of first-order bivariate geometric integer-valued autoregressive models: one class employs binomial thinning (BVGINAR(1)), and the other adopts negative binomial thinning (BVNGINAR(1)). We establish the stationarity and ergodicity of the model and estimate its parameters using a combination of the Yule–Walker (YW) and conditional maximum likelihood (CML) methods. Furthermore, Monte Carlo simulation experiments are conducted to evaluate the finite sample performances of the proposed estimators across various parameter configurations, and the Anderson-Darling (AD) test is employed to assess the asymptotic normality of the estimators under large sample sizes. Ultimately, we highlight the practical applicability of the examined model by analyzing two real-world datasets on crime counts in New South Wales (NSW) and comparing its performance with other popular overdispersed BINAR(1) models.

Keywords:

BINAR(1) model; random coefficient; ρ-thinning operator; overdispersion

MSC:

62M10

1. Introduction

Bivariate count data are prevalent across various scenarios and frequently represent the occurrences of two distinct events, objects, or individuals over a specific time frame. Bivariate integer-valued time series models, in particular, excel at preserving the paired relationship between two correlated count variables observed over certain time intervals. Such data arise in many fields, including guest nights in hotels and cottages [1], tick-by-tick data of two highly traded stocks [2], traffic accidents happening both in daylight and nighttime [3], the occurrence of different offenses in specific areas [4], and the counts of asymptomatic and symptomatic COVID-19 cases within considered regions.

Substantial research interest has been directed toward the analysis of bivariate integer-valued time series, particularly regarding their cross-correlated nature. One direction is to describe the correlation between series by employing different bivariate innovation distributions. For instance, [3] proposed a bivariate diagonal INAR(1) model with bivariate Poisson and bivariate negative binomial innovations. Similarly, [5] found that BINAR(1) models with Poisson–Lindley (PL) innovations outperform other competing INAR(1) models, whether based on diagonal or full coefficient matrices. Furthermore, [6] presented a bivariate full INAR(1) model, assuming a time-dependent innovation vector, where the mean of the innovation vector linearly increases with the previous population size. However, these models are all based on constant coefficients.

To introduce more flexibility into BINAR models, [7] considered inference for a bivariate random coefficient INAR(1) model with different geometric marginals. Thereafter, [8] proposed a more general bivariate diagonal random coefficient INAR(1) process (BRCINAR(1)) with dependent innovations. Moreover, [9] compared the performances of random coefficient BINAR(1) models based on the bivariate negative binomial distributions constructed in different ways with explanatory variables.

While these studies have made significant strides in enhancing BINAR(1) model flexibility through various marginal and innovation distributions, they have primarily focused on binomial thinning operators. The binomial thinning operator, initially proposed by [10], remains the most widely utilized approach in modeling integer-valued time series due to its capability of producing integer-valued results and offering strong interpretability. Denoted as “∘”, this operator is defined as follows:

α \circ X = \sum_{i = 1}^{X} B_{i},

where

α \in (0, 1)

, X represents a non-negative integer-valued random variable, and

{B_{i}}

denotes a sequence of independent and identically distributed Bernoulli random variables. Each

B_{i}

has a probability

P (B_{i} = 1) = α

and

P (B_{i} = 0) = 1 - α

, independent of X. Essentially, the binomial thinning operator assigns a value of either 0 or 1 to each counting random variable

B_{i}

, making it suitable for modeling scenarios where random events either survive or vanish after a period of observation.

However, in situations where the observed unit has the potential to generate multiple countable elements or trigger further stochastic occurrences beyond mere survival or disappearance, the Bernoulli random variable may not be the most suitable choice for constructing the counting series. For example, in the context of infectious diseases, an infected individual may not only survive or die but may also contribute to the generation of new cases. To address this limitation, [11] introduced the negative binomial thinning operator, defined as:

α * X = \sum_{i = 1}^{X} G_{i},

where

α \in (0, 1)

, and

{G_{i}}

represents a sequence of independent and identically distributed geometric random variables with a mean of

α

, also independent of X.

In fact, while there are many other thinning operators proven useful in univariate integer-valued time series [12,13], the literature on different operators applied to bivariate data is limited. As far as we know, a study by [14] constructed a BINAR(1) model based on the signed thinning operator, which is capable of accommodating data with negative observations. In addition, another study by [15] proposed a new BINAR(1) model that extended the negative binomial thinning operator.

The aim of this paper is to introduce a more sophisticated

ρ

-binomial thinning operator [12] to bivariate integer-valued time series and explore statistical inference of the proposed model. The motivation behind this endeavor lies in the remarkable versatility exhibited by the counting series derived from this thinning operator. Specifically, this thinning operator enables us to describe equidispersion, overdispersion, or underdispersion characteristics concurrently within both the counting processes. Consequently, the

ρ

-thinning operator extends the capabilities of the binomial thinning [10] and negative binomial thinning operators [16], offering superior fitting capabilities to paired count data.

The outline of the paper is structured as follows. In Section 2, we introduce a definition of the

ρ

-BVGINAR(1) model and discuss the basic properties of the thinning operator for bivariate vectors. The properties of the model are further examined in Section 3. Section 4 estimates the model parameters by integrating the Yule–Walker (YW) and maximum likelihood (CML) methods, followed by simulation studies to explore the asymptotic properties of the estimators under various parameter combinations. Moreover, the Anderson-Darling (AD) test is also performed to assess the asymptotic normality of the estimators under large sample sizes. Section 5 illustrates the application of the proposed models to two real-world datasets of crime counts in New South Wales. We examine datasets with varying levels of overdispersion indices and compare the performance of our models with other bivariate integer-valued models. Finally, Section 6 provides concluding remarks and an outlook for future research. Some proofs and figures are provided in the Appendix for reference.

2. Construction of the Model

Ref. [17] introduced a novel variant of the Bernoulli distribution, termed an inflated-parameter Bernoulli (IBe) distribution, designed to model univariate count data exhibiting overdispersion. This distribution incorporates an additional parameter,

ρ

, allowing for more flexible dispersion indices. The probability mass function (pmf) for the IBe distribution is defined as follows:

Pr (W = w) = \{\begin{matrix} 1 - α, & w = 0, \\ α {(\frac{ρ}{1 + ρ})}^{w - 1} (\frac{1}{1 + ρ}), & w \in {1, 2, \dots}, \end{matrix}

(1)

where

α \in [0, 1]

and

ρ \in [0, 1)

. The distribution can be denoted as

W \sim IBe (α, ρ)

. The mean, variance, and probability generating function (pgf) of the IBe distribution are detailed below:

\begin{matrix} E (W) = & α (1 + ρ), \\ Var (W) = & α (1 + ρ) [(1 + 2 ρ) - α (1 + ρ)], \\ Φ_{W} (s) = & \frac{1 - (1 - s) [α (1 + ρ) - ρ]}{1 + ρ (1 - s)}, | s | < 1 . \end{matrix}

Moreover, the dispersion index is expressed as:

I_{Y} : = \frac{Var (W)}{E (W)} = ρ + (1 + ρ) (1 - α) .

(2)

Interestingly, the IBe distribution presents three dispersion scenarios depending on the values of the parameters:

Overdispersion is observed when $\frac{α}{2 - α} < ρ < 1$ .
Underdispersion occurs if $0 \leq ρ < \frac{α}{2 - α}$ .
Equidispersion is achieved at $ρ = \frac{α}{2 - α}$ .

The distribution can also degenerate into two important distributions:

The standard Bernoulli distribution, when $ρ = 0$ and $w \in {0, 1}$ , with mean $α$ ;
The geometric distribution, when $α = \frac{ρ}{1 + ρ}$ , with mean $ρ$ .

Inspired by these properties, [12] formulated a

ρ

-binomial thinning operator, defined as:

α_{ρ} \circ X : = \sum_{i = 1}^{X} W_{i},

(3)

where X is a non-negative integer-valued random variable, and

{W_{i}}_{i = 1}^{X}

is a sequence of inflated-parameter Bernoulli random variables with the pmf given by Equation (1), mutually independent of X. This operator has proven its practical utility in modeling univariate integer-valued time series, known as the

ρ

-GINAR process.

However, it is crucial to acknowledge that cross-correlations are prevalent in most paired count time series. Hence, this paper aims to enhance the

ρ

-GINAR process by extending it into a bivariate domain, thereby more effectively capturing the inherent correlations within the data.

We propose the

ρ

-BVGINAR(1) model, which is a novel bivariate random coefficient INAR(1) process characterized by the following recursive equation:

X_{t} = {A_{t}}_{ρ} \circ X_{t - 1} + Z_{t} = [\begin{matrix} U_{1, t} & U_{2, t} \\ V_{1, t} & V_{2, t} \end{matrix}]_{ρ} \circ [\begin{matrix} X_{1, t - 1} \\ X_{2, t - 1} \end{matrix}] + [\begin{matrix} Z_{1, t} \\ Z_{2, t} \end{matrix}] .

(4)

Here,

(i): $A_{t}$ represents a random coefficient matrix comprising two mutually independent bivariate random vectors, $(U_{1, t}, U_{2, t})$ and $(V_{1, t}, V_{2, t})$ , each with independent and identically distributed (i.i.d.) components and with pmf values as:

$P (U_{1, t} = α_{1}, U_{2, t} = 0) = 1 - P (U_{1, t} = 0, U_{2, t} = α_{1}) = p_{1},$

$P (V_{1, t} = α_{2}, V_{2, t} = 0) = 1 - P (V_{1, t} = 0, V_{2, t} = α_{2}) = p_{2},$

where $α_{1}, α_{2} \in (0, 1)$ and $p_{1}, p_{2} \in [0, 1]$ . The matrix operation ${A_{t}}_{ρ} \circ$ replicates matrix multiplication while preserving the properties of random coefficient thinning.
(ii): The innovation ${Z_{t}}_{t \in Z}$ is a sequence of i.i.d. bivariate non-negative integer-valued random vectors with mutually independent elements ${Z_{1, t}}$ and ${Z_{2, t}}$ and independent of $X_{s}$ for $s < t$ .

Remark 1.

The proposed bivariate INAR(1) process based on the ρ-binomial thinning operator has two sub-models:

When $ρ = 0$ , it corresponds to the bivariate INAR(1) with geometric marginals introduced by [4].
When $α_{i} = \frac{ρ}{1 + ρ}$ , for both $i = 1$ and $i = 2$ , it aligns with the BVNGINAR(1) proposed by [18].

Next, we further explore the properties of the

ρ

-thinning operator for vectors.

Lemma 1.

Consider

{A_{t}}_{ρ} \circ X_{t - 1}

as defined in Equation (4). Then:

(i): $E ({A_{t}}_{ρ} \circ X_{t - 1}) = \tilde{A} E (X_{t - 1})$ , where

$\tilde{A} = (1 + ρ) A = (1 + ρ) E (A_{t}) = (1 + ρ) [\begin{matrix} α_{1} p_{1} & α_{1} (1 - p_{1}) \\ α_{2} p_{2} & α_{2} (1 - p_{2}) \end{matrix}] .$
(ii): $E (({A_{t}}_{ρ} \circ X_{t - 1}) B^{⊤}) = \tilde{A} E (X_{t - 1} B^{⊤})$ for a random vector $B$ independent of $A_{t}$ .
(iii): $E (B {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}) = E (B X_{t - 1}^{⊤}) {\tilde{A}}^{⊤}$ for a random vector $B$ independent of $A_{t}$ .
(iv): $E (({A_{t}}_{ρ} \circ X_{t - 1}) {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}) = \tilde{A} E (X_{t - 1} X_{t - 1}^{⊤}) {\tilde{A}}^{⊤} + C$ , where $C$ has elements

$c_{12} = 0, c_{21} = 0,$

c_{11} = {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E {(X_{1, t} - X_{2, t})}^{2} + α_{1} (1 + ρ) (1 + 2 ρ - α_{1} (1 + p_{1})) (p_{1} E (X_{1, t}) + (1 - p_{1}) E (X_{2, t}))),

c_{22} = {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E {(X_{1, t} - X_{2, t})}^{2} + α_{2} (1 + ρ) (1 + 2 ρ - α_{2} (1 + p_{2})) (p_{2} E (X_{1, t}) + (1 - p_{2}) E (X_{2, t}))) .

Lemma 2.

If we assume

α_{1} (1 + ρ), α_{2} (1 + ρ) \in (0, 1)

, then all eigenvalues of matrix

\tilde{A}

lie within the unit circle.

Proof.

Similar to [19], we outline the key steps. Let

λ_{1}

and

λ_{2}

denote the eigenvalues, with

λ_{2} > λ_{1}

assumed without loss of generality.

(i): We have $λ_{1} + λ_{2} = (1 + ρ) α_{1} p_{1} + (1 + ρ) α_{2} (1 - p_{2}) > 0$ , indicating $λ_{2} > 0$ .
(ii): Furthermore, we calculate $λ_{1} λ_{2} = {(1 + ρ)}^{2} α_{1} p_{1} α_{2} (1 - q_{1}) - {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} p_{2}$ ; then, we have $(1 - λ_{1}) (1 - λ_{2}) = 1 - (λ_{1} + λ_{2}) + λ_{1} λ_{2} = (1 - (1 + ρ) α_{1} p_{1}) (1 - (1 + ρ) α_{2} (1 - p_{2})) - {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} p_{2} \geq (1 - p_{1}) (1 - (1 - p_{2})) - (1 - p_{1}) p_{2} = 0 .$ Hence, it necessitates that either both $λ_{1}, λ_{2} < 1$ or both $λ_{1}, λ_{2} > 1$ . Since $λ_{1} + λ_{2} = (1 + ρ) α_{1} p_{1} + (1 + ρ) α_{2} (1 - p_{2}) < 2$ , we deduce both $λ_{1}, λ_{2} < 1$ .
(iii): Similarly, evaluating $(1 + λ_{1}) (1 + λ_{2}) = 1 + (λ_{1} + λ_{2}) + λ_{1} λ_{2} = (1 + (1 + ρ) α_{1} p_{1}) (1 + (1 + ρ) α_{2} (1 - p_{2})) - {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} p_{2} > 0$ . As $λ_{2} > 0$ , then $λ_{1} < - 1$ .

Consequently, we conclude

- 1 < λ_{1} < 1

and

0 < λ_{2} < 1

under the conditions

α_{1} (1 + ρ), α_{2} (1 + ρ) \in (0, 1)

. Moreover, this condition ensures the stationarity of both the first- and second-order moments of the process. □

Proposition 1.

If

α_{1} (1 + ρ), α_{2} (1 + ρ) \in (0, 1)

and

p_{1}, p_{2} \in [0, 1]

, a strictly stationary bivariate integer-valued time series

{X_{t}}_{t \in Z}

satisfying Equation (4) exists uniquely. Moreover, the process is ergodic.

The proof of Proposition 1 is provided in Appendix B. Now let us derive the moments and conditional moments of the

ρ

-BVGINAR(1) process.

Proposition 2.

Suppose the bivariate time series

{X_{t}}_{{t \in Z}}

is a stationary process defined by Equation (4); then, for

t \geq 1, i = 1, 2

, we have

(i): $E (X_{t}) = {(I - \tilde{A})}^{- 1} E (Z)$ .
(ii): $Σ_{X_{t + 1}} = \tilde{A} Σ_{X_{t}} {\tilde{A}}^{⊤} + C + Σ_{Z}$ , where $C$ has elements

$c_{12} = 0, c_{21} = 0,$

$c_{11} = {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E {(X_{1, t} - X_{2, t})}^{2} + α_{1} (1 + ρ) (1 + 2 ρ - α_{1} (1 + p_{1})) (p_{1} E (X_{1, t}) + (1 - p_{1}) E (X_{2, t}))),$

$c_{22} = {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E {(X_{1, t} - X_{2, t})}^{2} + α_{2} (1 + ρ) (1 + 2 ρ - α_{2} (1 + p_{2})) (p_{2} E (X_{1, t}) + (1 - p_{2}) E (X_{2, t}))) .$
(iii): $E (X_{i, t + 1} | X_{1, t}, X_{2, t}) = (1 + ρ) α_{i} [p_{i} X_{1, t} + (1 - p_{i}) X_{2, t}] + μ_{Z_{i, t}}$ .
(iv): $Var (X_{i, t + 1} | X_{1, t}, X_{2, t}) = {(1 + ρ)}^{2} α_{i}^{2} p_{i} (1 - p_{i}) (X_{1, t}^{2} + X_{2, t}^{2}) + (1 + ρ) [1 + 2 ρ - (1 + ρ) α_{i}] α_{i}$ $[p_{i} X_{1, t} + (1 - p_{i}) X_{2, t}] - 2 {(1 + ρ)}^{2} α_{i}^{2} p_{i} (1 - p_{i}) X_{1, t} X_{2, t} + σ_{Z_{i, t}}^{2} .$

where

μ_{Z_{i, t}}

and

σ_{Z_{i, t}}^{2}

are the mean and variance, respectively, of

{Z_{i, t}}

, for

i = 1, 2

.

We define a stationary bivariate time series

{(X_{1, t}, X_{2, t})}_{t \in Z}

according to Equation (4). By specifying appropriate marginal distributions for

X_{1, t}

and

X_{2, t}

, we can deduce the respective marginal distributions of the innovation

Z_{1, t}

and

Z_{2, t}

. This process is clarified by the theorem below.

Theorem 1.

Let

α_{1} (1 + ρ), α_{2} (1 + ρ) \in (0, 1)

,

p_{1}, p_{2} \in [0, 1]

if

{X_{1, t}} \overset{d}{=} {X_{2, t}} \overset{d}{=} G e o m (\frac{μ}{1 + μ})

, and

μ > 0

; then the distributions of the innovation processes

{Z_{1, t}}

and

{Z_{2, t}}

are as follows:

Z_{1, t} \overset{d}{=} \{\begin{matrix} G e o m (\frac{ρ}{1 + ρ}), w . p . \frac{α_{1} μ (1 + ρ)}{μ - ρ}, \\ G e o m (\frac{μ}{1 + μ}), w . p . 1 - \frac{α_{1} μ (1 + ρ)}{μ - ρ}, \end{matrix}

(5)

Z_{2, t} \overset{d}{=} \{\begin{matrix} G e o m (\frac{ρ}{1 + ρ}), w . p . \frac{α_{2} μ (1 + ρ)}{μ - ρ}, \\ G e o m (\frac{μ}{1 + μ}), w . p . 1 - \frac{α_{2} μ (1 + ρ)}{μ - ρ} . \end{matrix}

(6)

Proof.

For intuitive purposes, the bivariate time series model

{(X_{1, t}, X_{2, t})}_{t \in Z}

can be represented as

X_{1, t} = \{\begin{matrix} {α_{1}}_{ρ} \circ X_{1, t - 1} + Z_{1, t}, w . p . p_{1}, \\ {α_{1}}_{ρ} \circ X_{2, t - 1} + Z_{1, t}, w . p . 1 - p_{1}, \end{matrix}

X_{2, t} = \{\begin{matrix} {α_{2}}_{ρ} \circ X_{1, t - 1} + Z_{2, t}, w . p . p_{2}, \\ {α_{2}}_{ρ} \circ X_{2, t - 1} + Z_{2, t}, w . p . 1 - p_{2} . \end{matrix}

Given

{X_{1, t}} \overset{d}{=} {X_{2, t}} \overset{d}{=} G e o m (\frac{μ}{1 + μ})

and leveraging the properties of the thinning operator and the stationarity of the process, we have

\begin{matrix} Φ_{{α_{1}}_{ρ} \circ X_{1, t}} (s) = & E (s^{{α_{1}}_{ρ} \circ X_{1, t}}) \\ = & E [E (s^{\sum_{j = 1}^{X_{1, t}} W_{j} (α_{1}, ρ)} | X_{1, t})] \\ = & E [E {(s^{W_{j} (α_{1}, ρ)} | X_{1, t})}^{X_{1, t}}] \\ = & E [{(Φ_{W} (s))}^{X_{1, t}}] \\ = & Φ_{X_{1, t}} (Φ_{W} (s)) \\ = & Φ_{X_{1, t}} (\frac{1 - (1 - s) [α_{1} (1 + ρ) - ρ]}{1 + ρ (1 - s)}) . \end{matrix}

Then the pgf of

Z_{1, t}

can be obtained by

\begin{matrix} Φ_{Z_{1, t}} (s) = & \frac{Φ_{X_{1, t}} (s)}{p_{1} Φ_{X_{1, t}} (Φ_{W} (s)) + (1 - p_{1}) Φ_{X_{1, t}} (Φ_{W} (s))} \\ = & \frac{\frac{1}{1 + μ - μ s}}{p_{1} \cdot Φ_{X_{1, t}} (\frac{1 - (1 - s) [α_{1} (1 + ρ) - ρ]}{1 + ρ (1 - s)}) + (1 - p_{1}) \cdot Φ_{X_{2, t}} (\frac{1 - (1 - s) [α_{1} (1 + ρ) - ρ]}{1 + ρ (1 - s)})} \\ = & \frac{1 + ρ (1 - s) + α_{1} μ (1 + ρ) (1 - s)}{[1 + μ (1 - s)] [1 + ρ (1 - s)]} \\ = & \frac{[1 + ρ (1 - s) + α_{1} μ (1 + ρ) (1 - s)] (μ - ρ)}{[1 + μ (1 - s)] [1 + ρ (1 - s)] (μ - ρ)} \\ = & \frac{(μ - ρ) - α_{1} μ (1 + ρ)}{(μ - ρ) [1 + μ (1 - s)]} + \frac{α_{1} μ (1 + ρ)}{(μ - ρ) [1 + ρ (1 - s)]} \\ = & (1 - \frac{α_{1} μ (1 + ρ)}{μ - ρ}) \frac{1}{1 + μ (1 - s)} + \frac{α_{1} μ (1 + ρ)}{μ - ρ} \frac{1}{1 + ρ (1 - s)} . \end{matrix}

The innovation

Z_{1, t}

clearly consists of a combination of two geometrically distributed random variables. Similar derivation holds for

Z_{2, t}

. Thus, the distributions of innovation process can be expressed as Equations (5) and (6). Notably, it is emphasized that

0 < ρ < min (\frac{μ (1 - α_{1})}{1 + α_{1} μ}, \frac{μ (1 - α_{2})}{1 + α_{2} μ})

is necessary to ensure the non-negativity of all probabilities for

Z_{1, t}

and

Z_{2, t}

. □

3. Properties

Lemma 3.

Let

0 < ρ < min (\frac{μ (1 - α_{1})}{1 + α_{1} μ}, \frac{μ (1 - α_{2})}{1 + α_{2} μ})

, and

p_{1}, p_{2} \in [0, 1]

. The correlation coefficient between

{X_{1, t}}

and

{X_{2, t}}

lies in

[0, 1)

and is expressed as:

γ = \frac{{(1 + ρ)}^{2} α_{1} α_{2} (p_{1} p_{2} + (1 - p_{1}) (1 - p_{2}))}{1 - {(1 + ρ)}^{2} α_{1} α_{2} (p_{1} (1 - p_{2}) + (1 - p_{1}) p_{2})} .

(7)

It is evident that the correlation coefficient γ lies in the interval

[0, 1)

.

From Lemma 1, we find that the covariance matrix between the random vectors

X_{t}

and

X_{0}

is given by

Cov (X_{t}, X_{0}) = \tilde{A} Cov (X_{t - 1}, X_{0}) = \dots = {\tilde{A}}^{t} Var (X_{0}) \to 0, t \to \infty .

The one-step-ahead conditional expectation can be derived as follows:

\begin{matrix} E (X_{t + 1} | X_{t}) = \tilde{A} X_{t} + E (Z_{t + 1}) . \end{matrix}

Typically, the k-step ahead conditional expectation of

X_{t}

is

\begin{matrix} E (X_{t} | X_{t - k}) = & E ({A_{t}}_{ρ} \circ X_{t - 1} + Z_{t} | X_{t - k}) \\ = & E ({A_{t}^{k}}_{ρ} \circ X_{t - k} + \sum_{j = 1}^{k - 1} {A_{t}^{j}}_{ρ} \circ Z_{t - j} | X_{t - k}) \\ = & {\tilde{A}}^{k} X_{t - k} + \sum_{j = 1}^{k - 1} {[\tilde{A}]}^{j} E (Z_{t}) \\ = & {\tilde{A}}^{k} X_{t - k} + [I - {\tilde{A}}^{k}] {[I - \tilde{A}]}^{- 1} E (Z_{t}) . \end{matrix}

Meanwhile, we observe that

E (Z_{t}) = [I - \tilde{A}] E (X_{t})

, thereby implying that

E (X_{t} | X_{t - k}) = {\tilde{A}}^{k} X_{t - k} + [I - {\tilde{A}}^{k}] E (X_{t}) \to E (X_{t}), k \to \infty .

This finding validates the characteristic of autoregressive processes, whereby the conditional expectation converges to the unconditional expectation as the number of steps approaches infinity.

Due to the conditional independence of the random variables

X_{1, t}

and

X_{2, t}

conditioned on

X_{1, t - 1}

and

X_{2, t - 1}

, respectively, the conditional probability function can be represented as the product of individual conditional probabilities. Therefore, we can derive the conditional probability function of the random vector

{(X_{1, t}, X_{2, t})}^{⊤}

as follows:

\begin{matrix} P (X_{1, t} = & x, X_{2, t} = y | X_{1, t - 1} = u, X_{2, t - 1} = v) = \\ P ( & X_{1, t} = x | X_{1, t - 1} = u, X_{2, t - 1} = v) \cdot P (X_{2, t} = y | X_{1, t - 1} = u, X_{2, t - 1} = v), \end{matrix}

(8)

where

\begin{matrix} P (X_{1, t} = & x | X_{1, t - 1} = u, X_{2, t - 1} = v) \\ = & p_{1} (P ({α_{1}}_{ρ} \circ X_{1, t - 1} = 0 | X_{1, t - 1} = u) P (Z_{1, t} = x) \\ + \sum_{k = 1}^{x} P ({α_{1}}_{ρ} \circ X_{1, t - 1} = k | X_{1, t - 1} = u) P (Z_{1, t} = x - k)) \\ + (1 - p_{1}) (P ({α_{1}}_{ρ} \circ X_{2, t - 1} = 0 | X_{2, t - 1} = v) P (Z_{1, t} = x) \\ + \sum_{k = 1}^{x} P ({α_{1}}_{ρ} \circ X_{2, t - 1} = k | X_{2, t - 1} = v) P (Z_{1, t} = x - k)), \end{matrix}

(9)

and

\begin{matrix} P (X_{2, t} = & y | X_{1, t - 1} = u, X_{2, t - 1} = v) \\ = & p_{2} (P ({α_{2}}_{ρ} \circ X_{2, t - 1} = 0 | X_{1, t - 1} = u) P (Z_{2, t} = y) \\ + \sum_{k = 1}^{y} P ({α_{2}}_{ρ} \circ X_{2, t - 1} = k | X_{1, t - 1} = u) P (Z_{2, t} = y - k)) \\ + (1 - p_{2}) (P ({α_{2}}_{ρ} \circ X_{2, t - 1} = 0 | X_{1, t - 1} = v) P (Z_{2, t} = y) \\ + \sum_{k = 1}^{y} P ({α_{2}}_{ρ} \circ X_{2, t - 1} = k | X_{2, t - 1} = v) P (Z_{2, t} = y - k)) . \end{matrix}

(10)

Here, the distributions of the random variables

Z_{1, t}

and

Z_{2, t}

are defined by Theorem 1, so their probability mass functions are given by

P (Z_{1, t} = z_{1}) = \frac{μ^{z_{1}}}{{(1 + μ)}^{z_{1} + 1}} (1 - \frac{α_{1} μ (1 + ρ)}{μ - ρ}) + \frac{ρ^{z_{1}}}{{(1 + ρ)}^{z_{1} + 1}} (\frac{α_{1} μ (1 + ρ)}{μ - ρ})

(11)

and

P (Z_{2, t} = z_{2}) = \frac{μ^{z_{2}}}{{(1 + μ)}^{z_{2} + 1}} (1 - \frac{α_{2} μ (1 + ρ)}{μ - ρ}) + \frac{ρ^{z_{2}}}{{(1 + ρ)}^{z_{2} + 1}} (\frac{α_{2} μ (1 + ρ)}{μ - ρ}) .

(12)

4. Estimation Procedure

In this section, we consider

X_{t}

as a strictly stationary and ergodic solution of the

ρ

-BVGINAR(1) process, with

{X_{t}}_{t \in Z}

representing a series of observations generated from this process. We discuss the estimation of the model parameters, comprising six parameters: one for thinning the distribution (

ρ

), two for the autocorrelation coefficients (

α_{1}, α_{2}

), two for specifying the dependence between processes

X_{1, t}

and

X_{2, t}

(

p_{1}, p_{2}

), and one for the marginal distributions (

μ

). Considering the unique characteristics of these parameters, we integrate two estimation approaches: the Yule–Walker (YW) and the conditional maximum likelihood (CML) methods.

The sample mean is commonly employed for estimating model parameters in time series analysis. Since the model assumption is that the marginal distribution

{X_{i, t}} \overset{d}{=} G e o m (\frac{μ}{1 + μ})

, then

μ = E (X_{1, t}) = E (X_{2, t})

. Thus, the reasonable estimate would be:

{\hat{μ}}^{Y W} = \frac{1}{2 n} \sum_{j = 1}^{n} (X_{1, j} + X_{2, j}) .

Theorem 2.

The estimator

{\hat{μ}}^{Y W}

is strongly consistent.

Proof.

Proposition 1 proved that process

{X_{t}}

is stationary and ergodic. Then, processes

{X_{1, t}}

and

{X_{2, t}}

are jointly stationary, which implies that

\frac{1}{2} (X_{1, t} + X_{2, t})

is also stationary and ergodic. We have

\frac{1}{2 n} \sum_{j = 1}^{n} (X_{1, j} + X_{2, j}) \overset{a . s .}{\to} \frac{1}{2} E (X_{1, t} + X_{2, t}) .

Therefore,

{\hat{μ}}^{Y W} \overset{a . s .}{\to} μ .

The proof of Theorem 2 is complete. □

Theorem 3.

The estimator

{\hat{μ}}^{Y W}

is asymptotically normally distributed with parameters

(μ, t^{- 1} v_{s})

, where

v_{s} = \sum_{k = - \infty}^{\infty} C o v (S_{t}, S_{t - k})

and

S_{t} = \frac{1}{2} (X_{1, t} + X_{2, t})

.

In addition, the conditional maximum likelihood (CML) stands out as one of the most commonly employed techniques for parameter estimation. The CML estimator of parameter vector

θ = {(α_{1}, α_{2}, p_{1}, p_{2}, ρ)}^{⊤}

is the value

\hat{θ} = {({\hat{α}}_{1}, {\hat{α}}_{2}, {\hat{p}}_{1}, {\hat{p}}_{2}, \hat{ρ})}^{⊤}

that maximizes the conditional log-likelihood function

L (θ)

. Suppose that

X_{1}

is fixed. The conditional log-likelihood function is given by

L (θ) = \sum_{t = 2}^{T} log {P (X_{1, t} = x, X_{2, t} = y | X_{1, t - 1} = u, X_{2, t - 1} = v, θ)} .

(13)

Under the given conditions, the conditional probability mass functions of processes

X_{1, t}

and

X_{2, t}

can be represented as the products of their respective conditional probabilities. These probabilities result from the convolution of the

ρ

-binomial distribution and the probability mass function of the corresponding innovation processes. Specifically,

\begin{matrix} L (v) = & \sum_{t = 2}^{n} log [P (X_{1, t} = x | X_{1, t - 1} = u, X_{2, t - 1} = v)] + \sum_{t = 2}^{n} log [P (X_{2, t} = y | X_{1, t - 1} = u, X_{2, t - 1} = v)] \\ = & \sum_{t = 2}^{n} log [(p_{1} {(1 - α_{1})}^{u} + (1 - p_{1}) {(1 - α_{1})}^{v}) (\frac{μ^{x}}{{(1 + μ)}^{x + 1}} (1 - \frac{α_{1} μ (1 + ρ)}{μ - ρ}) + \frac{ρ^{x}}{{(1 + ρ)}^{x + 1}} \\ \times \frac{α_{1} μ (1 + ρ)}{μ - ρ}) + (p_{1} \sum_{k = 1}^{x} {(1 - α_{1})}^{u} {(\frac{ρ}{1 + ρ})}^{k} \sum_{i = 1}^{min (k, u)} (_{u}^{i}) (_{k - 1}^{i - 1}) {(\frac{α_{1}}{ρ (1 - α_{1})})}^{i} + (1 - p_{1}) \\ \times \sum_{k = 1}^{x} {(1 - α_{1})}^{v} {(\frac{ρ}{1 + ρ})}^{k} \sum_{i = 1}^{min (k, v)} (_{v}^{i}) (_{k - 1}^{i - 1}) {(\frac{α_{1}}{ρ (1 - α_{1})})}^{i}) (\frac{μ^{x - k}}{{(1 + μ)}^{x + 1 - k}} 1 - \frac{α_{1} μ (1 + ρ)}{μ - ρ}) \\ + \frac{ρ^{x - k}}{{(1 + ρ)}^{x + 1 - k}} (\frac{α_{1} μ (1 + ρ)}{μ - ρ}))] + \sum_{t = 2}^{n} log [(p_{2} {(1 - α_{2})}^{u} + (1 - p_{2}) {(1 - α_{2})}^{v}) \\ \times (\frac{μ^{y}}{{(1 + μ)}^{y + 1}} (1 - \frac{α_{2} μ (1 + ρ)}{μ - ρ}) + \frac{ρ^{y}}{{(1 + ρ)}^{y + 1}} \frac{α_{2} μ (1 + ρ)}{μ - ρ}) + (p_{2} \sum_{k = 1}^{y} {(1 - α_{2})}^{u} {(\frac{ρ}{1 + ρ})}^{k} \\ \times \sum_{i = 1}^{min (k, u)} (_{u}^{i}) (_{k - 1}^{i - 1}) {(\frac{α_{2}}{ρ (1 - α_{2})})}^{i} + (1 - p_{2}) \sum_{k = 1}^{y} {(1 - α_{2})}^{v} {(\frac{ρ}{1 + ρ})}^{k} \sum_{i = 1}^{min (k, v)} (_{v}^{i}) (_{k - 1}^{i - 1}) \\ \times {(\frac{α_{2}}{ρ (1 - α_{2})})}^{i}) (\frac{μ^{y - k}}{{(1 + μ)}^{y + 1 - k}} (1 - \frac{α_{2} μ (1 + ρ)}{μ - ρ}) + \frac{ρ^{y - k}}{{(1 + ρ)}^{y + 1 - k}} (\frac{α_{2} μ (1 + ρ)}{μ - ρ}))] . \end{matrix}

The likelihood function reveals that the parameters

α_{1}, α_{2}, ρ

, and

μ

often interact multiplicatively, posing significant challenges for the optimization process. To address potential issues with parameter identifiability and leverage the specific characteristics of these parameters, we have implemented a stepwise optimization strategy. Initially, we estimate

μ

using the Yule–Walker method. This estimate

{\hat{μ}}^{Y W}

is then incorporated back into the likelihood function to facilitate the optimization of the remaining parameters. This tailored approach effectively mitigates the risk of converging to local optima: a prevalent concern with non-convex objective functions.

For numerical maximization, we employ the “nlm” function from R programming software. All computational experiments were performed using R version 4.0.3 on a system equipped with an Intel Xeon Gold 6154 processor (Intel Corporation, Santa Clara, CA, USA) and 256 GB of RAM.

Next, we present the detailed simulation study design and results. We generated

ρ

-BVGINAR(1) samples with various model parameterizations and sample sizes

n = 150, 300, 600, 1200, 3000

, where

n = 300

is close to the length of the crime counts that will be analyzed in Section 5. We considered the following parameter configurations:

Model (A): ( $α_{1}, α_{2}, p_{1}, p_{2}, ρ, μ$ ) = (0.3, 0.25, 0.2, 0.15, 0.1, 5);
Model (B): ( $α_{1}, α_{2}, p_{1}, p_{2}, ρ, μ$ ) = (0.3, 0.25, 0.2, 0.15, 0.3, 5);
Model (C): ( $α_{1}, α_{2}, p_{1}, p_{2}, ρ, μ$ ) = (0.4, 0.4, 0.2, 0.15, 0.25, 5);
Model (D): ( $α_{1}, α_{2}, p_{1}, p_{2}, ρ, μ$ ) = (0.6, 0.4, 0.3, 0.7, 0.3, 3);
Model (E): ( $α_{1}, α_{2}, p_{1}, p_{2}, ρ, μ$ ) = (0.6, 0.4, 0.7, 0.3, 0.3, 3).

Recalling the properties of IBe random variables discussed in Section 2, we explored diverse parameter combinations of

α_{1}

,

α_{2}

, and

ρ

in our simulation study. Models (A), (B), and (C) represent scenarios of underdispersion (

ρ < \frac{α_{1}}{2 - α_{1}}, \frac{α_{2}}{2 - α_{2}}

), overdispersion (

ρ > \frac{α_{1}}{2 - α_{1}}, \frac{α_{2}}{2 - α_{2}}

), and equidispersion (

ρ = \frac{α_{1}}{2 - α_{1}}, \frac{α_{2}}{2 - α_{2}}

) of the IBe random variables

W_{i} (α_{1}, ρ)

and

W_{i} (α_{2}, ρ)

under

ρ

-thinning. Conversely, Models (D) and (E) are characterized by distinct dispersion patterns of

W_{i} (α_{1}, ρ)

and

W_{i} (α_{2}, ρ)

. Specifically, Model (D) sets underdispersion of

W_{i} (α_{1}, ρ)

and overdispersion of

W_{i} (α_{2}, ρ)

(

ρ < \frac{α_{1}}{2 - α_{1}}

,

ρ > \frac{α_{2}}{2 - α_{2}}

), while Model (E) shows overdispersion of

W_{i} (α_{1}, ρ)

and underdispersion of

W_{i} (α_{2}, ρ)

(

ρ > \frac{α_{1}}{2 - α_{1}}

,

ρ < \frac{α_{2}}{2 - α_{2}}

).

To assess model performance, we employed two widely recognized criteria: mean absolute error (MAE) and root mean squared error (RMSE), based on

M = 500

replications for each model parametrization. MAE is preferred for its robustness against outliers, while RMSE provides a more detailed measure of errors and is particularly sensitive to larger deviations. The use of both MAE and RMSE allows for a comprehensive evaluation of estimation accuracy. They are defined as follows:

MAE = \sqrt{\frac{1}{M} \sum_{m = 1}^{M} | {\hat{θ}}_{m} - θ_{m} |}, RMSE = \sqrt{\frac{1}{M} \sum_{m = 1}^{M} {({\hat{θ}}_{m} - θ_{m})}^{2}},

where

{\hat{θ}}_{m}

is the estimate of

θ

at the m-th replication.

In addition, to demonstrate the asymptotic normality of the estimators, we conducted a goodness-of-fit test for normality. The Anderson-Darling (AD) test, proposed by [20], is a statistical test used to assess whether data come from a specific distribution: typically, a normal distribution. Unlike other normality tests, the AD test gives more weight to the tails of the distribution, making it more powerful for detecting non-normality. Therefore, we selected the AD test to examine the asymptotic normality of the estimators by using the procedure from the R package “nortest” authored by [21].

Table 1, Table 2 and Table 3 report the estimates, biases, MAEs, and RMSEs for Models (A)–(E) across various sample sizes, as well as the AD test statistics (denoted AD in the tables) and corresponding p-values for

n = 3000

. From Table 1, we observe that the biases, MAEs, and RMSEs of the estimates for Models (A) and (B) decrease as the sample size n increases, as expected. Figure 1 and Figure 2 also illustrate the notable downward trends in MAEs and RMSEs of the estimates, implying the consistency of the proposed estimators with increasing values of n. A similar conclusion can be drawn from Table 2 and Table 3, along with the corresponding visual curves in Figure 3, Figure 4 and Figure 5. Based on the above discussions, we conclude that the estimation method can produce reliable parameter estimators.

Table 1. Simulation results for Model (A) and (B) under different sample sizes.

Table 2. Simulation results of Model (C) under different sample sizes.

Table 3. Simulation results of Model (D) and (E) under different sample sizes.

Figure 1. Variation of MAE and RMSE for Model (A) estimates across various sample sizes.

Figure 2. Variation of MAE and RMSE for Model (B) estimates across various sample sizes.

Figure 3. Variation of MAE and RMSE for Model (C) estimates across various sample sizes.

Figure 4. Variation of MAE and RMSE for Model (D) estimates across various sample sizes.

Figure 5. Variation of MAE and RMSE for Model (E) estimates across various sample sizes.

Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15 display the Gaussian QQ plots of the proposed estimators for Models (A)–(E) across various sample sizes. For small sample sizes, particularly when

n = 150

, the data points are concentrated along a non-45-degree diagonal. However, as n increases, more data points align closely with the 45-degree line, indicating a good match between the estimator and normal distributions. Furthermore, all p-values of the AD normality test for the estimates when

n = 3000

for Models (A)–(E) are greater than the significance level of 0.05, as shown in Table 1, Table 2 and Table 3, further confirming the asymptotic normality of the estimators for large sample sizes. Based on the above facts, we conclude that the proposed estimation method is trustworthy for the models under consideration and can yield estimators with asymptotic normality.

Figure 6. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (A) across various sample sizes.

Figure 7. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (A) across various sample sizes.

Figure 8. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (B) across various sample sizes.

Figure 9. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (B) across various sample sizes.

Figure 10. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (C) across various sample sizes.

Figure 11. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (C) across various sample sizes.

Figure 12. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (D) across various sample sizes.

Figure 13. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (D) across various sample sizes.

Figure 14. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (E) across various sample sizes.

Figure 15. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (E) across various sample sizes.

5. Real Data Examples

In this section, we present two applications to demonstrate the effectiveness of our proposed

ρ

-BVGINAR(1) model in capturing overdispersion and zero inflation phenomena. We compare our model with existing bivariate INAR(1) models utilizing standard binomial and negative binomial thinning operators. Specifically, we consider the following models:

BVNGINAR(1) model ([18]);
BVPOINAR(1) model ([4]);
BVMIXINAR(1) model ([22]).

Note that both the binomial and negative binomial thinning operators are special cases of the

ρ

-binomial thinning operator: the three models above are considered. For fairness, we estimate the mean parameter

μ

using the Yule–Walker method across all models, while employing maximum likelihood estimation for other parameters. Model performance was evaluated using the Akaike information criterion (AIC) and Bayesian information criterion (BIC).

The dataset used in this analysis was obtained from the NSW Bureau of Crime Statistics and Research (BOCSAR). It is organized by local government area (LGA), offense category (including subcategories), and month. Covering the period from January 1995 to December 2023, the dataset includes 348 monthly crime counts in New South Wales (NSW), Australia. The data can be downloaded from the following website: https://www.bocsar.nsw.gov.au/Pages/bocsar_datasets/Offence.aspx, accessed on 13 May 2024.

To illustrate the characteristics of the observations, we present descriptive statistics for each series. These statistics include the minimum (Min), maximum (Max), median, mean, variance (Var), dispersion index (

I_{d}

), and zero inflation index (

z_{0}

). The

I_{d}

is defined in Equation (2). The

z_{0}

index, introduced by [23], is employed to assess the excess occurrence of zeros in count data and is formulated as:

z_{0} = 1 + \frac{ln (p_{0})}{λ},

where

p_{0}

denotes the proportion of zeros, and

λ

represents the mean. A

z_{0}

value greater than 0 indicates zero inflation, while a value less than 0 suggests zero deflation.

5.1. Crime Data: Disorderly Conduct Counts in Carrathool

In the first application, we analyze disorderly conduct counts in Carrathool, including three subcategories: “offensive conduct” (OCND), “offensive language” (OLNG), and “criminal intent”. Evidently, OCND and OLNG often co-occur due to similar contexts, likely indicating a significant degree of mutual association between their counts. Therefore, we applied the

ρ

-BVGINAR(1) model to fit the counts of OCND and OLNG.

The time series, autocorrelation function (ACF), and cross-correlation function (CCF) plots for the OCND and OLNG series are depicted in Figure 16 and Figure 17. The ACF plots show autocorrelation in both series, while the values in the CCF plot surpass the confidence interval, suggesting non-independence between the two series. Table 4 presents descriptive statistics for both series. The empirical mean values for OCND and OLNG are relatively close, at 0.3448 and 0.2960, respectively. Both sequences exhibit dispersion indices

I_{d}

of

1.6098

and

1.3487

, slightly exceeding 1, indicating marginal overdispersion. Moreover, the

z_{0}

values for both series exceed 0, indicating zero inflation characteristics in the data. Further insight is provided by their histograms in Figure 18, which highlight a notable proportion of zeros in each series.

Figure 16. Sample paths of OCND and OLNG series.

Figure 17. The autocorrelation function (ACF) and cross-correlation (CCF) plots of OCND and OLNG series.

Table 4. Descriptive statistics of OCND and OLNG series.

Figure 18. Histograms of OCND and OLNG counts.

The fitted results of the proposed

ρ

-BVGINAR(1), BVNGINAR(1), BVPOINAR(1), and BVMIXINAR(1) models are summarized in Table 5. Despite incorporating a mixture of binomial and negative binomial thinning, the BVMIXINAR(1) model exhibits the poorest performance, evidenced by its minimum log-likelihood value (

- 526.14

), maximum AIC (1062.29) and BIC (1081.55) values. The BVNGINAR(1) and BVPOINAR(1) models perform comparably based on their log-likelihood values, appearing only suboptimal compared to the

ρ

-BVGINAR(1) model. However, the

ρ

-BVGINAR(1) model achieves the lowest AIC (991.27) and BIC (1008.53) values, indicating superior data fitting.

Table 5. Fitting results of the monthly OCND and OLNG counts across different models.

5.2. Crime Data: Theft Counts in Narrandera

As suggested by a referee, to demonstrate the flexibility and applicability of our model, we selected a real-world example with higher levels of overdispersion for analysis. We focused on the theft counts in Narrandera, encompassing five subcategories: “break and enter dwelling”, “break and enter non-dwelling”, “receiving or handling stolen goods”, “motor vehicle theft”, and “steal from motor vehicle”. Notably, “break and enter dwelling” and “break and enter non-dwelling” exhibit a correlation due to their similar modus operandi and motivations, likely originating from the actions of the same group of offenders. Therefore, we chose to examine the counts of “break and enter thefts into dwelling” (BETD) and “break and enter theft into non-dwelling” (BETND) for further investigation.

Figure 19 and Figure 20 display the time series, ACF, and CCF plots, revealing significant autocorrelation within each series and cross-correlation between the BETD and BETND series. Table 6 presents the dispersion indices

I_{d}

values of

4.6408

and

3.9099

, markedly exceeding 1, indicating a higher degree of overdispersion compared to the

I_{d}

values of

1.6098

and

1.3487

observed in the OCND and OLNG counts. Moreover, the

z_{0}

values for both series indicate notable zero inflation, with values of

0.7587

and

0.7681

, respectively. Their histograms are depicted in Figure 21.

Figure 19. Sample paths of BETD and BETND series.

Figure 20. The autocorrelation function (ACF) and cross-correlation (CCF) plots of BETD and BETND series.

Table 6. Descriptive statistics of BETD and BETND series.

Figure 21. Histograms of BETD and BETND counts.

Table 7 presents the fitted results. We observe that the BVMIXINAR(1) model exhibits the highest AIC and BIC values, indicating that the model fails to capture the overdispersion and zero inflation characteristic of the dataset. We also notice that the BIC value of the fitted BVPOINAR(1) models is also large, indicating that the model is unsuitable to fit this dataset. While the BVNGINAR(1) model yields satisfactory results, the

ρ

-BVGINAR(1) model outperforms it in terms of AIC and BIC values, indicating its greater suitability for fitting the BETD and BETND counts.

Table 7. Fitting results of the monthly BETD and BETND counts across different models.

In conclusion, the

ρ

-BVGINAR(1) model outperforms the other three models and more effectively captures the overdispersion and zero inflation features in count time series data. It demonstrates superiority in model fitting by striking a balance between flexibility and complexity.

6. Conclusions

This paper introduces a more flexible

ρ

-BVGINAR(1) model tailored to analyzing bivariate integer-valued time series data with overdispersion characteristic. It extends the

ρ

-GINAR(1) model [12] to the two-dimensional case. Meanwhile, it is also a generalization of the BVGNAR(1) model or the BVNGINAR(1) model [18], offering enhanced capability for handling excess zeros and overdispersed data. Furthermore, the paper derives the innovation structure of the proposed model, discusses its essential properties, and describes the methodologies for YW and CML estimation. A comprehensive simulation study is conducted to evaluate the finite sample performances of the estimators and their asymptotic properties under various parameter combinations. Two real applications showcase the effectiveness of the proposed model relative to existing ones, demonstrating its utility in practical settings.

Moving forward, there are several promising avenues for future research in the field of bivariate INAR-type models. One promising direction involves exploring the application of the zero-modified geometric distribution as the marginal distribution. This distribution offers the capability to effectively model features such as zero inflation, zero deflation, overdispersion, and underdispersion, as discussed in detail in [24]. In addition, there is potential for further investigation into the modification of various marginal parameters and thinning parameters. Previous works, such as those by [7,19], have demonstrated the effectiveness of such modifications for analyzing bivariate time series data. These approaches hold promise for enhancing the flexibility and applicability of bivariate models and warrant thorough exploration in future research projects.

Author Contributions

Conceptualization and methodology, C.L.; validation and review, D.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Social Science Planning Foundation of Liaoning Province (No. L22ZD065) and the National Natural Science Foundation of China (Nos. 12271231, 12001229, 11901053).

Data Availability Statement

Publicly available data sets were analyzed in this study. These data can be found here: https://www.bocsar.nsw.gov.au/Pages/bocsar_datasets/Offence.aspx (accessed on 13 May 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proof in Lemma 1

(i): $E ({A_{t}}_{ρ} \circ X_{t - 1}) = \tilde{A} E (X_{t - 1})$ , where $\tilde{A} = (1 + ρ) A = (1 + ρ) E (A_{t}) = (1 + ρ)$ $[\begin{matrix} α_{1} p_{1} & α_{1} (1 - p_{1}) \\ α_{2} p_{2} & α_{2} (1 - p_{2}) \end{matrix}] .$

Proof.

According to the definition of the model, we have

\begin{matrix} E ({A_{t}}_{ρ} \circ X_{t - 1}) = E [\begin{matrix} {U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1} \\ {V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1} \end{matrix}] . \end{matrix}

We decompose the computation of each element of the matrix.

\begin{matrix} E ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) \\ = & E {E [({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) | (X_{1, t - 1}, X_{2, t - 1})]} \\ = & E \{E [(\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) + \sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) | (X_{1, t - 1}, X_{2, t - 1})]\} \\ = & E \{E [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) | X_{1, t - 1}] + E [\sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) | X_{2, t - 1}]\} \\ = & E {X_{1, t - 1} \cdot E [W_{i} (U_{1, t}, ρ)] + X_{2, t - 1} \cdot E [W_{j} (U_{2, t}, ρ)]} \\ = & E [W_{i} (U_{1, t}, ρ)] \cdot E (X_{1, t - 1}) + E [W_{j} (U_{2, t}, ρ)] \cdot E (X_{2, t - 1}) \\ = & (1 + ρ) \cdot E (U_{1, t}) \cdot E (X_{1, t - 1}) + (1 + ρ) \cdot E (U_{2, t}) \cdot E (X_{2, t - 1}) \\ = & (1 + ρ) [E (U_{1, t}) \cdot E (X_{1, t - 1}) + E (U_{2, t}) \cdot E (X_{2, t - 1})], \end{matrix}

where

\begin{matrix} E [W_{i} (U_{1, t}, ρ)] = & E {E [W_{i} (U_{1, t}, ρ) | U_{1, t}]} = E [U_{1, t} (1 + ρ)] = (1 + ρ) E (U_{1, t}) = (1 + ρ) α_{1} p_{1} . \\ E [W_{j} (U_{2, t}, ρ)] = & E {E [W_{j} (U_{2, t}, ρ) | U_{2, t}]} = (1 + ρ) E (U_{2, t}) = (1 + ρ) α_{1} (1 - p_{1}) . \end{matrix}

By the same token,

\begin{matrix} E ({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1}) = (1 + ρ) [E (V_{1, t}) \cdot E (X_{1, t - 1}) + E (V_{2, t}) \cdot E (X_{2, t - 1})] . \end{matrix}

Therefore,

\begin{matrix} E ({A_{t}}_{ρ} \circ X_{t - 1}) = & [\begin{matrix} (1 + ρ) [E (U_{1, t}) \cdot E (X_{1, t - 1}) + E (U_{2, t}) \cdot E (X_{2, t - 1})] \\ (1 + ρ) [E (V_{1, t}) \cdot E (X_{1, t - 1}) + E (V_{2, t}) \cdot E (X_{2, t - 1})] \end{matrix}] \\ = & (1 + ρ) \cdot E [\begin{matrix} U_{1, t} & U_{2, t} \\ V_{1, t} & V_{2, t} \end{matrix}] \cdot E [\begin{matrix} X_{1, t - 1} \\ X_{2, t - 1} \end{matrix}] \\ = & (1 + ρ) \cdot E (A_{t}) \cdot E (X_{t - 1}) \\ = & \tilde{A} \cdot E (X_{t - 1}) . \end{matrix}

□

(ii): $E [({A_{t}}_{ρ} \circ X_{t - 1}) B^{⊤}] = \tilde{A} E (X_{t - 1} B^{⊤})$ for a random vector $B$ independent of $A_{t}$ .

Proof.

From the definition of the model, it follows that

\begin{matrix} E [({A_{t}}_{ρ} \circ X_{t - 1}) B^{⊤}] \\ = & E [(\begin{matrix} {U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1} \\ {V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1} \end{matrix}) \cdot (B_{1}, B_{2})] \\ = & E [\begin{matrix} ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) \cdot B_{1} & ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) \cdot B_{2} \\ ({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1}) \cdot B_{1} & ({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1}) \cdot B_{2} \end{matrix}] . \end{matrix}

Decomposing the computation of each element of the matrix, we have

\begin{matrix} E [({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) \cdot B_{1}] \\ = & E {E [({U_{1, t}}_{ρ} \circ X_{1, t - 1}) \cdot B_{1} + ({U_{2, t}}_{ρ} \circ X_{2, t - 1}) \cdot B_{1} | (X_{1, t - 1}, X_{2, t - 1})]} \\ = & E \{E [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot B_{1} | X_{1, t - 1}] + E [\sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot B_{1} | X_{2, t - 1}]\} \\ = & E {X_{1, t - 1} \cdot E [W_{i} (U_{1, t}, ρ) \cdot B_{1} | X_{1, t - 1}] + X_{2, t - 1} \cdot E [W_{j} (U_{2, t}, ρ) \cdot B_{1} | X_{2, t - 1}]} \\ = & E {X_{1, t - 1} \cdot E [W_{i} (U_{1, t}, ρ)] \cdot E [B_{1} | X_{1, t - 1}] + X_{2, t - 1} \cdot E [W_{j} (U_{2, t}, ρ)] \cdot E [B_{1} | X_{2, t - 1}]} \\ = & E {E [W_{i} (U_{1, t}, ρ)] \cdot X_{1, t - 1} \cdot E [B_{1} | X_{1, t - 1}]} + E [W_{j} (U_{2, t}, ρ)] \cdot X_{2, t - 1} \cdot E [B_{1} | X_{2, t - 1}]} \\ = & (1 + ρ) \cdot E (U_{1, t}) \cdot E {E [X_{1, t - 1} \cdot B_{1} | X_{1, t - 1}]} + (1 + ρ) \cdot E (U_{2, t}) \cdot E {E [X_{2, t - 1} \cdot B_{1} | X_{2, t - 1}]} \\ = & (1 + ρ) [E (U_{1, t}) E (X_{1, t - 1} B_{1}) + E (U_{2, t}) E (X_{2, t - 1} B_{1})] . \end{matrix}

Similarly,

\begin{matrix} E [({U_{1, t}}_{ρ} \circ X_{1, t - 1} + U_{2, t} [_{ρ}] \circ X_{2, t - 1}) \cdot B_{2}] = (1 + ρ) [E (U_{1, t}) E (X_{1, t - 1} B_{2}) + E (U_{2, t}) E (X_{2, t - 1} B_{2})] . \\ E [({V_{1, t}}_{ρ} \circ X_{1, t - 1} + V_{2, t} [_{ρ}] \circ X_{2, t - 1}) \cdot B_{1}] = (1 + ρ) [E (V_{1, t}) E (X_{1, t - 1} B_{1}) + E (V_{2, t}) E (X_{2, t - 1} B_{1})] . \\ E [({V_{1, t}}_{ρ} \circ X_{1, t - 1} + V_{2, t} [_{ρ}] \circ X_{2, t - 1}) \cdot B_{2}] = (1 + ρ) [E (V_{1, t}) E (X_{1, t - 1} B_{2}) + E (V_{2, t}) E (X_{2, t - 1} B_{2})] . \end{matrix}

Therefore,

\begin{matrix} E [({A_{t}}_{ρ} \circ X_{t - 1}) B^{⊤}] \\ = & (1 + ρ) [\begin{matrix} E (U_{1, t}) & E (U_{2, t}) \\ E (V_{1, t}) & E (V_{2, t}) \end{matrix}] \cdot [\begin{matrix} E (X_{1, t - 1} B_{1}) & E (X_{1, t - 1} B_{2}) \\ E (X_{2, t - 1} B_{1}) & E (X_{2, t - 1} B_{2}) \end{matrix}] = \tilde{A} E (X_{t - 1} B^{⊤}) . \end{matrix}

□

(iii): $E [B {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}] = E (B X_{t - 1}^{⊤}) {\tilde{A}}^{⊤}$ for a random vector $B$ independent of $A_{t}$ .

Proof.

The proof process for this is similar to (ii), and we only provide key expressions.

\begin{matrix} E [B_{1} \cdot ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1})] \\ = & E {E [B_{1} \cdot ({U_{1, t}}_{ρ} \circ X_{1, t - 1}) | X_{1, t - 1}] + E [B_{1} \cdot ({U_{2, t}}_{ρ} \circ X_{2, t - 1}) | X_{2, t - 1}]} \\ = & E \{E [B_{1} \cdot \sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) | X_{1, t - 1}] + E [B_{1} \cdot \sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) | X_{2, t - 1}]\} \\ = & E {X_{1, t - 1} \cdot E [B_{1} \cdot W_{i} (U_{1, t}, ρ) | X_{1, t - 1}] + X_{2, t - 1} \cdot E [B_{1} \cdot W_{j} (U_{2, t}, ρ) | X_{2, t - 1}]} \\ = & E {X_{1, t - 1} \cdot E [W_{i} (U_{1, t}, ρ)] \cdot E [B_{1} | X_{1, t - 1}] + X_{2, t - 1} \cdot E [W_{j} (U_{2, t}, ρ)] \cdot E [B_{1} | X_{2, t - 1}]} \\ = & E [W_{i} (U_{1, t}, ρ)] \cdot E {X_{1, t - 1} \cdot E [B_{1} | X_{1, t - 1}]} + E [W_{j} (U_{2, t}, ρ)] \cdot E {X_{2, t - 1} \cdot E [B_{1} | X_{2, t - 1}]} \\ = & E [(1 + ρ) \cdot U_{1, t}] \cdot E {E [X_{1, t - 1} \cdot B_{1} | X_{1, t - 1}]} + E [(1 + ρ) \cdot U_{2, t}] \cdot E {E [X_{2, t - 1} \cdot B_{1} | X_{2, t - 1}]} \\ = & (1 + ρ) \cdot E (U_{1, t}) \cdot E (X_{1, t - 1} \cdot B_{1}) + (1 + ρ) \cdot E (U_{2, t}) \cdot E (X_{2, t - 1} \cdot B_{1}) \\ = & (1 + ρ) [E (U_{1, t}) \cdot E (X_{1, t - 1} \cdot B_{1}) + E (U_{2, t}) \cdot E (X_{2, t - 1} \cdot B_{1})], \end{matrix}

and

\begin{matrix} E [B_{1} \cdot ({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1})] = (1 + ρ) [E (V_{1, t}) E (X_{1, t - 1} B_{1}) + E (V_{2, t}) E (X_{2, t - 1} B_{1})] . \\ E [B_{2} \cdot ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1})] = (1 + ρ) [E (U_{1, t}) E (X_{1, t - 1} B_{2}) + E (U_{2, t}) E (X_{2, t - 1} B_{2})] . \\ E [B_{2} \cdot ({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1})] = (1 + ρ) [E (V_{1, t}) E (X_{1, t - 1} B_{2}) + E (V_{2, t}) E (X_{2, t - 1} B_{2})] . \end{matrix}

Therefore,

\begin{matrix} E [B {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}] \\ = & E [(\begin{matrix} B_{1} \\ B_{2} \end{matrix}) \cdot (X_{1, t - 1}, X_{2, t - 1})] \cdot E [\begin{matrix} (1 + ρ) U_{1, t} & (1 + ρ) V_{1, t} \\ (1 + ρ) U_{2, t} & (1 + ρ) V_{2, t} \end{matrix}] = E (B X_{t - 1}^{⊤}) \tilde{A} . \end{matrix}

□

(iv): $E [({A_{t}}_{ρ} \circ X_{t - 1}) {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}] = \tilde{A} E (X_{t - 1} X_{t - 1}^{⊤}) {\tilde{A}}^{⊤} + C,$ where $c_{12} = 0, c_{21} = 0$ ,

$\begin{matrix} c_{11} = & {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{1} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{1}) [p_{1} E (X_{1, t - 1}) \\ + (1 - p_{1}) E (X_{2, t - 1})] \\ c_{22} = & {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{2} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{2}) [p_{2} E (X_{1, t - 1}) \\ + (1 - p_{2}) E (X_{2, t - 1})] . \end{matrix}$

Proof.

Based on the model definition, we expand the expectations.

\begin{matrix} E [({A_{t}}_{ρ} \circ X_{t - 1}) {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}] \\ = & E [(\begin{matrix} {U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1} \\ {V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1} \end{matrix}) \cdot {(\begin{matrix} {U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1} \\ {V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1} \end{matrix})}^{⊤}] \equiv & (\begin{matrix} e_{11} & e_{12} \\ e_{21} & e_{22} \end{matrix}) . \end{matrix}

For representational convenience, we denote the corresponding elements of the aforementioned matrix as

e_{i, j}, i, j = 1, 2

. We then proceed to compute these elements individually.

\begin{matrix} e_{11} = & E [({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1})] \\ = & E [{({U_{1, t}}_{ρ} \circ X_{1, t - 1})}^{2} + 2 ({U_{1, t}}_{ρ} \circ X_{1, t - 1}) ({U_{2, t}}_{ρ} \circ X_{2, t - 1}) + {({U_{2, t}}_{ρ} \circ X_{2, t - 1})}^{2}] \\ = & E {{[\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ)]}^{2} + 2 [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ)] [\sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ)] \\ + {[\sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ)]}^{2}} \\ = & E \{\sum_{i = 1}^{X_{1, t - 1}} W_{i}^{2} (U_{1, t}, ρ) + \sum_{i \neq m}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot W_{m} (U_{1, t}, ρ)\} + E {2 \sum_{i = 1}^{X_{1, t - 1}} \sum_{j = 1}^{X_{2, t - 1}} W_{i} (U_{1, t}, ρ) \\ \times W_{j} (U_{2, t}, ρ)} + E \{\sum_{j = 1}^{X_{2, t - 1}} W_{j}^{2} (U_{2, t}, ρ) + \sum_{j \neq n}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot W_{n} (U_{2, t}, ρ)\} \\ \equiv & s_{1} + s_{2} + s_{3} . \end{matrix}

\begin{matrix} s_{1} = & E \{\sum_{i = 1}^{X_{1, t - 1}} W_{i}^{2} (U_{1, t}, ρ) + \sum_{i \neq m}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot W_{m} (U_{1, t}, ρ)\} \\ = & E \{E [\sum_{i = 1}^{X_{1, t - 1}} W_{i}^{2} (U_{1, t}, ρ) + \sum_{i \neq m}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot W_{m} (U_{1, t}, ρ)] | X_{1, t - 1}\} \\ = & E {X_{1, t - 1} E [W_{i}^{2} (U_{1, t}, ρ) | X_{1, t - 1}] + X_{1, t - 1} (X_{1, t - 1} - 1) E [W_{i} (U_{1, t}, ρ) \cdot W_{m} (U_{1, t}, ρ) | X_{1, t - 1}]} \\ = & E {X_{1, t - 1} \cdot E [W_{i}^{2} (U_{1, t}, ρ)] + X_{1, t - 1} (X_{1, t - 1} - 1) \cdot E [W_{i} (U_{1, t}, ρ) \cdot W_{m} (U_{1, t}, ρ)]} \\ = & E {X_{1, t - 1} \cdot (1 + ρ) (1 + 2 ρ) E (U_{1, t}) + X_{1, t - 1} (X_{1, t - 1} - 1) \cdot {(1 + ρ)}^{2} E (U_{1, t}^{2})} \\ = & E (X_{1, t - 1}) \cdot (1 + ρ) (1 + 2 ρ) E (U_{1, t}) + E (X_{1, t - 1}^{2} - X_{1, t - 1}) \cdot {(1 + ρ)}^{2} E (U_{1, t}^{2}) \\ = & (1 + ρ) (1 + 2 ρ) α_{1} p_{1} \cdot E (X_{1, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} p_{1}^{2} \cdot E (X_{1, t - 1}^{2}) - {(1 + ρ)}^{2} α_{1}^{2} p_{1} \cdot E (X_{1, t - 1}) \\ = & ((1 + ρ) (1 + 2 ρ) α_{1} p_{1} - {(1 + ρ)}^{2} α_{1}^{2} p_{1}) E (X_{1, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} p_{1} \cdot E (X_{1, t - 1}^{2}), \end{matrix}

where

\begin{matrix} E [W_{i}^{2} (U_{1, t}, ρ)] = & E [E [W_{i}^{2} (U_{1, t}, ρ) | U_{1, t}]] \\ = & E [V a r (W_{i} (U_{1, t}, ρ) | U_{1, t}) + {(E (W_{i} (U_{1, t}, ρ) | U_{1, t}))}^{2}] \\ = & E [U_{1, t} (1 + ρ) [ρ + (1 + ρ) (1 - U_{1, t})] + {(U_{1, t} (1 + ρ))}^{2}] \\ = & E [U_{1, t} (1 + ρ) (1 + 2 ρ)] \\ = & (1 + ρ) (1 + 2 ρ) E (U_{1, t}) \\ = & (1 + ρ) (1 + 2 ρ) α_{1} p_{1} . \end{matrix}

\begin{matrix} E [W_{i} (U_{1, t}, ρ) W_{j} (U_{1, t}, ρ)] = & E [E [W_{i} (U_{1, t}, ρ) W_{j} (U_{1, t}, ρ) | U_{1, t}]] \\ = & E [E [W_{i} (U_{1, t}, ρ) | U_{1, t}] E [W_{j} (U_{1, t}, ρ) | U_{1, t}]] \\ = & E [U_{1, t} (1 + ρ) \cdot U_{1, t} (1 + ρ)] \\ = & {(1 + ρ)}^{2} E (U_{1, t}^{2}) = {(1 + ρ)}^{2} α_{1}^{2} p . \end{matrix}

\begin{matrix} E [W_{i} (U_{1, t}, ρ) W_{j} (U_{2, t}, ρ)] = & E [E [W_{i} (U_{1, t}, ρ) W_{j} (U_{2, t}, ρ) | (U_{1, t}, U_{2, t})]] \\ = & E [E [W_{i} (U_{1, t}, ρ) | U_{1, t}] E [W_{j} (U_{2, t}, ρ) | U_{2, t}]] \\ = & E [U_{1, t} (1 + ρ) \cdot U_{2, t} (1 + ρ)] \\ = & {(1 + ρ)}^{2} E (U_{1, t} U_{2, t}) = 0 . \end{matrix}

\begin{matrix} E [W_{i} (U_{1, t}, ρ) W_{n} (V_{1, t}, ρ)] = & E [E [W_{i} (U_{1, t}, ρ) W_{j} (V_{1, t}, ρ) | (U_{1, t}, V_{1, t})]] \\ = & E [E [W_{i} (U_{1, t}, ρ) | U_{1, t}] E [W_{j} (V_{2, t}, ρ) | V_{1, t}]] \\ = & E [U_{1, t} (1 + ρ) \cdot V_{1, t} (1 + ρ)] \\ = & {(1 + ρ)}^{2} E (U_{1, t}) E (V_{1, t}) = {(1 + ρ)}^{2} α_{1} p_{1} α_{2} p_{2} . \end{matrix}

\begin{matrix} s_{2} = & E \{2 \sum_{i = 1}^{X_{1, t - 1}} \sum_{j = 1}^{X_{2, t - 1}} W_{i} (U_{1, t}, ρ) W_{j} (U_{2, t}, ρ)\} \\ = & E \{E [2 \sum_{i = 1}^{X_{1, t - 1}} \sum_{j = 1}^{X_{2, t - 1}} W_{i} (U_{1, t}, ρ) W_{j} (U_{2, t}, ρ) | (X_{1, t - 1}, X_{2, t - 1})]\} \\ = & E {2 X_{1, t - 1} X_{2, t - 1} \cdot E [W_{i} (U_{1, t}, ρ) W_{j} (U_{2, t}, ρ) | (X_{1, t - 1}, X_{2, t - 1})]} \\ = & 2 E (X_{1, t - 1} X_{2, t - 1}) \cdot E [W_{i} (U_{1, t}, ρ) W_{j} (U_{2, t}, ρ)] \\ = & 2 E (X_{1, t - 1} X_{2, t - 1}) \cdot {(1 + ρ)}^{2} E (U_{1, t} U_{2, t}) = 0 . \\ s_{3} = & E \{\sum_{j = 1}^{X_{2, t - 1}} W_{j}^{2} (U_{2, t}, ρ) + \sum_{j \neq n}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot W_{n} (U_{2, t}, ρ)\} \\ = & E {X_{2, t - 1} \cdot (1 + ρ) (1 + 2 ρ) E (U_{2, t}) + X_{2, t - 1} (X_{2, t - 1} - 1) \cdot {(1 + ρ)}^{2} E (U_{2, t}^{2})} \\ = & ((1 + ρ) (1 + 2 ρ) α_{1} (1 - p_{1}) - {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1})) E (X_{2, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1}) E (X_{2, t - 1}^{2}) . \end{matrix}

\begin{matrix} e_{11} = & s_{1} + s_{2} + s_{3} \\ = & ((1 + ρ) (1 + 2 ρ) α_{1} p_{1} - {(1 + ρ)}^{2} α_{1}^{2} p_{1}) E (X_{1, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} p_{1} \cdot E (X_{1, t - 1}^{2}) + ((1 + ρ) \\ \times (1 + 2 ρ) α_{1} (1 - p_{1}) - {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1})) E (X_{2, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1}) \cdot E (X_{2, t - 1}^{2}) \\ = & [(1 + ρ) (1 + 2 ρ) \cdot α_{1} p_{1} - {(1 + ρ)}^{2} α_{1}^{2} p_{1}] E (X_{1, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} p_{1} E (X_{1, t - 1}^{2}) + [(1 + ρ) \\ \times (1 + 2 ρ) α_{1} (1 - p_{1}) - {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1})] E (X_{2, t - 1}) + {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1}) E (X_{2, t - 1}^{2}) \\ = & (1 + ρ) α_{1} [(1 + 2 ρ) - (1 + ρ) α_{1}] [p_{1} E (X_{1, t - 1}) + (1 - p_{1}) E (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1}^{2} [p_{1} \\ \times E (X_{1, t - 1}^{2}) + (1 - p_{1}) E (X_{2, t - 1}^{2})] \\ = & (1 + ρ) α_{1} [(1 + 2 ρ) - (1 + ρ) α_{1}] [p_{1} E (X_{1, t - 1}) + (1 - p_{1}) E (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1}^{2} {p_{1} [p_{1} \\ \times E (X_{1, t - 1}^{2}) + (1 - p_{1}) E (X_{2, t - 1}^{2})] + (1 - p_{1}) [p_{1} E (X_{1, t - 1}^{2}) + (1 - p_{1}) E (X_{2, t - 1}^{2})]} \\ = & (1 + ρ) α_{1} [(1 + 2 ρ) - (1 + ρ) α_{1}] [p_{1} E (X_{1, t - 1}) + (1 - p_{1}) E (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1}^{2} [p_{1}^{2} \\ \times E (X_{1, t - 1}^{2}) + {(1 - p_{1})}^{2} E (X_{2, t - 1}^{2})] + {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) [E (X_{1, t - 1}^{2}) + E (X_{2, t - 1}^{2})] \\ = & (1 + ρ) α_{1} [(1 + 2 ρ) - (1 + ρ) α_{1}] [p_{1} E (X_{1, t - 1}) + (1 - p_{1}) E (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1}^{2} [p_{1}^{2} \\ \times E (X_{1, t - 1}^{2}) + {(1 - p_{1})}^{2} E (X_{2, t - 1}^{2})] + {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) {{[E (X_{1, t - 1}) - E (X_{2, t - 1})]}^{2} \\ + 2 E (X_{1, t - 1} X_{2, t - 1})} \\ = & {(1 + ρ)}^{2} [α_{1}^{2} p^{2} E (X_{1, t - 1}^{2}) + α_{1}^{2} {(1 - p_{1})}^{2} E (X_{2, t - 1}^{2})] + 2 {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) \\ \times E (X_{1, t - 1} X_{2, t - 1}) + (1 + ρ) α_{1} [(1 + 2 ρ) - (1 + ρ) α_{1}] [p_{1} E (X_{1, t - 1}) + (1 - p_{1}) \\ \times E (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) {[E (X_{1, t - 1}) - E (X_{2, t - 1})]}^{2} . \end{matrix}

\begin{matrix} e_{12} = & E [({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}) ({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1})] \\ = & E [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot \sum_{m = 1}^{X_{1, t - 1}} W_{m} (V_{1, t}, ρ) + \sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot \sum_{n = 1}^{X_{2, t - 1}} W_{n} (V_{2, t}, ρ) \\ + \sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot \sum_{m = 1}^{X_{1, t - 1}} W_{m} (V_{1, t}, ρ) + \sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot \sum_{n = 1}^{X_{2, t - 1}} W_{n} (V_{2, t}, ρ)] \\ \equiv & {\tilde{s}}_{1} + {\tilde{s}}_{2} + {\tilde{s}}_{3} + {\tilde{s}}_{4} . \end{matrix}

\begin{matrix} {\tilde{s}}_{1} = & E [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot \sum_{m = 1}^{X_{1, t - 1}} W_{m} (V_{1, t}, ρ)] \\ = & E \{E [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot \sum_{m = 1}^{X_{1, t - 1}} W_{m} (V_{1, t}, ρ)] | (X_{1, t - 1}, X_{2, t - 1})\} \\ = & E {X_{1, t - 1}^{2} E [W_{i} (U_{1, t}, ρ) \cdot W_{m} (V_{1, t}, ρ)] | (X_{1, t - 1}, X_{2, t - 1})} \\ = & E {X_{1, t - 1}^{2} E [W_{i} (U_{1, t}, ρ) \cdot W_{m} (V_{1, t}, ρ)]} \\ = & E (X_{1, t - 1}^{2}) \cdot {(1 + ρ)}^{2} E (U_{1, t}) E (V_{1, t}) = {(1 + ρ)}^{2} α_{1} p_{1} α_{2} p_{2} E (X_{1, t - 1}^{2}) . \end{matrix}

\begin{matrix} {\tilde{s}}_{2} = & E [\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot \sum_{n = 1}^{X_{2, t - 1}} W_{n} (V_{2, t}, ρ)] \\ = & {(1 + ρ)}^{2} \cdot E (U_{1, t}) E (V_{2, t}) \cdot E (X_{1, t - 1} X_{2, t - 1}) = {(1 + ρ)}^{2} α_{1} p_{1} α_{2} (1 - p_{2}) E (X_{1, t - 1} X_{2, t - 1}) . \end{matrix}

\begin{matrix} {\tilde{s}}_{3} = & E [\sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot \sum_{m = 1}^{X_{1, t - 1}} W_{m} (V_{1, t}, ρ)] \\ = & {(1 + ρ)}^{2} \cdot E (U_{2, t}) E (V_{1, t}) \cdot E (X_{1, t - 1} X_{2, t - 1}) = {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} p_{2} E (X_{1, t - 1} X_{2, t - 1}) . \end{matrix}

\begin{matrix} {\tilde{s}}_{4} = & E [\sum_{j = 1}^{X_{2, t - 1}} W_{j} (U_{2, t}, ρ) \cdot \sum_{n = 1}^{X_{2, t - 1}} W_{n} (V_{2, t}, ρ)] \\ = & {(1 + ρ)}^{2} \cdot E (U_{2, t}) E (V_{2, t}) \cdot E (X_{2, t - 1}^{2}) = {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} (1 - p_{2}) E (X_{2, t - 1}^{2}) . \end{matrix}

\begin{matrix} e_{12} = & {\tilde{s}}_{1} + {\tilde{s}}_{2} + {\tilde{s}}_{3} + {\tilde{s}}_{4} \\ = & {(1 + ρ)}^{2} \cdot α_{1} p_{1} α_{2} p_{2} \cdot E (X_{1, t - 1}^{2}) + {(1 + ρ)}^{2} \cdot α_{1} p_{1} α_{2} (1 - p_{2}) \cdot E (X_{1, t - 1} X_{2, t - 1}) \\ + {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} p_{2} E (X_{1, t - 1} X_{2, t - 1}) + {(1 + ρ)}^{2} α_{1} (1 - p_{1}) α_{2} (1 - p_{2}) E (X_{2, t - 1}^{2}) . \\ e_{21} = & E [({V_{1, t}}_{ρ} \circ X_{1, t - 1} + {V_{2, t}}_{ρ} \circ X_{2, t - 1}) ({U_{1, t}}_{ρ} \circ X_{1, t - 1} + {U_{2, t}}_{ρ} \circ X_{2, t - 1})] . \end{matrix}

Similarly,

\begin{matrix} e_{22} = & {(1 + ρ)}^{2} [α_{2}^{2} q^{2} E (X_{1, t - 1}^{2}) + α_{2}^{2} {(1 - p_{2})}^{2} E (X_{2, t - 1}^{2})] + 2 {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E (X_{1, t - 1} X_{2, t - 1}) \\ + (1 + ρ) α_{2} [(1 + 2 ρ) - (1 + ρ) α_{2}] [p_{2} E (X_{1, t - 1}) + (1 - p_{2}) E (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{2}^{2} p_{2} \\ \times (1 - p_{2}) {[E (X_{1, t - 1}) - E (X_{2, t - 1})]}^{2} . \end{matrix}

Therefore,

\begin{matrix} E & [({A_{t}}_{ρ} \circ X_{t - 1}) {({A_{t}}_{ρ} \circ X_{t - 1})}^{⊤}] = (\begin{matrix} e_{11} & e_{12} \\ e_{21} & e_{22} \end{matrix}) \\ = & {(1 + ρ)}^{2} [\begin{matrix} α_{1} p_{1} & α_{1} (1 - p_{1}) \\ α_{2} p_{2} & α_{2} (1 - p_{2}) \end{matrix}] \cdot E [\begin{matrix} X_{1, t - 1}^{2} & X_{1, t - 1} X_{2, t - 1} \\ X_{1, t - 1} X_{2, t - 1} & X_{2, t - 1}^{2} \end{matrix}] \cdot [\begin{matrix} α_{1} p_{1} & α_{2} p_{2} \\ α_{1} (1 - p_{1}) & α_{2} (1 - p_{2}) \end{matrix}] \\ + [\begin{matrix} c_{11} & 0 \\ 0 & c_{22} \end{matrix}] \end{matrix}

where

c_{12} = 0, c_{21} = 0

,

\begin{matrix} c_{11} = & {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{1} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{1}) [p_{1} E (X_{1, t - 1}) \\ + (1 - p_{1}) E (X_{2, t - 1})] \\ c_{22} = & {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{2} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{2}) [p_{2} E (X_{1, t - 1}) \\ + (1 - p_{2}) E (X_{2, t - 1})] . \end{matrix}

□

Appendix B. Proof in Proposition 1

Proof.

We begin by introducing a sequence of bivariate random series

{X_{t}^{(m)} = (X_{1, t}^{(m)},

X_{2, t}^{(m)} {)^{⊤}}}_{m \in Z}

, defined as follows:

\begin{matrix} X_{t}^{(m)} = \{\begin{matrix} 0, & m < 0, \\ Z_{t}, & m = 0, \\ {A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)} + Z_{t}, & m > 0, \end{matrix} \end{matrix}

(A1)

where, for any m,

Z_{t}

is independent of

{A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)}

and

X_{s}^{(m)}, s < t

.

For any given m, we define a random vector

X = {(X_{1}, X_{2})}^{⊤}

and the space

L^{2} (Ω, F, P)

as the set of all random vectors

X

satisfying

E (X X^{⊤}) < \infty

, where the measure between two random vectors

X

and

Y

is denoted as

d (X, Y) = E (X Y^{⊤})

. lt is easy to obtain that

L^{2} (Ω, F, P)

is a Hilbert space. Our goal is to establish the convergence of

X_{t}^{(m)} \to X_{t}

as

m \to \infty .

To achieve this, we aim to demonstrate that

{X_{t}^{(m)}}

forms a Cauchy sequence in the Hilbert space

L^{2} (Ω, F, P)

. Now, let us proceed to the proof.

Existence.

We will prove the existence of the model through three key points.

1.: $X_{t}^{(m)}$ is non-decreasing for all t.
To substantiate this claim, we must demonstrate that for all $t \in Z$ and $n \geq 1$ , $X_{t}^{(m)} \geq X_{t}^{(m - 1)}$ . For $m = 1$ , we have that

$X_{t}^{(1)} = {A_{t}}_{ρ} \circ X_{t - 1}^{(0)} + Z_{t} = {A_{t}}_{ρ} \circ Z_{t} + Z_{t} \geq Z_{t} = X_{t}^{(0)} .$

Now suppose that $X_{t}^{(k)} \geq X_{t}^{(k - 1)}$ for $k \leq m$ and all $t \in Z$ ; we will demonstrate that $X_{t}^{(m + 1)} - X_{t}^{(m)} \geq 0$ . We consider the components $X_{1, t}^{(m + 1)} - X_{1, t}^{(m)}$ and $X_{2, t}^{(m + 1)} - X_{2, t}^{(m)}$ :

$\begin{matrix} X_{1, t}^{(m + 1)} - X_{1, t}^{(m)} = & {U_{1, t}}_{ρ} \circ X_{1, t - 1}^{(m)} - {U_{1, t}}_{ρ} \circ X_{1, t - 1}^{(m - 1)} + {U_{2, t}}_{ρ} \circ X_{2, t - 1}^{(m)} - {U_{2, t}}_{ρ} \circ X_{2, t - 1}^{(m - 1)} \\ = & \sum_{i = 1}^{X_{1, t - 1}^{(m)} - X_{1, t - 1}^{(m - 1)}} W_{i} (U_{1, t}, ρ) + \sum_{j = 1}^{X_{2, t - 1}^{(m)} - X_{2, t - 1}^{(m - 1)}} W_{j} (U_{2, t}, ρ) \\ = & {U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m)} - X_{1, t - 1}^{(m - 1)}) + {U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m)} - X_{2, t - 1}^{(m - 1)}) \geq 0 . \end{matrix}$

Similarly,

$\begin{matrix} X_{2, t}^{(m + 1)} - X_{2, t}^{(m)} = & {V_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m)} - X_{1, t - 1}^{(m - 1)}) + {V_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m)} - X_{2, t - 1}^{(m - 1)}) \geq 0 . \end{matrix}$

Therefore, by mathematical induction, we establish that ${X_{t}^{(m)}}_{m \in Z}$ is non-decreasing for all t.
2.: $X_{t}^{(m)} \in L^{2} (Ω, F, P)$ for $m > 0$ .
To demonstrate this, let us denote $μ_{Z} = E (Z_{t})$ . From Lemma 1, it follows that:

$\begin{matrix} E (X_{t}^{(m)}) = & \tilde{A} E (X_{t - 1}^{(m - 1)}) + μ_{Z} \\ = & \tilde{A} [\tilde{A} E (X_{t - 2}^{(m - 2)}) + μ_{Z}] + μ_{Z} \\ = & {\tilde{A}}^{2} E (X_{t - 2}^{(m - 2)}) + \tilde{A} μ_{Z} + μ_{Z} \\ = & {\tilde{A}}^{m} E (X_{t - m}^{0}) + {\tilde{A}}^{m - 1} μ_{Z} + \dots + \tilde{A} μ_{Z} + μ_{Z} \\ = & ({\tilde{A}}^{m} + {\tilde{A}}^{m - 1} + \dots + {\tilde{A}}^{0}) \cdot μ_{Z} \\ = & (\sum_{j = 0}^{m} {\tilde{A}}^{j}) \cdot μ_{Z} . \end{matrix}$

Assuming $0 < (1 + ρ) α_{1}, (1 + ρ) α_{2} < 1$ , we find:

$\begin{matrix} \det (I - \tilde{A}) = & 1 - (1 + ρ) α_{1} p_{1} [1 - (1 + ρ) α_{2}] - (1 + ρ) α_{2} [1 - (1 - (1 + ρ) α_{1}) p_{2}] \\ > 1 - [1 - (1 + ρ) α_{2}] - (1 + ρ) α_{2} = 0, \end{matrix}$

where $I$ represents the $2 \times 2$ identity matrix. Hence, the matrix $I - \tilde{A}$ is invertible, and $lim_{m \to \infty} \sum_{j = 0}^{m} {\tilde{A}}^{j} = {(I - \tilde{A})}^{- 1}$ . Consequently, $E (X_{t}^{(m)})$ becomes independent of t as $E (X_{t}^{(m)}) < \infty$ and tends to ${(I - \tilde{A})}^{- 1} μ_{Z}$ as $m \to \infty$ .
Let us consider $E [X_{t}^{(m)} {(X_{t}^{(m)})}^{⊤}]$ . Leveraging the findings from Lemma 1, we derive:

$\begin{matrix} E [X_{t}^{(m)} {(X_{t}^{(m)})}^{⊤}] = & E [({A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)} + Z_{t}) {({A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)} + Z_{t})}^{⊤}] \\ = & E [({A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)}) {({A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)})}^{⊤}] + E [({A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)}) Z_{t}^{⊤}] \\ + E [Z_{t} {({A_{t}}_{ρ} \circ X_{t - 1}^{(m - 1)})}^{⊤}] + E [Z_{t} Z_{t}^{⊤}] \\ = & \tilde{A} E (X_{t - 1}^{(m - 1)} {(X_{t - 1}^{(m - 1)})}^{⊤}) {\tilde{A}}^{⊤} + C + \tilde{A} E (X_{t - 1}^{(m - 1)}) μ_{Z}^{⊤} \\ + μ_{Z} E {(X_{t - 1}^{(m - 1)})}^{⊤} {\tilde{A}}^{⊤} + E (Z_{t} Z_{t}^{⊤}), \end{matrix}$

where $C = diag (c_{11}, c_{22})$ , with

$\begin{matrix} [\begin{matrix} c_{11} \\ c_{22} \end{matrix}] = & [\begin{matrix} {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E (X_{1, t - 1}^{2} + X_{2, t - 1}^{2} - 2 X_{1, t - 1} X_{2, t - 1}) \\ {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E (X_{1, t - 1}^{2} + X_{2, t - 1}^{2} - 2 X_{1, t - 1} X_{2, t - 1}) \end{matrix}] \\ + [\begin{matrix} (1 + ρ) (2 ρ + 1 - (1 + ρ) α_{1}) α_{1} [p_{1} E (X_{1, t - 1}) + (1 - p_{1}) E (X_{2, t - 1})] \\ (1 + ρ) (2 ρ + 1 - (1 + ρ) α_{2}) α_{2} [p_{2} E (X_{1, t - 1}) + (1 - p_{2}) E (X_{2, t - 1})] \end{matrix}] \\ = & [\begin{matrix} {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) & {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) \\ {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) & {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) \end{matrix}] \cdot E (\binom{X_{1, t - 1}^{2}}{X_{2, t - 1}^{2}}) \\ + [\begin{matrix} (1 + ρ) (2 ρ + 1 - (1 + ρ) α_{1}) α_{1} p_{1} & (1 + ρ) (2 ρ + 1 - (1 + ρ) α_{1}) α_{1} (1 - p_{1}) \\ (1 + ρ) (2 ρ + 1 - (1 + ρ) α_{2}) α_{2} p_{2} & (1 + ρ) (2 ρ + 1 - (1 + ρ) α_{2}) α_{2} (1 - p_{2}) \end{matrix}] \\ \cdot E (\binom{X_{1, t - 1}}{X_{2, t - 1}}) - [\begin{matrix} 2 {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) \\ 2 {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) \end{matrix}] \cdot E (\binom{X_{1, t - 1} X_{2, t - 1}}{X_{1, t - 1} X_{2, t - 1}}) \\ \leq [\begin{matrix} (1 + ρ) α_{1} p_{1} (1 - p_{1}) \cdot 1 & (1 + ρ) α_{1} (1 - p_{1}) \cdot 1 \\ (1 + ρ) α_{2} p_{2} (1 - p_{2}) \cdot 1 & (1 + ρ) α_{2} (1 - p_{2}) \cdot 1 \end{matrix}] \cdot E [{(X_{t - 1}^{(m)})}^{2}] \\ + [\begin{matrix} (1 + ρ) (2 ρ + 1) α_{1} p_{1} & (1 + ρ) (2 ρ + 1) α_{1} (1 - p_{1}) \\ (1 + ρ) (2 ρ + 1) α_{2} p_{2} & (1 + ρ) (2 ρ + 1) α_{2} (1 - p_{2}) \end{matrix}] \cdot E [X_{t - 1}^{(m)}] \\ = & \tilde{A} \cdot E [{(X_{t - 1}^{(m)})}^{2}] + (2 ρ + 1) \tilde{A} \cdot E [X_{t - 1}^{(m)}] . \end{matrix}$

If $0 < (1 + ρ) α_{1}, (1 + ρ) α_{2} < 1$ , all entries of the matrix $\tilde{A}$ fall within the interval $(0, 1)$ . Through iterative recursion repeated m times, we establish its independence from t. Consequently, $E [X_{t}^{(m)} {(X_{t}^{(m)})}^{⊤}] < \infty$ . Hence, $X_{t}^{(m)} \in L^{2} (Ω, F, P)$ for $m > 0$ .
3.: $X_{t}^{(m)}$ is a Cauchy sequence.
Let $D_{m, t, k} = X_{t}^{(m)} - X_{t}^{(m - k)}$ for all $t \in Z$ , $k = 1, 2, \dots, m$ . From Equation (A1), it is straightforward to get

$X_{t}^{(m)} - X_{t}^{(m - k)} \overset{d}{=} {A_{t}}_{ρ} \circ (X_{t - 1}^{(m - 1)} - X_{t - 1}^{(m - 1 - k)}) = {A_{t}}_{ρ} \circ D_{m - 1, t - 1, k} .$

Then, we have

$\begin{matrix} E (D_{m, t, k}) = & \tilde{A} E (D_{m - 1, t - 1, k}) = {\tilde{A}}^{m} E (Z_{t}) \to 0, m \to \infty . \end{matrix}$

Next, we aim to prove that

$E (D_{m, t, k} \cdot D_{m, t, k}^{⊤}) \to 0, m \to \infty .$

Proof.

Let

H_{m, t, k} \equiv {[{(X_{1, t}^{(m)} - X_{1, t}^{(m - k)})}^{2}, {(X_{2, t}^{(m)} - X_{2, t}^{(m - k)})}^{2}]}^{⊤}

,

D_{m, t, k} = [(X_{1, t}^{(m)} - X_{1, t}^{(m - k)}),

(X_{2, t}^{(m)} - X_{2, t}^{(m - k)})]^{⊤}

. According to Lemma 1, it follows that

\begin{matrix} E (D_{m, t, k} \cdot D_{m, t, k}^{⊤}) = & E [(X_{t}^{(m)} - X_{t}) \cdot {(X_{t}^{(m)} - X_{t})}^{⊤}] \\ = & E [{A_{t}}_{ρ} \circ (X_{t - 1}^{(m - 1)} - X_{t - 1}) \cdot {({A_{t}}_{ρ} \circ (X_{t - 1}^{(m - 1)} - X_{t - 1}))}^{⊤}] \\ = & \tilde{A} E [(X_{t - 1}^{(m - 1)} - X_{t - 1}) \cdot {(X_{t - 1}^{(m - 1)} - X_{t - 1})}^{⊤}] \tilde{A} + M, \end{matrix}

where

M = diag (m_{1}, m_{2})

and

\begin{matrix} [\begin{matrix} m_{1} \\ m_{2} \end{matrix}] = & {(1 + ρ)}^{2} [\begin{matrix} α_{1}^{2} p (1 - p_{1}) & α_{1}^{2} p (1 - p_{1}) \\ α_{2}^{2} p_{2} (1 - p_{2}) & α_{2}^{2} p_{2} (1 - p_{2}) \end{matrix}] E (H_{m - 1, t - 1, k}) \\ + (1 + ρ) [\begin{matrix} α_{1} p_{1} ((1 + 2 ρ) - α_{1} (1 + ρ)) & α_{1} (1 - p_{1}) ((1 + 2 ρ) - α_{1} (1 + ρ)) \\ α_{2} p_{2} ((1 + 2 ρ) - α_{2} (1 + ρ)) & α_{2} (1 - p_{2}) ((1 + 2 ρ) - α_{2} (1 + ρ)) \end{matrix}] E (D_{m - 1, t - 1, k}) \\ - 2 {(1 + ρ)}^{2} [\begin{matrix} α_{1}^{2} p (1 - p_{1}) \cdot E [(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \\ α_{2}^{2} p_{2} (1 - p_{2}) \cdot E [(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \end{matrix}] . \end{matrix}

Let us start by analyzing

M

. On the one hand, considering the non-negativity of the random variables

X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}

and

X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)}

, we find that

E ((X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})) \geq 0

. On the other hand, we proceed to derive

E (H_{m, t, k})

as follows:

\begin{matrix} E & [\begin{matrix} {(X_{1, t}^{(m)} - X_{1, t}^{(m - k)})}^{2} \\ {(X_{2, t}^{(m)} - X_{2, t}^{(m - k)})}^{2} \end{matrix}] \\ = & [\begin{matrix} E {[{U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) + {U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]}^{2} \\ E {[{V_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) + {V_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]}^{2} \end{matrix}] \\ = & [\begin{matrix} E {[{U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})]}^{2} \\ E {[{V_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})]}^{2} \end{matrix}] + [\begin{matrix} E {[{U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]}^{2} \\ E {[{V_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]}^{2} \end{matrix}] \\ + [\begin{matrix} 2 E [{U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot {U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \\ 2 E [{V_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot {V_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \end{matrix}] \\ \equiv & [\begin{matrix} s_{11} \\ s_{21} \end{matrix}] + [\begin{matrix} s_{12} \\ s_{22} \end{matrix}] + [\begin{matrix} s_{13} \\ s_{23} \end{matrix}] . \end{matrix}

\begin{matrix} s_{11} = & E {[{U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})]}^{2} \\ = & E {[\sum_{i = 1}^{X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}} W_{i} (U_{1, t}, ρ)]}^{2} \\ = & E [\sum_{i = 1}^{X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}} W_{i}^{2} (U_{1, t}, ρ) + \sum_{i \neq n}^{X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}} W_{i} (U_{1, t}, ρ) \cdot W_{n} (U_{1, t}, ρ)] \\ = & E \{E [\sum_{i = 1}^{X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}} W_{i}^{2} (U_{1, t}, ρ) | X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}]\} \\ + E \{E [\sum_{i \neq n}^{X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}} W_{i} (U_{1, t}, ρ) \cdot W_{n} (U_{1, t}, ρ) | X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}]\} \\ = & E \{(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot E [W_{i}^{2} (U_{1, t}, ρ)]\} W_{i} (U_{1, t}, ρ) + E {(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \\ \cdot (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)} - 1) \cdot E [W_{i} (U_{1, t}, ρ) \cdot W_{n} (U_{1, t}, ρ)]} \\ = & E (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot E {U_{1, t} (1 + ρ) (1 + 2 ρ)} + E [(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \\ \cdot (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)} - 1)] \cdot E [U_{1, t}^{2} {(1 + ρ)}^{2}] \\ = & E (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot (1 + ρ) (1 + 2 ρ) \cdot α_{1} p_{1} + E [(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \\ \cdot (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)} - 1)] \cdot {(1 + ρ)}^{2} \cdot α_{1}^{2} p_{1} \\ = & α_{1} p_{1} (1 + ρ) [(1 + 2 ρ) - α_{1} (1 + ρ)] \cdot E (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) + α_{1}^{2} p_{1} {(1 + ρ)}^{2} \\ \cdot E {(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})}^{2} . \end{matrix}

Similarly,

\begin{matrix} s_{12} = & E {[{U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]}^{2} \\ = & α_{1} (1 - p_{1}) (1 + ρ) [(1 + 2 ρ) - α_{1} (1 + ρ)] \cdot E (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)}) \\ + α_{1}^{2} (1 - p_{1}) {(1 + ρ)}^{2} \cdot E {(X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})}^{2} . \end{matrix}

\begin{matrix} s_{13} = & 2 E [{U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot {U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \\ = & 2 E {E [{U_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot {U_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)}) | (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}, \\ X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]} \\ = & 2 E {E [\sum_{i = 1}^{X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}} W_{i} (U_{1, t}, ρ) \cdot \sum_{j = 1}^{X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)}} W_{j} (U_{2, t}, ρ) | (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}, \\ X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]} \\ = & 2 E {(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \cdot (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)}) E [W_{i} (U_{1, t}, ρ) \cdot W_{j} (U_{2, t}, ρ)]} \\ = & 2 E {(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})} \cdot E {E [W_{i} (U_{1, t}, ρ) \cdot W_{j} (U_{2, t}, ρ)]} \\ = & 2 E [(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \cdot E [U_{1, t} (1 + ρ) \cdot U_{2, t} (1 + ρ)] \\ = & 2 {(1 + ρ)}^{2} E (U_{1, t} U_{2, t}) \cdot E [(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})] \\ = & 0 \\ = & s_{23} . \end{matrix}

\begin{matrix} s_{21} = & E {[{V_{1, t}}_{ρ} \circ (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})]}^{2} \\ = & α_{2} p_{2} (1 + ρ) [(1 + 2 ρ) - α_{2} (1 + ρ)] \cdot E (X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)}) \\ + α_{2}^{2} p_{2} {(1 + ρ)}^{2} \cdot E {(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})}^{2} . \end{matrix}

\begin{matrix} s_{22} = & E {[{V_{2, t}}_{ρ} \circ (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})]}^{2} \\ = & α_{2} (1 - p_{2}) (1 + ρ) [(1 + 2 ρ) - α_{2} (1 + ρ)] \cdot E (X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)}) \\ + α_{2}^{2} (1 - p_{2}) {(1 + ρ)}^{2} \cdot E {(X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})}^{2} . \end{matrix}

Then,

\begin{matrix} E (H_{m, t, k}) = & {(1 + ρ)}^{2} [\begin{matrix} α_{1}^{2} p (1 - p_{1}) & α_{1}^{2} p (1 - p_{1}) \\ α_{2}^{2} p_{2} (1 - p_{2}) & α_{2}^{2} p_{2} (1 - p_{2}) \end{matrix}] \cdot E [\begin{matrix} {(X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)})}^{2} \\ {(X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)})}^{2} \end{matrix}] \\ + (1 + ρ) [\begin{matrix} α_{1} p_{1} ((1 + 2 ρ) - α_{1} (1 + ρ)) & α_{1} (1 - p_{1}) ((1 + 2 ρ) - α_{1} (1 + ρ)) \\ α_{2} p_{2} ((1 + 2 ρ) - α_{2} (1 + ρ)) & α_{2} (1 - p_{2}) ((1 + 2 ρ) - α_{2} (1 + ρ)) \end{matrix}] \\ \cdot E [\begin{matrix} X_{1, t - 1}^{(m - 1)} - X_{1, t - 1}^{(m - k - 1)} \\ X_{2, t - 1}^{(m - 1)} - X_{2, t - 1}^{(m - k - 1)} \end{matrix}] \\ \leq & (1 + ρ) [\begin{matrix} α_{1} p_{1} \cdot 1 & α_{1} (1 - p_{1}) \cdot 1 \\ α_{2} p_{2} \cdot 1 & α_{2} (1 - p_{2}) \cdot 1 \end{matrix}] \cdot E (H_{m - 1, t - 1, k}) \\ + (1 + ρ) [\begin{matrix} α_{1} p_{1} (1 + 2 ρ) & α_{1} (1 - p_{1}) (1 + 2 ρ) \\ α_{2} p_{2} (1 + 2 ρ) & α_{2} (1 - p_{2}) (1 + 2 ρ) \end{matrix}] E (D_{m - 1, t - 1, k}) \\ = & \tilde{A} \cdot E (H_{m - 1, t - 1, k}) + (1 + 2 ρ) \tilde{A} \cdot E (D_{m - 1, t - 1, k}) . \end{matrix}

Therefore,

\begin{matrix} [\begin{matrix} m_{1} \\ m_{2} \end{matrix}] \leq & {(1 + ρ)}^{2} [\begin{matrix} α_{1}^{2} p (1 - p_{1}) & α_{1}^{2} p (1 - p_{1}) \\ α_{2}^{2} p_{2} (1 - p_{2}) & α_{2}^{2} p_{2} (1 - p_{2}) \end{matrix}] \cdot E (H_{m - 1, t - 1, k}) \\ + (1 + ρ) [\begin{matrix} α_{1} p_{1} (1 + 2 ρ - α_{1} (1 + ρ)) & α_{1} (1 - p_{1}) (1 + 2 ρ - α_{1} (1 + ρ)) \\ α_{2} p_{2} (1 + 2 ρ - α_{1} (1 + ρ)) & α_{2} (1 - p_{2}) (1 + 2 ρ - α_{1} (1 + ρ)) \end{matrix}] \cdot E (D_{m - 1, t - 1, k}) \\ < & (1 + ρ) [\begin{matrix} α_{1} p_{1} \cdot 1 & α_{1} (1 - p_{1}) \cdot 1 \\ α_{2} p_{2} \cdot 1 & α_{2} (1 - p_{2}) \cdot 1 \end{matrix}] \cdot E (H_{m - 1, t - 1, k}) \\ + (1 + ρ) [\begin{matrix} α_{1} p_{1} \cdot (1 + 2 ρ) & α_{1} (1 - p_{1}) \cdot (1 + 2 ρ) \\ α_{2} p_{2} \cdot (1 + 2 ρ) & α_{2} (1 - p_{2}) \cdot (1 + 2 ρ) \end{matrix}] \cdot E (D_{m - 1, t - 1, k}) \\ = & \tilde{A} \cdot E (H_{m - 1, t - 1, k}) + (1 + 2 ρ) \tilde{A} \cdot E (D_{m - 1, t - 1, k}) \\ \leq & \tilde{A} \cdot [\tilde{A} \cdot E (H_{m - 2, t - 2, k}) + (1 + 2 ρ) \tilde{A} \cdot E (D_{m - 2, t - 22, k})] + (1 + 2 ρ) \tilde{A} \cdot E (D_{m - 1, t - 1, k}) \\ \leq & {\tilde{A}}^{m} \cdot E (Z_{t}^{2}) + (1 + 2 ρ) \sum_{j = 1}^{m} {\tilde{A}}^{j} E (D_{m - j, t - j, k}) \\ \leq & {\tilde{A}}^{m} \cdot E (Z_{t}^{2}) + (1 + 2 ρ) m {\tilde{A}}^{m} E (Z_{t}) . \end{matrix}

(A2)

If

0 < (1 + ρ) α_{1}, (1 + ρ) α_{2} < 1

, it is straightforward to demonstrate that the eigenvalues (denoted as

λ_{i}

, for

i = 1, 2

) of the matrix

\tilde{A}

lie within the unit circle. Consequently, we have

m λ_{i}^{m} \to 0

,

i = 1, 2

as

m \to \infty

, which implies that Equation (A2) converges to 0 as

m \to \infty

. Furthermore,

M \to 0

as

m \to \infty

.

Next, we examine

\begin{matrix} E (D_{m, t, k} \cdot D_{m, t, k}^{⊤}) = & \tilde{A} E (D_{m - 1, t - 1, k} \cdot D_{m - 1, t - 1, k}^{⊤}) {\tilde{A}}^{⊤} + M \\ \leq & {\tilde{A}}^{m} E (Z_{t} Z_{t}^{⊤}) {\tilde{A}}^{m ⊤} + \sum_{i = 0}^{m - 1} {\tilde{A}}^{i} M {\tilde{A}}^{i ⊤} . \end{matrix}

(A3)

As

m \to \infty

, we have:

\begin{matrix} vec [{\tilde{A}}^{m} E (Z_{t} Z_{t}^{⊤}) {\tilde{A}}^{m ⊤}] = ({\tilde{A}}^{m} \otimes {\tilde{A}}^{m}) vec [E (Z_{t} Z_{t}^{⊤})] = {(\tilde{A} \otimes \tilde{A})}^{m} vec [E (Z_{t} Z_{t}^{⊤})] \to 0, \end{matrix}

(A4)

and

\begin{matrix} \sum_{i = 0}^{m - 1} vec ({\tilde{A}}^{i} M {\tilde{A}}^{i ⊤}) = & \sum_{i = 0}^{m - 1} {(\tilde{A} \otimes \tilde{A})}^{i} vec (M) \\ = & \frac{I - {\tilde{A} \otimes \tilde{A}}^{m}}{I - \tilde{A} \otimes \tilde{A}} vec (M) \to {I - \tilde{A} \otimes \tilde{A}}^{- 1} . \end{matrix}

(A5)

The validity of Equations (A4) and (A5) relies on the fact that the eigenvalues of the matrix

\tilde{A} \otimes \tilde{A}

lie within the unit circle. Hence,

E (D_{m, t, k} \cdot D_{m, t, k}^{⊤}) \to 0

when

m \to \infty

. □

We have proved

E [(X_{t}^{(m)} - X_{t}^{(m - k)}) \cdot {(X_{t}^{(m)} - X_{t}^{(m - k)})}^{⊤}] \to 0

. This implies that the sequence

{X_{t}^{(m)}}_{m \in Z}

is a Cauchy sequence, and thus,

lim_{m \to \infty} X_{t}^{(m)} = X_{t} \in L^{2} (Ω, F, P)

. Finally, by taking limits on both sides of Equation (A1) and letting

m \to \infty

, we obtain

X_{t} = {A_{t}}_{ρ} \circ X_{t - 1} + Z_{t}

, where

Z_{t}

is independent of

{A_{t}}_{ρ} \circ X_{t - 1}

and

X_{s}

for

s < t

.

Uniqueness

Let us delve into uniqueness. Assume there exists another series

{Y_{t}}_{t \in Z}

satisfying Equation (4). Then, we can express the difference between

X_{t}

and

Y_{t}

as

X_{t} - Y_{t} = {A_{t}}_{ρ} \circ X_{t} - {A_{t}}_{ρ} \circ Y_{t} .

Define

\begin{matrix} B_{i, m} = {ω : |X_{i, t}^{(m)} (ω) - Y_{i, t} (ω)| > 0}, i = 1, 2, m \geq 1, \\ B_{\infty} = ⋃_{i = 1}^{2} {ω : |X_{i, t} (ω) - Y_{i, t} (ω)| > 0} = ⋃_{i = 1}^{2} B_{i, \infty} = ⋃_{i = 1}^{2} ⋂_{n = 1}^{\infty} ⋃_{m = n}^{\infty} B_{i, m} . \end{matrix}

We then establish:

\begin{matrix} P (B_{i, m}) & = P ({ω : |X_{i, t}^{(n)} (ω) - Y_{i, t} (ω)| > 0}) \\ \leq \sum_{k = 1}^{\infty} P ({ω : |X_{i, t}^{(m)} (ω) - Y_{i, t} (ω)| = k}) \\ \leq \sum_{k = 1}^{\infty} k P ({ω : |X_{i, t}^{(m)} (ω) - Y_{i, t} (ω)| = k}) \\ = & E (|X_{i, t}^{(m)} - Y_{i, t}|) . \end{matrix}

We introduce new notations:

\begin{matrix} L_{t}^{(0)} = & Z_{t}, \\ L_{t}^{(m)} = & {(L_{1, t}^{(m)}, L_{2, t}^{(m)})}^{⊤} = (| X_{1, t}^{(m)} - Y_{1, t} |, | X_{2, t}^{(m)} - Y_{2, t} {|)}^{⊤}, for m ≧ 1 . \end{matrix}

According to Lemma 1, we derive:

E [L_{t}^{(m)}] ≦ \tilde{A} E [L_{t - 1}^{(m - 1)}] ≦ \dots ≦ {\tilde{A}}^{m} E [L_{t - m}^{(0)}] = {\tilde{A}}^{m} μ_{Z_{t}} .

Consequently,

\sum_{m = 1}^{\infty} \Pr \{B_{i, m}\} ≦ \sum_{m = 1}^{\infty} {\tilde{A}}^{m} μ_{Z_{t}} ≦ {(I - \tilde{A})}^{- 1} μ_{Z_{t}} < \infty .

By the Borel–Cantelli lemma, we conclude that

\Pr (B_{i, \infty}) = 0

. Thus,

\Pr (B_{\infty}) = 0

, i.e.,

X_{t} = Y_{t}

almost surely.

Strictly stationary.

We will employ mathematical induction to demonstrate that for all h and k,

(X_{h + 1}^{(m)}, \dots,

X_{h + k}^{(m)})^{⊤}

and

{(X_{1}^{(m)}, \dots, X_{k}^{(m)})}^{⊤}

are identically distributed. Firstly, when

m = 0

, we have

\begin{matrix} (\begin{matrix} X_{h + 1}^{(0)} \\ ⋮ \\ X_{h + k}^{(0)} \end{matrix}) = (\begin{matrix} Z_{h + 1} \\ ⋮ \\ Z_{h + k} \end{matrix}) \end{matrix}

and

\begin{matrix} (\begin{matrix} X_{1}^{(0)} \\ ⋮ \\ X_{k}^{(0)} \end{matrix}) = (\begin{matrix} Z_{1} \\ ⋮ \\ Z_{k} \end{matrix}) . \end{matrix}

Since

{(Z_{h + 1}, \dots, Z_{h + k})}^{⊤} \overset{d}{=} {(Z_{1}, \dots, Z_{k})}^{⊤}

(here,

\overset{d}{=}

stands for having the same distribution), then

{(X_{h + 1}^{(m)}, \dots, X_{h + k}^{(m)})}^{⊤}

and

{(X_{1}^{(m)}, \dots, X_{k}^{(m)})}^{⊤}

have identical distributions. Consequently,

{X_{t}^{(0)}}_{t \in Z}

is strictly stationary.

Next, suppose

{X_{t}^{(m - 1)}}_{t \in Z}

is strictly stationary; then we have

\begin{matrix} (\begin{matrix} X_{1}^{(m)} \\ ⋮ \\ X_{k}^{(m)} \end{matrix}) = (\begin{matrix} Z_{1} \\ ⋮ \\ Z_{k} \end{matrix}) + (\begin{matrix} A_{1} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & A_{k} \end{matrix})_{ρ} \circ (\begin{matrix} X_{1}^{(m - 1)} \\ ⋮ \\ X_{k}^{(m - 1)} \end{matrix}), \end{matrix}

\begin{matrix} (\begin{matrix} X_{h + 1}^{(m)} \\ ⋮ \\ X_{h + k}^{(m)} \end{matrix}) = (\begin{matrix} Z_{h + 1} \\ ⋮ \\ Z_{h + k} \end{matrix}) + (\begin{matrix} A_{h + 1} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & A_{h + k} \end{matrix})_{ρ} \circ (\begin{matrix} X_{h + 1}^{(m - 1)} \\ ⋮ \\ X_{h + k}^{(m - 1)} \end{matrix}) . \end{matrix}

Likewise, since

{X_{t}^{(m - 1)}}_{t \in Z}

is strictly stationary, we have

{(X_{1}^{(m - 1)}, \dots, X_{k}^{(m - 1)})}^{⊤} \overset{d}{=} {(X_{h + 1}^{(m - 1)}, \dots, X_{h + k}^{(m - 1)})}^{⊤} .

Then

{(X_{1}^{(m)}, \dots, X_{k}^{(m)})}^{⊤} \overset{d}{=} {(X_{h + 1}^{(m)}, \dots, X_{h + k}^{(m)})}^{⊤}

. Thus,

{X_{t}^{(m)}}_{t \in Z}

forms a strictly stationary process. Furthermore, since

lim_{m \to \infty} X_{t}^{(m)} = X_{t}

, i.e.,

X_{t}^{(m)} \overset{L^{2}}{\to} X_{t}

, then

X_{t}^{(m)} \overset{P}{\to} X_{t}

. Therefore,

{X_{t}}_{t \in Z}

is also a strictly stationary process.

Ergodicity.

At time t, the random matrical operation

A_{t}_{ρ} \circ

involves two random coefficient-thinning operations, i.e., “

U_{1, t}_{ρ} \circ

” or “

U_{2, t}_{ρ} \circ

” and “

V_{1, t}_{ρ} \circ

” or “

V_{2, t}_{ρ} \circ

”. Let

W (t)

denote all counting series involved in the matrix operation. Obviously,

W (t)

is a 2-dimensional series. Let

σ (X)

represent the

σ

-algebra rendering the vector

X

measurable. According to Equation (4), for any t, we have

σ (X_{t}, X_{t + 1}, \dots) \subset σ (Z_{t}, A_{t}, W (t), Z_{t + 1}, A_{t + 1}, W (t + 1), \dots),

and consequently,

⋂_{t = 1}^{\infty} σ (X_{t}, X_{t + 1}, \dots) \subset D = ⋂_{t = 1}^{\infty} σ (Z_{t}, A_{t}, W (t), Z_{t + 1}, A_{t + 1}, W (t + 1), \dots) .

Given that the sequence

{Z_{t}, A_{t}, W (t)}

is an i.i.d. sequence of random vectors, then

{Z_{t}, A_{t}, W (t)}

is ergodic. According to Kolmogorov’s

0 - 1

law, for any event

B

within

D

, the probability

P (B)

is either 0 or 1. This means that the tail of the

σ

field of

{X_{t}}

contains only the measure sets with probability 0 or 1. Consistent with findings akin to those in [8], the sequence

X_{t}

is considered ergodic. □

Appendix C. Proof in Proposition 2

(i): $E (X_{t}) = {(I - \tilde{A})}^{- 1} E (Z)$ .

Proof.

E (X_{t}) = E (A_{t}_{ρ} \circ X_{t - 1} + Z_{t}) = \tilde{A} E (X_{t - 1}) + E (Z_{t})

. □

(ii): $Σ_{X_{1, t}} = \tilde{A} Σ_{X_{t - 1}} {\tilde{A}}^{⊤} + C + Σ_{Z_{t}},$ where $c_{12} = 0, c_{21} = 0,$

$\begin{matrix} c_{11} = & {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{1} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{1}) [p_{1} E (X_{1, t - 1}) \\ + (1 - p_{1}) E (X_{2, t - 1})], \\ c_{22} = & {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{2} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{2}) [p_{2} E (X_{1, t - 1}) \\ + (1 - p_{2}) E (X_{2, t - 1})] . \end{matrix}$

Proof.

\begin{matrix} Var (X_{1, t}) = & Var (U_{1, t}_{ρ} \circ X_{1, t - 1} + U_{2, t}_{ρ} \circ X_{2, t - 1} + Z_{1, t}) \\ = & Var (U_{1, t}_{ρ} \circ X_{1, t - 1}) + Var (U_{2, t}_{ρ} \circ X_{2, t - 1}) + 2 Cov (U_{1, t}_{ρ} \circ X_{1, t - 1}, U_{2, t}_{ρ} \circ X_{2, t - 1}) \\ + σ_{Z_{1}}^{2} \\ = & {(1 + ρ)}^{2} α_{1}^{2} [p_{1}^{2} Var (X_{1, t - 1}) + {(1 - p_{1})}^{2} Var (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) [E (X_{1, t - 1}^{2}) \\ + E (X_{2, t - 1}^{2}) - 2 E (X_{1, t - 1}) E (X_{2, t - 1})] + [(1 + ρ) (1 + 2 ρ) α_{1} - {(1 + ρ)}^{2} α_{1}^{2}] [p_{1} \\ \times E (X_{1, t - 1}) + (1 - p_{1}) E (X_{2, t - 1})] + σ_{Z_{1}}^{2}, \end{matrix}

where

\begin{matrix} Var (U_{1, t}_{ρ} \circ X_{1, t - 1}) = & E [{(U_{1, t}_{ρ} \circ X_{1, t - 1})}^{2}] - {[E (U_{1, t}_{ρ} \circ X_{1, t - 1})]}^{2} \\ = & E {[\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ)]}^{2} - {[E (\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ))]}^{2} \\ = & E \{E [\sum_{i = 1}^{X_{1, t - 1}} W_{i}^{2} (U_{1, t}, ρ) + \sum_{i \neq m}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) \cdot W_{m} (U_{1, t}, ρ) | X_{1, t - 1}]\} \\ - {\{E [E (\sum_{i = 1}^{X_{1, t - 1}} W_{i} (U_{1, t}, ρ) | X_{1, t - 1})]\}}^{2} \\ = & E [X_{1, t - 1} (1 + ρ) (1 + 2 ρ) E (U_{1, t}) + X_{1, t - 1} (X_{1, t - 1} - 1) {(1 + ρ)}^{2} E (U_{1, t}^{2})] \\ - {\{E [X_{1, t - 1} (1 + ρ) E (U_{1, t})]\}}^{2} \\ = & {(1 + ρ)}^{2} α_{1}^{2} p^{2} Var (X_{1, t - 1}) + {(1 + æ)}^{2} {ff}_{1}^{2} p (1 - p_{1}) E (X_{1, t - 1}^{2}) - {(1 + æ)}^{2} {ff}_{1}^{2} p_{1} \\ \times E (X_{1, t - 1}) 1 + (1 + ρ) (1 + 2 ρ) α_{1} p_{1} E (X_{1, t - 1}) . \end{matrix}

\begin{matrix} Var (U_{2, t}_{ρ} \circ X_{1, t - 1}) = & {(1 + ρ)}^{2} α_{1}^{2} {(1 - p_{1})}^{2} Var (X_{2, t - 1}) + {(1 + æ)}^{2} {ff}_{1}^{2} p (1 - p_{1}) E (X_{2, t - 1}^{2}) \\ - {(1 + ρ)}^{2} α_{1}^{2} (1 - p_{1}) E (X_{2, t - 1}) + (1 + ρ) (1 + 2 ρ) α_{1} (1 - p_{1}) E (X_{2, t - 1}) . \end{matrix}

\begin{matrix} Var (X_{2, t}) = & Var (V_{1, t}_{ρ} \circ X_{1, t - 1} + V_{2, t}_{ρ} \circ X_{2, t - 1} + Z_{2, t}) \\ = & Var (V_{1, t}_{ρ} \circ X_{1, t - 1}) + Var (V_{2, t}_{ρ} \circ X_{2, t - 1}) + 2 Cov (V_{1, t}_{æ} \circ X_{1, t - 1}, V_{2, t}_{æ} \circ X_{2, t - 1}) \\ + σ_{Z_{2, t}}^{2} \\ = & {(1 + ρ)}^{2} α_{2}^{2} [p_{2}^{2} Var (X_{1, t - 1}) + {(1 - p_{2})}^{2} Var (X_{2, t - 1})] + {(1 + æ)}^{2} {ff}_{2}^{2} p_{2} (1 - p_{2}) \\ \times [E (X_{1, t - 1}^{2}) + E (X_{2, t - 1}^{2}) - 2 E (X_{1, t - 1}) E (X_{2, t - 1})] + [(1 + ρ) (1 + 2 ρ) α_{2} \\ - {(1 + ρ)}^{2} α_{2}^{2}] [p_{2} E (X_{1, t - 1}) + (1 - p_{2}) E (X_{2, t - 1})] + σ_{Z_{2, t}}^{2} . \end{matrix}

\begin{matrix} Cov (X_{1, t}, X_{2, t}) = & {(1 + ρ)}^{2} α_{1} p_{1} α_{2} p_{2} Var (X_{1, t - 1}) + {(1 + æ)}^{2} {ff}_{1} (1 - p_{1}) {ff}_{2} (1 - p_{2}) Var (X_{2, t - 1}) \\ + {(1 + ρ)}^{2} α_{1} p_{1} α_{2} (1 - p_{2}) Cov (X_{1, t - 1}, X_{2, t - 1}) + {(1 + æ)}^{2} {ff}_{1} (1 - p_{1}) {ff}_{2} p_{2} \\ \times Cov (X_{1, t - 1}, X_{2, t - 1}) . \end{matrix}

\begin{matrix} Cov (U_{1, t}_{ρ} \circ X_{1, t - 1}, U_{2, t}_{ρ} \circ X_{2, t - 1}) = - 2 {(1 + ρ)}^{2} α_{1} p_{1} α_{1} (1 - p_{1}) E (X_{1, t - 1}) E (X_{2, t - 1}) . \end{matrix}

Therefore,

\begin{matrix} Σ_{X_{1, t}} = & [\begin{matrix} Var (X_{1, t}) & Cov (X_{1, t}, X_{2, t}) \\ Cov (X_{2, t}, X_{1, t}) & Var (X_{2, t}) \end{matrix}] \\ = & {(1 + ρ)}^{2} [\begin{matrix} α_{1} p_{1} & α_{1} (1 - p_{1}) \\ α_{2} p_{2} & α_{2} (1 - p_{2}) \end{matrix}] [\begin{matrix} Var (X_{1, t - 1}) & Cov (X_{1, t}, X_{2, t - 1}) \\ Cov (X_{2, t - 1}, X_{1, t}) & Var (X_{2, t - 1}) \end{matrix}] \\ \cdot [\begin{matrix} α_{1} p_{1} & α_{2} p_{2} \\ α_{1} (1 - p_{1}) & α_{2} (1 - p_{2}) \end{matrix}] + C + Σ_{Z_{t}} \\ = & \tilde{A} Σ_{X_{t - 1}} {\tilde{A}}^{⊤} + diag (c_{11}, c_{22}) + Σ_{Z_{t}}, \end{matrix}

where

\begin{matrix} c_{11} = & {(1 + ρ)}^{2} α_{1}^{2} p (1 - p_{1}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{1} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{1}) [p_{1} E (X_{1, t - 1}) \\ + (1 - p_{1}) E (X_{2, t - 1})], \\ c_{22} = & {(1 + ρ)}^{2} α_{2}^{2} p_{2} (1 - p_{2}) E {(X_{1, t - 1} - X_{2, t - 1})}^{2} + α_{2} (1 + ρ) (1 + 2 ρ - (1 + ρ) α_{2}) [p_{2} E (X_{1, t - 1}) \\ + (1 - p_{2}) E (X_{2, t - 1})] . \end{matrix}

□

Appendix D. Proof in Lemma 3

Proof.

Since

\begin{matrix} Cov (X_{1, t}, X_{2, t}) = & {(1 + ρ)}^{2} α_{1} α_{2} [p q Var (X_{1, t - 1}) + (1 - p) (1 - q) Var (X_{2, t - 1})] + {(1 + ρ)}^{2} α_{1} α_{2} \\ \times [p_{1} (1 - p_{2}) + (1 - p_{1}) p_{2}] Cov (X_{1, t - 1}, X_{2, t - 1}), \end{matrix}

under the conditions of the lemma, we have that

{(1 + ρ)}^{2} α_{1} α_{2} [p_{1} (1 - p_{2}) + (1 - p_{1}) p_{2})] < {(1 + ρ)}^{2} α_{1} α_{2} [p_{1} + (1 - p_{1})] = 1 .

Building on the equation and the fact that the time series model

{X_{1, t}, X_{2, t}}

is stationary, we obtain that the covariance between the random variables

X_{1, t}

and

X_{2, t}

is given by

\begin{matrix} cov (X_{1, t}, X_{2, t}) = \frac{{(1 + ρ)}^{2} α_{1} α_{2} [p_{1} p_{2} Var (X_{1, t - 1}) + (1 - p_{1}) (1 - p_{2}) Var (X_{2, t - 1})]}{1 - {(1 + ρ)}^{2} α_{1} α_{2} [p_{1} (1 - p_{2}) + (1 - p_{1}) p_{2}]} . \end{matrix}

Since

{X_{1, t}} \overset{d}{=} {X_{2, t}} \overset{d}{=} Geom (\frac{μ}{1 + μ})

, then we obtain that the correlation coefficient between

X_{1, t}

and

X_{2, t}

is

\begin{matrix} γ = \frac{cov (X_{1, t}, X_{2, t})}{\sqrt{D (X_{1, t})} \cdot \sqrt{D (X_{2, t})}} = & \frac{{(1 + ρ)}^{2} α_{1} α_{2} [p_{1} p_{2} + (1 - p_{1}) (1 - p_{2})]}{1 - {(1 + ρ)}^{2} α_{1} α_{2} [p_{1} (1 - p_{2}) + (1 - p_{1}) p_{2}]} \\ = & \frac{{(1 + ρ)}^{2} α_{1} α_{2} - {(1 + ρ)}^{2} α_{1} α_{2} (p_{1} + p_{2} - 2 p_{1} p_{2})}{1 - {(1 + ρ)}^{2} α_{1} α_{2} (p_{1} + p_{2} - 2 p_{1} p_{2})} . \end{matrix}

□

Appendix E. Proof in Theorem 3

Proof.

From the proof of Theorem 2,

S_{t} = \frac{1}{2} (X_{1, t} + X_{2, t})

is also stationary and ergodic. Then, the process

{S_{t} - μ}

is a zero-mean, stationary, ergodic process. According to the Wold decomposition theorem (see [25] Section 2.6), this process can be represented as

S_{t} - μ = \sum_{k = 0}^{\infty} d_{k} ξ_{t - k},

where

d_{0} = 1

,

\sum_{k = 0}^{\infty} {(d_{k})}^{2} < \infty

, and

{ξ_{t}}

is white noise with parameters

(0, σ^{2}) .

Then, the processes

X_{i, t}, i = 1, 2

can be decomposed as

X_{1, t} = a + \sum_{k = - \infty}^{\infty} d_{k} ξ_{t - k},

(A6)

where

d_{k} = 0

for

k < 0

, and

\sum_{k = - \infty}^{\infty} | d_{k} | < \infty .

Also,

C o v (S_{t}, S_{t - k}) = \frac{1}{4} (C o v (X_{1, t}, X_{1, t - k}) + C o v (X_{1, t}, X_{2, t - k}) + C o v (X_{1, t - k}, X_{2, t}) + C o v (X_{2, t}, X_{2, t - k})) .

These terms represent the components of the matrix

C o v (X_{t}, X_{t - k}) = {\tilde{A}}^{k} V a r (X_{t})

. According to the correlation structure of the model and the properties of the matrix

\tilde{A}

, all terms in the equation are nonnegative, with the first and last terms being strictly positive. Equation (A6) indicates that

\sum_{k = - \infty}^{\infty} C o v (S_{t}, S_{t - k}) = σ^{2} {(\sum_{k = - \infty}^{\infty} d_{k})}^{2}

, implying

\sum_{k = - \infty}^{\infty} d_{k} \neq 0 .

We can now apply the theorem presented in [26], thereby completing the proof. □

References

Brannas, K.; Nordstrom, J. A Bivariate Integer Valued Allocation Model for Guest Nights in Hotels and Cottages. Umea Economic Studies Working Paper No. 547. 2001. Available online: https://ssrn.com/abstract=255292 (accessed on 20 May 2024). [CrossRef]
Quoreshi, A.S. Bivariate time series modeling of financial count data. Commun. Stat. Theory Methods 2006, 35, 1343–1358. [Google Scholar] [CrossRef]
Pedeli, X.; Karlis, D. A bivariate INAR (1) process with application. Stat. Model. 2011, 11, 325–349. [Google Scholar] [CrossRef]
Nastić, A.S.; Ristić, M.M.; Popović, P.M. Estimation in a bivariate integer-valued autoregressive process. Commun. Stat. Theory Methods 2016, 45, 5660–5678. [Google Scholar] [CrossRef]
Khan, N.M.; Oncel Cekim, H.; Ozel, G. The family of the bivariate integer-valued autoregressive process (BINAR (1)) with Poisson–Lindley (PL) innovations. J. Stat. Comput. Simul. 2020, 90, 624–637. [Google Scholar] [CrossRef]
Chen, H.; Zhu, F.; Liu, X. A new bivariate INAR (1) model with time-dependent innovation vectors. Stats 2022, 5, 819–840. [Google Scholar] [CrossRef]
Popović, P.M.; Ristić, M.M.; Nastić, A.S. A geometric bivariate time series with different marginal parameters. Stat. Pap. 2016, 57, 731–753. [Google Scholar] [CrossRef]
Yu, M.; Wang, D.; Yang, K.; Liu, Y. Bivariate first-order random coefficient integer-valued autoregressive processes. J. Stat. Plan. Inference 2020, 204, 153–176. [Google Scholar] [CrossRef]
Su, B.; Zhu, F. Comparison of BINAR (1) models with bivariate negative binomial innovations and explanatory variables. J. Stat. Comput. Simul. 2021, 91, 1616–1634. [Google Scholar] [CrossRef]
Steutel, F.W.; van Harn, K. Discrete analogues of self-decomposability and stability. Ann. Probab. 1979, 7, 893–899. [Google Scholar] [CrossRef]
Al-Osh, M.A.; Aly, E.-E.A. First order autoregressive time series with negative binomial and geometric marginals. Commun. Stat. Theory Methods 1992, 21, 2483–2492. [Google Scholar] [CrossRef]
Borges, P.; Molinares, F.F.; Bourguignon, M. A geometric time series model with inflated-parameter Bernoulli counting series. Stat. Probab. Lett. 2016, 119, 264–272. [Google Scholar] [CrossRef]
Kachour, M.; Truquet, L. A p-order signed integer-valued autoregressive (SINAR (p)) model. J. Time Ser. Anal. 2011, 32, 223–236. [Google Scholar] [CrossRef]
Bulla, J.; Chesneau, C.; Kachour, M. A bivariate first-order signed integer-valued autoregressive process. Commun. Stat. Theory Methods 2017, 46, 6590–6604. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, D.; Fan, X. A negative binomial thinning-based bivariate INAR (1) process. Stat. Neerl. 2020, 74, 517–537. [Google Scholar] [CrossRef]
Ristić, M.M.; Bakouch, H.S.; Nastić, A.S. A new geometric first-order integer-valued autoregressive (NGINAR (1)) process. J. Stat. Plan. Inference 2009, 139, 2218–2226. [Google Scholar] [CrossRef]
Kolev, N.; Minkova, L.; Neytchev, P. Inflated-parameter family of generalized power series distributions and their application in analysis of overdispersed insurance data. ARCH Res. Clear. House 2000, 2, 295–320. [Google Scholar]
Ristić, M.M.; Nastić, A.S.; Jayakumar, K.; Bakouch, H.S. A bivariate INAR (1) time series model with geometric marginals. Appl. Math. Lett. 2012, 25, 481–485. [Google Scholar] [CrossRef]
Popović, P.M. A bivariate INAR (1) model with different thinning parameters. Stat. Pap. 2016, 57, 517–538. [Google Scholar] [CrossRef]
Anderson, T.W.; Darling, D.A. A test of goodness of fit. J. Am. Stat. Assoc. 1954, 49, 765–769. [Google Scholar] [CrossRef]
Gross, L. Tests for Normality, R Package Version 1.0-2. 2013. Available online: http://CRAN.R-project.org/package=nortest (accessed on 20 May 2024).
Popović, P.M.; Nastić, A.S.; Ristić, M.M. Residual analysis with bivariate INAR (1) models. REVSTAT-Stat. J. 2018, 16, 349–363. [Google Scholar]
Weiss, C.H.; Homburg, A.; Puig, P. Testing for zero inflation and overdispersion in inar (1) models. Stat. Pap. 2019, 60, 823–848. [Google Scholar] [CrossRef]
Kang, Y.; Zhu, F.; Wang, D.; Wang, S. A zero-modified geometric INAR (1) model for analyzing count time series with multiple features. Can. J. Stat. 2023. [Google Scholar] [CrossRef]
Brockwell, P.J.; Davis, R.A. Introduction to Time Series and Forecasting; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
Brockwell, P.J.; Davis, R.A. Time Series: Theory and Methods; Springer Science & Business Media: Berlin, Germany, 1991. [Google Scholar]

Figure 1. Variation of MAE and RMSE for Model (A) estimates across various sample sizes.

Figure 2. Variation of MAE and RMSE for Model (B) estimates across various sample sizes.

Figure 3. Variation of MAE and RMSE for Model (C) estimates across various sample sizes.

Figure 4. Variation of MAE and RMSE for Model (D) estimates across various sample sizes.

Figure 5. Variation of MAE and RMSE for Model (E) estimates across various sample sizes.

Figure 6. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (A) across various sample sizes.

Figure 6. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (A) across various sample sizes.

Figure 7. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (A) across various sample sizes.

Figure 7. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (A) across various sample sizes.

Figure 8. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (B) across various sample sizes.

Figure 8. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (B) across various sample sizes.

Figure 9. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (B) across various sample sizes.

Figure 9. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (B) across various sample sizes.

Figure 10. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (C) across various sample sizes.

Figure 10. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (C) across various sample sizes.

Figure 11. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (C) across various sample sizes.

Figure 11. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (C) across various sample sizes.

Figure 12. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (D) across various sample sizes.

Figure 12. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (D) across various sample sizes.

Figure 13. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (D) across various sample sizes.

Figure 13. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (D) across various sample sizes.

Figure 14. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (E) across various sample sizes.

Figure 14. Gaussian QQ plots of the estimates of

α_{1}, α_{2}

, and

p_{1}

for Model (E) across various sample sizes.

Figure 15. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (E) across various sample sizes.

Figure 15. Gaussian QQ plots of the estimates of

p_{2}, ρ

, and

μ

for Model (E) across various sample sizes.

Figure 16. Sample paths of OCND and OLNG series.

Figure 17. The autocorrelation function (ACF) and cross-correlation (CCF) plots of OCND and OLNG series.

Figure 18. Histograms of OCND and OLNG counts.

Figure 19. Sample paths of BETD and BETND series.

Figure 20. The autocorrelation function (ACF) and cross-correlation (CCF) plots of BETD and BETND series.

Figure 21. Histograms of BETD and BETND counts.

Table 1. Simulation results for Model (A) and (B) under different sample sizes.

		Model (A)						Model (B)
		${\hat{α}}_{1}$	${\hat{α}}_{2}$	${\hat{p}}_{1}$	${\hat{p}}_{2}$	$\hat{ρ}$	$\hat{μ}$	${\hat{α}}_{1}$	${\hat{α}}_{2}$	${\hat{p}}_{1}$	${\hat{p}}_{2}$	$\hat{ρ}$	$\hat{μ}$
Size	Metrics	0.30	0.25	0.20	0.15	0.10	5.00	0.30	0.25	0.20	0.15	0.30	5.00
150	Est.	0.2913	0.2410	0.1665	0.1028	0.1388	4.9929	0.2966	0.2413	0.1782	0.0823	0.3104	4.9649
	Bias	−0.0087	−0.0090	−0.0335	−0.0472	0.0388	−0.0071	−0.0034	−0.0087	−0.0218	−0.0677	0.0104	−0.0351
	MAE	0.0500	0.0465	0.1036	0.1341	0.1284	0.3299	0.0449	0.0465	0.1047	0.1621	0.1776	0.3656
	RMSE	0.0555	0.0527	0.1559	0.2008	0.3483	0.4113	0.0547	0.0582	0.1476	0.4420	0.2344	0.4522
300	Est.	0.2933	0.2431	0.1887	0.1351	0.1071	5.0062	0.3045	0.2521	0.1983	0.1440	0.2921	5.0100
	Bias	−0.0067	−0.0069	−0.0113	−0.0149	0.0071	0.0062	0.0045	0.0021	−0.0017	−0.0060	−0.0079	0.0100
	MAE	0.0327	0.0326	0.0638	0.0802	0.0798	0.2506	0.0298	0.0312	0.0632	0.0746	0.1149	0.2724
	RMSE	0.0344	0.0354	0.0822	0.1059	0.1004	0.3080	0.0373	0.0387	0.0821	0.0980	0.1447	0.3337
600	Est.	0.2964	0.2457	0.1985	0.1398	0.1012	4.9958	0.3009	0.2497	0.2025	0.1456	0.2975	5.0169
	Bias	−0.0036	−0.0043	−0.0015	−0.0102	0.0012	−0.0042	0.0009	−0.0003	0.0025	−0.0044	−0.0025	0.0169
	MAE	0.0234	0.0224	0.0463	0.0535	0.0579	0.1670	0.0209	0.0191	0.0420	0.0494	0.0789	0.1719
	RMSE	0.0259	0.0252	0.0568	0.0669	0.0706	0.2125	0.0259	0.0239	0.0540	0.0641	0.0990	0.2194
1200	Est.	0.2989	0.2492	0.1992	0.1449	0.1033	5.0021	0.3011	0.2495	0.1980	0.1475	0.2991	5.0012
	Bias	−0.0011	−0.0008	−0.0008	−0.0051	0.0033	0.0021	0.0011	−0.0005	−0.0020	−0.0025	−0.0009	0.0012
	MAE	0.0146	0.0143	0.0303	0.0380	0.0423	0.1252	0.0152	0.0143	0.0309	0.0353	0.0571	0.1424
	RMSE	0.0173	0.0173	0.0377	0.0478	0.0520	0.1563	0.0192	0.0178	0.0391	0.0445	0.0724	0.1784
3000	Est.	0.2998	0.2505	0.1990	0.1492	0.1001	5.0058	0.2999	0.2497	0.1992	0.1485	0.2997	4.9988
	Bias	−0.0002	0.0005	−0.0010	−0.0008	0.0001	0.0058	−0.0001	−0.0003	−0.0008	−0.0015	−0.0003	−0.0012
	MAE	0.0118	0.0111	0.0242	0.0286	0.0335	0.0964	0.0097	0.0097	0.0182	0.0220	0.0343	0.0893
	RMSE	0.0095	0.0087	0.0193	0.0228	0.0266	0.0774	0.0122	0.0120	0.0229	0.0277	0.0421	0.1130
	AD	0.4014	0.4446	0.6456	0.4647	0.2335	0.1660	0.2979	0.4877	0.3091	0.4579	0.7324	0.4977
	p-value	0.3585	0.2832	0.0916	0.2534	0.7957	0.9395	0.5872	0.2227	0.5569	0.2633	0.0559	0.2106

Table 2. Simulation results of Model (C) under different sample sizes.

Model (C)
		${\hat{α}}_{1}$	${\hat{α}}_{2}$	${\hat{p}}_{1}$	${\hat{p}}_{2}$	$\hat{ρ}$	$\hat{μ}$
Size	Metrics	0.40	0.40	0.20	0.15	0.25	5.00
150	Est.	0.3990	0.3962	0.1860	0.1320	0.2546	5.0048
	Bias	−0.0010	−0.0038	−0.0140	−0.0180	0.0046	0.0048
	MAE	0.0383	0.0382	0.0767	0.0833	0.0985	0.4698
	RMSE	0.0469	0.0476	0.1018	0.1178	0.1234	0.5860
300	Est.	0.3997	0.4005	0.1957	0.1492	0.2485	5.0276
	Bias	−0.0003	0.0005	−0.0043	−0.0008	−0.0015	0.0276
	MAE	0.0262	0.0270	0.0528	0.0497	0.0651	0.3519
	RMSE	0.0333	0.0339	0.0675	0.0642	0.0811	0.4382
600	Est.	0.4002	0.4007	0.1956	0.1492	0.2523	5.0087
	Bias	0.0002	0.0007	−0.0044	−0.0008	0.0023	0.0087
	MAE	0.0188	0.0190	0.0346	0.0351	0.0480	0.2383
	RMSE	0.0234	0.0239	0.0437	0.0433	0.0620	0.3015
1200	Est.	0.4004	0.4001	0.1978	0.1505	0.2479	4.9909
	Bias	0.0004	0.0001	−0.0022	0.0005	−0.0021	−0.0091
	MAE	0.0132	0.0120	0.0244	0.0251	0.0333	0.1745
	RMSE	0.0166	0.0151	0.0309	0.0313	0.0415	0.2153
3000	Est.	0.4005	0.4004	0.2008	0.1493	0.2490	5.0044
	Bias	0.0005	0.0004	0.0008	−0.0007	−0.0010	0.0044
	MAE	0.0079	0.0079	0.0155	0.0150	0.0208	0.1091
	RMSE	0.0101	0.0098	0.0195	0.0186	0.0258	0.1347
	AD	0.2718	0.2061	0.4386	0.2595	0.4890	0.5042
	p-value	0.6703	0.8696	0.2928	0.7118	0.2211	0.2029

Table 3. Simulation results of Model (D) and (E) under different sample sizes.

		Model (D)						Model (E)
		${\hat{α}}_{1}$	${\hat{α}}_{2}$	${\hat{p}}_{1}$	${\hat{p}}_{2}$	$\hat{ρ}$	$\hat{μ}$	${\hat{α}}_{1}$	${\hat{α}}_{2}$	${\hat{p}}_{1}$	${\hat{p}}_{2}$	$\hat{ρ}$	$\hat{μ}$
Size	Metrics	0.60	0.40	0.30	0.70	0.30	3.00	0.60	0.40	0.70	0.30	0.30	3.00
150	Est.	0.5989	0.3943	0.3025	0.6947	0.3106	2.9614	0.5961	0.3931	0.6987	0.2861	0.3111	2.9720
	Bias	−0.0011	−0.0057	0.0025	−0.0053	0.0106	−0.0386	−0.0039	−0.0069	−0.0013	−0.0139	0.0111	−0.0280
	MAE	0.0397	0.0441	0.0581	0.0856	0.0815	0.3666	0.0379	0.0458	0.0588	0.0856	0.0807	0.3892
	RMSE	0.0488	0.0547	0.0746	0.1103	0.1119	0.4651	0.0479	0.0583	0.0744	0.1111	0.1179	0.4905
300	Est.	0.6013	0.3991	0.2950	0.7052	0.2949	2.9846	0.5989	0.3986	0.7014	0.2971	0.2999	2.9953
	Bias	0.0013	−0.0009	−0.0050	0.0052	−0.0051	−0.0154	−0.0011	−0.0014	0.0014	−0.0029	−0.0001	−0.0047
	MAE	0.0255	0.0302	0.0417	0.0554	0.0456	0.2589	0.0256	0.0277	0.0395	0.0582	0.0495	0.2759
	RMSE	0.0322	0.0378	0.0514	0.0713	0.0575	0.3311	0.0322	0.0357	0.0497	0.0725	0.0647	0.3487
600	Est.	0.5994	0.3975	0.2995	0.7007	0.3012	2.9896	0.5989	0.4007	0.7000	0.3007	0.2992	2.9992
	Bias	−0.0006	−0.0025	−0.0005	0.0007	0.0012	−0.0104	−0.0011	0.0007	0.0000	0.0007	−0.0008	−0.0008
	MAE	0.0173	0.0211	0.0285	0.0433	0.0344	0.1738	0.0176	0.0205	0.0289	0.0393	0.0350	0.1881
	RMSE	0.0219	0.0269	0.0355	0.0535	0.0430	0.2170	0.0219	0.0256	0.0363	0.0499	0.0439	0.2390
1200	Est.	0.6004	0.4013	0.2999	0.6981	0.2984	3.0039	0.5999	0.3979	0.6994	0.3001	0.2999	3.0026
	Bias	0.0004	0.0013	−0.0001	−0.0019	−0.0016	0.0039	−0.0001	−0.0021	−0.0006	0.0001	−0.0001	0.0026
	MAE	0.0122	0.0144	0.0208	0.0273	0.0237	0.1270	0.0126	0.0156	0.0204	0.0290	0.0243	0.1447
	RMSE	0.0156	0.0179	0.0262	0.0354	0.0296	0.1602	0.0160	0.0197	0.0257	0.0365	0.0301	0.1791
3000	Est.	0.6000	0.4005	0.2998	0.6982	0.2998	3.0030	0.6007	0.3997	0.7000	0.3002	0.2982	3.0000
	Bias	0.0000	0.0005	−0.0002	−0.0018	−0.0002	0.0030	0.0007	−0.0003	0.0000	0.0002	−0.0018	0.0000
	MAE	0.0081	0.0093	0.0126	0.0183	0.0152	0.0836	0.0079	0.0093	0.0122	0.0181	0.0145	0.0856
	RMSE	0.0102	0.0118	0.0155	0.0232	0.0192	0.1054	0.0098	0.0117	0.0153	0.0226	0.0180	0.1074
	AD	0.1186	0.3962	0.2291	0.1378	0.5815	0.7015	0.7048	0.4253	0.2718	0.2414	0.2766	0.5041
	p-value	0.9898	0.3688	0.8088	0.9763	0.1297	0.0667	0.0654	0.3150	0.6705	0.7713	0.6542	0.2031

Table 4. Descriptive statistics of OCND and OLNG series.

Crime	Min	Max	Median	Mean	Var	$I_{d}$	$z_{0}$
Offensive Conduct (OCND)	0	5	0	0.3448	0.5551	1.6098	0.6616
Offensive Language (OLNG)	0	4	0	0.2960	0.3992	1.3487	0.6438

Table 5. Fitting results of the monthly OCND and OLNG counts across different models.

Estimate	$ρ$ -BVGINAR(1)	BVNGINAR(1)	BVPOINAR(1)	BVMIXINAR(1)
${\hat{α}}_{1}$	0.1278	0.2364	0.0769	0.3306
${\hat{α}}_{2}$	0.3435	0.1695	0.5340	0.2339
${\hat{p}}_{1}$	0.5724	0.2289	0.6902	0.3071
${\hat{p}}_{2}$	0.6263	0.2572	0.2078	0.0961
$\hat{ρ}$	0.0777	–	–	–
$\hat{μ}$	0.3448	0.3448	0.3448	0.3448
LogLik	−489.64	−492.15	−492.25	−526.14
AIC	991.27	994.31	994.51	1062.29
BIC	1008.53	1013.57	1013.77	1081.55

Table 6. Descriptive statistics of BETD and BETND series.

Crime	Min	Max	Median	Mean	Var	$I_{d}$	$z_{0}$
Break and Enter Thefts into Dwellings (BETD)	0	40	4	5.5431	25.7244	4.6408	0.7587
Break and Enter Thefts into Non-Dwellings (BETND)	0	22	3	4.0517	15.8417	3.9099	0.7681

Table 7. Fitting results of the monthly BETD and BETND counts across different models.

Estimate	$ρ$ -BVGINAR(1)	BVNGINAR(1)	BVPOINAR(1)	BVMIXINAR(1)
${\hat{α}}_{1}$	0.2434	0.7028	0.3483	0.6987
${\hat{α}}_{2}$	0.2686	0.6855	0.5614	0.7911
${\hat{p}}_{1}$	0.7451	0.3911	0.9741	0.4548
${\hat{p}}_{2}$	0.2428	0.2431	0.3223	0.4693
$\hat{ρ}$	1.5649	–	–	–
$\hat{μ}$	5.5431	5.5431	5.5431	5.5431
LogLik	−1742.96	−1761.20	−1987.37	−2216.58
AIC	3497.91	3532.40	3984.75	4443.16
BIC	3515.17	3551.66	4004.01	4462.42

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.