Abstract
While overdispersion is a common phenomenon in univariate count time series data, its exploration within bivariate contexts remains limited. To fill this gap, we propose a bivariate integer-valued autoregressive model. The model leverages a modified binomial thinning operator with a dispersion parameter and integrates random coefficients. This approach combines characteristics from both binomial and negative binomial thinning operators, thereby offering a flexible framework capable of generating counting series exhibiting equidispersion, overdispersion, or underdispersion. Notably, our model includes two distinct classes of first-order bivariate geometric integer-valued autoregressive models: one class employs binomial thinning (BVGINAR(1)), and the other adopts negative binomial thinning (BVNGINAR(1)). We establish the stationarity and ergodicity of the model and estimate its parameters using a combination of the Yule–Walker (YW) and conditional maximum likelihood (CML) methods. Furthermore, Monte Carlo simulation experiments are conducted to evaluate the finite sample performances of the proposed estimators across various parameter configurations, and the Anderson-Darling (AD) test is employed to assess the asymptotic normality of the estimators under large sample sizes. Ultimately, we highlight the practical applicability of the examined model by analyzing two real-world datasets on crime counts in New South Wales (NSW) and comparing its performance with other popular overdispersed BINAR(1) models.
MSC:
62M10
1. Introduction
Bivariate count data are prevalent across various scenarios and frequently represent the occurrences of two distinct events, objects, or individuals over a specific time frame. Bivariate integer-valued time series models, in particular, excel at preserving the paired relationship between two correlated count variables observed over certain time intervals. Such data arise in many fields, including guest nights in hotels and cottages [1], tick-by-tick data of two highly traded stocks [2], traffic accidents happening both in daylight and nighttime [3], the occurrence of different offenses in specific areas [4], and the counts of asymptomatic and symptomatic COVID-19 cases within considered regions.
Substantial research interest has been directed toward the analysis of bivariate integer-valued time series, particularly regarding their cross-correlated nature. One direction is to describe the correlation between series by employing different bivariate innovation distributions. For instance, [3] proposed a bivariate diagonal INAR(1) model with bivariate Poisson and bivariate negative binomial innovations. Similarly, [5] found that BINAR(1) models with Poisson–Lindley (PL) innovations outperform other competing INAR(1) models, whether based on diagonal or full coefficient matrices. Furthermore, [6] presented a bivariate full INAR(1) model, assuming a time-dependent innovation vector, where the mean of the innovation vector linearly increases with the previous population size. However, these models are all based on constant coefficients.
To introduce more flexibility into BINAR models, [7] considered inference for a bivariate random coefficient INAR(1) model with different geometric marginals. Thereafter, [8] proposed a more general bivariate diagonal random coefficient INAR(1) process (BRCINAR(1)) with dependent innovations. Moreover, [9] compared the performances of random coefficient BINAR(1) models based on the bivariate negative binomial distributions constructed in different ways with explanatory variables.
While these studies have made significant strides in enhancing BINAR(1) model flexibility through various marginal and innovation distributions, they have primarily focused on binomial thinning operators. The binomial thinning operator, initially proposed by [10], remains the most widely utilized approach in modeling integer-valued time series due to its capability of producing integer-valued results and offering strong interpretability. Denoted as “∘”, this operator is defined as follows:
where , X represents a non-negative integer-valued random variable, and denotes a sequence of independent and identically distributed Bernoulli random variables. Each has a probability and , independent of X. Essentially, the binomial thinning operator assigns a value of either 0 or 1 to each counting random variable , making it suitable for modeling scenarios where random events either survive or vanish after a period of observation.
However, in situations where the observed unit has the potential to generate multiple countable elements or trigger further stochastic occurrences beyond mere survival or disappearance, the Bernoulli random variable may not be the most suitable choice for constructing the counting series. For example, in the context of infectious diseases, an infected individual may not only survive or die but may also contribute to the generation of new cases. To address this limitation, [11] introduced the negative binomial thinning operator, defined as:
where , and represents a sequence of independent and identically distributed geometric random variables with a mean of , also independent of X.
In fact, while there are many other thinning operators proven useful in univariate integer-valued time series [12,13], the literature on different operators applied to bivariate data is limited. As far as we know, a study by [14] constructed a BINAR(1) model based on the signed thinning operator, which is capable of accommodating data with negative observations. In addition, another study by [15] proposed a new BINAR(1) model that extended the negative binomial thinning operator.
The aim of this paper is to introduce a more sophisticated -binomial thinning operator [12] to bivariate integer-valued time series and explore statistical inference of the proposed model. The motivation behind this endeavor lies in the remarkable versatility exhibited by the counting series derived from this thinning operator. Specifically, this thinning operator enables us to describe equidispersion, overdispersion, or underdispersion characteristics concurrently within both the counting processes. Consequently, the -thinning operator extends the capabilities of the binomial thinning [10] and negative binomial thinning operators [16], offering superior fitting capabilities to paired count data.
The outline of the paper is structured as follows. In Section 2, we introduce a definition of the -BVGINAR(1) model and discuss the basic properties of the thinning operator for bivariate vectors. The properties of the model are further examined in Section 3. Section 4 estimates the model parameters by integrating the Yule–Walker (YW) and maximum likelihood (CML) methods, followed by simulation studies to explore the asymptotic properties of the estimators under various parameter combinations. Moreover, the Anderson-Darling (AD) test is also performed to assess the asymptotic normality of the estimators under large sample sizes. Section 5 illustrates the application of the proposed models to two real-world datasets of crime counts in New South Wales. We examine datasets with varying levels of overdispersion indices and compare the performance of our models with other bivariate integer-valued models. Finally, Section 6 provides concluding remarks and an outlook for future research. Some proofs and figures are provided in the Appendix for reference.
2. Construction of the Model
Ref. [17] introduced a novel variant of the Bernoulli distribution, termed an inflated-parameter Bernoulli (IBe) distribution, designed to model univariate count data exhibiting overdispersion. This distribution incorporates an additional parameter, , allowing for more flexible dispersion indices. The probability mass function (pmf) for the IBe distribution is defined as follows:
where and . The distribution can be denoted as . The mean, variance, and probability generating function (pgf) of the IBe distribution are detailed below:
Moreover, the dispersion index is expressed as:
Interestingly, the IBe distribution presents three dispersion scenarios depending on the values of the parameters:
- Overdispersion is observed when .
- Underdispersion occurs if .
- Equidispersion is achieved at .
The distribution can also degenerate into two important distributions:
- The standard Bernoulli distribution, when and , with mean ;
- The geometric distribution, when , with mean .
Inspired by these properties, [12] formulated a -binomial thinning operator, defined as:
where X is a non-negative integer-valued random variable, and is a sequence of inflated-parameter Bernoulli random variables with the pmf given by Equation (1), mutually independent of X. This operator has proven its practical utility in modeling univariate integer-valued time series, known as the -GINAR process.
However, it is crucial to acknowledge that cross-correlations are prevalent in most paired count time series. Hence, this paper aims to enhance the -GINAR process by extending it into a bivariate domain, thereby more effectively capturing the inherent correlations within the data.
We propose the -BVGINAR(1) model, which is a novel bivariate random coefficient INAR(1) process characterized by the following recursive equation:
Here,
- (i)
- represents a random coefficient matrix comprising two mutually independent bivariate random vectors, and , each with independent and identically distributed (i.i.d.) components and with pmf values as:where and . The matrix operation replicates matrix multiplication while preserving the properties of random coefficient thinning.
- (ii)
- The innovation is a sequence of i.i.d. bivariate non-negative integer-valued random vectors with mutually independent elements and and independent of for .
Remark 1.
The proposed bivariate INAR(1) process based on the ρ-binomial thinning operator has two sub-models:
- When , it corresponds to the bivariate INAR(1) with geometric marginals introduced by [4].
- When , for both and , it aligns with the BVNGINAR(1) proposed by [18].
Next, we further explore the properties of the -thinning operator for vectors.
Lemma 1.
Consider as defined in Equation (4). Then:
- (i)
- , where
- (ii)
- for a random vector independent of .
- (iii)
- for a random vector independent of .
- (iv)
- , where has elements
Lemma 2.
If we assume , then all eigenvalues of matrix lie within the unit circle.
Proof.
Similar to [19], we outline the key steps. Let and denote the eigenvalues, with assumed without loss of generality.
- (i)
- We have , indicating .
- (ii)
- Furthermore, we calculate ; then, we have Hence, it necessitates that either both or both . Since , we deduce both .
- (iii)
- Similarly, evaluating . As , then .
Consequently, we conclude and under the conditions . Moreover, this condition ensures the stationarity of both the first- and second-order moments of the process. □
Proposition 1.
If and , a strictly stationary bivariate integer-valued time series satisfying Equation (4) exists uniquely. Moreover, the process is ergodic.
The proof of Proposition 1 is provided in Appendix B. Now let us derive the moments and conditional moments of the -BVGINAR(1) process.
Proposition 2.
Suppose the bivariate time series is a stationary process defined by Equation (4); then, for , we have
- (i)
- .
- (ii)
- , where has elements
- (iii)
- .
- (iv)
where and are the mean and variance, respectively, of , for .
We define a stationary bivariate time series according to Equation (4). By specifying appropriate marginal distributions for and , we can deduce the respective marginal distributions of the innovation and . This process is clarified by the theorem below.
Theorem 1.
Let , if , and ; then the distributions of the innovation processes and are as follows:
Proof.
For intuitive purposes, the bivariate time series model can be represented as
Given and leveraging the properties of the thinning operator and the stationarity of the process, we have
Then the pgf of can be obtained by
The innovation clearly consists of a combination of two geometrically distributed random variables. Similar derivation holds for . Thus, the distributions of innovation process can be expressed as Equations (5) and (6). Notably, it is emphasized that is necessary to ensure the non-negativity of all probabilities for and . □
3. Properties
Lemma 3.
Let , and . The correlation coefficient between and lies in and is expressed as:
It is evident that the correlation coefficient γ lies in the interval .
From Lemma 1, we find that the covariance matrix between the random vectors and is given by
The one-step-ahead conditional expectation can be derived as follows:
Typically, the k-step ahead conditional expectation of is
Meanwhile, we observe that , thereby implying that
This finding validates the characteristic of autoregressive processes, whereby the conditional expectation converges to the unconditional expectation as the number of steps approaches infinity.
Due to the conditional independence of the random variables and conditioned on and , respectively, the conditional probability function can be represented as the product of individual conditional probabilities. Therefore, we can derive the conditional probability function of the random vector as follows:
where
and
Here, the distributions of the random variables and are defined by Theorem 1, so their probability mass functions are given by
and
4. Estimation Procedure
In this section, we consider as a strictly stationary and ergodic solution of the -BVGINAR(1) process, with representing a series of observations generated from this process. We discuss the estimation of the model parameters, comprising six parameters: one for thinning the distribution (), two for the autocorrelation coefficients (), two for specifying the dependence between processes and (), and one for the marginal distributions (). Considering the unique characteristics of these parameters, we integrate two estimation approaches: the Yule–Walker (YW) and the conditional maximum likelihood (CML) methods.
The sample mean is commonly employed for estimating model parameters in time series analysis. Since the model assumption is that the marginal distribution , then . Thus, the reasonable estimate would be:
Theorem 2.
The estimator is strongly consistent.
Proof.
Proposition 1 proved that process is stationary and ergodic. Then, processes and are jointly stationary, which implies that is also stationary and ergodic. We have
Therefore, The proof of Theorem 2 is complete. □
Theorem 3.
The estimator is asymptotically normally distributed with parameters , where and .
In addition, the conditional maximum likelihood (CML) stands out as one of the most commonly employed techniques for parameter estimation. The CML estimator of parameter vector is the value that maximizes the conditional log-likelihood function . Suppose that is fixed. The conditional log-likelihood function is given by
Under the given conditions, the conditional probability mass functions of processes and can be represented as the products of their respective conditional probabilities. These probabilities result from the convolution of the -binomial distribution and the probability mass function of the corresponding innovation processes. Specifically,
The likelihood function reveals that the parameters , and often interact multiplicatively, posing significant challenges for the optimization process. To address potential issues with parameter identifiability and leverage the specific characteristics of these parameters, we have implemented a stepwise optimization strategy. Initially, we estimate using the Yule–Walker method. This estimate is then incorporated back into the likelihood function to facilitate the optimization of the remaining parameters. This tailored approach effectively mitigates the risk of converging to local optima: a prevalent concern with non-convex objective functions.
For numerical maximization, we employ the “nlm” function from R programming software. All computational experiments were performed using R version 4.0.3 on a system equipped with an Intel Xeon Gold 6154 processor (Intel Corporation, Santa Clara, CA, USA) and 256 GB of RAM.
Next, we present the detailed simulation study design and results. We generated -BVGINAR(1) samples with various model parameterizations and sample sizes , where is close to the length of the crime counts that will be analyzed in Section 5. We considered the following parameter configurations:
- Model (A): () = (0.3, 0.25, 0.2, 0.15, 0.1, 5);
- Model (B): () = (0.3, 0.25, 0.2, 0.15, 0.3, 5);
- Model (C): () = (0.4, 0.4, 0.2, 0.15, 0.25, 5);
- Model (D): () = (0.6, 0.4, 0.3, 0.7, 0.3, 3);
- Model (E): () = (0.6, 0.4, 0.7, 0.3, 0.3, 3).
Recalling the properties of IBe random variables discussed in Section 2, we explored diverse parameter combinations of , , and in our simulation study. Models (A), (B), and (C) represent scenarios of underdispersion (), overdispersion (), and equidispersion () of the IBe random variables and under -thinning. Conversely, Models (D) and (E) are characterized by distinct dispersion patterns of and . Specifically, Model (D) sets underdispersion of and overdispersion of (, ), while Model (E) shows overdispersion of and underdispersion of (, ).
To assess model performance, we employed two widely recognized criteria: mean absolute error (MAE) and root mean squared error (RMSE), based on replications for each model parametrization. MAE is preferred for its robustness against outliers, while RMSE provides a more detailed measure of errors and is particularly sensitive to larger deviations. The use of both MAE and RMSE allows for a comprehensive evaluation of estimation accuracy. They are defined as follows:
where is the estimate of at the m-th replication.
In addition, to demonstrate the asymptotic normality of the estimators, we conducted a goodness-of-fit test for normality. The Anderson-Darling (AD) test, proposed by [20], is a statistical test used to assess whether data come from a specific distribution: typically, a normal distribution. Unlike other normality tests, the AD test gives more weight to the tails of the distribution, making it more powerful for detecting non-normality. Therefore, we selected the AD test to examine the asymptotic normality of the estimators by using the procedure from the R package “nortest” authored by [21].
Table 1, Table 2 and Table 3 report the estimates, biases, MAEs, and RMSEs for Models (A)–(E) across various sample sizes, as well as the AD test statistics (denoted AD in the tables) and corresponding p-values for . From Table 1, we observe that the biases, MAEs, and RMSEs of the estimates for Models (A) and (B) decrease as the sample size n increases, as expected. Figure 1 and Figure 2 also illustrate the notable downward trends in MAEs and RMSEs of the estimates, implying the consistency of the proposed estimators with increasing values of n. A similar conclusion can be drawn from Table 2 and Table 3, along with the corresponding visual curves in Figure 3, Figure 4 and Figure 5. Based on the above discussions, we conclude that the estimation method can produce reliable parameter estimators.
Table 1.
Simulation results for Model (A) and (B) under different sample sizes.
Table 2.
Simulation results of Model (C) under different sample sizes.
Table 3.
Simulation results of Model (D) and (E) under different sample sizes.
Figure 1.
Variation of MAE and RMSE for Model (A) estimates across various sample sizes.
Figure 2.
Variation of MAE and RMSE for Model (B) estimates across various sample sizes.
Figure 3.
Variation of MAE and RMSE for Model (C) estimates across various sample sizes.
Figure 4.
Variation of MAE and RMSE for Model (D) estimates across various sample sizes.
Figure 5.
Variation of MAE and RMSE for Model (E) estimates across various sample sizes.
Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15 display the Gaussian QQ plots of the proposed estimators for Models (A)–(E) across various sample sizes. For small sample sizes, particularly when , the data points are concentrated along a non-45-degree diagonal. However, as n increases, more data points align closely with the 45-degree line, indicating a good match between the estimator and normal distributions. Furthermore, all p-values of the AD normality test for the estimates when for Models (A)–(E) are greater than the significance level of 0.05, as shown in Table 1, Table 2 and Table 3, further confirming the asymptotic normality of the estimators for large sample sizes. Based on the above facts, we conclude that the proposed estimation method is trustworthy for the models under consideration and can yield estimators with asymptotic normality.
Figure 6.
Gaussian QQ plots of the estimates of , and for Model (A) across various sample sizes.
Figure 7.
Gaussian QQ plots of the estimates of , and for Model (A) across various sample sizes.
Figure 8.
Gaussian QQ plots of the estimates of , and for Model (B) across various sample sizes.
Figure 9.
Gaussian QQ plots of the estimates of , and for Model (B) across various sample sizes.
Figure 10.
Gaussian QQ plots of the estimates of , and for Model (C) across various sample sizes.
Figure 11.
Gaussian QQ plots of the estimates of , and for Model (C) across various sample sizes.
Figure 12.
Gaussian QQ plots of the estimates of , and for Model (D) across various sample sizes.
Figure 13.
Gaussian QQ plots of the estimates of , and for Model (D) across various sample sizes.
Figure 14.
Gaussian QQ plots of the estimates of , and for Model (E) across various sample sizes.
Figure 15.
Gaussian QQ plots of the estimates of , and for Model (E) across various sample sizes.
5. Real Data Examples
In this section, we present two applications to demonstrate the effectiveness of our proposed -BVGINAR(1) model in capturing overdispersion and zero inflation phenomena. We compare our model with existing bivariate INAR(1) models utilizing standard binomial and negative binomial thinning operators. Specifically, we consider the following models:
- BVNGINAR(1) model ([18]);
- BVPOINAR(1) model ([4]);
- BVMIXINAR(1) model ([22]).
Note that both the binomial and negative binomial thinning operators are special cases of the -binomial thinning operator: the three models above are considered. For fairness, we estimate the mean parameter using the Yule–Walker method across all models, while employing maximum likelihood estimation for other parameters. Model performance was evaluated using the Akaike information criterion (AIC) and Bayesian information criterion (BIC).
The dataset used in this analysis was obtained from the NSW Bureau of Crime Statistics and Research (BOCSAR). It is organized by local government area (LGA), offense category (including subcategories), and month. Covering the period from January 1995 to December 2023, the dataset includes 348 monthly crime counts in New South Wales (NSW), Australia. The data can be downloaded from the following website: https://www.bocsar.nsw.gov.au/Pages/bocsar_datasets/Offence.aspx, accessed on 13 May 2024.
To illustrate the characteristics of the observations, we present descriptive statistics for each series. These statistics include the minimum (Min), maximum (Max), median, mean, variance (Var), dispersion index (), and zero inflation index (). The is defined in Equation (2). The index, introduced by [23], is employed to assess the excess occurrence of zeros in count data and is formulated as:
where denotes the proportion of zeros, and represents the mean. A value greater than 0 indicates zero inflation, while a value less than 0 suggests zero deflation.
5.1. Crime Data: Disorderly Conduct Counts in Carrathool
In the first application, we analyze disorderly conduct counts in Carrathool, including three subcategories: “offensive conduct” (OCND), “offensive language” (OLNG), and “criminal intent”. Evidently, OCND and OLNG often co-occur due to similar contexts, likely indicating a significant degree of mutual association between their counts. Therefore, we applied the -BVGINAR(1) model to fit the counts of OCND and OLNG.
The time series, autocorrelation function (ACF), and cross-correlation function (CCF) plots for the OCND and OLNG series are depicted in Figure 16 and Figure 17. The ACF plots show autocorrelation in both series, while the values in the CCF plot surpass the confidence interval, suggesting non-independence between the two series. Table 4 presents descriptive statistics for both series. The empirical mean values for OCND and OLNG are relatively close, at 0.3448 and 0.2960, respectively. Both sequences exhibit dispersion indices of and , slightly exceeding 1, indicating marginal overdispersion. Moreover, the values for both series exceed 0, indicating zero inflation characteristics in the data. Further insight is provided by their histograms in Figure 18, which highlight a notable proportion of zeros in each series.
Figure 16.
Sample paths of OCND and OLNG series.
Figure 17.
The autocorrelation function (ACF) and cross-correlation (CCF) plots of OCND and OLNG series.
Table 4.
Descriptive statistics of OCND and OLNG series.
Figure 18.
Histograms of OCND and OLNG counts.
The fitted results of the proposed -BVGINAR(1), BVNGINAR(1), BVPOINAR(1), and BVMIXINAR(1) models are summarized in Table 5. Despite incorporating a mixture of binomial and negative binomial thinning, the BVMIXINAR(1) model exhibits the poorest performance, evidenced by its minimum log-likelihood value (), maximum AIC (1062.29) and BIC (1081.55) values. The BVNGINAR(1) and BVPOINAR(1) models perform comparably based on their log-likelihood values, appearing only suboptimal compared to the -BVGINAR(1) model. However, the -BVGINAR(1) model achieves the lowest AIC (991.27) and BIC (1008.53) values, indicating superior data fitting.
Table 5.
Fitting results of the monthly OCND and OLNG counts across different models.
5.2. Crime Data: Theft Counts in Narrandera
As suggested by a referee, to demonstrate the flexibility and applicability of our model, we selected a real-world example with higher levels of overdispersion for analysis. We focused on the theft counts in Narrandera, encompassing five subcategories: “break and enter dwelling”, “break and enter non-dwelling”, “receiving or handling stolen goods”, “motor vehicle theft”, and “steal from motor vehicle”. Notably, “break and enter dwelling” and “break and enter non-dwelling” exhibit a correlation due to their similar modus operandi and motivations, likely originating from the actions of the same group of offenders. Therefore, we chose to examine the counts of “break and enter thefts into dwelling” (BETD) and “break and enter theft into non-dwelling” (BETND) for further investigation.
Figure 19 and Figure 20 display the time series, ACF, and CCF plots, revealing significant autocorrelation within each series and cross-correlation between the BETD and BETND series. Table 6 presents the dispersion indices values of and , markedly exceeding 1, indicating a higher degree of overdispersion compared to the values of and observed in the OCND and OLNG counts. Moreover, the values for both series indicate notable zero inflation, with values of and , respectively. Their histograms are depicted in Figure 21.
Figure 19.
Sample paths of BETD and BETND series.
Figure 20.
The autocorrelation function (ACF) and cross-correlation (CCF) plots of BETD and BETND series.
Table 6.
Descriptive statistics of BETD and BETND series.
Figure 21.
Histograms of BETD and BETND counts.
Table 7 presents the fitted results. We observe that the BVMIXINAR(1) model exhibits the highest AIC and BIC values, indicating that the model fails to capture the overdispersion and zero inflation characteristic of the dataset. We also notice that the BIC value of the fitted BVPOINAR(1) models is also large, indicating that the model is unsuitable to fit this dataset. While the BVNGINAR(1) model yields satisfactory results, the -BVGINAR(1) model outperforms it in terms of AIC and BIC values, indicating its greater suitability for fitting the BETD and BETND counts.
Table 7.
Fitting results of the monthly BETD and BETND counts across different models.
In conclusion, the -BVGINAR(1) model outperforms the other three models and more effectively captures the overdispersion and zero inflation features in count time series data. It demonstrates superiority in model fitting by striking a balance between flexibility and complexity.
6. Conclusions
This paper introduces a more flexible -BVGINAR(1) model tailored to analyzing bivariate integer-valued time series data with overdispersion characteristic. It extends the -GINAR(1) model [12] to the two-dimensional case. Meanwhile, it is also a generalization of the BVGNAR(1) model or the BVNGINAR(1) model [18], offering enhanced capability for handling excess zeros and overdispersed data. Furthermore, the paper derives the innovation structure of the proposed model, discusses its essential properties, and describes the methodologies for YW and CML estimation. A comprehensive simulation study is conducted to evaluate the finite sample performances of the estimators and their asymptotic properties under various parameter combinations. Two real applications showcase the effectiveness of the proposed model relative to existing ones, demonstrating its utility in practical settings.
Moving forward, there are several promising avenues for future research in the field of bivariate INAR-type models. One promising direction involves exploring the application of the zero-modified geometric distribution as the marginal distribution. This distribution offers the capability to effectively model features such as zero inflation, zero deflation, overdispersion, and underdispersion, as discussed in detail in [24]. In addition, there is potential for further investigation into the modification of various marginal parameters and thinning parameters. Previous works, such as those by [7,19], have demonstrated the effectiveness of such modifications for analyzing bivariate time series data. These approaches hold promise for enhancing the flexibility and applicability of bivariate models and warrant thorough exploration in future research projects.
Author Contributions
Conceptualization and methodology, C.L.; validation and review, D.W. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Social Science Planning Foundation of Liaoning Province (No. L22ZD065) and the National Natural Science Foundation of China (Nos. 12271231, 12001229, 11901053).
Data Availability Statement
Publicly available data sets were analyzed in this study. These data can be found here: https://www.bocsar.nsw.gov.au/Pages/bocsar_datasets/Offence.aspx (accessed on 13 May 2024).
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A. Proof in Lemma 1
- (i)
- , where
Proof.
According to the definition of the model, we have
We decompose the computation of each element of the matrix.
where
By the same token,
Therefore,
□
- (ii)
- for a random vector independent of .
Proof.
From the definition of the model, it follows that
Decomposing the computation of each element of the matrix, we have
Similarly,
Therefore,
□
- (iii)
- for a random vector independent of .
Proof.
The proof process for this is similar to (ii), and we only provide key expressions.
and
Therefore,
□
- (iv)
- where ,
Proof.
Based on the model definition, we expand the expectations.
For representational convenience, we denote the corresponding elements of the aforementioned matrix as . We then proceed to compute these elements individually.
where
Similarly,
Therefore,
where ,
□
Appendix B. Proof in Proposition 1
Proof.
We begin by introducing a sequence of bivariate random series , defined as follows:
where, for any m, is independent of and .
For any given m, we define a random vector and the space as the set of all random vectors satisfying , where the measure between two random vectors and is denoted as . lt is easy to obtain that is a Hilbert space. Our goal is to establish the convergence of as To achieve this, we aim to demonstrate that forms a Cauchy sequence in the Hilbert space . Now, let us proceed to the proof.
Existence.
We will prove the existence of the model through three key points.
- 1.
- is non-decreasing for all t.To substantiate this claim, we must demonstrate that for all and , . For , we have thatNow suppose that for and all ; we will demonstrate that . We consider the components and :Similarly,Therefore, by mathematical induction, we establish that is non-decreasing for all t.
- 2.
- for .To demonstrate this, let us denote . From Lemma 1, it follows that:Assuming , we find:where represents the identity matrix. Hence, the matrix is invertible, and . Consequently, becomes independent of t as and tends to as .Let us consider . Leveraging the findings from Lemma 1, we derive:where , withIf , all entries of the matrix fall within the interval . Through iterative recursion repeated m times, we establish its independence from t. Consequently, . Hence, for .
- 3.
- is a Cauchy sequence.Let for all , . From Equation (A1), it is straightforward to getThen, we haveNext, we aim to prove that
Proof.
Let , . According to Lemma 1, it follows that
where and
Let us start by analyzing . On the one hand, considering the non-negativity of the random variables and , we find that . On the other hand, we proceed to derive as follows:
Similarly,
Then,
Therefore,
If , it is straightforward to demonstrate that the eigenvalues (denoted as , for ) of the matrix lie within the unit circle. Consequently, we have , as , which implies that Equation (A2) converges to 0 as . Furthermore, as .
Next, we examine
We have proved . This implies that the sequence is a Cauchy sequence, and thus, . Finally, by taking limits on both sides of Equation (A1) and letting , we obtain , where is independent of and for .
Uniqueness
Let us delve into uniqueness. Assume there exists another series satisfying Equation (4). Then, we can express the difference between and as
Define
We then establish:
We introduce new notations:
According to Lemma 1, we derive:
Consequently,
By the Borel–Cantelli lemma, we conclude that . Thus, , i.e., almost surely.
Strictly stationary.
We will employ mathematical induction to demonstrate that for all h and k, and are identically distributed. Firstly, when , we have
and
Since (here, stands for having the same distribution), then and have identical distributions. Consequently, is strictly stationary.
Next, suppose is strictly stationary; then we have
Likewise, since is strictly stationary, we have
Then . Thus, forms a strictly stationary process. Furthermore, since , i.e., , then . Therefore, is also a strictly stationary process.
Ergodicity.
At time t, the random matrical operation involves two random coefficient-thinning operations, i.e., “” or “” and “” or “”. Let denote all counting series involved in the matrix operation. Obviously, is a 2-dimensional series. Let represent the -algebra rendering the vector measurable. According to Equation (4), for any t, we have
and consequently,
Given that the sequence is an i.i.d. sequence of random vectors, then is ergodic. According to Kolmogorov’s law, for any event within , the probability is either 0 or 1. This means that the tail of the field of contains only the measure sets with probability 0 or 1. Consistent with findings akin to those in [8], the sequence is considered ergodic. □
Appendix C. Proof in Proposition 2
- (i)
- .
Proof.
. □
- (ii)
- where
Proof.
where
Therefore,
where
□
Appendix D. Proof in Lemma 3
Proof.
Since
under the conditions of the lemma, we have that
Building on the equation and the fact that the time series model is stationary, we obtain that the covariance between the random variables and is given by
Since , then we obtain that the correlation coefficient between and is
□
Appendix E. Proof in Theorem 3
Proof.
From the proof of Theorem 2, is also stationary and ergodic. Then, the process is a zero-mean, stationary, ergodic process. According to the Wold decomposition theorem (see [25] Section 2.6), this process can be represented as
where , , and is white noise with parameters Then, the processes can be decomposed as
where for , and Also,
These terms represent the components of the matrix . According to the correlation structure of the model and the properties of the matrix , all terms in the equation are nonnegative, with the first and last terms being strictly positive. Equation (A6) indicates that , implying We can now apply the theorem presented in [26], thereby completing the proof. □
References
- Brannas, K.; Nordstrom, J. A Bivariate Integer Valued Allocation Model for Guest Nights in Hotels and Cottages. Umea Economic Studies Working Paper No. 547. 2001. Available online: https://ssrn.com/abstract=255292 (accessed on 20 May 2024). [CrossRef]
- Quoreshi, A.S. Bivariate time series modeling of financial count data. Commun. Stat. Theory Methods 2006, 35, 1343–1358. [Google Scholar] [CrossRef]
- Pedeli, X.; Karlis, D. A bivariate INAR (1) process with application. Stat. Model. 2011, 11, 325–349. [Google Scholar] [CrossRef]
- Nastić, A.S.; Ristić, M.M.; Popović, P.M. Estimation in a bivariate integer-valued autoregressive process. Commun. Stat. Theory Methods 2016, 45, 5660–5678. [Google Scholar] [CrossRef]
- Khan, N.M.; Oncel Cekim, H.; Ozel, G. The family of the bivariate integer-valued autoregressive process (BINAR (1)) with Poisson–Lindley (PL) innovations. J. Stat. Comput. Simul. 2020, 90, 624–637. [Google Scholar] [CrossRef]
- Chen, H.; Zhu, F.; Liu, X. A new bivariate INAR (1) model with time-dependent innovation vectors. Stats 2022, 5, 819–840. [Google Scholar] [CrossRef]
- Popović, P.M.; Ristić, M.M.; Nastić, A.S. A geometric bivariate time series with different marginal parameters. Stat. Pap. 2016, 57, 731–753. [Google Scholar] [CrossRef]
- Yu, M.; Wang, D.; Yang, K.; Liu, Y. Bivariate first-order random coefficient integer-valued autoregressive processes. J. Stat. Plan. Inference 2020, 204, 153–176. [Google Scholar] [CrossRef]
- Su, B.; Zhu, F. Comparison of BINAR (1) models with bivariate negative binomial innovations and explanatory variables. J. Stat. Comput. Simul. 2021, 91, 1616–1634. [Google Scholar] [CrossRef]
- Steutel, F.W.; van Harn, K. Discrete analogues of self-decomposability and stability. Ann. Probab. 1979, 7, 893–899. [Google Scholar] [CrossRef]
- Al-Osh, M.A.; Aly, E.-E.A. First order autoregressive time series with negative binomial and geometric marginals. Commun. Stat. Theory Methods 1992, 21, 2483–2492. [Google Scholar] [CrossRef]
- Borges, P.; Molinares, F.F.; Bourguignon, M. A geometric time series model with inflated-parameter Bernoulli counting series. Stat. Probab. Lett. 2016, 119, 264–272. [Google Scholar] [CrossRef]
- Kachour, M.; Truquet, L. A p-order signed integer-valued autoregressive (SINAR (p)) model. J. Time Ser. Anal. 2011, 32, 223–236. [Google Scholar] [CrossRef]
- Bulla, J.; Chesneau, C.; Kachour, M. A bivariate first-order signed integer-valued autoregressive process. Commun. Stat. Theory Methods 2017, 46, 6590–6604. [Google Scholar] [CrossRef]
- Zhang, Q.; Wang, D.; Fan, X. A negative binomial thinning-based bivariate INAR (1) process. Stat. Neerl. 2020, 74, 517–537. [Google Scholar] [CrossRef]
- Ristić, M.M.; Bakouch, H.S.; Nastić, A.S. A new geometric first-order integer-valued autoregressive (NGINAR (1)) process. J. Stat. Plan. Inference 2009, 139, 2218–2226. [Google Scholar] [CrossRef]
- Kolev, N.; Minkova, L.; Neytchev, P. Inflated-parameter family of generalized power series distributions and their application in analysis of overdispersed insurance data. ARCH Res. Clear. House 2000, 2, 295–320. [Google Scholar]
- Ristić, M.M.; Nastić, A.S.; Jayakumar, K.; Bakouch, H.S. A bivariate INAR (1) time series model with geometric marginals. Appl. Math. Lett. 2012, 25, 481–485. [Google Scholar] [CrossRef]
- Popović, P.M. A bivariate INAR (1) model with different thinning parameters. Stat. Pap. 2016, 57, 517–538. [Google Scholar] [CrossRef]
- Anderson, T.W.; Darling, D.A. A test of goodness of fit. J. Am. Stat. Assoc. 1954, 49, 765–769. [Google Scholar] [CrossRef]
- Gross, L. Tests for Normality, R Package Version 1.0-2. 2013. Available online: http://CRAN.R-project.org/package=nortest (accessed on 20 May 2024).
- Popović, P.M.; Nastić, A.S.; Ristić, M.M. Residual analysis with bivariate INAR (1) models. REVSTAT-Stat. J. 2018, 16, 349–363. [Google Scholar]
- Weiss, C.H.; Homburg, A.; Puig, P. Testing for zero inflation and overdispersion in inar (1) models. Stat. Pap. 2019, 60, 823–848. [Google Scholar] [CrossRef]
- Kang, Y.; Zhu, F.; Wang, D.; Wang, S. A zero-modified geometric INAR (1) model for analyzing count time series with multiple features. Can. J. Stat. 2023. [Google Scholar] [CrossRef]
- Brockwell, P.J.; Davis, R.A. Introduction to Time Series and Forecasting; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
- Brockwell, P.J.; Davis, R.A. Time Series: Theory and Methods; Springer Science & Business Media: Berlin, Germany, 1991. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).