Robust Estimation for Bivariate Poisson INGARCH Models

In the integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) models, parameter estimation is conventionally based on the conditional maximum likelihood estimator (CMLE). However, because the CMLE is sensitive to outliers, we consider a robust estimation method for bivariate Poisson INGARCH models while using the minimum density power divergence estimator. We demonstrate the proposed estimator is consistent and asymptotically normal under certain regularity conditions. Monte Carlo simulations are conducted to evaluate the performance of the estimator in the presence of outliers. Finally, a real data analysis using monthly count series of crimes in New South Wales and an artificial data example are provided as an illustration.


Introduction
Integer-valued time series models have received widespread attention from researchers and practitioners, due to their versatile applications in many scientific areas, including finance, insurance, marketing, and quality control. Numerous studies focus on integervalued autoregressive (INAR) models to analyze the time series of counts, see Weiß [1] and Scotto et al. [2] for general reviews. Taking a different approach, Ferland et al. [3] proposed using Poisson integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) models and Fokianos et al. [4] developed Poisson AR models to generalize the linear assumption on INGARCH models. The Poisson assumption on INGARCH models has been extended to negative binomial INGARCH models (Davis and Wu [5] and Christou and Fokianos [6]), zero-inflated generalized Poisson INGARCH models (Zhu [7,8] and Lee et al. [9]), and one-parameter exponential family AR models (Davis and Liu [10]). We refer to the review papers by Fokianos [11,12] and Tjøstheim [13,14] for more details.
Researchers invested considerable efforts to extend the univariate integer-valued time series models to bivariate (multivariate) models. For INAR type models, Quoreshi [15] proposed bivariate integer-valued moving average models and Pedeli and Karlis [16] introduced bivariate INAR models with Poisson and negative binomial innovations. Liu [17] proposed bivariate Poisson INGARCH models with a bivariate Poisson distribution that was constructed via the trivariate reduction method and established the stationarity and ergodicity of the model. Andreassen [18] later verified the consistency of the conditional maximum likelihood estimator (CMLE) and Lee et al. [19] studied the asymptotic normality of the CMLE and developed the CMLE-and residual-based change point tests. However, this model has the drawback that it can only accommodate positive correlation between two time series of counts. To cope with this issue, Cui and Zhu [20] recently introduced a new bivariate Poisson INGARCH model based on Lakshminarayana et al.'s [21] bivariate Poisson distribution. Their model can deal with positive or negative correlation, depending on the multiplicative factor parameter. They employed the CMLE for parameter estimation. However, because the CMLE is unduly influenced by outliers, the robust estimation in bivariate Poisson INGARCH models is crucial and deserves thorough investigation.

Simulation
In this section, we report the simulation results to evaluate the performance of the MDPDE. The simulation settings are described, as follows. Using the inverse transformation sampling method (cf. Section 2.3 of Verges [37]), we generate Y 1 , . . . , Y n from (2) with the initial value λ 1 = (0, 0) T . For the estimation,λ 1 is set to be the sample mean of the data. We first consider θ = (ω 1 , a 1 , b 11 , b 12 , ω 2 , a 2 , b 21 , b 22 , δ) T = (1, 0.2, 0.1, 0.2, 0.5, 0.3, 0.4, 0.2, 0.5) T , which satisfies (A3) with p = 1. In this simulation, we compare the performance of the MDPDE with α > 0 with that of the CMLE (α = 0). We examine the sample mean, variance, and mean squared error (MSE) of the estimators. The sample size under consideration is n = 1000 and the number of repetitions for each simulation is 1000. In Tables 1-16, the symbol * represents the minimal MSEs for each parameter. Table 1 indicates that, when the data are not contaminated by outliers, the CMLE exhibits minimal MSEs for all parameters, and the MSEs of the MDPDE with small α are close to those of the CMLE. The MSE of the MDPDE shows an increasing tendency as α increases. Hence, we can conclude that the CMLE outperforms the MDPDE in the absence of outliers. Now, we consider the situation that the data are contaminated by outliers. To this end, we generate contaminated data Y c,t = (Y c,t,1 , Y c,t,2 ) T when considering where Y t,i are generated from (2), P t,i are i.i.d. Bernoulli random variables with success probability p, and Y o,t,i are i.i.d. Poisson random variables with mean γ. We consider three cases: (p, γ) = (0.03, 5), (0.03, 10), and (0.05, 10). Tables 2-4 report the results. In the tables, the MDPDE appears to have smaller MSEs than the CMLE for all cases, except for the case of α = 1 when (p, γ) = (0.03, 5). As p or γ increases, the MSEs of the CMLE increase faster than those of the MDPDE, which indicates that the MDPDE outperforms the CMLE, as the data are more contaminated by outliers. Moreover, as p or γ increases, the symbol * tends to move downward. This indicates that, when the data are severely contaminated by outliers, the MDPDE with large α performs better.   We also consider smaller sample size n = 200. The results are presented in Tables 5-8  and they show results similar to those in Tables 1-4. The variances and MSEs of both the CMLE and MDPDE are larger than those in Tables 1-4.    In order to evaluate the performance of the MDPDE for negatively cross-correlated data, with the same p and γ, as above. The results are reported in Tables 9-16 for n = 1000 and 200, respectively. These tables exhibit results that are similar to those in Tables 1-8. Overall, our findings strongly support the assertion that the MDPDE is a functional tool for yielding a robust estimator for bivariate Poisson INGARCH models in the presence of outliers.

Illustrative Examples
First, we illustrate the proposed method by examining the monthly count series of crimes provided by the New South Wales Police Force in Australia. The data set is classified by local government area and offence type. This data set has been studied in many literatures, including Lee et al. [9], Chen and Lee [38,39], Kim and Lee [24], and Lee et al. [40]. To investigate the behavior of the MDPDE in the presence of outliers, we consider the data series of liquor offences (LO) and transport regulatory offences (TRO) in Botany Bay from January 1995 to December 2012, which has 216 observations in each series. Figure 1 plots the monthly count series of LO and TRO and it shows the presence of some deviant observations in each series. The sample mean and variance are 1.912 and 13.14 for LO, and 2.463 and 20.41 for TRO. A large value of the variance of each series is expected to be influenced by outliers. The autocorrelation function (ACF) and partial autocorrelation function (PACF) of LO and TRO, as well as cross-correlation function (CCF) between two series, are given in Figure 2, indicating that the data are both serially and crossly correlated. The cross-correlation coefficient between two series is 0.141.  We fit the model (2) to the data using both the CMLE and the MDPDE.λ 1 is set to be the sample mean of the data. Table 17 reports the estimated parameters with various α. The standard errors are given in parentheses and the symbol • represents the minimal AMSE provided in Remark 2. In the table, we can observe that the MDPDE has smaller AMSE than the CMLE and the optimal α is chosen to be 0.1. The MDPDE with optimal α is quite different from the CMLE, in particular,δ is about half of the CMLE. This result indicates that outliers can seriously affect the parameter estimation and, thus, the robust estimation method is required when the data are contaminated by outliers. We clean the data by using the approach that was introduced by Fokianos and Fried [41] and apply the CMLE and the MDPDE to this data in order to illustrate the behavior of the estimators in the absence of outliers. Table 18 reports the results. The standard errors and AMSE tend to decrease compared to Table 17. The CMLE has minimal AMSE and the MDPDE with small α appears to be similar to the CMLE. Now, we consider an artificial example that has negative cross-correlation coefficient. Following Cui and Zhu [20], we generate 1000 samples from univariate Poisson INGARCH model, i.e., where P(λ t ) denotes the Poisson distribution with mean λ t . Further, we observe the contaminated data X c,t as follows where P t are i.i.d. Bernoulli random variables with a success probability of 0.03 and X o,t are i.i.d. Poisson random variables with mean 5. Let Y t = (Y t,1 , Y t,2 ) T , where Y t,1 = X c,t and Y t,2 = X c,t+500 for t = 1, . . . , 500. The sample mean and variance are 5.196 and 7.380 for Y t,1 , and 4.538 and 8.129 for Y t,2 . The cross-correlation coefficient between Y t,1 and Y t,2 is −0.161. We fit the model (2) to Y t and the results are presented in Table 19. Similar to Table 17, the MDPDE has smaller AMSE than the CMLE. The optimal α is chosen to be 0.3 and the correspondingδ is −0.329, whereas the CMLE is 0.772.

Concluding Remarks
In this study, we developed the robust estimator for bivariate Poisson INGARCH models based on the MDPDE. In practice, this is important, because outliers can strongly affect the CMLE, which is commonly employed for parameter estimation in INGARCH models. We proved that the MDPDE is consistent and asymptotically normal under regularity conditions. Our simulation study compared the performances of the MDPDE and the CMLE, and confirmed the superiority of the proposed estimator in the presence of outliers. The real data analysis also confirmed the validity of our method as a robust estimator in practice. Although we focused on Cui and Zhu's [20] bivariate Poisson INGARCH models here, the MDPDE method can be extended to other bivariate INGARCH models. For example, one can consider the copula-based bivariate INGARCH models that were studied by Heinen and Rengifo [42], Andreassen [18], and Fokianos et al. [43]. We leave this issue of extension as our future research.

Data Availability Statement:
Publicly available datasets were analyzed in this study. This data can be found here: data.gov.au (accessed on 19 March 2021).

Conflicts of Interest:
The authors declare no conflict of interest.
the first part of the lemma is validated. Note that where the equality holds if and only if δ = δ 0 and λ t = λ 0 t a.s. Therefore, the second part of the lemma is established by (ii) of Lemma A1.