Monitoring Parameter Change for Time Series Models of Counts Based on Minimum Density Power Divergence Estimator

In this study, we consider an online monitoring procedure to detect a parameter change for integer-valued generalized autoregressive heteroscedastic (INGARCH) models whose conditional density of present observations over past information follows one parameter exponential family distributions. For this purpose, we use the cumulative sum (CUSUM) of score functions deduced from the objective functions, constructed for the minimum power divergence estimator (MDPDE) that includes the maximum likelihood estimator (MLE), to diminish the influence of outliers. It is well-known that compared to the MLE, the MDPDE is robust against outliers with little loss of efficiency. This robustness property is properly inherited by the proposed monitoring procedure. A simulation study and real data analysis are conducted to affirm the validity of our method.


Introduction
In this paper we consider the cumulative sum (CUSUM) monitoring procedure for detecting a parameter change in integer-valued generalized autoregressive heteroscedastic (INGARCH) models. Integer-valued time series is a core area in time series analysis that includes diverse disciplines in social, physical, engineering, and medical sciences. Both integer-valued autoregressive (INAR) time series models and the integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) models have been widely studied in the literature and applied to various practical problems. Refer to McKenzie [1], Al-Osh and Alzaid [2], Ferland, Latour and Oraichi [3], Fokianos, Rahbek and Tjøstheim [4], and Weiß [5] for a general review. Poisson, negative binomial (NB), and one-parameter exponential family distributions have been widely used as underlying distributions, as seen in Davis and Wu [6], Zhu [7], Zhu [8], Jazi, Jones and Lai [9], Christou and Fokianos [10], Davis and Liu [11], Lee, Lee and Chen [12], and Chen, Khamthong and Lee [13].
Since Page [14], the CUSUM test has been a conventional tool to detect a structural change in underlying models. For a history and background, we refer to Csörgő and Horváth [15], Chen and Gupta [16], Lee, Ha, Na and Na [17], and the papers cited therein. Several authors have studied the change point test for INGARCH models, including Fokianos and Fried [18], Fokianos and Fried [19], Franke, Kirch and Kamgaing [20], Fokianos, Gombay and Hussein [21], Hudecová [22], Hudecová, HuŠková and Meintanis [23], Kang and Lee [24], Lee, Lee and Chen [12], Lee, Lee and Tjøstheim [25], and Lee and Lee [26]. This CUSUM scheme has been applied not only to retrospective change point tests but also to on-line monitoring and statistical process control (SPC) problems, designed to monitoring abnormal phenomena in manufacturing processes and health care surveillance. The CUSUM control chart has been popular due to its considerable competency in early detection of anomalies. Refer to Weiß [27], Rakitzis, Maravelakis and Castagliola [28], Kim and Lee [29], and the papers cited therein. Meanwhile, Gombay and Serban [30] used the CUSUM approach based on the score vectors for independent observations, and later extended it to autoregressive processes, wherein the Type I probability error is measured for obtaining control limits instead of the conventional average run length (ARL). Their CUSUM monitoring process is based on the asymptotic property of the partial sum process generated from score vectors. Later, Huh, Kim and Lee [31] adopted their method for analyzing Poisson INGARCH models, and compared its performance with the likelihood ratio (LR)-based control chart, originally considered by Weiss and Testik [32].
In this work, taking the approach of Gombay and Serban [30] and Huh, Kim and Lee [31], we designate a robust monitoring process based on the minimum distance power divergence estimator (MDPDE) proposed by Basu, Harris, Hjort and Jones [33]. We do this because the MDPDE is well-known to be suitable for robust inference in various models, having a trade-off between efficiency and robustness controlled through the tuning parameters with little loss in asymptotic efficiency relative to the maximum likelihood estimator (MLE) (Riani, Atkinson, Corbellini and Perrotta [34]). The MDPDE method has been successfully applied to various time series models, and in particular INGARCH models (Kim and Lee [35], Kim and Lee [36]). Recently, Lee and Lee [26] and Kim and Lee [37] considered the CUSUM tests based on score vectors for the MLE and MDPDE in exponential family distribution INGARCH models. See also Kang and Song [38]. Using their results within the framework of Gombay and Serban [30] and Huh, Kim and Lee [31], we design an MDPDE-based monitoring process to detect a model parameter change in INGARCH models. Monte Carlo simulations are conducted to assess the performance of the proposed monitoring procedure. A focus is made on comparing the MDPDE-based CUSUM test with the MLE-based CUSUM test for Poisson INGARCH models to demonstrate the superiority of the former over the latter in the presence of outliers. A real data analysis of the return times of extreme events of Goldman Sachs Group (GS) stock prices is also provided to illustrate the validity of the proposed test.
The rest of the paper is organized as follows. Section 2 reviews the MDPDE for INGARCH models and Section 3 constructs the monitoring procedure for these models and investigates its asymptotic properties. Section 4 presents a simulation study and Section 5 provides a real data analysis. Section 6 concludes the paper. The proof of the main theorem is provided in Appendix A.

MDPDE for INGARCH Model: An Overview
In this section, we briefly review the MDPDE for INGARCH models in [36]. Let Y 1 , Y 2 , . . . be the observations generated from integer-valued time series models with the conditional distribution of the one-parameter exponential family: . ., and f θ (x, y) is a non-negative bivariate function, depending on the parameter θ ∈ Θ ⊂ R d , and satisfies inf θ∈Θ f θ (x, y) ≥ c * for some c * > 0 for all x, y, and p(·|·) is a probability mass function given by where η is the natural parameter, A(η) and h(y) are known functions, and both A and B = A are strictly increasing. In particular, B(η t ) = X t and B (η t ) is the conditional variance of Y t . In what follows, symbols X t (θ) and η t (θ) = B −1 (X t (θ)) are also utilized to stand for X t and η t , respectively. Davis and Liu [11] demonstrated that the strict stationarity and ergodicity of {X t }, and the expression of X t (θ) = f θ ∞ (Y t−1 , Y t−2 , . . .) are allowed for some nonnegative measurable function f θ ∞ defined on N ∞ 0 under the contraction condition: for all x, x ≥ 0 and y, y ∈ N 0 , Meanwhile, Basu, Harris, Hjort and Jones [33] considered the minimum distance power divergence estimator (MDPDE) for model parameters using the density power divergence d α between two density functions g and h, defined by: Kim and Lee [36] studied the MDPDE for one parameter exponential family distribution INGARCH models. Given Y 1 , . . . , Y n generated from (1), the MDPDE is defined bŷ wherel Below, θ 0 denotes the true value of θ and is assumed to be an interior point in the compact for some c > 0, Kim and Lee [36] verified that the MDPDE is strongly consistent. Additionally, they showed that provided where V and ρ ∈ (0, 1) denote a generic integrable random variable and a constant, respectively, the symbol · denotes the L 2 -norm for matrices and vectors, and expectation E(·) is taken under θ 0 , the MDPDE is asymptotically normal with asymptotic variance J −1 and l α,t (θ) is the same asl α,t (θ) withη t (θ) in (3) replaced by η t (θ).
Moreover, additionally assuming Kim and Lee [37] showed that the CUSUM test statistics designed for detecting a change in θ have the limiting null distribution of the sup of a Brownian bridge. In practice, α ∈ (0, 1] is often harnessed and an optimal α can be selected through the method of Warwick [39] and Warwick and Jones [40]; see Remark 1 of Kim and Lee [36].
In the literature, the following linear INGARCH model has been frequently used: where X t = B(η t ) = E(Y t |F t−1 ) and θ = (ω, a, b) T satisfy ω > 0 and a + b < 1. Here, we assume that θ 0 is an interior of a compact neighborhood where NB(r, p) denotes a negative binomial (NB) distribution with parameters r ∈ N and p ∈ (0, 1), satisfy the aforementioned regularity conditions. Those conditions should be checked analytically when one aims to use a specific distribution as the conditional distribution of the INGARCH model. In this case, a goodness of fit test could be conducted to check the adequacy of the assumed underlying distribution (Fokianos and Neumann [41]).

MDPDE-Based Monitoring Process
In this section, we consider a monitoring process detecting a significant change in the underlying models based on sequentially observed time series Y 1 , . . . , Y n following Model (1), given a training sample Y 1 , . . . , Y m from Model (1), where m = m(n) is a sequence of positive integers that diverges to ∞ as n tends to ∞. For this task, we set up the following hypotheses: We first consider the case that θ 0 is known a priori from a past experience. Then we consider the monitoring process using the processŴ k, where ∂l α,t ∂θ is the score vector as in (3) based on Y 1 , . . . , Y n and where ∂l α,t ∂θ is the score vector based on the training sample. Here, the notation max 1≤i≤k z i with z i = (z i,1 , . . . , z i,d ) T ∈ R d is defined to be the vector with the jth entry equal to max 1≤i≤k z j,i for j = 1, . . . , d, and ||z|| max = max 1≤i≤k |z i | for z = (z 1 , . . . , z d ) T ∈ R d . Similar versions ofT max n,0 and T cusum n,0 based on MLE have been considered by Gombay and Serban [30] and Huh, Kim and Lee [31] for the AR and Poisson INGARCH models, whileT min n,0 is newly considered here. An anomaly is signaled at k whenT min n,0 (k),T max n,0 (k), orT cusum n,0 (k) get out of a control limit for some k = 1, . . . , n, and the control limit can be determined using the convergence result in Theorem 1 addressed below.
Next, we consider the situation that θ 0 is unknown and must be estimated in the construction of the monitoring process in (5). We employ a monitoring process constructed based onŴ k = , whereθ α,m is the MDPDE of θ 0 obtained from the training sample and which is obtained by substituting θ 0 in K α in (6) withθ α,m , namely, An anomaly is detected at k whenT min n (k),T max n (k), orT cusum n (k) get out of the control limit for some k = 1, . . . , n. The control limit can be determined theoretically using the asymptotic result in Theorem 1 addressed below. For this task, we investigate the asymptotic behavior of the monitoring processesT min n ,T max n , andT cusum n defined below.
∂θ are the ones in (4), and Using Donsker's invariance principle for martingale differences (Billingsley [42]) and the fact that sup 0≤s≤t B(s) − B(t) = |B(t)| in distribution for any standard Brownian motion B, we obtain where B d and denote a d-dimensional standard Brownian motion, so that as T min n behaves asymptotically similarly to T max n . Meanwhile, we can see that where B • d is a d-dimensional Brownian bridge. Using the above facts, we are led to attain the following theorem, whose proof is provided in the Appendix A. The result in Theorem 1 can be used to determine a control limit for the monitoring process. Given significance level 0 < α < 1, we take c and c satisfying P( The performance of the proposed CUSUM monitoring methods is evaluated in our simulation study, focusing onT cusum n ,T min n,0 , andT min n . (We do not report the result forT max n,0 andT max n , as these do not perform well compared to the others in most cases). Therein, a parametric bootstrap is adopted in obtaining control limits to reduce the parameter estimation effect, which can be more problematic when m is not so large compared to n, and the MDPDE from the training sample is used to generate the bootstrap sample.

Simulation Results
In this section, we compare the performance of the CUSUM monitoring processesT cusum n ,T min n,0 , andT min n in three different experimental environments for the Poisson INGARCH(1,1) model as follows: For the comparison, we compute the empirical sizes and powers at the nominal level of 0.05 for m = n = 500, 1000 with 1000 implications. For the critical value ofT min n,0 , we use 2.633, which is the 0.95th quantile of sup 0≤s≤1 B 3 (s) max . However, forT cusum n andT min n , we use the critical values obtained from a parametric bootstrap method, as the MDPDEθ α,m might cause some size distortions. In implementation, the warp-bootstrap method is utilized to save computing times (Giacomini, Politis, and White [43]).
-Part 1. We compare the performance of MLE-and MDPDE-based monitoring processes (α = 0, 0.1, 0.2, 0.3) by calculating the size and power for the four different cases of changing parameter from (ω 0 , a 0 , b 0 ) to (ω 1 , a 1 , b 1 ) when the parameter change is assumed to occur at [n/2].
-Part 2. We examine the size and power for the same settings as in Part 1 when the change occurs at [n/4].
-Part 3. We compare the performance of MLE-and MDPDE-based monitoring processes (α = 0, 0.1, 0.2, 0.3) for the same settings as in Part 1 when outliers exist in the time series, wherein the parameter change is assumed to occur at [n/2]. In this case time series samples are generated from  in Part 2 appears to increase up to that ofT min n,0 . In both Part 1 and Part 2, different α do not affect the size much, but a larger α tends to diminish the power. This appeals to our intuition, as the MLE is more efficient in the presence of no outliers.  Meanwhile, Tables 9-12 show that the outliers undermine the performance of the MLE-based monitoring processes in terms of both size and power; namely, size distortions are notable and the power decreases to a certain extent. This result particularly indicates thatT cusum n is improved when the MDPDE with α > 0 is used, which demonstrates the efficacy of the MDPDE in the monitoring process. By contrast, the size ofT min n significantly increases when α > 0, indicating thatT min n is unstable; see Figure 2. Although not reported here, we also examined the performance of the same monitoring processes for NB INGARCH(1,1) models. The result for this case showed a similar pattern to the Poisson INGARCH(1,1) case. All our findings strongly affirm thatT cusum n is the most favorable among the monitoring methods considered in this study.

Real Data Analysis
In this section, we applyT cusum n to a real dataset, using the extreme events of the daily log-returns of GS stock from 2 July 2007 to 28 February 2020. Davis and Liu [11] and Kim and Lee [37] used the GS stock datasets with different periods, but their works were focused on parameter estimation and the retrospective change point test. For the task of online monitoring, we first calculated the hitting times, τ 1 , τ 2 , . . . , for which the log-returns of GS stock fall outside the 0.05 and 0.95 quantiles of the data, and generated the time series of counts Y t = τ t − τ t−1 ≥ 0, t = 1, . . . , 319. Figure 3 plots Y t and exhibits the presence of a number of outliers. Fitting the Poisson INGARCH(1,1) model to the whole observations, we have the MLE of (ω,â,b) = (1.969, 0.152, 0.664) and the MDPDE of (ω,â,b) = (1.213, 0.144, 0.472) when α = 0.1 is used. The significant difference between the two estimates is seemingly due to the presence of outliers. Using Y t , t = 1, . . . , 150 as a training sample and viewing Y t , t ≥ 151 as sequentially observed testing data, we implement the monitoring processT cusum n with α = 0, 0.1 to detect a parameter change. Subsequently, an anomaly is detected when t = 180 for α = 0 (blue vertical line) and t = 197 for α = 0.1 (red vertical line), which indicates that the monitoring process based on the MLE is more sensitive to relatively smaller outliers lying around t = 180, while that based on MDPDE is more robust to those outliers and detects a more significant change around t = 197, ignoring smaller ones. Obviously, we can see from Figure 3 that Y t has a pattern with more fluctuations after t = 180. Our finding affirms the adequacy of the MDPDE-based monitoring process in the presence of outliers.

Concluding Remarks
In this work, we studied the robust on-line monitoring process based on MDPDE for detecting a parameter change in INGARCH models. For this task, we adopted the CUSUM process based on the score functions, which were originally constructed for obtaining the MDPDE. Our simulation study and real data analysis confirmed the validity of the proposed method. Here, we focused on the monitoring process within the framework of Gombay and Serban [30] and Huh, Kim and Lee [31]. However, one can also consider a different monitoring scheme, for example as in Na, Lee and Lee [44], and conduct a comparison study, which is left as our future project.  Then, using the arguments as in (A3) and (A4), we can see that which impliesT cusum n − T cusum n = o P (1) andT cusum n d → T holds owing to (9). This completes the proof.