Detection of Parameter Change in Random Coefficient Integer-Valued Autoregressive Models

This paper considers the problem of testing for parameter change in random coefficient integer-valued autoregressive models. To overcome some size distortions of the existing estimate-based cumulative sum (CUSUM) test, we suggest estimating function-based test and residual-based CUSUM test. More specifically, we employ the estimating function of the conditional least squares estimator. Under the regularity conditions and the null hypothesis, we derive their limiting distributions, respectively. Simulation results demonstrate the validity of the proposed tests. A real data analysis is performed on the polio incidence data.


Introduction
In recent years, time series of counts are widely observed in real-world applications, for instance, the monthly number of people with a certain disease, the number of transactions per minute of some stock, the number of accidents per a day and so on. Among the existing models for analyzing those data sets, autoregressive moving average (ARMA)-type models based on a thinning operator, referred to as integer-valued ARMA models, are still popular since ARMA-type models provide a convenient way to transfer the classical ARMA recursion to discrete-valued time series (cf. Fokianos [1]). Reviews for these models are given by McKenzie [2], Weiß [3], Scotto et al. [4] and references cited therein.
As is addressed in Kang and Lee [5], integer-valued time series, particularly in epidemiology, often undergo a significant change as a result of changes in the quality of health care and the state of patients' health. It is well known that such a change can affect the statistical inference undesirably and ignoring a parameter change can lead to a false conclusion. Thus, the change point detection has attracted a lot of attention. In the field of integer-valued time series, Fokianos and Fried [6,7] investigated a testing procedure for the detection of intervention effects in linear and log-linear Poisson autoregressive (AR) models. Szabó [8] proposed the test for a change in several crucial parameters of integer-valued autoregressive (INAR) (p) models. Kang and Lee [5,9] constructed the estimate-based cumulative sum (CUSUM) tests for parameter change in random coefficient integer-valued autoregressive (RCINAR) models and Poisson AR models, respectively. More recently, Pap and Szabó [10] developed change detection methods for INAR(p) processes in general and provided the results available under the alternative hypothesis. Doukhan and Kengne [11] proposed two tests based on the likelihood of the observations in a general class of Poisson AR models. Hudecová et al. [12][13][14][15] studied methods for detecting structural changes in INAR and Poisson AR models incorporating the empirical probability generating function and Kang and Song [16] constructed the score test in Poisson AR models.
This study is concerned with change point problem in RCINAR models. The random coefficient setting reflects that the autoregressive coefficient may vary randomly over time due to environmental factors (cf. Zheng et al. [17], Leonenko et al. [18] and Gomes and Canto e Castro [19]). As aforementioned, First, the thinning operator is defined as follows: Let X be an integer-valued random variable and φ ∈ [0, 1], then the thinning operator "•" takes the form Bernoulli random sequence with mean φ that is independent of X (cf. Steutal and Van Harn [26]). With this operator, the RCINAR model is defined by that is independent of {φ t } and the counting sequences {B i } involved in φ t • X t−1 for t ≥ 1 are mutually independent and independent of {Z t }. Note that, conditioned on X t−1 and φ t , φ t • X t−1 follows a binomial distribution with parameters X t−1 and φ t . Assume that E(φ 2 t ) < ∞ and E(Z 4 t ) < ∞. According to Proposition 2.2 of Zheng et al. [17], under the assumptions, the Markov chain {X t } has a unique stationary distribution. From now on, we suppose that the distribution of the initial value X 0 coincides with this uniquely existing stationary distribution, yielding that the sequence {X t } is strictly stationary.
Let θ = (φ, λ) T = (E(φ t ), E(Z t )) T , and denote the true value of θ by θ 0 = (φ 0 , λ 0 ) T . To estimate the unknown parameters, we consider the CLSE. Suppose that X 0 , X 1 , . . . , X n from the model (1) are observed. Then, the CLSEθ n is obtained by minimizing the conditional sum of squares over R 2 , and is given byφ Throughout the paper, we use ∂ θ and ∂ 2 θ to denote ∂/∂θ and ∂ 2 /∂θ∂θ T , respectively. The symbol || · || denotes the l 2 norm for matrices and vectors and E(·) is taken under θ 0 . The symbols d → and p → denote convergence in distribution and convergence in probability, respectively. The almost sure convergence is written as "a.s.".
We define the function g(·, ·) by g(θ, x) = φx + λ, then S n (θ) can be written in the form ∑ n t=1 (X t − g(θ, X t−1 )) 2 . And then, the following result can be established by checking the regularity conditions in Klimko and Nelson [27]. Theorem 1. We have thatθ n converges to θ 0 almost surely and where V and W are positive definite matrices defined by

Parameter Change Test for RCINAR Models
In this section, we consider the problem of testing the following hypotheses: H 0 : θ does not change over X 1 , · · · , X n vs.
To this end, we employ the EF-based test and residual-based CUSUM test.

EF-Based Test
First, we consider the EF-based test using the partial sum process of the following estimating function: As the estimate-based test in Kang and Lee [9] is constructed based on the differencesθ k −θ n , we construct a test statistic using the differences ∂ θ S k (θ n ) − ∂ θ S n (θ n ). Noting the fact that ∂ θ S n (θ n ) = 0, we can see that the differences become ∂ θ S k (θ n ). Then, the test statistic is proposed as the maximum value of a function of ∂ θ S k (θ n ). To derive its limiting distribution, it is needed to obtain the limiting distribution of ∂ θ S [ns] (θ n ) for each s ∈ [0, 1].
By Taylor's theorem, we have that for each s ∈ [0, 1], It follows from (2) and ∂ θ S n (θ n ) = 0 that for s = 1, s. by the ergodicity of X t . The above equation can be rewritten as Consequently, from (2) and (3), we can write that In fact, it can be verified that Here, D be the function space with respect to the Skorohod topology and the symbol w → denotes the weak convergence in function space. Here, W − 1 2 denotes the inverse of the unique positive definite square root of the positive definite matrix W. Furthermore, I n and I I n are asymptotically negligible (see Lemmas A2 and A3 in the Appendix, respectively). Hence, combining the above arguments, we obtain our first main result.
whereŴ n is a consistent estimator of W. We reject H 0 if T EF n is large.

Remark 1.
As a consistent estimator of W, one can consider to usê

Residual-Based CUSUM Test
Instead of the EF-based test, we consider the test statistic based on the residuals, which may be defined as the difference between X t and its conditional expectation (cf. Freeland and McCabe [28]). For RCINAR models, the residuals are obtained as t (θ 0 ) = X t − φ 0 X t−1 − λ 0 . Let F t be the σ-field generated by {X s ; s ≤ t}. Since { t (θ 0 ), F t , 1 ≤ t ≤ n} forms a sequence of martingale differences, the invariance principle shows that where τ 2 = Var( 1 (θ 0 )). This allows us to construct the residual-based CUSUM test. Here, we replace the residuals with t (θ n ) = X t −φ n X t−1 −λ n , whereφ n andλ n are the CLSE of φ 0 and λ 0 , respectively. Using the fact that ∑ n t=1 t (θ n ) = 0, we propose the test statistic as follows: From Lemmas A4 and A5, we can see that respectively. Owing to these and (4), we have the second main result.

Theorem 3.
Under H 0 , we have We reject H 0 if T R n is large.

Simulation Results
In this section, we evaluate the performance of our tests T EF n and T R n . For the comparison purpose, we additionally perform the estimate-based CUSUM test, T CLS n , of Kang and Lee [9] given by Kang and Lee [9] showed that under H 0 , We consider the RCINAR model where {φ t } is an i.i.d. sequence of Beta random variables with parameters (a, b) and {Z t } is an i.i.d. Poisson sequence with mean λ. Here, we evaluate T EF n , T R n and T CLS n with sample sizes n = 300, 500, 1000 at the nominal level 0.05: the associated critical values, obtained through Monte Carlo simulations, are 2.408, 1.353 and 2.408, respectively. For each simulation, the first 1000 initial observations are discarded to avoid initialization effects. The empirical sizes and powers are calculated as the proportion of the number of rejections of the null hypothesis based on 1000 repetitions.
In order to calculate empirical sizes, observations are generated from the model (5) with (b, λ) = (1, 1), (2, 1), (4, 1), (8, 1), (16,1) for fixed a = 4. Since the φ varies with a and b, we consider the various combinations of (b, λ) to detect the change of the parameters φ and λ. Note that φ tends to 1 as b gets close to 1. The empirical sizes are dotted in Figure 1, where the horizontal dashed lines represent the nominal level 0.05. We can conclude that the empirical sizes are adequate if the empirical sizes are located near the horizontal dashed lines. From the figure, it can be seen that none of T EF n and T R n has severe size distortions even for the case that b is close to 1. In contrast, as seen in Kang and Lee [9], T CLS n shows sever size distortions when b is close to 1. Although not reported here, we could see that the results for other λ are similar to the case of λ = 1. Hence, our tests remedy this defect of existing test T CLS n . In order to examine the empirical powers, we consider the following alternatives, In particular, we consider the two cases: (i) λ 0 = 1 changes to λ 1 = 1.2, 1.4, 1.6, 1.8, 2.0 and b = 4 dose not change.
(ii) b 0 = 8 changes to b 1 = 7, 6 and λ changes in the same way as in (i). Figures 2-4 show that all the tests produce reasonably good powers and the power increases as either the distance between θ 0 and θ 1 or n increases. Overall, our simulation results demonstrate the validity of T EF n and T R n .

Real Data Analysis
In this section, we apply the proposed tests in Section 3 to analyze the monthly counts of poliomyelitis cases in the US from January 1970 through December 1983, as reported by the Centers for Disease Control and Prevention. The polio incidence data is one of the most famous data sets in the context of time series of counts. This data set has been previously studied by many researchers, such as Zeger [29], Davis et al. [30], Jung and Tremayne [31] and Kang and Lee [5,9]. The data are plotted in Figure 5 and consist of 168 observations. By investigating the sample ACF and by observing the spikes, we fit the RCINAR model to the polio incidence data and examine whether a parameter change exists or not. In order to test for a change in (φ, λ), T EF n and T R n are performed at the nominal level 0.1; the corresponding critical values are 2.054 and 1.212, respectively (cf. the horizontal lines in Figure 6). As a result, we obtain T EF 168 = 2.166 and T R 168 = 1.29 indicating rejection of the null hypothesis. Since both T EF 168,k and T R 168,k have a maximum at k = 35 (cf. Figure 6), the location of the change can be estimated as November 1972. It is the same result as those of Kang and Lee [5,9].
As we have already seen in Kang and Lee [9], it is revealed that the data in the first period, from January 1970 through October 1972, follows RCINAR model with The figures within the parentheses denote the corresponding standard errors. It can be seen that the estimated parameters in the first period are different from those in the second period. This indicates that ignoring a parameter change can lead to a false result. Furthermore, Figure 7 displays the polio series with the horizontal lines indicating the sample means of the first and second periods, which are 2.95 and 1.15, respectively. It looks quite evident that the series before and after November 1972 have different levels. Overall, the existence of a change is supported in this data.

Conclusions
In this study, we constructed an estimating function-based test and residual-based CUSUM test to detect a parameter change in RCINAR models and derived their limiting null distributions. According to simulation results, the proposed tests produce stable sizes even for the case that true parameter lies near the boundary of parameter space and reasonably good powers. Additionally, through a real data analysis, we demonstrated that there exists a parameter change in polio incidence data, which is consistent with previous research. Therefore, our tests can be useful tools in detecting for parameter change.
We anticipate that our tests can extend to other types of integer-valued models. Although we only derived asymptotic null distributions of the proposed tests, the behavior under the alternative, i.e., the consistency of the tests, is also of great interest. Indeed, there are several studies such as Pap and Szabó [10], Hudecová et al. [15] and Doukhan and Kengne [11] dealing with the consistency of each test in time series of counts. As with their studies, we presume that our tests also have the consistency property based on our simulation results (not reported). We leave these issues as a task for our future study.