Detecting Structural Change Point in ARMA Models via Neural Network Regression and LSCUSUM Methods

This study considers the change point testing problem in autoregressive moving average (ARMA) (p,q) models through the location and scale-based cumulative sum (LSCUSUM) method combined with neural network regression (NNR). We estimated the model parameters via the NNR method based on the training sample, where a long AR model was fitted to obtain the residuals. Then, we selected the optimal model orders p and q of the ARMA models using the Akaike information criterion based on a validation set. Finally, we used the forecasting errors obtained from the selected model to construct the LSCUSUM test. Extensive simulations and their application to three real datasets show that the proposed NNR-based LSCUSUM test performs well.


Introduction
The change point problem has been extensively studied in time series analysis because time series often suffer from structural changes due to issues such as changes of governmental policy, health care quality, and critical social events. This problem has also been crucial in engineering, medicine, and economics, and numerous related articles have been published. We refer the reader to Csörgö and Horváth [1], Hušková et al. [2], Franke et al. [3], Kang and Lee [4], Lee and Chen [5], and Huh et al. [6] for a general review.
The cumulative sum (CUSUM) test proposed by Page [7] is one of the most popular methods for change point tests due to its convenience of usage. Lee and Na [8] proposed the conventional estimate-based CUSUM test, which generally performs well, but suffers from severe size distortions and produces low powers on some occasions. Na et al. [9] and Lee [10] suggested the residual-based CUSUM test for time series models as a remedy. This test is more stable, but it suffers serious power loss when location parameters change because it can only detect the change in scale parameters. To improve upon this shortcoming, Oh [11,12] proposed a score vector-based CUSUM test and a modified residual-based CUSUM method to enhance the power performance. Lee [13] introduced a location-and scale-based CUSUM (LSCUSUM) test, and simulations confirmed its improved performance in terms of stability and power over any estimate-, residual-and score vector-based CUSUM tests developed thus far.
A key step in constructing the LSCUSUM test is to accurately estimate the residuals, i.e., fitting a correct model is important if we want to test change points in time series. The most popular time series fitting method is the classical ARMA method, which uses maximum likelihood estimation (QMLE) or least squares estimation (LSE) to fit an ARMA model. Classical linear ARMA models produce an accurate prediction when a time series truly follows them, but the prediction is not correct if the time series has obvious nonlinear characteristics. Because many time series carry nonlinear characteristics, new analytical tools such as neural network regression (NNR), support vector regression (SVR), and least absolute shrinkage and selection operator (LASSO) have been applied to fit them, and have been demonstrated to outperform classical time series models, especially when a time series has certain nonlinear and nonstationary characteristics. For surveys, we refer the reader to Hwarng and Ang [14], Hossain et al. [15], Zafar et al. [16], and Giacomazzo [17]. Harchaoui and Lévy-Leduc [18] considered the change point estimation problem as a variable selection problem and used the LASSO method to estimate multiple change points in one-dimensional piecewise constant signals. Chan et al. [19] applied a group LASSO method to estimate multiple change points in a time series. Jin et al. [20] proposed a fast algorithm to estimate multiple change points in nonstationary time series models. Qian and Su [21] studied a multiple change points estimation problem for panel data models via an adaptive group fused LASSO. For more studies about change point estimation via a penalized model selection method, please refer to Lee et al. [22], Jin et al. [23], and Ciuperca and Maciak [24], among many others.
Lee et al. [25,26] proposed an SVR with a hybridization of the CUSUM method to detect change points in ARMA and generalized autoregressive conditional heteroscedastic (GARCH) models. As is well known, the NNR can approximate the nonlinearity of a time series without knowing the underlying dynamic structure by minimizing the empirical risk. Motivated by this, we adopted NNR to obtain ARMA residuals, and then constructed the LSCUSUM statistic to test change points. Our simulations show that the proposed method outperforms methods based on SVR, the adaptive least absolute shrinkage and selection operator (ALASSO), and classical ARMA.
The rest of this paper is organized as follows. Section 2 presents the principle of the LSCUSUM test for ARMA models. Section 3 introduces the NNR method in a general framework. Section 4 proposes a forecasting method based on the NNR fitting ARMA model, herein the NNR-ARMA model, and then explains how to apply the NNR-ARMA model to the construction of the LSCUSUM test. Section 5 performs Monte Carlo simulations to evaluate the validity of the proposed method under various linear and nonlinear ARMA models. Section 6 applies the method to three real datasets for illustration. Finally, Section 7 provides concluding remarks.

LSCUSUM Test for ARMA Models
The model discussed in this paper is where f is an unknown function to be estimated, Θ = (φ 1 , . . . , φ p , θ 1 , . . . , θ q , σ 2 ) is an unknown parameter vector, p and q are real numbers, and ε i are IID random variables with mean zero, covariance σ 2 > 0, and E|ε t | 4 < ∞. For given observations y 1 , . . . , y n , we are interested in the null and alternative hypotheses where Θ is the unknown parameter vector, the nonzero constant vector ∆ represents the parameter skip, and k = [nτ] is the location of the change point, where τ ∈ (0, 1) and [·] denotes the integral function. Model (1) includes AR, ARMA, threshold ARMA (TARMA) and time-varying AR (TVAR) models as special cases. To test the above null and alternative hypotheses, we applied the following two LSCUSUM tests proposed by Lee [13]: whereε t = y t − f (Θ; y t−1 , . . . , y t−p, ε t−1 , . . . , ε t−q ) denotes the residuals withΘ as the estimator of Θ, andγ We rejected H 0 ifT LS n > 2.4503 orT max n > 1.4596 at the nominal 0.05 level. The critical values of these two tests were obtained though Monte Carlo simulations using two-dimensional standard Brownian motion.
Next, we introduce the NNR method in a general framework, propose a fitting method based on NNR to the ARMA model, and explain how to apply the NNR-ARMA method when constructing the LSCUSUM test.

Neural Network Regression
A neural network (NN) is a simplified model obtained by abstracting and modeling the human brain, which is a parallel connection of a set of nodes called neurons. It can effectively solve complex regression and classification problems with a large number of correlated variables and make accurate predictions for time series.
In this study, we define an NN with an input layer, a hidden layer, and an output layer. We consider a feedforward net with n input nodes, one layer of H hidden nodes, and one output node. As shown in Figure 1, n nodes representing independent variables on the left form the input layer, H nodes in the middle form the hidden layer, and the rightmost node belongs to the output layer representing dependent variables. These nodes are connected according to the arrows, and the number next to an arrow marks its weight. Considering an input vector x = (x 1 , . . . , x n ) ∈ n , the output z h (x; θ) of the h−th hidden node is The output of the net is where ω ih (i = 0, . . . , n; h = 1, . . . , H) is the weight connecting the i−th input layer node and h−th hidden layer node; υ h (h = 0, . . . , H) is the weight of the h−th hidden layer node to the j−th dependent variable; and θ stands for all parameters υ 0 , . . . , υ H and ω ih (i = 0, . . . , n;h = 1, . . . , H) of the network. We also write υ = (υ 0 , . . . , υ H ) T and ω = (ω ih , i = 0, . . . , n; h = 1, . . . ,H). Here, ψ(·) and ψ * (·) are activation functions of the NN, which are usually defined as S-shaped logistic functions, where a is a parameter that controls the inclination of the function. Different function types can be selected as required, such as the threshold, piecewise, Gaussian, and hyperbolic tangent functions. The NN connection weights were adjusted through training, as follows: 1.
Assume that the input vector and the associated output vector are x = x 1 , . . . , x m ∈ n and y = y 1 , . . . , y m ∈ , respectively. 2.
Initialize the weights in the network:

3.
Use forward propagation to obtain the predicted value and calculate the loss function,

4.
Update the connection weights: Taking individual weights, the k − th iteration weights are Similarly, where η(0 < η < 1) is the learning rate; k is the number of updates or iterations; and ∂S(y,x;θ (k) ) ∂ω hj and ∂S(y,x;θ (k) ) ∂υ h are respective error terms of the hidden unit and output unit, i.e., the gradient values or partial derivatives of the loss function with respect to them. According to the chain rule, they are: Repeat steps 3 and 4 until the loss function is less than the preset threshold or the number of iterations is exhausted and the output parameter is the best parameter at present.
An NN can have multiple hidden layers, but one is generally enough. The number of nodes in the hidden layer can be large or small. Too many nodes may lead to overfitting, and too few to poor fitting. The number of neurodes H in the hidden layer can be determined as in Looney [27] by rule of thumb as Alternatively, one can use the Bayesian information criterion (BIC), as proposed by Swanson [28], to sequentially determine H. In this paper, the optimal number of nodes H is selected according to root mean square error (RMSE) recommendations through cross-validation.

Prediction Based on NNR-ARMA Model
Throughout this section, we assume that {y t : t = 1, . . . , n, n + 1, . . . , n + l} is generated according to the ARMA model (1). If the training sample is known to follow an NNR-ARMA (p, q) model with specific orders p and q, we can use this model. Otherwise, we determine the orders from the training sample as follows: Step 1 Let y = (y m , . . . , y n ), and design the matrix as the NN input. The design matrix X involves the potential innovation terms ε t (t = m − q, . . . , n − 1). Thus, the long AR(p ) model is first fitted to y 1 , . . . , y n to obtain the residualε t , and then the residualε t is used to approximate the unobserved ε t .
See Hannan and Rissanen [29] for more details. The design matrix with ε t replaced isX, where m = p + max(p, q) + 1. When fitting the long AR(p ) model, the order p can be determined using the Akaike information criterion (AIC), with the maximum order set to be 10 log 10(n).
Step 2 Fix p * > 0 , q * > 0, and the underlying true ARMA orders p ≤ p * and q ≤ q * , where p * , q * are known upper bounds of the true orders. For each p, q, we assume that y 1 , . . . , y n approximately satisfy and the NNR-ARMA (p, q) model is fitted with y = (y m , . . . , y n ) andX, as in Section 3. Then, after applying the obtained NNR-ARMA (p, q) model to y n+1 , . . . , y n+l , calculate the prediction errors ε t (t = 1, . . . , l) and the corresponding AIC, namely, AIC = 2(p + q) +l ln(RSS/l), where l denotes the number of effective observations and RSS denotes the sum of squared residuals. We then select the optimal order (p 0 , q 0 ) by minimizing the AICs.
Step 3 Train the NNR-ARMA (p 0 , q 0 ) model for the whole training sample y 1 , . . . , y n+l and predict the testing samples with the estimated NNR-ARM (p 0 , q 0 ) model. The obtained prediction errors are then used in the construction of the NNR-based LSCUSUM testT LS n andT max n .

Monte Carlo Simulation
We conducted Monte Carlo simulations to evaluate the finite sample performance of the NNR-based LSCUSUM test for linear and nonlinear ARMA models. For this, we used the AR, ARMA, TARMA, and TVAR models to generate data, and focused on the comparison of NNR-, ALASSO-, SVR-, and classical ARMA-based LSCUSUM tests. For each simulation, the empirical sizes and powers were calculated as the number of rejections of the null hypothesis of no changes out of 1000 repetitions at the 5% significance level. The sample sizes under consideration were n = 300, 500, and 1000. The change point was assumed to occur at [n/2] under the alternatives. The simulations were conducted with R version 4.1.3, and we used the R packages nnet and caret for NNR, glmnet for ALASSO, and e1071 for SVR.
(1) Linear ARMA model We generated time series from the ARMA (1,1) model: where |φ| < 1 and ε t are IID normal random variables with mean zero and variance σ 2 . We considered the following setups for the null hypothesis: Under the alternative, we assumed a single parameter changed while the others remained constant. Tables 1 and 2 report the empirical sizes and powers for the AR(1) and ARMA(1,1) models, respectively, from which it can be seen that there was no size distortion, and reasonably good powers were produced using all four methods. In particular, the NNR-based LSCUSUM test outperformed the ALASSO-, SVR-, and classical ARMA-based LSCUSUM test, and for the LSCUSUM statistic,T max n was preferable toT LS n in terms of stability and power. In addition, we conducted simulations with a sample size of 500 and different change point locations; see Table 3. It can be seen that the empirical power is greatly affected by the position of the change point, the NNR-based LSCUSUM test has the highest power when a change occurs in the middle, and the power decreases as the change point comes nearer to the beginning and ending points of time series, indicating that the change point is more easily detected when in the middle of the sequence. We also found that our method is obviously superior to the classical method when the change point occurs at the front of the sequence. Although not reported here, we carried out simulations for various settings of model parameters and sample sizes, and a similar pattern was seen.  (2) Threshold ARMA model We generated time series from the TARMA(1,1) model: where ε t are IID normal random variables with mean zero and variance σ 2 . We considered the following setup for the null hypothesis: Model 3: TARMA(1,1) with φ 1 = 0.1, φ 2 = −0.5, θ 1 = θ 2 = 0.5, and σ 2 = 1. Model 4: TARMA(1,1) with φ 1 = 0.3, φ 2 = −0.5, θ 1 = θ 2 = 0.5, and σ 2 = 1.
Under the alternative, we assumed a single parameter changed while the others remained constant. Tables 4 and 5 list some empirical sizes and powers for models 3 and 4, respectively. The results show that the classical ARMA-and ALASSO-based LSCUSUM tests had severe size distortion, which became more severe as the sample size increased. The NNR-and SVR-based LSCUSUM did not have such a problem, and the NNR-based LSCUSUM test produced better powers than the SVR-based LSCUSUM test. As anticipated, the power increased as the sample size increased or the deviance between the two parameter sets before and after the change point became larger. The simulation results considering different change point positions were the same as before, i.e., the LSCUSUM test had the highest power when a change point occurred in the middle of the time series (Table 6).

Empirical Applications
We illustrate our proposed NNR-based LSCUSUM test scheme by analyzing three sets of real data: the annual volume of discharge from the Nile River at Aswan in 10 8 m 3 from 1871 to 1970, the weekly average oil prices of the USA from 1 September 2014 to 27 August 2018, and the weekly Hang Seng Index (HSI) of Hong Kong from 4 September 2011 to 26 Jun 2022, whose respective time series were obtained from the R package TSA and the websites www.macrotrends.net (accessed on 27 June 2022) and investing.com. We used 70% samples of each dataset for training and the remaining 30% for testing. Prior to fitting the NNR-ARMA model, we inspected the autocorrelation function (ACF) and partial autocorrelation function (PACF) of each time series, looking for irregular patterns of autocorrelations, as seen in Figure 2, where plot (a) shows that the ACFs and PACFs for the Nile River dataset support stationarity to a great extent, indicating that estimation via the NNR-ARMA model with this time series would not undermine the outcomes. In contrast, plots (b) and (c) show that oil prices and HSI are nonstationary time series, and we use the log-returns of the oil prices and HIS.
We applied the NNR-based LSCUSUM test to the Nile River data and obtained T LS = 9.589 andT max = 2.631, which are larger than the respective critical values 2.4503 and 1.4596 at the 0.05 level. Thus, we rejected the null hypothesis of no changes. We calculated that both statistics detected the change point at 1899, which is the conclusion of Cobb [30], MacNeill et al. [31], and Wu and Zhao [32]. They believed that there is a change (decrease) in year 1899, which may be due to the construction of a new dam at Aswan. See Figure 3, where the red vertical line indicates a change point. For comparison, we also applied the classical ARMA-based LSCUSUM test to the Nile River data. Here, we expect this method to misbehave due to the nonlinearity of the datasets (Keenan's linearity test [33] is quite against the linearity assumption). We obtainedT LS = 2.271 < 2.4503, which does not reject the null hypothesis, whereasT max = 1.468 > 1.4596 rejects it, indicating that the change point occurs in 1917. The blue vertical line in Figure 3 shows that the change point for the dataset is quite away from the previously obtained one. This result demonstrates that the classical ARMA-based LSCUSUM can be severely damaged when a time series sample has a strong nonlinear feature.

Concluding Remarks
We proposed a prediction method based on the NNR-ARMA model and described how to determine an optimal model. This model was used to obtain residuals for a training time series sample, which was divided into two subseries. A long AR model was fitted to the first subseries to obtain the residuals, which were used as the error terms in the NNR-ARMA (p, q) model, and for each estimated NNR-ARMA (p, q) model, we calculated AICs based on the second subseries and selected an optimal ARMA order with the smallest AIC. Prediction errors or residuals were obtained using the determined NNR-ARMA (p, q) model and used to construct the LSCUSUM test. Monte Carlo simulations were conducted using the linear and nonlinear ARMA models with various parameter settings, including ARMA, TARMA, and TVAR. The results show that the NNR-based LSCUSUM test outperformed the classical ARMA-and ALASSO-based LSCUSUM tests, especially when the underlying model was nonlinear. Finally, our method was applied to the analysis of real datasets, namely of Nile River data, oil prices, and HSI prices, and detected one change point in all cases. Our findings support the validity of our method and its practicality in various real-world circumstances.