Next Article in Journal
Study on the Criteria for Starlikeness in Integral Operators Involving Bessel Functions
Next Article in Special Issue
A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs
Previous Article in Journal
New Subclass of Close-to-Convex Functions Defined by Quantum Difference Operator and Related to Generalized Janowski Function
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Combination Test for Mean Shift and Variance Change

1
School of Big Data and Statistics, Anhui University, Hefei 230601, China
2
Irving K. Barber Faculty of Science, University of British Columbia, Kelowna, BC V1V 1V7, Canada
*
Author to whom correspondence should be addressed.
Symmetry 2023, 15(11), 1975; https://doi.org/10.3390/sym15111975
Submission received: 24 September 2023 / Revised: 15 October 2023 / Accepted: 18 October 2023 / Published: 25 October 2023
(This article belongs to the Special Issue Applications Based on Symmetry/Asymmetry in Functional Data Analysis)

Abstract

:
This paper considers a new mean-variance model with strong mixing errors and describes a combination test for the mean shift and variance change. Under some stationarity and symmetry conditions, the important limiting distribution for a combination test is obtained, which can derive the limiting distributions for the mean change test and variance change test. As an application, an algorithm for a three-step method to detect the change-points is given. For example, the first step is to test whether there is at least a change-point. The second and third steps are to detect the mean change-point and the variance change-point, respectively. To illustrate our results, some simulations and real-world data analysis are discussed. The analysis shows that our tests not only have high powers, but can also determine the mean change-point or variance change-point. Compared to the existing methods of cpt.meanvar and mosum from the R package, the new method has the advantages of recognition capability and accuracy.

1. Introduction

In the paper [1], statistical methods of control for manufacturing processes were first proposed. However, the issue of change-points initially appeared in the context of quality control, where staff typically observe the output of a production line, aiming to detect signals deviating from acceptable levels while observing the data. Since the seminal paper [2], the cumulative sum (CUSUM) test has been one of the most popular methods for detecting a parameter change in statistical models. CUSUM tests of the mean change-point and the variance change-point have played a central role in detecting abnormal signals during quality control and changes in financial time series and other fields. For example, the authors of the papers [3,4] considered CUSUM tests for the mean change-point model with independent errors and linear processes, respectively; the papers [5,6] investigated CUSUM tests for the variance change-point model with independent normal errors and independent errors, respectively. For further studies, reference can be made to [7,8,9,10] and the sources detailed therein. Combining the change-point of mean with change-point of variance, this paper considers a change-point model with mean shift and variance change. For T 2 , we consider a time series { X t } to be a mean-variance model as
X t = μ t + σ t e t , 1 t T ,
where < μ t < and σ t > 0 are the mean and variance parameters, respectively. Since the condition of strong mixing ( α -mixing) is more general in time series [11], we consider the error sequence { e t } to be a α -mixing sequence with a mean of zero and variance of one. Let us recall the definition of α -mixing. Let N = { 1 , 2 , } and denote F k T = σ ( e t , k t T , t N ) to be the σ -fields generated by random variables e k , e k + 1 , , e T , 1 k T . For t 1 , we define
α ( t ) = sup m N sup A F 1 m , B F m + t | P ( A B ) P ( A ) P ( B ) | .
Definition 1.
If α ( t ) 0 as t , then { e t , t 1 } is called a strong mixing or α-mixing sequence.
In the mean-variance model (1), we monitor the mean change-point with the CUSUM statistic
U k , 1 = ( k ( T k ) T ) [ 1 k t = 1 k X t 1 T k t = k + 1 T X t ] , 1 k T 1 ,
and monitor the variance change-point with the CUSUM statistic
U k , 2 = ( k ( T k ) T ) [ 1 k t = 1 k ( X t X ¯ ) 2 1 T k t = k + 1 T ( X t X ¯ ) 2 ] , 1 k T 1 ,
where X ¯ = 1 T t = 1 T X t .
To improve the power of the statistics of the mean change-point and the variance change-point, we use the combination statistic
g ( T , k ) : = ( g 1 ( T , k ) , g 2 ( T , k ) ) : = ( 1 T U k , 1 , 1 T U k , 2 ) ,
where U k , 1 and U k , 2 are defined by (2) and (3), respectively. Here, ⊤ represents the transpose of the vector. Under the assumption of no change in the mean or variance, for all T 2 , the model (1) can be summarized in the null hypothesis  H 0 as
μ 1 = μ 2 = = μ T = μ 0 and σ 1 = σ 2 = σ T = σ 0 ,
where < μ 0 < and 0 < σ 0 < . The change-point alternative hypothesis H A is that there is an integer k 1 such that μ 1 = = μ k 1 μ k 1 + 1 = = μ T or there is an integer k 2 such that σ 1 = = σ k 2 σ k 2 + 1 = = σ T .
In this paper, we consider the mean-variance model (1) with α -mixing errors and investigate the limiting distributions for the statistics related to g ( T , k ) under the null hypothesis H 0 by (5). For example, if max 1 k T 1 g ( T , k ) T g ( T , k ) is smaller than a critical value (see details in Remark 2), then there is no evidence of a mean change or variance change in the mean-variance model (1). Otherwise, if max 1 k T 1 | g 1 ( T , k ) | is larger than another critical value (see details in Remark 2), the mean change-point location is suggested by
k ^ T , 1 = argmax 1 k T 1 | g 1 ( T , k ) | = argmax 1 k T 1 | U k , 1 | .
If max 1 k T 1 | g 2 ( T , k ) | is larger than this critical value, the variance change-point location is suggested by
k ^ T , 2 = argmax 1 k T 1 | g 2 ( T , k ) | = argmax 1 k T 1 | U k , 2 | .
Compared with existing methods for determining change-points, such as cpt.meanvar from the R package changepoint in [12] and mosum from the R package mosum in [13], we will show that our tests not only have high powers, but also determine the change-points as the mean change-point or variance change-point. Further details are provided in Section 3, Section 4 and Section 5.
In addition to the change-point studies referred to above, many scholars have sought to extend the change-points of the mean and variance for both independent and dependent data. For the mean change-point example, in [14,15,16], CUSUM estimators were investigated with dependent errors; in [17], a weighted CUSUM estimator was studied using an infinite variance A R ( p ) process; in [18,19], a self-normalization method was used to test the mean change-point in a time series; in [20,21], data-driven methods were used to investigate the mean shift and variance change; in [22,23,24], CUSUM estimators were discussed regarding the mean change-point with panel data. For the variance change-point example, the Schwarz information criterion (SIC) estimator of variance change was studied with independent normal errors in [25]; ref. [26] extended the CUSUM estimator in [5] with normal data to infinite moving average processes; in [27], a weighted variances test was considered based on independent errors; covariance structure change was studied with linear processes in [28]. In addition, the authors of [29] considered a CUSUM test of parameter changes in a time series model; refs. [30,31] reported changes in a variance inflation factor (VIF) regression model and a linear regression model; the authors of ref. [32] considered changes in parameters using the Shiryaev–Roberts statistics; in ref. [33], the change of covariance structure in multivariate time series were considered; refs. [34,35,36] considered multiple change-points; in ref. [37], the authors investigated a Bayesian method for the change-point; the authors of ref. [38] investigated a new class of weighted CUSUM estimators of the mean change-point; in ref. [39], the least sum of the squared error (LSSE) and maximum log-likelihood (MLL) methods in the estimation of the change-point were examined; in ref. [40], multivariate change-points in a mean vector and/or covariance structure were considered; and ref. [41] discussed a CUSUM estimator in an ARMA–GARCH model. Furthermore, a test for the detection of outliers for continuous distribution data was investigated in [42]; refs. [43,44], respectively, investigated change-point problems with a nonstationary time series and the volatility of conditional heteroscedasticity, respectively.
It is pointed out that the α -mixing sequence is very general in time series. For example, consider an infinite order moving average (MA()) process X t = i = 0 a i e t i , t 1 , where a i 0 exponentially fast, and { e t } is an i . i . d . sequence. If the probability density function of e t exists (such as normal, Cauchy, exponential, and uniform distributions), then { X t } is an α -mixing with exponentially decaying coefficients. The strictly stationary time series, including the autoregressive moving average (ARMA) processes and geometrically ergodic Markov chains, are the α -mixing processes. For further studies of α -mixing, reference can be made to [45,46] for limit theorems, refs. [47,48] for central limit theorem, refs. [49,50,51,52] for regression models, etc.
The rest of this paper is organized as follows. Some assumptions are provided in Section 2. By some stationarity and symmetry conditions, the limit distribution for the combination statistic g ( T , k ) will be shown under the null hypothesis (5) in Section 2, which can derive the limiting distributions of the CUSUM statistics g 1 ( T , k ) for the mean change and g 2 ( T , k ) for the variance change, respectively. As an application, we give a three-step algorithm to detect the change-points in Section 3. For example, in the first step, we do the combination test to check whether there is at least a change-point or not; in the second step, we do the mean test to detect the mean change-point; in the third step, we use the variance test to detect the variance change-point. In Section 4 of simulation, it will show that our method has a better performance than the methods of as cpt.meanvar [12] and mosum [13]. We also use three examples of real-world data to detect the mean change-point and variance change-point in Section 5. In addition, some conclusions and future work will be discussed in Section 6. Last, the proofs of the main results are presented in Section 7.
Throughout the paper, as T , let P and d denote the convergence in probability and distribution, respectively. Let C , C 1 , C 2 , C 3 , denote some positive constants not depending on T, which may be different in various places. If X and Y have the same distribution, we denote it as X Y . In addition, second-order stationarity means that ( e 1 , e 1 + k ) ( e t , e t + k ) for all t 1 and k 1 .

2. Main Results

First, we list some assumptions as follows:
Assumption 1.
Consider the model (1), where { e t , t 1 } is a stationarity sequence of α-mixing random variables with E e t = 0 , E e t 2 = 1 for all t 1 . In addition, for some δ > 0 , let sup t 1 E | e t | 4 + 2 δ < and t = 1 α δ 2 + δ ( t ) < .
Assumption 2.
Let μ 0 be defined in (5) and
lim T 1 T Var ( t = 1 T X t ) = s 1 2 > 0 , lim T 1 T Var ( t = 1 T ( X t μ 0 ) 2 ) = s 2 2 > 0 ,
lim T 1 T i = 1 T j = 1 T Cov [ ( X i μ 0 ) , ( X j μ 0 ) 2 ] = 0 .
Assumption 3.
Let { e t , t 1 } be a second-order stationarity sequence of α-mixing random variables with E e 1 = 0 and E e 1 2 = 1 . For some δ > 0 , assume that E | e 1 | 8 + 4 δ < and t = 1 α δ 2 + δ ( t ) < . In addition, let E e 1 e 1 + j 2 = 0 for all j 0 .
Assumption 4.
Let
s 1 2 : = γ 1 ( 0 ) + 2 h = 1 γ 1 ( h ) > 0 , s 2 2 : = γ 2 ( 0 ) + 2 h = 1 γ 2 ( h ) > 0 ,
where γ 1 ( h ) = Cov ( X 1 , X 1 + h ) = Cov ( ( X 1 μ 0 ) , ( X 1 + h μ 0 ) ) and γ 2 ( h ) = Cov ( ( X 1 μ 0 ) 2 , ( X 1 + h μ 0 ) 2 ) for h = 0 , 1 , 2 , .
Assumption 5.
Let { h T , T 1 } be a sequence of positive integers satisfying
h T a s T a n d h T = O ( T β ) f o r s o m e β ( 0 , 1 / 2 ) .
Remark 1.
The moment conditions and mixing coefficients of α-mixing sequence { e t } in Assumption 1 are used by many researchers, see [46,51], etc. The conditions (8) in Assumption 2 are the limiting variances for partial sums of { X t μ 0 T } and { ( X t μ 0 ) 2 T } . The condition (9) in Assumption 2 is a symmetry condition, which requires the limiting for partial sums of covariance of { X t μ 0 T } and { ( X t μ 0 ) 2 T } to be zero. For example, let f ( x , y ; k ) denote the joint probability density function of random variables X t μ 0 and X t + k μ 0 for all t 1 and k 1 . Let f ( x , y ; k ) be symmetrical, i.e., f ( x , y ; k ) = f ( x , y ; k ) for all x , y R . It is easy to check that E [ ( X t μ 0 ) ( X t + k μ 0 ) 2 ] = E [ ( X t μ 0 ) ( X t + k μ 0 ) 2 ] , which implies E [ ( X t μ 0 ) ( X t + k μ 0 ) 2 ] = 0 for all t 1 and k 1 . In addition, it has E ( X t μ 0 ) = 0 and E ( X t μ 0 ) 3 = 0 for all t 1 . Obviously, the binary normal distribution can satisfy the conditions for this example. Thus, the condition of (9) is satisfied. The second-order stationarity condition in Assumption 3 is provided to obtain s 1 2 and s 2 2 in Assumption 2. They are the long-run variances s 1 2 and s 2 2 in (10). To estimate these long-run variances s 1 2 and s 2 2 , we use Assumption 4 and the sample autocovariance functions to give their estimators s ^ T , 1 2 and s ^ T , 2 2 in (16). A similar condition (11) can be seen in [26].
Second, we study the limiting distribution of combination statistic g ( T , k ) in (4) under the null hypothesis H 0 by (5). We denote x as the greatest integer not exceeding x. Throughout the paper, let { W 1 0 ( x ) , x [ 0 , 1 ] } and { W 2 0 ( x ) , x [ 0 , 1 ] } be two independent standard Brownian bridges, ⇒ denote the convergence in distribution in the Skorokhod space D [ 0 , 1 ] .
Theorem 1.
In model (1), let the Assumptions 1 and 2 be satisfied. Then, under the null hypothesis H 0 defined by (5), for 0 < k = x T < T and x [ 0 , 1 ] , it has
g ( T , k ) g ( T , k ) i = 1 2 s i 2 ( W i 0 ( x ) ) 2 , a s T ,
where g ( T , k ) , s 1 2 and s 2 2 are defined by (4) and (8), respectively. Combining with the continuous mapping theorem, we obtain
max 1 k T 1 g ( T , k ) g ( T , k ) d sup 0 x 1 i = 1 2 s i 2 ( W i 0 ( x ) ) 2 , a s T .
Usually, the s 1 2 and s 2 2 in (8) are unknown. We should estimate them. By the second stationarity in Assumption 3, it is easy to obtain the long-run variances s 1 2 and s 2 2 defined by (10). In the next, we discuss the estimators of s 1 2 and s 2 2 . Let Z t = X t X ¯ , 1 t T . Then, γ 1 ( h ) and γ 2 ( h ) defined in (10) can be, respectively, estimated by γ ^ 1 ( h ) and γ ^ 2 ( h ) as
γ ^ 1 ( h ) = 1 T t = 1 T h Z t Z t + h , 0 h < T ,
and
γ ^ 2 ( h ) = 1 T t = 1 T h ( Z t 2 Z 2 ¯ ) ( Z t + h 2 Z 2 ¯ ) , 0 h < T ,
where Z 2 ¯ = 1 T t = 1 T Z t 2 and X ¯ = 1 T t = 1 T X t .
Thus, the estimators of s 1 2 and s 2 2 are, respectively, suggested by
s ^ T , 1 2 : = γ ^ 1 ( 0 ) + 2 h = 1 h T γ ^ 1 ( h ) , s ^ T , 2 2 : = γ ^ 2 ( 0 ) + 2 h = 1 h T γ ^ 2 ( h ) ,
where γ ^ 1 ( h ) and γ ^ 2 ( h ) are defined by (14) and (15).
Lemma 1.
In model (1), let the Assumptions 3–5 be satisfied. Under the null hypothesis H 0 defined by (5), we obtain
s ^ T , 1 2 P s 1 2 , s ^ T , 2 2 P s 2 2 ,
where s 1 2 , s 2 2 , s ^ T , 1 2 and s ^ T , 2 2 are defined by (10) and (16), respectively.
For 1 k T 1 , denote the combination statistic
f ( T , k ) : = ( 1 s ^ T , 1 2 g 1 ( T , k ) , 1 s ^ T , 2 2 g 2 ( T , k ) ) = ( 1 T s ^ T , 1 2 U k , 1 , 1 T s ^ T , 2 2 U k , 2 ) ,
where U k , 1 , U k , 2 , g 1 ( T , k ) , g 2 ( T , k ) , s ^ T , 1 2 and s ^ T , 2 2 are defined by (2)–(4) and (16), respectively.
Combining Theorem 1 with Lemma 1, we obtain two corollaries as follows:
Corollary 1.
In model (1), let the Assumptions 3–5 be fulfilled. Under the null hypothesis H 0 defined by (5), for 0 < k = x T < T and x [ 0 , 1 ] , we have
f ( T , k ) f ( T , k ) i = 1 2 ( W i 0 ( x ) ) 2 , a s T ,
where W 1 0 ( x ) and W 2 0 ( x ) are defined in (12). Thus,
max 1 k T 1 f ( T , k ) f ( T , k ) d sup 0 x 1 i = 1 2 ( W i 0 ( x ) ) 2 , a s T .
Corollary 2.
In model (1), let the Assumptions 3–5 be fulfilled. Under the null hypothesis H 0 defined by (5), it has
max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | d sup 0 x 1 | W 1 0 ( x ) | , max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | d sup 0 x 1 | W 2 0 ( x ) | ,
where g 1 ( T , k ) s ^ T , 1 2 and g 2 ( T , k ) s ^ T , 2 2 are defined by (18), W 1 0 ( x ) and W 2 0 ( x ) are defined in (12).
Remark 2.
For i = 1 , 2 , by (11.38) in [53], it is presented that
P ( sup x [ 0 , 1 ] | W i 0 ( x ) | y ) = 1 + 2 k = 1 ( 1 ) k exp ( 2 k 2 y 2 ) , y > 0 ,
where W 1 0 ( x ) and W 2 0 ( x ) be defined in (12). Let α ( 0 < α < 1 ) be the level of significance. For l 1 , let W 1 0 ( x ) , , W l 0 ( x ) be independent standard Brownian bridges for x [ 0 , 1 ] . Then the distribution of
sup 0 x 1 i = 1 l ( W i 0 ( x ) ) 2
was derived by Kiefer [54], which has a series Fourier-Bessel expansions. It is not easy to calculate the critical values for this distribution. Lee et al. [29] considered the problem of testing for parameter changes in time series models based on the CUSUM statistics and obtained the limiting distribution (23). They used the Monte Carlo method to obtain the critical values c α with different α and l. For example, when l = 2 , the critical values c α are calculated as c 0.05 = 2.408 and c 0.1 = 2.054 (see [29]).
If max 1 k T 1 f ( T , k ) f ( T , k ) c α , there is no evidence of a mean change or variance change. Otherwise, we conclude that there is at least a mean change-point or a variance change-point.
Similar to the multiple testing problems, by (22), we take the critical value d α / 2 for the distribution of sup 0 x 1 | W 1 0 ( x ) | to do the tests of the mean change-point and variance change-point, in order to control the type I error. For example, d 0.025 = 1.48 , d 0.05 = 1.358 . If max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | d α / 2 , there is no evidence of a mean change. Otherwise, we conclude that there is a mean change-point, and its time location k 1 is defined in (6).
Meanwhile, by (22), the p-value of max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | can be defined by p v 1 as
p v 1 = P ( sup x [ 0 , 1 ] | W 1 0 ( x ) | > y 0 ) = 2 k = 1 ( 1 ) k exp ( 2 k 2 y 0 2 ) ,
where y 0 = max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | . Similarly, let d α / 2 be the critical value for the distribution of sup 0 x 1 | W 2 0 ( x ) | . If max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | d α / 2 , there is no evidence of a variance change. Otherwise, we conclude that there is a variance change-point, and its time location k 2 is suggested in (7).
In addition, the p-value p v 2 of max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | can be defined by (24), where | W 1 0 ( x ) | is replaced by | W 2 0 ( x ) | and y 0 is max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | .

3. The Three-Step Algorithm

Based on the combination statistic f ( T , k ) in Corollary 1, the mean change statistic g 1 ( T , k ) and variance change statistic g 2 ( T , k ) in Corollary 2, we give a three-step algorithm to test the changes in mean and variance in Algorithm 1, i.e., the combination test, mean change test, and variance change test. In the following algorithm, we assume that there is at most one mean and one variance in the time series. If there are more change-points, we will give the discussion in Remark 3 to detect them.
Remark 3.
It can be seen that the CUSUM statistic U k , 2 of variance change in (3) contains the sample mean statistic, while the CUSUM statistic U k , 1 of mean change in (2) does not contain the sample variance statistic. One can use U k , 2
U k , 2 = ( k ( T k ) T ) [ 1 k t = 1 k ( X t X ¯ k ) 2 1 T k t = k + 1 T ( X t X ¯ T k ) 2 ] , 1 k T 1 ,
to replace U k , 2 , where X ¯ k = 1 k t = 1 k X t and X ¯ T k = 1 T k t = k + 1 T X t . However, the proofs of limiting distribution and consistency estimator based on U k , 2 will be complicated. Thus, we consider U k , 2 to construct the variance change-point estimator. That is why we do the mean change test in Step 2 before the test of variance change in this paper. To reduce the impact of mean change on the variance change test, we can construct the modified data X 1 , X 2 , , X T if we find a mean change-point. Then, we go to do Step 3 of the variance change test. Since it is assumed that there is at most one mean and one variance in the time series, Algorithm 1 is terminated after Step 3. If there are more change-points, we can modify the process { X t } by the variance change. For example, base on the data { X 1 , X 2 , , X T } (or the modified data { X 1 , X 2 , , X T }), let the modified process X t as
X t = { X t , t k ^ T , 2 , λ T 1 / 2 X t , t > k ^ T , 2 ,
where λ T = σ ^ T , 2 2 / σ ^ T , 1 2 , σ ^ T , 1 2 = 1 k ^ T , 2 t = 1 k ^ T , 2 ( X t X ¯ ) 2 , σ ^ T , 2 2 = 1 T k ^ T , 2 t = 1 + k ^ T , 2 T ( X t X ¯ ) 2 and X ¯ = 1 T t = 1 T X t . Then, based on the modified data { X 1 , , X T } , we can combine the three-step algorithm with iterative methods to detect more change-points. For further details, one can refer to [20,21] and the sources detailed therein.
For further studies of multiple change-point detection, reference can be made to [35] and the sources detailed therein. Next, we should discuss the measure of accuracy for the multiple change-point detection. Based on a time series observation of { X 1 , X 2 , , X T } , assume that there are L 0 change-points denoted by k 1 , k 2 , , k L 0 . By the change-point detection methods, it is assumed to detect L ^ T change-points denoted by k ^ T , 1 , k ^ T , 2 , , k ^ T , L ^ T . Following [55,56], a set of correctly detected change-points is defined as True Positive (TP):
T P = { k i | k ^ T , j : | k ^ T , j k i | m } , i = 1 , 2 , , L 0 a n d 0 j L ^ T ,
where m is a margin size with m > 0 . Then, the Precision, Recall, and F1-score are defined as follows
Precision = | T P | L 0 , Recall = | T P | L ^ T , F 1 - score = 2 Precision × Recall Precision + Recall ,
where | T P | denotes the number of set T P .
Algorithm 1 Three-step algorithm
  • Input: Data: { X 1 , X 2 , , X T } , set the level of significance α , the critical values c α and d α / 2 defined in Remark 2. Denote K the estimator of change-point location.   
  • Initialize: K
  •  /* Step 1: Do the combination test */   
  • if max 1 k T 1 f ( T , k ) f ( T , k ) in (20) is less than c α  then   
  •    There is no evidence of a mean change or variance change, and the algorithm is terminated.   
  • else
  •    There is at least a mean change-point or variance change-point, and Step 2 is started.
  •   /* Step 2: Do the mean change test */
  •   
  •    if  max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | in (21) is less than d α / 2  then   
  •      There is no evidence of a mean change, and Step 3 is started.
      
  •    else
  •      There is a mean change-point suggested by k ^ T , 1 in (6). Denote ( k ^ T , 1 , 1 ) , where 1 stands the change in mean. Do K ( k ^ T , 1 , 1 ) . In addition, take the modified the process X t by the mean change as
    X t = { X t , t k ^ T , 1 , X t ( θ ¯ T , 2 θ ¯ T , 1 ) , t > k ^ T , 1 ,
         where θ ¯ T , 1 = 1 k ^ T , 1 t = 1 k ^ T , 1 X t and θ ¯ T , 2 = 1 T k ^ T , 1 t = k ^ T , 1 + 1 T X t . Update { X 1 , X 2 , , X T } { X 1 , X 2 , , X T } and start the Step 3.   
  •    end if
  •   /* Step 3: Do the variance change test */   
  •    if  max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | in (21) is less than d α / 2  then   
  •      There is no evidence of a variance change, and the algorithm is terminated.   
  •    else
  •      There is a variance change-point suggested by k ^ T , 2 in (7). Denote ( k ^ T , 2 , 2 ) , where 2 stands the change in variance. Do K ( k ^ T , 2 , 2 ) .   
  •    end if  
  • end if  
  • Output: K   

4. Simulations

In this section, some simulations illustrate the empirical detection probabilities for the change-point estimators max 1 k T 1 f ( T , k ) f ( T , k ) , max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | and max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | defined by (18). For T 2 , we consider a mean-variance model as
X t = μ 1 I ( 1 t k 1 ) + μ 2 I ( k 1 + 1 t T ) + σ 1 e t I ( 1 t k 2 ) + σ 2 e t I ( k 2 + 1 t T ) , 1 t T ,
where μ 1 and μ 2 are the mean parameters, σ 1 > 0 and σ 2 > 0 are the variance parameters, k 1 and k 2 are the mean change-point location and variance change-point location, respectively. Let e = ( e 1 , e 2 , , e T ) be a random vector with E e = 0 and Cov ( e ) = Σ T satisfying Σ T = ( ξ | i j | ) 1 i , j T for some | ξ | < 1 . It is easy to see that e 1 , e 2 , e 3 , are the α -mixing random variables with mixing coefficient α ( t ) = O ( | ξ | t ) .
Consider the null hypothesis H 0 : μ 1 = μ 2 = μ 0 and σ 1 2 = σ 2 2 = σ 0 2 and the alternative hypothesis H A : μ 1 μ 2 or σ 1 2 σ 2 2 . For simplicity, we consider 4 different cases as follows:
Case 1: μ 1 = μ 2 = 1 and σ 1 2 = σ 2 2 = 1 ; Case 2: μ 1 = 1 , μ 2 = 1.5 and σ 1 2 = σ 2 2 = 1 ;
Case 3: μ 1 = μ 2 = 1 and σ 1 2 = 1 , σ 2 2 = 2 ; Case 4: μ 1 = 1 , μ 2 = 1.5 and σ 1 2 = 1 , σ 2 2 = 2 .
The mean change-point location k 1 and variance change-point location k 2 will be given later. Denote
A T , 0 = max 1 k T 1 f ( T , k ) f ( T , k ) , A T , 1 = max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | , A T , 2 = max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | .
The details of our algorithm to detect change-points in the mean-variance model can be found in Section 3.
First, we consider the Case 1Case 4 for the mean-variance model (27) based on the multivariate normal distribution. Let e N T ( 0 , Σ T ) with the dependence parameter ξ in Σ T . The level of significance is taken α = 0.05 . By ξ = [ 0.3 , 0 , 0.3 ] , h T = T 1 / 5 , k 1 = T / 4 , k 2 = T / 2 and T = [ 300 , 600 , 900 ] , we obtain the empirical sizes and powers for the estimators A T , 0 , A T , 1 and A T , 2 denoted by p 0 , p 1 and p 2 , respectively. Thus, the results of p 0 , p 1 and p 2 are shown in Table 1. The simulation results are taken by 1000 replications.
By Table 1, we give some comments here:
For Case 1: The mean and variance are not changed. It can be seen that the empirical sizes p 0 of A T , 0 are around the level of significance α = 0.05 , while the empirical sizes p 1 and p 2 of A T , 1 and A T , 2 are smaller than α / 2 = 0.025 , respectively.
For Case 2: The mean is changed, while the variance is not changed. It can be seen that the powers p 0 of A T , 0 , p 1 of A T , 1 , go to 1 as sample size T increases, while the powers p 2 of A T , 2 are smaller than 0.025.
For Case 3: The variance is changed, while the mean is not changed. It can be seen that the powers p 0 of A T , 0 , p 2 of A T , 2 , increase to 1 as sample size T increases, while the powers p 1 of A T , 1 are around 0.025 .
For Case 4: The mean and variance are both changed. We can find that the powers p 0 of A T , 0 , p 1 of A T , 1 , p 2 of A T , 2 , go to 1 as sample size T increases.
Second, we consider the multivariate t distribution. Let X 1 N T ( 0 , Σ T ) and X 2 χ 2 ( n ) and X 1 and X 2 be independent. Thus, t = X 1 / Y 2 / n has a multivariate t distribution denoted by t T ( 0 , Σ T , n ) . Similar to Table 1, we replace e N T ( 0 , Σ T ) by e t ( 0 , Σ T , 5 ) and obtain the results of p 0 , p 1 and p 2 in Table 2.
Compared to Table 1, we find that the powers of p 1 for Cases 2 and 3 in Table 2 are not identical to those in Table 1, but the sizes and powers of p 0 and p 2 for Cases 1–4 in Table 2 are as good as those in Table 1. It may be that the multivariate t distribution with 5 degrees of freedom has heavier tails, which affects the mean change test.
Thirdly, we will discuss the accuracy of Precision, Recall, and F1-score defined by (26) for the above change-point Cases 2–4. Killick and Eckley [12] studied the methods of change-point detection and gave the ‘cpt.mean’, ‘cpt.meanvar’ and ‘cpt.var’ in R the Package changepoint for the mean change, mean-variance change and variance change, respectively. Recently, Meier et al. [13] provided the R the Package mosum to detect the change-point using the moving sum statistics. We use cpt.meanvar and mosum to write ‘cpt.meanvar’ algorithm and ‘mosum’ algorithm, respectively. Thus, we compare these two methods with our method presented in Section 3. By the same setting in Table 1 and Table 2, we take m = 0.1 T in (25). When sample size T is 300,600 and 900, respectively, the bandwidth G in the mosum method is taken by 100, 120 and 150, respectively. Here, the G should be less than one half of the sample size (see R Package mosum). Then, we obtain the results of Precision, Recall, and F1-score in Table 3 and Table 4 under the multivariate normal distribution and multivariate t distribution, respectively.
Since the mosum method in [13] is mainly used to detect the mean change-point, by Table 3 and Table 4, the Precision, Recall, and F1-score of the mosum algorithm are worse than those of cpt.meanvar algorithm and our algorithm under Cases 3 and 4. By Table 3, under the multivariate normal case, the results of our algorithm for Cases 2 and 3 are as well as those of cpt.meanvar algorithm, but the results of our algorithm for Case 4 are better than those of cpt.meanvar algorithm. Furthermore, by Table 4, the results of our algorithm are better than those of cpt.meanvar algorithm under the multivariate t distribution.

5. The Real Data Analysis

In this section, we give three examples of real data to illustrate our three-step test for the change-point detection of mean and variance. The statistics max 1 k T 1 f ( T , k ) f ( T , k ) , max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | and max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | can be found in Section 3.
Example 1.
The dataset is the annual flow of the river Nile at Aswan from 1871 to 1970 (see [57]), and it contains 100 observations denoted by x t , 1 t 100 (see Figure 1). It measures the annual discharge at Aswan in 108 m 3 and is depicted in Figure 1. The sample autocorrelation function (ACF) is also presented in Figure 1. By the right side of Figure 1, the autocorrelation coefficient is relatively large when the lag is small, but it approaches zero as the lag increases. Therefore, the data satisfies the properties of α-mixing. By Figure 1, it seams that there is a mean change-point in the time series of the annual flow of the river Nile.
To judge the existence of change-points, we set the null hypothesis that the annual flow of the river Nile has no change in the mean or variance. We use our three-step algorithm given in Section 3 to find the change-points. Base on x 1 , x 2 , , x 100 , by Step 1, we take α = 0.1 , T = 100 , h T = T 1 / 5 and obtain max 1 k T 1 f ( T , k ) f ( T , k ) = 4.6048 > 2.054 . Therefore, we reject the null hypothesis and conclude that there is at least a change-point of mean or variance. By Step 2, it has max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | = 1.7838 > 1.358 and the p-value p v 1 = 0.0034 < 0.05 . It means that there is a mean change-point located at k ^ T , 1 = argmax 1 k T 1 | U k , 1 | = 28 . Meanwhile, it calculates by Step 3 with the modified data of the mean change that max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | = 1.0415 < 1.358 and the p-value p v 2 = 0.2282 > 0.05 . It means that there is no evidence of a variance change-point. Consequently, we conclude that there is only one mean change-point in the time series of the annual flow of the river Nile. It is pointed out that change-point 28 is the year 1898, when the Aswan dam was built. Since the Aswan dam was built, it has significantly changed the mean annual flow of the river Nile. In addition, Zeileis et al. [58] used the F test to detect the same change-point 28. We also use the cpt.meanvar method (see [12]) and the mosum method with bandwidth G = 20 (see [13]) from the R and obtain the same change-point 28.
Example 2.
The dataset is the prices of AMD stock downloaded by Python, which contains 212 observations from 3 March 2008 to 31 December 2008. Let P t be the closing price of AMD stock, and the return can be defined as r t = log P t log P t 1 and P 0 = 1 for 1 t 212 . Figure 2 shows the plots of times series of returns of AMD stock and its sample ACF.
By the right side of Figure 2, the times series of returns satisfy the properties of α-mixing. In addition, by the left side of Figure 2, the returns are around zero, but the variance of returns seams to change. Therefore, we use the three-step algorithm to find the change-points and set the null hypothesis that the return of AMD stock has no change in mean or variance. Based on the sample r 1 , r 2 , , r 212 , by Step 1, we take α = 0.1 , T = 212 , h T = T 1 / 5 and obtain max 1 k T 1 f ( T , k ) f ( T , k ) = 4.4340 > 2.054 . By Step 2, it has max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | = 0.8974 < 1.358 and p v 1 = 0.3963 > 0.05 . It means that there is no evidence of a mean change-point. By Step 3, it has max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | = 1.9398 > 1.358 and p v 2 = 0.0011 < 0.05 . Therefore, we detect a variance change-point located at k ^ T , 2 = argmax 1 k T 1 | U k , 2 | = 136 (on 12 September 2008). On the other hand, under the independent normal random variables, the authors investigated the change-point detection of variance [5,25]. Thus, we use the methods in [5,25] to detect the variance change-points 136 and 137, respectively. We also use the cpt.meanvar method from the R and obtain a change-point 136. However, we do not detect any change-point using the mosum method. Obviously, the difference between 136 and 137 is only one. Furthermore, it is known that Lehman Brothers declared bankruptcy on 15 September 2008, i.e., point 137. Therefore, it added financial risk to the stock market. Consequently, the variance of returns of AMD stock began to increase after the time of the bankruptcy of Lehman Brothers.
Example 3.
The dataset is the quarterly US ex-post real interest rate from 1961:Q1 to 1986:Q3 provided by Citibase data bank (see [59]). The data are also available from the R package strucchange (see [60]) and denoted by x t , 1 t 103 (see Figure 3). The sample AFC based on the quarterly US ex-post real interest rate is also shown in Figure 3.
Similarly, by the right left of Figure 3, the times series of US ex-post real interest rate satisfy the properties of α-mixing; and by the left side of Figure 3, it seems that there are some change-points of mean or variance. We also use the three-step algorithm to find the change-points and set the null hypothesis that the quarterly US ex-post real interest rate has no change in mean or variance. Base on x 1 , x 2 , , x 103 , by α = 0.1 , T = 103 , h T = T 1 / 5 and Step 1, we have max 1 k T 1 f ( T , k ) f ( T , k ) = 3.9193 > 2.054 . It means that there exist some change-points in this times series { x 1 , x 2 , , x 103 } . By Step 2, it has max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | = 1.5973 > 1.358 , p v 1 = 0.0122 < 0.05 . So there exists a mean change-point at k ^ T , 1 = argmax 1 k T 1 | U k , 1 | = 76 (on 1979:Q4). Then, it can be checked by Step 3 with the modified data of the mean change that max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | = 1.3779 > 1.358 and p v 2 = 0.0449 < 0.05 . In other words, it has a variance change-point located at k ^ T , 2 = argmax 1 k T 1 | U k , 2 | = 51 (on 1973:Q3). Consequently, we detect a mean change-point 76 and a variance change-point 51. In addition, we use mosum method from the R with bandwidth G = 25 to detect two change-points 47 and 76. Meanwhile, we apply cpt.meanvar from the R and find two change-points 47 and 79. On the other hand, the differences between the change-points { 51 , 76 } , { 47 , 76 } , { 47 , 79 } are small. However, we detect the change-point 76 to be a mean change-point and detect point 51 to be a variance change-point, while the mosum and cpt.meanvar methods do not specify the types of these change-points. Thus, our method has an advantage over their methods. Furthermore, it is pointed out that the sudden jump in oil prices in 1973 added to the volatility of the US ex-post real interest rate. We also point out that the Federal Reserve’s operating procedures in October 1979 increased the means of US ex-post real interest rate (see [59]).

6. Conclusions

Many researchers have studied the mean change-point models and variance change-point models, and obtained the limiting distributions for the CUSUM statistics of the mean change-point and variance change-point (see [3,6]). As far as we know, there are few papers to study the change-point model with the mean change and variance change. In this paper, we consider the mean-variance change-point model (1) with the α -mixing errors. Based on the CUSUM statistics of mean and variance, we give the combination statistic g ( T , k ) g ( T , k ) in (4). To determine whether there is a change-point of mean or variance, the limiting distributions for the CUSUM statistics g ( T , k ) g ( T , k ) and max 1 k T 1 g ( T , k ) g ( T , k ) are obtained under the null hypothesis there is no change in mean or variance. Some consistent estimators s ^ T , 1 2 and s ^ T , 2 2 for long-run variances s 1 2 and s 2 2 are presented in (17) of Lemma 1, respectively. Then, we obtain the limiting distributions for a combination statistic max 1 k T 1 f ( T , k ) f ( T , k ) , mean CUSUM statistic max 1 k T 1 | g 1 ( T , k ) s ^ T , 1 2 | and variance CUSUM statistic max 1 k T 1 | g 2 ( T , k ) s ^ T , 2 2 | in Corollaries 1 and 2. As an application, we give a three-step algorithm for change-point detection. The first step is to test whether there is at least a change-point or not. The second step and third step are to detect the mean change-point and variance change-point, respectively. To illustrate our three-step test of the change-point detection, some simulations and three real data examples are presented in Section 4 and Section 5, respectively. It can be seen that our algorithm has an advantage over the existing methods cpt.meanvar by [12] and mosum by [13]. For example, our method not only has a high power but can also determine the change-points as the mean change-point or variance change-point. On the other hand, the multiple change-point problems of the mean, variance, mean vector, and covariance matrix have gained much attention. In this article, we consider the limit distribution under the null hypothesis of no change-point. It is important for research to investigate the limit distribution under the alternative hypotheses. It is also interesting for researchers to study these problems based on the dependent panel data, high-dimensional data, and other dependent data in future work.

7. Proofs of Main Results

Lemma 2
(Lemma 1 in [49]). Let Y t = g t ( e t , , e t τ ) , where g t is a measurable function onto R υ and τ and υ are finite positive integers. If { e t , t 1 } is an α-mixing with α ( t ) = O ( t λ ) for some λ > 0 , then { Y t , t 1 } is also an α-mixing with α ( t ) = O ( t λ ) .
Lemma 3
(Proposition 2.5 in [51]). Let X F k , Y F k + t . If E | X | p < and E | Y | q < for some p , q 1 and 1 / p + 1 / q < 1 , then
| Cov ( X , Y ) | 8 ( E | X | p ) 1 / p ( E | Y | q ) 1 / q α 1 1 / p 1 / q ( t ) .
Lemma 4
(Lemma 1.4 in [52]). For some δ > 0 , let { e t , t 1 } be a mean zero α-mixing sequence with E | e t | 2 + δ < for all t 1 and t = 1 α δ 2 + δ ( t ) < . Then
E ( t = 1 T e t ) 2 [ 1 + 16 t = 1 α δ 2 + δ ( t ) ] t = 1 T ( E | e t | 2 + δ ) 2 2 + δ , T 1 .
Lemma 5
(Corollary 1 in [47] and Theorem 0 in [48]). For some δ > 0 , let { e t , t 1 } be an α-mixing sequence with E e t = 0 for all t 1 , sup t 1 E | e t | 2 + δ < and t = 1 α δ 2 + δ ( t ) < . For T 1 , denote S T = t = 1 T e t and suppose
E S T 2 / T σ 2 > 0 , as T .
Then
S T T σ 2 d N ( 0 , 1 ) , as T .
Furthermore,
W T ( x ) W ( x ) , as T ,
where W T ( x ) = S T x / T σ 2 for x [ 0 , 1 ] , and { W ( x ) , x [ 0 , 1 ] } is a Wiener process(standard Brownian motion). Then, for x [ 0 , 1 ] ,
W T ( x ) x W T ( 1 ) W ( x ) x W ( 1 ) W 0 ( x ) , a s T ,
where { W 0 ( x ) ; x [ 0 , 1 ] } is a standard Brownian bridge.
Lemma 6.
Let the Assumptions 1 and 2 be satisfied. Denote
ξ T = σ 0 T s 1 2 t = 1 T e t a n d η T = σ 0 2 T s 2 2 t = 1 T ( e t 2 E e t 2 ) ,
where σ 0 is defined by (5), s 1 2 and s 2 2 are defined by (10), respectively. Under the null hypothesis H 0 defined by (5), it has
( ξ T , η T ) d ( ξ , η ) , a s T ,
where ξ and η are two independent N ( 0 , 1 ) random variables. For some x [ 0 , 1 ] , let k = x T . Then
( σ 0 T s 1 2 t = 1 k e t , σ 0 2 T s 2 2 t = 1 k ( e t 2 E e t 2 ) ) ( W 1 ( x ) , W 2 ( x ) ) ,
and
( σ 0 T s 1 2 ( t = 1 k e t k T t = 1 T e t ) , σ 0 2 T s 2 2 ( t = 1 k e t 2 k T t = 1 T e t 2 ) ) ( W 1 0 ( x ) , W 2 0 ( x ) ) ,
where W 1 0 ( x ) and W 2 0 ( x ) are independent standard Brownian bridge.
Proof of Theorem 1.
By (1), (2), (4), (5) and E e t = 0 for all t 1 , it is easy to check that
g 1 ( T , k ) s 1 2 = 1 T s 1 2 U k , 1 = 1 T s 1 2 ( k ( T k ) T ) ( 1 k t = 1 k X t 1 T k t = k + 1 T X t ) = 1 T s 1 2 ( t = 1 k X t k T t = 1 T X t ) = σ 0 T s 1 2 ( t = 1 k e t k T t = 1 T e t ) .
Meanwhile, it has
1 k t = 1 k ( X t X ¯ ) 2 1 T k t = k + 1 T ( X t X ¯ ) 2 = 1 k t = 1 k ( X t 2 2 X t X ¯ + X ¯ 2 ) 1 T k t = k + 1 T ( X t 2 2 X t X ¯ + X ¯ 2 ) = ( 1 k t = 1 k X t 2 1 T k t = k + 1 T X t 2 ) 2 X ¯ ( 1 k t = 1 k X t 1 T k t = k + 1 T X t ) = ( 1 k t = 1 k ( μ 0 + σ 0 e t ) 2 1 T k t = k + 1 T ( μ 0 + σ 0 e t ) 2 ) 2 X ¯ ( 1 k t = 1 k ( μ 0 + σ 0 e t ) 1 T k t = k + 1 T ( μ 0 + σ 0 e t ) ) = 2 σ 0 ( μ 0 X ¯ ) ( 1 k t = 1 k e t 1 T k t = k + 1 T e t ) + σ 0 2 ( 1 k t = 1 k e t 2 1 T k t = k + 1 T e t 2 ) ,
where X ¯ = 1 T t = 1 T X t . Combining this with (4), we have
g 2 ( T , k ) s 2 2 = 1 T s 2 2 U k , 2 = 1 T s 2 2 ( k ( T k ) T ) [ 2 σ 0 ( μ 0 X ¯ ) ( 1 k t = 1 k e t 1 T k t = k + 1 T e t ) + σ 0 2 ( 1 k t = 1 k e t 2 1 T k t = k + 1 T e t 2 ) ] = 2 σ 0 ( μ 0 X ¯ ) T s 2 2 ( t = 1 k e t k T t = 1 T e t ) + σ 0 2 T s 2 2 ( t = 1 k e t 2 k T t = 1 T e t 2 ) : = D T 1 + D T 2 .
By δ > 0 and sup t 1 E | e t | 4 + 2 δ < , it has sup t 1 E | e t | 2 + δ < . Then, by Lemma 4 with sup t 1 E | e t | 2 + δ < and t = 1 α δ 2 + δ ( t ) < , it has
Var ( X ¯ ) = E ( X ¯ μ 0 ) 2 = σ 0 2 T 2 E ( t = 1 T e t ) 2 C 1 T 2 ( 1 + 16 t = 1 α δ 2 + δ ( t ) ) t = 1 T ( E | e t | 2 + δ ) 2 2 + δ = O ( 1 T ) ,
which implies
X ¯ μ 0 = O P ( T 1 / 2 ) .
In addition, we apply (33) in Lemma 6 and obtain that
2 σ 0 T s 2 2 ( t = 1 k e t k T t = 1 T e t ) = O P ( 1 ) .
Thus, it has D T 1 = O P ( T 1 / 2 ) = o P ( 1 ) , which implies that D T 2 in (35) is a main term. Last, by (34) and (35) and D T 1 = o P ( 1 ) , we apply Lemma 6 and obtain (12), i.e.,
( g 1 ( T , k ) s 1 2 , g 2 ( T , k ) s 2 2 ) ( W 1 0 ( x ) , W 2 0 ( x ) ) ,
where W 1 0 ( x ) and W 2 0 ( x ) are independent Brownian motions. In addition, by (12) and the continuous mapping theorem, (13) is also proved. □
Proof of Lemma 1.
First, we prove that s 1 2 in (10) absolutely converges. Obviously, E | e 1 | 8 + 4 δ < implies E | e 1 | 2 + δ < for some δ > 0 . Then, by the second-order stationarity of α -mixing sequence { e t , t 1 } , we apply Lemma 3 with E e 1 = 0 , E e 1 2 = 1 , E | e 1 | 2 + δ < and t = 1 α δ 2 + δ ( t ) < , and obtain that
0 < s 1 2 = γ 1 ( 0 ) + 2 h = 1 γ 1 ( h ) Var ( X 1 ) + 2 h = 1 | Cov ( X 1 , X 1 + h ) | σ 0 2 [ Var ( e 1 ) + 2 h = 1 | Cov ( e 1 , e 1 + h ) | ] < .
By (10) and (16), we use the decomposition
| s ^ T , 1 2 s 1 2 | | γ ^ 1 ( 0 ) γ 1 ( 0 ) | + 2 h = 1 h T | γ ^ 1 ( h ) γ 1 ( h ) | + 2 h = h T + 1 | γ 1 ( h ) | : = i = 1 3 K T , i ,
where h T as T .
Obviously, by (38), (39) and h T as T , it can be checked that
K T , 3 0 , as T .
Now, we consider the term K T , 1 in (39). Obviously, by the second-order stationarity of { e t } , (14), E e 1 = 0 and γ 1 ( 0 ) = σ 0 2 E e 1 2 , we obtain that
γ ^ 1 ( 0 ) γ 1 ( 0 ) = σ 0 2 T t = 1 T ( e t 2 E e t 2 ) σ 0 2 ( e ¯ ) 2 ,
where e ¯ = 1 T t = 1 T e t = X ¯ μ 0 σ 0 . Combining with (37), we have
e ¯ = O P ( T 1 / 2 ) .
By E | e 1 | 8 + 4 δ < , it has E | e 1 | 2 ( 2 + δ ) < for some δ > 0 . In addition by the second-order stationarity of α -mixing sequence { ( e t 2 E e t 2 ) , t 1 } with E ( e 1 2 E e 1 2 ) = 0 , E | e 1 2 E e 1 2 | 2 + δ < and t = 1 α δ 2 + δ ( t ) < , we apply Lemma 4 and obtain that
E | t = 1 T ( e t 2 E e t 2 ) | 2 ( 1 + 16 t = 1 α δ 2 + δ ( t ) ) t = 1 T E ( e t 2 E e t 2 ) 2 + δ 2 2 + δ = O ( T ) ,
which implies
| t = 1 T ( e t 2 E e t 2 ) | = O P ( T 1 / 2 ) ,
Consequently, it follows from (41), (42), (44) and 0 < σ 0 2 < that
| γ ^ 1 ( 0 ) γ 1 ( 0 ) | σ 0 2 | 1 T t = 1 T ( e t 2 E e t 2 ) | + σ 0 2 ( e ¯ ) 2 = O P ( T 1 2 ) + O P ( T 1 ) = O P ( T 1 2 ) .
Thus, it has
K T , 1 = o P ( 1 ) .
Next, we consider K T , 2 . By (14), E e 1 = 0 , γ 1 ( h ) = σ 0 2 E e 1 e 1 + h and
γ ^ 1 ( h ) = σ 0 2 [ 1 T t = 1 T h e t e t + h 2 ( e ¯ ) 2 + T h T ( e ¯ ) 2 + e ¯ T t = T h + 1 T e t + e ¯ T t = 1 h e t ] ,
it can be seen that
γ ^ 1 ( h ) γ 1 ( h ) = σ 0 2 [ 1 T t = 1 T h ( e t e t + h E e t e t + h ) 1 T t = T h + 1 T E e t e t + h ( h T + 1 ) ( e ¯ ) 2 + e ¯ T t = T h + 1 T e t + e ¯ T t = 1 h e t ] : = σ 0 2 i = 1 5 N h , i .
By Lemma 3, it can be seen that { e t e t + h , 1 t T h } are α -mixing random variables with the same mixing coefficients. Thus, by Lemma 4 with E | e 1 | 2 ( 2 + δ ) < and t = 1 α δ 2 + δ ( t ) < , we establish that
E ( t = 1 T h [ e t e t + h E ( e t e t + h ) ] ) 2 C 1 t = 1 T h E [ ( e t e t + h ) E ( e t e t + h ) ] 2 + δ 2 2 + δ = O ( T h ) ,
which implies
| t = 1 T h [ e t e t + h E ( e t e t + h ) ] | = O P ( T h ) = O P ( T 1 / 2 ) .
By (47) and (48) and the fact h T = O ( T β ) in (11) and β ( 0 , 1 / 2 ) , we obtain that
h = 1 h T | N h , 1 | = O P ( T β 1 / 2 ) = o P ( 1 ) .
Meanwhile, by E e 1 2 < , Hölder inequality and 0 < β < 1 / 2 , it has
h = 1 h T | N h , 2 | 1 T h = 1 h T t = T h + 1 T ( E e t 2 ) 1 / 2 ( E e t + h 2 ) 1 / 2 C T h = 1 h T h = O ( T 2 β 1 ) = o ( 1 ) .
By (42),
h = 1 h T | N h , 3 | h = 1 h T ( h T + 1 ) ( e ¯ ) 2 = O P ( T β 1 ) = o P ( 1 ) .
Similar to the proof of (44), we have
| t = T h + 1 T e t | = O P ( h ) .
Thus, it follows from (42), (47), (52), h T = O ( T β ) and β ( 0 , 1 / 2 ) that
h = 1 h T | N h , 4 | | e ¯ | T h = 1 h T | t = T h + 1 T e t | = O P ( T 3 / 2 ) O P ( h T 3 / 2 ) = O P ( T 3 ( β 1 ) / 2 ) = o P ( 1 ) .
Similarly,
h = 1 h T | N h , 5 | = O P ( T 3 ( β 1 ) / 2 ) = o P ( 1 ) .
Therefore, by (39), (47), (49)–(54) and 0 < σ 0 2 < , we establish
K T , 2 2 σ 0 2 h = 1 h T i = 1 5 | N h , i | = o P ( 1 ) .
Finally, it follows from (39), (40), (46) and (55) that
| s ^ T , 1 2 s 1 2 | = o P ( 1 ) ,
i.e., the first term in (17) is proved.
Next, we prove the right of (17). Similar to (39), by (10) and (16), it follows
| s ^ T , 2 2 s 2 2 | | γ ^ 2 ( 0 ) γ 2 ( 0 ) | + 2 h = 1 h T | γ ^ 2 ( h ) γ 2 ( h ) | + 2 h = h T + 1 | γ 2 ( h ) | : = i = 1 3 R T , i ,
where h T as T . Similar to the proof of (38), by Lemma 3 with E | e 1 | 2 ( 2 + δ ) < and t = 1 α δ 2 + δ ( t ) < , it follows
0 < s 2 2 = γ 2 ( 0 ) + 2 h = 1 γ 2 ( h ) Var ( ( X 1 μ 0 ) 2 ) + 2 h = 1 Cov ( ( X 1 μ 0 ) 2 , ( X 1 + h μ 0 ) 2 ) σ 0 4 ( Var ( e 1 2 ) + 2 h = 1 | Cov ( e 1 2 , e 1 + h 2 ) | ) < .
Thus, we have
R T , 3 0 , as T
proving h T as T .
Now, we consider the term R T , 1 in (57). By (15) and
γ 2 ( 0 ) = E ( X 1 μ 0 ) 4 ( E ( X 1 μ 0 ) 2 ) 2 ,
we obtain
γ ^ 2 ( 0 ) γ 2 ( 0 ) = 1 T t = 1 T ( X t μ 0 ) 4 + 4 ( μ 0 X ¯ ) T t = 1 T ( X t μ 0 ) 3 + 8 ( μ 0 X ¯ ) 2 T t = 1 T ( X t μ 0 ) 2 + 4 ( μ 0 X ¯ ) 3 T t = 1 T ( X t μ 0 ) ( 1 T t = 1 T ( X t μ 0 ) 2 ) 2 E ( X 1 μ 0 ) 4 + ( E ( X 1 μ 0 ) 2 ) 2 = 1 T t = 1 T ( X t μ 0 ) 4 E ( X t μ 0 ) 4 + 4 ( μ 0 X ¯ ) T t = 1 T ( X t μ 0 ) 3 E ( X t μ 0 ) 3 + 8 ( μ 0 X ¯ ) 2 T t = 1 T ( X t μ 0 ) 2 E ( X t μ 0 ) 2 + 4 ( μ 0 X ¯ ) 3 T t = 1 T ( X t μ 0 ) E ( X t μ 0 ) { 1 T t = 1 T ( X t μ 0 ) 2 E ( X t μ 0 ) 2 + E ( X t μ 0 ) 2 } 2 + 4 ( μ 0 X ¯ ) E ( X 1 μ 0 ) 3 + 8 ( μ 0 X ¯ ) 2 E ( X 1 μ 0 ) 2 + 4 ( μ 0 X ¯ ) 3 E ( X 1 μ 0 ) + ( E ( X 1 μ 0 ) 2 ) 2 = 1 T t = 1 T ( X t μ 0 ) 4 E ( X t μ 0 ) 4 + 4 ( μ 0 X ¯ ) T t = 1 T [ ( X t μ 0 ) 3 E ( X t μ 0 ) 3 ] + 8 ( μ 0 X ¯ ) 2 T t = 1 T ( X t μ 0 ) 2 E ( X t μ 0 ) 2 + 4 ( μ 0 X ¯ ) 3 T t = 1 T ( X t μ 0 ) E ( X t μ 0 ) ( 1 T t = 1 T ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ) 2 2 E ( X 1 μ 0 ) 2 T t = 1 T ( X t μ 0 ) 2 E ( X t μ 0 ) 2 + 4 ( μ 0 X ¯ ) E ( X 1 μ 0 ) 3 + 8 ( μ 0 X ¯ ) 2 E ( X 1 μ 0 ) 2 + 4 ( μ 0 X ¯ ) 3 E ( X 1 μ 0 ) : = i = 1 9 L T , i .
By the null hypothesis H 0 defined by (5), it has X t = μ 0 + σ 0 e t , t 1 . Then, { ( X t μ 0 ) j E ( X t μ 0 ) j , 1 i T , j = 1 , 2 , 3 , 4 } are also α -mixing random variables with the same mixing coefficients. Similar to the proof of (43), by Lemma 4 with E | e 1 | 4 ( 2 + δ ) < and t = 1 α δ 2 + δ ( t ) < , it can be obtained
E ( t = 1 T [ ( X t μ 0 ) j E ( X t μ 0 ) j ] ) 2 = O ( T ) , j = 1 , 2 , 3 , 4 ,
which implies
| t = 1 T [ ( X t μ 0 ) j E ( X t μ 0 ) j ] | = O P ( T 1 2 ) , j = 1 , 2 , 3 , 4 .
Thus, by (37) and (61), we can obtain that
| L T , 1 | = O P ( T 1 2 ) , | L T , 2 | = O P ( T 1 ) , | L T , 3 | = O P ( T 3 / 2 ) ,
| L T , 4 | = O P ( T 2 ) , | L T , 5 | = O P ( T 1 ) , | L T , 6 | = O P ( T 1 / 2 ) ,
| L T , 7 | = O P ( T 1 / 2 ) , | L T , 8 | = O P ( T 1 ) , | L T , 9 | = O P ( T 3 / 2 ) .
By (57), (59), (62)–(64), we have that
R T , 1 = | γ ^ 2 ( 0 ) γ 2 ( 0 ) | i = 1 9 | L T , i | = O P ( T 1 2 ) = o P ( 1 ) .
It is time to consider the term R T , 2 . We can check that
γ ^ 2 ( h ) = 1 T t = 1 T h Z t 2 Z t + h 2 ( T + h T ) ( Z 2 ¯ ) 2 + Z 2 ¯ T t = T h + 1 T Z t 2 + Z 2 ¯ T t = 1 h Z t 2 : = i = 1 4 H h , i ,
where Z t = X t X ¯ . Combining with γ 2 ( h ) = E ( X 1 μ 0 ) 2 ( X 1 + h μ 0 ) 2 E ( X 1 μ 0 ) 2 E ( X 1 + h μ 0 ) 2 , it can be checked that
γ ^ 2 ( h ) γ 2 ( h ) = i = 1 4 H n , i E ( X 1 μ 0 ) 2 ( X 1 + h μ 0 ) 2 + E ( X 1 μ 0 ) 2 E ( X 1 + h μ 0 ) 2 = 1 T t = 1 T h [ ( X t μ 0 ) 2 ( X t + h μ 0 ) 2 E ( X t μ 0 ) 2 ( X t + h μ 0 ) 2 ] + 2 ( μ 0 X ¯ ) T t = 1 T h [ ( X t μ 0 ) 2 ( X t + h μ 0 ) E ( X t μ 0 ) 2 ( X t + h μ 0 ) ] + ( μ 0 X ¯ ) 2 T t = 1 T h [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + 2 ( μ 0 X ¯ ) T t = 1 T h [ ( X t μ 0 ) ( X t + h μ 0 ) 2 E ( X t μ 0 ) ( X t + h μ 0 ) 2 ] + 4 ( μ 0 X ¯ ) 2 T t = 1 T h [ ( X t μ 0 ) ( X t + h μ 0 ) E ( X t μ 0 ) ( X t + h μ 0 ) ] + 2 ( μ 0 X ¯ ) 3 T t = 1 T h ( X t μ 0 ) + ( μ 0 X ¯ ) 2 T t = 1 T h [ ( X t + h μ 0 ) 2 E ( X t + h μ 0 ) 2 ] + 2 ( μ 0 X ¯ ) 3 T t = 1 T h ( X t + h μ 0 ) ( T + h T ) ( 1 T t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] ) 2 + 2 [ ( μ 0 X ¯ ) 2 E ( X 1 μ 0 ) 2 ] ( T + h ) T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + 1 T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] t = T h + 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + 2 h E ( X 1 μ 0 ) 2 T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + E ( X 1 μ 0 ) 2 T t = T h + 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + 2 ( μ 0 X ¯ ) T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] t = T h + 1 T ( X t μ 0 ) + 2 ( μ 0 X ¯ ) E ( X 1 μ 0 ) 2 T t = T h + 1 T ( X t μ 0 ) + 2 ( μ 0 X ¯ ) 2 h T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] ( μ 0 X ¯ ) 2 T t = T h + 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] 2 ( μ 0 X ¯ ) 3 T t = T h + 1 T ( X t μ 0 ) + 1 T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] t = 1 h [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + E ( X 1 μ 0 ) 2 T t = 1 h [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] + 2 ( μ 0 X ¯ ) T 2 t = 1 T [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] t = 1 h ( X t μ 0 ) + 2 ( μ 0 X ¯ ) E ( X 1 μ 0 ) 2 T t = 1 h ( X t μ 0 ) ( μ 0 X ¯ ) 2 T t = 1 h [ ( X t μ 0 ) 2 E ( X t μ 0 ) 2 ] 2 ( μ 0 X ¯ ) 3 T t = 1 h ( X t μ 0 ) h T E ( X 1 μ 0 ) 2 ( X 1 + h μ 0 ) 2 + 2 ( μ 0 X ¯ ) ( T h ) T E ( X 1 μ 0 ) 2 ( X 1 + h μ 0 ) + 2 ( μ 0 X ¯ ) ( T h ) T E ( X 1 μ 0 ) ( X 1 + h μ 0 ) 2 + 4 ( μ 0 X ¯ ) 2 ( T h ) T E ( X 1 μ 0 ) ( X 1 + h μ 0 ) + 4 ( μ 0 X ¯ ) 2 E ( X 1 μ 0 ) 2 + h T ( E ( X 1 μ 0 ) 2 ) 2 4 h T ( μ 0 X ¯ ) 4 : = i = 1 31 G h , i .
According to Lemma 2, it is easy to obtain that { [ ( X t μ 0 ) i ( X t + h μ 0 ) j E ( X t μ 0 ) i ( X t + h μ 0 ) j ] , i , j = 0 , 1 , 2 , 1 t T h } are α -mixing random variables with the same mixing coefficients. Then, similar to the proof of (43), by Lemma 4 with E | e 1 | 4 ( 2 + δ ) < and t = 1 α δ 2 + δ ( t ) < , we obtain
E | t = 1 T h [ ( X t μ 0 ) i ( X t + h μ 0 ) j E ( X t μ 0 ) i ( X t + h μ 0 ) j ] | 2 = O ( T ) , i , j = 0 , 1 , 2 ,
which implies
| t = 1 T h [ ( X t μ 0 ) i ( X t + h μ 0 ) j E ( X t μ 0 ) i ( X t + h μ 0 ) j ] | = O P ( T 1 / 2 ) , i , j = 0 , 1 , 2 .
Thus, by (37), (67) and (68), one can obtain that
| G h , 1 | = O P ( T 1 2 ) , | G h , 2 | = O P ( T 1 ) , | G h , 3 | = O P ( T 3 2 ) ,
| G h , 4 | = O P ( T 1 ) , | G h , 5 | = O P ( T 3 2 ) , | G h , 6 | = O P ( T 2 ) ,
| G h , 7 | = O P ( T 3 2 ) , | G h , 8 | = O P ( T 2 ) , | G h , 9 | = O P ( T 1 ) ,
| G h , 10 | = O P ( T 1 2 ) , | G h , 11 | = O P ( T 3 2 h 1 / 2 ) , | G h , 12 | = O P ( T 3 2 h ) ,
| G h , 13 | = O P ( T 1 h 1 2 ) , | G h , 14 | = O P ( T 2 h 1 2 ) , | G h , 15 | = O P ( T 3 2 h 1 2 ) ,
| G h , 16 | = O P ( T 5 2 h ) , | G h , 17 | = O P ( T 2 h 1 2 ) , | G h , 18 | = O P ( T 5 2 h 1 2 ) ,
| G h , 19 | = O P ( T 3 2 h 1 2 ) , | G h , 20 | = O P ( T 1 h 1 2 ) , | G h , 21 | = O P ( T 2 h 1 2 ) ,
| G h , 22 | = O P ( T 3 2 h 1 2 ) , | G h , 23 | = O P ( T 2 h 1 2 ) , | G h , 24 | = O P ( T 5 2 h 1 2 ) ,
| G h , 25 | = O ( T 1 h ) , | G h , 26 | = O P ( T 1 2 ) , | G h , 27 | = O P ( T 1 2 ) ,
| G h , 28 | = | G h , 29 | = O P ( T 1 ) , | G h , 30 | = O ( T 1 h ) , | G h , 31 | = O ( T 3 h ) .
Therefore, by (57), (67), (69)–(78), h T = O ( T β ) and β ( 0 , 1 / 2 ) , we obtain that
R T , 2 h = 1 h T i = 1 31 | G h , i | = O P ( T β 1 / 2 ) = o P ( 1 ) .
Consequently, it follows from (57), (58), (65) and (79) that
| s ^ T , 2 2 s 2 2 | = o P ( 1 ) ,
i.e., the second term in (17) is complete to prove. □
Proof of Lemma 6.
By the Cramér–Wold device, it is sufficient to show that
a ξ T + b η T d N ( 0 , a 2 + b 2 ) for all a , b R .
We rewrite a ξ T + b η T = 1 T t = 1 T ζ t , where
ζ t = a σ 0 s 1 2 e t + b σ 0 2 s 2 2 ( e t 2 E e t 2 ) , 1 t T .
Obviously, { ζ t , t 1 } is also a mean zero sequence of α -mixing random variables with the same mixing coefficients. By the null hypothesis H 0 defined by (5) and the Assumptions 1 and 2, it is easy to check that
lim T 1 T Var ( t = 1 T ζ t ) = lim T 1 T Var ( t = 1 T ( a σ 0 s 1 2 e t + b σ 0 2 s 2 2 e t 2 ) ) = lim T 1 T Var ( t = 1 T a σ 0 s 1 2 e t ) + lim T 1 T Var ( t = 1 T b σ 0 2 s 2 2 e t 2 ) + lim T 2 T a σ 0 s 1 2 b σ 0 2 s 2 2 i = 1 T j = 1 T Cov ( e i , e j 2 ) = lim T a 2 T s 1 2 Var ( t = 1 T ( X t μ 0 ) ) + lim T b 2 T s 2 2 Var ( t = 1 T ( X t μ 0 ) 2 ) + lim T 2 a b T s 1 2 s 2 2 i = 1 T j = 1 T Cov ( X i μ 0 ) , ( X j μ 0 ) 2 = a 2 + b 2 .
Thus, by (28) and (83) in Lemma 5 with sup t 1 E | e t | 4 + 2 δ < and t = 1 α δ 2 + δ ( t ) < for some δ > 0 , we immediately obtain the result of (31).
Next, we prove (32). Denote S T ( 1 ) = σ 0 t = 1 T e t , S T ( 2 ) = σ 0 2 t = 1 T ( e t 2 E e t 2 ) ,
W T 1 ( x ) = S T x ( 1 ) T s 1 2 and W T 2 ( x ) = S T x ( 2 ) T s 2 2 for x [ 0 , 1 ] ,
where s 1 2 , s 2 2 are defined by (10). Obviously, { e t , t 1 } and { ( e t 2 E e t 2 ) , t 1 } are mean zero α -mixing sequence with the same mixing coefficients. Then, by (29) and (31) in Lemma 5, we obtain 32. Combining with (30), the proof of (33) is completed. □

Author Contributions

Supervision W.Y.; software M.G.; writing–original draft preparation, X.S., X.W., and W.Y. All authors have read and agreed to the published version of the manuscript.

Funding

Yang’s work was funded by NSF of Anhui Province (2008085MA14, 2108085MA06), Quality Engineering Project of Anhui University (2023xjzlgc232); Shi’s work was supported by the NSERC Discovery Grant RGPIN 2022-03264, the Interior Universities Research Coalition and the BC Ministry of Health, and the University of British Columbia Okanagan (UBC-O) Vice Principal Research in collaboration with the UBC-O Irving K. Barber Faculty of Science.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shewhart, W.A. The application of statistics as an aid in maintaining quality of a manufactured product. J. Amer. Statist. Assoc. 1925, 20, 546–548. [Google Scholar] [CrossRef]
  2. Page, E.S. Continuous inspection schemes. Biometrika 1954, 41, 100–115. [Google Scholar] [CrossRef]
  3. Antoch, J.; Hušková, M.; Veraverbeke, N. Change-point problem and bootstrap. J. Nonparametr. Stat. 1995, 5, 123–144. [Google Scholar] [CrossRef]
  4. Bai, J. Least squares estimation of a shift in linear processes. J. Time Series Anal. 1994, 15, 453–472. [Google Scholar] [CrossRef]
  5. Inclán, C.; Tiao, G. Use of cumulative sums of squares for retrospective detection of changes of variance. J. Amer. Statist. Assoc. 1994, 89, 913–923. [Google Scholar]
  6. Gombay, E.; Horváth, L.; Hušková, M. Estimators and tests for change in variances. Statist. Decis. 1996, 14, 145–159. [Google Scholar] [CrossRef]
  7. Csörgő, M.; Horváth, L. Limit Theorems in Change-Point Analysis; Wiley: Chichester, UK, 1997; pp. 170–180. [Google Scholar]
  8. Chen, J.; Gupta, A. Parametric Statistical Change Point Analysis, with Applications to Genetics, Medicine and Finance, 2nd ed.; Birkhäuser: Boston, MA, USA, 2012; pp. 1–30. [Google Scholar]
  9. Shiryaev, A. On stochastic models and optimal methods in the quickest detection problems. Theory Probab. Appl. 2009, 53, 385–401. [Google Scholar] [CrossRef]
  10. Shiryaev, A. Stochastic Disorder Problems; Springer: Berlin/Heidelberg, Germany, 2019; pp. 367–388. [Google Scholar]
  11. Rosenblatt, M. A central limit theorem and a strong mixing condition. Proc. Natl. Acad. Sci. USA 1956, 42, 43–47. [Google Scholar] [CrossRef] [PubMed]
  12. Killick, R.; Eckley, I.A. changepoint, An R Package for Changepoint Analysis. J. Stat. Softw. 2014, 58, 1–19. [Google Scholar] [CrossRef]
  13. Meier, A.; Kirch, C.; Cho, H. mosum: A Package for Moving Sums in Change-Point Analysis. J. Stat. Softw. 2021, 97, 1–42. [Google Scholar] [CrossRef]
  14. Kokoszka, P.; Leipus, R. Change-point in the mean of dependent observations. Statist. Probab. Lett. 1998, 40, 385–393. [Google Scholar] [CrossRef]
  15. Shi, X.P.; Wu, Y.H.; Miao, B.Q. Strong convergence rate of estimators of change-point and its application. Comput. Statist. Data Anal. 2009, 53, 990–998. [Google Scholar] [CrossRef]
  16. Ding, S.S.; Fang, H.Y.; Dong, X.; Yang, W.Z. The CUSUM statistics of change-point models based on dependent sequences. J. Appl. Stat. 2022, 49, 2593–2611. [Google Scholar] [CrossRef] [PubMed]
  17. Zhou, J.; Liu, S.Y. Inference for mean change-point in infinite variance AR(p) process. Stat. Probab. Lett. 2009, 79, 6–15. [Google Scholar] [CrossRef]
  18. Shao, X.; Zhang, X. Testing for change points in time series. J. Amer. Statist. Assoc. 2010, 105, 1228–1240. [Google Scholar] [CrossRef]
  19. Shao, X. Self-normalization for time series, a review of recent developments. J. Amer. Statist. Assoc. 2015, 110, 1797–1817. [Google Scholar] [CrossRef]
  20. Tsay, R. Outliers, level shifts and variance changes in time series. J. Forecast. 1988, 7, 1–20. [Google Scholar] [CrossRef]
  21. Yang, W.Z.; Liu, H.S.; Wang, Y.W.; Wang, X.J. Data-driven estimation of change-points with mean shift. J. Korean Statist. Soc. 2023, 52, 130–153. [Google Scholar] [CrossRef]
  22. Bai, J. Common breaks in means and variance for panel data. J. Econom. 2010, 157, 78–92. [Google Scholar] [CrossRef]
  23. Horváth, L.; Hušková, M. Change-point detection in panel data. J. Time Ser. Anal. 2012, 33, 631–648. [Google Scholar] [CrossRef]
  24. Cho, H. Change-point detection in panel data via double CUSUM statistic. Electron. J. Stat. 2016, 10, 2000–2038. [Google Scholar] [CrossRef]
  25. Chen, J.; Gupta, A. Testing and locating variance change points with application to stock prices. J. Amer. Statist. Assoc. 1997, 92, 739–747. [Google Scholar] [CrossRef]
  26. Lee, S.; Park, S. The cusum of squares test for scale changes in infinite order moving average processes. Scand. J. Stat. 2001, 28, 625–644. [Google Scholar] [CrossRef]
  27. Xu, M.; Wu, Y.; Jin, B. Detection of a change-point in variance by a weighted sum of powers of variances test. J. Appl. Stat. 2019, 46, 664–679. [Google Scholar] [CrossRef]
  28. Berkes, I.; Gombay, E.; Horvath, L. Testing for changes in the covariance structure of linear processes. J. Stat. Plan. Inf. 2009, 139, 2044–2063. [Google Scholar] [CrossRef]
  29. Lee, S.; Ha, J.; Na, O. The cusum test for parameter change time series models. Scand. J. Stat. 2003, 30, 781–796. [Google Scholar] [CrossRef]
  30. Vexler, A. Guaranteed testing for epidemic changes of a linear regression model. J. Stat. Plann. Inference 2006, 136, 3101–3120. [Google Scholar] [CrossRef]
  31. Jin, B.S.; Wu, Y.H.; Shi, X.P. Consistent two-stage multiple change-point detection in linear models. Canad. J. Statist. 2016, 44, 161–179. [Google Scholar] [CrossRef]
  32. Gurevich, G. Optimal properties of parametric Shiryaev-Roberts statistical control procedures. Comput. Model. New Technol. 2013, 17, 37–50. [Google Scholar]
  33. Aue, A.; Hörmann, S.; Horváth, L.; Reimherr, M. Break detection in the covariance structure of multivariate time series models. Ann. Statist. 2009, 37, 4046–4087. [Google Scholar] [CrossRef]
  34. Cho, H.; Kirch, C. Two-stage data segmentation permitting multiscale change points, heavy tails and dependence. Ann. Inst. Statist. Math. 2022, 74, 653–684. [Google Scholar] [CrossRef]
  35. Niu, Y.; Hao, N.; Zhang, H. Multiple change-point detection, a selective overview. Statist. Sci. 2016, 31, 611–623. [Google Scholar] [CrossRef]
  36. Korkas, K.; Fryzlewicz, P. Multiple change-point detection for non-stationary time series using wild binary segmentation. Statist. Sinica 2017, 27, 287–311. [Google Scholar] [CrossRef]
  37. Shi, X.P.; Wu, Y.H.; Rao, C.R. Consistent and powerful graph-based change-point test for high-dimensional data. Proc. Natl. Acad. Sci. USA 2017, 114, 3873–3878. [Google Scholar] [CrossRef] [PubMed]
  38. Shi, X.P.; Wang, X.-S.; Reid, N. A New Class of Weighted CUSUM Statistics. Entropy 2022, 24, 1652. [Google Scholar] [CrossRef]
  39. Chen, F.; Mamon, R.; Nkurunziza, S. Inference for a change-point problem under a generalised Ornstein-Uhlenbeck setting. Ann. Inst. Statist. Math. 2018, 70, 807–853. [Google Scholar] [CrossRef]
  40. Zamba, K.D.; Hawkins, D.M. A multivariate change-point model for change in mean vector and/or covariance dtructure. J. Qual. Technol. 2009, 41, 285–303. [Google Scholar] [CrossRef]
  41. Oh, H.; Lee, S. On score vector-and residual-based CUSUM tests in ARMA-GARCH models. Stat. Methods Appl. 2018, 27, 385–406. [Google Scholar] [CrossRef]
  42. Jäntschi, L. A test detecting the outliers for continuous distributions based on the cumulative distribution function of the data being tested. Symmetry 2019, 11, 835. [Google Scholar] [CrossRef]
  43. William, K.; Isidore, N. Inference for nonstationary time series of counts with application to change-point problems. Ann. Inst. Statist. Math. 2022, 74, 801–835. [Google Scholar]
  44. Arrouch, M.S.E.; Elharfaoui, E.; Ngatchou-Wandji, J. Change-Point Detection in the Volatility of Conditional Heteroscedastic Autoregressive Nonlinear Models. Mathematics 2023, 11, 4018. [Google Scholar] [CrossRef]
  45. Hall, P.; Heyde, C.C. Martingale Limit Theory and Its Application; Academic Press Inc.: New York, NY, USA, 1980. [Google Scholar]
  46. Lin, Z.Y.; Lu, C.R. Limit Theory for Mixing Dependent Random Variable; Science Press: Beijing, China, 1997. [Google Scholar]
  47. Withers, C.S. Central limit theorems for dependent variables. Z. Wahrsch. Verw. Gebiete. 1981, 57, 509–534. [Google Scholar] [CrossRef]
  48. Herrndorf, N. A Functional Central Limit Theorem for Strongly Mixing Sequences of Random Variables. Z. Wahrsch. Verw. Gebiete 1985, 69, 541–550. [Google Scholar] [CrossRef]
  49. White, H.; Domowitz, I. Nonlinear regression with dependent observations. Econometrica 1984, 52, 143–162. [Google Scholar] [CrossRef]
  50. Györfi, L.; Härdle, W.; Sarda, P.; Vieu, P. Nonparametric Curve Estimation from Time Series; Springer: Berlin/Heidelberg, Germany, 1989. [Google Scholar]
  51. Fan, J.Q.; Yao, Q.W. Nonlinear Time Series. Nonparametric and Parametric Methods; Springer: New York, NY, USA, 2003. [Google Scholar]
  52. Yang, W.Z.; Wang, Y.W.; Hu, S.H. Some probability inequalities of least-squares estimator in non linear regression model with strong mixing errors. Comm. Statist. Theory Methods 2017, 46, 165–175. [Google Scholar] [CrossRef]
  53. Billingsley, P. Convergence of Probability Measures; John Wiley & Sons, Inc.: New York, NY, USA, 1968. [Google Scholar]
  54. Kiefer, J. K-sample analogues of the Kolmogorov-Smirnov and Cramér-v. Mises tests. Ann. Math. Statist. 1959, 30, 420–447. [Google Scholar] [CrossRef]
  55. Bolboacă, S.D.; Jäntschi, L. Predictivity approach for quantitative structure-property models. application for blood-brain barrier permeation of diverse drug-like compounds. Int. J. Mol. Sci. 2011, 12, 4348–4364. [Google Scholar] [CrossRef]
  56. Truong, C.; Oudre, L.; Vayatis, N. Selective review of offline change-point detection methods. Signal Process. 2020, 167, 107299. [Google Scholar] [CrossRef]
  57. Balke, N. Detecting level shifts in time series. J. Bus. Econom. Statist. 1993, 11, 81–92. [Google Scholar]
  58. Zeileis, A.; Kleiber, C.; Krämer, W.; Hornik, H. Testing and dating of structural changes in practice. Comput. Statist. Data Anal. 2003, 44, 109–123. [Google Scholar] [CrossRef]
  59. Garcia, R.; Perron, P. An analysis of the real interest rate under regime shifts. Rev. Econom. Statist. 1996, 78, 111–125. [Google Scholar] [CrossRef]
  60. Zeileis, A.; Leisch, F.; Hornik, K.; Kleiber, C. strucchange: An R Package for Testing for Structural Change in Linear Regression Models. J. Stat. Softw. 2002, 7, 1–38. [Google Scholar] [CrossRef]
Figure 1. The left side is the times series of the annual flow of the river Nile at Aswan from 1871 to 1970; the right side is the sample ACF for the river Nile.
Figure 1. The left side is the times series of the annual flow of the river Nile at Aswan from 1871 to 1970; the right side is the sample ACF for the river Nile.
Symmetry 15 01975 g001
Figure 2. The left side is the times series of returns of AMD.com stock from March 2008 to December 2008; the right side is the sample ACF for these returns.
Figure 2. The left side is the times series of returns of AMD.com stock from March 2008 to December 2008; the right side is the sample ACF for these returns.
Symmetry 15 01975 g002
Figure 3. The left side is the quarterly US ex-post real interest rate from 1961:Q1 to 1986:Q3; the right side is the sample ACF for these interest rates.
Figure 3. The left side is the quarterly US ex-post real interest rate from 1961:Q1 to 1986:Q3; the right side is the sample ACF for these interest rates.
Symmetry 15 01975 g003
Table 1. Empirical sizes and powers of A T , 0 , A T , 1 , A T , 2 based on N T ( 0 , Σ T ) and the level of significance α = 0.05 .
Table 1. Empirical sizes and powers of A T , 0 , A T , 1 , A T , 2 based on N T ( 0 , Σ T ) and the level of significance α = 0.05 .
h T = T 1 / 5 ξ T A T , 0 A T , 1 A T , 2
p 0 p 1 p 2
Case 1
μ 1 = μ 2 = 1
σ 1 2 = σ 2 2 = 1
k 1 = T
k 2 = T
−0.33000.05900.02300.0190
6000.04300.01700.0120
9000.05200.02200.0130
03000.05000.02300.0120
6000.04900.01700.0130
9000.04700.01400.0150
0.33000.04900.01200.0180
6000.04400.01900.0130
9000.04500.01700.0120
Case 2
μ 1 = 1 , μ 2 = 1.5
σ 1 2 = σ 2 2 = 1
k 1 = T / 4
k 2 = T
−0.33000.93200.91800.0130
6001.00001.00000.0230
9001.00001.00000.0270
03000.66300.60900.0170
6000.97500.97300.0230
9001.00001.00000.0190
0.33000.38200.29700.0200
6000.80100.76800.0200
9000.95500.94900.0150
Case 3
μ 1 = μ 2 = 1
σ 1 2 = 1 , σ 2 2 = 2
k 1 = T
k 2 = T / 2
−0.33000.79200.03200.7460
6000.99100.03000.9880
9001.00000.02501.0000
03000.88900.02300.8640
6000.99900.02000.9980
9001.00000.01701.0000
0.33000.81700.02600.7630
6000.99100.02300.9910
9001.00000.02201.0000
Case 4
μ 1 = 1 , μ 2 = 1.5
σ 1 2 = 1 , σ 2 2 = 2
k 1 = T / 4
k 2 = T / 2
−0.33000.97400.77400.8000
6001.00000.99600.9960
9001.00001.00001.0000
03000.93600.39300.8380
6001.00000.86601.0000
9001.00000.98201.0000
0.33000.97900.75800.7910
6001.00000.99500.9950
9001.00001.00001.0000
Table 2. Empirical sizes and powers of A T , 0 , A T , 1 , A T , 2 based on e t ( 0 , Σ T , 5 ) and the level of significance α = 0.05 .
Table 2. Empirical sizes and powers of A T , 0 , A T , 1 , A T , 2 based on e t ( 0 , Σ T , 5 ) and the level of significance α = 0.05 .
h T = T 1 / 5 ξ T A T , 0 A T , 1 A T , 2
p 0 p 1 p 2
Case 1
μ 1 = μ 2 = 1
σ 1 2 = σ 2 2 = 1
k 1 = T
k 2 = T
−0.33000.05300.02400.0140
6000.04700.02000.0110
9000.05400.02200.0190
03000.04900.01600.0140
6000.05700.01900.0160
9000.04200.01200.0190
0.33000.03800.01500.0100
6000.04000.01400.0130
9000.05000.01400.0200
Case 2
μ 1 = 1 , μ 2 = 1.5
σ 1 2 = σ 2 2 = 1
k 1 = T / 4
k 2 = T
−0.33000.81300.78700.0150
6000.95500.94700.0150
9000.98000.97700.0250
03000.58900.53700.0130
6000.84300.82000.0220
9000.93100.92100.0210
0.33000.35900.29300.0200
6000.65300.62000.0180
9000.79200.77500.0180
Case 3
μ 1 = μ 2 = 1
σ 1 2 = 1 , σ 2 2 = 2
k 1 = T
k 2 = T / 2
−0.33000.81600.02800.7660
6000.99600.04200.9950
9001.00000.03501.0000
03000.88900.02700.8680
6000.99800.02900.9980
9001.00000.02601.0000
0.33000.83700.01700.8050
6000.99700.01600.9950
9001.00000.02801.0000
Case 4
μ 1 = 1 , μ 2 = 1.5
σ 1 2 = 1 , σ 2 2 = 2
k 1 = T / 4
k 2 = T / 2
−0.33000.93700.64600.7750
6001.00000.89000.9900
9001.00000.94801.0000
03000.92900.37400.8390
6001.00000.68600.9980
9001.00000.86001.0000
0.33000.81700.16300.6990
6000.99700.47000.9900
9001.00000.62901.0000
Table 3. Precision, Recall and F1-score of two algorithms based on N T ( 0 , Σ T ) .
Table 3. Precision, Recall and F1-score of two algorithms based on N T ( 0 , Σ T ) .
h T = T 1 / 5 ξ TOur Algorithmcpt.meanvar’s AlgorithmMosum’s Algorithm
PrecisionRecallF1-ScorePrecisionRecallF1-ScorePrecisionRecallF1-Score
3000.65630.65180.65330.17380.17380.17380.31170.31170.3117
−0.36000.83420.82420.82750.68230.68230.68230.63040.62740.6284
Case 2 9000.91410.90110.90540.96000.95950.95970.82720.82420.8252
μ 1 = 1 , μ 2 = 1.5 3000.36560.36260.36360.20880.20830.20850.34670.33820.3410
σ 1 2 = σ 2 2 = 1 06000.72230.71330.71630.62940.62840.62870.63640.61700.6234
k 1 = T / 4 9000.81020.80270.80520.86910.86810.86850.76620.75090.7559
k 2 = T 3000.14690.14540.14590.22380.21880.22040.35260.31990.3306
0.36000.50150.49500.49720.53250.52530.52760.59840.50980.5380
9000.67830.67380.67530.75220.74510.74740.72530.62480.6562
3000.50850.49800.50150.29870.29770.29800.00000.00000.0000
−0.36000.81220.80120.80490.74530.74430.74460.00000.00000.0000
Case 3 9000.91410.89660.90240.89810.89810.89810.00000.00000.0000
μ 1 = μ 2 = 1 3000.63740.62840.63140.31770.31520.31600.00000.00000.0000
σ 1 2 = 1 , σ 2 2 = 2 06000.85310.84420.84720.76420.76370.76390.00300.00300.0030
k 1 = T 9000.92310.91560.91810.91410.91260.91310.00000.00000.0000
k 2 = T / 2 3000.51850.51250.51450.33270.32320.32630.01400.01230.0128
0.36000.80420.79570.79850.69930.68930.69260.02900.02300.0248
9000.89410.88360.88710.89010.88200.88460.02200.01540.0174
3000.52850.63690.56460.18280.35460.24010.07940.15880.1059
−0.36000.82170.82470.82270.44210.77940.55440.31520.62740.4192
Case 4 9000.89710.89710.89710.65280.88380.72970.41360.82420.5504
μ 1 = 1 , μ 2 = 1.5 3000.39810.58790.46140.20430.39530.26790.11690.22930.1543
σ 1 2 = 1 , σ 2 2 = 2 06000.73430.79270.75370.44110.76860.55000.31920.61800.4187
k 1 = T / 4 9000.85060.85910.85350.63890.85900.71210.38210.75040.5048
k 1 = T / 2 3000.52800.64490.56690.22230.39960.28130.14840.26500.1870
0.36000.83270.83670.83400.42910.69000.51570.31570.53300.3865
9000.90410.90410.90410.59340.77020.65170.02700.03800.0304
Table 4. Precision, Recall and F1-score of two algorithms based on t ( 0 , Σ T , 5 ) .
Table 4. Precision, Recall and F1-score of two algorithms based on t ( 0 , Σ T , 5 ) .
h T = T 1 / 5 ξ TOur Algorithmcpt.meanvar’s AlgorithmMosum’s Algorithm
PrecisionRecallF1-ScorePrecisionRecallF1-ScorePrecisionRecallF1-Score
3000.55140.54700.54850.06490.05930.06110.10890.10790.1082
−0.36000.76220.75520.75760.27770.26260.26740.25770.25470.2557
Case 2 9000.83920.82920.83250.55440.51590.52750.38760.38560.3863
μ 1 = 1 , μ 2 = 1.5 3000.34170.34020.34070.08390.07550.07820.14790.14490.1459
σ 1 2 = σ 2 2 = 1 06000.59140.58440.58670.29770.27700.28340.30870.30220.3044
k 1 = T / 4 9000.71630.70980.71200.53550.50340.51280.48150.47200.4752
k 2 = T 3000.15280.15030.15120.12690.11490.11850.17980.16350.1688
0.36000.40360.39810.39990.31370.28410.29290.39260.34640.3614
9000.54550.53800.54050.48650.44870.45960.51150.46680.4809
3000.51850.50900.51220.26570.24880.25390.00000.00000.0000
−0.36000.81220.79370.79990.53450.50110.51110.00000.00000.0000
Case 3 9000.90410.88760.89310.66830.63240.64270.00000.00000.0000
μ 1 = μ 2 = 1 3000.61040.60040.60370.28270.26450.27000.00100.00100.0010
σ 1 2 = 1 , σ 2 2 = 2 06000.85010.83770.84180.54850.52250.53050.00200.00200.0020
k 1 = T 9000.92810.91660.92040.71030.66400.67680.00000.00000.0000
k 2 = T / 2 3000.53750.53150.53350.28370.25660.26490.00800.00700.0073
0.36000.84820.84070.84320.51050.46640.47970.02300.02080.0215
9000.90410.89210.89610.69230.64370.65800.01800.01350.0148
3000.47700.61390.52260.15380.28370.19690.02600.05190.0346
−0.36000.76020.80720.77590.31920.55480.39610.12840.25320.1700
Case 4 9000.85460.87760.86230.41910.66030.49600.19380.38560.2577
μ 1 = 1 , μ 2 = 1.5 3000.39210.57990.45470.17280.30420.21570.04000.07790.0526
σ 1 2 = 1 , σ 2 2 = 2 06000.67530.80720.71930.33920.58560.41970.15430.30120.2033
k 1 = T / 4 9000.79120.85810.81350.45750.68460.52770.24080.47200.3178
k 2 = T / 2 3000.28420.48650.35160.18880.32810.23430.06990.12550.0884
0.36000.53750.75220.60910.33620.51770.39320.20780.36040.2582
9000.65730.82120.71200.46350.65810.52280.26470.47440.3335
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, M.; Shi, X.; Wang, X.; Yang, W. Combination Test for Mean Shift and Variance Change. Symmetry 2023, 15, 1975. https://doi.org/10.3390/sym15111975

AMA Style

Gao M, Shi X, Wang X, Yang W. Combination Test for Mean Shift and Variance Change. Symmetry. 2023; 15(11):1975. https://doi.org/10.3390/sym15111975

Chicago/Turabian Style

Gao, Min, Xiaoping Shi, Xuejun Wang, and Wenzhi Yang. 2023. "Combination Test for Mean Shift and Variance Change" Symmetry 15, no. 11: 1975. https://doi.org/10.3390/sym15111975

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop