Monitoring Test for Stability of Dependence Structure in Multivariate Data Based on Copula

In this paper, we consider a sequential monitoring procedure for detecting changes in copula function. We propose a cusum type of monitoring test based on the empirical copula function and apply it to the detection of the distributional changes in copula function. We investigate the asymptotic properties of the stopping time and show that under regularity conditions, its limiting null distribution is the same as the sup of Kiefer process. Moreover, we utilize the bootstrap method in order to obtain the limiting distribution. A simulation study and a real data analysis are conducted to evaluate our test.


Introduction
A copula is a multivariate joint distribution function for which the marginal distribution of each variable is uniform.In classical analysis, the Pearson's correlation is most frequently used in practice as a measure of dependence.However, it only works well with elliptical distributions while empirical data in finance and insurance mostly has skewed distributions, heavy tails and extreme values.Thus, correlation may not be suitable to model the nonlinear association.The copula function can overcome this drawback due to the fact that a copula can model various types of dependence structure beyond the linear dependence independently of the marginal distributions.For this reason, a copula has become a flexible methodology in applications of financial risk assessment and actuarial analysis (see Cherubini et al. [1,2], McNeil et al. [3], Hougaard [4] and the papers cited therein).Recently, dependence modelling with copula function has been widely applied to various areas such as civil engineering, reliability engineering, climatology, hydrology, biology, etc.
Since all of the information about the dependence is contained in the copula function, estimating the copula function is a crucial task for providing a correct dependence structure.Conventionally, the copula function is assumed to remain constant over time.However, there is empirical evidence suggesting that the dependence structure is likely to change due to some financial adjustments and critical social events (see, for example, Longin and Solnik [5], Patton [6] and Rodriguez [7]).To cope with this, Dias and Embrechts [8] and Guegan and Zhang [9] suggested a likelihood ratio test for copula parameter changes in specific copula families, Harvey [10] and Busetti and Harvey [11] developed a nonparametric stationarity test for a constant copula based on time-varying quantiles, and Na et al. [12] studied a cusum test for detecting the copula parameter change.
Recently, the problem of testing constancy of the copula has been studied by Quessy et al. [13], Bücher and Ruppert [14], and Bücher et al. [15].All of these approaches are devoted to the change point detection within data sets of fixed size.However, many researchers are also interested in

Monitoring Procedure
Let {X t = (X 1t , • • • , X dt ); t = 1, 2, • • • } be a sequence of d-dimensional independent random vectors with joint distribution F and continuous marginal distribution functions If C is a true copula function of {X t }, owing to Sklar's theorem, we can express When the marginal distribution functions F 1 , • • • , F d are continuous, it is well known that the function C is uniquely determined such as where According to Sklar's theorem, one can always model any multivariate distributions by modelling both marginal distributions and copula functions separately.For details, we refer to Joe [27] and Nelson [28].
Suppose that we have observed X 1 , . . ., X n , which are called the historical data.For each new observation, we wish to test the following hypotheses: where C t is a copula function at time t and C is a time-invariant copula.The copula function is assumed to be stable over the historical period of length n, i.e., The historical data is used as a reference for comparison with future observations.The aim of our monitoring procedure is to check a change of copula function each time a new observation is updated in the post-historical period.
In this study, we assume that the marginal distributions do not change under both the null and alternative hypotheses, while the copula function changes under the alternative hypothesis.This assumption is in line with other precedent studies on testing for structural breaks in copula (see, e.g., Harvey [10], Busetti and Harvey [11], Bouzebda [26], Bücher and Ruppert [14] and Na et al. [12,21]).Furthermore, this assumption is very crucial in real practice since one can encounter the situation that the monitoring test detects a change in copula function although the change actually occurs in marginal distributions.Empirical evidence of this situation can be found in Na et al. [29].Therefore, a change point test for marginal distribution must be conducted in advance of implementing the monitoring test for a copula function change.
The monitoring procedure in a general set-up can be described with the stopping time τ(n) defined as follows: where D k,n is a test statistic based on X 1 , .
for a given α ∈ (0, 1), and We consider the boundary function b(•) satisfies: (cf.Chu et al. [16] and Berkes at al. [18]).Here, we focus on the b(•) of a specific form such as b(s) = cs a for some c > 0, a > 0, (5) proposed by Lee et al. [22].Practically, the constants c and a must be chosen to satisfy (3) for a given α.
For the specific form of test statistic D k,n , Chu et al. [16] considered two types of test statistics.The one is based on the fluctuations of sequential parameter estimates, and the other is based on the cumulative sum of recursive residuals.Berkes [18] proposed the quasi-maximum likelihood estimator of the parameters in the GARCH process for test statistics.Lee et al. [22] used a sequential empirical process of residuals for the detection of distributional changes in AR models.
In this paper, we use a copula function for the test statistic to detect a change in the dependence structure of multivariate random vectors.The main idea of the procedure is based on the changes in the dependence structure upon changes in the copula function.In real practice, the copula function C is usually unknown.Thus, we consider the situation in which the empirical copula estimator is employed to play a role of the true copula function.The empirical copula function can be obtained by replacing the unknown terms in (1) with the joint empirical distribution function and the marginal empirical distribution function, which are defined, respectively, by Then, the empirical copula estimator is where Then, we have and Thus, using the representation (6), it follows that This implies that the law of C n,n is the same for all F whose associated copula is C. We will propose the test statistics based on the empirical copula estimator in the next section.
Remark 1.Note that the alternative hypothesis H 1 actually means that H 1 : C t = C for some n < t < T(n), where T(n) is the predetermined maximal number.Since it is impossible to monitor at an unlimited time horizon and the test for too large t will be meaningless, the maximal number of observations T(n) is considered.Here, T(n) is considered as lim n→∞ T(n) n = q < ∞ and lim n→∞ T(n) n = ∞, and this result will be seen in our simulation study.

Main Result
For monitoring the copula function, we employ the test statistic D n,k based on the empirical copula functions such as where . Suppose that X 1 , . . ., X n are observed which represent available historical data.By observing new data sequentially, we wish to detect if a change occurs in copula function C.This procedure compares the estimates of the copula function obtained based on a growing number of observations, with the estimate obtained based on the historical observations.Note that C k,n is the estimator of C based on the observation up to time k, while C n,n is the estimator obtained based on the historical data.As mentioned earlier, since we assume that the marginal distribution functions are stable, the marginal empirical distribution functions F in are used in C k,n instead of F ik .
To show the asymptotic behavior of the test statistic, we need to introduce some Gaussian processes.The limiting process is called the Kiefer process associated with the copula function C. For details on the Kiefer process, we refer to Adler [30] and Piterbarg [31].
Here, we impose the following conditions for the main theorem: (A1) C is twice continuously differentiable on (0, 1) d ; (A2) The second-order partial derivatives of C exist and are continuous on Under the above assumptions, we have the following: Suppose that H 0 is true and conditions (A1) and (A2) hold.In addition, a boundary function b(•) satisfies ( 4), (5) and condition (A3).Then, the stopping time τ(n) with a test statistic D k,n in (7) satisfies where On the other hand, due to (8) and Lemmas 1 and 2 addressed below, we have By condition (A3), the last term converges to 0 a.s. as n → ∞.Therefore, we can express This validates the theorem.
Lemma 1.If the assumptions in Theorem 1 hold, then we have Proof.It follows from Theorem B of Bouzebda [26] (see also Csörgő and Horváth [32]).
Lemma 2. If the assumptions in Theorem 1 hold, then we have Proof.We follow the lines of the proof of Proposition 4.2 in Segers [33] and the proof of Theorem 4.1 in Tsukahara [34].For k ≥ n and a n ≥ 0, we put By the Smirnov-Chung law of the iterated logarithm for the empirical distribution functions, we obtain Moreover, we take for K 2 as in Proposition A.1 of Segers [33].Using Proposition A.1 of Segers [33], there exist constants K 1 such that By the Borel-Cantelli lemma, we obtain In practice, given 0 < α < 1, we reject H 0 if where C α is the number such that However, the asymptotic limiting distribution is complicated to compute in practice and depends on the unknown copula C. For this reason, it is not directly applicable for the monitoring procedure in practice.To overcome the difficulty that arises due to the computation, we recommend using a bootstrap method.Some precedent studies have used the bootstrap method to approximate the limiting distribution.Bücher and Dette [25] compared the finite sample properties of the various bootstrap methods proposed in the literature and concluded that the procedure proposed by Rémillard and Scaillet [24] yields the best results in most cases.In this study, we consider the multiplier bootstrap approach proposed by Rémillard and Scaillet [24].
Let 1 , • • • , n be an i.i.d sequence of random variables with mean zero, variance one, and independent of X 1 , . . ., X n .Rémillard and Scaillet [24] defined the bootstrap process where u = (u 1 , • • • , u d ) and ¯ = ∑ n t=1 t , and showed that α n (u) approximates a Brownian bridge process (cf.Lemma A.1 of Rémillard and Scaillet [24]).Using the fact s −1/2 K C (u, s) = B C (u), we can obtain the approximation of K C and calculate an approximate value for C α in (10).The detailed procedure is as follows: (Step 1) Based on the data X 1 , . . ., X n , obtain the marginal empirical distribution functions F in and the empirical copula function C n,n .
n that is an i.i.d sequence of random variables with mean zero, variance one and n (u) obtained through (11) based on these random variables.The approximate value for C α in (10) can be obtained by a bootstrap sample and the approximate quantile is copula distribution free.The above bootstrap method is easy to implement and gives satisfactory results, as seen in the next section.
Remark 2. In this study, we focus on the boundary function of the form in (5).In this case, there is no such rule to choose an optimal a.The test with small a produces large powers compared to that with large a. Unsatisfactory results are obtained if a is either too small or too large.Thus, the choice of a can be an important issue in practice.From the simulation study in Lee et al. [22], it is found that no test with specific a outperforms the others completely in terms of the stability for the test.Here, we recommend using a = 2 and this result will be seen in our simulation study.Furthermore, one can also employ other boundary functions satisfying (4) and condition (A3).

Simulation
In this section, we evaluate the performance of the monitoring test proposed in Section 3 through a simulation study.For this task, we use the boundary function in (5) and employ the stopping rule based on (7).In this study, we consider the bivariate Gaussian copula with copula parameter θ 0 as a true copula model.To see an effect from the copula functions with different functional forms allowing degrees of asymmetry and tail dependence, we also consider the Gumbel copula that is asymmetric and has upper tail dependency as a true model.The copula parameters of the Gumbel model are set to be equal to the value of Kendall's tau τ 0 in Gaussian copula models.For each case, sets of n = 100, 200 and 300 observations are generated from the copula model with marginal distribution N(0, 1).The empirical sizes and powers are calculated by the number of rejections of the null hypothesis "H 0 : no changes occur in the copula model at t = n + 1, • • • ", out of 1000 repetitions.Here, the predetermined maximal number of observations T(n) are considered as T(n) = n log n for empirical size and T(n) = 2n, 3n, 4n, 5n, n log n for empirical power.In order to examine the power, we consider the following alternative hypotheses.We take into account two elliptical copulas such as the Gaussian and the Student t and the Frank copula as alternative hypotheses.
H 1 (1) A change occurs from the Gaussian copula with τ 0 = 0.13 to the Gaussian copula with τ 0 = 0.35 and 0.60 at np. H 1 (2) A change occurs from the Gaussian copula with τ 0 = 0.13 to the Student t copula with τ 0 = 0.35 and 0.60 at np. H 1 (3) A change occurs from the Gaussian copula with τ 0 = 0.13 to the Frank copula with τ 0 = 0.35 and 0.60 at np.
For the Gumbel copula, we consider Archimedean copulas family for alternative hypotheses such as the Gumbel, Clayton and Frank copulas.For this, we consider the following alternative hypotheses.
H 1 (1) A change occurs from the Gumbel copula with τ 0 = 0.13 to the Gumbel copula with 0.60 at np. H 1 (2) A change occurs from the Gumbel copula with τ 0 = 0.13 to the Clayton copula with 0.60 at np. H 1 (3) A change occurs from the Gumbel copula with τ 0 = 0.13 to the Frank copula with 0.60 at np.
In each case, the copula parameters are set to be at the same level in terms of Kendall's tau in different copula families.To examine the power, many cases of changes in the dependence structure are considered, namely changes of the copula parameter and/or changes of the copula family.For H 1 (1) and H 1 (1), we consider the situation involving a change of the copula parameter within a copula family.For H 1 (2), H 1 (3), H 1 (2) and H 1 (3), we examined power of the case involving a change of copula parameter and copula family at the same time.
Throughout our simulation study, we only consider the change of copula function and assume that the marginal distributions experience no changes.The empirical sizes are calculated at the nominal levels 0.01, 0.05 and 0.10, and the powers are examined at the nominal level 0.10.The bootstrap method is used for the calculation of the critical value at the nominal level.We perform the bootstrap method discussed in Section 3 with B = 500 for n = 100, 200 and 300, and the constant a of the boudary function in ( 5) is chosen to be 2.
In particular, our test is compared with the monitoring test proposed by Na et al. [21].Recall that Na et al. [21]'s test can be applied to detect a copula parameter change when the copula family does not change.Na et al. [21] proposed the detector in (2) based on the difference between estimates of the copula parameter: where θk is the estimator of the true copula parameter θ 0 based on the observation up to time k, while θn is the estimator obtained based on the historical data.Empirical sizes and powers are presented in Tables 1-4.The figures in Tables 1-4 are for D k,n while the figures in the parentheses are for D E k,n .Table 1 shows that the test procedure has some size distortions when n is small.However, as n increases, the empirical size of the test gets very close to the nominal levels in most cases.Size distortions of tests for small sample sizes can be reduced if a smaller a is chosen.For D E k,n , it can be seen that the test also has some size distortions, but the test is generally able to keep their nominal level, especially when n = 300.The result is also same for the other copula models such as t-copula, Frank copula, and Clayton copula, although not reported here for brevity.Tables 2-4 report the empirical power of H 1 (1) − H 1 (3) and H 1 (1) − H 1 (3) with p = 1.1 and 1.5.Tables 2-4 show that our monitoring procedure produces good powers in most cases.It is shown that the powers increase remarkably either as n increases or the more significant change occurs.Moreover, we can see that when the changes in copula function occur earlier, the powers increase remarkably.It can be seen that the powers in the case that n is large and p = 1.1 are very close to 1.As pointed out by Lee et al. [22], our monitoring procedure with boundary function such as (5) detects early changes more effectively than late changes.Due to the curvature of the component (k/n) a in the boundary function, the boundary function increases rapidly as the change point moves further away from the point where the monitoring was initiated.This implies that it is more likely to capture small changes early in the sample.Consequently, our test has better power properties for early change points.Similar findings were reported in Na et al. [21] and Lee et al. [22].This result indicates that it is desirable to renew the historical data appropriately to escalate the power when the null hypothesis appears to be true for a certain period time.Note that for alternative hypothesis H 1 (1) involving a change of the copula parameter within a copula family, the monitoring procedure based on D E k,n appears to have higher powers than our monitoring test.This result can be explained by the fact that Na et al. [21]'s monitoring test is designed only to detect parameter changes of copula function.However, even if we consider the alternative hypotheses H 1 (2) and H 1 (3) that involve a change of copula family, Na et al. [21]'s test also shows good performance in terms of power.This means that Na et al. [21]'s test tends to detect a copula parameter change, even though the change actually occurs in copula function.From this aspect, we were motivated to develop the monitoring procedure for detecting a copula function change.In comparing Table 4 against Table 3, the performance of power appears to be similar.Different functional forms of copula seemed to have no impact on the performance, hence our monitoring test also has good performance in copula models having asymmetry properties or tail dependency.All these results indicate that our test procedure performs adequately to monitor for stability of copula function.

Real Data Analysis
In this section, we illustrate an example of a real data analysis.We consider bivariate climate data consisting of temperature and precipitation over the contiguous United States.There is a lot of literature studying the association of temperature and precipitation over the United States, and they reported empirical evidence that there is an obvious relationship between two variables (see Zhao and Khalil [35] and Huang and van Den Dool [36] and the papers cited therein).Recently, several authors have used a copula based methodology to model the joint distribution of temperature and precipitation (see, e.g., Favre et al. [37], Shiau et al. [38], Dupuis [39] and Schölzel and Friederichs [40]).However, the precedent studies only focus on the problem as to which copula model best fitted the empirical data.Here, we use the copula functions to model the dependence between temperature and precipitation and attempt to monitor for stability of dependence.Annual mean temperature and annual mean precipitation in summer months (June, July, and August) over the contiguous United States from 1895 to 2015 are used for empirical data.The data can be obtained from NOAA's National Centers for Environmental Information (NCEI).Figure 1 shows that precipitation and temperature tend to be negatively correlated.It is well known that warmer summers usually result in drier conditions and colder summers are likely to be wetter.For historical data, the data from 1895 to 1975 is used, which has 81 observations.As discussed earlier, since the monitoring test for copula function can be influenced by a change in marginal distribution, the change point tests for marginal distributions are performed in advance of implementing the monitoring test for a copula function change.To this end, we perform the test of Lee et al. [22] who sequentially monitored marginal distributional changes based on the following test statistic: where F ik is the empirical distribution based on the observation up to time k, while F in is the empirical distribution obtained based on the historical data.By observing new data sequentially, we first conduct the monitoring test for marginal distributional changes.If there are no changes in marginal distributions, we can perform the monitoring test for the copula function change.Since both of the two series detect no evidence of a change in marginal distributions at the nominal level 0.05, we apply monitoring procedure based on the test statistic in (7) to detect a change of dependence.For this task, we use the boundary function in (5) with a = 2 and perform the bootstrap method in Section 3 with B = 500.As a result, it appears that the test detects a change in dependence at nominal levels 0.01, 0.05, and 0.10.The location of the stopping time is summarized in Table 5 and Figure 1 illustrates the stopping time in dependence: the solid line corresponds to the end of historical data and the dotted lines identify the detected stopping time.

Conclusions
In this study, we designed the monitoring test for a change of copula function on the basis of the empirical copula functions.The test is shown to have its limiting distribution as the supremum of the Kiefer process under certain regularity conditions.The simulation results reported in Section 4 confirms that our test performs adequately.Our method to monitor the change of copula function has several advantages.The procedure is copula model free and we use a bootstrap method to overcome the difficulty that the asymptotic limiting distribution depends on the unknown copula function.For this reason, it is directly applicable in practice even when we do not know the true copula function.Furthermore, finite sample properties are expected to be well behaved since we use a bootstrap method.Our monitoring test has been established under the assumption that each series of random vector is independent and identically distributed.However, this assumption is often violated in practice and one might be able to consider even a broader class of stochastic processes such as autoregressive moving average (ARMA), ARCH and GARCH processes.Recently, Doukhan et al. [41] and Bücher and Volgushev [42] considered the weak convergence of the empirical copula process under serial dependence.These studies form the basis for our new monitoring test for copula function under weak dependence.We leave the task of extending our test to future study.

4 ) 5 )
Repeat the above procedure (Step 2) and (Step 3) B times and calculate the 100(1 − α)% percentile of the obtained B number of T Starting from time k = n + 1 onward, we reject H 0 if T k,n in (9) is larger than the 100(1 − α)% percentile obtained through (Step 4).

Figure 1 .
Figure 1.Annual mean temperature and annual mean precipitation in summer months over the contiguous United States from 1895 to 2015.(a) Annual mean temperature in summer; (b) Annual mean precipitation in summer.

Table 5 .
The stopping time.