Multi-Stage Change Point Detection with Copula Conditional Distribution with PCA and Functional PCA

: A global uncertainty environment, such as the COVID-19 pandemic, has affected the manufacturing industry severely in terms of supply and demand balancing. So, it is common that one stage statistical process control (SPC) chart affects the next-stage SPC chart. It is our research objective to consider a conditional case for the multi-stage multivariate change point detection (CPD) model for highly correlated multivariate data via copula conditional distributions with principal component analysis (PCA) and functional PCA (FPCA). First of all, we review the current available multivariate CPD models, which are the energy test-based control chart (ETCC) and the nonparametric multivariate change point model (NPMVCP). We extend the current available CPD models to the conditional multi-stage multivariate CPD model via copula conditional distributions with PCA for linear normal multivariate data and FPCA for nonlinear non-normal multivariate data.


Introduction
Since Hotelling (1949) proposed Hotelling T 2 statistics for the multivariate statistical process control (SPC), Crosier (1988), Lowry, Woodall, Champ and Rigdon (1992), and Zou and Tsung (2011) have proposed the multivariate versions of the cumulative sum (CUSUM) time-weighted control chart and the exponentially weighted moving average (EWMA) time-weighted SPCs. However, the manufacturing industry is still requiring a modern statistical technique dealing with non-normal high dimensional correlated multivariate data. In order to solve this difficult problem in lights of quality control, [1] proposed SPC charts as a tool for analyzing big data. Furthermore, Reference [2] discussed and compared the conventional SPCs with nonparametric SPCs in terms of the strengths and limitations.
The purpose of our paper is that, under a situation, such as high dimensional correlated variables over the several stage process in a manufacturing industry business, we consider modeling conditional multi-stage manufacturing processes for detecting faults in the several stages for the complex production system. Multi-stage CPD has emerged as a cutting-edge research area at the interface of the engineering and statistical sciences. Over the last two decades, Reference [3][4][5] developed the change point detection (CPD) models with needed pre-knowledge for in-control distribution and nonparametric CPD charts to detect mean, variance, and other distributional shifts. Reference [6] proposed online nonparametric multivariate CPD models. Recently, Reference [7] reviewed previous works focusing on energy divergence

Statistical Methods
We consider a situation, such as high dimensional correlated variables over conditional multi-stage processes. Among several statistical methods for the dimensional reduction of multivariate highly correlated variables, we employ the traditional linear PCA and a nonlinear FPCA in this paper. With the traditional PCA method, we consider normal multivariate data for the conditional multi-stage CPD. With a nonlinear FPCA method, we consider non-normal multivariate data for the conditional multi-stage CPD. For extending single-stage CPD to the conditional multi-stage CPD, we employ the conditional distribution by copula and the nonparametric multivariate control chart for the conditional multi-stage CPD.

Principal Component Analysis
The traditional linear PCA is one of the popular statistical methods to reduce the dimensionality of multivariate data into a smaller number of uncorrelated variables called principal components (PCs), while keeping variation in the original data.
The SPCs with primary principal components by PCA have been proposed to monitor a class of multivariate quality processes for handling multivariate data with multicollinearity between variables (see Reference [16] for details). We consider the PCA-based multivariate CPD method to demonstrate the model's flexibility and performance by both a simulation study and a real data illustration based on Reference [7]. If the data follows the normality assumption, then we can use our proposed conditional multi-stage CPD with a linear PCA method.

Functional Principal Component Analysis
For the dimension reduction of the multivariate highly correlated and non-normal data, we also employ nonlinear FPCA to determine the factors (i.e., principal components). By using non-liner eigenfunctions to explain the variation of the time series and examine the sample covariance structure, FPCA is a better statistical dimension reduction method than the PCA proposed by Reference [17]. In addition, FPCA is the more appropriate statistical method to know the clustering pattern of the time-course data rather than the clustering pattern of the whole data at a certain time. We divide density variations into a set of orthogonal principal component functions that maximize the variance along each component estimating density functions by employing a nonparametric method and extracting common structures from the estimated functions.
The functional form of y i (t) is given by the sum of the weighted basis functions, φ k (t), across the set of times T.
where K is a number of basis functions. In this study, a Fourier basis is used to represent smooth functions as a basis function due to its flexibility and computational advantages. Here, our goal is to obtain a smooth function which fits well into the observed time series, y i (t). To perform FPCA, we use the 'fdapace' R package (Reference [18]). This package is FPCA for sparsely or densely sampled random trajectories and time courses, via the principal analysis by conditional estimation (PACE) algorithm which produces covariance and mean functions, eigenfunctions, and principal component (scores), for both functional data and derivatives for both dense (functional) and sparse (longitudinal) sampling designs. For sparse designs, PACE gives fitted continuous trajectories with confidence bands, even for subjects with few longitudinal observations. Reference [19,20] developed the basic procedure behind the PACE approach for sparse functional data as follows: First, compute the cross-sectional meanμ. Second, compute the cross-sectional covariance surface which is guaranteed to be positive semi-definite. Third, do eigenanalysis on the covariance to estimate the eigenfunctionsφ and eigenvaluesλ. Fourth, employ numerical integration to estimate the corresponding scoresη, i.e.,η ik = T 0 [y(t) −μ(t)]φ i (t)dt.

Copula
A copula is defined as a multivariate distribution function defined on the unit [0, 1] p , with p the number of marginal distributions. Copula is a flexible function to construct the dependence structure of random variables. In this paper, we consider a bivariate (two-dimensional) copula, where p = 2. Reference [21] proposed copula function such that any bivariate distribution function, F XY (x, y), can be represented as a function of its marginal distribution of X and Y, F X (x) and F Y (y), as where we denote U = F X (x) and V = F Y (y), which are the continuous cumulative distribution functions of X and Y, and we denote as θ an association parameter of the copula function. Therefore, the copula function describes the dependent mechanism between two random variables by eliminating the influence of the marginal distributions or any monotone transformation of the marginal distributions.

Definition 2.
A Gaussian copula is a distribution over [0, 1] p . It is constructed from a multivariate normal distribution over R p by using the probability integral transform. For a given correlation matrix R ∈ [−1, 1] p×p , the Gaussian copula with parameter matrix R can be written as where θ is an association parameter of the Gaussian copula function, Φ −1 is the inverse cumulative distribution function of a standard normal, and Φ R is the joint cumulative distribution function of a multivariate normal distribution with mean vector zero and a covariance matrix equal to the correlation matrix R.
Definition 3. The p-dimensional random vector X = (X 1 , · · · , X p ) is said to have a (non-singular) multivariate Student-t distribution with ν degrees of freedom, mean vector µ and positive-definite dispersion or scatter matrix Σ, denoted X ∼ t p (ν, µ, Σ), if its density is given by Note that, in this standard parameterization, cov(X) = ν ν−2 Σ so that the covariance matrix is not equal to Σ and is in fact only defined if ν > 2. Useful reference for the multivariate t-copula is Reference [23].

Copula
Copula Function Because of the limited range of the association parameter, θ, in the Clayton copula, FGM copula, and Gumbel copula functions in Table 1, we have difficulty applying the Clayton copula, FGM copula, and Gumbel copula functions to SPC, except for the Frank copula. We employed the Gaussian copula in Definition 1, the t-copula in Definition 2, and one of the Archimedean copula, the Frank copula introduced in Definition 3, for our proposed conditional multi-stage CPD.

Energy Test-Based Control Chart (ETCC)
Reference [7] proposed a nonparametric CPD model which can simultaneously detect any change of mean, variance, or dependence structure all together in the multivariate distribution. Furthermore, Reference [7] used the maximum energy divergence-based permutation test to screen out the multiple change points for multivariate time series and employs the discrepancy of empirical characteristic functions of two random vectors. The empirical distribution of the test statistic can be obtained by permutation samples. Then, the sequential detection of change points can be performed under the algorithm introduced by the change point model (see Reference [24]) to form an online detection. For a change point detection problem, it is set that the change occurs at τ when the two random vectors {X i ∈ R p : X i ∼ F, i = 1, . . . , τ} and {Y j ∈ R p : Y j ∼ G, j = τ + 1, . . . , T } have a distribution shift. In a multiple change point case, τ i , i = 1, 2, . . . , the changes' detection can be formulated as Because the corresponding characteristic functions of X i and Y j , i.e., f x and f y , are uniquely determined, using the divergence between characteristic functions of the two random vectors to monitor the change is an applicable routine. Reference [25] employed an integrated weighted distance between two characteristic functions, and proved that the larger the distance, the more likely a change may occur between the two random vectors. Reference [7] named a nonparametric CPD model which is a nonparametric control chart as an energy test-based control chart (ETCC). Based on the ETCC, Reference [26] made an R package 'EnergyOnlineCPM' which centers on the Phase II nonparametric CPD model to online detect the multiple change points.

NPMVCP by Holland and Hawkins (2014)
Reference [6] proposed a nonparmatric SPC by employing multivariate rank-based test by Reference [27]. The multivariate CPD model by Reference [24] defines changes in a sequence, X 1 , . . . , X t , as follows: and H 0 : σ = 0, versus H 1 : σ = 0. The test statistics and their asymptotic distribution are given for k ∈ {1, . . . , t − 1} as: where∑ k,t is the pooled sample covariance matrix for the centered rank vectorr t computed by using a kernel function. Reference [6] developed the test statistic where∑ k,t = t−k tk∑ t is the unpooled estimator of the covariance matrix of the centered ranks.

Multi-Stage CPD with Copula Conditional Distribution
In this research, we consider the conditional multi-stage multivariate CPD by performing the conditional transformed data by copula functions (Gaussian, t, Frank).

Corollary 1.
For two random variables X 1 and X 2 , we can derive the conditional distribution of X 1 given X 2 , F 1|2 (X 1 |X 2 ), as follows: where θ 12 is an association parameter of the copula function, U 1 = F(X 1 ), and U 2 = F(X 2 ). Similarly, for two random variables X 2 and X 3 , we can derive the conditional distribution of X 3 given X 2 , F 3|2 (X 3 |X 2 ), as follows: where θ 23 is an association parameter of the copula function, U 2 = F(X 2 ), and U 3 = F(X 3 ).

Corollary 2.
Assume we have three random variables X 1 , X 2 , X 3 . We can derive the conditional cumulative distribution function as follows: where θ 3|12 is an association parameter of the copula function, The procedures for estimating the parameter of the copula for the conditional distribution function can be defined as follows. The first step is that we employ the empirical CDF approach to transform the observations to uniform distributed data in [0, 1]. (see Reference [28] for details). Because the empirical marginal distributions of U and V are uniform on [0, 1] such that they are parameter-free, the rank-based approach allows us to compute joint probabilities without knowing marginal distributions. In this paper, the association parameter estimation for bivariate copulas is computed by using a maximum likelihood estimation method which can be used in the 'BiCopEst' function from the 'CDVine' R package [29]. The second step is that after the parameters θ ij and θ jk in C(U i , U j , θ ij ) and C(U j , U k , θ jk ) are estimated, the conditional CDFs the estimatesθ ij andθ jk by partial derivatives of C(U i , U j ,θ ij ) and C(U j , U k ,θ jk ). The third step is that the association parameter is estimated by the maximum likelihood estimation method. The last step is that with the estimated parameterθ ik|j , the conditional By following these procedures, we can make the conditional transformed data with a copula function for conditional multi-stage CPD.
For the dimensional reduction to the smaller number of principal components compared to the number of variables in the whole dataset, we apply PCA or FPCA to each stage dataset (X i , i = 1, 2, 3), and, iteratively, we perform PCA or FPCA on each stage dataset(X i , i = 1, 2, 3) to do a dimensional reduction. Each stage size is n i for i = 1, 2, 3, and we set an equal sample size for each stage (n 1 = n 2 = n 3 ) for the computation convenience of the copula method such that n = n 1 + n 2 + n 3 for the simulated multivariate data X = (X 1 , X 2 , X 3 ). After performing PCA or FPCA with X = (X 1 , X 2 , X 3 ), we transform the PCA scores generated from X = (X 1 , by the empirical CDF approach, and then we apply the copula conditional distribution to these Y = (Y 1 , Y 2 , Y 3 ), so that the conditional transformed data are generated, such as the Y 1 Finally, we employ the energy test-based control chart (ETCC, Reference [7]) and nonparametric multivariate change point model (NPMVCP, Reference [6]) of each stage for detecting change points to the Y 1 vector, F(Y 2 |Y 1 ) vector, and F(Y 3 |Y 1 , Y 2 ) vector. We propose the conditional multi-stage CPD scheme based on the copula conditional distributions and PCA or FPCA for multivariate correlated data as follows:

1.
Apply PCA or FPCA to the simulated multivariate data X = (X 1 , X 2 , X 3 ) for dimensional reduction to several principal components.

2.
Transform the PCA or FPCA scores to the uniform distribution transformed data Y = (Y 1 , Y 2 , Y 3 ) by the empirical CDF approach.

3.
Apply the copula conditional distribution for transforming three-stage multivariate data Apply multivariate CPD methods (ETCC and NPMVCP) to both conditional stages F(Y 2 |Y 1 ) and Detect change points by ETCC and NPMVCP from each F(Y 2 |Y 1 ) and F(Y 3 |Y 1 , Y 2 ).

Illustrated Example
In order to illustrate our proposed conditional multi-stage multivariate CPD, we compare our method with recent multivariate CPD models by using simulated multivariate data and real data in Section 4.

Simulation Study
We want to generate multi-stage simulated multivariate dataset so that a current stage process is affected by the previous stage process and multivariate data have high correlations among variables, we employed the copula dependence method which can express the multi-stage dependence and can make a high correlation structure among variables in each stage. With this simulated dataset, we want to verify our conditional multi-stage CPD scheme by the copula conditional distribution. We generate three stage simulated datasets (X 1 , X 2 , X 3 ). We name X 1 as stage 1, X 2 as stage 2, and X 3 as stage 3.
For the dataset X 1 , we simulate the highly correlated multivariate data by using the 'copula' R package with the 'normalCopula' function for the five variables, and each variable has sample size 400 with the correlation parameters (0.9, 0.8, 0.8, 0.8, 0.7, 0.7, 0.7, 0.6, 0.5, 0.4), specifying that the type of the symmetric positive definite matrix characterizing the elliptical copula is unstructured. For each marginal distribution for the five variables in X 1 , we use three gamma distributions (X 11 follows gamma distribution with shape parameter (set to 5) and scale parameter (set to 1), X 12 follows gamma distribution with shape parameter (set to 5) and scale parameter (set to 2), and X 13 follows gamma distribution with shape parameter (set to 5) and scale parameter (set to 3)) and two exponential distributions (X 14 follows exponential distribution with parameter (set to 5) and X 15 follows exponential distribution with parameter (set to 2)).
For the dataset X 2 , we simulate the highly correlated multivariate data by using 'copula' R package with the 'normalCopula' function for the five variables, and each variable has sample size 400 with the correlation parameters (0.4, 0.5, 0.6, 0.7, 0.7, 0.7, 0.7, 0.6, 0.5, 0.4), specifying that the type of the symmetric positive definite matrix characterizing the elliptical copula is unstructured. For each marginal distribution for the five variables in X 2 , we use three gamma distributions (X 21 follows gamma distribution with shape parameter (set to 2) and scale parameter (set to 1), X 22 follows gamma distribution with shape parameter (set to 2) and scale parameter (set to 2), and X 23 follows gamma distribution with shape parameter (set to 2) and scale parameter (set to 3)) and two exponential distributions (X 24 follows exponential distribution with parameter (set to 2), and X 25 follows exponential distribution with parameter (set to 5)).
For the dataset X 3 , we simulate the highly correlated multivariate data by using 'copula' R package with 'normalCopula' function for the 5 variables and each variable has sample size 400 with the correlation parameters (0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5) to specify the symmetric positive definite matrix characterizing that the elliptical copula is unstructured. For each marginal distribution for five variables in X 3 , we use three gamma distributions (X 31 follows gamma distribution with shape parameter (set to 3) and scale parameter (set to 1), X 32 follows gamma distribution with shape parameter (set to 3) and scale parameter (set to 2), and X 33 follows gamma distribution with shape parameter (set to 3) and scale parameter (set to 3)) and two exponential distributions (X 34 follows exponential distribution with parameter (set to 4), and X 35 follows exponential distribution with parameter (set to 3)). Figure 1 shows the data plots of the three stages (X 1 , X 2 , X 3 ). In Figure 1, stage 1 shows bigger spread than stage 2 and stage 3, stage 2 shows smaller spread than stage 1 and stage 3, and stage 3 has bigger spread than stage 2. The correlation matrix with the simulated multivariate data in Table 2 shows high correlations exist among the five variables in X 1 , X 2 , and X 3 .
To perform the change point detection for the conditional multi-stage multivariate highly correlated simulated dataset, we apply PCA or FPCA to the simulated data X = (X 1 , X 2 , X 3 ) and then generate PCA or FPCA scores to the uniform distribution transformed data Y = (Y 1 , Y 2 , Y 3 ) by the empirical CDF approach. For FPCA, we employ Fourier basis functions for constructing functional eigenfunctions with K = 7 and T = 433 introduced in Section 2.2, and three eigenfunctions by the FPCA are transformed to the uniform distributed data Y = (Y 1 , Y 2 , Y 3 ) by the empirical CDF approach. We apply the copula conditional distribution for transforming three-stage multivariate data Y = (Y 1 , Y 2 , Y 3 ) to two conditional datasets F(Y 2 |Y 1 ) and F(Y 3 |Y 1 , Y 2 ) and then apply multivariate CPD methods (ETCC and NPMVCP) to each Y 1 , F(Y 2 |Y 1 ) and F(Y 3 |Y 1 , Y 2 ) to detect change points for each stage Y 1 , F(Y 2 |Y 1 ) and F(Y 3 |Y 1 , Y 2 ). Table 3 shows the PCA variance proportions for X 1 , X 2 and X 3 . Table 4 shows the PCA variance proportions with copula conditional distributions of t-copula, Gaussian copula and Frank copula, F(Y 2 |Y 1 ) and F(Y 3 |Y 1 , Y 2 ). Table 5 shows the change points by both ETCC and NPMVCP with the whole simulated data case and PCA components of the simulated multivariate data. We compare our proposed method with recent methods on multi-stage change point detection of multivariate data. We chose a nonparametric multiple change point analysis of multivariate data developed by Reference [9,10] with 'ecp' R package. Table 6 shows change point detections with nonparametric multiple change point analysis of simulated multivariate data with using the command 'ks.cp3o_delta' for the change points estimation by pruned objective via the Kolmogorov-Smirnov statistic, and the window size between segments is 30 in the 'ecp' R package. Compared with the results of ETCC and NPMVCP in Table 5, James, Zhang, and Matteson (2019) detected more change points with simulated multivariate data for each stage (stage 1, stage 2, stage 3). Table 7 shows the change point detections by both ETCC and NPMVCP with FPCA components of simulated multivariate data. The ETCC and NPMVCP with FPCA components of simulated multivariate data in Table 7 detected more change points with simulated multivariate data for each stage (stage 1, stage 2, stage 3) than the James, Zhang, and Matteson (2019) nonparametric multiple change point method. For considering the conditional multivariate data, the change point detections by both ETCC and NPMVCP with a copula conditional distribution of the t-copula, Gaussian copula, and Frank copula with PCA components are proposed in this paper. We notice that the performance of our copula-based method depends on the choice of the copula function. But, as we mentioned in Section 2.3, it is difficult to apply many copula functions to SPC because the range of the association parameter, θ, in the Clayton copula, FGM copula, and Gumbel copula functions in Table 1 is restricted so that we had computation difficulty to apply the Clayton copula, FGM copula, and Gumbel copula functions to SPC. Since the range of the association parameter, θ, from the Gaussian copula and the t-copula is θ ∈ (−∞, ∞), and the Frank copula is θ ∈ (−∞, ∞) \ {0}, we can compare these copula functions to simulated multivariate data to choose the copula function properly, which is the critical issue about a copula-based CPD method. Table 8 shows the change point detections by both ETCC and NPMVCP with copula conditional distribution of the t-copula, Gaussian copula, and Frank copula with PCA components. From Table 8, we can notice that the change point detections by the three copula functions (Gaussian copula, t-copula, and Frank copula) are slightly different.
Through the empirical trial and error learning based on the certain manufacturing circumstance, we recommend that industry practitioners compare these copula-based CPD methods and choose a copula function properly. Figures 2-4 show the eigenvalues and eigenfunctions of the FPCA plots of X 1 , X 2 , and X 3 . Table 9 shows the change point detections by both ETCC and NPMVCP with a copula conditional distribution of the t-copula, Gaussian copula, and Frank copula with FPCA components. With the simulated multivariate data, we found that the FPCA-based conditional multi-stage multivariate CPD method detected more change points for each stage case rather than the PCA-based conditional multi-stage multivariate CPD method. From Table 9, the FPCA-based conditional multi-stage multivariate CPD method is a promising research area for detecting change points if we can implement the proper copula function empirically.

Real Data
To apply multi-stage multivariate real dataset to our proposed CPD method, we chose daily foreign exchange rates in each continental region which each continental region is financially and economically influenced by another continental region by the time zone difference. Our data set contains daily foreign exchange rates for the twenty four most traded currencies (8 countries in Asia, 8 countries in Europe, and 8 countries in America) against the euro from January 3, 2013 (1/3/2013) to October 6, 2014 (10/6/2014). The data set was retrieved from the currency database retrieval system provided by Professor Werner Antweiler's website at UBC (University of British Columbia)'s Sauder School of Business, http://fx.sauder. ubc.ca/data.html. We denote S t to be an observed daily foreign exchange rate process in discrete time, t = 1, . . . , n, and r t = log(S t /S t−1 ) to be the rates of return of the exchange rates at time t. In particular, we select highest Gross Domestic Product (GDP) to lowest GDP order in each continent so that, in Asia, we select Japan, South Korea, Taiwan, China, Philippines, Thailand, India, and Vietnam; in Europe, we select Norway, Switzerland, Denmark, Sweden, United Kingdom, Poland, Hungary, and Russia; and, in America, we select USA, Canada, Chile, Uruguay, Brazil, Mexico, Columbia, and Peru in Table 10. Figure 5 shows the time plots of the twenty-four currencies in America, Europe, and Asia. Table 11 shows the correlation matrix with the twenty-four currencies in the period (1/3/2013 to 10/6/2014). We can find that there are high correlations among currencies in Asia and America but not high correlations among currencies in Europe. Table 12 shows the results of PCA variance proportions with real exchange currency data in the period. The result in Table 12 shows that America and Asia have similar PCA component variance proportions but Europe is different from America and Asia in terms of PCA component variance proportions. Table 13 shows PCA variance proportions with copula conditional distributions with t-copula, Gaussian copula, and Frank copula for F(Europe | Asia) and F(America | (Europe, Asia)) of real data. Table 14 shows the change point detection by ETCC and NPMVCP with real exchange currency data and PCA components. Figures 6-8 show the eigenvalues and eigenfunctions of FPCA plots of Asia, Europe, and America. Table 15 shows the change points with real exchange currency data by the [9,10] nonparametric multiple change point analysis with the 'ecp' R package. Compared with the results of ETCC and NPMVCP in Table 14, Reference [10] detected more change points with real exchange currency data with America, Asia, and Europe (1/3/2013 to 10/6/2014). Table 16 shows change point detections of FPCA components of real exchange currency data with America, Asia, and Europe (1/3/2013 to 10/6/2014). The ETCC and NPMVCP with FPCA components of real data in Table 16 detected more change points with real exchange currency data with America, Asia, and Europe (1/3/2013 to 10/6/2014) than Reference [10] nonparametric multiple change point method. To consider the conditional multivariate real data, the change point detections by ETCC and NPMVCP with copula function of PCA components are shown in Table 17. Table 18 shows change point detection by ETCC and NPMVCP with copula function of FPCA components for real exchange currency data with America, Asia, and Europe (1/3/2013 to 10/6/2014).
For the second real data application, we consider real exchange currency data with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014) because of the time zone difference between America and Asia. Table 19 shows change point detection by ETCC and NPMVCP with real exchange currency data with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014) and the PCA components of the data. Table 20 shows the change point detection with real exchange currency data with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014) by Reference [9,10] nonparametric multiple change point analysis. Compared with the results of ETCC and NPMVCP in Table 19, Reference [10] detected more change points with real exchange currency data with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014). Table 21 shows change point detection by ETCC and NPMVCP with FPCA components of real exchange currency data with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014). The ETCC and NPMVCP with FPCA components of real data in Table 21 detected more change points with real exchange currency data with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014) than [10] nonparametric multiple change point method. We also considered the conditional real data, Asia given America, in terms of time zone difference with America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014). Table 22 shows change point detection by ETCC and NPMVCP of the copula conditional distribution and PCA with real exchange currency data of America  Table 23 shows change point detection by ETCC and NPMVCP of the copula conditional distribution and FPCA with real exchange currency data of America (1/3/2013 to 10/3/2014) and Asia (1/4/2013 to 10/6/2014). From these two real data examples, we can conclude that the FPCA-based conditional multi-stage multivariate CPD method detected more change points for each stage case rather than the PCA-based conditional multi-stage multivariate CPD method because FPCA is nonlinear PCA, which can be flexible to the real data.

Conclusions
We proposed the conditional multi-stage multivariate CPD method by employing PCA or FPCA, copula conditional distribution, and the multivariate CPD models, which are energy test-based control chart (ETCC) and the nonparametric multivariate change point model (NPMVCP). With a simulation study and real data analysis, we showed that our proposed conditional multi-stage multivariate CPD method based PCA and FPCA is useful for detecting change points in the case of a multi-stage sequential process. Furthermore, we can conclude that the FPCA-based conditional multi-stage multivariate CPD method detects more change points compared to the PCA-based conditional multi-stage multivariate CPD method. Future study will employ FPCA with different types of bases to compare Fourier-based FPCA for multi-stage multivariate CPD and also develop a neural network-based multi-stage multivariate CPD method.
Author Contributions: J.-M.K. designed the model, analyzed the data and wrote the paper. N.W. proposed the idea of this paper, formulated the conceptual framework, designed the model, obtained inference and wrote the paper. Y.L. supervised this research, formulated the conceptual framework, designed the model, obtained inference and wrote the paper. All the authors cooperated to revise the paper. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.