Dynamics of Correlation Structure in Stock Market

: In this paper a correction factor for Jennrich’s statistic is introduced in order to be able not only to test the stability of correlation structure, but also to identify the time windows where the instability occurs. If Jennrich’s statistic is only to test the stability of correlation structure along predetermined non-overlapping time windows, the corrected statistic provides us with the history of correlation structure dynamics from time window to time window. A graphical representation will be provided to visualize that history. This information is necessary to make further analysis about, for example, the change of topological properties of minimal spanning tree. An example using NYSE data will illustrate its advantages.


Introduction
Correlation structure among stocks in a given portfolio is a complex structure represented numerically in the form of a symmetric matrix where all diagonal elements are equal to 1 and the off-diagonals are the correlations of two different stocks.That matrix is the so-called correlation matrix [1].It is clear that the larger the number of stocks, the higher the complexity of that structure and the harder it is to understand [2].From recent literature such as, for example, [1][2][3] we learn that understanding OPEN ACCESS correlation structure is one of the most important problems in econophysics.Theoretically, correlation matrix among stocks is a random matrix [4].The vital importance of random matrix in this field is very well known.Its role can be found not only in stock market analysis but also in many other areas such as, for example, portfolio optimization [5,6], asset price [7] and ex-ante optimal portfolios [8].It is also a major problem to understand which non-overlapping time windows, if any, that will provide the most stable correlation structure [8].
There are two mainstreams in analyzing the complex structure of correlation matrix.First, is about to filter the important information contained therein.This mainstream notion is pioneered by Mantegna [1] where he introduced the application of: (i) subdominant ultrametric to construct the economic classification of the stocks in the form of indexed hierarchical tree, and (ii) minimal spanning tree (MST) to filter the topological structure of the stocks.See also [9] for a recent development of robust filters.Nowadays, these two tools have become indispensible in econophysics as can be seen, for example, in [10][11][12][13].Second, is about to model the dynamics of correlation structure from a time window to another [4,8,14,15].Under the assumption that the time series data representing the stocks are governed by geometric Brownian motion (GBM) law, the logarithmic returns are independent and normally distributed.Thus, in this case, the correlation between two different stocks is customarily quantified as Pearson correlation coefficient (PCC) between the corresponding logarithmic returns [1,2].
In this paper our discussion will be focused on the second topic, especially on how to numerically represent the occurrence of correlation structure dynamics from time window to time window.More specifically, on how to identify the time windows where the instability of correlation matrix occurs and to what extent it occurs.Since that problem is multivariate in nature, in the rest of the paper, the study will be focused on statistical model building in multivariate setting.In that setting, Larntz and Perlman [16] have remarked that the statistical model that has been advanced to test the stability of correlation structure is the one developed by Jennrich [17].They further reported that this test has commendable properties in terms of computational and distributional behavior.These are among the reasons why Jennrich's test is considered the most appropriate to test correlation structure stability [5].
Nowadays, under the assumption mentioned above that the time series representing the stocks are a GBM process, Jennrich's test becomes the standard practice in finance and financial market analysis [6,8,18].Its applications can also be found in many studies such as, for example, in global market [15], business of property [19,20], equity analysis [21], real estate [22], and stock market analysis [23].Evidently, there is no doubt that this test plays a vital role in testing the stability of correlation structures [5,18].However, as we will show, if the result is negative, Jennrich's test cannot provide any information about the correlation structure dynamics from a time window to another.It only provides us with the information whether the correlation structure is stable along all time windows.Thus, if it is unstable, how can we identify the time windows where the instability occurs?This is the main problem that will be discussed in this paper.
The rest of the paper is organized as follows: in the next section we begin our discussion by briefly recalling Jennrich's test and its limitation, which will be the background and motivation of this paper.In the third section, we construct a statistic, mathematically equivalent to Jennrich's, to overcome the limitation of Jennrich's.Then, in the fourth section, a correction factor for each term in Jennrich's statistic is introduced in order to identify the time windows where the dynamics of correlation structure occurs.In the fifth section, an example using NYSE data will illustrate the advantages of the corrected statistic.To close this presentation, concluding remarks are highlighted in the last section.

Background and Motivation
Suppose n stocks are available in a portfolio under study and each stock is represented by a time series of its price.Let ( ) i p t and ( ) i r t be the price of stock i and the logarithm of i-th stock's price return at time t, respectively.Thus: for all i = 1, 2, …, n.
Under the assumption that ( ) i p t is governed by GBM law, the interrelations or, equivalently, similarities among stocks are summarized in the form of a correlation matrix C of size ( ) n n  where its general element of the i-th row and j-th column is defined as PCC, see [1,2,14]: with i r is the average of ( ) i r t for all t.Thus, the matrix C is a numerical representation of the complex system of stocks' interrelationships.That matrix C plays an important role in econophysics as the main source of economic information.Analyzing the complex structure of C is not simple.The greater the number of stocks, the higher the complexity of that structure [2].However, from the literature we learn that there are two parts in analyzing the complex structure of C, namely: (i) to filter the important information contained therein [1], and (ii) to model the dynamics of correlation structure instability from a time window to another such as discussed in [4,6,8].
In what follows our discussion will be focused on the second topic, especially on how to numerically represent the history of correlation structure instability.For that purpose we introduce a correction factor for each term in Jennrich's statistic.If the original Jennrich's statistic can only be used to test whether the correlation matrix is stable along all time windows, the corrected statistic will be able to identify the particular windows at which the instability, if any, occurs.This information is necessary to make further analysis of correlation structure dynamics in terms, for example, of stock topological properties.
It is important to note that in a more general condition of time series, the use of PCC as a similarity measure among two different time series might be not apt.In this case, other similarity measures such as dynamic time warping [24], detrended correlation [25,26], and Hayashi-Yoshida correlation [27] are available.If dynamic time warping is to measure the similarity of two time series which may vary in time frame, detrended correlation is introduced for the case where non-stationary and/or non GBM process is involved.On the other hand, Hayashi-Yoshida correlation is designed for the case where the two time series are observed in a non-synchronous manner.See [24][25][26][27] for the details.

Review of Jennrich's Statistic
Actually, testing the stability of correlation structure has a long history before Jennrich introduced his test in [17] which, nowadays, became popular as the most appropriate test [5].See, for example, [28] for early development, and [29,30] for more recent works.Those works show that this research area is very active.In the next paragraph we recall briefly Jennrich's test and then highlight its limitations.
Suppose m non-overlapping time windows of stock's price time series data are of our concern in studying the dynamics of correlation structure.Let i T be the length of the i-th window and i C the correlation matrix of stocks in that time window.To test the stability of correlation structure among stocks under those time windows, Jennrich [17] proposed this statistic: where , is the pooled correlation matrix; (iii) i  is the column vector where its j-th component is equal to the j-th diagonal element of i Z ; (iv) the general element of G is   , respectively.
He showed that J is asymptotically distributed according to a chi-square distribution with degrees of freedom   n n  .Therefore, for significance level  , the correlation structure along all time windows is declared unstable if J exceeds a cut-off value Despite its popularity, Jennrich [17] has remarked at the end of his paper that, although the asymptotic behavior of J in Equation ( 3) is the same as a chi-square variable, the term i J needs not asymptotically be a chi-square variable for all time windows i = 1, 2, …, m.This is the limitation of Jennrich's test that will be handled in the next two sections by introducing a correction factor.As a consequence of that limitation, if the correlation structure along all time windows is unstable, J cannot provide any information about the time windows, if any, at which the correlation structure is changed.This will be not the case if the distribution of i J is known.Therefore, we need to investigate the distributional behavior of i J .In the remaining pages, in order to derive that distribution, a correction factor for i J will be introduced through the construction of an equivalent alternative formula of i J in the form of Mahalanobis square distance.We need the correction factor and that equivalent form because it is difficult to derive the distribution of i J directly from Equation (3).It is the distribution of the corrected i J that will allow us to investigate the dynamics of correlation structure stability.First, we discuss the distributional behavior of i C .

Asymptotic Behavior of Correlation Matrix among Stocks
Let i  be the theoretical correlation matrix among stocks in the i-th time window.The asymptotic distributional behavior of i C is given in the following theorem [31].
where its   , i j -th element is equal to 1 and 0 , K D be a diagonal matrix where its diagonal elements are those of K, and A =   is asymptotically distributed as multivariate normal of dimension 2 n with mean vector 0 and covariance matrix  , denoted by In that theorem, the matrix K is the so-called commutation matrix and vec(*) is the vectorization of the matrix * obtained by stacking each column underneath the other.See [31], and [32] for the details.It is very important to note that this theorem cannot directly be used to derive the distribution of i J because the covariance matrix  of i C is singular.This motivates us, in the next section, to investigate the asymptotic distribution of the squareform of i C which will simplify our discussion.More specifically, working with this form is more advantageous than working with i C itself because (i) it contains the same information as i C in terms of correlation structure, and (ii) its covariance matrix is non-singular.These properties lead us to the construction of a statistic, equivalent to Jennrich's statistic which allows us to investigate the dynamics of correlation structure instability along all time windows.

An Equivalent Form of Jennrich's Statistic
Actually, since i C is symmetric and all diagonal elements are not a random variable, what we need in the study of correlation structure dynamics is only the information contained in the lower (or upper) off-diagonal part of i C .To represent that part in a compact way, the notion of squareform operator, used [33], will be adopted., our discussion will be focused on the distributional behavior of that distance in Mahalanobis sense.To derive that distribution, we need to know the covariance matrix . For this purpose, we define a linear transformation M from The transformation M can be represented in matrix form as a block matrix M =   , where 1 M is zero matrix and for r = 2, 3, …, n: where, 2 C r is the number of combinations of 2 out of r objects.
The transformation Equation ( 4) and the asymptotic distributional behavior of .From Equation (4), we obtain  = M M t  where  is defined in Theorem 1.Since  is non-singular, the distribution of that Mahalanobis squared distance is given in Property 1 which is a consequence of Theorem 2.2.2 in [31].A special case of that distribution, under the hypothesis that the correlation structure is stable over time windows, is given in Property 2. Based on this property, an equivalent form of Jennrich's statistic J in Equation (3) will be developed and presented in Property 3.This leads us to the correction factor of i J in Property 4.
 is asymptotically distributed as chi-square with degrees of freedom mk .
In practice, 0  is unknown.Thus, it is so with 0  .In this case, as suggested by Jennrich [17], 0  is estimated by pooled C .Therefore 0  is estimated by 0  obtained from 0  by replacing 0  with pooled C .Since pooled C is a consistent estimator of 0  , then the following property which presents an equivalent form of Jennrich's statistic J in (3) is straightforward.
property as i D .By construction, see [17], i D is mathematically equivalent to i J in (3).Moreover, the  in Property 3 is also mathematically equivalent to J.As we have mentioned in Sub- section 2.1, the correlation structure is declared unstable along all time windows if D or, equivalently, . Although J is more preferable than D in terms of computational efficiency, as can be seen in the next section, the statistic D provides an opportunity to develop a correction factor for i J which will be useful to study the dynamics of correlation structure instability.

Correction Factor
Although D is asymptotically distributed as a chi-square variable, as remarked in Jennrich [17], the distribution of the term i D is still unknown.This is the reason why D or, equivalently, J cannot be used to investigate the dynamics of correlation structure instability.To handle this problem, in the next paragraph a correction factor for each term i D is proposed.Since the time windows are non-overlapping, testing the stability of correlation structure ) is equivalent to testing repeatedly 0 H : i  = 0  for all i = 1, 2, …, m [34].
Based on this equivalence relation, we have the following property.The proof is given in the Appendix.

Property 4: Let
is asymptotically distributed as chi-square with degrees of freedom k for all i = 1, 2, …, m.
We conclude that the term i D in Property 3 corrected by the factor i T T  is asymptotically distributed as chi-square with degrees of freedom k.For computational reason, instead of n n  , matrix inversion in the latter is of size   n n  .As we will see in the next section, this corrected statistic provides us with graphical representation of the history of correlation structure dynamics.

Example
To illustrate how the corrected statistic introduced in Property 4 works, NYSE data from January 2007 until December 2009 for 100 most capitalized stocks classified in ten industry sectors were used.
Those data were downloaded from [35] on 9 May 2013.The distribution of stocks in each sector, represented in different color, is given in Table 1.However, four stocks are not included in this study due to data availability.3) applied to half-yearly data gives J = 28490.90.Since the degrees of freedom is large, for significance level  = 2.5% as suggested in [36], normal approximation gives the cut-off value equals to 23218.53.We conclude that, since J exceeds the cut-off value, the correlation structure along all 6 half-yearly time windows is unstable.
That is all information provided by Jennrich's statistic; it can only be used to test whether the correlation structure is stable along all 6 half-yearly time windows.In the next paragraph, by using the corrected statistic developed in Property 4, we investigate further the dynamics of that structure.
The details of the i J value and its corrected value are presented in Table 2. Based on the corrected statistic, the last column of this table, with significance level  = 2.5%, the half-yearly history of correlation structure instability is represented graphically in Figure 1.The dots represent half-yearly value of the corrected statistic for the i-th time window; i = 1, 2, ..., 6, and the straight line is the cut-off value for corrected i J , i.e., the   1   = 97.5% quantile of chi-square distribution with degrees of freedom k = 4,560 which is equal to 4,747.17.What we learn from Figure 1 is not only the instability of half-yearly correlation structure but also the history of its dynamics viewed from C pooled as reference.That figure also provides us with the information that at the following time windows the correlation structure are significantly different from the reference; January-June 2007, July-December 2007, January-June 2008, and July-December 2009.

Tracking Correlation Structure Changes
The information in Figure 1 provided by the corrected statistic makes possible further investigation about to what extent the correlation structure has been changed.In this example, the correlation structure changes will be studied by comparing the pattern of the MST-based network topology issued from each time window and that issued from C pooled .First, we compare them in terms of the power-law of degree distribution and, later on, in terms of Jaccard index.
In Figure 2 we present the dynamics of correlation structure in terms of MST-based network topology among stocks [1,2,[10][11][12].Let us consider the pooled correlation matrix issued from all the time windows as reference.We call reference network topology in Figure 2a, the MST-based network topology of C pooled .In Figure 2b-g we also present the network topology of the first until sixth time windows, respectively.
In that figure, the weight of the link between two stock i and j represents the distance ( , ) d i j , related to ( , ) c i j in Equation ( 2), defined in [1,2] as: ( , ) From that figure we can investigate how degree distributions differ from that of reference correlation structure.This could lead us to investigate further the topological properties of MST-based network such as the dynamics of the most influential stocks by observing the centrality measures such as, for example, degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality as usually used in networks analysis [11,12,[37][38][39][40].In what follows we focus the discussion on degree distribution.We show that, in this example, the dynamics of correlation structure in Figure 1 as monitored by the corrected Jennrich's statistic, can nicely be explained in terms of the power-law of degree distribution for each MST in Figure 2. Graphically, in log-log scale, the degree distribution of the reference network together with that of each time window is presented in Figure 3. Horizontal and vertical axes represent log(degree) and log(degree frequency), respectively.
At a glance, this figure shows the dynamics of correlation structure in terms of the power-law of degree distribution.Specifically, let us write the power-law model ( ) P k = - ck  where ( ) P k is the probability that a particular stock has degree k, and c and  are constants.For each time window, the constant c and the exponent  are given in Table 3.  From this table we learn that: (i) According to Lawrence and Lawrence [41], for all time windows, the power-law model ( ) P k = - ck  is reasonably fits the empirical pattern of degree distribution in Figure 3 since the mean absolute percentage errors (MAPE) is between 20% and 50% for all time windows.(ii) Only the power-laws of the fourth and fifth time windows that are closer to the reference power-law related to pooled C .These results are in-line with the result in Figure 1.

Jaccard Index
To track the changes of correlation structure, we can also use Jaccard similarity coefficient, also known as Jaccard index, between the reference structure C pooled and that of each time window.This index is to measure the similarity between the MST of a particular time window and the reference MST.For the i-th time window, Jaccard index i I , i = 1, 2, …, 6, is defined by: where,

i MST and
Re f MST represent the MST of the i-th time window and that of the reference, respectively, and A is the number of elements in a set A.
As can be seen in Table 4, this index is as nice as the degree distribution to represent the similarity between i MST and Re f

MST
. The indices for the fourth and fifth time windows are higher than the others.This is also in-line with the result given by the corrected statistic in Figure 1.

Concluding Remarks
Under the assumption that the time series representing stocks are governed by GBM law, Jennrich's statistic J can be used to test the stability of correlation structure among stocks in the sense of PCC.However, if the correlation structure is unstable, J is not able to provide any information about the time windows at which the instability occurs.Therefore, J cannot tell us the dynamics of correlation structure instability along all time windows.
In this paper a correction factor is introduced in order to improve the role of Jennrich's statistic in understanding the dynamics of correlation structure.More specifically, the corrected statistic can be used not only to test the stability of correlation structure but also to identify the particular time windows at which the correlation structure has significantly been changed.
By using the corrected statistic, a visual representation of the history of correlation structure instability along all time windows can be constructed.The information from this representation is necessary to investigate further, for example, to what extent the correlation structure in a particular time window has been changed.We have demonstrated these advantages in analyzing the dynamics of correlation structure at NYSE.According to that case of NYSE, the dynamics of correlation structur is closely related to the power-law of degree distribution.Furthermore, Jaccard index is able to quantify the similarity among two MST-based network topology.

Appendix: Proof of Property 4
Let us write:

Kronecker
in Theorem 1 lead us to the asymptotic distribution of Mahalanobis squared distance between

5. 1 .
NYSE Correlation Structure Dynamics As an illustration of the advantages of the corrected statistic, let us first test the stability of correlation structure in half-yearly basis (January-June 2007, July-December 2007, January-June 2008, July-December 2008, January-June 2009, and July-December 2009) based on Jennrich's test.The Equation (
the first term on the right hand side is simply i i T C T  , according to Theorem 2.2.2.in[31], the second term on the right hand side of Equation (A1), the distribution of , if i T   for all i = 1, 2, …, m, we have Property 4.
That operator transforms i C into a vector containing all elements of i C below or above the diagonal.In this paper we choose the upper off-diagonal part and we denote it by C   is equivalent to the Euclidean length of the vector  

Table 1 .
Distribution of stocks in each sector.

Table 2 .
Corrected statistic for each time window.

Table 3 .
The constant c and the exponent  for each time window.

Table 4 .
Jaccard index for each time window.