In this section, we calculate the correlation and Transfer Entropy matrices using the 83 indices previously described plus their lagged values.

#### 3.1. Correlation

The Pearson correlation is given by

where

${x}_{ik}$ is element

k of the time series of variable

${x}_{i}$,

${x}_{jk}$ is element

k of the time series of variable

${x}_{j}$, and

$\overline{{x}_{i}}$ and

$\overline{{x}_{j}}$ are the averages of both time series, respectively.

Pearson correlation is used to calculate the linear correlation between variables. While other correlation measures, such as Spearman rank correlation and Kendall tau rank correlation, capture nonlinear relations, we apply the usual Pearson correlation because the results for the financial data we are using are very similar to the Spearman rank correlation, suggesting a near linear correlation between the indices. This discussion is made in Appendix B, where the three correlation measures are compared when applied to our set of data.

Using both original and lagged indices, we build a enlarged correlation matrix, displayed in

Figure 1, with the original indices arranged from 1 to 83, and the lagged indices from 84 to 166. Enlarged correlation values are represented in lighter shades, and lower correlations are represented by darker shades.

**Figure 1.**
Heat map of the enlarged correlation matrix of both original and lagged indices, representing same-day correlation in Sector 1 and previous-day correlation in Sector 2. Correlation is symmetric, therefore, Sectors 3 and 4 are identical to Sectors 1 and 2.

**Figure 1.**
Heat map of the enlarged correlation matrix of both original and lagged indices, representing same-day correlation in Sector 1 and previous-day correlation in Sector 2. Correlation is symmetric, therefore, Sectors 3 and 4 are identical to Sectors 1 and 2.

Figure 2 shows the magnified correlation submatrices of Sector 1 (left), the original indices with themselves, and Sector 2 (right), the lagged indices with the original ones. In Sector 1, where correlations go from

$-0.1143$ to 1, besides the bright main diagonal, representing the correlation of an index with itself, which is always 1, there are other clear regions of strong correlation-the North American, South American, and Western European indices all cluster regionally. There is a region of weaker correlation, among Asian countries and those of Oceania, and darker areas correspond to countries of Central America and some islands of the Atlantic. In Western Europe, the index of Iceland has very low correlation with the others, and African indices, with the exception of the one from South Africa, also interact weakly in terms of correlation. Other, off diagonal bright areas correspond to strong correlations between indices of the Americas and those of Europe, and weaker correlations between Western indices and their same-day counterparts in the East.

In Sector 2 of the correlation matrix, we see the correlations between lagged and original indices, that go from $-0.3227$ to $0.5657$. Here, one can see some correlation between American and European indices and next day indices from Asia and Oceania, as well as some correlation between American indices and the next day values of European indices. This suggests an influence of West to East in terms of the behavior of the indices, which we will explore in more detail in later sections. There is little correlation between the lagged value of an index and its value on the next day.

**Figure 2.**
Heat maps of the correlation submatrices for original × original indices and for lagged original indices, respectively.

**Figure 2.**
Heat maps of the correlation submatrices for original × original indices and for lagged original indices, respectively.

#### 3.2. Transfer Entropy

To further explore the question of which markets influence others we turn to information based measures, in particular

Transfer Entropy, that was created by Thomas Schreiber [

6] as a measurement of the amount of information that a source sends to a destination. Such a measure must be asymmetric, since the amount of information that is transferred from the source to the destination need not, in general, be the same as the amount of information transferred from the destination to the source. It must also be dynamic-as opposed to mutual information which encodes the information shared between the two states. Transfer Entropy is constructed from the Shannon entropy [

32], given by

where the sum is over all states for which

${p}_{i}\ne 0$. The base 2 for the logarithm is chosen so that the measure of information is given in bits. This definition resembles the Gibbs entropy but is more general as it can be applied to any system that carries information.

Shannon entropy represents the average uncertainty about measurements i of a variable X, and quantifies the average number of bits needed to encode the variable X. In our case, given the time series of an index of a stock market ranging over a certain interval of values, one may divide the possible values into N different bins and then calculate the probabilities of each state i.

For interacting variables, time series may influence one another at different times. We assume that the time series of

X is a Markov process of degree

k, that is, a state

${i}_{n+1}$ of

X depends on the

k previous states of

X:

where

$p\left(A\right|B)$ is the conditional probability of

A given

B, defined as

Modelling interaction between nodes, we also assume that state

${i}_{n+1}$ of variable

X depends on the

ℓ previous states of variable

Y, as represented schematically in

Figure 3.

**Figure 3.**
Schematic representation of the Transfer Entropy between a variable Y and a variable X.

**Figure 3.**
Schematic representation of the Transfer Entropy between a variable Y and a variable X.

We may now define the Transfer Entropy from a time series Y to a times series X as the average information contained in the source Y about the next state of the destination X which was not already contained in the destination’s past. We assume that element ${i}_{n+1}$ of the time series of variable X is influenced by the k previous states of the same variable and by the ℓ previous states of variable Y:

Transfer Entropy from variable

Y to variable

X is defined as

where

${i}_{n}$ is element

n of the time series of variable

X and

${j}_{n}$ is element

n of the time series of variable

Y,

$p(A,B)$ is the joint probability of

A and

B, and

is the joint probability distribution of state

${i}_{n+1}$, of state

${i}_{n}$ and its

k predecessors, and the

ℓ predecessors of state

${j}_{n}$, as in

Figure 3.

This definition of Transfer Entropy assumes that events on a certain day may be influenced by events of

k and

ℓ previous days. We shall assume, with some backing from empirical data for financial markets, that only the previous day is important (ie,

$k=\ell =1$). The Transfer Entropy Equation (

6) then simplifies:

In order to calculate Transfer Entropy using Equation (

8), we must first establish a series of bins in which data may be fitted. The number of bins alter the resulting TE, and in order to gauge the effects of binning choice, in Appendix C, we calculated TE for our set of data for binnings with three different widths:

$0.02$,

$0.1$, and

$0.5$. The results did not change substantially from one binning to the other, and since the heat maps from binning with width

$0.02$ were clearer, we adopted this binning in the remaining of our calculations.

Transfer Entropy, like other measures, is usually contaminated with noise due to finite data points, residual non-stationarity of data,

etc. To reduce this contamination, we calculate the Transfer Entropy from randomized data, where time series are randomly reordered to destroy any correlation or causality relation between variables but to preserve their frequency distributions. The randomized data is then subtracted from the original Transfer Entropy matrix, producing the Effective Transfer Entropy (ETE), first defined in [

9], and used in the financial setting by [

2] and [

20]. In the present work, we calculated ten Transfer Entropy matrices based on randomized data and then removed their average from the original Transfer Entropy matrix, obtaining the effective Transfer Entropy matrix presented in

Figure 4. The heat map in

Figure 4 are colored in such a way as to enhance visibility, so the largest brightness was set to

$0.3$ (so, every cell with ETE value above

$0.3$ is painted white) in order to make the figures more visible, although the range of values goes from –0.0203 to 1.8893.

The resulting ETE matrix is strikingly different from the correlation matrix. Here the ETEs from original to original indices, shows some weak flow from Asian Pacific and Oceanian indices to American and European indices, and from European indices to the American ones. Now, Sector 2, representing the ETEs from lagged to original indices, shows strong ETEs from the indices of one continent to the indices of the same continent on the next day, as can be seen by the brighter squares around the main diagonal of the quadrant. Off diagonal bright regions also show a flow of information from lagged American to European indices, from lagged European to Asian Pacific indices, and from lagged Asian Pacific to both American and European indices. Sector 3 mimics the ETEs of Sector 1, and Sector 4 features ETEs compatible with noise, which is to be expected since causality relations should not go backwards in time.

Sector 2 of the ETE matrix are the result of lagged Transfer Entropy by one day. In [

33], the authors used lagged Transfer Entropy in order to study neuronal interaction delays, and [

34] implements the calculation of a diversity of information-based measures, including the possibility of using lagged variables in TE.

**Figure 4.**
Effective Transfer Entropy (ETE) matrix. Brighter areas correspond to large values of ETE, and darker areas correspond to low values of ETE.

**Figure 4.**
Effective Transfer Entropy (ETE) matrix. Brighter areas correspond to large values of ETE, and darker areas correspond to low values of ETE.

Figure 5 shows close views of Sector 1 and Sector 2, respectively. From Sector 1, where ETE ranges from

$-0.0162$ to

$0.1691$, one can see an ETE from Asian and Oceanian indices to American and European ones on the same day, indicating information flow from Asian and Oceanian markets to the West.

Section 2 depicts the ETEs from lagged to original variables, ranging from

$0.0185$ to

$1.8893$. There is a clear bright streak from lagged indices to themselves on the next day, which is to be expected given the definition of Transfer Entropy. We also see structures very similar to the ones obtained from Sector 1 of the correlation matrix, but now from lagged indices to original ones, leading to the belief that the flow of information from previous days anticipates correlation. There are clear clusters of North and South American indices, of Western European indices, and of Asian Pacific plus Oceanian indices. Although an ETE matrix need not be symmetric, the structure shown in

Figure 5b is nearly symmetric, showing there is a comparable flow of information in both directions.

Figure 5b has been colored so as to enhance visibility so that all values above

$0.3$ are represented as white.

Figure 2, left (Sector 1 of the correlation matrix), and

Figure 5, right (Sector 2 of the ETE matrix), display a very similar structure, which suggests that the transfer of information from one index to another coincides to correlated behavior of the two indices on the following day.

Figure 6 shows a plot of correlation values of the two submatrices. There is a clear nonlinear correlation between them: the Pearson (linear) correlation between them is 0.73, the Spearman rank correlation between them is 0.82, and the Kendall tau rank correlation between them is 0.66.

**Figure 5.**
Heat maps of the ETE submatrices from original to original indices and from lagged to original indices, respectively.

**Figure 5.**
Heat maps of the ETE submatrices from original to original indices and from lagged to original indices, respectively.

**Figure 6.**
Correlation values of Sector 1 of the correlation matrix and Effective Transfer Entropy (ETE) of Sector 2 of the ETE matrix.

**Figure 6.**
Correlation values of Sector 1 of the correlation matrix and Effective Transfer Entropy (ETE) of Sector 2 of the ETE matrix.

#### 3.3. Evolution in Time

Returning to the issue of stationarity, we now make an analysis of the Pearson correlation and of the Lagged Transfer Entropy considering each year of data separately.

Figure 7 shows the correlation matrices for each year, from 2003 to 2014. Although the average correlation rises in years of crisis, like 2008 and 2011, one can see that there is a conservation of the structure of the correlations between indices.

Figure 8 shows the logarithmic values of a frequency distribution for each of the correlation matrices. The structure does change for years of crisis, but remains relatively intact throughout the years.

**Figure 7.**
Correlation matrices for each of the years of data for Sector 1 (original × original variables). Brighter colors denote higher correlation, and darker colors denote lower correlations.

**Figure 7.**
Correlation matrices for each of the years of data for Sector 1 (original × original variables). Brighter colors denote higher correlation, and darker colors denote lower correlations.

**Figure 8.**
Logarithmic values of frequency distributions of the correlation matrices for each of the years of data for Sector 1 (original × original variables).

**Figure 8.**
Logarithmic values of frequency distributions of the correlation matrices for each of the years of data for Sector 1 (original × original variables).

The same is shown for the Lagged Transfer Entropy (Sector 2, from lagged to original indices).

Figure 9 shows the Lagged Transfer Entropy matrices for each year, from 2003 to 2014. Again, there is a conservation of the structure of the relations between indices.

Figure 10 shows the logarithmic values of a frequency distribution for each of the Lagged Transfer Entropy matrices, showing that the structure does change for years of crisis, but remains relatively intact throughout the years.

**Figure 9.**
Correlation matrices for each of the years of data for Sector 1 (original × original variables). Brighter colors denote higher correlation, and darker colors denote lower correlations.

**Figure 9.**
Correlation matrices for each of the years of data for Sector 1 (original × original variables). Brighter colors denote higher correlation, and darker colors denote lower correlations.

**Figure 10.**
Logarithmic values of frequency distributions of the correlation matrices for each of the years of data for Sector 1 (original × original variables).

**Figure 10.**
Logarithmic values of frequency distributions of the correlation matrices for each of the years of data for Sector 1 (original × original variables).

So, although there is change in the data in time, the change does not affect significantly the values obtained for correlation and Transfer Entropy, particularly in terms of the structures of networks based on them. In

Section 7, further discussion is made on the correlation and Transfer Entropy dependencies. Since the main aim of this article is to study the networks of indices based on measures that use correlation and Lagged Transfer Entropy, this similarity of structures in time lead us to believe that there is no significant change in the network structure derived using the whole set of data and networks based on particular subsets of data. Some confirmation of this claim may be found in [

35].