# Financial Insights from the Last Few Components of a Stock Market PCA

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- The first principal component (PC1) with the largest eigenvalue which they asserted represents a market-wide effect that influences all stocks.
- A variable number of principal components (PCs) following the market component which represent synchronized fluctuations affecting groups of stocks.
- The remaining PCs indicate randomness in the price fluctuations.

## 2. Data and Methods

#### 2.1. Data

- We created a new variable associated with each stock called the Dividend Factor. We started with a factor of 1, and every time a dividend was paid we multiplied the Dividend Factor,$$\begin{array}{cc}\hfill \mathrm{Daily}\text{}\mathrm{Dividend}\text{}{\mathrm{Factor}}_{i}\left(t\right)& =\left\{\begin{array}{cc}1\hfill & \mathrm{if}\text{}\mathrm{no}\text{}\mathrm{dividend}\hfill \\ 1+\frac{{D}_{i}\left(t\right)}{{P}_{i}\left(t\right)}\hfill & \mathrm{if}\text{}\mathrm{dividend}\hfill \end{array}\right\}\hfill \\ \hfill \mathrm{Cumulative}\text{}\mathrm{Dividend}\text{}{\mathrm{Factor}}_{i}\left(t\right)& =\prod _{j=1}^{t}\left(\mathrm{Daily}\text{}\mathrm{Dividend}\text{}{\mathrm{Factor}}_{i}\left(t\right)\right)\hfill \end{array}$$
- We adjusted the price series with the Cumulative Dividend Factor , the adjusted price was calculated by$${\mathrm{PNEW}}_{i}\left(t\right)={P}_{i}\left(t\right)\times \mathrm{Cumulative}\text{}\mathrm{Dividend}\text{}{\mathrm{Factor}}_{i}\left(t\right).$$
- The return series for a given stock i was calculated as$${\mathrm{R}}_{i}\left(t\right)=\frac{{\mathrm{PNEW}}_{i}(t+1)-{\mathrm{PNEW}}_{i}\left(t\right)}{{\mathrm{PNEW}}_{i}\left(t\right)}.$$

#### 2.2. Principal Component Analysis

`R`(R Core Team 2014). The PCs were not rotated. We note that rotation of PCs is common, but should not be carried out in this type of application.

`graphics`package in base

`R`and examined them for near constant relationships. In the results below, we present the scatter plots starting with PC151/PC152, the lowest-numbered PCs which showed evidence of stocks with a near-linear relationship. There is a more general method of plotting PCs against each other known as biplots, see Jolliffe (1986, sct. 5.3) for further details.

## 3. Results

## 4. Conclusions

## Author Contributions

## Conflicts of Interest

## Abbreviations

ANZ | ANZ Bank |

ASX 200 | Australian Stock Exchange 200 Index |

BHP | BHP Billiton |

CBA | Commonwealth Bank |

CFX | CFS Retail Property Group |

MGR | Mirva Group |

NAB | National Australia Bank |

NVN | Novion Property Group |

NYSE | New York Stock Exchange |

PCA | Principal Component Analysis |

PC | Principal Component |

RIO | Rio Tinto Ltd |

SGP | Stockland |

STO | Santos Limited |

WBS | Westpac Banking Corporation |

WPL | Woodside Petroleum Limited |

## References

- Aggarwal, Charu C. 2013. Outlier Analysis. New York: Springer. [Google Scholar]
- Barnett, Vic, and Toby Lewis. 1994. Outliers in Statistical Data, 3rd ed. New York: Wiley. [Google Scholar]
- Billio, Monica, Mila Getmansky, Andrew W. Lo, and Loriana Pelizzon. 2012. Econometric measures of connectedness and systemic risk in the finance and insurance sectors. Journal of Financial Economics 104: 535–59. [Google Scholar] [CrossRef]
- Ding, Chris, and Xiaofeng He. 2004. K-means Clustering via Principal Component Analysis. In Paper presented at ICML ’04 Proceedings of the Twenty-First International Conference on Machine Learning, Banff, AB, Canada, 4–8 July; p. 29. [Google Scholar]
- Driessen, Joost, Bertrand Melenberg, and Theo Nijman. 2003. Common factors in international bond returns. Journal of International Money and Finance 22: 629–56. [Google Scholar] [CrossRef]
- Hawkins, Douglas M. 1980. Identification of Outliers. Dordrecht: Springer. [Google Scholar]
- Jolliffe, Ian T. 1986. Principal Component Analysis. New York: Springer. [Google Scholar]
- Kim, Dong-Hee, and Hawoong Jeong. 2005. Systematic analysis of group identification in stocks markets. Physical Review E 72: 046133. [Google Scholar] [CrossRef] [PubMed]
- Kritzman, Mark, Yuanzhen Li, Sebastien Page, and Roberto Rigobon. 2011. Principal Components as a measure of systemic risk. Journal of Portfolio Management 37: 112–26. [Google Scholar] [CrossRef]
- Pérignon, Christophe, Daniel R. Smith, and Christophe Villa. 2007. Why common factors in international bond returns are not so common. Journal of International Money and Finance 26: 284–304. [Google Scholar]
- R Core Team. 2014. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]
- Yang, Libin, William Rea, and Alethea Rea. 2016. Stock Selection with Principal Component Analysis. Journal of Investment Stategies 5: 1–21. [Google Scholar]
- Zheng, Zeyu, Boris Podobnik, Ling Feng, and Baowen Li. 2012. Changes in cross-correlations as an indicator for systemic risk. Scientific Reports 2: 888. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Scatter plots of relative weights of each stock in components 151 and 152 arising from a principal components analysis (PCA) on a correlation matrix from the whole study period. The lines join the origin to the point representing the coefficient in the 151th eigenvalue (x-axis) and 152nd eigenvalue (y-axis). The stocks are colour-coded using the Industry Classification Benchmark Industry (ICB) classification. Financials are blue (33 stocks), Health Care are red (9 stocks), Industrials are yellow (24 stocks), Consumer Services are brown (19 stocks), Basic Materials are green (31 stocks), Oil & Gas are purple (16 stocks), Utilities are orange (5 stocks), Consumer Goods are black (9 stocks), Telecommunications are orchid (4 stocks), Technology are grey (6 stocks). Stocks with a loading of at least 0.2 in one of the PCs are labelled with their ticker symbol. From this scatterplot, we can conclude that there are six highly correlated stocks structured as three pairs: namely, STO and WPL, SGP and MGR, and BHP and CFX.

**Figure 2.**Scatter plots of relative weights of each stock in components 153 and 154 arising from a PCA on a correlation matrix from the whole study period. The lines join the origin to the point representing the coefficient in the 153rd eigenvalue (x-axis) and 154th eigenvalue (y-axis). The stocks are colour-coded using the colour scheme described in Figure 1. Stocks with a loading of at least 0.25 in one of the PCs are labelled with their ticker symbol. From this scatterplot, we can conclude that there are four highly correlated stocks structured as two pairs: namely, CBA (Commonwealth Bank of Australia) and ANZ (ANZ), and NAB (National Australia Bank) and WPC (Westpac).

**Figure 3.**Scatter plots of relative weights of each stock in components 155 and 156 arising from a PCA on a correlation matrix from the whole study period. The lines join the origin to the point representing the coefficient in the 155th eigenvalue (x-axis) and 156th eigenvalue (y-axis). The stocks are colour-coded using the colour scheme described in Figure 1. Stocks with a loading of at least 0.2 in one of the PCs are labelled with their ticker symbol. From this scatterplot, we can conclude that there are six highly correlated stocks structured as three pairs: namely, CBA and WBC, NAB and ANZ, and BHP and RIO.

**Figure 4.**Time series plot of two stocks identified in the scatter plot of components 151 and 152: BHP-Billiton (BHP) in basic materials and CFS Retail Property Trust Group (CFX) in the real estate industry.

**Figure 5.**Time series plots of near-linear correlated stocks identified in a scatter plot of components 151 and 152: Mirvac Group (MGR) and Stockland Corporation Limited (SGP)—two stocks in the real estate sector.

**Figure 6.**Time series plot of two stocks in the Oil & Gas industry identified in the scatter plot of components 151 and 152: Santos (STO) and Woodside Petroleum (WPL).

**Figure 7.**Time series plots of near-linear correlated stocks identified in the scatter plots of components 153 to 156: the four big banks in Australia. ANZ (ANZ), Commonwealth Bank of Australia (CBA), National Australia Bank (NAB) and Westpac (WBC).

**Figure 8.**Time series plot of two stocks in Basic Materials identified in component 155: BHP and Rio Tinto (RIO).

**Table 1.**Eigenvalues and variances explained by the last six principal components. The percent of variation explained was calculated using Equation (1).

Eigenvalue | Variance Explained (%) | |
---|---|---|

PC151 | 0.398 | 0.255% |

PC152 | 0.380 | 0.244% |

PC153 | 0.321 | 0.206% |

PC154 | 0.304 | 0.195% |

PC155 | 0.286 | 0.183% |

PC156 | 0.244 | 0.157% |

Pair | Correlation | PCs |
---|---|---|

MGR-SGP | 0.71 | 151, 152 |

STO-WLP | 0.95 | 151, 152 |

BHP-RIO | 0.77 | 155 |

ANZ | WBC | CBA | NAB | |
---|---|---|---|---|

ANZ | 1 | 0.97 | 0.96 | 0.85 |

WBC | 0.97 | 1 | 0.98 | 0.76 |

CBA | 0.96 | 0.98 | 1 | 0.73 |

NAB | 0.85 | 0.76 | 0.73 | 1 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yang, L.; Rea, W.; Rea, A.
Financial Insights from the Last Few Components of a Stock Market PCA. *Int. J. Financial Stud.* **2017**, *5*, 15.
https://doi.org/10.3390/ijfs5030015

**AMA Style**

Yang L, Rea W, Rea A.
Financial Insights from the Last Few Components of a Stock Market PCA. *International Journal of Financial Studies*. 2017; 5(3):15.
https://doi.org/10.3390/ijfs5030015

**Chicago/Turabian Style**

Yang, Libin, William Rea, and Alethea Rea.
2017. "Financial Insights from the Last Few Components of a Stock Market PCA" *International Journal of Financial Studies* 5, no. 3: 15.
https://doi.org/10.3390/ijfs5030015