How to Explain the Cross-Section of Equity Returns through Common Principal Components

: In this paper, we propose a procedure to obtain and test multifactor models based on statistical and ﬁnancial factors. A major issue in the factor literature is to select the factors included in the model, as well as the construction of the portfolios. We deal with this matter using a dimensionality reduction technique designed to work with several groups of data called Common Principal Components. A block-bootstrap methodology is developed to assess the validity of the model and the signiﬁcance of the parameters involved. Data come from Reuters, correspond to nearly 1250 EU companies, and span from October 2009 to October 2019. We also compare our bootstrap-based inferential results with those obtained via classical testing proposals. Methods under assessment are time-series regression and cross-sectional regression. The main ﬁndings indicate that the multifactor model proposed improves the Capital Asset Pricing Model with regard to the adjusted- R 2 in the time-series regressions. Cross-section regression results reveal that Market and a factor related to Momentum and mean of stocks’ returns have positive risk premia for the analyzed period. Finally, we also observe that tests based on block-bootstrap statistics are more conservative with the null than classical procedures.


Introduction
Traditionally, finance theory has relied upon the risk-return relationship, i.e., the higher the risk (usually measured through the standard deviation of returns), the higher the return. This concept is at the core of the Capital Asset Pricing Model (CAPM) (see Sharpe [1], Lintner [2], Mossin [3]), where the expected profitability of the i-th stock, E(R i ), is represented as follows: where r f is the risk-free rate, E(r m ) − r f is the Market Risk Premium, and β i the sensitivity of expected excess asset's return associated with the i-th asset. However, several authors have reported some breaches in this theory. For example, less volatile stocks seem to have higher returns (see Frazzini and Pedersen [4]), while Lintner [2] and Miller and Scholes [5] obtained certain inconsistencies when testing the model with NYSE stocks. Academia has pointed to the existence of several other factors that, beyond volatility, affect the returns of assets (basically, investors obtain a reward for bearing risks different from volatility). Some of these factors, relying on financial measures, are already considered as classical and have been tested in different Markets; see Fama and French [6]. Other factors, that incorporate macro or industry-related measures, such as Interest Rates levels (Viale et al. [7]), or Oil price (Ramos et al. [8]), have been less studied or remain undiscovered. Some researchers, like Elyasiani et al. [9], Lemperiere et al. [10], have focused on higher order statistical moments of returns and, recently, more sophisticated models are built by combining such measures with others involving psychological factors, like Momentum; see Carhart [11]. Momentum, specifically, has won a place by itself among the main factors to be considered in the asset pricing literature and has been tested even in Emerging Markets (see, for example, Misra and Mohapatra [12]). What appears to be clear is that multifactor models should explain the behavior of assets' returns better than CAPM. However, the number of proposed factors has increased dramatically in recent years. For example, Harvey et al. [13] catalogue 316 factors and note that there are additional ones that do not make it to their final list. Typically, these models take the form of the following equation: The expected return for the i-th portfolio is linearly related to a set of factors λ. Usually this set includes CAPM's Market factor and other additional factors as independent variables. New debates arise today regarding which factors should be included in asset pricing models (see Fama and French [14] and Barillas and Shanken [15] on the methodology to choose among different models and factors and Fama and French [16] regarding the redundancy of the value factor), whether certain factors are not working anymore and what are the main characteristics of extreme performers, equities that experienced extreme return levels during a specific period (for instance, see the work of Heerden and Rensburg [17], where they first choose among different factors and then apply a logistic regression to select shares and build portfolios).
When studying these multifactor models, two main difficulties may appear. First, the procedures to test the significance of the factors rely on the construction of portfolios. We use portfolios instead of stocks since the former have more stable characteristics and are less prone to missing data than the latter and because the errors of α and β are higher for individual stocks as their volatility is higher. However, the question on which factors to select in order to build the portfolios remains unanswered. Firstly, there is an issue of dimensionality: on one hand, as we increase the number of portfolios to account for the various factors, the number of companies per portfolio decreases and this could be relevant for analysis of Markets not as developed as the U.S. or the Euro-Zone; on the other hand, there might also be a loss in efficiency in using too few portfolios as the model could fail to explain the cross-section of individual assets. Secondly, as Feng et al. [18] suggest, selecting a few portfolios based on some characteristics could bias the results in favor of these factors. In this paper, we propose to build the portfolios using Common Principal Components (CPC), a multivariate technique developed by Flury [19]. Unlike other dimensionality reduction techniques, specifically the classical principal components that it extends, CPC was designed to be applied when the available information is organized in more than one dataset. In our case, we have several factors measured along a time period for a large group of companies. The idea is to search for a common set of orthogonal axes that capture a high percentage of the variability of the factors observed in all the companies. Using CPC, we respect the individual behavior of each company, which constitutes a group on its own, while keeping a reasonably small number of factors that explain a large part of the variability of the stocks. At the same time, we have made an effort in interpreting each of the CPC factors in terms of the traditional ones.
Second, traditional inferential procedures about multifactor models, like Fama and French [20] and Fama and MacBeth [21], strongly rely on assumptions regarding the data: uncorrelated factors over time, i.i.d. normally distributed errors over time and independent of the factors, etc. When these hypothesis are not fulfilled, classical estimators may be biased. The Bootstrap methodology was developed by Efron [22] as a resampling technique to approximate the distribution of test statistics. In the Asset Pricing literature, Cueto et al. [23] proposed to test test the validity of the model and the significance of the parameters involved through a block-bootstrap procedure that accounts for time dependency. Specifically, we use bootstrap techniques to assess our time-series and crosssectional regression models by testing, in first place, the hypothesis that the independent terms of the time-series models that explain each of the portfolios returns are jointly zero; thus, the models are able to explain the excess returns. In a second and third stage, significance of the factors for each portfolio are tested, as well as that of the risk premia in the cross-sectional model. The methodological developments in the manuscript end here, since the purpose of these multifactor models in finance is to explain past assets' behaviors, rather than forecasting.
The resampling procedure described here grows, among others, on that of Chou and Zhou [24], who bootstrap a Wald test for the case where residuals and asset returns are jointly i.i.d and use a block-bootstrap for a Wald-type GMM test in the non-i.i.d case.
The block bootstrap was explored by Grané and Veiga [25] in the computation of returns' unconditional distribution, and huge differences in the estimates of minimum capital risk requirement were reported when using conditional approaches (such as GARCHtype models and stochastic volatility models), particularly for long positions and larger investment horizons.
The objectives of this paper are: (1) to propose a multifactor model based on statistical and financial factors using CPC to reduce its number of dimensions, (2) to develop nonparametric resampling procedures that account for time dependency in order to test model validity and involved parameter significance, and (3) to compare the results obtained via bootstrap-based inferential procedures with those of the classical proposals.
In particular, the financial and statistical factors considered are: Market Capitalization and Total Assets (measures of size), Price to Book ratio (measure of cheapness), Return on Assets and Return on Equity (measures of profitability), Momentum, and four statistical measures (mean, standard deviation, kurtosis, and skewness). The multifactor model with four CPC-factors is able to explain 90% of the variability of the data. The first CPC-factor is a linear combination of mean and Momentum returns; the second and third CPC-factors are linear combinations of skewness and kurtosis returns and finally, the fourth CPC-factor is the standard deviation of the return. Interestingly, none of them include the financial ratios. A possible explanation is that these ratios do not add enough variability compared to statistical factors.
The main findings are that CAPM cannot explain by itself the return of the portfolios as β for Market is higher for portfolios with high standard deviation (CPC4) and α is higher (and positive) for portfolios with high Momentum and mean (CPC1). For these time-series models, R 2 shows values not greater than 0.55, while despite the wide confirmation of the Market factor in the financial literature, it is not significant in our CAPM cross-section regression analysis, which leads us to conclude the need to control for other factors. When we incorporate additional factors, we notice that Momentum and mean (CPC1-factor), despite being correlated with the Market factor, and standard deviation (CPC4-factor) help explaining the cross-section of European stocks during the period considered. Now, Market β stabilize around 0.35, and CPC1 and CPC4 mainly capture the variability that the Market factor could not explain by itself in CAPM. For these models, we observed a substantial improvement in adjusted-R 2 , with a median value of 0.671. Apart from the calculation of βs and αs, which seem to be quite robust despite the relaxation of the assumptions of the model, GRS p-value is much higher for the bootstrap (which also occurs in CAPM). Finally, in the cross-section regression, two factors present risk premia different from zero, which are Market and the factor based on mean and Momentum (CPC1-factor). These findings lead us to conclude that the multifactor model based on CPC-factors is a good model with regard to the adjusted-R 2 , able to explain excess returns, although in the analyzed time period only one of the CPC-factors presents positive risk premia.
The remainder of this paper is organized as follows. Section 2 contains a description of the data and methodology; specifically, in Section 2.1, we present and describe our data, while, in Section 2.2, we explain the methodology used to construct the portfolios and to test the validity the multifactor model, that is, time-series and cross-sectional classical methodologies and the block-bootstrap procedure. Section 3 contains the results of the analysis and the comparison of the application of classical and bootstrap inferential procedures to the data, while the final conclusions are discussed in Section 4.

Data Description
We start with 2393 European companies that were selected from all EU countries. As is usual in the literature on factor models, we excluded Financials as they usually have high leverage ratios, affecting several financial ratios. The database comprises monthly data from Oct-2009 to Oct-2019 and includes prices and several financial ratios, such as Market capitalization, Price to Book and Price to Earnings ratios, number of shares outstanding, Return on Assets and Return on Equity, number of shares traded, and Total Assets. We apply several filters to the Data: first, we delete all companies with less than 30 months of complete data (which leads to 1305 remaining companies); second, we apply a transaction filter, and companies having no transactions for a whole semester were excluded; third, companies with no Market Cap info for more than 2 consecutive years were excluded; finally, companies with non-positive Equity at the end of any year were excluded. We end up with a final set of 1230 companies, after all these filters were passed. Next, monthly returns were computed for all the companies, the Risk Free Rate (r f ) and Market (r m ) were estimated, respectively, by means of the 2 year German Bond yield and the STOXX 60. Monthly returns were computed as the quotient between the natural logarithms of the market price and the market price in the previous month.
In Table 1, we show some summary statistics by year. In particular, we give information on the number of companies, the number of months, and summary statistics on prices by year. Figure 1 contains a graphical description of the data.  This behavior is related to the 2008 Global Financial Crisis and can be seen more clearly in Figure 1. In terms of standard deviation, we can observe a similar pattern, although the slope in the decrease is more abrupt than for the mean values.

Factors under Study
Cueto et al. [23] introduce three new factors based on statistical measurements on stock prices. Here, we consider such factors calculated on stocks' returns together with a Momentum variable, which is equal to the 12-month logarithmic return of prices, Market Capitalization and Total Assets (measures of size), Price to Book ratio (measure of cheapness), Return on Assets, and Return on Equity (measures of profitability). For all the financial ratios, we take the value appearing 6 months in advance except for Total Assets, which corresponds to the increment of this measure over the previous 12 months. This way, we incorporate all the factors included in the 5-factor model by Fama and French [16].
Regarding the statistical measures for all the stocks in the sample, we applied a rolling window taking into account the previous 12 observations. These measures are: • Mean of returns for each year and company: • Standard Deviation (SD) of returns for each year and company: • Excess Kurtosis (Kurt) of returns for each year and company: which compares the fatness of the distribution tails with respect to those of a normally distributed random variable. Positive values indicate heavier tails than those of the gaussian distribution, whereas negative values indicate thinner tails than the gaussian one. • Skewness (Skew) of returns for each year and company: which measures the asymmetry of the distribution. It takes positive values for positively skewed distributions, that is, when the right tail is longer than the left one, and it takes negative values when the contrary happens. It is roughly zero when both tails are similar (symmetrical distribution).
As can be seen in Table 2, Market is positively correlated with Momentum, mean, skewness, and Total Assets, while it is negatively correlated with the remaining factors. This could be useful for investors willing to invest in uncorrelated portfolios (uncorrelated in terms of factors, not assets). Notice further that correlations are mainly low, except for Momentum and mean, which are highly correlated. Such an issue might lead to multicolineality in models considering both factors and, hence, to greater standard errors and larger confidence intervals for the model coefficients. Before analyzing whether certain factors manage to explain the expected return of a set of portfolios, we must first form the portfolios. In order to do so, we use the Common Principal Components technique, introduced by Flury [19], as a generalization of principal components to the case of several groups. The basic assumption in the CPC-model is that the principal component transformation is identical in all the considered groups, while the variances associated with the components may vary between groups. This transformation can be viewed as a rotation yielding variables that are "as uncorrelated as possible" simultaneously in several groups. This distribution-free property justifies the application of the CPC-model to non-normal data. The CPC-model can be also justified from the principle of parsimony, since the number of parameters to be estimated is less than in other usual multivariate methods, which leads to more stable estimations (in the sense of low standard errors). Thus, the underlying idea of CPC-model is to represent in the same common orthogonal axes several groups of individuals/objects (of possibly different sample sizes) for which the same number of variables/measurements have been observed. In our case, it seems reasonable to consider a model in which the same factors occur in different, but related companies. Thus, we can take as variables the ten factors under consideration and groups correspond to the 1230 companies. That is, each group is formed by a particular company and its dataset is composed by the observations of the factors for this particular company along the time period under consideration.
As in classical principal component analysis, the goal is to determine a number of uncorrelated linear combinations of the variables that maximize their variance for each company. In this case, however, despite the fact that the linear combinations will be the same for all companies, the associated variances to each component may change among them, which results in a reduction of the number of parameters to estimate when maximizing the variance explained by the model.
In the setup of the problem, we have f variables x 1 , . . . , x f (factors) observed on p companies along time periods of size (n 1 , . . . , n p ). Given S 1 , . . . , S p , the variance-covariance matrices of the f variables for each company, we would like to find an orthogonal matrix U and p diagonal matrices Λ 1 , . . . , Λ p such that: where matrix U if formed by the common eigenvectors of S 1 , . . . , S p and diagonal matrix Λ contains the eigenvalues of each S in descending order. Note that this model may not exist since, given two positive definite symmetric matrices, their eigenvectors are equal if and only if these matrices fulfill the commutative property. So, in general, matrices S will not have the same eigenvectors, unless they fulfill the commutative property.
Nevertheless, this problem is solved numerically, and the idea is to find a matrix U and p matrices Λ , such that each U Λ U is as similar as possible to S . Thus, the CPC-model can be viewed as a rotation yielding variables that are "as uncorrelated as possible" simultaneously in p groups.
To determine how similar they are, Flury and Gaustchi [26] propose a numerical algorithm that minimizes the following discrepancy measure of "simultaneous diagonalizability": which naturally arises in the context of maximum likelihood estimation in principal component analysis of several groups under the assumption of multivariate normality; see Flury [27].
The second part of Proposition 1 is straightforward, while the upper bound for F(U) follows from Hadamard's inequality, that we present in Lemma 3 after two preliminary results.

Lemma 2.
For any f × f matrix of correlations R, we have det(R) ≤ 1.
Proof. Let λ 1 , . . . , λ f be the eigenvalues of R. Applying the inequality between the geometric and arithmetic means given in Lemma 1, we have that Proof. Let S be an f × f positive definite symmetric matrix and D = diag(S), then R = D −1/2 S D −1/2 is an f × f matrix of correlations, from where we can obtain S and compute its determinant as follows: since after Lemma 2, det(R) ≤ 1.

Preliminary Setup of the Algorithm to Compute the CPC-Model
Flury and Gaustchi [26] propose an algorithm to solve a system of equations that leads to the minimizer of function F(·) in (1) among f × f orthogonal matrices and for any given f × f positive definite symmetric matrices S 1 , . . . , S p .
In first place, observe that the denominator in (1) is constant since As a consequence, any minimizer of F(U) would also be a minimizer of its numerator, which we will denote by G(U). Taking logarithms, we have that Next, the extreme values of log G(U) are computed, with the restriction that U is an orthogonal matrix: where µ i , µ ij , 1 ≤ i, j ≤ p, i = j are Lagrange multipliers. It can be assumed that µ ij = µ ji since u i u j = u j u i . Differentiating (2) with respect to u k , we have that where λ k is the k-th eigenvalue of S . Post-multiplying the previous expression by u j , j = k, we have that: Analogously, differentiating (2) with respect to u j and post-multiplying the resulting expression by u k , k = j, one obtains that Subtracting both expressions leads to the following system of equations: whose solutions are the columns of matrix U, from which it is possible to obtain matrices Λ . Finally, the system of equations is efficiently solved by means of the aforementioned algorithm of Flury and Gaustchi [26], which is a generalization of the well-known Jacobi method for computing eigenvectors and eigenvalues of a single symmetric matrix. The CPC algorithm is implemented in the R package multigroup, designed to study multigroup data, where the same set of variables are measured on different groups of individuals. Within this package, we specifically use the function FCPCA to perform the CPC calculation.

Portfolio Construction
In order to build the portfolios in the more rational manner, we first take the proposed financial and statistical measures and standardize them to zero mean and unit variance. We recall that these measures are: Market Capitalization and Total Assets (measures of size), Price to Book ratio (measure of cheapness), Return on Assets and Return on Equity (measures of profitability), Momentum, and the statistical measures already mentioned.
Then, we seek for a common pattern in all companies attending to all measures or factors. Thus, we compute the CPC-model with the aim of obtaining a few uncorrelated components that explain as much as possible the ten measures included in the analysis for all companies. We selected the first four principal components since the average percentage of variability explained by them was 90%. Additionally, to check the robustness of the CPC loadings, we estimated them by a bagging procedure [28]. Specifically, we selected groups of 100 companies (without replacement) for which we calculated the CPC-model. After repeating this procedure 5 times, we present the results and compare them with the CPC-model computed with all the data in Table 3. Regarding the bagging results of Table 3, the CPC-loadings changed signs in some iterations and exchanged CPC2 for CPC3 in others (the percentage of variability explained by these two components is very similar-around 22%). Such exchange in some of the components is a well-known problem and is related to the Flury-Gautschi algorithm used to solve (1) [27]. We proceeded to change signs accordingly in order to present consistent results.
Observing the CPC-loadings, we see that the first CPC is a linear combination of mean and Momentum returns, the second and third CPCs are linear combinations of skewness and kurtosis returns and, finally, the fourth CPC is the standard deviation return. In what follows, we call these components CPC1 to CPC4. Interestingly, none of these CPCs include the financial ratios.

Portfolio Setup with the CPC Model
Next, we get the representation of each company in the CPC model (multiplying the loadings by the standardized variables), compute percentiles for each of them and assign to portfolios accordingly, as follows [6]. In Table 4, we summarize the resulting portfolios, which are updated monthly based on the previous month measurements. For each portfolio, we calculate the average monthly returns. Additionally, each factor is computed as the excess return of the higher portfolio in each category minus the return of the lower portfolio. All returns are calculated for equallyweighted portfolios at t + 1. Figure 2 contains the cumulative returns of the portfolios. As we commented before, it is interesting that none of the CPCs include financial measures, which might occur because these ratios do not add enough variability compared to statistical factors. However, in a way, statistical measures could be also capturing some of the financial characteristics of the stocks as we see in Table 5, which includes average factors for each of the portfolios considered and standard deviation, kurtosis, and skewness of the returns. We notice, for example, that portfolios with high CPC1 present in average higher PB ratio, higher ROA and ROE, lower kurtosis, and positive skew. Portfolios with high CPC4 show in average lower PB ratio, lower ROA and ROE, lower kurtosis, and higher skew.

Classical Methodologies
Once we have defined the N portfolios that we are about to analyze, we perform a time-series regression for each of them: We use the GRS test by Gibbons et al. [29] to assess the ability of a model to explain excess returns. The null hypothesis of this test is H 0 : α 1 = . . . = α N = 0, and assuming that s are normally distributed, the test statistic is given by: where K is the number of factors, Σ f is the factors' covariance matrix, and Σ the residual covariance matrix. The purpose of the test, whose statistic takes also into account the sampling error of estimates in β i , is to determine if the α i are jointly zero assuming that the distribution of returns and factors is multivariate normal. As suggested by Fama [30], the GRS test is against an unspecified alternative, both for portfolios and factors. On the one hand, the model may pass the test for a set of portfolios, but fail for another. On the other hand, we do not specify additional factors that could produce a violation of the model (however, we get some intuition on which factors should be included or excluded from the model by observing the number of portfolios whose coefficients are significantly different from zero). Then, we run a cross-sectional regression to estimate the vector of risk premia λ with the model obtained taking expectations in the time-series regression equation: where the estimates forβ i are those obtained in the TS regression at the first step. In conclusionλ represents the slope coefficients in the cross-sectional regression which is run without intercept, whileα are the residuals in the cross-sectional regression. If the estimated βs are important determinants of average returns, then the risk premium, λ i , should be statistically significant. We have used the covariance of the residuals of the TS regression to calculate the standard error of λ i to take into account the correlation across assets. Additionally, the error terms for λ i must include the error of estimating β [31] for the so-called Shanken correction, although the difference may be very small in practice. Finally, we use t-statistics to test the significance of each of the λ i .

Resampling Techniques: Bootstrap
As in Reference [23], we follow a block-bootstrapping pairs scheme to preserve the time correlation of the data, and take B bootstrap samples of block size b = 2 months. We describe briefly the procedure and refer to the original paper for further details.

•
Step 1 Estimate benchmark regression models, one for each portfolio. For each portfolio, save α, β, their corresponding t-statistics, residuals, risk factors' estimates, and GRS statistic.

•
Step 2 Produce a set of simulation runs equal for each portfolio in order to preserve returns' cross-correlations.

•
Step 3 Build a new series of α-free portfolio returns by using the simulated time indices.

•
Step 4 Run the time-series factor model regression on the artificially constructed returns. Calculateα,β and the corresponding confidence intervals. Next, generate samples of the GRS Statistic, compute different percentiles from the bootstrapped distribution and compare them to the original GRS statistic.
In this paper, we improve the methodology proposed in our previous work. We have noticed that, given the construction of the GRS statistic, when we face portfolios with heteroskedasticity, the estimation of the variance through the residual covariance matrix may not be appropriate. Thus, we propose a new statistic, Q(α) =α Σ −1α , for which we will calculate the covariance matrix of αs, Σ, through a nested bootstrap. Then, a new step appears: • Step 5 Run the time-series factor model regression on the artificially constructed returns. Calculateα,β and the corresponding confidence intervals. Finally, generate bootstrap samples of the Q(α) statistic that makes use of the covariance matrix of the αs which is approximated by means of a nested bootstrap and compare the bootstrapped statistics with the original Q(α) statistic. Concerning cross-sectional regression, we use βs and average returns from each bootstrapped sample to determine the significance of the risk premia estimates (λs). Given that βs are estimated, the estimates of the λs might present substantial bias and we use a reverse bootstrap percentile interval to determine the significance of the factors. To be consistent, we also present this type of bootstrap interval for all the estimated parameters.
As this is a computer-intensive method, we implemented the previous procedure in R by using the R-packages doParallel and foreach, designed to do multi-core calculations. The main reason for using the foreach package is that it supports parallel execution, that is, repeated operations can be executed on multiple cores of the computer or on multiple nodes of a cluster, thus reducing the execution time.

Time-Series Regressions
In this section, we present the results for the time-series regression of two models. The first one is the CAPM model and the second one is a multifactor model including Market and four factors determined by CPC1 to CPC4. The dependent variables are the returns of the 16 portfolios.

Model 1: CAPM
When we analyze a 1-factor model, only taking into consideration the Market factor, we notice that the p-value of the GRS statistic is 1.93%, thus rejecting that all αs are jointly zero at 5% significance level. This could be an indication that other factors are missing because the portfolios are sorted to have greater cross-section differences, but, given that this is a well studied factor, we would like to understand if this rejection may be due to an anomalous period of time and/or to a breach in certain assumptions of the model. First, we split the 108 months in two periods: from 1 to 80 (first subperiod) and from 30 to 108 (second subperiod), both with approximately the same number of months. We find that the GRS p-value for the first subperiod is 3.43%, while the p-value for the second subperiod is 0.24%. This could indicate that the CAPM may have not behaved correctly during the second period, while global Central Banks have been injecting huge amounts of liquidity in the system.
Additionally, in order to confirm if the OLS hypotheses are fulfilled (normality, homoscedasticity and independence), we performed several analysis following Soumaré et al. [32]. The results are presented in Table 6, where we show the different statistics analyzed with their p-values in parentheses and the number of portfolios for which we reject the null at a 5% significance level for the different tests. A Shapiro-Wilk test was run on each set of residuals, suggesting that, in general, if we consider both subperiods individually, the residuals of the time-series regression are normally distributed. This changes when we consider the whole period, where 6 portfolios have a p-value lower than 5% in the test. Generally, these are portfolios with high mean and Momentum (CPC1) and low standard deviation (CPC4).
Additionally, we performed a Breusch-Pagan test on the residuals of the regressions, showing that the homoscedasticity assumption is violated for portfolios 9, 10, 11 and 13 (all of them share the characteristic of having a high mean and Momentum (CPC1) and, generally, present low levels of the remaining CPCs). Interestingly, during the second subperiod, we reject the homocedasticity only for portfolio 16. In Figure 3, we show the autocorrelation charts for errors in time-series regression for CAPM for the whole period. The charts indicate that residuals are especially correlated for lags equal to or higher than 9. The Durbin-Watson test also suggests that errors might be autocorrelated for 5 portfolios for lag equal 9 when considering the whole period (generally, those with low mean and Momentum (CPC1) and high standard deviation (CPC4)), 4 for the first subperiod and 8 for the second one (p-values lower than 5%). These three facts suggest that the bootstrap procedure developed in Reference [23] might be useful to approximate the distribution of the GRS statistic. Applying this procedure (see Table 6), we obtained a bootstrap p-value of 17.4% for the whole period, so we cannot reject that the αs are jointly zero. Something similar happens when we consider the first subperiod (GRS p-value of 8.2%) and the second subperiod (GRS p-value of 17.8%). However, we do not feel comfortable with the results as the p-values of the Bootstrap are far away from the p-values of the traditional methodology; remember that the GRS test estimates Σ through the residual covariance matrix. We suspect that the existence of heteroscedasticity in some of the portfolios could make this estimation inaccurate. Thus, we propose a new statistic, Q(α) =α Σ −1α , for which we will calculate the covariance matrix of αs, Σ, through a nested bootstrap. The results obtained with the Q(α) statistic reinforce the conclusions reached with the GRS bootstrap, since the p-values of this new statistic are even higher than those of the former (between 1.6 and 3.3 times). Thus, we do not reject that the αs are jointly zero at a 5% significance level.
In Table 7, we present the results for the classical and the bootstrap methodologies for Model 1. Columns "Estimates" contain the estimates for α and β for Market. The classical t-statistics for these estimates are in columns 4 and 5. Columns 6 and 7 correspond to the basic bootstrap confidence intervals and R 2 is reported in the final column. A graphical comparison is given in Figure 4. Regarding the estimates of the model, we find that the coefficient for Market is always positive and statistically significant for classical methodology and for the resampling technique. The estimate varies between 0.95 and 0.45, but it is, in general, higher for portfolios with high standard deviation (CPC4). This confirms the traditional relationship between expected return and β explained in the CAPM. However, as we will see later and as different studies have suggested, this relationship does not hold once we include new factors and review the cross-section regression. The coefficients of determination indicate that this model explains between 19.0% and 52.6% of the variability of portfolios' returns. Moreover, under the classical inferential techniques, we find that α is not statistically significant (except for portfolio 15) which is positive for portfolios with high mean and Momentum (CPC1) and negative otherwise. The resampling technique finds more portfolios where α is statistically significant. As we have already discussed, the GRS test rejects, at a 5% significance level, that all pricing errors are equal to zero which might indicate that other factors are missing.

Model 2: Market and Factors CPC1 to CPC4
When we analyze the 5-factor model, we notice that the p-value of the GRS statistic is 0.90%, lower than in the previous model and also rejecting, at a 5% significance level, that all αs are jointly zero. This could indicate that the model does not reflect correctly the variability of the returns of the portfolios.
As before, we split the 108 months in two periods: from 1 to 80 (first subperiod) and from 30 to 108 (second subperiod). We find that the GRS p-value for the first subperiod is 5.19%, while the p-value for the second subperiod is 0.38%. This could indicate that this model has not behaved correctly during the second subperiod, as we discussed for the previous model. Results for GRS p-values (both for classical and resampling methodology), Q(α) p-value and normality, heteroscedasticity and autocorrelation tests are presented in Table 8. For this second model, the Shapiro-Wilk test run on each set of residuals suggests that the number of non-normal portfolios is higher than in the previous model. In fact, when we consider the whole period, 10 portfolios have a p-value lower than 5% in the test. If we consider both subperiods individually, we can only reject the null for one portfolio in subperiod 1 (portfolio 5), while there are 8 portfolios with this same characteristic in subperiod 2.
We also ran a Breusch-Pagan test on the residuals of the regressions, showing that the homoscedasticity assumption is violated for portfolio 11 during subperiod 1 and portfolio 14 during the second subperiod and the whole period. In this model, the heteroscedasticity problem seems to be (at least, partially) solved.
In Figure 5, we show the autocorrelation charts for errors in time-series regression for the 5-factor model for the whole period. According to the charts, residuals are correlated for lag equal to 10 and beyond. The Durbin-Watson test also suggests that errors may be autocorrelated for 12 portfolios for lag equal 10 when considering the whole period, 7 for the first subperiod and 7 for the second subperiod (p-values lower than 5%). In this case, incorporating the CPC-factors results in an increase in the number of portfolios that present autocorrelation.
Again, given that the three previous assumptions of the model are breached, resampling might be useful to approximate the distribution of the GRS statistic. When using the bootstrap, we cannot reject the hypothesis that all the αs are jointly zero, since we find that 17.1% of the values of the sample are higher than the GRS statistic. As in the previous case, the results obtained with the new statistic Q(α) reinforces our decision of not rejecting the hypothesis that all the αs are jointly zero at a 5% significance level.
In Table 9, we present the results for the estimates and t-statistics for the classical methodology for Model 2. When we observe the coefficients (α and β) generated for the 16 portfolios, we notice that: (1) we cannot reject that each of the α are zero; (2) none of the Market β can be considered statistically zero and the coefficients have stabilized around 0.35 (as commented before, once we control for additional factors, and specifically for a factor linked to volatility, the Market β effect that appeared in the previous model disappears); (3) βs for CPC1-factor and CPC4-factor seem to be significantly different from 0 as only 1 coefficient in each of them presents a t-statistic whose absolute value does not exceed 1.96; and (4) the average adjusted-R 2 increases to 63.5% (this model explains between 26.3% and 85.3% of the variation in the returns of the portfolios). CPC1-factor estimates are negative for portfolios with low mean and Momentum (CPC1), while the contrary happens in portfolios with high mean and Momentum (CPC1). Estimates for CPC2-factor are negative except for portfolios 6 and 16, and, in general, they are only significant for portfolios with low skewness and kurtosis (CPC2). Estimates for CPC3-factor are all positive except for portfolio 6. Finally, estimates for CPC4-factor are always positive and significant for all portfolios except for portfolio 3. Additionally, βs for CPC4-factor are higher for portfolios with high standard deviation (CPC4).  Table 10 contains the results for the inference based on resampling techniques. A graphical comparison is shown in Figure 6. The results of the basic bootstrap confidence intervals are consistent with what we have already commented. Despite having a lower GRS p-value than Model 1, we can conclude that this model is better as (1) resampling techniques show that we cannot reject that αs are jointly zero; (2) the average adjusted-   We noticed that some of the intervals seem to be quite wide. Specifically, intervals for portfolio 3 are all wider than their respective averages and the z-score of the width of 3 of them is higher than 2. Those of portfolio 14 are equally wider than their means, with the exception of the interval for Market β (in this case, the z-score of the width is higher than 2 in only 2 of the intervals ). Additionally, in both cases, adjusted R 2 is lower than the average. The explanation for both portfolios may be, among others, the existence of outliers that impact the width of the bootstrap intervals in the simulations; see Figure 7. Next, we will proceed to analyze the significance of the different risk premia for both models.

Cross-Sectional Regression
In this section, we present the results for the cross-section regression of the two models under study: CAPM (Model 1) and the five-factor model, including Market and factors CPC1 to CPC4 (Model 2).

Model 1: CAPM
The GRS bootstrap test for Model 1 suggests that we cannot reject that all αs are jointly zero and further it registers an explained variability between 19% and 52.6%. Market seems to be statistically significant in all the portfolios.
We use the βs estimated in Table 7 to examine if the factor is priced on the crosssection of returns. We must take into account that despite the fact that we are working with portfolios, risk premia will be estimated from the estimated coefficients, which can lead to an important bias, especially if the model is not well specified. Market's risk premium is 0.0043 per month and, as a consequence, returns depend positively on the Market. The reported t-statistic of 0.874, which in this case includes the Shanken correction [31], suggests that this factor is not statistically significant at a 5% significance level. The correction seems to be minor in this case, as the t-statistic for the simple regression (Fama-Macbeth approach) is 0.8776. The basic bootstrap confidence interval [−0.0078, 0.0015] confirms that this factor is not statistically significant during the considered period. The statistical non-significance of the Market factor despite its wide confirmation in the financial literature could be related to the fact that we are not controlling for other factors, as the results from Model 2 suggest.

Model 2: Market and Factors CPC1 to CPC4
Again, we use the estimated βs for Model 2 (see Table 9) to examine if the different factors are priced on the cross-section of returns and the results are reported in Table 11. The highest risk premia are those of Market, which is 0.0108, and the one associated with the factor based on mean and Momentum (CPC1-factor), which is 0.0071. In both cases, returns depend positively on them. The rest of the risk premia are much lower and nonsignificant at 5% significance levels both for traditional methods and resampling techniques. The reported t-statistics, which in the case of CS include the Shanken correction, suggest that the only statistically significant factor at conventional significance levels is that based on mean and Momentum (CPC1-factor). However, bootstrap intervals indicate that both Market and CPC1-factor λs are significantly different from zero (at 5% level) contributing to the model. This is consistent with the 4-factor model by Carhart [11]. Additionally, once we control for a volatility factor (CPC4), we notice that expected returns, at least during the period considered, do not depend on standard deviation. See Reference [33] for potential explanations and a historical review of the volatility effect.

Comparison between Classical Methodologies and Bootstrap Methods
As we have seen, the assumptions' violation of the residuals of the regressions, like non-normality, heteroskedasticity, and autocorrelation, as well as, the presence of multicollinearity (that is, high correlation among the different factors used in the analysis), may affect the estimates of α, β i and GRS computed through classical methodologies. Table 12 contains a comparison between the inferential results obtained with both methodologies. In particular, we show the number of portfolios where β i = 0. Rows corresponding to Model 1 and Model 2 contain classical inferential results, whereas rows Model 1b and Model 2b contain bootstrap inferential results. Indeed, looking at those results, we notice that the p-values of the bootstrap GRS tests are always greater than those of classical procedures, showing that the former are more conservative (we fail to reject the null hypothesis that the α are jointly 0 more often). We also observe that the number of portfolios where β i = 0 varies depending on the methodology used.

Model
Factors

Conclusions
We propose a procedure to obtain and test multifactor models based on statistical and financial factors and illustrate it on a large dataset corresponding to nearly 1250 EU companies and spanning from October 2009 to October 2019. However, the procedure is general enough to be extended to other factors, companies or time period.
The first methodological contribution relies on using Common Principal Components to build the portfolios and summarize factors' information by capturing a high percentage of the variability of the datasets. In this paper, we considered factors like Market Capitalization and Total Assets (measures of size), Price to Book ratio (measure of cheapness), Return on Assets and Return on Equity (measures of profitability), Momentum, and four statistical measures, such as mean, standard deviation, kurtosis, and skewness. The second methodological contribution is the development of a block-bootstrap procedure to assess the validity of the model and the significance of the parameters involved.
The main findings indicate that the multifactor model proposed improves the Capital Asset Pricing Model with regard to the adjusted-R 2 in the time-series regressions. Crosssection regression results reveal that Market and a factor related to Momentum and mean of stocks' returns have positive risk premia for the analyzed period. Finally, we also observe that tests based on block-bootstrap statistics are less prone to reject the validity of the model than classical procedures.
In this paper, we proposed Common Principal Components to obtain multifactor models for equity returns, mainly because it can deal with several datasets and can be applied to non-normal data. Direct extensions of this work are to explore the efficency of these multifactor models in other equity markets, as well as in other time periods. A further research line is to explore and adapt other multivariate dimensionality reduction techniques, like MANOVA, although this technique requires additional hypothesis that are hardly fulfilled in real datasets. To explore and adapt MANOVA to be used in large datasets is beyond the scope of this paper, and we leave it for further research.