Abnormal Returns or Mismeasured Risk ? Network Effects and Risk Spillover in Stock Returns

Recent event study literature has highlighted abnormal stock returns, particularly in short event windows. A common explanation is the cross-correlation of stock returns that are often enhanced during periods of sharp market movements. This suggests the misspecification of the underlying factor model, typically the Fama-French model. By drawing upon recent panel data literature with cross-section dependence, we argue that the Fame-French factor model can be enriched by allowing explicitly for network effects between stock returns. We show that recent empirical work is consistent with the above interpretation, and we advance some hypotheses along which new structural models for stock returns may be developed. Applied to data on stock returns for the 30 Dow Jones Industrial Average (DJIA) stocks, our framework provides exciting new insights.


Introduction
In finance theory and empirics, stock returns are typically described by a factor model along the lines of Fama and French (1988, 1993, 2015) and Carhart (1997).However, despite the popularity of the Fama-French (FF) and FF-type models, substantial literature in the event study tradition, starting from Brown and Warner (1985) and Strong (1992), has pointed towards a failure of the FF model to adequately capture the relationship between risk and return; for recent literature, see Chiang and Li (2012) and Marks and Musumeci (2017), among others.Specifically, there are periods when stock returns are highly correlated (Kolari and Pynnönen 2010); this correlation leads to abnormal returns and mismeasured risk; see also Boehmer et al. (1991) and Kothari and Warner (2007). 1  In this paper, we contrast the FF-type factor models for stock returns against the standard panel data factor model in contemporary econometrics.Then, recent developments in the econometrics of panel factor models with cross-section dependence suggest reasons why the FF-type model may be misspecified.To address such misspecifications, we propose modeling cross-correlations using a suitable structural model.Motivated by the recent clustering model (Nagy and Ormos 2018) and recursive model (Basak et al. 2018), we propose a social network dependence structure.Applied to data on stock returns for the 30 current DJIA stocks, we find evidence of network effects, the careful 1 Abnormal return is defined in the event study finance literature as the difference between the actual return of a security (in our case, over a one week time horizon) and the expected return as calculated using a model; see, for example, Brown and Warner (1985).Thus, any misspecification in the underlying factor model implies mismeasurement of expected returns and the corresponding risk-return relationship and would be evident in substantial abnormal returns.modeling of which addresses misspecification of the underlying factor model.This brings returns more in line with risks, and provides a structural understanding of risk spillovers.
Any model is necessarily an abstraction of reality, and will entail a certain degree of misspecification; understandably this is true of the FF model as well.Researchers have continued to improve upon the FF model with a larger collection of factors (FF-type models), and this has undoubtedly improved model fit and interpretation.Our contribution here lies in proposing quite a different extension.We consider trading activity and its structural interpretations more explicitly than the literature, which goes along the lines of structural interpretation of correlations that currently lies beyond the scope of the FF-type models.Then, together with the factors in the FF model, the proposed model provides substantial enhancement to our understanding and a better explanation of returns.We consider as benchmark a CAPM model (including only the market return factor) and an FF-type model including 6 factors: the 5 Fama andFrench (1988, 1993) factors, plus the momentum factor of Carhart (1997).Our results show that the base CAPM, together with network effects, has competitive explanatory power, and for some stocks offers substantial improvements relative to the above 6 factor model.This provides an alternate structural factor model for asset pricing, and develops avenues for new research.
Section 2 contrasts factor models in finance and econometrics and draws some insights into misspecification.Section 3 develops a social network model and estimates this using the DJIA stock returns.Structural interpretation of the model is discussed, together with alternate structural models.Section 4 concludes, with an appeal for further research on structural dependence in stock returns.

Factor Models in Finance and Econometrics
The FF and similar factor models in finance are typically expressed as: and estimated using time series data (t = 1, ..., T) on the returns, y it , on n (i = 1, ..., n) stocks.Here m t denotes the excess return on the market portfolio and β i the corresponding beta-factor for stock i, x t is a vector of returns on a finite number of firm-specific factors (typically called the Fama-French factors) and γ i their corresponding factor exposures, α i is a stock-(firm-) specific intercept that can be interpreted as a fixed effect, and ε it is an idiosyncratic error term.Typically, the excess return on the market portfolio (m t ) is easily computed from market data and the time-varying returns (x t ) are reported in market research publications (French 2017).The original Fama and French (1993) factors SMB (Small Minus Big) and HML (High Minus Low) were extended to include Mom (momentum) in Carhart (1997) and further to RMW (Robust Minus Weak) and CMA (Conservative Minus Aggressive) in Fama and French (2015).Returns on these factors constitute x t ; see French (2017) for further details on concepts and computation.

Network Effects and Bias
Traditionally, the above FF model ( 1) is estimated by least squares, where the factor exposures β i and γ i are viewed as parameters to be estimated from the data.This estimation strategy raises issues that are well recognized in the literature; see, for example, Strong (1992), Kothari and Warner (2007) and Kolari and Pynnönen (2010).One important issue is that risk is not consistently estimated if there is either time-varying volatility or cross-section correlations in the errors ε it .This renders inference on abnormal returns particularly challenging.This is really an estimation efficiency issue that is not in itself likely to cause bias in estimation of the factor model.However, there would be a more serious problem of endogeneity if, for some reason, there were network interdependencies between returns on different stocks.This will also lead to mismeasured risk and very likely biased estimates.
To understand the nature of the problem, consider for simplicity a CAPM type restricted factor model of the form where the effect of additional FF firm-specific factors is not included.The above CAPM model (2) can imply a specific form of network architecture, known in the spatial econometrics literature as a social interactions model (Lee et al. 2010;Hsieh and Lee 2016;Bhattacharjee et al. 2018;Cohen-Cole et al. 2018;Do gan et al. 2018) or a farmer-district model (Case 1992;Robinson 2003;Gupta and Robinson 2015), whereby the units (here, stocks) are classified into several groups or social networks.Stocks in the same social network are related to each other, but not to stocks in the other networks.Further, the inter-network influences are symmetric across all directed pairs of stocks within the network and can be represented by an adjacency-based binary weights matrix.In turn, the membership of social networks is inferred either by cluster analysis (Bhattacharjee et al. 2016;Chakraborty et al. 2018;Nagy and Ormos 2018) or correlation analysis (Junior et al. 2015;Bailey et al. 2016) of the dependent variable, which in our case are stock returns.Now, consider the clustering pattern implied by the above CAPM, assuming for simplicity that the parameter vector can take only one of two values, (α i , β i ) ∈ {(− 1 2 , 1), ( 1 2 , 0)}, and these correspond to the two network classes.Likewise, assume for simplicity only two time periods, t = 1, 2. Then, if a scatterplot of returns is obtained along the axes given by t = 1 and t = 2, it is clearly seen that the loci of data points in the two network classes will be m 1 − 1 2 , m 2 − 1 2 and 1 2 , 1 2 , with the random distribution of points around the loci determined solely by the idiosyncratic errors ε it .
More generally, if the parameter vector takes values in a finite set (α i , }, and we have data for t = 1, ..., T time periods, then this would generate data points clustered around a corresponding set of k loci: Further, if there were no network interdependence between the stocks, the parameters can be recovered through time series least squares regressions based on (2) for each individual stock, since the observations are independent over time.Therefore, the loci of the clusters can also be precisely estimated as the number of time periods increase, that is, as T → ∞ .The same argument holds if we had additional FF firm-specific factors, in which case we would estimate a model of the Fama-French form (1).However, the endogenous network effects would lead to biased least squares estimation.
Let us now consider a simple extension to the CAPM type restricted factor model (2) to include network interdependence.Denote by y t = (y 1t , y 2t , • • • , y nt ) the vector of returns at time point t, and consider a standard spatial autoregressive (lag) network model of the form where α and β are corresponding vectors of the CAPM parameters, W (n×n) is a square matrix of network membership with zero diagonal elements and the off-diagonal elements are unit if two firms belong to the same network and zero otherwise, and ρ is the so-called spatial autoregressive or network dependence parameter (|ρ| < 1).Here, the network architecture follows exactly the social interactions or farmer-district model; see, for example, Lee et al. (2010).Further, we assume as before that the CAPM parameter vector takes values in a finite set and further that two stocks with the same parameters belong to the same network.Now, without loss of generality, let the stocks in the first network have parameters (a 1 , b 1 ) and come first in the ordering, followed by the second network with parameter values (a 2 , b 2 ) and so on, we have the following block-diagonal equicorrelation structure for W: where the connection matrix for network j (j = 1, . . ., k) is of order n j × n j and takes the form: , with zero diagonal and row-standardised unit values everywhere else, and The equicorrelation form of the social interactions model clearly highlights why it may be useful to identify the network structure based on cross-section correlations, as in Lee et al. (2010) andmboxciteauthorB3-jrfm-450549 (2016).Then, the reduced form the network model ( 3) is: where This reduced form representation (4) clearly highlights how the network generates risk spillovers.
The structure of the reduced form in (4) has important implications for estimation and inference on the Fama-French model.First, additional FF-type firm-specific factors retain the same basic structure of the model, with slope parameters that are proportional to the underlying structural model at (3).Second, applied to data from the network interactions model (3), time series least squares regression based on individual stock returns will simply recover the reduced form intercept and slopes, rather than the underlying structural parameters (a j and b j ); clearly this leads to biased and inconsistent inferences.Third, the reduced form least squares parameter estimates correctly recover the underlying network structure because the nature of clustering does not change.Specifically, the data points are still clustered around a set of k loci, which is a simple scale transformation of the original model without network dependence.Hence, the network structure can be accurately identified by cluster analysis of the underlying returns.In fact, within the context of the FF model (1) with network dependence as in (3), cluster analysis will typically recover the loci of the FF-firm specific factors as well.Fourth, simply accounting for the network structure does not help.True, the underlying network dependence can be identified by clustering; but under the network interactions model (3), if the CAPM (and FF) part of the model were ignored, this would provide biased inferences on the network dependence parameters.Hence, both parts of the model are important for accurate estimation and analyses of risk and return.

Comparison with Panel Data Factor Model
Cross-section dependence is well studied within the current literature in panel data econometrics.
Here, the central factor model has the following form: where f t denotes a vector of time-specific "factors" with corresponding stock-specific loadings δ i , z it contains a collection of stock-and time-varying covariates, and ε it are stationary but potentially cross-section dependent and autocorrelated regression errors; see, for example, Pesaran (2006).Some of the "factors" may be observed, and others latent.In particular, a factor taking unit value in each time period corresponds to fixed effects, denoted α i in the FF model ( 1).The response variable (returns) y it has cross-section dependence arising from two sources.First, there is the influence of common "factors" f t , but potentially with effects heterogeneous across different stocks.Second, there are the cross-section dependent errors ε it .Pesaran (2006) points to an important distinction between cross-section strong and weak dependence (of returns on different stocks, in our case).The first arises from the effect of common factors, such as the market portfolio and the FF factor returns; the second is due to local interdependencies (spillovers) between firms and their stocks.Following from Pesaran (2006), an influential literature has spawned in this area; see, for example, Kapetanios and Pesaran (2007), Bai (2009), Pesaran and Tosetti (2009), and Bailey et al. (2016).Pesaran (2006) developed two key results.First, least squares estimation of ( 5) with omitted latent factors provides inconsistent and biased estimates of θ i in general.The only situation where credible inferences can be made is when the errors ε it are stationary over time and granular across the cross-section.Pesaran (2006) and Pesaran and Tosetti (2009) provide technical definitions of cross-sectional granularity.This is conceptually akin to stationarity, but across the cross-section dimension, and implies that the degree of cross-section dependence is limited.Pesaran (2006) terms this case weak cross-section dependence, and Pesaran (2015) provided a statistical test based on average cross-section correlation of the residuals; see Bhattacharjee and Holly (2013) for an alternate test.
Second, Pesaran (2006) offers a large sample method to address strong dependence when both dimensions are large, that is, n → ∞ and T → ∞ .In such situations, one can enrich the model by including cross-section averages of the dependent and independent variables as: where the cross-section averages (y t , z t ) eliminate strong dependence from the model, leaving only weak dependence in the residuals.This model can then be consistently estimated by least squares.This methodology is called common correlated effects estimation because (y t , z t ) take these high cross-section correlations out of the data.
Let us now revert to a comparison with the FF model ( 1).The return on the market is akin to the average return in each period, and hence is very close to y t .Since temporal variation in the risk free rate is much lower than the market, the excess return on the market, m t , is numerically almost the same cross-section average return less a constant. 2Unfortunately, beyond y t , common correlated effects cannot be directly applied in (1), because there are no regressors with both cross-section and time variation, unlike z it in (5).This key observation has two implications.First, one should always test the residuals from least squares estimation of (1) for potential strong dependence.Second, strong cross-section dependence needs to be modelled based on structural considerations of pricing in financial markets.We focus on this second issue in the next section.
2 Over the period of our analysis, standard deviation of the risk free rate is only 0.15, as compared to 4.43 for the market return.

Structural Models of Asset Pricing Correlations
As discussed in Section 2, the lack of cross-section variation in the regressors precludes the opportunity to apply common correlated effects estimation in the FF model (1).This implies that any cross-section strong dependence (across the stocks) needs to be modelled explicitly using structural models of price formation in the market.This is where we turn to next.We first discuss a few structural models and the relevant literature, and propose an alternate model.Then, we illustrate our proposed structural model using data on monthly returns on the stocks currently included in the Dow Jones Industrial Average (DJIA).

Structural Models of Price Formation
Structural modelling of cross-section dependence is the domain of spatial and network econometrics.Within this literature, there is very little research on stock returns.However, one can draw some insights from the literature on other markets (for example, housing and labor markets) or to dependence across financial markets.A key result from this literature is that the underlying structural model is, in general, not fully identified from cross-section covariances and correlations (Bhattacharjee and Jensen-Butler 2013).Hence, one requires further structural assumptions, either from theory or from context specific research, to identify network effects.
One such admissible assumption is recursive structure under which information flows are sequential (but contemporaneous) through different segments of the market.Based on 25 portfolios formed on size and book-to-market (Fama and French 1996;French 2017), Basak et al. (2018) find substantial explanation for risk spillovers and abnormal returns, and the model outperforms reduced form VAR (vector autoregressive) factor models.Suppose risk neutral traders arrive sequentially and repeatedly at the market taking positions on preferred risk/return FF portfolios.Then, limit or market order mechanisms would generate such recursive ordering of the portfolios in terms of information flow, in turn leading to cross-portfolio correlations.This ties in with the recent market microstructure literature on limit orders; see, for example, Handa and Schwartz (1996), Parlour (1998), Foucault (1999), Mertens (2003) and Foucault et al. (2005).
Recursive ordering is often a feature of cross-market information flows, where the sequencing of opening and closing times of different markets produce the so-called 'meteor shower' phenomenon documented in volatility by Engle et al. (1990) and in returns by Hamao et al. (1990).Bhattacharjee (2017) find that return correlations across 19 markets worldwide are explained by recursive ordering, in combination with global factors capturing the dominance of major markets.We propose a different model in this paper, but recursive structural models hold good promise for the future.
Beyond recursive ordering, some other structural assumptions are also admissible.Bhattacharjee and Jensen-Butler (2013) show that network dependence structure is identified under the assumption of symmetric interdependence.There are three important special cases of symmetry.First, there are social network models where individuals in the same network share information or interact with each other, but not with members in other networks; see, for example, Lee et al. (2010) and Cohen-Cole et al. (2018).This model is also closely related to the farmer-district model (Case 1992;Robinson 2003).Cross-section dependent stock returns can well be represented by social network models, but in applications, the membership and identity of networks is seldom known a priori, and one still needs appropriate theory or clustering/LASSO methods (discussed below) to motivate these networks.
Second, there are models where interconnections are binary and reciprocal, as in the social networks model, but the networks can be overlapping.In the context where the network is sparse, but negative interactions are possible, Bailey et al. (2016) propose estimation of network structure based on multiple testing of estimated cross-section correlations.We will briefly consider a model of this type later.In the context of stock returns factor models, lack of any obvious structural interpretation of the network is obviously an impediment.Besides, one would expect network interdependence in the stock market to be fairly dense, and hence the sparsity assumption may also be somewhat tenuous.
The third class of models use either clustering or LASSO (least absolute shrinkage and selection operator; Tibshirani 1996) to identify social network or block structure from the data.However, an observed pattern of clustering does not necessarily imply clustering of the parameter vector, since the returns of different stocks may be similar because they share a network.This observation justifies current approaches of characterizing the underlying latent network structure using either spatial and spatio-temporal clustering (Bhattacharjee et al. 2016;Chakraborty et al. 2018) or an analysis of cross-section correlations (Junior et al. 2015;Bailey et al. 2016); Nagy and Ormos (2018) apply clustering across markets to study dependence.Lam and Souza (2016) provide a method to identify block structure using the LASSO and similar methods.Our structural model and empirical analysis are based on this third approach.As discussed earlier, since the network structure can only contribute to weak dependence, the factor structure needs to be removed from the data a priori.As our analytical discussion in Section 2.1 shows, one can first estimate a FF-type factor model, identify the common factors that are associated with strong dependence, and then cluster the stocks based on the remaining weak-dependence factors.However, since the factors in the FF model are essentially returns on different types of risk, clustering can then be based on exposure to the corresponding risks.
The above approach is consistent with the following trading strategy.In the context of factor models (1) and ( 5), absence of cross-section variation in regressors for the FF-type model imply that subtle variations, over time, in the factor exposures for each stock are not captured in the data; hence, traders need to address this issue through diversification.Then, our structural model posits that, traders sort themselves on heterogenous risk preferences.Given a specific preference type over multiple risk factors, they then choose their preferred exposure to the risks and create a diversified portfolio of stocks with this risk exposure.This behavior generates interdependence across stock returns within this portfolio, but not beyond.Since the exposures are estimated by an FF-type factor model ( 1), the network can be identified by clustering the stocks on this estimated exposure vector. 3 The above trading model is structural, but its assumptions need validation.To highlight the promise that this approach holds, we now provide an illustrative application on the DJIA stock returns.
Our central argument is that, since the structural network effects are only partially identified from reduced form regressions (Bhattacharjee and Jensen-Butler 2013), inference requires structural assumptions underpinned by appropriate theory.Above, we have discussed three such lines of theory, emphasizing in particular one new structural model.There, traders choose their diversified portfolio with a preferred risk exposure; this trading behavior generates interdependence across stock returns within this portfolio, but not beyond.Since the exposures are estimated by an FF-type factor model, the network can be identified by clustering the stocks on this estimated exposure vector.This justifies our approach (in Section 3.2) of assuming that within cluster correlations dominate network effects across the clusters, which are then assumed to be absent.
Obviously, there are competing structural models where stocks belonging to different groups would be correlated, and we also discussed two such models: First, we refer to Basak et al. (2018), who developed a model where limit or market order mechanisms generate recursive ordering of the portfolios in terms of information flow, in turn leading to cross-portfolio correlations.This may be viewed as a model diametrically in opposition to the model developed in this paper.Second, unrestricted correlations can be modeled, and we also consider this approach.However, the unrestricted correlations model raises two further issues from a structural point of view.First, we need an assumption of sparsity (Bailey et al. 2016).Network interdependence in the stock market may be dense, and hence the sparsity assumption may be somewhat tenuous.Second, we do not currently have theory to justify sparse interactions, and lack of any obvious structural interpretation of the network is obviously an impediment.Nevertheless, we estimated such a model (Section 3.2) 3 Alternate structural restrictions with asymmetric dependence, for example, tree-based nested dependence (Bhattacharjee and Holly 2013), sparsity (Ahrens and Bhattacharjee 2015;Lam and Souza 2018) and copulas (Liu et al. 2018) can also hold promise, but we do not consider these here.and highlight that more work is required for structural understanding of the underlying trading mechanisms.We suggest this as an avenue for future research.

Data and Estimated Model
We collected daily stock returns data (adjusted for splits, dividends and distributions) from Yahoo Finance, on the stocks currently included in the Dow Jones Industrial Average (DJIA). 4The period under analysis is January 2001 to December 2015. 5Historical monthly factor returns on the Fama andFrench (1993, 2015) 5-factors and the Carhart (1997) momentum factor were collected from the web archive of French (2017).To make our stock returns data comparable with factor returns, stock returns are aggregated to the monthly level.These constitute our data under analysis.
First, we estimate by least squares a CAPM model (2) including only an intercept and excess return on the market.As discussed in Section 2.1, the network structure can be accurately identified by clustering on the vector (α i , β i ).We also estimate a FF-type factor model including all the six factors: m t , SMB, HML, RMW, CMA and Mom.The CAPM model exhibits spatial (network) strong dependence.Using the CD test of Pesaran (2015), the null hypothesis of weak dependence is strongly rejected.The test statistic evaluates to −4.026 with a p-value of 5.7 × 10 −5 .However, the same is not true for the FF-type model with six factors; the CD test statistic is −1.775 with a p-value of 0.076.
Next, we apply cluster analysis to the estimated αi , βi , and clearly identify 3 clusters: low alpha and low beta (13 stocks), low alpha and high beta (13 stocks) and high alpha (4 stocks).The membership of the clusters is reported in Table 1.Then, we construct a social network weighting matrix W (n×n) based on membership of the above three clusters.
Finally, we estimate two spatial autoregressive (lag) network models, including only m t but not the Fama andFrench (1993, 2015) or Carhart (1997) factors.In the first, a contemporaneous spatial lag Wy t is included; this is exactly the model in (3).Inclusion of the spatial lag introduces endogeneity in the model, and we estimate using a variant of the popular two stage least squares (2SLS) method in Kelejian and Prucha (1998).Like the application of common correlated effects, the above 2SLS method also presents challenges because there is no cross-section variation in the regressors.We use as instruments the omitted five FF-type factors, together with lagged residuals from the estimated CAPM model (2).The first stage estimation work well (with F-statistics much greater than 10 in all cases) and weak instruments issues are not apparent.However, 2SLS is known to have finite sample bias and there is loss of efficiency; for this reason, model comparison is based on root mean squared errors (RMSE).The second network model includes network effects as a time-lag, that is Wy t−1 . Here, we do not have contemporaneous endogeneity and the model can be estimated using least squares: The spatial lag (3) and space-time lag (7) model would in general have different structural implications.However, in our specific context, they are similar since the time lag is one week, which is very long in financial markets.By this time lag, all stock specific temporal information is expected to already have been factored into prices, and this information is therefore not relevant for trading strategy based on portfolio construction.Under our proposed structural model discussed in Section 3.1, trading behavior generates network effects through the choice of portfolios which get updated at a much lower frequency.Hence, structural implications of the spatial autoregressive and time lag models are literally the same in so far as network effect implications are concerned.In terms of econometric implications, estimation of the two models are different.The spatial lag model generates endogenous effects; hence, we use instrumental variables methods, while the space-time lag model has only lagged endogenous effects, and therefore, least squares estimation is employed.Between the two network models (3) and ( 7), we choose the one with lower RMSE; the model with better fit is indicated in bold in Table 1.Then, we apply the CD test to the correlation matrix of residuals.The test statistic is 1.458 with a p-value of 0.145.Hence, we are satisfied that weak dependence holds, and estimates of the network factor model are consistent.Finally, we report relative efficiency of the chosen network model, in terms of percent lower RMSE, relative to the CAPM model (1) and the FF-type model (2).
In terms of RMSE, the clustering network model improves upon the CAPM for all stocks except two (Nike and JPMorgan Chase).This is reassuring but not surprising because the network model includes one addition regressor, the spatial lag.However, it is remarkable that the network model improves upon the FF-type model with all six factors for 11 (out of the 30) stocks.This provides encouraging validation of the clustering structural model proposed in this paper.Understanding trading activity and pricing in financial markets is an important problem in finance.It is our belief that the work here takes an important step in this direction.
In Table 2, we report the estimated alphas (α) and betas (β) for the CAPM and network models.The distinction between the three estimated clusters is clear from the CAPM estimates.Further, as predicted by theory, there is strong correlation between the estimates from the two models; 0.68 for alpha and 0.76 for beta.However, also as expected from our theory, there is substantial bias in the CAPM model estimates; on average, positive bias in alpha is about 66% and 30% for beta.The FF-type model with 6 factors is qualitatively similar.Since the time-period under study is not too long, we assume that W and ρ is constant over time, but that ρ varies by stock (that is, ρ i ).In addition to the clustering model, we also applied the multiple testing procedure of Bailey et al. (2016) to construct a weighting matrix; in this case, the p-value of the CD test is 0.650, which is promising performance of this method in negating strong dependence.However, more work is required for structural understanding of the underlying trading mechanisms; this is also an avenue for future research.
We verify the robustness of our findings across several dimensions.First, we evaluate robustness in clustering.We use different starting clusters, and different algorithms, all of which provide consistent results.Further, we account for the uncertainty in estimated alpha and beta parameters by multiple imputation based on estimated confidence intervals, and the results are consistent as well.Second, we evaluate robustness in choice of factors by considering two traditional factor models.One is the CAPM with a single market return factor, and the other is a 6-factor model including the Fama andFrench (1988, 1993) and Carhart (1997) factors.We find that our model almost always provides an improvement over the CAPM (in terms of RMSE), but is also frequently better than the 6-factor model.Third, we consider several network models.One, a contemporaneous spatial lag model; two, a space-time lag model; and finally, a sparse network model with unrestricted interactions.The main implications of our results are consistent across all three specifications.

Conclusions
The Fama and French (1993) and similar factor models are important and popular in finance, and they provide good structural understanding of the risk, returns and price formation.Typically, the model is estimated as a time series regression separately for each stock (firm).Such estimation would provide consistent estimates if the data are independent across firms.However, if there were any network effects, such estimates can be inefficient or even inconsistent if the network effects are endogenous.
Indeed, persistent evidence of abnormal returns and cross-section correlations in stock returns points towards potential misspecification of the FF-type models.In this paper, we show that endogenous network effects create cross-section dependence that renders least squares estimation of FF-type factor models inconsistent; hence, computed returns and risk may both be erroneous.
Further, we argue that current econometric methods to deal with cross-section dependence are not applicable to the above factor models.This leads us to development of structural models to understand network effects better.We propose a social network model based on clustering and show that it lends itself to interesting structural interpretations.Applied to data on the 30 DJIA stocks, our model provides improved estimation of factor models and insightful new understanding of trading activity and price formation.How the information in improved relative efficiencies can be harnessed for trading is a matter of further research and practice, which we also retain for future work.
While our current evidence is limited to only the DJIA stocks, this work provides the basis for further empirical validation and development of theory, not to mention alternate structural models of trading activity as well.A larger temporal dimension would obviously be useful in highlighting the weaknesses of the FF model which ignores structural cross-sectional interactions that are highlighted from our findings.However, capturing such interactions requires a potentially strong assumption that the nature and strength of interactions is constant over time.Obviously, the validity of this assumption would become more tenuous with a larger sample, but can equally be verified using more data.The advantages of larger sample data would also be apparent with a larger cross-section dimension.The current paper is best viewed as a proof of concept that further research on structural network effects may be fruitful.Hence, our work provides several promising avenues for further research in the direction of market microstructure models and their applications.

Table 1 .
Clusters, Model Diagnostics and Relative Efficiencies.