Next Article in Journal
Approximation of Real Functions by a Generalization of Ismail–May Operator
Next Article in Special Issue
Proposed Model of a Dynamic Investment Portfolio with an Adaptive Strategy
Previous Article in Journal
Bayesian and Non-Bayesian Inference for Weibull Inverted Exponential Model under Progressive First-Failure Censoring Data
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Estimating Forward-Looking Stock Correlations from Risk Factors

Swiss Institute of Banking and Finance, University of St. Gallen, 9000 St. Gallen, Switzerland
Authors to whom correspondence should be addressed.
Mathematics 2022, 10(10), 1649;
Received: 13 April 2022 / Revised: 9 May 2022 / Accepted: 10 May 2022 / Published: 12 May 2022
(This article belongs to the Special Issue Modern Mathematical Models in Investment: Theory and Practice)


This study provides fully mathematically and economically feasible solutions to estimating implied correlation matrices in equity markets. Factor analysis is combined with option data to receive ex ante beliefs for cross-sectional correlations. Necessary conditions for implied correlation matrices to be realistic, both in a mathematical and in an economical sense, are developed. An evaluation of existing models reveals that none can comply with the developed conditions consistently. This study overcomes this pitfall and provides two estimation models via exploiting the underlying factor structure of returns. The first solution reformulates the task into a constrained nearest correlation matrix problem. This method can be used either as a stand-alone instrument or as a repair tool to re-establish the feasibility of another model’s estimate. One of these properties is matrix invertibility, which is especially valuable for portfolio optimization tasks. The second solution transforms common risk factors into an implied correlation matrix. The solutions are evaluated upon empirical experiments of S&P 100 and S&P 500 data. They turn out to require modest computational power and comply with the developed constraints. Thus, they provide practitioners with a reliable method to estimate realistic implied correlation matrices.

1. Introduction

Stock markets are forward looking by nature, and properly assessing future developments can be a decisive competitive advantage for any market participant. The option implied correlation matrix is an important tool, as it embodies the ex ante aggregated expectation about forward-looking dependencies between securities [1,2]. Sufficient knowledge about this matrix is a key requirement for pricing basket/index options [2,3], but is also important for other purposes, such as factor-based asset pricing [4,5], forecasting [1,6,7,8], trading [8,9,10,11], managing risks [1,12,13], or understanding market behavior [14,15]. Using the implied correlation matrix can be advantageous over using historical data for the above tasks in practice. The implied correlation matrix contains forward-looking market expectations and investors’ current perception of risk [16]. Hence, it can be used to make predictions about the future at no additional assumptions. Using historic data for future predictions however, comes with the assumption that patterns in the past are also decisive for the future. Empirical analyses show that implied estimates can be a better proxy for future realized volatility [17,18] and future returns [19]. It is also documented that mean-variance efficient portfolios constructed from implied correlations outperform those constructed from past data [10,20]. In a broader context, research shows that the beyond second moment characteristics of implied data also have rich information content (e.g., [21]) and that besides options, other instruments and channels can also be used to extract forward-looking information (e.g., [22,23]). Given its superiority of timeliness and richer information content [24], there is a broad interest in solving the cross-sectional implied correlation puzzle. The goal of this paper is to provide a fully feasible solution to that puzzle.
Despite the broad interest, yet finding a realistic solution to the puzzle is a challenging task [25]. The main difficulty of solving for implied correlations is that there are far fewer options traded than there are unknown correlation pairs within the matrix. Hence, the problem is underdetermined, and multiple solutions exist in general. Current models solve the problem but, as it is shown later, they cannot guarantee to meet the important mathematical properties or impose unrealistic economic assumptions. In other words, while they deliver numerically correct results, those results can be rather useless from a practical standpoint.
The purpose of this paper is to overcome this shortfall in a two-stage process. Its first contribution is to answer the question on what requirements the implied correlation estimates must fulfill to be mathematically and economically feasible. In doing so, it identifies the shortcomings of existing models and argues that none of these models can be used as a reliable and realistic tool by investors to estimate implied correlation matrices. The second contribution of this paper is that it validates the hypothesis that fully feasible solutions can be generated via using the factor structure of returns. It proposes two alternative estimation methods to the problem at hand. Both solutions are based on the idea that a well-conditioned correlation matrix is composed of a set of factors. The first solution is quantitatively motivated. It computes the nearest implied correlation matrix with respect to a pre-specified target that can be freely chosen. Hence, it can be used as a stand-alone method or to re-establish the feasibility and invertibility of any other model’s estimate. This feature makes the quantitative solution especially attractive for applications in portfolio optimization. The second solution is economically motivated. It discusses the link between common risk factors (such as CAPM [26] or Fama-French [27,28]) and the corresponding expected correlation matrix. This closes a gap between factor-based asset pricing and ex ante beliefs. An empirical experiment shows that both solutions are computationally efficient and can be used reliably in practice.
As already discussed, multiple plausible solutions of the correlation matrix puzzle work numerically but can be rather useless from a mathematical and economic standpoint. To filter the pool of potential solutions, feasibility constraints were developed and grouped into two categories. More precisely, for a market covering n securities, a correlation matrix C R n × n satisfies mathematical feasibility if the following criteria are met:
It is symmetric;
All its elements lie within the [ 1 , 1 ] interval;
It has unit diagonal;
It is positive semi-definite (psd).
Economic feasibility is fulfilled if the following criteria are met:
The matrix is ex ante free of arbitrage;
The values have a realistic structure.
While the mathematical conditions are straightforward, the economic ones need a little more clarification. Ex ante arbitrage freeness (v) is interpreted in the following sense. Within a financial market, there are typically options traded on individual firms, but also for the market index directly. Hence, a minimum criterion for C to meet condition (v) is that it properly aggregates the prices of the constituents to match the observable price of the basket option (cp. [1,2]):
( v ) : σ m 2 = w σ C σ w
with σ m 2 R + as the implied variance of the market portfolio, w R n the corresponding weight vector and σ R + n × n a diagonal matrix of firm implied volatilities. If this condition does not hold, an arbitrageur can earn riskless profits via simultaneously buying (selling) the index (constituent) options or vice versa. Such profit opportunities should not exist following common economic sense because within the moment that such a trading opportunity emerges, it should be vanished by the traders who exploit it. Condition (vi) is also taken from the existing literature (such as [29]). Its loose definition implies that, in a real economy, the pair-wise correlations C i j are heterogeneous across stock pairs i , j { 1 , , n } . A more narrow definition is linked to the thousands of publications in finance that decompose returns into factor loadings (see [30,31]). Given this large body of literature, a realistic structure of implied correlations can not only be expected to be heterogeneous, but also to originate from factor loadings.
To come up with a solution that matches all six conditions (i)–(vi), C is generated from an inherent k-factor structure [32,33]. In its final form, the factor structure is derived as
C ( X ) = J X X + I
having X R n × k as factor loadings, ∘ the Hadamard product, I the identity matrix, and J : = 1 n × n I with 1 n × n as the unity matrix of dimension as indicated in the subscript. This structured representation of the implied correlation matrix has the advantage that controlling for the above requirements (i)–(vi) is easy, which allows this paper to provide fully realistic solutions.
The reading is set up as follows. Section 2 provides a comparison of existing models and identifies their violations of mathematical/economical feasibility. Section 3 discusses the factor model and the proposed solutions to the implied correlation puzzle. In Section 4, the empirical implementation upon data of the S&P 100 and S&P 500 is evaluated. Section 5 concludes. Throughout the reading, the matrix- and Hadamard products are displayed in a fashion like J X X . The corresponding multiplication order should be read as J ( X X ) and not as ( J X ) X (i.e., ‘matrix-before-Hadamard product’). The brackets are dropped to increase readability.

2. Discussion on Existing Models

From a literature review, four distinct concepts for solving the implied correlation matrix are identified. Each concept is presented shortly and evaluated against the feasibility constraints developed above. Before doing so, it must be highlighted that, generally, in quantitative finance, it is often more convenient to work with implied volatilities directly instead of option prices (cp. [34]). This also applies for implied correlations. Given non-flat implied volatility surfaces, a common convention is to use options of same strike and maturity [1,2,9]. The existing concepts for solving implied correlation matrices are given as follows.   
  • Equi-Correlations
The literature on implied correlations was initialized by [1] in 2005. This model is not only cited in most of the subsequent research in that direction, but it is also used as the fundamental concept behind the CBOE implied correlation index. The idea of the method is as follows. Since for a given point in time, the weights w and the forward-looking measures σ m 2 and σ are available (or can be estimated), and each off-diagonal element within C can be set to the same scalar c ¯ to match the equilibrium of Equation (1). The equi-correlation c ¯ is then easily computed by
C ¯ : = c ¯ J + I : σ m 2 w σ C ¯ σ w c ¯ = σ m 2 w σ 2 w w σ J σ w
(cp. [5]). From the proof of [33] (Lem. 2.1) follows, that C ¯ will be psd if 1 / ( n 1 ) c ¯ 1 , which is also sufficient for mathematical feasibility. Within an economy, securities are typically, on average, positively correlated (the CBOE implied correlation index is also thoroughly positive), so this should not be much of an issue empirically. However, it is unrealistic that each correlation pair inside C takes on the same value, and hence C ¯ is seen as economically unfeasible and violates condition (vi) (cp. [9,29]). Additionally, C ¯ clearly rejects the existence of a (risk-)factor structure (also discussed in [5]), being in contradiction with empirical asset pricing literature. In case the market index also has options traded upon sub-indices (e.g., S&P 500 sector indices), it is unlikely that the equi-correlation matrix will match them. Hence, this is a violation of condition (v). Due to those reasons, the equi-correlation approach is more frequently used as an index of average diversification possibilities rather than as a correlation matrix per se (e.g., [14]).   
  • Local Equi-Correlations
The equilibrium (Equation (1)) violation from existing sub-index options can be easily resolved by introducing local equi-correlations, where the correlations are averaged out within sub-squares of the matrix. This idea is mentioned in [35], who use options on the S&P 500 market index and 10 S&P 500 sector sub-indices. Herein, always the average correlation within a sub-index is computed, before computing the global equi-correlations outside the sub-portfolio. For a sub-portfolio p, Equation (1) must hold for its weights w p and volatility σ p . Local equi-correlations of different ps are always centered alongside the diagonal of C. This raises the issue that large parts of C cannot be covered and obviously, with a handful of sub-indices and a very large number of unknown correlation pairs, the resolving correlation matrix will still give a very blurred and unrealistic picture. For example, if n = 500 , there are 124,750 unknown correlation pairs. With 10 sub-indices, there are only 11 distinct estimates. For this definition of local equi-correlations, condition (iv) does not hold, as the resulting matrix is not necessarily psd. Additionally, as argued in [5], the known factor structure of security markets (condition (vi)) will still be ignored. The work of [5] further demonstrates how the factor structure can be resolved from calibrating also off-diagonal local equi-correlations. However, also this refinement yields a very blurred estimate of the implied correlation matrix. Hence, as the local equi-correlation method only slightly improves precision, compared to the global equi-correlation, it still causes unrealistic estimates of C and is thus economically not feasible.
  • Adjusted Ex Post
This method combines (local) equi-correlations with backward-looking estimates to approximate the forward-looking implied correlation matrix [9,35]. For the discussion of this model, it has to be kept in mind that due to the stochastic nature of variances and correlations, option prices are documented to carry risk premia for variance (VRP) and for correlation (CRP; see [36,37,38,39]). Let P denote investor expectations under the physical and Q under the risk-neutral probability measure. Option implied volatilities are known to be Q -measured. The physically expected correlation matrix can be denoted by C P . Then, the difference between implied market and aggregated individual volatilities can be used to express the volatility quoted ex ante CRP as C R P = σ m 2 w σ C P σ w (cp. [6,37]). Building on this idea, A can be defined as a backward-looking estimated correlation matrix; in the example of [9] it is simply the 1 year historically realized return correlations, but A can also be chosen from a more sophisticated model (e.g., incorporating mean-reversion [35]). For the model of [9] (indicated by the superscript A E P 1 ), two crucial assumptions are obligatory. First, they introduce that the backward-looking A is equivalent to the forward-looking P matrix, A C P . While investors use backward-looking information to form their beliefs, there is no fundamental reasoning why this should hold, as A does not carry information on the market outlook (see, e.g., [1]). The second assumption is that the correlation risk premium enters into the matrix in the specific form of
C Q A E P 1 : = C P α ( 1 n × n C P )
with α as a scalar calibrating for the correlation risk premium and C Q A E P 1 as their estimate of the implied correlation matrix. The two assumptions come at a mathematical convenience. Let α ^ : = α , then by rearranging terms, the equation can be brought into the form C Q A E P 1 = α ^ 1 n × n + ( 1 α ^ ) C P , known in the literature as the ’weighted average correlation matrix’ [40]. Since both 1 n × n and C P are psd, and the sum of two psd matrices is also psd, the mathematical feasibility of C Q A E P 1 holds for α ( 1 , 0 ] , which is fulfilled when C R P > 0 . Empirically, however, α is likely to fall outside that range, and this method does not comply with condition (iv) in general. For example, on monthly S&P 100 data from 1996 to 2020 with A as the one year historically realized correlation matrix, it can be observed that C R P > 0 in 156 and C R P < 0 in 144 of the 300 monthly estimates, thus the requirement did not hold for 48% of the time. This rejection of mathematical feasibility is also the main critique stated in [29], who provide a workaround for the C R P < 0 cases. In this model, whenever α > 0 the matrix 1 n × n of Equation (4) is replaced by the ’equi-correlation lower-bound’, L defined as i j : L i j = 1 / ( n 1 ) and unit diagonal. Here, L is simply the smallest possible psd equi-correlation matrix (see the discussion on equi-correlations above). Following the modification of [29] (indicated by A E P 2 ), the implied correlation matrix is computed by
C Q A E P 2 = α ^ 1 n × n + ( 1 α ^ ) C P , for C R P 0 α ^ 0 α ^ L + ( 1 α ^ ) C P , for α ^ < 0
According to [9], consistent investor preferences require that all P -correlation pairs are scaled in the same direction, under general risk aversion; this means up if C R P > 0 and down when C R P < 0 . This consistency clearly holds in [9,29] for the α ^ 0 cases. Analyzing the workaround of [29] in greater detail, one recognizes that off-diagonal entries inside L are slightly negative and close to zero (e.g., −0.01 for n = 100 and −0.002 for n = 500 ). As a result, for α ^ < 0 , almost every negative P -correlation pair will be up-scaled, while positive ones will be down-scaled at the P - Q transformation. Therefore, the [29] workaround repairs the mathematical feasibility of the basic adjusted ex post model, but at the same time causes an inconsistent implementation of the correlation risk premium, thus violating condition (v). This yields into a rejection of economical feasibility, leaving an insufficient solution of the implied correlation matrix behind. While mathematical/economical flaws still persist, compared to the other concepts in place, the adjusted ex post method seems to be the most realistic one. The nearest implied correlation algorithm as introduced in Section 3 can be applied upon these models to repair feasibility.
  • Skewness Approach
Of note, Refs. [10,12] use a CAPM-like model, introduce economic assumptions to cancel out mathematical relationships, and estimate implied correlations between a stock and the market portfolio by combining option-implied volatilities with risk-neutral skewness. Since CAPM is a factor model, the estimates correspond to X of Equation (2), and their approach can thus be used for a solution to the implied correlation matrix. However, this model does not coerce with market conditions (Equation (1)), nor does it stick to the boundaries stated in condition (ii). So, mathematical and economical feasibilities are ignored. On the other hand, the psd condition is easily met when combined with the factor model of Equation (2).
Table 1 summarizes the assessment of existing models with respect to the constraints developed in Section 1. It can be observed that none of the presented models meet all feasibility constraints. It appears that while most models handle mathematical feasibility relatively well, they do so at the expense of being economically justifiable. From the results of Table 1, it can be hypothesized that all drawbacks can be overcome when an implied correlation matrix is estimated from factors. This hypothesis is tested in the following section.

3. Solutions from Factor Structures

This section develops the two proposed approaches to estimate a mathematically and economically feasible correlation matrix. The section starts with a discussion on how correlation matrices can be estimated from factors. Thereafter, quantitative and economic estimation approaches are introduced that both ensure meeting the mathematical and economical feasibility constraints.
Generally, implied volatilities are used as input parameters, which can be estimated in various ways (see [34] for a detailed discussion). Dependent on what kind of implied volatilities to use (e.g., centered vs. directly parameterized [41]), the methodology is not limited to Pearson-type correlation matrices and can also be used within more sophisticated option pricing models, adjusting for non-normal distributions. An example for the case of a multivariate variance-gamma process can be found in Appendix A.
The correlation structure that is typical for financial markets can be expressed with the multi-factor copula model described by [32]. This model is used in a nearest correlation matrix context by [33]. The factor generating core of the model follows a simple but intuitive definition, that is
ξ = X η + F ϵ
where ξ R n describes a random vector, F R n × n a diagonal matrix and η R k corresponds to the factor’s magnitude. All three vectors, ξ , η and ϵ R n are defined to have zero mean and unit variance; η and ϵ are orthogonal,
E [ ξ ] = E [ η ] = E [ ϵ ] = 0 , v a r ( ξ ) = v a r ( η ) = v a r ( ϵ ) = 1 , c o v ( η , ϵ ) = 0
From this, it follows that
c o v ( ξ ) = E [ ξ ξ ] = X X + F 2
Since ξ has unit variance, c o v ( ξ ) turns out to be a correlation matrix with the boundaries of
i = { 1 , , n } : d = 1 k X i , d 2 + F i i 2 = 1 d = 1 k X i , d 2 1
such that X is necessarily limited to [ 1 , 1 ] . From the above equation, it follows that every X i , d 2 corresponds to the goodness-of-fit of stock i explained by the the risk factor d, and  F i i 2 to the unsystematic correlation, which cannot be explained by the given set of factors. From the notation, it follows that X i , d itself can be interpreted as the correlation of a stock i to a risk factor d (e.g., X i , m = c o r r ( i , m ) in CAPM). F 2 itself is of less importance for further modeling, as it can be easily computed once a feasible solution to X is found. Therefore, the factor-structured correlation matrix of Equation (8) can be equivalently rewritten in a form where F 2 is suppressed and X remains the only unknown. This reformulation is used in the specific context as the factor-structured implied correlation matrix C ( X ) , which is now denoted by Equation (2). Other works in the literature (e.g., [33]) write C ( X ) = X X d i a g ( X X ) + I with d i a g ( X X ) as the diagonal matrix of X X . This alternative definition yields the same C ( X ) ; however, given the cleaner notation of Equation (2) and its superiority in terms of computation, as J can be pre-computed (thus not iterated) while d i a g ( X X ) cannot, Equation (2) is preferred in this paper. Below, it is shown how Equation (2) generates a well-conditioned correlation matrix.
Theorem 1.
The matrix C ( X ) is a mathematically feasible correlation matrix whenever
X Ω : = { X R n × k : d = 1 k X i , d 2 1 , i = { 1 , , n } }
Since X X , J and I are symmetric, it follows that C ( X ) is symmetric and condition (i) is guaranteed. From  X Ω it follows that condition (ii) holds. As for condition (iii), X X J has zero diagonal such that addition by I guarantees unit diagonal. C ( X ) ’s off-diagonal elements are generated from X X , which is positive semi-definite by construction. Since d i a g ( C ( X ) ) = 1 n × 1 d i a g ( X X ) , it follows that C ( X ) is positive semi-definite given that X Ω . Therefore, conditions (i)–(iv) are met.    □
In empirical applications, it is unrealistic that asset returns are perfectly (anti-)correlated with risk factors. Therefore, excluding −1 and 1 from the factor loadings, X, is not only more appropriate, but it will also come with the practical convenience that it guarantees C ( X ) to be non-singular (invertible). This is practical for portfolio optimizations where the inverse of the correlation matrix can be used for closed-form solutions. An example of such an optimization task is the computation of the global minimum variance portfolio.
Lemma 1.
The correlation matrix C ( X ¯ ¯ ) is invertible if its factor loadings X ¯ ¯ are not perfectly (anti-)correlated,
X ¯ ¯ Ω ¯ ¯ : = { X ¯ ¯ R n × k : d = 1 k X ¯ ¯ i , d 2 < 1 , i = { 1 , , n } }
For the one-factor case of k = 1 , ref. [33] (Cor. 3.3) showed that non-singularity holds if at most one entry in | X | is 1. Hence, C ( X ¯ ¯ ) subject to k = 1 is guaranteed to be non-singular. The behavior for k > 1 is similar in the sense that at most, one entry per column of | X | is allowed to take on 1. This condition becomes more obvious when looking at the two-factor case k = 2 . In this case, X is structured as
X = y u z 0 X X = y y + u u y z z y z z
where y , u represent vectors of dimension m = n q , and z is of dimension q. The vectors have the characteristics of y , u ( 1 , 1 ) and | z | = 1 q × 1 . Hence, the first column of | X | has q many entries equal to 1. When X Ω , it follows that all X entries in the same row as z take on zero (if not z itself). This means that the columns of X can be arbitrarily exchanged, but it will not affect X X . Further, from the permutation argument of [33] (Cor. 3.3), it is known that the position of z does not matter. The upper-left corner of the correlation matrix can be abbreviated by C 1 ( X ) = ( y y + u u ) J m × m + I m × m , which represents a feasible correlation matrix itself. Thus, C ( X ) is alternatively represented by
C ( X ) = C 1 ( X ) y z z y z z
for which [33] (Cor. 3.3) showed to be non-singular if X Ω and q 1 . Therefore, when using X ¯ ¯ instead, it follows that q = 0 and C ( X ¯ ¯ ) are guaranteed to be invertible. Since adding additional factors k > 2 will enter into C 1 ( X ¯ ¯ ) only, it does not change the conclusion.    □
For economic feasibility Equation (2) needs to be combined with the equilibrium conditions of Equation (1). Therefore, the options traded on the market- and its sub-indices restrict possible values of X. These restrictions are referenced with ’market-constraints’ from now on. Given n c many market-constraints, the general solution to the factor-structured implied correlation matrix now evolves as
C ( X ) subject to X Ω σ j 2 = w j σ C ( X ) σ w j , j = { 1 , , n c }
From this definition, it follows that Ω is a closed convex set of real numbers. This set can be expressed as inequality constraints, and the market constraints are of the equality type. From the equality constraints, it follows that the feasible set is a subset (hypersurface if n c = 1 ) within Ω , denoted by Ω ˙ in the subsequent. For optimization purposes, it is convenient to formulate the constraints as follows.
Definition 1.
Let h ( X ) describe the inequality constraints,
h ( X ) : = 1 n × 1 ( X X ) 1 k × 1 , h ( X ) : R n × k R n
such that h ( X ) 0 n × 1 guarantees that X Ω , which is necessary and sufficient for mathematical feasibility.
In practical terms, when one wants to explicitly ensure that the implied correlation matrix is invertible, the easiest way to do so is by introducing a very slim tolerance tol for h ( X ) tol n × 1 .
Definition 2.
Let g ( X ) define the vector of market constraints, which simply stacks Equation (1) for every (sub-) index that has option contracts traded upon,
j = { 2 , , n c } : g ( X ) = w σ C ( X ) σ w w j σ C ( X ) σ w j σ m 2 σ j 2 , g ( X ) : = R n × k R n c
with j = 1 reserved for the market portfolio m. Hence g ( X ) = 0 n c × 1 is necessary for economic feasibility.
The equality constraint is necessary, but not sufficient for economic feasibility. The best example is the (local) equi-correlation model, which meets this requirement but yields an unrealistic implied correlation matrix, such that economic feasibility is not met.
In general, the number of unknown correlation pairs of a correlation matrix C ( X ) as in Equation (14) is n ( n 1 ) / 2 , and the number of observable implied volatilities is n + n c . Empirically, it is given that n + n c n ( n 1 ) / 2 almost surely, and hence, many different numerical solutions to X exist. To come up with a reasonable choice of X, this paper argues in favor of two approaches: first, a purely quantitative approach computing the feasible implied correlation matrix that is nearest to some pre-specified target; second, an economic approach using factor-based asset pricing models, assuming that P expected factor loadings can be estimated.

3.1. Quantitative Approach: Computing Nearest Implied

This solution is based on the case when there exists an educated guess about the forward-looking correlation matrix. For example, this educated guess could be derived from historically realized correlations, from a GARCH forecast, or from a not fully feasible estimate of the discussed models in Section 2. As before, this educated guess is denoted by A, but different to [9,29], the assumption that A C P is not required. Such an educated guess allows to search for a C ( X ) that is as similar as possible to A, but satisfies all market constraints in order to be considered a feasible implied correlation matrix.

3.1.1. Formulating the Problem

Finding the most similar of a target matrix is known in the mathematical literature as the nearest correlation matrix problem (e.g., [33]). Similarity between the generated C ( X ) and the target matrix A can be quantified by the squared Frobenius norm between them,
f ( X ) = C ( X ) A F 2 , f ( X ) : R n × k R +
which, when introducing A ^ = A I , can also be written as
f ( X ) = J X X A ^ F 2
Lemma 2.
The gradient of f ( X ) is
X f = 4 ( J X X A ^ ) X , X f R n × k
For simplicity, let M : = J X X A ^ and the Frobenius product (trace operator) be denoted by the colon symbol ‘:’, such that t r ( M M ) = M : M . The problem can then be written as f ( X ) = M F 2 = M : M and its differential is thus d f = 2 M : d M . The differential of M itself is d M = J d ( X X ) , and  d ( X X ) = d X X + X d X . Both: and ∘ are mutually commutative operators, and hence M : J d ( X X ) = M J : d ( X X ) . Therefore, substituting back in yields
d f = 2 J M : ( d X X + X d X ) = 2 ( J M + J M ) : d X X ,
since J and M are symmetric, this reduces to d f = 4 J M X : d X , and 
X f = f X = 4 J M X
follows. As  J J = J and J A ^ = A ^ , plugging back in M gives Equation (19). This result is a reduced-form equivalent to the gradient as found in [33], who derived it as X f = 4 ( X X d i a g ( X X ) A ^ ) X .    □
Popular optimization methods such as sequential quadratic programming (SQP) build on the Lagrangian function, which for the problem at hand can be formulated as
L ( X , λ , κ ) = f ( x ) + λ g ( x ) + κ h ( x ) )
with λ R n c and κ R n representing the Lagrangian multipliers of the market constraints introduced in Section 3. Working with the Lagrangian is probably the most common practice, and hence the respective gradients are also reported below.
Lemma 3.
The gradient of λ g ( X ) with respect to X is
X ( λ g ( X ) ) = 2 σ J j = 1 n c λ j w j w j σ X
λ g ( X ) can be written as j = 1 n c λ j g j ( X ) . One market constraint g j ( X ) , can be rearranged into the form g j ( X ) = w j σ ( J X X ) σ w j + w j σ 2 w j σ j 2 . To abbreviate notation, let M ^ : = J X X and B : = σ w j w j σ , both being symmetric matrices. For the gradient X g j , all non-X terms cancel out, so the focus is on w j σ M ^ σ w j , which can be expressed as the matrix trace t r ( B M ^ ) . The differential of g j is thus reduced to d g j = t r ( B d M ^ ) = B : d M ^ . From the proof of Lemma 2 it follows that d M ^ = J ( d X X + X d X ) . Therefore, the differential writes
d g j = B J : ( d X X + X d X ) = 2 ( B J ) : d X X = 2 ( B J ) X : d X
hence the gradient of g j is given as X g j = 2 ( B J ) X . Here, B J equals σ ( J w j w j ) σ , so substituting back in and multiplying by λ j gives
λ j X g j = 2 λ j σ ( J w j w j ) σ X
thus, the lemma evolves from X ( λ g ( X ) ) = j = 1 n c λ j X g j .    □
Lemma 4.
Let D κ denote the diagonal matrix of κ, the gradient of κ h ( X ) can then be written as
X ( κ h ( X ) ) = 2 D κ X
The term κ h ( X ) can alternatively be written as κ 1 n × 1 κ X 2 1 k × 1 , where X 2 describes that each element in X is squared (Hadamard quadratic). Thus, ( κ h ( X ) ) / x i d = 2 κ i x i d , which for i { 1 , , n } , d { 1 , , k } yields the result above.    □

3.1.2. Numerical Method

The nearest implied correlation matrix can now be attained by the following optimization:
X = arg min X L ( X , λ , κ )
A performance comparison of numerical methods for a general nearest correlation matrix problem (i.e., without equality constraints g ( X ) ) was presented by [33]. They recommend the spectral projected gradient (SPGM) as the most efficient and reliable method for this task (compared algorithms include ‘alternating directions’, ‘principal factors method’, ‘spectral projected gradient’ and Newton-based methods; see [33] for details). Based on this finding, the SPGM is also used as the main algorithm within this study for solving the implied correlation matrix puzzle. A brief comparison to a SQP-based solver is included in Section 4.
First, a brief elaboration on the SPGM algorithm and expansions that are relevant for applications in this paper are presented in this paragraph. SPGM was initially introduced by [42] and is used in many different studies. The study of [43] summarizes a long list of corresponding works and the applications to financial data include [25,33]. A detailed discussion on the algorithm can be found in [43,44]. Technically, the SPGM comes with three advantages when compared to other potential candidates. First, the method ’only’ requires the gradient—known for this task—but not the Hessian matrix. Second, the non-linear constraints are guaranteed to be satisfied and third, the method is ensured to converge toward the optimum (cp. [42]). In a nutshell, the method minimizes a continuous differentiable function on a nonempty closed convex set via projecting possible values of X back onto the feasible set. Consequently, a key requirement is that projections on the feasible set can be made at low computational effort. If so, then the method provides a very efficient way of handling the constraints. For the application at hand, handling the equality constraint alongside Ω can become tricky. This issue is resolved by combining SPGM with a more general inexact restoration framework (e.g., [45] for a discussion). The IR-SPGM was introduced by [46] who also provided an in-depth explanation of the algorithm, thus the details will not be repeated here. Roughly speaking, the algorithm can be broken down into two phases: a restoration phase, projecting X back onto the feasible set, and an optimization phase, computing the step size.
For the particular application at hand, greater attention has to be paid on the projection function. In case of this paper, Ω ˙ defines the feasible region that satisfies both Ω and the market constraints. As mentioned by [46], for some X outside Ω ˙ , the projection function P Ω ˙ ( X ) should be defined such that P Ω ˙ ( X ) arg min X ˙ X X ˙ 2 where X ˙ Ω ˙ . This means that the projection function should find a point in Ω ˙ that is approximately the closest to X (inexact restoration). Obviously, P Ω ˙ ( X ) is an orthogonal projection if an X ˙ X exists, which is not necessarily the case, given the upper boundaries of X ˙ (Equation (9)). Figure 1 visualizes the projection and feasible set on a simplified example.
For the inexact restoration framework, the projection is split up into two functions, a projection onto the inequality constraint P Ω ( · ) and one onto the equality constraint P E ( · ) . As for P Ω ( · ) , [33] already discussed that this projection is easily carried out by replacing every row i of X which exceeds d = 1 k X i , d 2 > 1 by X i / X i . Hence, the projection on Ω comes at very low computational effort.
For the P E ( · ) projection, given the market constraint σ m 2 and defining X E to be any point that fulfills g ( X E ) = 0 , the Lagrangian can be formulated as
L E ( X E , λ E ) = X E X F 2 λ E [ v ( X E X E J + I ) v σ m 2 ]
To simplify notation, introduce v = σ w . With the results from above, the gradient of L E can now be written as
X E L E = 2 ( X E X ) 2 λ E ( v v J ) X E and λ E L E = σ m 2 v ( X E X E J + I ) v
To minimize the Lagrangian function, both gradients are set to zero, X E L E = λ E L E = 0 . Therefore, from rearranging terms of X E L E , X E can be defined as
X E = ( I λ E v v J ) 1 X
The inverse of ( I λ E v v J ) 1 can be expressed following the expansion procedure of [47]. Since in a real economy the elements of v v J are generally smaller than 1 and close to 0, higher-order terms, such as ( v v J ) 2 , converge toward zero very fast. If this was not the case, the implied volatilities could be time-scaled down, for example, from yearly to daily such that i : v i < 1 . Consequently, the inverse is efficiently approximated by
( I λ E v v J ) 1 I + λ E v v J P E ( X ) : X E X + λ E ( v v J ) X
Plugging back into the equality constraint λ E L E = 0 and rearranging terms, λ E can then be solved from a simple quadratic equation with two solutions,
λ E , ± = λ E , 1 ± λ E , 1 2 4 λ E , 0 λ E , 2 2 λ E , 0 , with λ E , 0 = v [ ( v v J ) 2 J ] v λ E , 1 = 2 v [ X X ( v v J ) J ] v λ E , 2 = v [ X X J + I ] v σ m 2
Having estimated both solutions of λ E , ± and inserting the results into Equation (31), it is easy to attain whether X E , + or X E , is closer to X. Hence, it evolves that also the projection P E ( X ) is inexpensive to compute. In case λ E is complex, taking only its realistic part is considered to be sufficient, given the inexact restoration framework.
X ˙ that fulfills both X ˙ Ω and g ( X ˙ ) = 0 can now be found following a simple alternating projection algorithm that proceeds as follows. It starts with the initial projection on the market-constraint P E ( X ) . If  P E ( X ) Ω , then X ˙ has already been found. Otherwise, it locks in whether λ E , + or λ E , was used and takes only the upper or lower solution subsequently. This is useful because, therefore, the market-constraint becomes convex, and alternating the projections P Ω ( · ) and P E ( · ) converges toward the optimal solution X ˙ . The alternating procedure can be stopped if a certain tolerance level for g ( · ) has been reached. With the defined projection procedure, the optimization under IR-SPGM can now be carried out following the guidelines of [46].

3.2. Economic Approach

As an alternative to the above quantitative approach, a set of expected risk factors can also be utilized to estimate the implied correlation matrix. The factor exposures can be computed either from a statistical (e.g., principal component analysis) or from an economic routine (such as CAPM [26], Fama-French [27,28]). As for the latter, X corresponds to the correlation between stocks and the risk factors. From the literature on factor analysis, it is known that if X can be found such that X X has a unit diagonal, then a correlation matrix is fully explained by its reduced structure X. It is also known that if k = n , then a correlation matrix can be fully described by a set of factors (something that is known, e.g., from eigenvalue decomposition). So, either way, a (psd) correlation matrix C is an aggregate with X as the underlying structure. In the models of [9,29], the correlation risk premium is incorporated, stock pair-wise. Different to that, this paper assumes that investors use factor-based asset pricing models and thereby quote the risk-premium on the factor level.
Whenever the implied correlation matrix is described by a one-factor model, k = 1 , the weighting idea of [9] can be applied to the factor loading level to achieve the P Q transformation. Taking a one-factor model is no oversimplification, because in financial markets, it is documented (e.g., [48]) that the first principal component already explains a very large portion of the realized correlation matrix. It is also reported that the first principal component is typically interpreted as CAPM’s market portfolio. In the sense of [9,29], it can be assumed that the P expected correlations to risk factors can be estimated on a reasonable basis. Next, following their concept, X P is weighted against 1 n × k if C R P > 0 . In case C R P < 0 , then the weighting is made toward the lower boundary, which can now be chosen as 1 n × k . Hence, investors are modeled to quote the risk-premium on the factor exposure rather than on the single stock level. Considering the diversification effects, the mark-up to risk factors seems more realistic from an economic point of view. Furthermore, the factor-level implementation will not run into an economically inconsistent-scaling as in [29] and there is also no psd problem as in [9]. When introducing the sign of the correlation risk premium as s : = sign ( C R P ) that indicates up- or down-scaling, the transformation can then, similar to Equation (4), be written as
X Q = X P + α ˜ X Δ , with X Δ = s 1 n × k X P
It generally holds that α ˜ [ 0 , 1 ] and the implied correlation matrix is given by Equation (2) using X Q . To compute α ˜ , one has to match the market constraint in Equation (1), from which it follows that α ˜ can be explicitly solved for by
α ˜ = σ P , Δ 2 + s σ P , Δ 4 σ Δ 2 ( σ P 2 σ m 2 ) σ Δ 2 with σ P 2 = w σ ( X P X P J + I ) σ w σ Δ 2 = w σ ( X Δ X Δ J ) σ w σ P , Δ 2 = w σ ( X P X Δ J ) σ w
For multiple market constraints, one can either calibrate X P and obtain a unique α ˜ , or first solve for various α ˜ on sub-index levels, and then on the market index level. The above quadratic equation obviously gives two values of α ˜ , given that ‘ + s ’ can be replaced by ‘±’. Since α ˜ is associated to an economic interpretation, the upper value is taken in the case of C R P > 0 (up-scaling), and the lower value if C R P < 0 (down-scaling); hence, ‘±’ is substituted by ‘ + s ’ to implement this mechanism. In a multi-factor pricing model, the high multicolinearity among factors could potentially violate X P Ω . This issue can be easily resolved by first computing the correlations between stocks and factors and then orthogonalizing the set of vectors (e.g., via the Gram–Schmidt process).
The procedure of computing the implied correlation matrix from pricing factors can now be summarized as follows. First, the physically expected correlations between stocks and risk factors have to be computed and denoted in a set of vectors as X P . If necessary, X P has to be orthogonalized. Thereafter, the weighting scalar α ˜ has to be calculated according to Equation (34). Lastly, X Q has to be computed from Equation (33), and the implied correlation matrix then evolves from Equation (2).

4. Empirical Experiment

A brief empirical experiment for the developed models was conducted to evaluate their implementation difficulty and computational efficiency, using data of the S&P 100 and S&P 500 index. The experiment focuses on the monthly at-the-money (ATM) Call option implied volatilities with a target maturity of one month, that were derived from OptionMetrics. It is important to know that OptionMetrics follows a three-dimensional kernel regression for interpolating the option surface (see [49]); a discussion on this topic can be found in [50]. Computations were carried out at the beginning of each month, starting in 1 January 1996 to 2 December 2020, thus giving 300 implied correlation matrices per time-series. The ATM level was chosen due to three reasons: first, options are typically most liquid around the ATM level [51]; second, the ATM level is less sensitive to model misspecification (compared to out-of-money; [52]); third, ATM Call prices are close to ATM Puts. Return (daily) and market value (monthly) data were derived from the CRSP (Center for Research in Security Prices), the lists of index constituents are from Compustat. All computations were executed under an Intel® Core™ i5-8250U CPU with 1.60 GHz, using the statistical programming software R.
The experiment was executed using the following model specifications. As for the optimization algorithms, the variance tolerance for the market constraint was set at 1 × 10 6 . The stopping criteria was set to a marginal improvement in the objective function of 1 × 10 4 . A modest mark-up of | X | 1 1 × 10 8 was also introduced to ensure that estimates were invertible. The IR-SPGM was implemented as described by [46] (Algorithm 2.1) except that a monotone line search strategy was used. The non-monotone line search strategy runs additional sub-routines of projections to speed up convergence. However, since the projection function here was a potential alternating procedure on its own, the monotone line search was found to be slightly faster than the non-monotone one. Two types of target matrices (A) were used. First, simple Pearson’s correlation matrices from 12-month historically realized returns. Second, a mean-reverting matrix with the entries ρ ^ ( i , j ) = θ i j ρ ( i , j ) + ( 1 θ i j ) ρ ¯ ( i , j ) , using 9 months of historically realized returns for ρ ( i , j ) and the mean-correlation between i and j over the total time horizon for ρ ¯ ( i , j ) . The reversion speed θ i j was randomly drawn from a uniform distribution between 0 and 0.4 to bring in some noise into the target matrix. Generally, the two S&P indices are re-balanced on a quarterly basis, and hence there is no guarantee that the target matrices per se are psd  as firms leave and enter the indices within the estimation horizon. For the non-psd cases, the quantitative approach also served as a repairing tool.
As for the starting value X ( 0 ) , a modified version of [33] was used. For the target matrix A, let e be the set of eigenvectors and ι the corresponding eigenvalues. Then, for each column d = { 1 , , k } of X ( 0 ) , the starting values were computed as
d = { 1 , , k } : X d ( 0 ) = ς d e d where ς d = min ( ι d 1 ) e d 2 2 k e d 2 4 k i = 1 n e d , i 4 , 1 k max i | e d , i |
This starting value is identical with the one proposed by [33] in case k = 1 , but differs when the number of factors is higher. Within the empirical experiments of the k > 1 cases, the reduction in the objective function was found to be larger under the modified than under X ( 0 ) of [33]. Hence, the modified starting value was preferred.
The empirical experiment was split into three parts, which are summarized in Table 2, Panels A–C. The columns of Table 2 describe the number of risk factors (k), the target matrix (A), the mean and standard deviation of computation time (t), the optimized objective function (fn), the absolute realization of the variance tolerance (|v.tol|), the number of outer iterations (iter), and the index data used for the calculation (index). Panels A and B show the results of the quantitative approach, computing the nearest implied correlation matrices. Panel C displays the results of the economic approach.
In Panel A, the nearest implied correlation matrix was computed using IR-SPGM for the S&P 100 under different settings with respect to the number of risk factors and the two different target matrices. Overall, the patterns between the historical- and the mean-reverting matrices look very similar. The average computation speed was very fast and is comparable to the non-equality-constrained results of [33]. The computational speed was probably caused by the fast convergence within few outer iterations—around 3 to 5 on average—and the simplicity of the projection. As it is unlikely that the target matrix perfectly equaled the hidden true implied correlation matrix, the objective function was not expected to reach zero. The results show that on the one hand, an increasing number of risk factors indeed reduced the final objective function, and hence improved the estimation accuracy. On the other hand, non-surprisingly, a larger k came at higher computational effort, as it multiplied the model’s number of variables. With a look on the variance tolerance, the IR-SPGM algorithm had no difficulties to stay inside the feasible region.
Panel B displays the results for the nearest implied correlation matrix method applied to three alternative settings. The first line shows the results on S&P 500 data. Using the S&P 500 increased the number of unknown correlation pairs from 4950 ( n = 100 ) to 124,750. With the increase in the number of stocks the computation time rose over-proportionally. This was probably caused by the larger number of iterations needed to converge toward the optimum (mean 3.037 vs. 5.003). The optimized objective functions can be compared when dividing by n 2 n (off-diagonal entries), which is 0.016 for the S&P 100 and 0.011 for the S&P 500 index. With a look on the variance tolerance, the S&P 500 computations stuck more strictly to the first market constraint with a maximum deviation of 1.3 × 10 10 . Therefore, while the computation time over-proportionally increased from the S&P 100 to the 500 index, the mean fn per matrix-entry and also the realized variance tolerance were remarkably smaller. Hence, stopping criteria and variance tolerances can potentially be relaxed the more constituents the index holds. The second line of Panel B reports results on the S&P 100, where the target matrix was chosen from mean-reverting correlations, converted into an implied correlation matrix following the adjusted ex post model of [9] (Equation (4)). Hence, the target matrix here already was an implied correlation matrix that fulfilled the market constraint, but did not stick to mathematical feasibility. In this setting, only the monthly estimates of non-psd target matrices were taken such that the quantitative factor approach was used to repair the estimates under the [9]-model. This included 64 of the 300 monthly matrices. The number of risk factors was set higher here to achieve a better fit, and with a look on fn, it can be observed that the objective function was indeed substantially smaller at this application. Hence, the factor model qualifies as a repair tool for existing implied correlation models. Within the third line of Panel B, the IR-SPGM algorithm was replaced by a SQP routine for the S&P 100 data. The algorithm used was taken from the Rsolnp package (see [53,54]). The SQP solver served as a reference to cross-validate whether the IR-SPGM method was implemented correctly. The results can thus directly be compared to the first line of Panel A. It can be observed that the optimized objective functions were very similar between the SQP and IR-SPGM algorithm, which confirms correct implementation of the IR-SPGM routine. Comparing computation times between them, SQP does not seem to be competitive, which was also found in the general case of [33]. This finding thus motivates the usage of IR-SPGM.
In Panel C of Table 2 the economic approach was implemented on the S&P 500 index with respect to implied market exposure [12], CAPM (market factor), the Fama–French three-factor model [27], the extension for the momentum factor (FF3+Mom.) and the Fama–French five-factor model [28]. For the implied market exposure, the risk-neutral skewness was calculated based on [55] using the public code library by Grigory Vilkov (, accessed on 23 February 2022). The return data for the remaining factor portfolios were derived directly from K.R. French’s data library (, accessed on 7 April 2021). Following [12], the purely forward-looking correlation between stock i and the market m was computed as X P = ( s k e w i / s k e w m ) 1 / 3 . To approximate X P for the remaining four methods, 12-month realized correlations between stocks and the respective risk factors were used. Note that X P of the implied market exposure factor solely relied on option data and was thus purely forward looking. X P of the other factors, however, had a hybrid form that took historical data to assess implied future co-movement. Next, to ensure that X P Ω for estimations with k > 1 , the factors were orthogonalized via the Gram–Schmidt process. On the one hand, since the economic approach does not iterate and consists only of one projection, its computation was very inexpensive. On the other hand, its estimates deviated more strongly from the target matrix than those of the quantitative approach—irrespective of the number of factors employed. Comparing the results within the economic approach, it can be observed that for hybrid estimations, the larger the number of risk factors, the smaller the fnavg. The factor solely relying on implied data resulted by far in the highest fnavg. However, similarity to the target matrix plays an subordinate role here. Similar to that of fnavg, a reduction pattern can also be seen for α ˜ among hybrid models. In Section 3.2, α ˜ was defined as the weight of the boundary and ( 1 α ˜ ) as the weight of X P inside the risk-neutral factor correlations X Q . This means that the larger the number of risk factors, the less modification was required to match the observed implied market variance in this empirical test. Thus, a larger k is likely to explain the hidden (true) implied correlation matrix better. This time, comparing the purely forward looking measure to the hybrid estimations shows that when implied data were used, less modification was required. Generally, as  α ˜ avg was close to zero across all five economic models, it can be concluded that correlation risk premia enter modestly in the factor-structured implied correlation matrix framework.

5. Concluding Remarks

Having an idea about future diversification possibilities requires knowledge about future correlations. Identifying such is a challenging task, as backward-looking information will never capture events that are expected to happen in the future. In contrast, option-implied volatilities are known to carry information about the market outlook. Hence, they are used by academics and practitioners alike to obtain forward-looking perspectives. Computing implied correlation matrices, however, is an intricate puzzle since it is a highly under-determined problem. While there already exists a strain of literature that provides estimation methods for such matrices, this paper shows that these approaches either fail to ensure important mathematical characteristics or discard economic rational grounds. In short, a fully feasible solution to the puzzle has not yet been found.
This paper provides two solutions to the problem by exploiting the commonly accepted assumption that returns stem from factor risk exposure. The first approach is quantitatively motivated and computes the nearest implied correlation matrix subject to a pre-specified target, that can be freely chosen. This method turns out to be a useful tool for repairing (implied) correlation matrix estimates, moreover ensuring their invertibility. It can also be used as a stand-alone estimate, coming at a minimum of assumptions. With the second economically motivated approach, the paper demonstrates how expected risk-factor loadings (or betas) can be translated into an ex ante correlation matrix. As long as one has an educated assumption about either the implied correlation matrix or efficient factor loadings, both approaches provide a fully rational solution to the implied correlation puzzle. Thus, the hypothesis that the shortfalls of existing models can be overcome with factor-based solutions can be accepted. Furthermore, an empirical application of the two approaches on monthly option data of the S&P 100 and S&P 500 (1996–2020) shows that the implementation of the two proposed solutions is easy and computationally efficient.
The findings of this paper have strong implications for practitioners and literature alike. With the current models, investors either risk losing important mathematical properties or have to abandon economic rational grounds. The two provided methodologies are not only able to overcome these shortcomings, but also turn out to be handy regarding practical implementation. Thus, this paper makes an important contribution to the literature, as it solves a long-lasting problem without compromise. With the broad interest in estimating future-oriented co-movements, the presented approaches can find multiple applications in all areas of finance. These include asset pricing, market forecasting, portfolio optimization, and risk management. While this paper focuses more on the theory of implied correlation estimation, future research should evaluate the performance of the presented solutions in an investment context.

Author Contributions

Conceptualization, W.S.; Data curation, J.T.; Formal analysis, W.S.; Methodology, W.S.; Project administration, W.S.; Software, W.S. and J.T.; Validation, J.T.; Writing—original draft, W.S.; Writing—review & editing, J.T. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Non-Gaussian Copula: Example of Variance-Gamma

Risk-neutral densities are typically of an asymmetric shape and have heavy tails, causing stocks to correlate higher for market down-turns and less for up-movements. The Pearson’s correlation matrix cannot capture such asymmetries. While for ATM options, there is almost no difference in implied correlations between the Gaussian or a more sophisticated copula (see e.g., [2]), it can indeed matter for out-of-money options. ATM option prices are typically very similar among most option pricing models; hence, the same holds for implied volatilities and thus also for implied correlations. An easy extension for non-normal shapes/tails can be made if the transformation between direct and centered multivariate parametrization is known (e.g., following [41]). To provide an example, the case of the variance gamma model [56,57] is discussed below, which became quite popular for pricing basket options [2].
Let Z ( t ) follow a one-factor multivariate variance-gamma process, constructed from subordinating the multivariate Brownian motion B with the gamma distributed subordinator V ( t ) G a ( t / ν , 1 / ν ) ,
Z ( t ) = t μ + θ V ( t ) + B ( V ( t ) ) s . t . Z VG ( ξ , ω , C d i r , θ , ν )
with ξ R n as the location, ω R + n the diagonal matrix of scale, θ R n the shape, ν R + the variance rate and C d i r the correlation matrix; all are direct parameters. The subscripts d i r and c e n are used for direct/centered matrix parametrization. The direct covariance matrix is thus given by Σ d i r = ω C d i r ω . All direct parameters, except C d i r , can be derived for the cross-section, and the (sub-)index from option data, for example, from ref. [57]. Further, it is known that the first two centered moments evolve as
E [ Z ] = ξ + θ and E [ ( Z E [ Z ] ) 2 ] = Σ d i r + ν θ θ
(cp. [56]). Hence, within the VG model, the change between direct and centered parametrization is easily obtained by Σ c e n = Σ d i r + ν θ θ σ C c e n σ , with σ still being the diagonal matrix of centered volatilities. The centered Pearson’s correlation matrix C c e n is thus given by
C c e n = σ 1 ω C d i r ω σ 1 + ν σ 1 θ θ σ 1
As before, it still holds that the portfolio variance is computed from Equation (1) as σ m 2 = w Σ c e n w , which defines the market constraint. So substituting Equation (A3) back into the market constraint and introducing Equation (2) as C d i r ( X ) = J X X + I allows to estimate the direct correlation matrix (i.e., non-Pearson) via the nearest implied correlation matrix method. This fact demonstrates that the NICM method is not limited to Pearson’s correlation matrices. Of note, C d i r ( X ) is psd by construction, and the same holds for C c e n as ν θ θ is psd and σ and ω are diagonal of positive entries. Hence, C d i r ( X ) = J X X + I ensures mathematical feasibility.


  1. Skintzi, V.D.; Refenes, A.P.N. Implied correlation index: A new measure of diversification. J. Futur. Mark. 2005, 25, 171–197. [Google Scholar] [CrossRef]
  2. Linders, D.; Schoutens, W. Basket option pricing and implied correlation in a one-factor Lévy model. In Innovations in Derivatives Markets; Springer: Cham, Switzerland, 2016; pp. 335–367. [Google Scholar]
  3. Milevsky, M.A.; Posner, S.E. A Closed-Form Approximation for Valuing Basket Options. J. Deriv. 1998, 5, 54–61. [Google Scholar] [CrossRef]
  4. Harris, R.D.F.; Li, X.; Qiao, F. Option-implied betas and the cross section of stock returns. J. Futur. Mark. 2019, 39, 94–108. [Google Scholar] [CrossRef][Green Version]
  5. Schadner, W. Ex-Ante Risk Factors and Required Structures of the Implied Correlation Matrix. Financ. Res. Lett. 2021, 41, 101855. [Google Scholar] [CrossRef]
  6. Driessen, J.; Maenhout, P.J.; Vilkov, G. The Price of Correlation Risk: Evidence from Equity Options. J. Financ. 2009, 64, 1377–1406. [Google Scholar] [CrossRef][Green Version]
  7. Fink, H.; Geppert, S. Implied correlation indices and volatility forecasting. Appl. Econ. Lett. 2017, 24, 584–588. [Google Scholar] [CrossRef]
  8. Markopoulou, C.; Skintzi, V.; Refenes, A. On the predictability of model-free implied correlation. Int. J. Forecast. 2016, 32, 527–547. [Google Scholar] [CrossRef]
  9. Buss, A.; Vilkov, G. Measuring Equity Risk with Option-implied Correlations. Rev. Financ. Stud. 2012, 25, 3113–3140. [Google Scholar] [CrossRef]
  10. Kempf, A.; Korn, O.; Saßning, S. Portfolio Optimization Using Forward-Looking Information. Rev. Financ. 2015, 19, 467–490. [Google Scholar] [CrossRef][Green Version]
  11. Hardle, W.K.; Silyakova, E. Implied basket correlation dynamics. Stat. Risk Model. 2016, 33, 1–20. [Google Scholar] [CrossRef][Green Version]
  12. Chang, B.Y.; Christoffersen, P.; Jacobs, K.; Vainberg, G. Option-Implied Measures of Equity Risk. Rev. Financ. 2011, 16, 385–428. [Google Scholar] [CrossRef][Green Version]
  13. Echaust, K.; Małgorzata, J. Implied correlation index: An application to economic sectors of commodity futures and stock markets. Eng. Econ. 2020, 31, 4–17. [Google Scholar] [CrossRef][Green Version]
  14. Dhaene, J.; Dony, J.; Forys, M.B.; Linders, D.; Schoutens, W. FIX: The fear index—Measuring market fear. In Topics in Numerical Methods for Finance; Springer: Berlin/Heidelberg, Germany, 2012; pp. 37–55. [Google Scholar]
  15. Just, M.; Echaust, K. Stock market returns, volatility, correlation and liquidity during the COVID-19 crisis: Evidence from the Markov switching approach. Financ. Res. Lett. 2020, 37, 101775. [Google Scholar] [CrossRef]
  16. González, M.T.; Novales, A. Are volatility indices in international stock markets forward looking? Rev. R. Acad. Cien. Ser. A Mat. 2009, 103, 339–352. [Google Scholar] [CrossRef][Green Version]
  17. Kim, J.; Park, Y.J. Predictability of OTC option volatility for future stock volatility. Sustainability 2020, 12, 5200. [Google Scholar] [CrossRef]
  18. Hollstein, F.; Prokopczuk, M.; Tharann, B.; Wese Simen, C. Predicting the equity market with option-implied variables. Eur. J. Financ. 2019, 25, 937–965. [Google Scholar] [CrossRef]
  19. Buss, A.; Schoenleber, L.; Vilkov, G. Expected Correlation and Future Market Returns. SSRN Electron. J. 2019. [Google Scholar] [CrossRef]
  20. Cecchetti, S.; Sigalotti, L. Forward-Looking Robust Portfolio Selection. Bank Italy Temi Discuss. 2013, 913. [Google Scholar] [CrossRef]
  21. Dew-Becker, I. Real-Time Forward-Looking Skewness over the Business Cycle. Northwestern University Working Paper. 2021. Available online: (accessed on 12 April 2022).
  22. Chamizo, Á.; Fonollosa, A.; Novales, A. Forward-looking asset correlations in the estimation of economic capital. J. Int. Financ. Mark. Institutions Money 2019, 61, 264–288. [Google Scholar] [CrossRef][Green Version]
  23. Athanasakou, V.; Hussainey, K. The perceived credibility of forward-looking performance disclosures. Account. Bus. Res. 2014, 44, 227–259. [Google Scholar] [CrossRef][Green Version]
  24. Conrad, J.; Dittmar, R.F.; Ghysels, E. Ex Ante Skewness and Expected Stock Returns. J. Financ. 2013, 68, 85–124. [Google Scholar] [CrossRef][Green Version]
  25. Higham, N.J. Computing the nearest correlation matrix—a problem from finance. IMA J. Numer. Anal. 2002, 22, 329–343. [Google Scholar] [CrossRef][Green Version]
  26. Sharpe, W.F. Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk. J. Financ. 1964, 19, 425–442. [Google Scholar]
  27. Fama, E.F.; French, K.R. Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 1993, 33, 3–56. [Google Scholar] [CrossRef]
  28. Fama, E.F.; French, K.R. A five-factor asset pricing model. J. Finac. Econ. 2015, 116, 1–22. [Google Scholar] [CrossRef][Green Version]
  29. Numpacharoen, K.; Numpacharoen, N. Estimating Realistic Implied Correlation Matrix from Option Prices. J. Math. Financ. 2013, 3, 401–406. [Google Scholar] [CrossRef][Green Version]
  30. Harvey, C.R.; Liu, Y.; Zhu, H. …and the Cross-Section of Expected Returns. Rev. Financ. Stud. 2015, 29, 5–68. [Google Scholar] [CrossRef][Green Version]
  31. Jensen, T.I.; Kelly, B.T.; Pedersen, L.H. Is There a Replication Crisis in Finance? J. Financ. 2022. Forthcoming. [Google Scholar]
  32. Glasserman, P.; Suchintabandid, S. Correlation expansions for CDO pricing. J. Bank. Financ. 2007, 31, 1375–1398. [Google Scholar] [CrossRef]
  33. Borsdorf, R.; Higham, N.J.; Raydan, M. Computing a nearest correlation matrix with factor structure. SIAM J. Matrix Anal. Appl. 2010, 31, 2603–2622. [Google Scholar] [CrossRef][Green Version]
  34. Guo, G.; Jacquier, A.; Martini, C.; Neufcourt, L. Generalized arbitrage-free SVI volatility surfaces. SIAM J. Financ. Math. 2016, 7, 619–641. [Google Scholar] [CrossRef][Green Version]
  35. Buss, A.; Schönleber, L.; Vilkov, G. Option-Implied Correlations, Factor Models, and Market Risk. INSEAD Working Paper No 2017/20/FIN. 2017. Available online: (accessed on 12 April 2022).
  36. Buraschi, A.; Porchia, P.; Trojani, F. Correlation Risk and Optimal Portfolio Choice. J. Financ. 2010, 65, 393–420. [Google Scholar] [CrossRef][Green Version]
  37. Buraschi, A.; Trojani, F.; Vedolin, A. When Uncertainty Blows in the Orchard: Comovement and Equilibrium Volatility Risk Premia. J. Financ. 2014, 69, 101–137. [Google Scholar] [CrossRef]
  38. Driessen, J.; Maenhout, P.J.; Vilkov, G. Option-Implied Correlations and the Price of Correlation Risk. Advanced Risk & Portfolio Management, SSRN. 2013. Available online: (accessed on 12 April 2022).
  39. Faria, G.; Kosowski, R.; Wang, T. The correlation risk premium: International evidence. J. Bank. Financ. 2022, 136, 106399. [Google Scholar] [CrossRef]
  40. Numpacharoen, K. Weighted Average Correlation Matrices Method for Correlation Stress Testing and Sensitivity Analysis. J. Deriv. 2013, 21, 67–74. [Google Scholar] [CrossRef]
  41. Arellano-Valle, R.B.; Azzalini, A. The centred parametrization for the multivariate skew-normal distribution. J. Multivar. Anal. 2008, 99, 1362–1382. [Google Scholar] [CrossRef]
  42. Barzilai, J.; Borwein, J.M. Two-point step size gradient methods. IMA J. Numer. Anal. 1988, 8, 141–148. [Google Scholar] [CrossRef]
  43. Birgin, E.G.; Martinez, J.M.; Raydan, M. Spectral projected gradient methods: Review and perspectives. J. Stat. Softw. 2014, 60, 1–21. [Google Scholar] [CrossRef]
  44. Birgin, E.G.; Martinez, J.M.; Raydan, M. Algorithm 813: SPG—Software for convex-constrained optimization. ACM Trans. Math. Softw. (TOMS) 2001, 27, 340–349. [Google Scholar] [CrossRef]
  45. Martinez, J.M.; Pilotta, E.A. Inexact-Restoration Algorithm for Constrained Optimization1. J. Optim. Theory Appl. 2000, 104, 135–163. [Google Scholar] [CrossRef]
  46. Gomes-Ruggiero, M.A.; Martinez, J.M.; Santos, S.A. Spectral Projected Gradient Method with Inexact Restoration for Minimization with Nonconvex Constraints. SIAM J. Sci. Comput. 2009, 31, 1628–1652. [Google Scholar] [CrossRef][Green Version]
  47. Miller, K.S. On the Inverse of the Sum of Matrices. Math. Mag. 1981, 54, 67–72. [Google Scholar] [CrossRef]
  48. Boyle, P.; Feng, S.; Melkuev, D.; Yang, S.; Zhang, J. Short Positions in the First Principal Component Portfolio. N. Am. Actuar. J. 2018, 22, 223–251. [Google Scholar] [CrossRef]
  49. OptionMetrics. IvyDB US File and Data Reference Manual, 3rd ed.; OptionMetrics: New York, NY, USA, 2016. [Google Scholar]
  50. Ulrich, M.; Walther, S. Option-implied information: What’s the vol surface got to do with it? Rev. Deriv. Res. 2020, 23, 323–355. [Google Scholar] [CrossRef]
  51. Etling, C.; Miller Thomas, W.J. The relationship between index option moneyness and relative liquidity. J. Futur. Mark. 2000, 20, 971–987. [Google Scholar] [CrossRef]
  52. Carr, P.; Wu, L. The Finite Moment Log Stable Process and Option Pricing. J. Financ. 2003, 58, 753–777. [Google Scholar] [CrossRef][Green Version]
  53. Ghalanos, A.; Theussl, S. Rsolnp: General Non-Linear Optimization Using Augmented Lagrange Multiplier Method. CRAN. 2015. Available online: (accessed on 12 April 2022).
  54. Ye, Y. Interior Algorithms for Linear, Quadratic, and Linearly Constrained Non-Linear Programming. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 1987. [Google Scholar]
  55. Bakshi, G.; Kapadia, N.; Madan, D. Stock Return Characteristics, Skew Laws, and the Differential Pricing of Individual Equity Options. Rev. Finac. Stud. 2003, 16, 101–143. [Google Scholar] [CrossRef]
  56. Madan, D.B.; Seneta, E. The variance gamma (VG) model for share market returns. J. Bus. 1990, 63, 511–524. [Google Scholar] [CrossRef]
  57. Madan, D.B.; Carr, P.P.; Chang, E.C. The variance gamma process and option pricing. Rev. Financ. 1998, 2, 79–105. [Google Scholar] [CrossRef][Green Version]
Figure 1. Visualization of the constraints for the two-asset/one-factor/one-market-constraint case. The inequality constraint Ω spans the gray box of mathematically feasible solutions, and the blue line defines the solutions which satisfy the market constraint. As can be seen, the market constraint actually consists of two convex curves. Hence, two orthogonal projections of X onto g ( X ) = 0 exist, but only one (i.e., X ˙ ) has minimum distance to X.
Figure 1. Visualization of the constraints for the two-asset/one-factor/one-market-constraint case. The inequality constraint Ω spans the gray box of mathematically feasible solutions, and the blue line defines the solutions which satisfy the market constraint. As can be seen, the market constraint actually consists of two convex curves. Hence, two orthogonal projections of X onto g ( X ) = 0 exist, but only one (i.e., X ˙ ) has minimum distance to X.
Mathematics 10 01649 g001
Table 1. The evaluation of existing models with respect to the developed requirements for realistic implied correlation matrices. A check mark ✓ indicates that the model is compliant with the condition. If it does not guarantee to comply, then it is marked with a ✗. A realistic solution has to fulfill all constraints (i)–(vi).
Table 1. The evaluation of existing models with respect to the developed requirements for realistic implied correlation matrices. A check mark ✓ indicates that the model is compliant with the condition. If it does not guarantee to comply, then it is marked with a ✗. A realistic solution has to fulfill all constraints (i)–(vi).
Symmetric|Cij| ≤ 1Unit Diag.Pos. Semi-Def.Arbitrage-FreeHetero. Struct.
Local Equi-Correlations
Adj. Ex Post of [9], A E P 1
Adj. Ex Post of [29], A E P 2
Skewness Approach
Table 2. Summary statistics of computing nearest (quantitative approach, Panels A–B) and risk-factor structured (economic approach, Panel C) implied correlation matrices on monthly option data. As for the quantitative approach, it turns out that computations were carried out in a small amount of time, converging towards the optimal solution within few iterations. The nearest factor-structured matrix can thus be used either as a stand-alone estimate, or as a tool to repair positive semi-definiteness and invertibility. Using the economic approach, implied as well as common risk factors from models, such as CAPM or Fama–French, can be used to estimate the implied correlation matrix.
Table 2. Summary statistics of computing nearest (quantitative approach, Panels A–B) and risk-factor structured (economic approach, Panel C) implied correlation matrices on monthly option data. As for the quantitative approach, it turns out that computations were carried out in a small amount of time, converging towards the optimal solution within few iterations. The nearest factor-structured matrix can thus be used either as a stand-alone estimate, or as a tool to repair positive semi-definiteness and invertibility. Using the economic approach, implied as well as common risk factors from models, such as CAPM or Fama–French, can be used to estimate the implied correlation matrix.
Panel A: hist. matrix
SP1001hist.0.0510.024159.9172.85.1 × 10 11 1.5 × 10 8 3.0371.491SP100
SP1003hist.0.1060.050111.0168.31.3 × 10 8 6.2 × 10 7 4.8192.137SP100
SP1005hist.0.1400.077102.7166.91.9 × 10 8 9.9 × 10 7 4.9901.977SP100
mean-reverting matrix
SP1001m.r.0.0550.027158.0158.22.0 × 10 11 2.5 × 10 9 3.0641.438SP100
SP1003m.r.0.1070.047113.3173.23.9 × 10 8 9.6 × 10 7 4.7081.790SP100
SP1005m.r.0.1490.082104.6172.52.4 × 10 8 8.9 × 10 7 5.0541.989SP100
Panel B:
SP5001hist.5.3472.3112847.94395.01.5 × 10 11 1.3 × 10 10 5.0031.892SP500
Repaired C Q A E P 1 15 m . r . Q 0.2230.07016.416.7753.1 × 10 8 9.7 × 10 7 7.1842.351SP100
SQP1hist.0.3370.098161.4166.83.5 × 10 7 9.6 × 10 7 3.5170.721SP100
Panel C: α ˜ avg α ˜ sd
CAPM1skew Q 0.0180.0144925.64548.05.0 × 10 17 4.4 × 10 16 0.0990.083SP500
CAPM1hist.0.0170.0185108.55858.93.0 × 10 17 4.7 × 10 16 0.1380.133SP500
Fama-Fr.33hist.0.0190.0184376.64740.83.3 × 10 17 2.5 × 10 16 0.0910.070SP500
FF3+Mom.4hist.0.0190.0194167.14381.33.3 × 10 17 3.1 × 10 16 0.0820.060SP500
Fama-Fr.55hist.0.0190.0244125.34509.13.6 × 10 17 3.9 × 10 16 0.0760.055SP500
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Schadner, W.; Traut, J. Estimating Forward-Looking Stock Correlations from Risk Factors. Mathematics 2022, 10, 1649.

AMA Style

Schadner W, Traut J. Estimating Forward-Looking Stock Correlations from Risk Factors. Mathematics. 2022; 10(10):1649.

Chicago/Turabian Style

Schadner, Wolfgang, and Joshua Traut. 2022. "Estimating Forward-Looking Stock Correlations from Risk Factors" Mathematics 10, no. 10: 1649.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop