1. Introduction
Stock markets are forward looking by nature, and properly assessing future developments can be a decisive competitive advantage for any market participant. The option implied correlation matrix is an important tool, as it embodies the ex ante aggregated expectation about forward-looking dependencies between securities [
1,
2]. Sufficient knowledge about this matrix is a key requirement for pricing basket/index options [
2,
3], but is also important for other purposes, such as factor-based asset pricing [
4,
5], forecasting [
1,
6,
7,
8], trading [
8,
9,
10,
11], managing risks [
1,
12,
13], or understanding market behavior [
14,
15]. Using the implied correlation matrix can be advantageous over using historical data for the above tasks in practice. The implied correlation matrix contains forward-looking market expectations and investors’ current perception of risk [
16]. Hence, it can be used to make predictions about the future at no additional assumptions. Using historic data for future predictions however, comes with the assumption that patterns in the past are also decisive for the future. Empirical analyses show that implied estimates can be a better proxy for future realized volatility [
17,
18] and future returns [
19]. It is also documented that mean-variance efficient portfolios constructed from implied correlations outperform those constructed from past data [
10,
20]. In a broader context, research shows that the beyond second moment characteristics of implied data also have rich information content (e.g., [
21]) and that besides options, other instruments and channels can also be used to extract forward-looking information (e.g., [
22,
23]). Given its superiority of timeliness and richer information content [
24], there is a broad interest in solving the cross-sectional implied correlation puzzle. The goal of this paper is to provide a fully feasible solution to that puzzle.
Despite the broad interest, yet finding a realistic solution to the puzzle is a challenging task [
25]. The main difficulty of solving for implied correlations is that there are far fewer options traded than there are unknown correlation pairs within the matrix. Hence, the problem is underdetermined, and multiple solutions exist in general. Current models solve the problem but, as it is shown later, they cannot guarantee to meet the important mathematical properties or impose unrealistic economic assumptions. In other words, while they deliver numerically correct results, those results can be rather useless from a practical standpoint.
The purpose of this paper is to overcome this shortfall in a two-stage process. Its first contribution is to answer the question on what requirements the implied correlation estimates must fulfill to be mathematically and economically feasible. In doing so, it identifies the shortcomings of existing models and argues that none of these models can be used as a reliable and realistic tool by investors to estimate implied correlation matrices. The second contribution of this paper is that it validates the hypothesis that fully feasible solutions can be generated via using the factor structure of returns. It proposes two alternative estimation methods to the problem at hand. Both solutions are based on the idea that a well-conditioned correlation matrix is composed of a set of factors. The first solution is quantitatively motivated. It computes the nearest implied correlation matrix with respect to a pre-specified target that can be freely chosen. Hence, it can be used as a stand-alone method or to re-establish the feasibility and invertibility of any other model’s estimate. This feature makes the quantitative solution especially attractive for applications in portfolio optimization. The second solution is economically motivated. It discusses the link between common risk factors (such as CAPM [
26] or Fama-French [
27,
28]) and the corresponding expected correlation matrix. This closes a gap between factor-based asset pricing and ex ante beliefs. An empirical experiment shows that both solutions are computationally efficient and can be used reliably in practice.
As already discussed, multiple plausible solutions of the correlation matrix puzzle work numerically but can be rather useless from a mathematical and economic standpoint. To filter the pool of potential solutions, feasibility constraints were developed and grouped into two categories. More precisely, for a market covering n securities, a correlation matrix satisfies mathematical feasibility if the following criteria are met:
- (i)
It is symmetric;
- (ii)
All its elements lie within the interval;
- (iii)
It has unit diagonal;
- (iv)
It is positive semi-definite (psd).
Economic feasibility is fulfilled if the following criteria are met:
- (v)
The matrix is ex ante free of arbitrage;
- (vi)
The values have a realistic structure.
While the mathematical conditions are straightforward, the economic ones need a little more clarification. Ex ante arbitrage freeness (v) is interpreted in the following sense. Within a financial market, there are typically options traded on individual firms, but also for the market index directly. Hence, a minimum criterion for
C to meet condition (v) is that it properly aggregates the prices of the constituents to match the observable price of the basket option (cp. [
1,
2]):
with
as the implied variance of the market portfolio,
the corresponding weight vector and
a diagonal matrix of firm implied volatilities. If this condition does not hold, an arbitrageur can earn riskless profits via simultaneously buying (selling) the index (constituent) options or vice versa. Such profit opportunities should not exist following common economic sense because within the moment that such a trading opportunity emerges, it should be vanished by the traders who exploit it. Condition (vi) is also taken from the existing literature (such as [
29]). Its loose definition implies that, in a real economy, the pair-wise correlations
are heterogeneous across stock pairs
. A more narrow definition is linked to the thousands of publications in finance that decompose returns into factor loadings (see [
30,
31]). Given this large body of literature, a realistic structure of implied correlations can not only be expected to be heterogeneous, but also to originate from factor loadings.
To come up with a solution that matches all six conditions (i)–(vi),
C is generated from an inherent
k-factor structure [
32,
33]. In its final form, the factor structure is derived as
having
as factor loadings, ∘ the Hadamard product,
I the identity matrix, and
with
as the unity matrix of dimension as indicated in the subscript. This structured representation of the implied correlation matrix has the advantage that controlling for the above requirements (i)–(vi) is easy, which allows this paper to provide fully realistic solutions.
The reading is set up as follows.
Section 2 provides a comparison of existing models and identifies their violations of mathematical/economical feasibility.
Section 3 discusses the factor model and the proposed solutions to the implied correlation puzzle. In
Section 4, the empirical implementation upon data of the S&P 100 and S&P 500 is evaluated.
Section 5 concludes. Throughout the reading, the matrix- and Hadamard products are displayed in a fashion like
. The corresponding multiplication order should be read as
and not as
(i.e., ‘matrix-before-Hadamard product’). The brackets are dropped to increase readability.
2. Discussion on Existing Models
From a literature review, four distinct concepts for solving the implied correlation matrix are identified. Each concept is presented shortly and evaluated against the feasibility constraints developed above. Before doing so, it must be highlighted that, generally, in quantitative finance, it is often more convenient to work with implied volatilities directly instead of option prices (cp. [
34]). This also applies for implied correlations. Given non-flat implied volatility surfaces, a common convention is to use options of same strike and maturity [
1,
2,
9]. The existing concepts for solving implied correlation matrices are given as follows.
The literature on implied correlations was initialized by [
1] in 2005. This model is not only cited in most of the subsequent research in that direction, but it is also used as the fundamental concept behind the CBOE implied correlation index. The idea of the method is as follows. Since for a given point in time, the weights
w and the forward-looking measures
and
are available (or can be estimated), and each off-diagonal element within
C can be set to the same scalar
to match the equilibrium of Equation (
1). The equi-correlation
is then easily computed by
(cp. [
5]). From the proof of [
33] (Lem. 2.1) follows, that
will be
psd if
, which is also sufficient for mathematical feasibility. Within an economy, securities are typically, on average, positively correlated (the CBOE implied correlation index is also thoroughly positive), so this should not be much of an issue empirically. However, it is unrealistic that each correlation pair inside
C takes on the same value, and hence
is seen as economically unfeasible and violates condition (vi) (cp. [
9,
29]). Additionally,
clearly rejects the existence of a (risk-)factor structure (also discussed in [
5]), being in contradiction with empirical asset pricing literature. In case the market index also has options traded upon sub-indices (e.g., S&P 500 sector indices), it is unlikely that the equi-correlation matrix will match them. Hence, this is a violation of condition (v). Due to those reasons, the equi-correlation approach is more frequently used as an index of average diversification possibilities rather than as a correlation matrix per se (e.g., [
14]).
The equilibrium (Equation (
1)) violation from existing sub-index options can be easily resolved by introducing local equi-correlations, where the correlations are averaged out within sub-squares of the matrix. This idea is mentioned in [
35], who use options on the S&P 500 market index and 10 S&P 500 sector sub-indices. Herein, always the average correlation within a sub-index is computed, before computing the global equi-correlations outside the sub-portfolio. For a sub-portfolio
p, Equation (
1) must hold for its weights
and volatility
. Local equi-correlations of different
ps are always centered alongside the diagonal of
C. This raises the issue that large parts of
C cannot be covered and obviously, with a handful of sub-indices and a very large number of unknown correlation pairs, the resolving correlation matrix will still give a very blurred and unrealistic picture. For example, if
, there are 124,750 unknown correlation pairs. With 10 sub-indices, there are only 11 distinct estimates. For this definition of local equi-correlations, condition (iv) does not hold, as the resulting matrix is not necessarily
psd. Additionally, as argued in [
5], the known factor structure of security markets (condition (vi)) will still be ignored. The work of [
5] further demonstrates how the factor structure can be resolved from calibrating also off-diagonal local equi-correlations. However, also this refinement yields a very blurred estimate of the implied correlation matrix. Hence, as the local equi-correlation method only slightly improves precision, compared to the global equi-correlation, it still causes unrealistic estimates of
C and is thus economically not feasible.
This method combines (local) equi-correlations with backward-looking estimates to approximate the forward-looking implied correlation matrix [
9,
35]. For the discussion of this model, it has to be kept in mind that due to the stochastic nature of variances and correlations, option prices are documented to carry risk premia for variance (VRP) and for correlation (CRP; see [
36,
37,
38,
39]). Let
denote investor expectations under the physical and
under the risk-neutral probability measure. Option implied volatilities are known to be
-measured. The physically expected correlation matrix can be denoted by
. Then, the difference between implied market and aggregated individual volatilities can be used to express the volatility quoted ex ante CRP as
(cp. [
6,
37]). Building on this idea,
A can be defined as a backward-looking estimated correlation matrix; in the example of [
9] it is simply the 1 year historically realized return correlations, but
A can also be chosen from a more sophisticated model (e.g., incorporating mean-reversion [
35]). For the model of [
9] (indicated by the superscript
), two crucial assumptions are obligatory. First, they introduce that the backward-looking
A is equivalent to the forward-looking
matrix,
. While investors use backward-looking information to form their beliefs, there is no fundamental reasoning why this should hold, as
A does not carry information on the market outlook (see, e.g., [
1]). The second assumption is that the correlation risk premium enters into the matrix in the specific form of
with
as a scalar calibrating for the correlation risk premium and
as their estimate of the implied correlation matrix. The two assumptions come at a mathematical convenience. Let
, then by rearranging terms, the equation can be brought into the form
, known in the literature as the ’weighted average correlation matrix’ [
40]. Since both
and
are
psd, and the sum of two
psd matrices is also
psd, the mathematical feasibility of
holds for
, which is fulfilled when
. Empirically, however,
is likely to fall outside that range, and this method does not comply with condition (iv) in general. For example, on monthly S&P 100 data from 1996 to 2020 with
A as the one year historically realized correlation matrix, it can be observed that
in 156 and
in 144 of the 300 monthly estimates, thus the requirement did not hold for 48% of the time. This rejection of mathematical feasibility is also the main critique stated in [
29], who provide a workaround for the
cases. In this model, whenever
the matrix
of Equation (
4) is replaced by the ’equi-correlation lower-bound’,
L defined as
and unit diagonal. Here,
L is simply the smallest possible
psd equi-correlation matrix (see the discussion on equi-correlations above). Following the modification of [
29] (indicated by
), the implied correlation matrix is computed by
According to [
9], consistent investor preferences require that all
-correlation pairs are scaled in the same direction, under general risk aversion; this means up if
and down when
. This consistency clearly holds in [
9,
29] for the
cases. Analyzing the workaround of [
29] in greater detail, one recognizes that off-diagonal entries inside
L are slightly negative and close to zero (e.g., −0.01 for
and −0.002 for
). As a result, for
, almost every negative
-correlation pair will be up-scaled, while positive ones will be down-scaled at the
-
transformation. Therefore, the [
29] workaround repairs the mathematical feasibility of the basic adjusted ex post model, but at the same time causes an inconsistent implementation of the correlation risk premium, thus violating condition (v). This yields into a rejection of economical feasibility, leaving an insufficient solution of the implied correlation matrix behind. While mathematical/economical flaws still persist, compared to the other concepts in place, the adjusted ex post method seems to be the most realistic one. The nearest implied correlation algorithm as introduced in
Section 3 can be applied upon these models to repair feasibility.
Of note, Refs. [
10,
12] use a CAPM-like model, introduce economic assumptions to cancel out mathematical relationships, and estimate implied correlations between a stock and the market portfolio by combining option-implied volatilities with risk-neutral skewness. Since CAPM is a factor model, the estimates correspond to
X of Equation (
2), and their approach can thus be used for a solution to the implied correlation matrix. However, this model does not coerce with market conditions (Equation (
1)), nor does it stick to the boundaries stated in condition (ii). So, mathematical and economical feasibilities are ignored. On the other hand, the
psd condition is easily met when combined with the factor model of Equation (
2).
Table 1 summarizes the assessment of existing models with respect to the constraints developed in
Section 1. It can be observed that none of the presented models meet all feasibility constraints. It appears that while most models handle mathematical feasibility relatively well, they do so at the expense of being economically justifiable. From the results of
Table 1, it can be hypothesized that all drawbacks can be overcome when an implied correlation matrix is estimated from factors. This hypothesis is tested in the following section.
3. Solutions from Factor Structures
This section develops the two proposed approaches to estimate a mathematically and economically feasible correlation matrix. The section starts with a discussion on how correlation matrices can be estimated from factors. Thereafter, quantitative and economic estimation approaches are introduced that both ensure meeting the mathematical and economical feasibility constraints.
Generally, implied volatilities are used as input parameters, which can be estimated in various ways (see [
34] for a detailed discussion). Dependent on what kind of implied volatilities to use (e.g., centered vs. directly parameterized [
41]), the methodology is not limited to Pearson-type correlation matrices and can also be used within more sophisticated option pricing models, adjusting for non-normal distributions. An example for the case of a multivariate variance-gamma process can be found in
Appendix A.
The correlation structure that is typical for financial markets can be expressed with the multi-factor copula model described by [
32]. This model is used in a nearest correlation matrix context by [
33]. The factor generating core of the model follows a simple but intuitive definition, that is
where
describes a random vector,
a diagonal matrix and
corresponds to the factor’s magnitude. All three vectors,
,
and
are defined to have zero mean and unit variance;
and
are orthogonal,
From this, it follows that
Since
has unit variance,
turns out to be a correlation matrix with the boundaries of
such that
X is necessarily limited to
. From the above equation, it follows that every
corresponds to the goodness-of-fit of stock
i explained by the the risk factor
d, and
to the unsystematic correlation, which cannot be explained by the given set of factors. From the notation, it follows that
itself can be interpreted as the correlation of a stock
i to a risk factor
d (e.g.,
in CAPM).
itself is of less importance for further modeling, as it can be easily computed once a feasible solution to
X is found. Therefore, the factor-structured correlation matrix of Equation (
8) can be equivalently rewritten in a form where
is suppressed and
X remains the only unknown. This reformulation is used in the specific context as the factor-structured implied correlation matrix
, which is now denoted by Equation (
2). Other works in the literature (e.g., [
33]) write
with
as the diagonal matrix of
. This alternative definition yields the same
; however, given the cleaner notation of Equation (
2) and its superiority in terms of computation, as
J can be pre-computed (thus not iterated) while
cannot, Equation (
2) is preferred in this paper. Below, it is shown how Equation (
2) generates a well-conditioned correlation matrix.
Theorem 1. The matrix is a mathematically feasible correlation matrix whenever Proof. Since , J and I are symmetric, it follows that is symmetric and condition (i) is guaranteed. From it follows that condition (ii) holds. As for condition (iii), has zero diagonal such that addition by I guarantees unit diagonal. ’s off-diagonal elements are generated from , which is positive semi-definite by construction. Since , it follows that is positive semi-definite given that . Therefore, conditions (i)–(iv) are met. □
In empirical applications, it is unrealistic that asset returns are perfectly (anti-)correlated with risk factors. Therefore, excluding −1 and 1 from the factor loadings, X, is not only more appropriate, but it will also come with the practical convenience that it guarantees to be non-singular (invertible). This is practical for portfolio optimizations where the inverse of the correlation matrix can be used for closed-form solutions. An example of such an optimization task is the computation of the global minimum variance portfolio.
Lemma 1. The correlation matrix is invertible if its factor loadings are not perfectly (anti-)correlated, Proof. For the one-factor case of
, ref. [
33] (Cor. 3.3) showed that non-singularity holds if at most one entry in
is 1. Hence,
subject to
is guaranteed to be non-singular. The behavior for
is similar in the sense that at most, one entry per column of
is allowed to take on 1. This condition becomes more obvious when looking at the two-factor case
. In this case,
X is structured as
where
represent vectors of dimension
, and
z is of dimension
q. The vectors have the characteristics of
and
. Hence, the first column of
has
q many entries equal to 1. When
, it follows that all
X entries in the same row as
z take on zero (if not
z itself). This means that the columns of
X can be arbitrarily exchanged, but it will not affect
. Further, from the permutation argument of [
33] (Cor. 3.3), it is known that the position of
z does not matter. The upper-left corner of the correlation matrix can be abbreviated by
, which represents a feasible correlation matrix itself. Thus,
is alternatively represented by
for which [
33] (Cor. 3.3) showed to be non-singular if
and
. Therefore, when using
instead, it follows that
and
are guaranteed to be invertible. Since adding additional factors
will enter into
only, it does not change the conclusion. □
For economic feasibility Equation (
2) needs to be combined with the equilibrium conditions of Equation (
1). Therefore, the options traded on the market- and its sub-indices restrict possible values of
X. These restrictions are referenced with ’market-constraints’ from now on. Given
many market-constraints, the general solution to the factor-structured implied correlation matrix now evolves as
From this definition, it follows that
is a closed convex set of real numbers. This set can be expressed as inequality constraints, and the market constraints are of the equality type. From the equality constraints, it follows that the feasible set is a subset (hypersurface if
) within
, denoted by
in the subsequent. For optimization purposes, it is convenient to formulate the constraints as follows.
Definition 1. Let describe the inequality constraints,such that guarantees that , which is necessary and sufficient for mathematical feasibility. In practical terms, when one wants to explicitly ensure that the implied correlation matrix is invertible, the easiest way to do so is by introducing a very slim tolerance for .
Definition 2. Let define the vector of market constraints, which simply stacks Equation (1) for every (sub-) index that has option contracts traded upon,with reserved for the market portfolio m. Hence is necessary for economic feasibility. The equality constraint is necessary, but not sufficient for economic feasibility. The best example is the (local) equi-correlation model, which meets this requirement but yields an unrealistic implied correlation matrix, such that economic feasibility is not met.
In general, the number of unknown correlation pairs of a correlation matrix
as in Equation (
14) is
, and the number of observable implied volatilities is
. Empirically, it is given that
almost surely, and hence, many different numerical solutions to
X exist. To come up with a reasonable choice of
X, this paper argues in favor of two approaches: first, a purely quantitative approach computing the feasible implied correlation matrix that is nearest to some pre-specified target; second, an economic approach using factor-based asset pricing models, assuming that
expected factor loadings can be estimated.
3.1. Quantitative Approach: Computing Nearest Implied
This solution is based on the case when there exists an educated guess about the forward-looking correlation matrix. For example, this educated guess could be derived from historically realized correlations, from a GARCH forecast, or from a not fully feasible estimate of the discussed models in
Section 2. As before, this educated guess is denoted by
A, but different to [
9,
29], the assumption that
is not required. Such an educated guess allows to search for a
that is as similar as possible to
A, but satisfies all market constraints in order to be considered a feasible implied correlation matrix.
3.1.1. Formulating the Problem
Finding the most similar of a target matrix is known in the mathematical literature as the nearest correlation matrix problem (e.g., [
33]). Similarity between the generated
and the target matrix
A can be quantified by the squared Frobenius norm between them,
which, when introducing
, can also be written as
Lemma 2. The gradient of is Proof. For simplicity, let
and the Frobenius product (trace operator) be denoted by the colon symbol ‘:’, such that
. The problem can then be written as
and its differential is thus
. The differential of
M itself is
, and
. Both: and ∘ are mutually commutative operators, and hence
. Therefore, substituting back in yields
since
J and
M are symmetric, this reduces to
, and
follows. As
and
, plugging back in
M gives Equation (
19). This result is a reduced-form equivalent to the gradient as found in [
33], who derived it as
. □
Popular optimization methods such as sequential quadratic programming (SQP) build on the Lagrangian function, which for the problem at hand can be formulated as
with
and
representing the Lagrangian multipliers of the market constraints introduced in
Section 3. Working with the Lagrangian is probably the most common practice, and hence the respective gradients are also reported below.
Lemma 3. The gradient of with respect to X is Proof. can be written as
. One market constraint
, can be rearranged into the form
. To abbreviate notation, let
and
, both being symmetric matrices. For the gradient
, all non-
X terms cancel out, so the focus is on
, which can be expressed as the matrix trace
. The differential of
is thus reduced to
. From the proof of Lemma 2 it follows that
. Therefore, the differential writes
hence the gradient of
is given as
. Here,
equals
, so substituting back in and multiplying by
gives
thus, the lemma evolves from
. □
Lemma 4. Let denote the diagonal matrix of κ, the gradient of can then be written as Proof. The term can alternatively be written as , where describes that each element in X is squared (Hadamard quadratic). Thus, , which for yields the result above. □
3.1.2. Numerical Method
The nearest implied correlation matrix can now be attained by the following optimization:
A performance comparison of numerical methods for a general nearest correlation matrix problem (i.e., without equality constraints
) was presented by [
33]. They recommend the spectral projected gradient (SPGM) as the most efficient and reliable method for this task (compared algorithms include ‘alternating directions’, ‘principal factors method’, ‘spectral projected gradient’ and Newton-based methods; see [
33] for details). Based on this finding, the SPGM is also used as the main algorithm within this study for solving the implied correlation matrix puzzle. A brief comparison to a SQP-based solver is included in
Section 4.
First, a brief elaboration on the SPGM algorithm and expansions that are relevant for applications in this paper are presented in this paragraph. SPGM was initially introduced by [
42] and is used in many different studies. The study of [
43] summarizes a long list of corresponding works and the applications to financial data include [
25,
33]. A detailed discussion on the algorithm can be found in [
43,
44]. Technically, the SPGM comes with three advantages when compared to other potential candidates. First, the method ’only’ requires the gradient—known for this task—but not the Hessian matrix. Second, the non-linear constraints are guaranteed to be satisfied and third, the method is ensured to converge toward the optimum (cp. [
42]). In a nutshell, the method minimizes a continuous differentiable function on a nonempty closed convex set via projecting possible values of
X back onto the feasible set. Consequently, a key requirement is that projections on the feasible set can be made at low computational effort. If so, then the method provides a very efficient way of handling the constraints. For the application at hand, handling the equality constraint alongside
can become tricky. This issue is resolved by combining SPGM with a more general inexact restoration framework (e.g., [
45] for a discussion). The IR-SPGM was introduced by [
46] who also provided an in-depth explanation of the algorithm, thus the details will not be repeated here. Roughly speaking, the algorithm can be broken down into two phases: a restoration phase, projecting
X back onto the feasible set, and an optimization phase, computing the step size.
For the particular application at hand, greater attention has to be paid on the projection function. In case of this paper,
defines the feasible region that satisfies both
and the market constraints. As mentioned by [
46], for some
X outside
, the projection function
should be defined such that
where
. This means that the projection function should find a point in
that is approximately the closest to
X (inexact restoration). Obviously,
is an orthogonal projection if an
exists, which is not necessarily the case, given the upper boundaries of
(Equation (
9)).
Figure 1 visualizes the projection and feasible set on a simplified example.
For the inexact restoration framework, the projection is split up into two functions, a projection onto the inequality constraint
and one onto the equality constraint
. As for
, [
33] already discussed that this projection is easily carried out by replacing every row
i of
X which exceeds
by
. Hence, the projection on
comes at very low computational effort.
For the
projection, given the market constraint
and defining
to be any point that fulfills
, the Lagrangian can be formulated as
To simplify notation, introduce
. With the results from above, the gradient of
can now be written as
To minimize the Lagrangian function, both gradients are set to zero,
. Therefore, from rearranging terms of
,
can be defined as
The inverse of
can be expressed following the expansion procedure of [
47]. Since in a real economy the elements of
are generally smaller than 1 and close to 0, higher-order terms, such as
, converge toward zero very fast. If this was not the case, the implied volatilities could be time-scaled down, for example, from yearly to daily such that
. Consequently, the inverse is efficiently approximated by
Plugging back into the equality constraint
and rearranging terms,
can then be solved from a simple quadratic equation with two solutions,
Having estimated both solutions of
and inserting the results into Equation (
31), it is easy to attain whether
or
is closer to
X. Hence, it evolves that also the projection
is inexpensive to compute. In case
is complex, taking only its realistic part is considered to be sufficient, given the inexact restoration framework.
that fulfills both
and
can now be found following a simple alternating projection algorithm that proceeds as follows. It starts with the initial projection on the market-constraint
. If
, then
has already been found. Otherwise, it locks in whether
or
was used and takes only the upper or lower solution subsequently. This is useful because, therefore, the market-constraint becomes convex, and alternating the projections
and
converges toward the optimal solution
. The alternating procedure can be stopped if a certain tolerance level for
has been reached. With the defined projection procedure, the optimization under IR-SPGM can now be carried out following the guidelines of [
46].
3.2. Economic Approach
As an alternative to the above quantitative approach, a set of expected risk factors can also be utilized to estimate the implied correlation matrix. The factor exposures can be computed either from a statistical (e.g., principal component analysis) or from an economic routine (such as CAPM [
26], Fama-French [
27,
28]). As for the latter,
X corresponds to the correlation between stocks and the risk factors. From the literature on factor analysis, it is known that if
X can be found such that
has a unit diagonal, then a correlation matrix is fully explained by its reduced structure
X. It is also known that if
, then a correlation matrix can be fully described by a set of factors (something that is known, e.g., from eigenvalue decomposition). So, either way, a (
psd) correlation matrix
C is an aggregate with
X as the underlying structure. In the models of [
9,
29], the correlation risk premium is incorporated, stock pair-wise. Different to that, this paper assumes that investors use factor-based asset pricing models and thereby quote the risk-premium on the factor level.
Whenever the implied correlation matrix is described by a one-factor model,
, the weighting idea of [
9] can be applied to the factor loading level to achieve the
transformation. Taking a one-factor model is no oversimplification, because in financial markets, it is documented (e.g., [
48]) that the first principal component already explains a very large portion of the realized correlation matrix. It is also reported that the first principal component is typically interpreted as CAPM’s market portfolio. In the sense of [
9,
29], it can be assumed that the
expected correlations to risk factors can be estimated on a reasonable basis. Next, following their concept,
is weighted against
if
. In case
, then the weighting is made toward the lower boundary, which can now be chosen as
. Hence, investors are modeled to quote the risk-premium on the factor exposure rather than on the single stock level. Considering the diversification effects, the mark-up to risk factors seems more realistic from an economic point of view. Furthermore, the factor-level implementation will not run into an economically inconsistent-scaling as in [
29] and there is also no
psd problem as in [
9]. When introducing the sign of the correlation risk premium as
that indicates up- or down-scaling, the transformation can then, similar to Equation (
4), be written as
It generally holds that
and the implied correlation matrix is given by Equation (
2) using
. To compute
, one has to match the market constraint in Equation (
1), from which it follows that
can be explicitly solved for by
For multiple market constraints, one can either calibrate
and obtain a unique
, or first solve for various
on sub-index levels, and then on the market index level. The above quadratic equation obviously gives two values of
, given that ‘
’ can be replaced by ‘±’. Since
is associated to an economic interpretation, the upper value is taken in the case of
(up-scaling), and the lower value if
(down-scaling); hence, ‘±’ is substituted by ‘
’ to implement this mechanism. In a multi-factor pricing model, the high multicolinearity among factors could potentially violate
. This issue can be easily resolved by first computing the correlations between stocks and factors and then orthogonalizing the set of vectors (e.g., via the Gram–Schmidt process).
The procedure of computing the implied correlation matrix from pricing factors can now be summarized as follows. First, the physically expected correlations between stocks and risk factors have to be computed and denoted in a set of vectors as
. If necessary,
has to be orthogonalized. Thereafter, the weighting scalar
has to be calculated according to Equation (
34). Lastly,
has to be computed from Equation (
33), and the implied correlation matrix then evolves from Equation (
2).
4. Empirical Experiment
A brief empirical experiment for the developed models was conducted to evaluate their implementation difficulty and computational efficiency, using data of the S&P 100 and S&P 500 index. The experiment focuses on the monthly at-the-money (ATM) Call option implied volatilities with a target maturity of one month, that were derived from OptionMetrics. It is important to know that OptionMetrics follows a three-dimensional kernel regression for interpolating the option surface (see [
49]); a discussion on this topic can be found in [
50]. Computations were carried out at the beginning of each month, starting in 1 January 1996 to 2 December 2020, thus giving 300 implied correlation matrices per time-series. The ATM level was chosen due to three reasons: first, options are typically most liquid around the ATM level [
51]; second, the ATM level is less sensitive to model misspecification (compared to out-of-money; [
52]); third, ATM Call prices are close to ATM Puts. Return (daily) and market value (monthly) data were derived from the CRSP (Center for Research in Security Prices), the lists of index constituents are from Compustat. All computations were executed under an Intel
® Core™ i5-8250U CPU with 1.60 GHz, using the statistical programming software R.
The experiment was executed using the following model specifications. As for the optimization algorithms, the variance tolerance for the market constraint was set at 1
. The stopping criteria was set to a marginal improvement in the objective function of 1
. A modest mark-up of
was also introduced to ensure that estimates were invertible. The IR-SPGM was implemented as described by [
46] (Algorithm 2.1) except that a monotone line search strategy was used. The non-monotone line search strategy runs additional sub-routines of projections to speed up convergence. However, since the projection function here was a potential alternating procedure on its own, the monotone line search was found to be slightly faster than the non-monotone one. Two types of target matrices (
A) were used. First, simple Pearson’s correlation matrices from 12-month historically realized returns. Second, a mean-reverting matrix with the entries
, using 9 months of historically realized returns for
and the mean-correlation between
i and
j over the total time horizon for
. The reversion speed
was randomly drawn from a uniform distribution between 0 and 0.4 to bring in some noise into the target matrix. Generally, the two S&P indices are re-balanced on a quarterly basis, and hence there is no guarantee that the target matrices per se are
psd as firms leave and enter the indices within the estimation horizon. For the non-
psd cases, the quantitative approach also served as a repairing tool.
As for the starting value
, a modified version of [
33] was used. For the target matrix
A, let
e be the set of eigenvectors and
the corresponding eigenvalues. Then, for each column
of
, the starting values were computed as
This starting value is identical with the one proposed by [
33] in case
, but differs when the number of factors is higher. Within the empirical experiments of the
cases, the reduction in the objective function was found to be larger under the modified than under
of [
33]. Hence, the modified starting value was preferred.
The empirical experiment was split into three parts, which are summarized in
Table 2, Panels A–C. The columns of
Table 2 describe the number of risk factors (k), the target matrix (A), the mean and standard deviation of computation time (t), the optimized objective function (fn), the absolute realization of the variance tolerance (|v.tol|), the number of outer iterations (iter), and the index data used for the calculation (index). Panels A and B show the results of the quantitative approach, computing the nearest implied correlation matrices. Panel C displays the results of the economic approach.
In Panel A, the nearest implied correlation matrix was computed using IR-SPGM for the S&P 100 under different settings with respect to the number of risk factors and the two different target matrices. Overall, the patterns between the historical- and the mean-reverting matrices look very similar. The average computation speed was very fast and is comparable to the non-equality-constrained results of [
33]. The computational speed was probably caused by the fast convergence within few outer iterations—around 3 to 5 on average—and the simplicity of the projection. As it is unlikely that the target matrix perfectly equaled the hidden true implied correlation matrix, the objective function was not expected to reach zero. The results show that on the one hand, an increasing number of risk factors indeed reduced the final objective function, and hence improved the estimation accuracy. On the other hand, non-surprisingly, a larger
k came at higher computational effort, as it multiplied the model’s number of variables. With a look on the variance tolerance, the IR-SPGM algorithm had no difficulties to stay inside the feasible region.
Panel B displays the results for the nearest implied correlation matrix method applied to three alternative settings. The first line shows the results on S&P 500 data. Using the S&P 500 increased the number of unknown correlation pairs from 4950 (
) to 124,750. With the increase in the number of stocks the computation time rose over-proportionally. This was probably caused by the larger number of iterations needed to converge toward the optimum (mean 3.037 vs. 5.003). The optimized objective functions can be compared when dividing by
(off-diagonal entries), which is 0.016 for the S&P 100 and 0.011 for the S&P 500 index. With a look on the variance tolerance, the S&P 500 computations stuck more strictly to the first market constraint with a maximum deviation of 1.3
. Therefore, while the computation time over-proportionally increased from the S&P 100 to the 500 index, the mean fn per matrix-entry and also the realized variance tolerance were remarkably smaller. Hence, stopping criteria and variance tolerances can potentially be relaxed the more constituents the index holds. The second line of Panel B reports results on the S&P 100, where the target matrix was chosen from mean-reverting correlations, converted into an implied correlation matrix following the adjusted ex post model of [
9] (Equation (
4)). Hence, the target matrix here already was an implied correlation matrix that fulfilled the market constraint, but did not stick to mathematical feasibility. In this setting, only the monthly estimates of non-
psd target matrices were taken such that the quantitative factor approach was used to repair the estimates under the [
9]-model. This included 64 of the 300 monthly matrices. The number of risk factors was set higher here to achieve a better fit, and with a look on fn, it can be observed that the objective function was indeed substantially smaller at this application. Hence, the factor model qualifies as a repair tool for existing implied correlation models. Within the third line of Panel B, the IR-SPGM algorithm was replaced by a SQP routine for the S&P 100 data. The algorithm used was taken from the Rsolnp package (see [
53,
54]). The SQP solver served as a reference to cross-validate whether the IR-SPGM method was implemented correctly. The results can thus directly be compared to the first line of Panel A. It can be observed that the optimized objective functions were very similar between the SQP and IR-SPGM algorithm, which confirms correct implementation of the IR-SPGM routine. Comparing computation times between them, SQP does not seem to be competitive, which was also found in the general case of [
33]. This finding thus motivates the usage of IR-SPGM.
In Panel C of
Table 2 the economic approach was implemented on the S&P 500 index with respect to implied market exposure [
12], CAPM (market factor), the Fama–French three-factor model [
27], the extension for the momentum factor (FF3+Mom.) and the Fama–French five-factor model [
28]. For the implied market exposure, the risk-neutral skewness was calculated based on [
55] using the public code library by Grigory Vilkov (
https://doi.org/10.17605/OSF.IO/Z2486, accessed on 23 February 2022). The return data for the remaining factor portfolios were derived directly from K.R. French’s data library (
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html, accessed on 7 April 2021). Following [
12], the purely forward-looking correlation between stock
i and the market
m was computed as
. To approximate
for the remaining four methods, 12-month realized correlations between stocks and the respective risk factors were used. Note that
of the implied market exposure factor solely relied on option data and was thus purely forward looking.
of the other factors, however, had a hybrid form that took historical data to assess implied future co-movement. Next, to ensure that
for estimations with
, the factors were orthogonalized via the Gram–Schmidt process. On the one hand, since the economic approach does not iterate and consists only of one projection, its computation was very inexpensive. On the other hand, its estimates deviated more strongly from the target matrix than those of the quantitative approach—irrespective of the number of factors employed. Comparing the results within the economic approach, it can be observed that for hybrid estimations, the larger the number of risk factors, the smaller the fn
avg. The factor solely relying on implied data resulted by far in the highest fn
avg. However, similarity to the target matrix plays an subordinate role here. Similar to that of fn
avg, a reduction pattern can also be seen for
among hybrid models. In
Section 3.2,
was defined as the weight of the boundary and
as the weight of
inside the risk-neutral factor correlations
. This means that the larger the number of risk factors, the less modification was required to match the observed implied market variance in this empirical test. Thus, a larger
k is likely to explain the hidden (true) implied correlation matrix better. This time, comparing the purely forward looking measure to the hybrid estimations shows that when implied data were used, less modification was required. Generally, as
was close to zero across all five economic models, it can be concluded that correlation risk premia enter modestly in the factor-structured implied correlation matrix framework.
5. Concluding Remarks
Having an idea about future diversification possibilities requires knowledge about future correlations. Identifying such is a challenging task, as backward-looking information will never capture events that are expected to happen in the future. In contrast, option-implied volatilities are known to carry information about the market outlook. Hence, they are used by academics and practitioners alike to obtain forward-looking perspectives. Computing implied correlation matrices, however, is an intricate puzzle since it is a highly under-determined problem. While there already exists a strain of literature that provides estimation methods for such matrices, this paper shows that these approaches either fail to ensure important mathematical characteristics or discard economic rational grounds. In short, a fully feasible solution to the puzzle has not yet been found.
This paper provides two solutions to the problem by exploiting the commonly accepted assumption that returns stem from factor risk exposure. The first approach is quantitatively motivated and computes the nearest implied correlation matrix subject to a pre-specified target, that can be freely chosen. This method turns out to be a useful tool for repairing (implied) correlation matrix estimates, moreover ensuring their invertibility. It can also be used as a stand-alone estimate, coming at a minimum of assumptions. With the second economically motivated approach, the paper demonstrates how expected risk-factor loadings (or betas) can be translated into an ex ante correlation matrix. As long as one has an educated assumption about either the implied correlation matrix or efficient factor loadings, both approaches provide a fully rational solution to the implied correlation puzzle. Thus, the hypothesis that the shortfalls of existing models can be overcome with factor-based solutions can be accepted. Furthermore, an empirical application of the two approaches on monthly option data of the S&P 100 and S&P 500 (1996–2020) shows that the implementation of the two proposed solutions is easy and computationally efficient.
The findings of this paper have strong implications for practitioners and literature alike. With the current models, investors either risk losing important mathematical properties or have to abandon economic rational grounds. The two provided methodologies are not only able to overcome these shortcomings, but also turn out to be handy regarding practical implementation. Thus, this paper makes an important contribution to the literature, as it solves a long-lasting problem without compromise. With the broad interest in estimating future-oriented co-movements, the presented approaches can find multiple applications in all areas of finance. These include asset pricing, market forecasting, portfolio optimization, and risk management. While this paper focuses more on the theory of implied correlation estimation, future research should evaluate the performance of the presented solutions in an investment context.