Portfolio Optimization for Extreme Risks with Maximum Diversiﬁcation: An Empirical Analysis

: Heavy tailedness and interconnectedness widely exist in stock returns and large insurance claims, which contributes to huge losses for ﬁnancial institutions. Diversiﬁcation ratio (DR) measures the degree of diversiﬁcation using the Value-at-Risk, which is known to capture extreme risks better than variance. The portfolio optimization strategy based on DR maximizes the effect of diversiﬁcation for extreme risks. In this paper, we empirically examine the DR strategy by using more than 350 S&P 500 stocks under the assumption that the stock losses are modeled with a ﬂexible multivariate heavy-tailed model. This assumption is veriﬁed empirically. The performance of DR strategy is compared with four benchmark strategies: equally weighted portfolio, minimum-variance portfolio, extreme risk index portfolio, and most diversiﬁed portfolio. The performance of comparison includes annualized portfolio return, modiﬁed Sharpe ratio, maximum drawdown, portfolio concentration, portfolio turnover, and the degree of diversiﬁcation. DR outperforms other strategies. In particular, DR shows the highest return and maintains the highest level of diversiﬁcation during the global ﬁnancial crisis of 2007–2009.


Introduction
Many empirical studies showed equity returns (losses) and large insurance claims exhibit heavy tailedness, i.e., the tail of the return (or loss) is power-like, which can lead to huge losses; see e.g., Jansen and de Vries (1991), Loretan and Phillips (1994), McCulloch (1997), and Gabaix et al. (2003). Interconnectedness is also empirically observed among financial assets, e.g., in Billio et al. (2012) and Acharya et al. (2017). Thus, heavy tailedness and interconnectedness are two features of extreme risks, which can cause sever systemic risk such as the global financial crisis of [2007][2008][2009]. Much research has been devoted to the study of the systemic risks and the lessons from the financial crisis; see e.g., Gorton (2008), Huang et al. (2009), Huang et al. (2012), Chen et al. (2014), and Rivera-Escobar et al. (2022). It is crucial for financial institutions to manage these extreme risks. Diversification is a common strategy in managing portfolios and it has been studied in different contexts involving financial risks or insurance risks, for example in Schnieper (2000), Choueifaty and Coignard (2008), Choo and de Jong (2010), Mainik and Rüchendorf (2010), and Cui et al. (2021). In this paper, we investigate the performance of an optimal strategy aiming at maximizing the effect of diversification to mitigate extreme risks.
Quantile-based risk measures such as Value-at-Risk (VaR) can capture the extreme risks better than the traditional measure, variance, used in the Markowitz portfolio optimization strategy. In this paper, we measure the effect of diversification by the diversification ratio (DR), whose formal mathematical definition of DR is presented in Section 2. Intuitively, DR is the ratio of the risk of the weighted portfolio and the sum of weighted individual risks, where the risk is measured by VaR. The DR is also called the risk concentration based on VaR in an actuarial setting; see, for example Degen et al. (2010) and Embrechts et al. (2009). In general, 1 − DR can be regarded as the measure of the degree of diversification. Naturally, to maximize the level of diversification, a portfolio optimization strategy is given by maximizing 1 − DR, or equivalently the portfolio that minimizes DR.
Since the focus of this paper is on extreme risks, the underlying risks are modeled by multivariate regular variation (MRV). The MRV is a multivariate model for extreme risks, which allows heavy-tailed marginals and flexible dependence structure among the risks. The technical details of MRV is introduced in Section 2. Cui et al. (2022) focused on the asymptotic properties of the DR strategy but the performance was simply examined by showing it yields the lower portfolio risk than other strategies during the 2007-2009 global financial crisis period. In this paper, we aim to conduct a more comprehensive empirical analysis to examine DR's performance with S&P 500 stocks. We first carried out a preliminary analysis of the dataset to show that MRV is a reasonable model for stock losses. The dataset contains 361 stocks, which have complete historical data from 1 January 2000 to 29 June 2020. We find that alternate-day log-losses have weak serial dependence and thus can be viewed as independent data while daily log-losses show a stronger dependence. For the alternate-day log-losses, we also verify the heavy tailedness of each stock. Through this analysis, we proceed to assume the 361 stocks' log-losses follow a MRV model. Although this is a rough assumption, we are still able to obtain interesting observations. In the second part of the empirical study, we also refined the study by grouping the stocks with similar tail heavy tailedness.
The DR portfolio is empirically tested using a 5-year moving window and alternateday rebalancing. That is, the optimal weights on each trading day are determined using the data in the past five years and the portfolio is rebalanced every other day. The portfolio values are then calculated based on the optimal weights on all the rebalancing days in the backtest period, from 3 January 2005 to 29 June 2020. We examine the performance of the DR strategy from various aspects. Further, its performance is compared with four benchmark strategies together with the S&P 500 index. Two classic benchmarks are equally weighted (EW) portfolio and minimum variance (MV) portfolio. The other two benchmarks are the extreme risk index (ERI) strategy and most diversified portfolio (MDP) strategy, whose formal definitions are presented in Section 2. The ERI was proposed in Mainik and Rüchendorf (2010) under the MRV structure. ERI uses VaR to measure risks and is essentially the "minimum-VaR" strategy, seen as the counterpart of MV. The MDP was proposed in Choueifaty and Coignard (2008). The MDP shares the same structure as DR while using variance as the risk measure. Thus, it can be considered the counterpart of DR.
Overall, DR shows very promising return with the highest annualized return of 15.05% among the five strategies together with the S&P 500 index. More importantly, DR maintains the feature of diversifying risks during the financial crisis of 2007-2009, which means it performs better than other strategies in crisis time. An interesting observation is that strategies based on VaR (DR and ERI) are very selective in stocks and have high turnover, especially during the crisis time, while strategies based on variance (MV and MDP) rely on more stocks and the turnover is moderate over time. This can be explained by VaR being more sensitive to extreme risks as opposed to variance taking both profits and losses into account. Since the turnover is high for the DR and ERI strategies, we further analyze the effect of transaction costs and find that when the transaction cost is relatively low, DR can still outperform other strategies, which is due to that DR strategy consists of fewer stocks than other strategies. Aside from fitting all 361 stocks into a MRV model, we study performance of the five strategies within groups of stocks exhibiting similar heavy tailedness. Specifically, the stocks are split into three groups based on their tail index, which is a measure of heavy tailedness. DR outperforms other strategies in the group with the most stocks, whose heavy tailedness lies in the middle. DR has the most effect of diversification for all three groups of stocks. In summary, DR is a feasible strategy to diversify extreme risks and it performs exceptionally during the crisis time, proxying extreme occurrences.
The paper is organized as follows. In Section 2, we formulate the DR strategy, introduce the model for extreme risks and the estimation method for DR. The preliminary analysis of the independence and heavy tailedness of log-losses of S&P 500 stocks are carried out in Section 3. Section 4 contains comprehensive empirical studies of DR with the S&P 500 stocks and the comparisons with other benchmark strategies: EW, MV, ERI, and MDP. Section 5 concludes the paper. One proof and some figures and tables are relegated to Appendix A.

The DR Portfolio Optimization Strategy
In this section, we first introduce the MRV model for extreme risks and then we define the DR portfolio optimization strategy. Lastly, under the MRV model, we present the estimation method for the DR strategy.

Model for Extreme Risks
Multivariate regular variation (MRV) is a general multivariate model for extreme risks. It includes a flexible tail dependence structure for the higher dimensional situation and regularly varying marginal distributions. Typical examples of MRV include elliptical distributions with a regularly varying radial component, multivariate Student's t distributions, multivariate α-stable distributions, Archimedean copulas with regularly varying generator, and marginals, among others.
Let X = (X 1 , X 2 , . . . , X n ) T be a non-negative random vector. If there exists a probability measure Ψ on B(S d−1 where v −→ means vague convergence, then X is said to be multivariate regularly varying. The parameter α is called the tail index of X, and the probability measure Ψ is called the spectral measure of X. Throughout the paper, we denote that X is MRV with index α and spectral measure Ψ by X ∈ MRV α (Ψ).
The univariate regular variation is simply by restricting the dimension d to 1. Let F be the distribution function F of the random variable X. Then F is said to be regularly varying (RV) or heavy tailed if there exists α > 0 such that for any x > 0, We denote it by F ∈ RV −α where α is called the tail index. Large price fluctuations and large insurance claims are often modeled with a heavy-tailed distribution.

DR Strategy
Let X := (X 1 , . . . X d ) T be a non-negative random vector indicating the losses of d assets. The value of a portfolio is given by w T X, where the weights satisfy w = (w 1 , w 2 , . . . , w d ) T ∈ Σ d := x ∈ [0, 1] d : x 1 + x 2 + . . . + x d = 1 . For this portfolio, diversification ratio (DR) based on VaR at level q ∈ (0, 1) is defined as: where VaR q (X) = inf{x ∈ R : Pr(X ≤ x) ≥ q} is the VaR of a random variable X at level q. From the above definition, 1 − DR w,q can be regarded as the measure of the degree of diversification. Thus, to maximize the level of diversification is to maximize 1 − DR w,q , or equivalently the portfolio that minimizes DR is The analytical solution to the maximizing DR portfolio w q is generally unavailable. Usually the alternative solution is to estimate by numerical methods; however, the computational burden increases exponentially with respect to the dimension d, making such an estimate less feasible for higher d. In Cui et al. (2022), an approximation of w q is proposed and its property is studied under MRV model. Next we introduce this approximation method.
Instead of directly estimating w q , we first consider the limit of DR: Cui et al. (2022) showed that with e i = (0, . . . , 1, . . . , 0) having only the ith component as 1 for i = 1, . . . , d and Let w * denote the optimal solution w * = arg min The computation of w * is straightforward. In fact, Cui et al. (2022) showed that if X ∈ MRV α (Ψ) with α > 1, then lim q↑1 w q = w * .
That is, when q is close to 1, w * can be used as an approximation of w q . In this paper, we call w * the DR portfolio optimization strategy.
The above DR strategy is proposed based on VaR. Other quantile based risk measures, such as Expected Shortfall (ES) and expectiles can be applied as well and they will yield the same strategy with using VaR. This is because under the MRV model these quantile-based risk measures are linearly proportional to VaR when the confidence level q approaching 1; see e.g., Mao and Yang (2015). Since the DR strategy is computed to minimize DR w,1 , the expression of DR w,1 in (4) is the same by replacing VaR in (2) with other risk measures that are asymptotically linearly proportional to VaR, such as ES or expectiles. Thus, applying different quantile-based risk measures (which are asymptotically linearly proportional to VaR) in (2), they all have the same solution w * in (5). This means the DR strategy is "partially" independent of the choice of risk measures and it does not suffer from using VaR even though some drawbacks of VaR are frequently discussed, such as lack of subadditivity and not capable of predicting a large loss that is beyond VaR.
Our focus of this paper is to empirically study the performance of DR strategy with the S&P 500 stock data. As mentioned in the introduction, we compare the DR strategy with four benchmark strategies together with the S&P 500 index: two classic benchmarks, EW and MV, and two other benchmarks, ERI and MDP. The ERI is defined as which was proposed in Mainik and Rüchendorf (2010) under the MRV structure. ERI uses VaR to measure risks and is essentially the "minimum-VaR" strategy, seen as the counterpart of MV. The MDP is defined as which was proposed in Choueifaty and Coignard (2008). The MDP shares the same structure as DR while using variance as the risk measure. Thus, it can be considered the counterpart of DR.

Estimation of DR Strategy
Now we are ready to present the estimation for the maximizing DR portfolio. Assume X ∈ MRV α (Ψ) with α > 1. Let X 1 , . . . X n be an i.i.d. sample of X. We follow the same estimation method proposed in Cui et al. (2022) for the optimal portfolio w * , which is consistent with that of Mainik and Rüchendorf (2010). That is we first estimate DR w,1 by where e i = (0, . . . , 1, . . . , 0) having only the ith component as 1 for i = 1, . . . , d, and η w = Σ d (w T s) α Ψ(ds) with α and Ψ the estimators of the tail index α and the spectral measure Ψ. Then we obtain an estimate w * by minimizing DR w,1 . To estimate of α and Ψ, we rewrite X i in polar coordinates of with respect to || · || 1 as Assume that the distribution function of R is continuous. Let R (1) ≥ R (2) ≥ · · · ≥ R (n) be the order statistics. Let π(1), . . . , π(n) denote the indices corresponding to R (1) , . . . , R (n) in the original sequence R 1 , . . . , R n . Then, we can identify S π(j) corresponding to R (j) for j = 1, 2, . . . , n. By using the Hill estimator (see Hill 1975), α is estimated as where k is chosen such that k → ∞ and k/n → 0. The spectral measure Ψ can be estimated by where δ π(j) (·) is the Dirac measure. Cui et al. (2022) showed that both w * and DR w,1 are consistent estimators. The rest of this section is devoted to the asymptotic normality results for the estimator DR w,1 . Let U n denote the (k + 1)-st upper-order statistic of R 1 , . . . , R n transformed by F R : The proof of the following theorem is relegated to Appendix A.
Theorem 1. Let X 1 , . . . , X n be an i.i.d. copy of X ∈ MRV α (Ψ) with α > 1 and Ψ ∂Σ d = 0. Assume that the distribution function F R of R in (7) is continuous, the estimator α is asymptotically normal, and there exists a mapping b ∈ l ∞ Σ d such that for any w ∈ Σ d : Further, assume that ( α − α) and Ψ − Ψ U n are asymptotically independent. Then with ∇h being the gradient of h, (10) is independent of B(Ψ(·)), and

Preliminary Data Analysis of Stock Losses
In this section, we carry out a preliminary data analysis for real market stock log-losses. We first describe the data and show that alternate-day stock losses do exhibit weak serial dependence and heavy tailedness.

Data
We obtain all the stocks listed in the S&P 500 index from 1 January 2000 to 29 June 2020 from S&P Capital IQ Platform and Financial Modelling Prep API. To have enough data points, we consider the stocks that survived the entire 20 year-long testing period. Further we remove stocks with negative or null prices or prices that did not match across retrieval sources, leaving 361 stocks.
Let the price of stock i at time t be denoted by P t (i). Since our focus of this paper is extreme losses of stocks, we study the log-loss at time t for stock i, denoted by X t (i), For simplicity, we can call X t (i) the loss of stock i, and −X t (i) is the log-return of stock i.

Independence
Daily stock losses are known to have serial dependence and are usually modeled by stationary time series; see e.g., Loretan and Phillips (1994) and Longin (1996). Clustering of extreme values is also observed for daily stock losses; see e.g., McNeil (1998) and Hamidieh et al. (2009). The extremal index θ is often used to determine the cluster size of extreme values in stationary time series. An informal interpretation of the extremal index is θ ≈ (mean size of cluster) −1 ; see Leadbetter et al. (2012). Formally, the extremal index is defined as follows. Let {X n } n∈Z be a strictly stationary time series and M n = max{X 1 , X 2 , . . . , X n }. Let { X n } n∈Z be an i.i.d. sequence with same marginal distribution as {X n } n∈Z and M n = max{ X 1 , X 2 , . . . , X n }. For some norming sequences c n > 0 and d n , if where H(x) is a non-degenerate distribution function, then θ is called the extremal index of {X n } n∈Z . In this paper, we estimate the extremal index by the following estimator used in McNeil (1998) where n is the sample size divided into m blocks, u is the threshold, N u is the number of exceedances of u, and K u is the number of blocks in which u is exceeded. This estimate θ is calculated for all stock losses at different frequencies: (a) daily (5345 data points), (b) alternate-day (2672 data points), and (c) weekly losses (1069 data points), at different threshold levels. The number of blocks m is set as 41 and 82 to account for semi-annual and quarterly blocks. In Figure 1, we show the boxplots of these results. We can see that both the alternate-day and weekly losses have a larger extremal index close to 1 than that of the daily losses. This means both alternate-day and weekly losses can be considered independent observations or having very weak dependence in the data. More importantly, it implies that when working with alternate-day or weekly losses, the estimation methods have a better convergence speed than with daily losses. Since the weekly losses do not show a more significant improvement in the independence of data and its data size is about one third of the alternate-day losses, our analysis in the rest of this paper is based on alternate-day data.
Ljung-Box Q test is carried out on the serial autocorrelation on the fifth lag for alternateday losses of each stock, denoted by Q(5). The p-values of these tests are reported in Table 1. About half of the stocks do not reject the null hypothesis of no autocorrelation at 1% level. This also coincides with the estimation of extremal index that the alternate-day data shows weak dependence across the series.

Heavy Tailedness
In this subsection, we examine the heavy tailedness of stock losses. First, the skewness and kurtosis are calculated for all 361 stocks using estimators proposed in Richardson and Smith (1993) based on generalized method of moment, which are robust against heteroskedasticity. More specifically, we set Kansas City Southern (NYSE:KSU) as the first asset and its loss is denoted by X t (1). The cross-skewness of stock 1 with stock i, i = 2, 3, . . . , 361, is denoted by S 1i . The cross-kurtosis of stock 1 and stock i is denoted by K 1i . Let ρ 1i be the correlation between stock 1 and stock i. Under the null hypothesis that the stock losses are multivariate normal distributed, the limiting distribution for S 1i is given in Richardson and Smith (1993) as: Similarly, the limiting distribution for K 1i is: The p-values of both skewness and kurtosis statistics are reported in Table 1. We can see the majority of stocks exhibit excessive skewness and kurtosis. This implies the stock losses have heavy tails.
So far, we have shown that it is more reasonable to model the stock losses with a heavy-tailed distribution and the alternate-day losses can be considered to be independent observations. Next, we explicitly test whether stock losses satisfy the heavy-tailed assumption. More specifically, we test if the distribution function of the stock loss is regularly varying with tail index α. We apply the PE test; see e.g., Hüsler and Li (2006). The test statistic PE n for stock i is defined as: where X 1,n (i) ≤ X 2,n (i) ≤ · · · ≤ X n,n (i) are the order statistics. The tail index α can be estimated using the Hill estimator; see Hill (1975). Under some conditions, as n → ∞, the test statistic PE n converges to where W(·) is Brownian motion. Here, we set η = 0.5 and k = 0.04n . The limiting distribution PE is calculated as follows. The Brownian motion on [0, 1] is simulated with a interval size of 1 10000 . The integral is approximated by a Riemann sum and the calculation repeated 50, 000 times to approximate PE's distribution. The p-values of the PE test for all stocks are reported in Table 1. Half the stocks do not reject the PE test at the 1% level. This verifies that it is reasonable to model the stock losses to be regularly varying.
Lastly, we estimate the tail index of each stock by using the Hill estimator defined in (8) by using 5-year rolling window. That is, from 3 January 2005 to 29 June 2020, one every other day, the tail index of each stock is estimated using the previous 5-year alternate-day data. Figure 2 shows the estimates of all the stocks. Almost all tail indices fall significantly both in 2008-2009 and in 2020 as tails become heavier in response to extreme risk market movements.

Empirical Study
In this section, we examine the performance of the portfolio that minimizes the DR. It is also compared with other four strategies, EW, MV, ERI, and MDP, together with the S&P 500 index. We first carry out the analysis by assuming all the 361 stocks follow a MRV model. Then we group the stocks with similar tail indices and analyze the performances of the five strategies for each group. Before the empirical study, we discuss the estimation method for DR portfolio and the backtest method.
By fitting the stock losses with the MRV model, the DR portfolio is estimated using the method introduced in Section 2. To be more specific, we first calculate DR w,1 using (4), which is the limit of DR w,q by letting q go to 1. The solution of minimizing DR w,1 , w * , is used to approximate the DR portfolio w q . In the calculation of DR w,1 , since only considering loss, η w is estimated by The tail index α is estimated α using (8). The proportion k is chosen to balance bias and variance of estimation. Here, we choose k as 4% when estimating α. The spectral measure estimator Ψ is given by (9), where k is 10%.
We perform out-of-sample tests to examine the robustness of the DR strategy. The DR is minimized on a 5-year rolling window. The backtest period is from 3 January 2005 to 29 June 2020. The rolling window is moved on alternate days and the portfolio is rebalanced on alternate days as well. For example, on 3 January 2005, the optimal weights are calculated based on previous 630 observations (5 years before 3 January 2005). There are a total of 2021 rebalancing dates in the backtest period. To calculate portfolio weights, we deploy sequential least-squares programming methods given an equal weight (1/d, 1/d, . . . , 1/d) initialization. We restrict to no short selling and thus all the weights are non-negative and in between 0 and 1. The error margin is set as 1 × 10 −10 .

Analysis for All Stocks
We first carry out our analysis on all 361 stocks. When we apply the DR or ERI strategies, we fit the 361 stocks into a MRV model. This is a very rough fitting but some interesting results can be obtained. In the next subsection, a finer fitting is carried out for subgroups of the 361 stocks. This approach was also used in Mainik et al. (2015).
The portfolio values under all five strategies (DR, EW, MV, ERI, and MDP) together with S&P 500 index in the backtest period are plotted in Figure 3. Although not directly implied by our theoretical results, the DR strategy overall has the highest portfolio value, especially during the financial crisis of 2007-2009 period and 2020 stock market crash due to the COVID-19 pandemic. In Table 2, we further compare the performance of the five strategies from different aspects. Subsequently, we discuss them individually. The cumulative return (CR) is return over entire backtest period from 3 January 2005 to 29 June 2020 with alternate-day rebalancing. The annualized return (AR) annualizes CR by counting 126 alternate trading days in a year, and AR is calculated as As seen from Figure 3, with no surprise, the DR portfolio has higher CR and AR than the rest.
The Sharpe ratio is estimated by using the sample mean and sample variance for the alternate-day rebalanced portfolios. Since the risk-free return rate r f is both difficult to estimate and close to 0, we approximate r f = 0 for calculations. In Table 2, the annualized Sharpe ratio is reported, which is approximated by multiplying the alternate-day Sharpe ratio with √ 126. The MDP portfolio has the highest Sharpe ratio among all of the five strategies and this is much expected as MDP is diversifying away risks measured by the variance, which is the same measure used in the Sharpe ratio. Since our focus here is on extreme risks, standard deviation is not a good risk measure in this case. A modification of the Sharpe ratio, by replacing the standard deviation with Expected Shortfall (ES) risk measure, is better to capture extreme risks and hence a good performance metric in this context. More specifically, this modified Sharpe ratio, called STARR p (Z), is defined as where Z is the portfolio return and r f is risk-free rate; see e.g., Rachev et al. (2005). We let r f = 0 and ES p is the expected shortfall risk measure at confidence level p ∈ (0, 1).
In empirical studies, we take p = 95%. The ES of a loss Z is estimated as where Z (n) ≥ Z (n−1) ≥ · · · ≥ Z (1) are the order statistics. The annualized STARR ratio, reported in Table 2, is obtained by multiplying alternate-day STARR ratio with √ 126. DR has the highest annualized STARR ratio.
Maximum drawdown assesses relative riskiness of one strategy versus another. A low maximum drawdown indicate losses from investment were small. DR has a lower drawdown than EW, MDP, and the index. ERI having the lowest maximum drawdown is somewhat expected as it, by definition, minimizes aggregate risks of the portfolio.
Concentration coefficient (CC) for a strategy at time t is defined as where w t is the optimal portfolio weights. For equally weighted portfolios, CC is maximized. When a portfolio concentrates on fewer stocks, the value of CC decreases. While CC ignores correlations in exposures, it is effective to capture risk appetite and stress levels with low computational complexity. In Figure 4, CC is plotted for DR, MV, ERI, and MDP in the backtest period. Both DR and ERI have very low CCs over time and are very stable. MV and MDP have high CCs over the financial crisis of 2007-2009 and around 2018. This shows that the selection of stocks are related to the choices of the risk measures: especially during the crisis time, strategies (MV and MDP) using variance as the risk measure need more stocks to diversify away the risk while strategies (DR and ERI) using VaR need fewer stocks. The average CC over the backtest period is calculated for each strategy shown in Table 2. Again, DR and ERI have similar but small CC suggesting they are selective in stocks.
The portfolio turnover is defined as τ t = ||w t − w t−1 || 1 where w t is the optimal portfolio weights. Turnover captures extent of re-balancing at each time-stamp by proxy on change in weights vector, and evaluates transaction costs of a strategy. In Figure 5, the turnovers are plotted for DR, MV, ERI, and MDP in the backtest period. The average of turnover for each strategy is reported Table 2. Both DR and ERI have higher turnover than other strategies.  From the above analysis, the VaR-based strategies (DR and ERI) have high turnover compared to the variance-based strategies (MV and MDP). This may be due to that VaR is more sensitive to extreme risks as opposed to variance taking both profits and losses into account. At the same time, DR and ERI have very low CCs, which means these strategies consist of fewer stocks. Thus, next we add the transaction cost to see if how low CC and high turnover affect the performance of these strategies. According to Wrobel (2017), from April 2007 to August 2008 the transaction cost increased 37.6% and from August to October in 2008 the transaction cost increased 26.9%, which is the highest levels in 5 years. That is the transaction cost during the financial crisis of 2007-2009 is higher than the normal time. For simplicity, we assume a constant transaction cost: the transaction cost per dollar amount traded is θ during normal time, and the cost during the crisis time is θ f c = 1.5θ. That is the transaction cost is one and a half times of that in the normal time. Let θ = 0.1%, 0.5%, 1%. The cumulative return with transaction cost is reported in Table 2 and the corresponding portfolio values are shown in Figure 6. When the transaction cost is as low as θ = 0.1%, DR still outperforms other strategies due to its low CC. Once the transaction cost is higher (θ = 0.1%, 0.5%, 1%), EW has the best performance and DR has the moderate performance because of its high turnover. We further plot the optimal weights for DR, MV, ERI, and MDP at each rebalancing date during the backtest period in Figure 7. The patterns of the optimal weights reinforce the observations in the study of the CC and turnover that DR and ERI consist of fewer stocks and have more significant changes in composition. They have similar patterns, i.e., frequent large changes in stock composition during crises (2008 financial crisis and 2020 stock market crash) and rather stable otherwise, while MV and MDP exhibit frequent but small changes in composition throughout. This difference should be attributed to the choice of risk measure. VaR is a measure to capture extreme risks better than variance so that both DR and ERI strategies are more sensitive in the crisis time, where extreme risks occur. To analyze the degree of diversification, we perform principal component analysis (PCA) for each strategy. The first principal component is calculated on the weighted returns, that is for a 2021 × 361 matrix M, with where w * i (j) is the optimal weight and −X i (j) is the log-return of the jth stock at time i. To account for temporal changes, we compute percentage of the sample variance explained by the first principal component on the first i rows of the matrix M at each time point i = 1, 2, . . . , 2021 in the backtest period. The estimated percentage of variance is plotted over time in Figure 8. We report the last percentage (i = 2021) in Table 2. The smaller this value the greater the diversification. ERI has almost the smallest percentage of sample variance explained by the first principal component for majority of the time as showed in Figure 8 and the smallest average value among all strategies although it has a low concentration coefficient. DR shows the smallest percentage after the 2008 crisis started. Similar pattern can be expected for the COVID-19 pandemic period. Although complete performance of the strategies over the pandemic are not shown here due to the time range of data, still we can conclude that DR is efficient in diversifying extreme risks and thus a feasible strategy in the period of a crisis. The skewness and kurtosis of portfolio returns are reported as well. DR has a right skew while all the benchmark strategies have left skews. DR also has the highest kurtosis. This reinforces the findings in cumulative return and annualized returns that DR has the ability to generate higher risk-adjusted returns.
We ascertain DR's distinctness with DR weights' cosine similarity and DR returns' Pearson correlation coefficient against other strategies in Table 3. We observe low cosine similarities on DR weights coupled with high correlation on returns with other strategies. This dichotomy reinforces DR's novelty in portfolio allocation since correlated returns hint at it capitalizing on similar temporal market movements as other strategies, while dissimilar weights (and higher returns) suggest it does so with an effective distinctive methodology.

Analysis of Grouped Stocks
In the previous section, all 361 stocks are fitted into a MRV model. In this subsection, we consider portfolios of stocks grouped with similar tail indices. Figure 9 shows the distribution of the tail indices calculated by Hill estimator based on a 5-year window of alternate-day data from 1 January 2000 to 31 December 2004 for all 361 stocks. We divide these stocks into three buckets as shown in Table 4.
The group of stocks with tail indices α ≤ 2.5 contains the most heavy-tailed securities. The portfolio values for each strategy are plotted in Figure A1 in Appendix A. The portfolio values with transaction costs are plotted in Figure A2. The performance of each strategy for this group of stocks is reported in Table A1. DR has the moderate performance in returns. When the transaction cost is included, DR has low returns among the five strategies but it is still higher than ERI. DR has the lowest average percentage of sample variance explained by the first principal component. This shows that DR has highest significant effect of diversification for the most heavy-tailed stocks.

Tail Index Groupings
Number of Stocks α ≤ 2.5 84 2.5 < α ≤ 3.5 195 3.5 < α 82 The group of stocks with tail indices 2.5 < α ≤ 3.5 is by far the largest, with 195 listed securities. The portfolio values for each strategy are plotted in Figure A3; the portfolio values with transaction costs are plotted in Figure A4. The performance is reported in Table A2. DR performs similar to that of the entire 361 stocks: it has highest cumulative return, annualized return, Sharpe ratio, STARR, and lowest average percentage of sample variance explained by the first principal component. When the transaction cost is low (θ = 0.1%), DR still has the highest return rate due to low CC; however, when the transaction cost is higher, DR has moderate performance due to high turnover. This shows that DR works well for a vast majority of our securities universe.
The group of stocks with tail indices α > 3.5 contains the most light-tailed stocks. The portfolio values for each strategy are plotted in Figure A5 and the portfolio values with transaction costs are plotted in Figure A6. The performance is reported in Table A3. Again, DR has the lowest average percentage of sample variance explained by the first principal component, reflecting the most significant effect of diversification. When transaction cost presents, DR has moderate performance.
To summarize, DR has highest effect of diversification in all three groups of stocks. It performs the best for most of the stocks with tail indices 2.5 < α ≤ 3.5.

Conclusions
In this paper, we empirically test the performance of a portfolio optimization strategy that minimizes DR on S&P 500 stocks. The performance is compared with benchmark strategies: EW, MV, ERI, and MDP. The DR and MDP strategies have similar structure of capturing the degree of diversification but using VaR and variance as the risk measure, respectively. ERI and MV are strategies that minimize risks, which are measured by VaR and variance, respectively. DR and ERI, both using VaR as the risk measure, are very selective in stocks and have high turnovers. This implies that strategies based on VaR have major changes in stock compositions during the crisis time, highlighting high sensitivity to extreme risks. When there is no transaction cost or low transaction cost, DR shows promising cumulative return, annualized return, Sharpe ratio, and STARR at 95%. In addition, it showcases the highest unfaltering level of diversification when the strategy is performed for the entire stock universe during the crisis time and for grouped stocks with similar tail indices over the entire test period.
Our study shows the importance of analyzing diversification benefit based on measures for extreme risks in portfolio optimization. In the data range analyzed, the DR strategy outperformed other strategies by diversifying more risks away and maintaining a good level of return during the crisis times, both 2007-2009 global financial crisis and the COVID-19 pandemic. Moreover, the DR strategy is invariant from using different quantilebased risk measures in the definition of DR (2), such as ES and expectiles, which means it does not suffer from the drawbacks of VaR. Thus, the DR strategy can be applied when extreme risks present in the market, that is when stronger interconnectedness is shown among the stocks and some stocks show sharp declines in prices. The current analysis is built upon the assumption that the underlying risks can be modeled by the MRV. Due to the difficulty of verifying the MRV model for high dimensional data, the proposed model (MRV) may not be appropriate to be applied in the entire stock market. For future research, flexible high dimensional models that can capture extreme risks should be investigated to see if the DR strategy can still have good performance during the crisis times. Another drawback of the current DR strategy is that it only extracts the most diversification benefit; that is, only the downside risk is minimized. This may be the reason that the DR strategy performs well during the crisis time but not as good as some other strategies in the normal time. For future work, strategies that can achieve higher return and also maintain the most diversification benefit may perform equally well during the crisis and the normal time, which is of great importance to study.  The gradient of h is The Hessian matrix of h is and for 2 ≤ i = j ≤ d + 1 By the Taylor expansion, for x, y ∈ (0, 1] d+1 and y t → y, we have for t > 0, That is, the function h is Hadamard differentiable. By the functional delta method (e.g., Theorem 20.8 in Van der Vaart (2000)), the desired result follows.   Figure A2. Portfolio values with transaction cost for grouped stocks with tail indices α ≤ 2.5 when θ = 0.1%, 0.5%, 1%, θ f c = 1.5θ.