On the Diversity Constraints for Portfolio Optimization

In the literature, Markowitz's mean-variance model and its variants have been shown to yield portfolios that put excessive weights on only a few assets. Many diversity constraints were proposed and added to these models to avoid such overly concentrated portfolios. However, since these diversity constraints are formulated differently, it becomes difficult to compare them and study their relationships. This paper proposes a canonical form for the commonly used diversity constraints in the literature, and shows how to transform these diversity constraints into this canonical form. Furthermore, this paper compares these diversity constraints (in the canonical form with the same upper bound) on their ability to shrink the feasible region of the portfolio optimization problem. The results show a subset relation among their feasible regions.


Introduction
Portfolio optimization deals with the problem of allocating one's wealth across a number of assets to maximize return and control risk. This problem was first studied in the seminal paper of Markowitz [1], who proposed a mathematical formulation of the problem, and derived a mean-variance model to yield portfolios that either maximize the portfolio expected return for a given level of risk, or minimize the portfolio risk for a given expected return. Since then, this problem continues to receive much consideration and attention from both academics and practitioners in the financial industry, and many variants of the mean-variance model (e.g., minimum-variance model, mean-variance-skewness model, etc.) have been proposed. Please refer to [2][3][4] for the variants of the mean-variance model.

OPEN ACCESS
Markowitz's mean-variance model uses the expected return of each asset and the covariance between the returns of any two assets as input. In practice, these input values are hard to estimate with high accuracy, and the estimations based on historical data might not be appropriate for future data. Furthermore, the portfolios derived from the mean-variance model are very sensitive to these input values [5,6]. Consequently, mean-variance optimized portfolios often put excessive weights on assets with large expected returns [6,7], regardless of possible estimation errors in these input values. Such overly concentrated portfolios, which also occur in the variants of the mean-variance model [3], go against the idea of diversification. DeMiguel et al. compared several variants of the mean-variance model on several datasets, and showed that none of these models consistently beat the naïve equally-weighted portfolio, out of sample [4].
To reduce the impact of estimation error and avoid overly concentrated portfolios, several diversity constraints have been introduced in the literature. The weight upper/lower bound constraint [8] simply imposes an upper/lower bound on assets' weights in a portfolio. The L p -norm constraint [9] and the entropy constraint [10] impose an upper bound on the L p -norm and a lower bound on the entropy, of the assets' weights in a portfolio, respectively. Empirical studies [8,9,11] show the weight upper bound constraint or the L p -norm constraint improves portfolio efficiency out of sample. Using the entropy of the assets' weights as one of the objective functions in a multi-objective setting also improves portfolio efficiency out of sample [2].
Even though these diversity constraints improve the out-of-sample performance of the portfolio optimization problem, their relation remains unclear. To the best of our knowledge, no work compares these diversity constraints on their abilities to shrink the feasible region of the portfolio optimization problem. One reason hindering such a comparison is due to the differences among these diversity constraints. Notably, some diversity constraints are in the form of an upper bound, but others are in the form of a lower bound. Furthermore, the ranges of their respective bounds are also different. Consequently, it is difficult to compare them systematically.
The objective of this study is to provide a systematic way to compare the performance of these diversity constraints. To achieve this goal, we first propose a canonical form for these diversity constraints, and show how to transform these diversity constraints into the canonical form. With the canonical form, all of these diversity constraints are in the form of an upper-bound constraint, and their respective bounds all fall into the same range. On the lowest end of the range of the bounds, these diversity constraints all restrict the feasible region of the portfolio optimization problem to exactly one point corresponding to the equally-weighted portfolio. On the highest end of the range of the bounds, these diversity constraints all become redundant. By using the same value for the upper bound of these diversity constraints in the canonical form, a systematical comparison among them can be achieved. Although these diversity constraints diminish the feasible region of the portfolio optimization problem differently, a subset relation among their feasible regions can be derived for these diversity constraints in the canonical form under the same upper-bound value.
The rest of this paper is organized as follows: Section 2 reviews the commonly used diversity constraints in the literature, including the weight upper/lower bound constraint, the L p -norm constraint and the entropy constraint. Section 3 proposes the canonical form for these diversity constraints and shows how to transform them into the canonical form. Section 4 discusses the subset relation among these diversity constraints. Section 5 studies how the feasible region of the portfolio optimization problem shrinks as these diversity constraints are applied. Section 6 concludes this paper.

Review of Diversity Constraints
Diversifying a portfolio avoids putting too much weight on only a few assets. This has the potential of reducing risk of the portfolio. Adding an upper-bound constraint on assets' weights is probably the most common and simplest way to control the diversity of a portfolio. In practice, institutional investors are often restricted by law to enforce such a weight upper-bound constraint [12]. Diversifying a portfolio can also be achieved by imposing a lower-bound constraint on assets' weights or by restricting the L p -norm or the entropy of the assets' weights of a portfolio. This section reviews these commonly used diversity constraints in the literature.
For ease of exposition, this study considers the following base form of the portfolio optimization problem where short selling is prohibited.

Problem 1.
Given the expected returns of n risky assets , , … , and their variance-covariance matrix   , find a portfolio , , … , such that the objective function is maximized, subject to the following two constraints: The constraint in Equation (1) prohibits short selling by restricting to non-negative weights. Notably, a non-negative weight is required for calculating the entropy of , to be defined shortly in Section 2.4. The constraint in Equation (2) enforces that all wealth is invested. The objective function often involves maximizing the portfolio's return and/or minimizing the portfolio's risk. For example, Markowitz's mean-variance model uses and  to measure the expected return and the risk of a portfolio , respectively [1], and therefore, can be defined as  or  for finding a comprise between return and risk.

Weight Upper-Bound Constraint
The weight upper-bound constraint directly avoids the overly concentrated portfolios by adding an upper bound U on the weight of each asset in a portfolio. It is expressed as follows: Notably, max returns the maximum component of . With the weight upper-bound constraint, smaller U restricts the feasible region of Problem 1 only to the more diverse portfolios. On one extreme, if U , then the feasible region of Problem 1 contains only one portfolio, i.e., the equally-weighted portfolio. On the other extreme, if U = 1, constraint in Equation (3) becomes redundant due to constraint in Equation (1). Therefore, the range of U is , 1 .

Weight Lower-Bound Constraint
Imposing a lower bound on the assets' weights can also avoid overly concentrated portfolios. The weight lower-bound constraint is expressed as follows: Notably, min returns the minimum component of . With the weight lower-bound constraint, larger L restricts the feasible region of Problem 1 only to more diverse portfolios. On one extreme, if L , then the feasible region of Problem 1 contains only one portfolio, i.e., the equally-weighted portfolio. On the other extreme, if L =0, constraint in Equation (4) becomes redundant due to constraint in Equation (1). Therefore, the range of L is 0, . Lemma 1 shows that enforcing a weight lower-bound constraint also imposes a weight upper-bound constraint, which in turn avoids the overly concentrated portfolios. It can be used to check for redundant weight upper-bound constraint when both upper bound and lower bound on assets' weights are applied. For example, consider Problem 1 subject to both constraints in Equations (3) and (4) with L 0.01, U 0.91 and 11. By Lemma 1, min 0.01 implies max 0.9, and thus max 0.91 is redundant.

L p -norm Constraint
Given 1, the L p -norm constraint adds an upper bound U on the L p -norm N of a portfolio , as defined below: In the least diverse scenario wherein only one component of is 1 and the rest of the components of are 0, N reaches its maximum 1. In the most diverse scenario that for all i, N reaches its minimum . Therefore, the range of U is , 1 . With the L p -norm constraint, smaller U restricts the feasible region of Problem 1 to more diverse portfolios. Lemma 2 shows that enforcing an L p -norm constraint also imposes a weight upper-bound constraint. Therefore, similar to Lemma 1, Lemma 2 can be used to check for redundant weight upper-bound constraint when both the weight upper-bound constraint and the L p -norm constraint are applied.  [9]. Even though the A-norm constraint lets the investors incorporate their preference about the assets into the matrix , the matrix can be hard to decide. Therefore, the identity matrix is often used for the matrix , and the A-norm becomes the L p -norm with 2.

Entropy Constraint
The entropy constraint adds a lower bound L on the entropy E of a portfolio , as defined below [10]: In the least diverse scenario that only one component of w is 1 and the rest of the components of w are 0, E reaches its minimum −1 ln1=0. In the most diverse scenario that for all i, E reaches its maximum ln ln . Thus, the range of L is 0, ln . Since a larger E indicates better diversity, the entropy constraint uses a lower bound L within the interval 0, ln to control the diversity of w from being too low.
Proof. The minimum of E occurs at the least diverse portfolio. Since U ∈ , 1 , in the least diverse portfolio, assets each has the maximum weight U , one asset has the remaining weight 1 − U , and the rest of the assets have weight zero.

Canonical Form of Diversity Constraints
Tightening the bounds for the four diversity constraints described in Section 2 all shrink the feasible region of Problem 1 gradually to the equally-weighted portfolio. However, the four diversity constraints shrink the feasible region differently, and the relation among them remains unclear. Furthermore, determining the bound for the weight upper/lower bound constraint may be intuitive for the investors, but this is not the case for the L p -norm constraint and the entropy constraint. To facilitate the comparison among the four diversity constraints, this section proposes the canonical form of a diversity constraint as follows: (7) where is a scalar function of for the diversity constraint, and the upper bound is confined to the interval , 1 .
In the canonical form, all of these diversity constraints are in the form of an upper-bound constraint, and the ranges of their respective bounds are all , 1 . Consequently, these diversity constraints can be compared under the same upper bound. Obviously, the weight upper-bound constraint is already in the canonical form by simply letting max and U . The rest of this section shows how to transform the remaining three diversity constraints into this canonical form.

Transforming Weight Lower-Bound Constraint
The weight lower-bound constraint, defined in Equation (4), can be transformed into the following upper-bound constraint: Lemma 4 shows that constraint in Equation (8) with U 1 1 L is equivalent to the weight lower-bound constraint in Equation (4).

Transforming L p -norm Constraint
The range for the upper bound U in the L p -norm Equation (5) is the interval , 1 , which varies with the value of p. The L p -norm constraint can be transformed into Equation (9) such that the range of the upper bound becomes independent of the value of p: Since the range of U in Equation (5) is the interval , 1 and N is the (p−1)-root of N , the range of U in Equation (9) is , 1 . Furthermore, Equation (9) with U U is equivalent to Equation (5). Transforming the L p -norm constraint in Equation (5)

. Transforming Entropy Constraint
The entropy constraint, defined in Equation (6), can be transformed into an upper-bound constraint as follows: According to Equation (6)

Subset Relation among Feasible Regions
This section studies the subset relation among the feasible region of Problem 1 subject to a diversity constraint in the canonical form. Specifically, the feasible regions of the following problems are studied:  For the L p -norm constraint, the cases of 2 and 3 are commonly used. Thus, we consider the following two special cases of Problem 4, and show the subset relation between their feasible regions using Corollary 3.
• Problem 4b: Problem 1 subject to N ∑ | | .  To sum up, under the same upper bound, the weight lower-bound constraint in Equation (8) and the weight upper-bound constraint in Equation (3) are the strictest and the second strictest at shrinking the feasible region of Problem 1, respectively. The L p -norm constraint in Equation (9) is stricter for larger . Notably, as approaches its minimum 1, the L p -norm constraint in Equation (9) is close to the entropy constraint in Equation (10), according to the experimental results in Section 5. The entropy constraint in Equation (10) is also less strict at shrinking the feasible region.

Measurement
This section studies how a diversity constraint (in the canonical form) shrinks the feasible region of Problem 1 as its upper bound changes. We use the feasible ratio instead of the size of the feasible region as our measurement to make the measurement unit less, as defined below.
Given a diversity constraint, the feasible ratio of the diversity constraint is defined as the size of the feasible region of Problem 1 subject to the diversity constraint divided by the size of the feasible region of Problem 1 without the diversity constraint. Thus, the range of feasible ratio is between 0 and 1, and a diversity constraint with a larger feasible ratio is less strict at shrinking the feasible region.

Estimation Approach
For Problem 1 involving only two assets (i.e., n = 2) subject to a diversity constraint, the size of the feasible region usually can be calculated directly. However, as n gets larger, the size of the feasible region becomes hard to derive. Here, we adopt a uniform sampling approach to generate a sample set of points in the feasible region of Problem 1, and then use the sample set to estimate the feasible ratio of a diversity constraint, as described below.
This approach first generates an equally spaced hyper-grid in the (n−1)-dimensional subspace 0,1 . Let the grid spacing be for some integer m>2, and consequently there are 1 crossover points in the hyper-grid. For each crossover point ( , , … , ) in the (n−1)-dimensional subspace, if ⋯ 1, then the point ( , , … , , ) is included in the sample set where 1 ⋯ . The total number of points in the sample set is used as an estimate of the size of the feasible region of Problem 1 without any diversity constraint.
In this experiment, m = 100 is used to generate the equally spaced hyper-grid in the (n−1)-dimensional subspace 0,1 for n = 2 to 5. For each n, there are 101 crossover points in the hyper-grid. Table 1 shows the number of the crossover points that can be extended to the n-dimensional space and included into the sample set. Then, functions max , L , N , N , N . , N . , N . and F are calculated for every point in the sample set. Notably, for the L p -norm constraint, we consider p = 3 and 2, since N and N are commonly used. We also calculate N . , N . and N .
to study the behavior of N as approaches its limit 1. The number of points satisfying max U in the sample set is used as an estimate for the size of the feasible region of Problem 1 subject to the weight upper-bound constraint in Equation (3). Consequently, the feasible ratio of the diversity constraint max U can be calculated by dividing this number by the total number of points in the sample set. Similarly, this can be done for L , N , N , N . , N . , N . and F for their respective diversity constraints. In this experiment, we vary the upper bound from 0 to 1 in step of 0.01. Figure 1 shows that the feasible ratio increases with the upper bound of a diversity constraint in the canonical form. For the weight lower-bound constraint in Equation (8), Figure 1a shows that, given the same upper bound, larger n could have a smaller feasible ratio. Also, as the upper bound gradually decreases from 1 to , the feasible ratio reduces quickly at first but slowly afterward (with the exception at n = 2, where the feasible ratio reduces at a constant rate). In contrast, for the rest of the diversity constraints, Figure 1b-f shows that, given the same upper bound, the feasible ratio is always larger for larger n. Also, as the upper bound gradually decreases from 1 to , the feasible ratio reduces slowly at first and quickly afterward (with the exception of the weight upper-bound constraint in Equation (3) at n = 2, where the feasible ratio reduces at a constant rate). Figure 2 compares how these diversity constraints affect the feasible ratio, where the horizontal axis is the upper bound in these diversity constraints, and the vertical axis is the feasible ratio. Given the same upper bound, the ordering of the feasible ratio is: entropy constraint in Equation (10) ≥ L 1.1 -norm constraint in Equation (9) ≥ L 2 -norm constraint in Equation (9) ≥ L 3 -norm constraint in Equation (9) ≥ weight upper-bound constraint in Equation (3) ≥ weight lower-bound constraint in Equation (8). When the upper bound is 1, the feasible ratio is 1 for all of these diversity constraints. When the upper bound is , , the feasible ratio approaches 0 for all of these diversity constraints since all of these diversity constraints shrink the feasible region to only one solution, i.e., the equally-weighted portfolio. Notably, when n = 2, the weight upper-bound constraint in Equation (3) is equivalent to the weight lower-bound constraint in Equation (8), as shown in Figure 2a. Also note that in Figure 2, the curve for the L p -norm constraint gradually approaches the curve for the entropy constraint as p decreases from 3 to 1.1. The curves for L 1.01 -norm and L 1.001 -norm constraints are omitted in Figure 2 because they closely overlap with the curve for the entropy constraint.     Figure 3a shows that the weigh lower-bound constraint in Equation (8) results in a feasible region similar to the original feasible region, both in terms of shape and orientation. This property has the potential of gradually excluding the solutions along the border of the feasible region as the upper bound of the constraint decreases. Figure 3b shows that the weigh upper-bound constraint in Equation (3) results in a feasible region similar to the original feasible region in shape but in the opposite orientation. The property has the potential of gradually excluding the solutions near the corners of the feasible region as the upper bound of the constraint decreases. Figure 3c-e, respectively, shows that the L 3 -norm constraint, the L 2 -norm constraint and the L 1.1 -norm constraint result in a feasible region similar to those in Figure 3b, except that the shape of the feasible region becomes a squeezed circle for the L 3 -norm constraint, a circle for the L 2 -norm constraint, and a bloated circle for the L 1.1 -norm constraint, instead of a triangle in the weight upper-bound constraint. Thus, they all have the potential of gradually excluding the solutions near the corners of the feasible region as the upper bound of the constraint decreases. Given the same upper bound, they all result in a smaller feasible region than the weight upper-bound constraint does. Furthermore, the feasible region of the L 3 -norm constraint is smaller than that of the L 2 -norm constraint, which in turn is smaller than that of the L 1.1 -norm constraint. Figure 3f shows that the entropy constraint in Equation (10) results in a feasible region similar to the original feasible region in orientation but different in shape. Similar to Figure 3a, constraint in Equation (10) has the potential of gradually excluding the solutions along the border of the feasible region as the upper bound of the constraint decreases. Given the same upper bound, constraint in Equation (10) results in a feasible region larger than any of the previous five constraints does. Notably, the feasible region of the L 1.1 -norm constraint is similar to that of constraint in Equation (10).

Conclusions
Diversity constraints are often used in the portfolio optimization problem to avoid overly concentrated portfolios. Previous work has shown that diversity constraints improve the performance of the portfolio optimization problem. However, no comparison among the diversity constraints for the portfolio optimization problem has been conducted. In this paper, we review the commonly-used diversity constraints in the literature, and show that their differences on both their forms (i.e., upper or lower bound) and the ranges of their bounds make it difficult to compare them systematically. A canonical form of the diversity constraints is proposed to resolve this problem. Using the canonical form, we can compare these diversity constraints at the same upper bound. Our analytical results show that the weight lower-bound constraint in Equation (8) is the strictest at enforcing diversity, and consequently results in the smallest feasible region. The weight upper-bound constraint in Equation (3) comes second. The L p -norm constraint in Equation (9) offers different level of diversity control by using different values for p. Although the entropy constraint in Equation (10) appears to be the loosest at enforcing diversity among the diversity constraints analyzed in Section 5, the L p -norm constraint with p approaching 1 shows similar results to that of the entropy constraint in Equation (10). By transforming these diversity constraints into the canonical form, we show that, these diversity constraints at the same upper bound exhibit a subset relation among their feasible regions.
In addition to allowing systematic comparison among these diversity constraints, the canonical form also eases the task of choosing the upper-bound value for a diversity constraint. In practice, the meaning of the bound for the weight upper/lower bound constraint is easy to understand by the investors. However, without the canonical form, setting a threshold on either the entropy or the L p -norm of the weighted vector w of a portfolio, as did in Equations (5) and (6), is less intuitive than setting a bound on the weights. With the canonical form, the upper bound of a diversity constraint is always restricted to the interval , 1 , which is the same as the range of the upper bound for the weight upper-bound constraint in Equation (3). Therefore, the investors can choose the value for the upper bound from the same range, regardless which diversity constraint is used.