Entropic Data Envelopment Analysis : A Diversification Approach for Portfolio Optimization

Recently, different methods have been proposed for portfolio optimization and decision making on investment issues. This article aims to present a new method for portfolio formation based on Data Envelopment Analysis (DEA) and Entropy function. This new portfolio optimization method applies DEA in association with a model resulting from the insertion of the Entropy function directly into the optimization procedure. First, the DEA model was applied to perform a pre-selection of the assets. Then, assets given as efficient were submitted to the proposed model, resulting from the insertion of the Entropy function into the simplified Sharpe’s portfolio optimization model. As a result, an improved asset participation was provided in the portfolio. In the DEA model, several variables were evaluated and a low value of beta was achieved, guaranteeing greater robustness to the portfolio. Entropy function has provided not only greater diversity but also more feasible asset allocation. Additionally, the proposed method has obtained a better portfolio performance, measured by the Sharpe Ratio, in relation to the comparative methods.


Introduction
More than sixty years after the development of the Markowitz model, mean-variance approach is still the main model used for asset allocation and portfolio management.Since then, several researchers have proposed new models for portfolio optimization based on the mean-variance Markowitz approach [1][2][3].For both investors and academia, investment selection in risky assets remains a challenge for financial management [4][5][6].
The major issues for asset management are the portfolios generated by the Markowitz model that are often extremely concentrated in a limited number of assets [7].Another point is that the estimation errors may bias the optimal portfolio weights and generate corner solutions that involve infeasible asset allocations [8].
An approach that has been widely applied to portfolio formation is the entropy concept [7,[9][10][11][12][13].The statistics generated by the entropy provide additional information in forming optimal portfolios, particularly to increase asset diversification by reducing errors in estimating associated parameters.Therefore, entropy increases portfolio diversity and makes asset allocation more feasible than models without entropy [7].
However, there are different approaches of entropy implementation.Zhou et al. [10] revised concepts, principles, and effects of entropy applications, as well as compared them with other methods.However, Sheraz et al. [9] used the entropy concept to compare volatile markets.Bhattacharyya et al. [12] used the Cross-Entropy model to quantify the level of discrimination in returns to a given satisfactory return value.Rodder et al. [14] used a theoretical inference mechanism of information under maximum entropy and minimum relative entropy, in order to consider the measures of risk of disadvantage and to take into account the specific objectives of the investor.In addition, Post and Poti [15] and Post [16] developed approaches to ex-post portfolio analysis based on relative entropy and empirical likelihood.Post and Karabati [17] consider ex-ante portfolio construction based on these ideas.
According to Popkov [18], the diversification of a portfolio implies that the idiosyncratic risk can be reduced to zero as the number of assets included in the investment increases.However, according to Yu et al. [7], in practice, financial analysts need to deal with the over-diversification that occurs in the model with entropy.
To control the number of assets, this study proposes the use of the Data Envelopment Analysis (DEA) model.According to Yu et al. [7], this approach can be justified because the purpose of a diversified portfolio is to invest in as many mean-variance efficient assets as possible.
DEA aims to evaluate efficiency and has been widely used to assist managers in a wide range of areas, including finance [19,20].More recently, DEA has been applied for training and evaluating a portfolio's efficiency [21][22][23][24][25].
Regarding the use of DEA in portfolio optimization, Branda and Kopa [24] introduced DEA models equivalent to efficiency tests with respect to n-th order stochastic dominance, which extends the SD efficiency tests by Post [26] and Post and Kopa [27].Rotela Junior et al. [23] proposed the use of DEA for a pre-selection of assets associated with Sharpe's model [28].Differently from the aforementioned works, this study uses a DEA approach to reduce the search space, defining efficient assets among the total number of assets.In the proposed method, Shannon entropy was included to ensure an efficient asset diversification while return and portfolio risk were maximized and minimized, respectively.

Data Envelopment Analysis (DEA)
Data Envelopment Analysis (DEA) is a non-parametric method used to evaluate the relative efficiency of decision-making units.These decision-making units-DMUs (Decision Making Units) use multiple inputs to produce multiple outputs [29].One of the main advantages of DEA is that the inputs and outputs can have different units [30].
The DEA was originally developed by Charnes et al. [31] and expanded by Banker et al. [32].Such models are considered classic in DEA literature.
Since its introduction, there has been a significant increase in the number of published articles.Up to 2016, about 9881 articles related to DEA were published in the literature, with 11,961 different authors.It is worth mentioning that between 2014 and 2016 about 1000 works were published per year.Such growth can be divided into three stages.The first occurred between 1978 and 1994; in that period the growth of articles related to the DEA was slow.The second occurred between 1995 and 2003, and the average number of articles published was 134 per year.Finally, the third phase occured as of 2004, in this stage the average number of articles published reached 650 per year [33].
Among these publications indexed into Thomson Reuters Web of Science (WoS), about 58% of the articles were applied to real problems, while 42% aimed to contribute with advances in methods [34].
Main applications are banking, health, agriculture, transportation, education, and hospital [35,36].However, current applications with the greatest growth occur in energy and environment, as well as in finance [37].
Regarding the classical models, the first DEA model to evaluate the efficiency of public education programs was developed by Charnes et al. [31], which was called constant returns to scale (CRS).The model constructs a nonparametric linear surface involving the data, called multiplier model, represented by ( 1)-( 4): subject to: In which k represents the DMU index; j is the output index; and i is the input index.It is worth mentioning that y jk is the value of the j-th output for the k-th DMU and x ik is the value of the i-th input for the k-th DMU.Moreover, u j is the weight associated with the j-th output; v i is the weight associated with the i-th input; E f f o is the relative efficiency of DMU O , which is the DMU under evaluation; and x io and y jo are, respectively, input and output data from DMU O .
The model divides the DMU S into two groups: efficient and inefficient.In relation to the former, if [38,39].
It is noteworthy that the objective of the presented Model ( 1) is to maximize the efficiency, through the minimization of the inputs, with the maintenance of the products.That model is known as input oriented.In addition, there is a model whose goal is to maximize products while maintaining the resources, termed as output-oriented [40][41][42].

Portfolio Selection
The Markowitz model [43] estimates future performance of assets in order to establish an efficient set of portfolios and, finally, to select the portfolio according to the investor preferences.This model is implemented by Quadratic Programming processes and its objective is to optimize portfolios, taking into consideration the mean vector and the variance-covariance matrix of asset returns.These parameters are estimated from time series and can also be included in the portfolio optimization [28,44].
For Sharpe [28], the number of computational resources to perform a portfolio analysis, using the Markowitz model [43], is related to the following factors: number of assets (which will affect the computational model extension); number of corner portfolios; and complexity of variance-covariance matrix.
Sharpe [28] extended the work of Markowitz [43] showing a simplified model of the relationships between assets, offering evidence on costs as well as the advantage of using the model for practical applications [45].In Sharpe's model, returns between the assets are not correlated with each other, but with a single index representing returns of the entire market.Furthermore, the model possesses two advantages: (1) it can be constructed without some kind of relationship between the assets; and (2) it captures a good part of these relationships [28].Then, the model assumes that there is no correlation between the assets, according to (5): In which, R m is the return level of any index, for example, the market rate of return; ε i is the random standard error of Sharpe's model, with mean equal to zero and variance σ 2 i ; α i is the return component of asset i that is an independent random variable of market performance; and β i measures the expectation of changes in asset returns i given a change in R m .Another assumption made by Sharpe [28] is that the future value of R m is, to some extent, determined by random factors as (6): where α n+1 is a parameter and ε n+1 is an independent random variable with mean equal to zero and variance σ 2 n+1 .In this case, covariance between ε i and ε j was assumed zero whenever i = j.Then, as shown by Sharpe [28], the return obtained by a portfolio can be considered as the result of: (a) a series of investments in n assets and (b) an investment in the index, as (7).Thus, from the weighting in (5), Equation ( 7) can be obtained as follows: Defining ω n + 1 as the weighted average of r p for a given value of R m , Equation ( 8) can be calculated.
Substituting ( 6) and ( 8) into ( 7), we have (9): According to Frankfurter et al. [44], such assumptions constrain the range of returns, however, a simpler model for portfolio optimization can be obtained by ( 10)-( 13): subject to: ) where λ represents the investor's risk tolerance; ω i is the total proportion invested in the i-th asset; α i is the i-th asset return; σ 2 i represents the i-th asset variance; and β i is the expectation of change in the i-th asset return.
This formulation indicates the reason why α n+1 and σ 2 n+1 parameters show the expected value and the variance of the future value of R m , respectively.This also indicates the reason why this model is called diagonal.The variance-covariance matrix can be defined as a matrix with non-zero values along the main diagonal, including the (n + 1)-th asset, defined as indicated by Sharpe [28].According to Rotela Junior et al. [23], portfolio complexity can be simplified by using this approach.

Entropy
Entropy was first introduced in thermodynamics by Rudolf Clausius in 1985 to measure the heat transfer ratio through a reversible process in an isolated system [10].It is noteworthy that the physical concept of entropy is related both to the state of disorganization of matter and to the tendency of disorganization of all the material.Thus, it is possible to state that in a closed system, the entropy never decreases, only increases.Nevertheless, according to Ormos and Zibriczky [46], entropy is a mathematically defined quantity, usually used to characterize the probability of results in a system that is undergoing a process.
The word entropy belonged to the domain of physics until 1948 when Claude Shannon developed his theory, using the term to represent an information measure.The goal of his work was to measure the loss of information by transmitting a message from one end to another.Shannon sought a word to describe his new measure of uncertainty [47].
Since then, Shannon's concept of entropy has been used in different disciplines such as mechanics, statistics, transport, urban planning, queuing theory, information theory, and linear and nonlinear programming.In addition, the concepts and principles of entropy are applied to the field of finance.This is because entropy has its advantages in measuring risk and describing distributions [10].
Ormos and Zibriczky [46] affirm that the functions of entropy can be divided into two main types: discrete and continuous entropy functions.Let X be a discrete random variable, the possible outcomes of that variable are x 1 , x 2 , ..., x n and the corresponding possibilities are p i = P(X = x i ), p i ≥ 0 and n ∑ i=1 p i = 1.The generalized discrete entropy function for the variable X is defined as (14): where α is the order of entropy, with α ≥ 0 and α = 1, and the basis of the logarithm is 2. The order of entropy expresses the weight considered in each result.The most commonly used orders are α = 1 and α = 2.When α = 1, the substitution in ( 14) results in a division by zero.Thus, it is necessary to use the L'hospital rule for the limit of α = 1, finding the Shannon [48] entropy of a measure of probability on a finite set X given by (15): where ω i are the weights assigned to the assets in the portfolio.According Rocha et al. [49], among the many desirable properties of the Shannon entropy index, we can highlight: (a) the entropy of a probability distribution representing a completely certain result is 0, and the entropy of any probability distribution representing uncertain results is positive; and (b) its measure is concave.Property 1 is desirable, since the entropy index guarantees non-zero solutions.Property 2 is desirable, therefore, it is much easier to maximize a concave function than a nonconcave function [47].Figure 1 shows these properties of the entropy, as calculated by Equation (15).It is possible to notice that the higher values of entropy indicate more randomness, that is, less information is expressed.
According to Zhou et al. [10], when dealing with continuous probability distributions, a density function is evaluated in all values of the argument.Thus, given a continuous probability distribution It is possible to notice that the higher values of entropy indicate more randomness, that is, less information is expressed.
According to Zhou et al. [10], when dealing with continuous probability distributions, a density function is evaluated in all values of the argument.Thus, given a continuous probability distribution with a density function, one can define its entropy as (16): An important difference between discrete and continuous entropy is that while discrete entropy only carries positive values, continuous entropy can also have negative values since S(x) ∈ R.

Proposed Method
For the application of the proposed method, preferential and ordinal assets, with participation greater than zero in the Bovespa index (Ibovespa) of the São Paulo Stock Exchange (BM&FBovespa), were considered.
The proposed method was conducted as follows.Initially, the Data Envelopment Analysis (DEA) model identifies efficient assets.Input and output variables in the DEA model were selected taking into account literature recommendations such as Rotela Junior et al. [23], Kim et al. [50], and Rotela Junior et al. [25].Return information and profit per share (EPS) were used as model output variables.As input variables, beta, price-earnings, and volatility were adopted.Considering these indicators, only 59 assets could be selected for portfolio formation.The Economática ® database was used for data collection for 36 months (from 2014 to 2016).
After this preliminary analysis, only the assets given as efficient were submitted to allocation from the proposed model.This model results from the insertion of the Shannon Entropy function directly into the portfolio optimization model proposed by Sharpe [28], with the aim of diversifying the portfolio.
In order to find a diversified portfolio, the Entropy metric associated with the conventional mean-variance model was applied, according to Equations ( 10)-( 13).The proposed model can be presented by ( 17)- (20): subject to: In this study, the λ parameter was considered to be equal to 1, and no weight was associated with the objective functions, so that the relative importance of the objective functions described in 17 is equal.
Finally, the portfolio resulting from the application of the proposed method that is, resulting from the association of the DEA and the modified Sharpe model [28], plus an Entropy function, were compared to two other portfolios: (1) one formed by Sharpe's model [28]; and (2) another formed by the association of the DEA model with the classic Sharpe's model [28].
To support this comparison during the result analysis, the Capital Asset Pricing Model (CAPM) presented by Sharpe [51] was applied to identify abnormal returns.In addition, the Sharpe Ratio (SR) [52] was calculated to evaluate the portfolios' performance [25,53,54].Moreover, data from six additional months were analyzed for method validation.Accumulated return was calculated for each portfolio in the validation period, according to the participations defined by the models used in the optimization.

Analysis and Results
At first, this paper proposed a method for optimizing portfolios in which assets are pre-selected by the DEA model, and in the sequence is defined the ideal allocation in the portfolio through the proposed model, which is the result of the insertion of the Entropy function into the model Sharpe [28], as presented by Equations ( 17)- (20).
Then, two other portfolios were chosen to compare the portfolio obtained through the proposed model and method.In the first one, the 59 assets were optimized from Sharpe's model [28], without using the DEA model for asset pre-selection.This portfolio was denominated Portfolio 1.The second portfolio was obtained from the use of the DEA to pre-select the efficient assets.At the end of this evaluation, ten assets were reported as efficient.After that, the original model of Sharpe [28] was applied, in order to obtain the ideal participation of the assets for portfolio composition.This portfolio was denominated Portfolio 2.
The resulting portfolio of the proposed method was called Portfolio 3. At the end of the efficiency evaluation, the same ten assets were given as efficient, and then submitted to the proposed Model ( 17)- (20).
In Portfolio 1, as previously mentioned, only Sharpe's model [28] was applied and the participation of the 59 assets was identified.Seven assets out of 59 were selected to compose the portfolio, since there was no restriction regarding the maximum participation of the assets.Thus, DMUs (assets) 2, 10, 12, 16, 26, 52, and 58 obtained a participation of 19.67, 10.21, 0.50, 47.68, 7.90, 0.89, and 13.16%, respectively.It is interesting to note that in this model the share of asset 16 dominates the portfolio.
In Portfolios 2 and 3, the DEA model was applied to the 59 assets (DMU's), and 10 assets were identified as efficient.Then, these assets were submitted to the allocation models: Portfolio 2 to Sharpe's model [28] and Portfolio 3 to the proposed model.Table 1 shows the ideal participation of the assets in Portfolios 1-3.
Among the assets presented in Table 1, not all were considered to compose all the portfolios, which explains the symbol "-".However, some may have been considered by the DEA model, but do not enter the portfolio share, which explains some values of 0%.It should be noted that the assets composing Portfolio 1 differ from the assets that compose Portfolios 2 and 3, since these were selected simply by taking into account the return and risk variables.
For the selection of Portfolios 2 and 3, different variables were considered, as discussed in Chapter 3, which provides a more careful analysis.
It is interesting to note that some assets in Portfolio 2 obtained zero participation, as was the case in most of the assets in the Portfolio 1 allocation.In Portfolio 3, asset allocation takes place in a more balanced manner, providing a diversification different from that found in Portfolio 1 and Portfolio 2, as can be seen in Table 1.The allocation by the proposed model, resulting from the inclusion of Entropy in Sharpe's model [28], provides an increase in diversity, as stated by Yu et al. [7].As shown in Table 2, the ten assets selected by the DEA model were used in the composition of Portfolio 3.For Portfolios 2 and 3, the diversification benefit is not used to the maximum, and has only four and seven assets in its composition, respectively.Diversification reduces portfolio risk by investing in assets that are independent of each other, that is, taking advantage of assets with a low correlation.[53], Auer and Schuhmacher [54], and Rotela Junior et al. [25].The results of 2.978, 2.289, and 3.002 were obtained and the Sharpe Ratio of the proposed method is highlighted.
Regarding the alpha or Jensen Index, since the value is positive, the investor generated a higher return than expected considering the level of portfolio risk.It should be noted that the values obtained for Portfolios 1-3 are, respectively, 1.851, 1.861, and 1.083, as shown in Table 2.In addition, Table 2 shows that in the validation period, the accumulated return obtained by Portfolios 1-3 are, respectively, 14.08, 13.36, and 16.97%, with emphasis on the value obtained by the portfolio generated by the proposed method.
Finally, the Beta values of the portfolios were calculated, in which the values obtained for Portfolios 1-3 were, respectively, 0.361, 0.278, and 0.283.Again, with emphasis on Portfolios 2 and 3, which was expected given that in the DEA model, the beta variable was considered.This is in agreement with Kim et al. [50] that low beta assets reduce overall portfolio risk and offer better returns than assets with a higher beta.Portfolios with low beta perform well in any market state, including in downturns, just when investors need the diversification effect.

Conclusions
This paper has presented a new method for portfolio formation by using Data Envelopment Analysis (DEA) in association with a model resulting from the addition of the Entropy function directly into the asset allocation model.
In the first step, the DEA model allowed a pre-selection of the assets through different variables that are relevant and assist investors in the decision making of portfolio composition.
The proposed new allocation model, in which Sharpe's [28] optimization model was altered by an Entropy function, provided an increase in diversity, which could be observed in the composition of the proposed portfolio, allowing a more feasible allocation of assets than models without the entropy function.
The good performance of the proposed method could be observed when comparing Portfolio 3 against the others.Some results are evidenced such as (1) the Sharpe Ratio; a number of assets and balanced asset allocation in the portfolio and (2) low beta value that allows, not only reaching a robust portfolio, but also having good behavior in any market state, resulting from the association of the DEA to the proposed model.

Figure 1 .
Figure 1.Entropy in the case of two possibilities.

Table 2 .
Summary table of Portfolio results.

Table 2
also shows the values of expected return, standard deviation, and return for Sharpe Ratio calculation, following the model of Homm and Pigorsch