Entropy Cross-Efficiency Model for Decision Making Units with Interval Data

The cross-efficiency method, as a Data Envelopment Analysis (DEA) extension, calculates the cross efficiency of each decision making unit (DMU) using the weights of all decision making units (DMUs). The major advantage of the cross-efficiency method is that it can provide a complete ranking for all DMUs. In addition, the cross-efficiency method could eliminate unrealistic weight results. However, the existing cross-efficiency methods only evaluate the relative efficiencies of a set of DMUs with exact values of inputs and outputs. If the input or output data of DMUs are imprecise, such as the interval data, the existing methods fail to assess the efficiencies of these DMUs. To address this issue, we propose the introduction of Shannon entropy into the cross-efficiency method. In the proposed model, intervals of all cross-efficiency values are firstly obtained by the interval cross-efficiency method. Then, a distance entropy model is proposed to obtain the weights of interval efficiency. Finally, all alternatives are ranked by their relative Euclidean distance from the positive solution.


Introduction
When decision making units (DMUs) have multiple inputs and outputs, data envelopment analysis (DEA) is a well-known non-parametric programming technique for assessing the efficiency of these DMUs.The maximum of the ratio of a DMU's weighted sum of outputs to its weighted sum of inputs is defined as the efficiency score of this DMU.If the efficiency score of a DMU is equal to 1, it is considered as efficient.Otherwise, it is inefficient.Usually, inefficient DMUs are considered as performing worse than efficient ones.Since DEA was proposed by Charnes et al. [1], it has been widely applied to various cases of performance evaluation [2][3][4][5][6][7].DEA models (both CCR (Charnes, Cooper and Rhodes) and BCC (Banker, Charnes, and Cooper) models) classify units into two groups: efficient and inefficient in the Pareto sense (see Sinuay-Shten et al. [8]).In addition, DEA is not able to rank the efficient DMUs that all have an efficiency score of 1.In order to solve this problem, the cross-efficiency method was developed by Sexton et al. [9].The cross-efficiency method, as a DEA extension, could obtain the efficiency of each DMU by linking the weights of all DMUs.Its primary advantage is that all DMUs can be completely ranked [10].In addition, the cross-efficiency method could eliminate unrealistic weight results [11].
With these advantages, the cross-efficiency evaluation has been extensively applied in various performance evaluation problems [12][13][14][15].In spite of its wide applications, cross-efficiency evaluation still has some defects, such as non-uniqueness of the DEA optimal weights [9].Usually, the optimal weights obtained by traditional DEA models are non-unique.If a set of weights are selected arbitrarily, then cross-efficiency scores will be arbitrarily generated [16].To solve the problem of weight non-uniqueness, Sexton et al. [9] improved the cross-efficiency evaluation method by incorporating a secondary goal model.Following the idea of Sexton et al. [9], a number of scholars have proposed secondary goal models.For example, Liang et al. [17] proposed three secondary goal models, and each secondary goal corresponds to a practical case scenario.Based on the models of Liang et al. [17], Wang and Chin [18] proposed the improved models by replacing the target efficiency.Jahanshahloo et al. [19] improved traditional cross-efficiency evaluation by considering symmetric weights.Wu et al. [20] and Contreras [21] proposed improving the ranking position of the evaluated DMU when choosing weights.In the study of Lim [22], the secondary goal was proposed to minimize (or maximize) the cross efficiencies of evaluated DMUs.Maddahi et al. [23] proposed a proportional weight assignment secondary goal, making weights be assigned proportionally to input or output evaluated DMUs.In these secondary models, most models are benevolent or aggressive.In the benevolent (aggressive) model, the selected weights for evaluated DMUs are to make the cross-efficiencies of other DMUs as large (small) as possible.Different from the above ideas, scholars also have proposed new cross-efficiency models from different perspectives.For example, Cook and Zhu [24] proposed a units-invariant multiplicative DEA model, directly generating the unique cross-efficiency scores.Based on the Pareto optimality, Wu et al. [25] proposed the Pareto improvement cross-efficiency evaluation, which could obtain Pareto-optimal cross efficiencies for all DMUs.
Besides secondary goals in the cross-efficiency evaluation, scholars also have examined aggregating cross-efficiencies for obtaining cross-efficiency matrix.For example, Wu et al. [26] introduced Shapley cooperative game theory into cross-efficiency evaluation, considering each DMU as a player, and proposed a Shapley DEA model to obtain all the weights of cross-efficiencies.Wang et al. [27] considered that all cross-efficiencies of DMUs should have different preference weights.From the preference deviation degree, they proposed three different models for aggregating cross-efficiencies.Angiz et al. [28] argued that the DMU may be more concerned about whether the assigned weights improve their ranking when weights are selected for their cross-efficiencies.Based on this idea, they proposed a ranking preference model.Yang et al. [29] regarded the cross-efficiency as the independent evidence, and thus the evidence reasoning method was used to aggregate cross-efficiencies.
The traditional DEA or cross-efficiency models assume that the data of DMUs are known exactly.However, because of the existence of uncertainty, the data may be given in a fuzzy form.Therefore, a number of studies examined how to evaluate the efficiencies of DMUs with fuzzy data.For example, Cooper et al. [30] proposed an imprecise DEA (IDEA) model, which can be transformed into a linear programming model based on a series of variable alternations and scale transformations.However, Lee et al. [31] argued that IDEA model was complicated, and may lead to a rapid increase in computation burden.To solve this problem, Despotis and Smirlis [32] proposed two improved models.Through these two improved models, the lower and upper efficiency of each DMU could be obtained.Wang et al. [33] pointed out that Despotis and Smirlis's model [32] used two different production frontiers to measure the efficiencies of DMUs, and this may lead to the efficiencies of DMUs' lack of comparability.To deal with such an issue, Wang et al. [33] proposed the new DEA models based on a common frontier to obtain the interval efficiency of each DMU and a minimax regret-based approach was then used for ranking the interval efficiencies of all DMUs.To determine the range of interval efficiency of each DMU, Azizi and Jahed [34] introduced a virtual ideal DMU into the DEA model.The efficiency of ideal DMU is definitely the largest among all the DMUs, so the worst and the best relative efficiencies of each DMU can be obtained.Then, the worst and the best relative efficiencies constitute an interval for the overall performance evaluation of each DMU.Wang and Chin [35] proposed the fuzzy DEA models based on two pairs of expected value models to measure the optimistic and the pessimistic efficiencies of DMUs.They integrated two extreme efficiencies through a geometric average for obtaining the overall performances of the DMUs.Dotoli et al. [36] proposed a novel approach by integrating the DEA cross-efficiency technique with the fuzzy logic framework.This approach not only maintains the cross-efficiency DEA discriminative power but also deals with uncertainty.
Although the existing cross-efficiency methods have well examined how to be aggressive or benevolent to DMUs when evaluating the efficiencies, the maximum discrimination of DMUs has been largely ignored.In addition, there is a paucity of research on aggregating interval cross-efficiency matrices.To fill these gaps, the present study proposes a new cross-efficiency method based on the entropy theory.In this new approach, the model of Wang et al. [33] is first extended into cross-efficiency evaluation to obtain the intervals of all cross-efficiency values.Then, the DEA entropy model is used to calculate the weights of all interval cross-efficiencies.Finally, all DMUs are evaluated and ranked according to the distance to ideal positive cross-efficiency.This approach is illustrated and verified by a demonstrative case using data from China's primary schools.We conclude that the proposed approach is effective to evaluate DMUs with interval data and can provide complete and fair results for all DMUs.
The rest of the paper is organized as follows.Section 2 introduces the interval DEA models and Section 3 presents the cross-efficiency evaluation method with interval data.The cross-efficiency model based on Shannon entropy is discussed in Section 4, followed by a numerical demonstration using data from Chinese primary schools in Section 5. Conclusions are presented in Section 6.

Interval DEA Models
There are n DMUs to be evaluated, and each DMU has s different outputs and m different inputs.Input i and output r for DMU j are denoted as x ij and y rj , respectively.The input and output data may be imprecise because of uncertainty and thus only their bounded intervals [x l ij , x u ij ] and [y l rj , y u rj ], with x l ij > 0 and y l rj > 0, are provided.For measuring the efficiencies of the DMUs with interval data, Despotis and Smirlis [32] proposed a linear problem model to generate the lower and upper bounds of the efficiency for each DMU, as shown in Model (1): However, Wang et al. [33] pointed out that Despotis and Smirlis [32] used two different production frontiers to obtain interval efficiency, and thus all DMUs cannot be compared on the basis of a common evaluation criterion.In order to calculate the lower and upper bound of the efficiency of DMU d , Wang et al. [33] proposed two linear formulations to generate the bounded interval [E l dd , E u dd ], as follows: and max In Models ( 2) and (3), DMU d is to be evaluated.ω id and µ rd are the weights of the input i and output r, respectively.E l dd (or E u dd ) is the lower (or upper) efficiency for DMU d .ε is the non-Archimedean infinitesimal.From Models (2) and (3), it is clear that E l dd ≤ E u dd .

Cross-Efficiency Evaluation Method with Interval Data
Models (2) and ( 3) are the self-assessment models.The self-evaluated DEA model enables each DMU to choose the most favorable weights for evaluating its efficiency.This may lead to more than one DMU is assessed as efficient, and such DEA-efficient DMUs cannot be further distinguished (Wang and Chin, [37]).To solve this problem, Sexton et al. [9] proposed a cross-efficiency DEA method by introducing the concept of peer evaluation.However, the method of Sexton et al. [9] has a problem of multiple optimum weight solutions.A weight scheme obtained by Sexton et al. [9] may be favorable to one DMU, but not to another.To address this ambiguity of weight selection, Doyle and Green [10] proposed the aggressive and benevolent formulations by introducing a secondary goal into the cross-efficiency method.In the case of aggressive (or benevolent) formulation, the secondary goal could choose the weights to make the efficiency of the target DMU the best that it can be, and all other DMUs worst (or best).However, Oukil and Amin [38] pointed out that the aggressive and benevolent models of Doyle and Green [10] used a common set of weights for all DMUs, which would not guarantee maximum discrimination among DMUs.To improve discrimination, Oukil and Amin [38] proposed using different weights for cross-efficiency scores.The purpose of our present study is to effectively discriminate all DMUs with interval data.Therefore, our study adopts the viewpoint of Oukil and Amin [38].In our study, the model [33] is extended to obtain the lower and upper cross-efficiencies for each DMU.Model (4) can calculate the low cross-efficiency values for interval data: max Similarly, the upper cross-efficiency values of interval data can be computed with Model (5): After all cross-efficiency values are computed, an interval efficiency matrix can be obtained as shown in Table 1.In each column, [E l dj ,E u dj ] represents the lower and upper bounds of the cross-efficiency scores of DMU j by using the weights of DMU d .Table 1.A generalized interval cross-efficiency matrix.

Rating DMU d
Rated DMU j Models ( 4) and ( 5) are built upon the classical cross-efficiency DEA framework, and we consider only the CRS (constant returns to scale) condition in the present study.Models ( 4) and ( 5) are inappropriate to be extended to the case of VRS (variable returns to scale).The reason is that integrating the concept of the VRS into the cross-efficiency DEA framework may yield negative cross-efficiency scores (Cook and Zhu [39] and Lim and Zhu [40]).
The DEA method mainly has two orientation modes-inputs and outputs.Under the form of the multiplier DEA model, the input orientation mode is expressed as maximizing the ratio of the DMU's sum of weighted outputs to its sum of weighted inputs.Output orientation mode is formulated as maximizing the ratio of the DMU's sum of weighted inputs to its sum of weighted outputs (Cooper et al. [41] and Cook and Bala [42]).Therefore, per these definitions, Models (4) and ( 5) are input orientation modes.

The Cross-Efficiency Model Based on Shannon Entropy
Shannon entropy is a useful and effective mathematical concept for measuring uncertainty [43].Incorporating Shannon entropy into DEA has attracted the interests of a number of scholars [44][45][46].In this section, Shannon entropy is utilized to calculate the weights of interval cross-efficiency.The weights are obtained by making the distance between the self-evaluation entropy score and peer evaluation entropy score as small as possible.The cross-efficiency entropy model is proposed as in the following steps: • Step 1: Determining the entropy value of interval cross-efficiency.
As defined in Table 1 is the interval cross-efficiency matrix, and the elements E l dj and E u dj represent the interval efficiency that DMU d accords to DMU j .After normalizing the matrix E, the entropy value of interval cross-efficiency can be defined as follows: Definition 1.For DMU j , the lower (or upper) entropy value of lower (or upper) cross-efficiency score is defined as: where Theorem 1. Entropy values can be added.
For each DMU (e.g., DMU j ), it has n lower cross-efficiency scores Step 1, the lower (or upper) entropy of DMU j is equal to the sum of its n entropy cross-efficiency scores.That is, • Step 2: Calculating the weights based on the proposed cross-efficiency entropy model.

Definition 2.
For DMU j , the distance entropy function between cross-efficiency score and its self-evaluation efficiency score is defined as: where h l ij (or h u ij ) and h l * jj (or h u * jj ) are the entropy values of cross-efficiency score E l ij (or E u ij ) and efficiency score E l jj (or E u jj ) of Models ( 2) or (3), respectively.Entropy is a measurement of uncertainty, and we assume that the reasonable weights should make distance entropy of all cross-efficiencies be the smallest and thus uncertainty of interval cross-efficiency would be the smallest.Therefore, cross-efficiency entropy model are as follows: Model (10) essentially is a multi-attribute decision making method.According to characteristics of multi-attribute decision making model (Wang and Parkan [47], Wang and Luo [48], and Tzeng and Huang [49]), the sum of weights is equal to 1. Therefore, the weights λ i in Model (10) also need to satisfy this constraint.Then, according to Lagrangian sufficiency theorem, the weight λ i can be determined as follows: • Step 3: Determining the weighted normalization decision matrix.
The weighted normalization value is calculated by where λ i is the weight of attribute j, • Step 4: Determining the positive ideal solutions.
After Step 3, there are two weighted normalization matrix (V l ij and V u ij ).In this step, the maximum value of each row in each matrix is found as the positive ideal solution: • Step 5: Calculating the Euclidean distance from the positive solutions: Therefore, the final distance from the positive solutions is d * j .• Step 6: Determining the rank of all alternatives on the basis of their relative Euclidean distance from the positive solutions.
The smaller the d j is, the better the alternative A i will be.The best alternative is the one with the smallest relative Euclidean distance to the ideal solutions.
Theorem 2. The constraints of Model (10) are non-empty convex set.
Proof.Let C be the constraints of Model (7).It is evident that C is non-empty.Now, assume both i has the continuous second partial derivatives.The Hessian matrix of the objective function is: Hessian matrix is always positive definite.Therefore, the objective function is a strongly convex function.

Theorem 4. λ *
i is the global optimal solution of Model (10).
Proof.C is a non-empty convex set, the objective function 10) is a convex programming model, and the generated λ * i is the global optimal solution of Model (10).

The Case of Primary Schools
The illustrative application uses a dataset of all primary schools in China's Jinhu County, Jiangsu, China.To be consistent with extant studies in the literature, we choose input variables as the number of staff, school building area (in Square meters), copies of books, fixed assets (in 10 6 RMB), and school budget (in 10 6 RMB).The output variable is the number of students in each school.All the data was collected from the Education Bureau of Jinhu County.The school profile is shown in Table 2.

Number of Students
Table 2 indicates significant differences in the input-outputs of these primary schools, with the maximum value being 31 times greater than the minimum value.The differences between the variables of school building area, fixed asset and the number of students among schools are substantial.Of these indicators, the number of staff and the number of students are given in an interval form.The staff and students might quit from the school or transfer from one school to another.Therefore, these data were not fixed, and we thus collected data from the beginning and end of the year, in an interval form.

The Results and Compared with Other Models
After solving Models (2) and (3), we can obtain the cross-efficiency matrix according to Models (4) and ( 5), as listed in Table 3.
Table 3 is the cross-efficiency matrix.The elements on the diagonal are the self-evaluation results when DMU d evaluates itself according to Models (2) and (3).After normalizing the elements in the cross-efficiency matrix, we can calculate the final weights by using Model (12).The weights of all DMUs are:   The relative Euclidean distances to the ideal cross-efficiency of all DMUs can be then calculated.Based on these relative Euclidean distances, the ranking results of the DMUs are obtained, which are shown in Table 4.In order to compare our approach with other models, the interval DEA model proposed by Sun et al. [50] is selected to evaluate these DMUs.Hadad and Hanani [51] provided a survey on the common weight DEA models (Ganley and Cubbin [52]; Friedman and Sinuany-Stern [53] and Adler et al. [54]).However, these models can only be used to evaluate the efficiencies of DMUs with precise data, and they are incapable of assessing DMUs with interval data.Based on the idea of common weights, Sun et al. [50] proposed an interval DEA model.Therefore, to make a fair comparison, we compare our results with the results from Sun et al. [50].Sun et al. [50] used a simple weighting method to aggregate the lower and upper efficiencies for obtaining the final efficiency.The aggregating weights of the lower and upper efficiencies are h 1 and h 2 , and h 1 + h 2 = 1.The results obtained by Sun et al. [50] are presented in Table 4.
By comparing the results, we have several findings.Firstly, Sun et al. [50] used a simple weighted method to aggregate the lower and upper efficiencies of each DMU.However, the different aggregation weights are given to the lower or upper bound of the interval, and the final ranking results are not the same.This may confuse the decision makers regarding how to determine aggregation weights.This problem does not appear in our model.The results of our models show that only one set of solutions are obtained.Secondly, Despotis [16] pointed out that using the simple weighted method to aggregate efficiencies is not good enough since it is not a Pareto solution.However, this problem does not appear in the model in our study.Our study proposes a Shannon entropy DEA model to obtain a set of aggregation weights, which are proved to be the global optimal solutions.Thirdly, Sun et al.'s [50] model is a self-evaluated model.There is a significant shortcoming in the self-evaluated DEA model.The self-evaluated DEA allows each DMU to be evaluated using its most favorable weights.This leads to the weights obtained by the DEA being usually inconsistent with the real-world production processes (Wang et al. [55]).From Table 5, we find that each DMU has zero weight, which is inconsistent with the production process or prior knowledge (Ramón et al. [56]).However, this problem can be effectively solved by our approach.Our approach uses the peer-evaluated model, which ranks all DMUs using the weights of all DMUs and can eliminate unreasonable weight schemes without any priori assumptions on weight restrictions.Fourthly, from Table 4, we can find that the performance of DMU 20 is the best, while DMU 8 is the worst.In the column of DMU 20 of Table 3, we can find that most of the upper and lower interval efficiencies of DMU 20 are the largest among cross efficiencies of all DMUs.The most interval efficiencies of DMU 8 are the smallest among cross-efficiencies of all DMUs.These findings are consistent with our ranking results, which indirectly confirm the reliability and practicability of the approach proposed in our present study.

Validating the Model by Adding Simulated Schools
To further verify the effectiveness and practicability of our approach, we simulate 10 virtual schools, as shown in Table 6.The simulation requirements of the 10 schools' data are as follows: (1) input data gradually increase from simulated school 26 to school 35; and (2) output data gradually decrease from simulated school 26 to school 35.All the input and output data of simulated schools are randomly generated by Matlab software (2010b version).If a school has more students (outputs) with fewer resources (inputs), it will get higher performance.Through comparing the data of

Conclusions
Traditional cross-efficiency models assume that the data of all DMUs are precise.However, this assumption is not always correct in the real world.In many real circumstances, the outputs and inputs of DMUs are not perfectly precise, which may only have a range in an interval form.In these cases, traditional cross-efficiency models cannot evaluate the efficiencies of DMUs.To address this problem, the present study proposes a new approach.In this approach, we firstly extend traditional cross-efficiency models for obtaining the interval efficiency of each DMU.Then, the distance entropy model is utilized to calculate the weights of interval cross-efficiency scores.Finally, all DMUs are assessed and ranked by the distance to the positive ideal cross-efficiency.A demonstrative case using data from China's primary schools is used to illustrate the newly proposed model.Through this real case, we can conclude that the proposed method is convenient to solve problems with interval data of DMUs, and can provide complete and fair results for all DMUs.
The method proposed in this paper can be further expanded in the future studies.The DEA Shannon entropy model is proposed based on the cross-efficiency method.The proposed model can also be extended to other DEA models with intervals in the future studies.In addition, our study collected the imprecise data in an interval form.However, in some real cases, a proportion of data are missing.Under this circumstance, it is difficult to extend our method to evaluate the DMUs with missing data, but this direction is worth further investigation.

Conclusions
Traditional cross-efficiency models assume that the data of all DMUs are precise.However, this assumption is not always correct in the real world.In many real circumstances, the outputs and inputs of DMUs are not perfectly precise, which may only have a range in an interval form.In these cases, traditional cross-efficiency models cannot evaluate the efficiencies of DMUs.To address this problem, the present study proposes a new approach.In this approach, we firstly extend traditional cross-efficiency models for obtaining the interval efficiency of each DMU.Then, the distance entropy model is utilized to calculate the weights of interval cross-efficiency scores.Finally, all DMUs are assessed and ranked by the distance to the positive ideal cross-efficiency.A demonstrative case using data from China's primary schools is used to illustrate the newly proposed model.Through this real case, we can conclude that the proposed method is convenient to solve problems with interval data of DMUs, and can provide complete and fair results for all DMUs.
The method proposed in this paper can be further expanded in the future studies.The DEA Shannon entropy model is proposed based on the cross-efficiency method.The proposed model can also be extended to other DEA models with intervals in the future studies.In addition, our study collected the imprecise data in an interval form.However, in some real cases, a proportion of data are missing.Under this circumstance, it is difficult to extend our method to evaluate the DMUs with missing data, but this direction is worth further investigation.

Table 2 .
The data of input and output of all schools.

Table 3 .
Interval cross-efficiencies of all DMUs.

Table 4 .
The distance and ranking results of all DMUs.

Table 2 (
real data) and Table6(simulated data), there are three observations.School 35 has the least number of students, but its educational resources are the largest.Thus, School 35 has the worst performance.

Table 6 .
The data of input and output of 10 simulated schools.

Number of Students should
be ranked first among all DMUs; (2) DMU 35 has the largest Euclidean distance of 0.0822, so it is ranked last; (3) and the Euclidean distances gradually increase from DMU 26 to DMU 35 .This indicates that performance rankings are gradually reduced from DMU 26 to DMU 35 .These ranking results meet all the above observations, suggesting that the ranking obtained by our approach can represent the true ranking.