Next Article in Journal
Feasibility of Developing Sustainable Concrete Using Environmentally Friendly Coarse Aggregate
Previous Article in Journal
Pilots’ Performance and Workload Assessment: Transition from Analogue to Glass-Cockpit
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Machine Learning Approach Combined with Optimization Models for Eco-efficiency Evaluation

by
Mirpouya Mirmozaffari
1,*,
Maziar Yazdani
2,
Azam Boskabadi
3,
Hamidreza Ahady Dolatsara
4,
Kamyar Kabirifar
2 and
Noorbakhsh Amiri Golilarz
5
1
Department of Industrial Manufacturing and Systems Engineering, The University of Texas at Arlington, Arlington, TX 76019, USA
2
Faculty of Built Environment, University of New South Wales, Sydney, NSW 2052, Australia
3
Department of Finance and Management Science, Carson College of Business, Washington State University, Pullman, WA 99163, USA
4
School of Management, Clark University, Worcester, MA 01610, USA
5
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(15), 5210; https://doi.org/10.3390/app10155210
Submission received: 2 July 2020 / Revised: 26 July 2020 / Accepted: 27 July 2020 / Published: 28 July 2020
(This article belongs to the Section Energy Science and Technology)

Abstract

:
Machine learning approaches have been developed rapidly and also they have been involved in many academic findings and discoveries. Additionally, they are widely assessed in numerous industries such as cement companies. Cement companies in developing countries, despite many profits such as valuable mines, face many challenges. Optimization, as a key part of machine learning, has attracted more attention. The main purpose of this paper is to combine a novel Data Envelopment Analysis (DEA) approach in optimization at the first step to find the Decision-Making Unit (DMU) with innovative clustering algorithms in machine learning at the second step introduce the model and algorithm with higher accuracy. At the optimization section with converting two-stage to a simple standard single-stage model, 24 cement companies from five developing countries over 2014–2019 are compared. Window-DEA analysis is used since it leads to increase judgment on the consequences, mainly when applied to small samples followed by allowing year-by-year comparisons of the results. Applying window analysis can be beneficial for managers to expand their comparison and evaluation. To find the most accurate model CCR (Charnes, Cooper and Rhodes model), BBC (Banker, Charnes and Cooper model) and Free Disposal Hull (FDH) DEA model for measuring the efficiency of decision processes are used. FDH model allows the free disposability to construct the production possibility set. At the machine learning section, a novel three-layers data mining filtering pre-processes proposed by expert judgment for clustering algorithms to increase the accuracy and to eliminate unrelated attributes and data. Finally, the most efficient company, best performance model and the most accurate algorithm are introduced. The results indicate that the 22nd company has the highest efficiency score with an efficiency score of 1 for all years. FDH model has the highest efficiency scores during all periods compared with other suggested models. K-means algorithm receives the highest accuracy in all three suggested filtering layers. The BCC and CCR models have the second and third places, respectively. The hierarchical clustering and density-based clustering algorithms have the second and third places, correspondingly.

1. Introduction

Cement is a vital component for economic development of countries; however, cement production is associated with high energy consumption (e.g., electricity consumption), and environmental pollution (e.g., CO2 emission) [1,2,3]. DEA as one of the main concepts of the optimization method has been widely used for calculating energy and environmental efficiency and eco-efficiency since it was primarily proposed by Charnes, et al. [4]. It is a non-parametric frontier technique where the efficiency of a specific entity is calculated by its distance from the highest performance practice frontier created by the most exceptional performance entities inside the group. DEA is a general method for assessing the efficiency of ecological systems [5].
Window analysis in DEA can provide the permanency of input and output in time, describing the dynamic evolution of the DMU. Simultaneously, window analysis can also compare the efficiency of DMUs under various windows in the same period and examine the constancy of the efficiency. Window analysis has two noteworthy features:
(1)
It can increase the number of DMUs in the reference set, which is an efficient way to solve the problem of having an inadequate number of DMUs.
(2)
Window analysis not only measure the efficiency of each DMU on a cross section but also measure the trend of the efficiency of all DMUs over the time series.
While DEA model uses a linear programming problem, the FDH deals with a mixed integer programming problem. It has been implemented in different industries. FDH relaxes the convexity assumption of the basic DEA models. Free disposal means if a specific pair of input and output is producible, any pairs of more input and less output for a specific one is also producible.
The data mining tools estimate forthcoming performances and it is one of the foremost machine learning methods. Clustering is a statistical methodology, and this method aim is forming groups of similar direction units. The recommended clustering method aims to recognize performance arrangements within regular outlines of varied, creating systems from raw data sets. Clustering methods can be separated into hierarchical, nonhierarchical, geometric, and others [6].
The concept of eco-efficiency is based on the theory of consuming fewer properties to make more facilities and declining levels of waste and environmental pollution.
Searching for a novel machine learning and optimization approaches to find the most appropriate model is still an open problem [7,8,9]. Thus, in responding to this research gap, in current study the combination of the above-mentioned machine learning and optimization will be a new experiment in literature perception.
In current study, all the available companies’ datasets are applied to three exclusive models, and their DMU’s efficiency is compared to find the unfamiliar trends in cement companies. A dataset for 24 companies with one input, two intermediate products, and three outputs after converting two-stage to proposed standard single-stage model is used. Window analysis in DEA with CCR, BCC, and FDH Models are applied to test and justify the alterations between companies. The use of DEA as a decision analysis tool is limitless in literature because DEA does not focus on finding a universal relationship for all unit’s undervaluation in the sample. DEA authorities every group in the data to have its production function, and then it evaluates the efficiency of that single unit by comparing it to the efficiency of the other units in the dataset. After running the three window analysis models in DEA SOLVER with every group in the data at the first step, based on the nature and attributes of our data, the most appropriate filtering preprocess method for data mining clustering algorithms at the second step has been applied by expert judgment in WEKA. A plus point is that WEKA just provide a list of algorithms and preprocessing approaches. Thus, the best scenario based on the nature and attributes of the data, should be done by experts. In this study, after considering all the points, we propose three layers filtering preprocess which gain the most accurate results.
The rest of the paper is organized as follows: Section 2 reviews the literature linked to the calculation of Eco-efficiency, energy use efficiency, and specifies the prospective role of the current study. Section 3 presents a clarification of the data set description and data sources. Section 4 discusses the five parts of the research methodology (part1: CCR, BCC and FDH, part2: Proposed model, part3: Window analysis, part4: Clustering, part5: Assessment process of combining DEA and data mining). Section 5 presents an evaluation of window analysis and clustering respectively followed by a discussion and conclusion of the experimental consequences in Section 6 and Section 7 correspondingly.

2. Background and Literature Review

In this section, first, the existing studies relevant to this paper are reviewed, then research gaps and the main contributions of this study are discussed.

2.1. Window Analysis Based on Energy, Environmental and Industrial Ecological Efficiency

Many studies have considered the Malmquist Productivity Index (MPI) to examine panel data, but Oh and Heshmati [10] presented that MPI does not appropriately reflect the features of technical growth, and the efficiency progress index attained might not be fair. Additionally, desirable and undesirable outputs have diverse technical structures, including technical heterogeneity. Consequently, window analysis is considered more appropriate approach. Based on the window analysis context, the efficiency of a DMU in one period can be compared with its efficiency in another period, and the efficiency of the other DMUs can also be compared, to reflect the heterogeneity between DMUs over a sequence of covering windows [11,12]. Based on the serious resource consumption and ecological pollution, the expansion of China’s green economy will greatly influence the nation’s future global economic development. Initially, ecological DEA method is used to analyze Green Economic Efficiency (GEE) at the regional level in China. Then, according to panel data, the window analysis method is applied to examine the regional differences of China’s regional GEE. The total GEE of China is gradually growing, and the local variances are still noteworthy, and the development of GEE can support to decrease regional changes [13]. Halkos and Tzeremes [14] considered the DEA window method to examine the environment efficiency of 17 countries over 1980–2002. They evaluated the presence of a Kuznets type connection among countries’ environmental efficiency and national income. Permitting for active effects they found that the modification to the target ratio is instantaneous. They also find that augmented economic activity does not continuously guarantee environmental safety and accordingly the path of development is significant in addition to the progress itself. Zhang, Cheng, Yuan and Gao [11] considered a total-factor framework to examine energy efficiency in 23 developing countries over 1980–2005. They discovered the total-factor energy efficiency and change trends by using DEA window analysis. Seven countries show little change in energy efficiency over time. Eleven countries had nonstop reductions in energy efficiency. Among five countries perceiving incessant progress in total-factor energy efficiency, China had the highest growth.

2.2. Eco-Efficiency Evaluation with Desirable or Undesirable Inputs and Outputs

Increasing concerns around the topic of energy safety and global warming, the problem of energy efficiency has gained significant attention from researchers. According to the International Energy Agency (IEA) energy efficiency is a way of handling and restricting the growth in energy consumption. Something is more energy-efficient than other methods if it distributes more services for the same energy input or the services for less energy input.
The number of DEA applications regarding pollutant services, as well as undesired outputs, is noteworthy. Separating ecological and technical efficiencies for power plants, Korhonen and Luptacik [15] suggested an approach. They consider pollutants as the inputs in order to increase desirable outputs and decrease pollutants and inputs. According to technical aspects, Yang and Pollitt [16] have considered several impossibilities features as undesirable outputs. In an experimental study, Zhang, et al. [17] used DEA to assess the eco-efficiency of gross domestic products in China. Liu, et al. [18] came across an approach to combine desirable and undesirable inputs and outputs. Chu, et al. [19] emphasize on the eco-efficiency study of Chinese provincial-level regions, concerning each region as a two-stage network structure. The first stage is reflected in the production system, and the second stage is considered as the pollution control system. Regarding the pollution emissions as intermediate products, a two-stage DEA model is suggested to attain the eco-efficiency of the entire two-stage structure. Khalili-Damghani and Shahmir [20] have considered emissions as an undesirable output in the efficiency assessment of electric power production and distribution procedure. Oggioni, et al. [21] used DEA to evaluate energy as an input producing both desirable outputs (goods) and undesirable outputs (CO2 emissions). The exclusion of undesirable output does not appear to deliver a broad scale of the production procedure. Consequently, Zhou and Ang [22] assess energy use efficiency in a combined production context of both desirable and undesirable output.
To comprehend whether this eco-efficiency is attributable to a sensible consumption of inputs or an actual CO2 lessening as a result of the ecological rule, they evaluate the circumstances where CO2 emissions can either be reflected as an input or as an undesirable output. Practical effects display that countries, where cement industries spend in scientifically innovative kilns and adopt substitute fuels and raw materials in their manufacturing processes, are eco-efficient. Environment, et al. [23] proposed two essential approaches, which have a positive influence in substantial additional reductions in CO2 emissions and increased the use of the low-CO2 supplement, including the more efficient use of cement clinker. This efficient use contributes a relative advantage to developed countries, such as India and China, which are encouraged to renovate their production processes.

2.3. Machine Learning Clustering Algorithms in Energy Consumption

Innovative computational methods particularly machine learning techniques have the potential to tackle a wide range of challenging problems [24,25,26,27,28,29,30,31], therefore they have widely been applied in different fields in recent years [32,33]. Yu, Wei and Wang [34] addressed the regional distribution of carbon emission reduction goals in China based on the constituent part swarm optimization algorithm, fuzzy c-means clustering algorithm, and shapely decomposition approach. Consequently, clustered all regions into four classes according to the relevant carbon emission features and decided that more carbon emission reduction quantity would be assigned to regions with large total emissions and high emission concentration. Emrouznejad, et al. [35] considered the same problem of this article, despite the fact they applied an inverse DEA method and ignored the competitive and supportive relations between various sub-level trades and regions. Qing, et al. [36] extracted building energy consumption data by DBSCAN clustering and decision tree-based classification methods. Despite its capability to intensely comprehend the outline of energy consumption in constructions, the algorithm is too intricate to be appropriate for the rapid processing of data in the energy consumption monitoring platform. Lim, et al. [37] eliminated abnormal energy consumption data through GESR, and then classified and predicted energy consumption data via classic variable analysis (CVA). However, these methods only perform static analysis of energy consumption using historical data and consequently cannot accurately detect energy consumption anomalies. Thus, pattern recognition in companies should be through diverse computational and combinatorial methods.

2.4. Two-Stage FDH Model in Production Technologies

In order to get FDH efficiency scores of DMUs with two-stage network structure, solving linear/nonlinear mixed integer programming problems is an essential part. Recent study by Tavakoli and Mostafaee [38] shows that FDH two-stage models can be solved by examining only some simple ratios, without solving any mathematical programming problem. Both cases of similar and diverse optional peers in both stages are provided by FDH models. In order to calculate the overall and stage efficiency scores based on dissimilar RTS assumptions, some closed form models are applied. Finally, some recent FDH models studies which have been proposed by many scholars can be addressed in literature [39,40,41,42,43].

2.5. Research Gap Analysis and Contributions

DEA measure of energy use efficiency has two main benefits as compared to the old-style meaning, “the proportion of energy services to energy input.” At the initial step, DEA provides somewhere to stay several inputs (energy and no-energy inputs) and several outputs (desirable and undesirable) in the production progression. Next, DEA can also put up the purposes of DMUs in evaluating energy use efficiency. However, far too little attention has been paid to developing an efficient solution method to cope with DEA and data mining associated with finding the best model, algorithm and DMU. In conclusion, main contributions of this study can be summarized as follows:
(1)
This research aims to study a comprehensive comparison of several efficiencies delivers insight into the firm’s efficiency based on a novel machine learning approach combined with optimization models for Eco-efficiency Evaluation. This comparison is of considerable significance to cement companies’ practitioners who desire to assess productivity and efficiency at a proper step of its progression.
(2)
An exclusive and easy to implement converting two-stage to single, standard and simple stage models in window analysis are applied with DEA SOLVER, which eventually results in comparing several efficient and inefficient DMUs. This model is proposed to fix the efficiency of a two-stage process and prevent the dependency on various weights. In fact, by converting two-stage to single proposed stage, desirable and undesirable inputs and output can be evaluated with simple CCR, BCC and FDH suggested model.
(3)
After applying the abovementioned optimization part, based on the nature of the data and attributes, the best filtering preprocess method have been chosen by experts. The best results are extracted, and the best fit preprocessing approach in machine learning section is introduced. DEA inputs and outputs as potential attributes for data mining clustering algorithms in WEKA are considered. In addition, data play the role of instances and finally efficient DMUs are applied for class yes and inefficient DMUs for class no.
(4)
After using the aforementioned novel combined optimization and machine learning approaches, the most efficient model, company and algorithm will be introduced. Thus, it can be beneficial for managers to conduct more effective processes.
(5)
Thus, in responding to this research gap, the following research questions are introduced and investigated:
  • RQ1: What problem solution approaches can be developed to find appropriate decisions?
  • RQ2: How can the robustness and accuracy of the designed approaches be demonstrated and evaluated, given a case study?
To address the first question, according to the nature of the data and external inputs and final outputs in optimization step, the best fitting model in optimization part should be selected by expert judgment. To address the second question, data, external inputs and final outputs, efficient and inefficient DMUS in optimization step, play three important roles of instances, attributes and class yes or no in machine learning section respectively. Therefore, in order to achieve the highest accuracy, based on the above mentioned three elements in machine learning, the best fit preprocessing should be implemented. In this study, FDH model and three layers filtering preprocesses for the suggested clustering algorithms were the best scenario. Therefore, in future studies with particular data and attributes in different industries, the best scenario may have the filtering with less or more layers for the suggested algorithms. So, practitioners and experts, based on their previous experiences, should select the most appropriate approaches. In addition, other DEA models such as Slack Based Measure (SBM) may have more appropriate results.

3. Dataset Description

The standard data set, collected in this study covers six years from 2014 to 2019, which is collected from 24 cement companies. Consequently, the single input of the first stage, two intermediate elements, and three outputs of the second stage for the first company, over 2014 to 2019, are presented in Table 1.
Table 2 shows the descriptive analysis of data.
Energy consumptions in the companies are the only input in the first stage. Cement production (outputs of the first stage) and pollution control investments (input of the second stage) are two intermediate elements. Wastewater, gas and solid waste removed in companies are the three outputs of the second stage. It should be noted that waste material removed consist of wastewater, waste gas and solid waste.

4. Research Methodology

The objective of this study after converting two-stage to a single-stage is to compare companies’ efficiency effectively. Using a comparative DEA with window analysis methodology and clustering algorithms in data mining were established to determine the features of cement companies in terms of some DMUs and algorithms. Finally, the entire progression can be divided into five steps, as follows:

4.1. FDH, CCR and BCC Models

FDH model is a non-parametric method to measure the efficiency of production units or DMUs. FDH model reduces the convexity assumption of basic DEA models. The computational technique to solve the FDH program reflects the mixed integer programming problem associated with the DEA model with a linear programming problem [39].
If definite inputs can produce particular outputs, the pairs of these inputs and outputs are producible, so the braces are called the production possibility set or PPS.
In Figure 1 and Figure 2, if any activity like 4 ( x 1 , y 1 ) in Figure 1, belongs to the production possibility set (P), then the activity ( t x , t y ) belongs to P for any positive scalar t. This property is a CCR model or CRS. This hypothesis can be adapted to allow the production possibility set with diverse hypothesizes. The BCC model is illustrative using by variable returns-to-scale (VRS). It is characterized by increasing returns-to-scale (IRS), decreasing returns-to-scale (DRS), and constant returns to scale (CRS). The production possibility set of the FDH model is attained by defining it inversely with CCR and BCC models. Based on the Figure 2, if 4 ( x 1 , y 1 ) and 1 ( x 2 , y 2 ) belong to the production possibility set of CCR and BCC models respectively, then the 5 (a ( x 1 + x 2 ), b ( y 1 + y 2 )) with any positive scalar a, b is also measured to be in the same production possibility set. The axiom is called convexity. Free impossibility means if a specific pair of input and output is producible, any pairs of more input and less output for the specific one is also producible. FDH model allows the free impossibility to construct the production possibility set. Accordingly, the frontier line for the FDH model is developed from the observed inputs and outputs, enabling free failure.
In Figure 3, the form of the production possibility set in FDH is stepwise. The frontiers determined for the FDH model are presented considering two inputs and one output for six production units labeled A through F. In the BCC model, DMUs A, B, and C are efficient, but A, B, C, and F are efficient in FDH model. The efficiency of observation E in BCC model is defined as:
θ E . B C C . i n p u t o r i e n t e d = O E 2 / O E    
But the efficiency of observation E in FDH model is defined as:
θ E . F D H . i n p u t o r i e n t e d = O E 1 / O E
One version of the FDH model aims to minimize inputs while satisfying at least the given output levels. This is called the input-oriented model. The other one is called the output-oriented model that attempts to maximize outputs without requiring more inputs. The scores of efficiencies in the FDH model are between 0 and 1. And under input-oriented conditions, the efficiency scores of the FDH input-oriented model are always more significant than the ones of input-oriented VRS model. Therefore, the efficiency scores of the input-oriented FDH model are higher than those of the input-oriented BCC model. Consequently, the efficiency scores of the input-oriented FDH model are higher than those of the input-oriented BCC model.
Therefore, following relation can be defined:
θ E . F D H . i n p u t o r i e n t e d θ E . B C C . i n p u t o r i e n t e d θ E . C C R . i n p u t o r i e n t e d
The FDH IO   is represented as follows:
M i n θ S t
j = 1 n λ j x i j θ p x i p ,                             i = 1 , . , m
j = 1 n λ j y r j y r p ,                           r = 1 , . , s
j = 1 n λ j = 1
λ j = { 0 ,   1 } ,                                                       j = 1 , , n
The efficiency of an assumed DMU is calculated based on the CCR IO   model as follows:
M i n θ S t
j = 1 n λ j x i j θ p x i p ,                             i = 1 , . , m
j = 1 n λ j y r j y r p ,                               r = 1 , . , s
λ j 0 ,                   j = 1 , , n
The BCC IO   is represented as follows:
M i n θ .  
S t .  
j = 1 n λ j x i j θ p x i p ,                           i = 1 , . , m
j = 1 n λ j y r j y r p ,                           r = 1 , . , s
j = 1 n λ j = 1
λ j 0 ,                   j = 1 , , n

4.2. Proposed Model

Based on the nature of the matter to adjust relations, external inputs and final outputs, the efficiency measure is assessed. In the input-oriented model, the objective is to minimize external inputs and intermediate products while creating at least the given final output levels

4.2.1. A New Approach in DEA Two-Stage Model

Converting two-stage model to simple standard model with the following items has been proposed in this study:
  • X i j (i = 1…, m): Energy consumption or input of the first stage
  • Y r j (r = 1…, s): Wastewater removed or desirable output of the second stage
  • E t j (t = 1…, v): Waste gas removed or desirable output of the second stage
  • F z j (z = 1…, q): Solid waste removed or desirable output of the second stage
  • M h j (h = 1…, d): Cement production or desirable output of the first stage
  • N c j (c = 1…, k): Pollution control investment or input of the second stage
  • D M U j (j = 1…, n): Decision Making Units.
Figure 4 shows the structure of the proposed model:
Based on the Figure 4, the two-stage model is considered as single-stage, where the intermediate elements depending on being desirable or undesirable, are considered as part of final desirable outputs or desirable inputs in the proposed standard single-stage model. Description of dimensionless parameters in nomenclature for dual proposed model are provided in Table 3:
Finally, in a more detailed discussion C C R I O ( C C R   I n p u t   O r i e n t e d ) , B C C I O ( B B C   I n p u t   O r i e n t e d ) for primal (multiplayer) and dual (envelopment) proposed model are widely discussed below:

4.2.2. Primal Proposed Model in C C R I O :

M a x = r = 1 s Y r u r + h = 1 d M h l h + t = 1 v E t b t + z = 1 q F z a z S t
i = 1 m X i v i + c = 1 K N c g c = 1        
r = 1 s Y r u r + h = 1 D M h l h + t = 1 v E t b t + z = 1 q F z a z i = 1 m X i v i c = 1 K N c g c 0        
  u r , l h ,   v i , g c , b t , a z 0 ,       j = 1 , , n

4.2.3. Primal Proposed Model in B C C I O :

M a x = r = 1 s Y r u r + h = 1 D M h l h + t = 1 v E t b t + z = 1 q F z a z + w S t .
i = 1 m X i v i + c = 1 K N c g c = 1        
r = 1 s Y r u r + h = 1 D M h l h + t = 1 v E t b t + z = 1 q F z a z i = 1 m X i v i c = 1 K N c g c + w 0        
u r , l h ,   v i , g c , b t , a z 0 ,       j = 1 , , n

4.2.4. Dual Proposed Model in C C R I O :

M i n   θ S t .
r = 1 s Y r u r + h = 1 d M h l h + t = 1 v E t b t + z = 1 q F z a z i = 1 m X i v i c = 1 k N c g c 0 ,         j = 1 , ,   n
j = 1 n λ j X i j θ p X i p
j = 1 n λ j N c j θ p N c p
j = 1 n λ j Y r j Y r p
j = 1 n λ j M h j M h p
j = 1 n λ j E t j E t p
j = 1 n λ j F z j F z p
λ j 0           θ p   f r e e    

4.2.5. Dual Proposed Model in B C C I O :

M i n   θ S t .
r = 1 s Y r u r + h = 1 d M h l h + t = 1 v E t b t + z = 1 q F z a z i = 1 m X i v i c = 1 k N c g c + w 0 ,         j = 1 , ,   n
j = 1 n λ j X i j θ p X i P
j = 1 n λ j N c j θ p N c p
j = 1 n λ j Y r j Y r p
j = 1 n λ j M h j M h p
j = 1 n λ j E t j E t p
j = 1 n λ j F z j F z p
j = 1 n λ j = 1
λ j 0           θ p   f r e e    
Based on the above mentioned linear and dual models, the dual proposed model for F D H I O ( F D H   I n p u t   O r i e n t e d ) is presented below:

4.2.6. Dual Proposed Model in F D H I O :

M i n   θ S t .
r = 1 s Y r u r + h = 1 d M h l h + t = 1 v E t b t + z = 1 q F z a z i = 1 m X i v i c = 1 k N c g c + w 0 ,         j = 1 , ,   n
j = 1 n λ j X i j θ p X i P
j = 1 n λ j N c j θ p N c p
j = 1 n λ j Y r j Y r p
j = 1 n λ j M h j M h p
j = 1 n λ j E t j E t p
j = 1 n λ j F z j F z p
j = 1 n λ j = 1
λ j   { 0 ,   1 } ,   θ p   f r e e ,                 j = 1 , , n

4.3. Window Analysis

Cooperation among concurrent and inter-time-based analysis is the supposed window analysis where DEA is used consecutively on the overlying periods of continuous width (called a window). Once the window width has been quantified, all explanations within it are observed and surveyed in an inter-time-based method denoted as nearby inter-time-based analysis. The technique was primarily suggested by Charnes, et al. [44] to calculate efficiency in cross-sectional and time fluctuating data. Additionally, when window-DEA is used, the number of explanations taken into interpretation is increased fundamentally by a factor equivalent to the window’s width, which is valuable when dealing with insignificant model sizes as it raises the judgment ability of the method. Consequently, two following features should be prepared to accept when selecting window width:
  • The window should be wide enough to integrate the least number of DMUs for the required judgment
  • But it should also be narrow enough to guarantee that technological change within is insignificant. Therefore, it will not permit confusing or partial assessments among DMUs fitting to distant apart periods.
For modeling, consider n DMUs number (n = 1, 2, 3...N) in T period (t = 1, 2, 3, 4…T). It has been observed that all r inputs are used to generate s outputs. Therefore, the sample has N × T observations.
D M U n t shows a D M U n in t period with an input vector consists of r dimension as follows:
X n t = ( X n 1 t . X n 2 t . X n 3 t . X n 4 t X n r t )  
Meanwhile, the input vector consists of s dimensions as follows:
Y n t = ( Y n 1 t . Y n 2 t . Y n 3 t . Y n 4 t Y n s t )  
window ( Y k w ) has been started in time K (1 ≤ k ≤ T) with width W (1 ≤ w ≤ t−k).
Therefore, inputs matrix for window analysis are as follows:
X k w = ( X 1 k , , X N k ,   X 1 k + 1 , X N k + 1 , . X 1 k + w , X N k + w )
And outputs matrix for window analysis are as follows:
Y k w = ( Y 1 k , , Y N k ,   Y 1 k + 1 , Y N k + 1 , . Y 1 k + w , Y N k + w )
Since CCR IO , BCC IO are applied the input-oriented window analysis based on CRS are as follows [45]:
M i n θ S t .
θ X n λ X k w 0
λ Y k w Y t 0
λ n 0                 ( n = 1 , 2 , 3 N × W )
In Equation (60), θ is a scalar that determines the rate of decrease in inputs, and θ = 1   is an efficient unit. Meanwhile, X k w is an inputs matrix in the k period with w width, and Y k w is outputs matrix in the k time with w width. Finally, λ is a vector with N × 1 dimension includes constant numbers or reference set weights.
For the window analysis, the following formula for N DMUs is applicable:
K = T − W + 1
N: The number of Decision-Making Units (DMUs) or cement companies in this paper; T: Period; W: The number of windows.

4.4. Clustering

Many structures can be considered based on pattern recognition responsibilities, so an appropriate usage of machine learning approaches in applied applications becomes so important. Even though many classification approaches have been proposed, there is no agreement on which approaches are more appropriate for a given dataset. Therefore, it is essential to widely compare approaches in many possible states. Therefore, a methodical assessment of three well-known clustering algorithms is applied.
Recently, many papers compare clustering algorithms and most of the afore-mentioned papers used WEKA tool:
Khalfallah and Slama [46] compare the following six well-known algorithms to find the most accurate algorithms with clustering tool WEKA (version 3.7.12) which were Canopy, Cobweb, EM, FarthestFirst, FilteredClusterer, and MakeDensityBasedClusterer. Finally, FarthestFirst algorithm has the best performance among all other algorithms based on accuracy and time.
DeFreitas and Bernard [47] compares three clustering algorithms which were hierarchical, Density-Based and K-means for defining the most appropriate algorithm. The result shows that the Density-based had a better distribution amongst clusters.
Ratnapala, et al. [48] examine the access behavior and used K-means clustering with WEKA. After examining the results of testing, it discovered that 40% from a student cluster are in a way or another passive online learner.
Clustering is a unique unsupervised learning method for assemblage analogous data points. A clustering algorithm allocates much data to a smaller number of collections such that data in the same group share the same possessions while, in different clusters, they are dissimilar. To approve the validity of the projected model and to test the specialist of this research, data were divided into two groups, test data, and educational data in the clustering algorithm. Cross-validation (CV) is one of the methods to test the efficiency of machine learning models, it is also a re-test group process utilized to assess a model when the data is restricted. With this method, the final outputs are reviewed, and the validity of the research is verified. In this study, 70 percent of the data were chosen as training data sets, and 30 percent of the data were designated as experimental data sets. To accidentally select the experimental data, the Excel software has been used. In conclusion, to relate and to find the best model among the three suggested models, three designated clustering algorithms in WEKA software are widely discussed below [49]:

4.4.1. K-MEANS Algorithm

This algorithm is a technique of cluster analysis which purposes to divide N explanations into K clusters where each reflection fits into the cluster with the nearest mean. Initially, the k centroid requirement to be selected at the beginning. These centers should be selected productively, placing them as much as probable isolated from each other because a diverse place causes the various outcome. The subsequent step is to make illustrations or themes belonging to a data set and assistant them to the adjacent centers. After linking points altogether with centers, the centroid is recalculated from the clusters achieved in the procedure. Subsequently, finding k novel centroid, a new obligatory must be completed among the same data set points and the adjacent novel center. The process is repeated until no more variations are completed, or in other words, centers do not change any longer. The K-means is an appropriate algorithm for finding similarities among units based on distance measures with small datasets [50].

4.4.2. Hierarchical Cluster Algorithm

This algorithm is a process of cluster analysis which is utilized to form a hierarchy of clusters. The clustering usually falls into two categories of agglomerative and divisive. Agglomerative begin with taking each separate entity through a single cluster. Then, depending upon their correspondences (distances) evaluated in following iterations, it agglomerates (combines) the nearby couple of clusters (by filling some resemblance standards), in anticipation of all the data is in one cluster. Divisive works in a parallel way to agglomerative clustering but in the opposite direction. Initially, all the entities are expected to encompass in a single cluster and then consecutively divided into diverse clusters until separate objects keep on in every cluster. This technique is generally used for the evaluation of a numerical and a typical method for clustering. The main goal of hierarchical clustering is to capture the fundamental structures of the time series data, and it provides a set of nested clusters, systematized as a hierarchical tree [51].

4.4.3. Make Density Based Cluster

This algorithm finds several clusters beginning from the predictable density scattering of equivalent nodes. It is one of the most common clustering algorithms and mentioned in the scientific literature. Assumed a set of themes in some space, it clusters together items that are thoroughly packed together (items with various adjacent neighbors), pattern as outliers’ themes that lie unaccompanied in low-thickness areas. The density-based method in clustering is an outstanding clustering approach in which data in the data set is divided based on density, and high-density points are divided from the low-density points based on the threshold. The density-based approach is the basis of density-based clustering algorithms. However, this algorithm does not support many densities. Novel algorithms improve this limitation [52].

4.4.4. Three Layers Proposed Filtering Pre-Preprocess

Preprocessing is one of the significant and precondition steps in data mining. Feature selection (FS) is a procedure for excellent features that are more useful, but some features may be repetitive, and others may be unrelated and noisy. When the data set contains pointless data, the preprocessing of the dataset is compulsory. Preprocessing step includes:
(1)
Data Cleaning: Management of lost principles by overlooking tuple, satisfying value with some exact data and supervising noise using discarding methods, clustering, collective human & machine review and regression.
(2)
Data Integration: Occasionally it has data from several bases in data warehouse and may need to merge them for extra examination. Plan incorporation and redundancy are the main problems in the data incorporation.
(3)
Data Transformation: Data Transformation is transforming the data in each format to the required format for data mining. Normalization, smoothing, aggregation and generalization are some examples of transformation.
(4)
Data Reduction: Data analysis on a large number of data takings a long time. It can be achieved by using data cube aggregation, dimension reduction, data compression, numerosity reduction, discretization and concept hierarchy generation.
For the first three conducts of preprocessing, a “filter” selection is in WEKA. In filter selection, there are two categories of filters: Supervised and unsupervised. In both groups, filters are for attributes and instances distinctly. After data cleaning, integration and transformation the data reduction is achieved to get the task applicable data. For data reduction, an “Attribute Selection” option is available. It contains several kinds of feature selection programs for wrapper method, filter method and embedded method. Using attribute and instance filters, all attributes and instances can be increased, eliminated, and changed respectively. In this study, a special pre-processing is applied by experts. Three layers filtering Pre-process is applied to the dataset to make imbalanced data balanced. This procedure is used in three steps as follows:
¯
Step A: Discretization (unsupervised attribute filter): According to the points, for orderly arrangement, this step should be done. Discretization converts one form of data to another form. There are many methods used to describe these two data types, such as ‘quantitative’ vs. ‘qualitative’, ‘continuous’ vs. ‘discrete’, ‘ordinal’ vs. ‘nominal’, and ‘numeric’ vs. ‘categorical’. It is so important to select the most appropriate method for discretization. We classified data in to quantitative or qualitative [53].
¯
Step B: Stratified Remove Folds (supervised instance filter): In our specific data set, this filter plays a vital role in increasing the accuracy of all algorithms. This filter takes a data set and outputs an identified fold for cross-validation.
¯
Step C: Attribute Selection (supervised attribute filter): In order to choose the best attributes for determining the best scenario, this step can be applied.
The classes to clusters evaluation in WEKA utilizing the foremost assessor to each output of steps described above. It is applied because it is the individual clustering surveyor which yields numeric accuracy as a principle of assessment within several algorithms. The clustering accuracy outcomes encompass true positive values and true negative values. The clustering performance or accuracy considered by the following formulation:
T r u e p o s i t i v e + T r u e n e g a t i v e T r u e p o s i t i v e + T r u e n e g a t i v e + F a l s e p o s i t i v e + F a l s e n e g a t i v e
Therefore, a true negative is an outcome where the model properly forecasts the negative class. A false positive is an outcome where the model mistakenly forecasts the positive class. And, a false negative is an outcome where the model incorrectly predicts the negative class.

4.5. Assessment Procedure of the Machine Learning and Optimization

Figure 5 shows the combination of optimization and machine learning procedure.

5. Evaluation in Window Analysis and Clustering Algorithms

DEA window analysis in optimization is evaluated for the first part of this section.

5.1. Evaluation in the Window Analysis Method

Introducing the basic parameters of window analysis based on the case study plays an important role for eco-efficiency evaluation.

5.1.1. Calculation for the Basic Parameters of Window Analysis

In this study, the width of the window is three. Therefore, according to Equation (15), given six years, there are four windows. Thus, K is calculated based on the Equation (66) as:
K = T − W + 1 (T = 6, W = 4)
Therefore, based on the abovementioned relation, k will be three. The first window consists of the first, second, and third years. In the second window, the first year is deleted, and the fourth year is added (Second window: The second year, the third year and the fourth year). To this end, it will continue until the last window.
In the window analysis used in the present study, considering the number of units based on Table 4:

5.1.2. Discussion in Window Analysis-FDH, BCC and CCR Models

Efficiency evaluation and values of 24 cement companies based on the abovementioned parameters within three years’ width (K) in window analysis for FDH, model is listed in Table 5:
Based on the FDH model for Figure 6:
  • The 22nd company has the highest efficiency score with an efficiency score of 1.
  • The 16th company has the second-highest efficiency score with an efficiency score of 0.998.
  • The 19th company has the third-highest efficiency score with an efficiency score of 0.995.
  • The 6th company has the lowest efficiency score, with an efficiency score of 0.913.
Based on the FDH model for Table 6:
  • The 22nd company has the highest efficiency score with an efficiency score of 1 for all windows.
  • 10th (0.957, 0.960, 0.992, 1), 19th (0.984, 0.997, 1, 1), and 23rd (0.960, 0.985, 1, 1) companies have an ascending trend of efficiency from the beginning (1st window) to the end (4th window).
  • Only 11th (0.967, 0.960, 0.950, 0.948) company has a descending trend of efficiency from the beginning (1st window) to the end (4th window).
  • Other cement companies have both descending and ascending from the beginning (1st window) to the end (4th window).
Based on the FDH model for Table 7:
  • The 22nd company has the highest efficiency score with an efficiency score of 1 for all years.
  • 19th (0.954, 0.997, 1, 1, 1, 1), and 23rd (0.924, 0.958, 1, 1, 1, 1) companies have an ascending trend of efficiency from the beginning (1st year or 2014) to the end (6th year or 2019).
  • There is no descending trend of efficiency from the beginning (1st year or 2014) to the end (6th year or 2019).
  • Other cement companies have both descending and ascending from the beginning (1st year or 2014) to the end (6th year or 2019).
Efficiency evaluation and values of 24 cement companies based on suggested parameters within three years’ width (K) in window analysis for BCC, model is listed in Table 8:
Based on the BCC model for Figure 7:
  • 22nd Company has the highest efficiency score, with an efficiency score of 1.
  • 16th Company has the second-highest efficiency score with an efficiency score of 0.996.
  • 19th Company has the third-highest efficiency score with an efficiency score of 0.993.
  • 6th Company has the lowest efficiency score, with an efficiency score of 0.911.
Based on the BCC model for Table 9:
  • The 22nd company has the highest efficiency score with an efficiency score of 1 for all windows.
  • 10th (0.955, 0.958, 0.991, 1), 19th (0.983, 0.996, 1, 1), and 23rd (0.958, 0.984, 1, 1) companies have an ascending trend of efficiency from the beginning (1st window) to the end (4th window).
  • Only 11th (0.966, 0.959, 0.949, 0.946) company has a descending trend of efficiency from the beginning (1st window) to the end (4th window).
  • Other cement companies have both descending and ascending from the beginning (1st window) to the end (4th window).
Based on the BCC model for Table 10:
  • The 22nd company has the highest efficiency score with an efficiency score of 1 for all years.
  • 19th (0.952, 0.996, 1, 1, 1, 1), and 23rd (0.923, 0.956, 1, 1, 1, 1) companies have an ascending trend of efficiency from the beginning (1st year or 2014) to the end (6th year or 2019).
  • There is no descending trend of efficiency from the beginning (1st year or 2014) to the end (6th year or 2019).
  • Other cement companies have both descending and ascending from the beginning (1st year or 2014) to the end (6th year or 2019).
Efficiency evaluation and values of 24 cement companies based on the recommended parameters within three years’ width (K) in window analysis for FDH, model is listed in Table 11:
Based on the CCR model for Figure 8:
  • The 22nd company has the highest efficiency score with an efficiency score of 1.
  • The 16th company has the second-highest efficiency score with an efficiency score of 0.994.
  • The 19th company has the third-highest efficiency score with an efficiency score of 0.991.
  • The 6th company has the lowest efficiency score, with an efficiency score of 0.909.
Based on the CCR model for Table 12:
  • The 22nd company has the highest efficiency score with an efficiency score of 1 for all windows.
  • 10th (0.949, 0.955, 0.990, 1), 19th (0.982, 0.995, 1, 1), and 23rd (0.957, 0.983, 1, 1) companies have an ascending trend of efficiency from the beginning (1st window) to the end (4th window).
  • Only 11th (0.965, 0.958, 0.948, 0.945) company has a descending trend of efficiency from the beginning (1st window) to the end (4th window).
  • Other cement companies have both descending and ascending from the beginning (1st window) to the end (4th window).
Based on the CCR model for Table 13:
  • The 22nd company has the highest efficiency score with an efficiency score of 1 for all years.
  • 19th (0.950, 0.995, 1, 1, 1, 1), and 23rd (0.922, 0.955, 1, 1, 1, 1) companies have an ascending trend of efficiency from the beginning (1st year or 2014) to the end (6th year or 2019).
  • There is no descending trend of efficiency from the beginning (1st year or 2014) to the end (6th year or 2019).
  • Other cement companies have both descending and ascending from the beginning (1st year or 2014) to the end (6th year or 2019).
Therefore, based on the abovementioned results:
  • 22nd company has the highest efficiency score in all FDH, BCC and CCR models
  • 6th company has the lowest efficiency score in all FDH, BCC and CCR models
  • FDH, CCR and BCC models have the same ranking for all DMUs (The first, second and third ranks for 22nd, 16th, and 19th companies respectively and the lowest level for 6th company)
  • FDH model has the first rank and the highest total average efficiency score for all 24 DMUs
  • BCC model has the second overall average efficiency score for all 24 DMUs
  • CCR model has the third overall average efficiency score for all 24 DMUs

5.2. Evaluation in the Clustering Algorithms

After applying clustering steps (Step A, Step B and Step C), the accuracy and average accuracy in each stage are presented in Table 14, Table 15 and Table 16.
It can be concluded from Table 14, Table 15 and Table 16 as the layers of filtering increases:
  • The maximum of accuracy within three assessment approaches is improved.
  • The average accuracy within three models, links to each filtering step is augmented.
  • The accuracy of all algorithms is increased as well.
Finally, the following relation is applicable to all three suggested clustering algorithms:
K-MEANS > HIERARCHICAL CLUSTERER > MAKE DENSITY BASED CLUSTERER
FDH at Steps A–C has the highest accuracy. In fact, according to our unique data, attributed, and instances using the K-means based on FDH model in proposed combining DEA and data mining methodology has the best performance.

6. Discussion

The efficiency of the projected method delivers us with a chance to distinguish pattern recognition of the whole, combining DEA and data mining techniques during the selected period (six years over 2014–2019). Meanwhile, the cement industry is one of the foremost manufacturers of naturally harmful material using an undesirable by-product; specific stress is given to that pollution control investment or undesirable output while evaluating energy use efficiency. The significant concentration of the study is to respond to four preliminary questions. First, whether the conversion of the two-stage model to a simple and standard single-stage model has any positive impacts on Eco-efficiency evaluation? Secondly, whether FDH proposed model has any positive effects on Eco-efficiency? Thirdly, whether combining DEA and data mining have any positive effect on Eco-efficiency evaluation? Fourthly, what are the advantages of clustering three layers filtering pre-processing suggested method?
To answer the first question, this model is proposed to fix the efficiency of a two-stage process and prevent the dependency on various weights. Decreasing pollutant material investment, as well as energy consumption (inputs of the single proposed model) and increasing waste material removed as well as cement production (outputs of the unique proposed model) have a positive influence on efficiency of cement companies.
For the second question, one of the interesting features of FDH model due to nonconvexity nature of FDH efficiency frontier is that, in FDH model, targets link to experiential units which is more well-matched with real life because, in some conditions, the practical unit is much better when compared with a real unit rather than with a simulated one. Meanwhile, unlike CCR and BCC models, the FDH model does not run with the convexity hypothesis. Consequently, this model has a separate nature which means the efficient mark point for an inefficient DMU only to be allocated as a point between individual pragmatic DMUs. Therefore, the efficiency analysis is completed comparative to the other assumed DMUs as a replacement for a hypothetical efficiency frontier. This has the benefit that the accomplishment target for an inefficient DMU assumed by its efficient target point will be more reliable than in cases of CCR and BCC models.
To answer the third question, after efficiency evaluation in window analysis and introducing the highest performance company, data mining clustering algorithms play an important role to find the superior model and algorithm.
For the last question, in order to have more appropriate data, three layers filtering pre-processing which suggested by experts remove unrelated data and attributes, increase the accuracy of the whole system in each step and plays an important role in the quality of algorithms. Consequently, it is so important to extensively compare tactics in many possible states. Therefore, a methodical assessment of three recognized clustering algorithms was applied to exactly find the best algorithm. Therefore, comparing clustering algorithms in the current study provides a valuable comparison of the various data mining strong rules.

7. Conclusions

In this study, it was described how companies operate more efficiently in the presence of similar companies. Therefore, companies that have a lower score by implementing the special pattern which have been executed by more efficient companies, can improve their efficiency. The more taking available information, the more accurate and accessible data will be available. Each company needs an efficiency measurement to know its current status. Therefore, efficient companies are the best reference for increasing the efficiency of inefficient companies. The single standard stage proposed FDH model has a more positive impact on efficiency score compared to other suggested models such as CCR and BCC. One of the advantages of the FDH model unlike CCR and BCC models is that the FDH model does not run with the convexity supposition. Thus, this model has an isolated nature which means the efficient mark point for an inefficient DMU only to be assigned as a point among separate pragmatic DMUs. The proposed approach, geometric average, results, and predictions derived from the period and windows in window analysis can help the practitioner to compare the efficiency of uncertain cases and instruct accordingly. In the future, applying Malmquist Productivity Index (MPI) and comparing final productivities result with window analysis will be valuable. Meanwhile, using fuzzy and random data for future window analysis will be interesting as a final comparison. Since the proposed window analysis method is based on a moving average, it is useful for finding per efficiency trends over time. The results and predictions can be helpful for managers of these companies and other managers who benefit from this approach to achieve a higher relative efficiency score. Besides, managers can compare the efficiency of the current year with other similar companies over the past years. Finally, before introducing the best model and algorithm, based on our particular data, three layers filtering pre-processing which has been proposed by specialists eliminate unrelated data and attributes, increase the precision of the entire system in each step and plays an important role in the quality of algorithms. For future study expert judgment should be translated into probability in a proper way to see how it works. In addition, it can relate to some methods such as but not limited to [54,55,56,57,58,59,60,61,62].

Author Contributions

M.M., M.Y., A.B., H.A.D., K.K. and N.A.G. conceived, designed the research, provided the data and wrote the paper. M.M., M.Y. and N.A.G. revised the manuscript. All authors confirmed the final version of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kabirifar, K.; Mojtahedi, M. The impact of Engineering, Procurement and Construction (EPC) phases on project performance: A case of large-scale residential construction project. Buildings 2019, 9, 15. [Google Scholar] [CrossRef] [Green Version]
  2. Bengar, H.A.; Shahmansouri, A.A.; Sabet, N.A.Z.; Kabirifar, K.; Tam, V.W. Impact of elevated temperatures on the structural performance of recycled rubber concrete: Experimental and mathematical modeling. Constr. Build. Mater. 2020, 255, 119374. [Google Scholar] [CrossRef]
  3. Kabirifar, K.; Mojtahedi, M.; Wang, C.; Tam, V.W. Construction and demolition waste management contributing factors coupled with reduce, reuse, and recycle strategies for effective waste management: A review. J. Clean. Prod. 2020, 263, 121265. [Google Scholar] [CrossRef]
  4. Charnes, A.; Cooper, W.W.; Rhodes, E. Measuring the efficiency of decision making units. Eur. J. Oper. Res. 1978, 2, 429–444. [Google Scholar] [CrossRef]
  5. Li, F.; Emrouznejad, A.; Yang, G.-l.; Li, Y. Carbon emission abatement quota allocation in Chinese manufacturing industries: An integrated cooperative game data envelopment analysis approach. J. Oper. Res. Soc. 2019, 1–30. [Google Scholar] [CrossRef] [Green Version]
  6. Mingoti, S.A.; Lima, J.O. Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms. Eur. J. Oper. Res. 2006, 174, 1742–1759. [Google Scholar] [CrossRef]
  7. Yazdani, M.; Ghodsi, R. Invasive weed optimization algorithm for minimizing total weighted earliness and tardiness penalties on a single machine under aging effect. Int. Robot. Autom. J. 2017, 2, 1–5. [Google Scholar] [CrossRef] [Green Version]
  8. Yazdani, M.; Jolai, F. Lion optimization algorithm (LOA): A nature-inspired metaheuristic algorithm. J. Comput. Des. Eng. 2016, 3, 24–36. [Google Scholar] [CrossRef] [Green Version]
  9. Yazdani, M.; Aleti, A.; Khalili, S.M.; Jolai, F. Optimizing the sum of maximum earliness and tardiness of the job shop scheduling problem. Comput. Ind. Eng. 2017, 107, 12–24. [Google Scholar] [CrossRef]
  10. Oh, D.-H.; Heshmati, A. A sequential Malmquist–Luenberger productivity index: Environmentally sensitive productivity growth considering the progressive nature of technology. Energy Econ. 2010, 32, 1345–1355. [Google Scholar] [CrossRef] [Green Version]
  11. Zhang, X.-P.; Cheng, X.-M.; Yuan, J.-H.; Gao, X.-J. Total-factor energy efficiency in developing countries. Energy Policy 2011, 39, 644–650. [Google Scholar] [CrossRef]
  12. Wang, K.; Yu, S.; Zhang, W. China’s regional energy and environmental efficiency: A DEA window analysis based dynamic evaluation. Math. Comput. Model. 2013, 58, 1117–1127. [Google Scholar] [CrossRef]
  13. Wu, D.; Wang, Y.; Qian, W. Efficiency evaluation and dynamic evolution of China’s regional green economy: A method based on the Super-PEBM model and DEA window analysis. J. Clean. Prod. 2020, 264, 121630. [Google Scholar] [CrossRef]
  14. Halkos, G.E.; Tzeremes, N.G. Exploring the existence of Kuznets curve in countries’ environmental efficiency using DEA window analysis. Ecol. Econ. 2009, 68, 2168–2176. [Google Scholar] [CrossRef]
  15. Korhonen, P.J.; Luptacik, M. Eco-efficiency analysis of power plants: An extension of data envelopment analysis. Eur. J. Oper. Res. 2004, 154, 437–446. [Google Scholar] [CrossRef]
  16. Yang, H.; Pollitt, M. The necessity of distinguishing weak and strong disposability among undesirable outputs in DEA: Environmental performance of Chinese coal-fired power plants. Energy Policy 2010, 38, 4440–4444. [Google Scholar] [CrossRef]
  17. Zhang, B.; Bi, J.; Fan, Z.; Yuan, Z.; Ge, J. Eco-efficiency analysis of industrial system in China: A data envelopment analysis approach. Ecol. Econ. 2008, 68, 306–316. [Google Scholar] [CrossRef]
  18. Liu, W.; Meng, W.; Li, X.; Zhang, D. DEA models with undesirable inputs and outputs. Ann. Oper. Res. 2010, 173, 177–194. [Google Scholar] [CrossRef]
  19. Chu, J.; Wu, J.; Zhu, Q.; An, Q.; Xiong, B. Analysis of China’s regional eco-efficiency: A DEA two-stage network approach with equitable efficiency decomposition. Comput. Econ. 2019, 54, 1263–1285. [Google Scholar] [CrossRef]
  20. Khalili-Damghani, K.; Shahmir, Z. Uncertain network data envelopment analysis with undesirable outputs to evaluate the efficiency of electricity power production and distribution processes. Comput. Ind. Eng. 2015, 88, 131–150. [Google Scholar] [CrossRef]
  21. Oggioni, G.; Riccardi, R.; Toninelli, R. Eco-efficiency of the world cement industry: A data envelopment analysis. Energy Policy 2011, 39, 2842–2854. [Google Scholar] [CrossRef]
  22. Zhou, P.; Ang, B.W. Linear programming models for measuring economy-wide energy efficiency performance. Energy Policy 2008, 36, 2911–2916. [Google Scholar] [CrossRef]
  23. Environment, U.; Scrivener, K.L.; John, V.M.; Gartner, E.M. Eco-efficient cements: Potential economically viable solutions for a low-CO2 cement-based materials industry. Cem. Concr. Res. 2018, 114, 2–26. [Google Scholar]
  24. Yazdani, M.; Khalili, S.M.; Jolai, F. A parallel machine scheduling problem with two-agent and tool change activities: An efficient hybrid metaheuristic algorithm. Int. J. Comput. Integr. Manuf. 2016, 29, 1075–1088. [Google Scholar] [CrossRef]
  25. Yazdani, M.; Khalili, S.M.; Babagolzadeh, M.; Jolai, F. A single-machine scheduling problem with multiple unavailability constraints: A mathematical model and an enhanced variable neighborhood search approach. J. Comput. Des. Eng. 2017, 4, 46–59. [Google Scholar] [CrossRef] [Green Version]
  26. Yazdani, M.; Jolai, F.; Taleghani, M.; Yazdani, R. A modified imperialist competitive algorithm for a two-agent single-machine scheduling under periodic maintenance consideration. Int. J. Oper. Res. 2018, 32, 127–155. [Google Scholar] [CrossRef]
  27. Yazdani, M.; Jolai, F. A genetic algorithm with modified crossover operator for a two-agent scheduling problem. J. Syst. Manag. 2013, 1, 1–13. [Google Scholar]
  28. Shahmansouri, A.A.; Akbarzadeh Bengar, H.; Jahani, E. Predicting compressive strength and electrical resistivity of eco-friendly concrete containing natural zeolite via GEP algorithm. Constr. Build. Mater. 2019, 229, 116883. [Google Scholar] [CrossRef]
  29. Golilarz, N.A.; Gao, H.; Demirel, H. Satellite image de-noising with harris hawks meta heuristic optimization algorithm and improved adaptive generalized gaussian distribution threshold function. IEEE Access 2019, 7, 57459–57468. [Google Scholar] [CrossRef]
  30. Golilarz, N.A.; Mirmozaffari, M.; Gashteroodkhani, T.A.; Ali, L.; Dolatsara, H.A.; Boskabadi, A.; Yazdi, M. Optimized wavelet-based satellite image de-noising with multi-population differential evolution-assisted harris hawks optimization algorithm. IEEE Access 2020, 1. [Google Scholar] [CrossRef]
  31. Yazdani, M.; Mojtahedi, M.; Loosemore, M. Enhancing evacuation response to extreme weather disasters using public transportation systems: A novel simheuristic approach. J. Comput. Des. Eng. 2020, 7, 195–210. [Google Scholar] [CrossRef]
  32. Yazdani, M.; Babagolzadeh, M.; Kazemitash, N.; Saberi, M. Reliability estimation using an integrated support vector regression–variable neighborhood search model. J. Ind. Inf. Integr. 2019, 15, 103–110. [Google Scholar] [CrossRef]
  33. Azadeh, A.; Seif, J.; Sheikhalishahi, M.; Yazdani, M. An integrated support vector regression–imperialist competitive algorithm for reliability estimation of a shearing machine. Int. J. Comput. Integr. Manuf. 2016, 29, 16–24. [Google Scholar] [CrossRef]
  34. Yu, S.; Wei, Y.-M.; Wang, K. Provincial allocation of carbon emission reduction targets in China: An approach based on improved fuzzy cluster and Shapley value decomposition. Energy Policy 2014, 66, 630–644. [Google Scholar] [CrossRef]
  35. Emrouznejad, A.; Yang, G.-l.; Amin, G.R. A novel inverse DEA model with application to allocate the CO2 emissions quota to different regions in Chinese manufacturing industries. J. Oper. Res. Soc. 2019, 70, 1079–1090. [Google Scholar] [CrossRef] [Green Version]
  36. Qing, X.-X.; Xiao, D.; Wang, B. A real-time monitoring method of energy consumption based on data mining. Chongqing Daxue Xuebao 2012, 35, 133–137. [Google Scholar]
  37. Lim, B.; Lee, K.; Lee, C. Free Disposal Hull (FDH) analysis for efficiency measurement: An update to DEA. Stata J. 2016, 10, 1–8. [Google Scholar]
  38. Tavakoli, I.M.; Mostafaee, A. Free disposal hull efficiency scores of units with network structures. Eur. J. Oper. Res. 2019, 277, 1027–1036. [Google Scholar] [CrossRef]
  39. Krivonozhko, V.; Lychev, A. Frontier visualization for nonconvex models with the use of purposeful enumeration methods. Dokl. Math. 2017, 96, 650–653. [Google Scholar] [CrossRef]
  40. Krivonozhko, V.; Lychev, A. Frontier visualization and estimation of returns to scale in free disposal hull models. Comput. Math. Math. Phys. 2019, 59, 501–511. [Google Scholar] [CrossRef]
  41. Cesaroni, G.; Kerstens, K.; Van de Woestyne, I. Estimating scale economies in non-convex production models. J. Oper. Res. Soc. 2017, 68, 1442–1451. [Google Scholar] [CrossRef] [Green Version]
  42. Cesaroni, G.; Kerstens, K.; Van de Woestyne, I. Global and local scale characteristics in convex and nonconvex nonparametric technologies: A first empirical exploration. Eur. J. Oper. Res. 2017, 259, 576–586. [Google Scholar] [CrossRef] [Green Version]
  43. Cesaroni, G.; Giovannola, D. Average-cost efficiency and optimal scale sizes in non-parametric analysis. Eur. J. Oper. Res. 2015, 242, 121–133. [Google Scholar] [CrossRef]
  44. Charnes, A.; Cooper, W.; Lewin, A.Y.; Seiford, L.M. Data envelopment analysis theory, methodology and applications. J. Oper. Res. Soc. 1997, 48, 332–333. [Google Scholar] [CrossRef]
  45. Charnes, A.; Clark, C.T.; Cooper, W.W.; Golany, B. A developmental study of data envelopment analysis in measuring the efficiency of maintenance units in the US air forces. Ann. Oper. Res. 1984, 2, 95–112. [Google Scholar] [CrossRef]
  46. Khalfallah, J.; Slama, J.B.H. A comparative study of the various clustering algorithms in e-learning systems using WEKA tools. In Proceedings of the 2018 JCCO Joint International Conference on ICT in Education and Training, International Conference on Computing in Arabic, and International Conference on Geocomputing (JCCO: TICET-ICCA-GECO), Hammamet, Tunisia, 9–11 November 2018; pp. 1–7. [Google Scholar]
  47. DeFreitas, K.; Bernard, M. Comparative performance analysis of clustering techniques in educational data mining. Iadis Int. J. Comput. Sci. Inf. Syst. 2015, 10, 65–78. [Google Scholar]
  48. Ratnapala, I.; Ragel, R.; Deegalla, S. Students behavioural analysis in an online learning environment using data mining. In Proceedings of the 7th International Conference on Information and Automation for Sustainability, Colombo, Sri Lanka, 22–24 December 2014; pp. 1–7. [Google Scholar]
  49. Sharma, N.; Bajpai, A.; Litoriya, M.R. Comparison the various clustering algorithms of weka tools. Facilities 2012, 4, 78–80. [Google Scholar]
  50. Sreedhar, C.; Kasiviswanath, N.; Reddy, P.C. Clustering large datasets using K-means modified inter and intra clustering (KM-I2C) in Hadoop. J. Big Data 2017, 4, 27. [Google Scholar] [CrossRef]
  51. Keogh, E.; Lin, J. Clustering of time-series subsequences is meaningless: Implications for previous and future research. Knowl. Inf. Syst. 2005, 8, 154–177. [Google Scholar] [CrossRef]
  52. Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 1966, 96, 226–231. [Google Scholar]
  53. Yang, Y.; Webb, G.I.; Wu, X. Discretization methods. In DataMining and Knowledge Discovery Handbook; Maimon, O., Rokach, L., Eds.; Springer: Boston, MA, USA, 2010; pp. 101–116. [Google Scholar]
  54. Golilarz, N.A.; Addeh, A.; Gao, H.; Ali, L.; Roshandeh, A.M.; Munir, H.M.; Moradkhani, A.; Munir, H.M.; Khan, R.U. A new automatic method for control chart patterns recognition based on ConvNet and Harris Hawks meta heuristic optimization algorithm. IEEE Access 2019, 7, 149398–149405. [Google Scholar] [CrossRef]
  55. Addeh, A.; Khormali, A.; Golilarz, N.A. Control chart pattern recognition using RBF neural network with new training algorithm and practical features. ISA Trans. 2018, 79, 202–216. [Google Scholar] [CrossRef] [PubMed]
  56. Mirmozaffari, M.; Boskabadi, A.; Azeem, G.; Massah, R.; Boskabadi, E.; Dolatsara, H.A.; Liravian, A. Machine learning clustering algorithms based on the DEA optimization approach for banking system in developing countries. Eur. J. Eng. Res. Sci. 2020, 5, 651–658. [Google Scholar] [CrossRef]
  57. Mirmozaffari, M.; Azeem, G.; Boskabadi, A.; Aranizadeh, A.; Vaishnav, A.; John, J. A novel improved data envelopment analysis model based on SBM and FDH models. Eur. J. Electr. Eng. Comput. Sci. 2020, 4, 1–7. [Google Scholar] [CrossRef]
  58. Mirmozaffari, M.; Alinezhad, A. Ranking of Heart Hospitals Using cross-efficiency and two-stage DEA. In Proceedings of the 7th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 26–27 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 217–222. [Google Scholar]
  59. Mirmozaffari, M. Eco-Efficiency Evaluation in Two-Stage Network Structure: Case Study: Cement Companies. Iran. J. Optim. 2019, 11, 125–135. [Google Scholar]
  60. Mirmozaffari, M.; Alinezhad, A.; Gilanpour, A. Heart disease prediction with data mining clustering algorithms. Int. J. Comput. Commun. Instrum. Engg. 2017, 4, 16–19. [Google Scholar]
  61. Mirmozaffari, M.; Zandieh, M.; Hejazi, S.M. An Output Oriented Window Analysis Using Two-stage DEA in Heart Hospitals. In Proceedings of the 10th International Conference on Innovations in Science, Engineering, Computers and Technology (ISECT-2017), Dubai, UAE, 17–19 October 2017; pp. 44–51. [Google Scholar]
  62. Aranizadeh, A.; Niazazari, I.; Mirmozaffari, M. A novel optimal distributed generation planning in distribution network using cuckoo optimization algorithm. Eur. J. Electr. Eng. Comput. Sci. 2019, 3. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Production possibility sets.
Figure 1. Production possibility sets.
Applsci 10 05210 g001
Figure 2. Production possibility sets in increasing returns-to-scale (IRS), decreasing returns-to-scale (DRS), and constant returns to scale (CRS).
Figure 2. Production possibility sets in increasing returns-to-scale (IRS), decreasing returns-to-scale (DRS), and constant returns to scale (CRS).
Applsci 10 05210 g002
Figure 3. Free Disposal Hull (FDH) Input-Oriented Efficiency Measures.
Figure 3. Free Disposal Hull (FDH) Input-Oriented Efficiency Measures.
Applsci 10 05210 g003
Figure 4. Conversion of the two-stage model to the single-stage standard model.
Figure 4. Conversion of the two-stage model to the single-stage standard model.
Applsci 10 05210 g004
Figure 5. Assessment procedure of the combination of DEA with clustering.
Figure 5. Assessment procedure of the combination of DEA with clustering.
Applsci 10 05210 g005
Figure 6. Average of total efficiency for FDH model.
Figure 6. Average of total efficiency for FDH model.
Applsci 10 05210 g006
Figure 7. Average of total efficiency for the BCC model.
Figure 7. Average of total efficiency for the BCC model.
Applsci 10 05210 g007
Figure 8. Average of total efficiency for the CCR model.
Figure 8. Average of total efficiency for the CCR model.
Applsci 10 05210 g008
Table 1. The inputs, intermediate elements, and outputs for the 1ST DMUs.
Table 1. The inputs, intermediate elements, and outputs for the 1ST DMUs.
TypeStageVariable (Stat)Unit
InputIEnergy Consumption (X)10 TCE (Ton of coal equivalent)
Intermediate elementsI&IICement Production (M)10 Ton
Intermediate elements (desirable output)I&IIPollution Control Investment (N)1000 USD (United States Dollar)
Desirable OutputIIWastewater Removed (Y)10 ton
Desirable OutputIIWaste Gas Removed (E)10 ton
Desirable OutputIISolid Waste Removed (F)10 ton
Table 2. Expressive analysis of the data.
Table 2. Expressive analysis of the data.
StatMaxMinAveSD
X35351.311723.9614241.868865.93
M36781.762359.98145438809.97
N891.9127.24291.92193.29
Y766248.456531.01164131.92153176.91
E29.980.829.356.91
F23431.707.982761.764723.97
Table 3. Explanation of dimensionless parameters in nomenclature for primal and dual proposed model.
Table 3. Explanation of dimensionless parameters in nomenclature for primal and dual proposed model.
Units
D M U j Decision making units
λ j Non-negative scalar (dual variables that categorize the benchmarks for inefficient parts)
X i j m t h input (Energy consumption) for n t h DMU
Y r j s t h output (Wastewater removed) for n t h DMU
M h j d t h output (Cement production) for n t h DMU
E t j v t h output (Waste gas removed) for n t h DMU
F z j q t h output (Solid waste removed) for n t h DMU
N c j k t h input (Pollution control investment) for n t h DMU
w Free of sign scalar for return to scale variable
j n t h DMU
n DMU observation
m Input (Energy consumption) observation
s Output (Wastewater removed) observation
d Output (Cement production) observation
v Output (Waste gas removed) observation
q Output (Solid waste removed) observation
k Input (Pollution control investment) observation
i m t h input (Energy consumption)
r s t h output (Wastewater removed)
h d t h output (Cement production)
t v t h output (Waste gas removed)
z q t h output (Solid waste removed)
c k t h input (Pollution control investment)
u r Weight assigned to output r (Wastewater removed)
l h Weight assigned to output h (Cement production)
b t Weight assigned to output t (Waste gas removed)
a z Weight assigned to output z (Solid waste removed)
v i Weight assigned to input i (Energy consumption)
g c Weight assigned to input c (Pollution control investment)
ɸ Scalar and real linear variable representing the value of efficiency score
θ Scalar and real dual variable representing the value of efficiency score
θ p Scalar and real dual-variable representing the value of efficiency score for p t h DMU
X i p m t h dual input (Energy consumption) for p t h DMU
Y r p s t h dual output (Wastewater removed) for p t h DMU
M h p d t h dual output (Cement production) for p t h DMU
E t p v t h dual output (Waste gas removed) for p t h DMU
F z p q t h dual output (Solid waste removed) for p t h DMU
N c p k t h dual input (Pollution control investment) for p t h DMU
Table 4. Values of model parameters.
Table 4. Values of model parameters.
ParametersAmount
N24
T6
K3
W4
The decision-maker equals (N), period (T), window length (K), and the number of windows (W) are the parameters of this study.
Table 5. Window analysis for FDH model.
Table 5. Window analysis for FDH model.
Windows 201420152016201720182019Ave.Windows 201420152016201720182019Ave.
1st window 1.000 1.000 0.962 0.987 1st window 1.000 1.000 1.000 1.000
2nd window 1.000 0.962 0.977 0.979 2nd window 1.000 1.000 1.000 1.000
3rd window 0.908 0.805 1.000 0.904 3rd window 1.000 0.882 1.000 0.960
4th window 0.805 1.000 1.000 0.934 4th window 0.987 1.000 1.000 0.995
Avg. for 1st 1.000 1.000 0.944 0.862 1.000 1.000 Avg. for 13th 1.000 1.000 1.000 0.956 1.000 1.000
1st window 1.000 0.936 1.000 0.978 1st window 1.000 0.859 0.941 0.933
2nd window 0.993 1.000 1.000 0.997 2nd window 0.998 1.000 1.000 1.000
3rd window 1.000 0.926 1.000 0.975 3rd window 1.000 0.917 1.000 0.972
4th window 0.986 1.000 0.981 0.989 4th window 0.932 1.000 1.000 0.977
Avg. for 2nd 1.000 0.964 1.000 0.970 1.000 0.981 Avg. for 14th 1.000 0.928 0.980 0.949 1.000 1.000
1st window 0.892 0.936 1.000 0.942 1st window 1.000 1.000 0.858 0.952
2nd window 0.993 1.000 1.000 0.998 2nd window 1.000 0.858 0.872 0.919
3rd window 1.000 0.926 1.000 0.944 3rd window 1.000 1.000 1.000 1.000
4th window 0.986 1.000 0.981 0.989 4th window 1.000 1.000 1.000 1.000
Avg. for 3rd 0.892 0.964 1.000 0.970 1.000 0.981 Avg. for 15th 1.000 1.000 0.914 0.957 1.000 1.000
1st window 1.000 0.892 0.865 0.918 1st window 1.000 0.979 1.000 0.992
2nd window 1.000 0.968 0.965 0.944 2nd window 1.000 1.000 1.000 1.000
3rd window 0.857 0.768 1.000 0.875 3rd window 1.000 1.000 1.000 1.000
4th window 0.950 1.000 1.000 0.983 4th window 1.000 1.000 1.000 1.000
Avg. for 4th 1.000 0.945 0.896 0.894 1.000 1.000 Avg. for 16th 1.000 0.989 1.000 1.000 1.000 1.000
1st window 1.000 0.835 0.847 0.894 1st window 1.000 1.000 0.880 0.960
2nd window 1.000 1.000 1.000 1.000 2nd window 1.000 0.890 1.000 0.963
3rd window 1.000 0.972 1.000 0.990 3rd window 0.839 1.000 0.920 0.919
4th window 0.998 1.000 0.858 0.951 4th window 1.000 0.920 0.963 0.960
Avg. for 5th 1.000 0.917 0.948 0.990 1.000 0.858 Avg. for 17th 1.000 1.000 0.869 1.000 0.920 0.963
1st window 1.000 0.823 0.839 0.887 1st window 1.000 0.929 1.000 0.976
2nd window 1.000 1.000 0.940 0.979 2nd window 0.933 1.000 1.000 0.977
3rd window 0.848 0.764 1.000 0.870 3rd window 0.914 0.884 1.000 0.932
4th window 0.850 1.000 0.932 0.932 4th window 0.884 1.000 0.850 0.911
Avg. for 6th 1.000 0.911 0.895 0.851 1.000 0.932 Avg. for 18th 1.000 0.931 0.971 0.922 1.000 0.850
1st window 0.921 1.000 1.000 0.973 1st window 0.954 1.000 1.000 0.984
2nd window 1.000 0.896 1.000 0.965 2nd window 0.994 1.000 1.000 0.997
3rd window 0.900 1.000 1.000 0.966 3rd window 1.000 1.000 1.000 1.000
4th window 1.000 1.000 1.000 1.000 4th window 1.000 1.000 1.000 1.000
Avg. for 7th 0.921 1.000 0.932 1.000 1.000 1.000 Avg. for 19th 0.954 0.997 1.000 1.000 1.000 1.000
1st window 1.000 1.000 1.000 1.000 1st window 1.000 0.850 1.000 0.950
2nd window 1.000 1.000 0.980 0.993 2nd window 0.881 1.000 0.962 0.947
3rd window 0.826 0.800 1.000 0.875 3rd window 1.000 0.915 1.000 0.971
4th window 0.778 1.000 1.000 0.926 4th window 0.915 1.000 1.000 0.971
Avg. for 8th 1.000 1.000 0.941 0.852 1.000 1.000 Avg. for 20th 1.000 0.865 1.000 0.931 1.000 1.000
1st window 1.000 1.000 1.000 1.000 1st window 1.000 0.861 0.877 0.912
2nd window 1.000 1.000 1.000 1.000 2nd window 1.000 1.000 0.949 0.982
3rd window 1.000 0.784 1.000 0.927 3rd window 0.750 0.982 1.000 0.910
4th window 0.837 1.000 1.000 0.945 4th window 0.629 1.000 1.000 0.876
Avg. for 9th 1.000 1.000 1.000 0.873 1.000 1.000 Avg. for 21st 1.000 0.930 0.875 0.853 1.000 1.000
1st window 0.919 1.000 0.953 0.957 1st window 1.000 1.000 1.000 1.000
2nd window 1.000 0.953 0.927 0.960 2nd window 1.000 1.000 1.000 1.000
3rd window 0.979 1.000 1.000 0.992 3rd window 1.000 1.000 1.000 1.000
4th window 1.000 1.000 1.000 1.000 4th window 1.000 1.000 1.000 1.000
Avg. for 10th 0.919 1.000 0.962 0.975 1.000 1.000 Avg. for 22nd 1.000 1.000 1.000 1.000 1.000 1.000
1st window 1.000 0.903 1.000 0.967 1st window 0.924 0.958 1.000 0.960
2nd window 1.000 1.000 0.880 0.960 2nd window 0.958 1.000 1.000 0.985
3rd window 1.000 0.850 1.000 0.950 3rd window 1.000 1.000 1.000 1.000
4th window 0.845 1.000 1.000 0.948 4th window 1.000 1.000 1.000 1.000
Avg. for 11th 1.000 0.951 1.000 0.858 1.000 1.000 Avg. for 23rd 0.924 0.958 1.000 1.000 1.000 1.000
1st window 1.000 1.000 0.962 0.987 1st window 1.000 0.861 0.877 0.912
2nd window 1.000 1.000 0.943 0.980 2nd window 1.000 1.000 0.949 0.982
3rd window 1.000 1.000 1.000 1.000 3rd window 0.690 0.925 1.000 0.871
4th window 1.000 1.000 1.000 1.000 4th window 0.880 1.000 1.000 0.960
Avg. for 12th 1.000 1.000 0.987 0.980 1.000 1.000 Avg. for 24th 1.000 0.930 0.855 0.918 1.000 1.000
Table 6. Average of windows in FDH model for each three years.
Table 6. Average of windows in FDH model for each three years.
2014–20162015–20172016–20182017–2019
Company 198.7%97.9%90.4%93.4%
Company 297.8%99.7%97.5%98.9%
Company 394.2%99.8%94.4%98.9%
Company 491.8%94.4%87.5%98.3%
Company 589.4%100.0%99.0%95.1%
Company 688.7%97.9%87.0%90.0%
Company 797.3%96.5%96.6%100.0%
Company 8100.0%99.3%87.5%92.6%
Company 9100.0%100.0%92.7%94.5%
Company 1095.7%96.0%99.2%100.0%
Company 1196.7%96.0%95.0%94.8%
Company 1298.7%98.0%100.0%100.0%
Company 13100.0%100.0%96.0%99.5%
Company 1493.3%100.0%97.2%97.7%
Company 1595.2%91.9%100.0%100.0%
Company 1699.2%100.0%100.0%100.0%
Company 1796.0%96.3%91.9%96.0%
Company 1897.6%97.7%93.2%91.1%
Company 1998.4%99.7%100.0%100.0%
Company 2095.0%94.7%97.1%97.1%
Company 2191.2%98.2%91.0%87.6%
Company 22100.0%100.0%100.0%100.0%
Company 2396.0%98.5%100.0%100.0%
Company 2491.2%98.2%87.1%96.0%
Table 7. Average efficiency in FDH model for each year.
Table 7. Average efficiency in FDH model for each year.
201420152016201720182019
Company 11.001.000.940.861.001.00
Company 21.000.961.000.971.000.98
Company 30.890.961.000.971.000.98
Company 41.000.950.900.891.001.00
Company 51.000.920.950.991.000.86
Company 61.000.910.900.851.000.93
Company 70.921.000.931.001.001.00
Company 81.001.000.940.851.001.00
Company 91.001.001.000.871.001.00
Company 100.921.000.960.981.001.00
Company 111.000.951.000.861.001.00
Company 121.001.000.990.981.001.00
Company 131.001.001.000.961.001.00
Company 141.000.930.980.951.001.00
Company 151.001.000.910.961.001.00
Company 161.000.991.001.001.001.00
Company 171.001.000.871.000.920.96
Company 181.000.930.970.921.000.85
Company 190.951.001.001.001.001.00
Company 201.000.871.000.931.001.00
Company 211.000.930.880.851.001.00
Company 221.001.001.001.001.001.00
Company 230.920.961.001.001.001.00
Company 241.000.930.860.921.001.00
Table 8. Window analysis for BCC model.
Table 8. Window analysis for BCC model.
Windows201420152016201720182019Avg.Windows201420152016201720182019Avg.
1st window 1.0001.0000.960 0.986 1st window 1.0001.0001.000 1.000
2nd window 1.0000.9610.976 0.978 2nd window 1.0001.0001.000 1.000
3rd window 0.9070.8041.000 0.903 3rd window 1.0000.8801.000 0.959
4th window 0.8031.0001.0000.933 4th window 0.9861.0001.0000.994
Avg. for 1st 1.0001.0000.9420.8611.0001.000 Avg. for 13th 1.0001.0001.0000.9541.0001.000
1st window 1.0000.9341.000 0.977 1st window 1.0000.8580.939 0.931
2nd window 0.9911.0001.000 0.996 2nd window 0.9961.0001.000 0.999
3rd window 1.0000.9241.000 0.974 3rd window 1.0000.9151.000 0.971
4th window 0.9851.0000.9800.980 4th window 0.9301.0001.0000.976
Avg. for 2nd 1.0000.9621.0000.9601.0000.980 Avg. for 14th 1.0000.9260.9790.9471.0001.000
1st window 0.8910.9351.000 0.941 1st window 1.0001.0000.856 0.951
2nd window 0.9911.0001.000 0.998 2nd window 1.0000.8840.870 0.917
3rd window 1.0000.9241.000 0.943 3rd window 1.0001.0001.000 1.000
4th window 0.9851.0000.9800.980 4th window 1.0001.0001.0001.000
Avg. for 3rd 0.8910.9621.0000.9681.0000.980 Avg. for 15th 1.0001.0000.9120.9561.0001.000
1st window 1.0000.8900.864 0.917 1st window 1.0000.9771.000 0.991
2nd window 1.0000.9670.964 0.944 2nd window 1.0001.0001.000 1.000
3rd window 0.8560.7671.000 0.874 3rd window 1.0001.0001.000 1.000
4th window 0.9491.0001.0000.983 4th window 1.0001.0001.0001.000
Avg. for 4th 1.0000.9440.8940.8931.0001.000 Avg. for 16th 1.0000.9881.0001.0001.0001.000
1st window 1.0000.8330.845 0.893 1st window 1.0001.0000.878 0.959
2nd window 1.0001.0001.000 1.000 2nd window 1.0000.8881.000 0.962
3rd window 1.0000.9701.000 0.989 3rd window 0.8381.0000.918 0.917
4th window 0.9971.0000.8570.950 4th window 1.0000.9180.9610.958
Avg. for 5th 1.0000.9160.9470.9891.0000.857 Avg. for 17th 1.0001.0000.8671.0000.9180.961
1st window 1.0000.8200.838 0.885 1st window 1.0000.9271.000 0.975
2nd window 1.0001.0000.939 0.977 2nd window 0.9311.0001.000 0.976
3rd window 0.8470.7631.000 0.868 3rd window 0.9120.8821.000 0.930
4th window 0.8491.0000.9300.898 4th window 0.8831.0000.8490.909
Avg. for 6th 1.0000.9100.8930.8491.0000.930 Avg. for 18th 1.0000.9290.9700.9201.0000.849
1st window 0.9201.0001.000 0.972 1st window 0.9521.0001.000 0.983
2nd window 1.0000.8951.000 0.964 2nd window 0.9921.0001.000 0.996
3rd window 0.8991.0001.000 0.965 3rd window 1.0001.0001.000 1.000
4th window 1.0001.0001.0001.000 4th window 1.0001.0001.0001.000
Avg. for 7th 0.9201.0000.9311.0001.0001.000 Avg. for 19th 0.9520.9961.0001.0001.0001.000
1st window 1.0001.0001.000 1.000 1st window 1.0000.8481.000 0.949
2nd window 1.0001.0000.978 0.992 2nd window 0.8801.0000.960 0.945
3rd window 0.8240.7991.000 0.873 3rd window 1.0000.9131.000 0.970
4th window 0.7761.0001.0000.925 4th window 0.9131.0001.0000.970
Avg. for 8th 1.0001.0000.9400.8501.0001.000 Avg. for 20th 1.0000.8631.0000.9291.0001.000
1st window 1.0001.0001.000 1.000 1st window 1.0000.8590.876 0.910
2nd window 1.0001.0001.000 1.000 2nd window 1.0001.0000.947 0.981
3rd window 1.0000.7821.000 0.926 3rd window 0.7480.9801.000 0.908
4th window 0.8351.0001.0000.944 4th window 0.6271.0001.0000.875
Avg. for 9th 1.0001.0001.0000.8711.0001.000 Avg. for 21st 1.0000.9290.8730.8511.0001.000
1st window 0.9181.0000.951 0.955 1st window 1.0001.0001.000 1.000
2nd window 1.0000.9520.925 0.958 2nd window 1.0001.0001.000 1.000
3rd window 0.9781.0001.000 0.991 3rd window 1.0001.0001.000 1.000
4th window 1.0001.0001.0001.000 4th window 1.0001.0001.0001.000
Avg. for 10th 0.9181.0000.9600.9741.0001.000 Avg. for 22nd 1.0001.0001.0001.0001.0001.000
1st window 1.0000.9011.000 0.966 1st window 0.9230.9561.000 0.958
2nd window 1.0001.0000.878 0.959 2nd window 0.9561.0001.000 0.984
3rd window 1.0000.8481.000 0.949 3rd window 1.0001.0001.000 1.000
4th window 0.8401.0001.0000.946 4th window 1.0001.0001.0001.000
Avg. for 11th 1.0000.9501.0000.8551.0001.000 Avg. for 23rd 0.9230.9561.0001.0001.0001.000
1st window 1.0001.0000.960 0.986 1st window 1.0000.8590.875 0.910
2nd window 1.0001.0000.941 0.979 2nd window 1.0001.0000.947 0.981
3rd window 1.0001.0001.000 1.000 3rd window 0.6860.9231.000 0.869
4th window 1.0001.0001.0001.000 4th window 0.8781.0001.0000.959
Avg. for 12th 1.0001.0000.9860.9791.0001.000 Avg. for 24th 1.0000.9290.8520.9161.0001.000
Table 9. Average of windows in the BCC model for each three years.
Table 9. Average of windows in the BCC model for each three years.
2014–20162015–20172016–20182017–2019
Company 10.990.980.900.93
Company 20.981.000.970.99
Company 30.941.000.940.99
Company 40.920.940.870.98
Company 50.891.000.980.95
Company 60.890.980.870.90
Company 70.970.960.971.00
Company 81.000.990.870.93
Company 91.001.000.930.94
Company 100.960.960.991.00
Company 110.970.960.950.95
Company 120.990.981.001.00
Company 131.001.000.960.99
Company 140.931.000.970.98
Company 150.950.921.001.00
Company 160.991.001.001.00
Company 170.960.960.920.96
Company 180.980.980.930.91
Company 190.981.001.001.00
Company 200.940.950.970.97
Company 210.910.980.910.88
Company 221.001.001.001.00
Company 230.960.981.001.00
Company 240.910.980.870.96
Table 10. Average efficiency in BCC model for each year.
Table 10. Average efficiency in BCC model for each year.
201420152016201720182019
Company 11.001.000.940.861.001.00
Company 21.000.961.000.971.000.98
Company 30.890.961.000.971.000.98
Company 41.000.940.900.891.001.00
Company 51.000.920.950.991.000.86
Company 61.000.910.890.851.000.93
Company 70.921.000.931.001.001.00
Company 81.001.000.940.851.001.00
Company 91.001.001.000.871.001.00
Company 100.921.000.960.971.001.00
Company 111.000.951.000.861.001.00
Company 121.001.000.990.981.001.00
Company 131.001.001.000.961.001.00
Company 141.000.930.980.951.001.00
Company 151.001.000.910.961.001.00
Company 161.000.991.001.001.001.00
Company 171.001.000.871.000.920.96
Company 181.000.930.970.921.000.85
Company 190.951.001.001.001.001.00
Company 201.000.861.000.931.001.00
Company 211.000.930.870.851.001.00
Company 221.001.001.001.001.001.00
Company 230.920.961.001.001.001.00
Company 241.000.930.850.921.001.00
Table 11. Window analysis for CCR model.
Table 11. Window analysis for CCR model.
Windows201420152016201720182019Avg.Windows201420152016201720182019Avg.
1st window 1.0001.0000.958 0.985 1st window 1.0001.0001.000 1.000
2nd window 1.0000.9600.975 0.977 2nd window 1.0001.0001.000 1.000
3rd window 0.9060.8031.000 0.902 3rd window 1.0000.8781.000 0.958
4th window 0.8011.0001.0000.932 4th window 0.9841.0001.0000.993
Avg. for 1st 1.0001.0000.9400.8591.0001.000 Avg. for 13th 1.0001.0001.0000.9521.0001.000
1st window 1.0000.9321.000 0.976 1st window 1.0000.8560.937 0.930
2nd window 0.9891.0001.000 0.995 2nd window 0.9941.0001.000 0.998
3rd window 1.0000.9221.000 0.973 3rd window 1.0000.9131.000 0.970
4th window 0.9831.0000.9780.978 4th window 0.9281.0001.0000.975
Avg. for 2nd 1.0000.9621.0000.9591.0000.978 Avg. for 14th 1.0000.9240.9780.9451.0001.000
1st window 0.8900.9331.000 0.939 1st window 1.0001.0000.854 0.950
2nd window 0.9891.0001.000 0.997 2nd window 1.0000.8820.868 0.915
3rd window 1.0000.9221.000 0.942 3rd window 1.0001.0001.000 1.000
4th window 0.9831.0000.9780.979 4th window 1.0001.0001.0001.000
Avg. for 3rd 0.8900.9601.0000.9661.0000.978 Avg. for 15th 1.0001.0000.9100.9551.0001.000
1st window 1.0000.8880.862 0.915 1st window 1.0000.9751.000 0.990
2nd window 1.0000.9650.962 0.942 2nd window 1.0001.0001.000 1.000
3rd window 0.8540.7651.000 0.872 3rd window 1.0001.0001.000 1.000
4th window 0.9471.0001.0000.982 4th window 1.0001.0001.0001.000
Avg. for 4th 1.0000.9430.8910.8911.0001.000 Avg. for 16th 1.0000.9871.0001.0001.0001.000
1st window 1.0000.8320.843 0.891 1st window 1.0001.0000.876 0.958
2nd window 1.0001.0001.000 1.000 2nd window 1.0000.8861.000 0.961
3rd window 1.0000.9681.000 0.988 3rd window 0.8361.0000.916 0.916
4th window 0.9951.0000.8550.948 4th window 1.0000.9160.9590.956
Avg. for 5th 1.0000.9150.9470.9871.0000.855 Avg. for 17th 1.0001.0000.8651.0000.9160.959
1st window 1.0000.8180.836 0.883 1st window 1.0000.9251.000 0.974
2nd window 1.0001.0000.937 0.976 2nd window 0.9291.0001.000 0.975
3rd window 0.8450.7611.000 0.866 3rd window 0.9100.8801.000 0.928
4th window 0.8471.0000.9280.897 4th window 0.8811.0000.8470.907
Avg. for 6th 1.0000.9090.8910.8471.0000.928 Avg. for 18th 1.0000.9270.9690.9181.0000.847
1st window 0.9181.0001.000 0.971 1st window 0.9501.0001.000 0.982
2nd window 1.0000.8931.000 0.963 2nd window 0.9901.0001.000 0.995
3rd window 0.8971.0001.000 0.963 3rd window 1.0001.0001.000 1.000
4th window 1.0001.0001.0001.000 4th window 1.0001.0001.0001.000
Avg. for 7th 0.9181.0000.9291.0001.0001.000 Avg. for 19th 0.9500.9951.0001.0001.0001.000
1st window 1.0001.0001.0001.000 1st window 1.0000.8461.000 0.948
2nd window 1.000 1.0000.9780.992 2nd window 0.8781.0000.958 0.943
3rd window 0.8220.7991.0000.873 3rd window 1.0000.9111.000 0.969
4th window 0.7761.0001.0000.925 4th window 0.9111.0001.0000.969
Avg. for 8th 1.0001.0000.9390.8501.0001.000 Avg. for 20th 1.0000.8611.0000.9271.0001.000
1st window 1.0001.0001.000 1.000 1st window 1.0000.8570.874 0.908
2nd window 1.0001.0001.000 1.000 2nd window 1.0001.0000.945 0.980
3rd window 1.0000.7801.000 0.925 3rd window 0.7460.9781.000 0.906
4th window 0.8331.0001.0000.943 4th window 0.6251.0001.0000.874
Avg. for 9th 1.0001.0001.0000.8691.0001.000 Avg. for 21st 1.0000.9280.8710.8491.0001.000
1st window 0.9151.0000.947 0.949 1st window 1.0001.0001.000 1.000
2nd window 1.0000.9480.921 0.955 2nd window 1.0001.0001.000 1.000
3rd window 0.9741.0001.000 0.990 3rd window 1.0001.0001.000 1.000
4th window 1.0001.0001.0001.000 4th window 1.0001.0001.0001.000
Avg. for 10th 0.9151.0000.9570.9731.0001.000 Avg. for 22nd 1.0001.0001.0001.0001.0001.000
1st window 1.0000.8991.000 0.965 1st window 0.9220.9551.000 0.957
2nd window 1.0001.0000.875 0.958 2nd window 0.9551.0001.000 0.983
3rd window 1.0000.8451.000 0.948 3rd window 1.0001.0001.000 1.000
4th window 0.8351.0001.0000.945 4th window 1.0001.0001.0001.000
Avg. for 11th 1.0000.9491.0000.8511.0001.000 Avg. for 23rd 0.9220.9551.0001.0001.0001.000
1st window 1.0001.0000.958 0.985 1st window 1.0000.8570.873 0.908
2nd window 1.0001.0000.939 0.979 2nd window 1.0001.0000.945 0.980
3rd window 1.0001.0001.000 1.000 3rd window 0.6840.9211.000 0.868
4th window 1.0001.0001.0001.000 4th window 0.8761.0001.0000.958
Avg. for 12th 1.0001.0000.9850.9781.0001.000 Avg. for 24th 1.0000.9280.8500.9141.0001.000
Table 12. Average of windows in the CCR model for each three years.
Table 12. Average of windows in the CCR model for each three years.
2014–20162015–20172016–20182017–2019
Company 10.990.980.900.93
Company 20.981.000.970.99
Company 30.941.000.940.99
Company 40.920.940.870.98
Company 50.891.000.970.95
Company 60.890.980.870.88
Company 70.970.960.961.00
Company 81.000.990.870.92
Company 91.001.000.930.94
Company 100.960.960.991.00
Company 110.970.960.950.95
Company 120.990.981.001.00
Company 131.001.000.960.99
Company 140.931.000.970.98
Company 150.950.921.001.00
Company 160.991.001.001.00
Company 170.940.960.920.96
Company 180.970.980.930.91
Company 190.981.001.001.00
Company 200.950.950.970.97
Company 210.910.980.910.87
Company 221.001.001.001.00
Company 230.960.981.001.00
Company 240.910.980.870.96
Table 13. Average efficiency in CCR model for each year.
Table 13. Average efficiency in CCR model for each year.
201420152016201720182019
Company 11.001.000.940.861.001.00
Company 21.000.961.000.971.000.98
Company 30.890.961.000.971.000.98
Company 41.000.940.890.891.001.00
Company 51.000.920.950.971.000.86
Company 61.000.910.890.851.000.93
Company 70.921.000.931.001.001.00
Company 81.001.000.940.851.001.00
Company 91.001.001.000.871.001.00
Company 100.921.000.960.971.001.00
Company 111.000.951.000.851.001.00
Company 121.001.000.990.981.001.00
Company 131.001.001.000.951.001.00
Company 141.000.930.980.951.001.00
Company 151.001.000.910.961.001.00
Company 161.000.991.001.001.001.00
Company 171.001.000.871.000.920.96
Company 181.000.930.970.921.000.85
Company 190.950.991.001.001.001.00
Company 201.000.861.000.931.001.00
Company 211.000.930.870.851.001.00
Company 221.001.001.001.001.001.00
Company 230.920.961.001.001.001.00
Company 241.000.930.850.921.001.00
Table 14. Accuracy comparison contained by K-means algorithm (All Numbers Are in Percent).
Table 14. Accuracy comparison contained by K-means algorithm (All Numbers Are in Percent).
Models Based on K-Means AlgorithmStep AStep BStep C
FDH94.532196.546798.5445
BCC88.342190.459292.1245
CCR83.124385.266587.1232
Avg. of three model’s accuracy88.666190.757492.5974
Table 15. Accuracy comparison contained by HIERARCHICAL CLUSTERER algorithms (All Numbers Are in Percent).
Table 15. Accuracy comparison contained by HIERARCHICAL CLUSTERER algorithms (All Numbers Are in Percent).
Models Based on HIERARCHICAL CLUSTERER AlgorithmStep AStep BStep C
FDH92.125694.348796.6578
BCC86.655488.266590.1227
CCR81.144183.157385.7656
Avg. of three model’s accuracy86.641788.590890.8487
Table 16. Accuracy comparison contained by MAKE DENSITY BASED CLUSTERER algorithms (All Numbers Are in Percent).
Table 16. Accuracy comparison contained by MAKE DENSITY BASED CLUSTERER algorithms (All Numbers Are in Percent).
Models Based on MAKE DENSITY BASED CLUSTERER AlgorithmStep AStep BStep C
FDH90.512292.155294.1121
BCC84.122786.743188.5567
CCR79.231181.765683.9087
Avg. of three model’s accuracy84.623086.887988.8591

Share and Cite

MDPI and ACS Style

Mirmozaffari, M.; Yazdani, M.; Boskabadi, A.; Ahady Dolatsara, H.; Kabirifar, K.; Amiri Golilarz, N. A Novel Machine Learning Approach Combined with Optimization Models for Eco-efficiency Evaluation. Appl. Sci. 2020, 10, 5210. https://doi.org/10.3390/app10155210

AMA Style

Mirmozaffari M, Yazdani M, Boskabadi A, Ahady Dolatsara H, Kabirifar K, Amiri Golilarz N. A Novel Machine Learning Approach Combined with Optimization Models for Eco-efficiency Evaluation. Applied Sciences. 2020; 10(15):5210. https://doi.org/10.3390/app10155210

Chicago/Turabian Style

Mirmozaffari, Mirpouya, Maziar Yazdani, Azam Boskabadi, Hamidreza Ahady Dolatsara, Kamyar Kabirifar, and Noorbakhsh Amiri Golilarz. 2020. "A Novel Machine Learning Approach Combined with Optimization Models for Eco-efficiency Evaluation" Applied Sciences 10, no. 15: 5210. https://doi.org/10.3390/app10155210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop