What Does Cost Structure Have to Say about Thermal Plant Energy E ﬃ ciency? The Case from Angola

: This paper analyzes the e ﬃ ciency of thermal power plants in Angola by means of a two-stage Data Envelopment Analysis (DEA) approach. In the ﬁrst stage, a novel super-e ﬃ ciency DEA model for undesirable outputs (CO 2 emission levels and discharge of polluted water) is initially used to measure their e ﬃ ciency levels. Then, in the second stage, relevant cost structure variables frequently used to describe a productive technology are employed as analytical thresholds for assessing energy production performance either in terms of capital or labor-intensity levels. Precisely, bootstrapped regression trees are used to discriminate super-e ﬃ ciency scores yielding an energy production performance predictive model based on the technology type as proxied by its cost structure and their respective thresholds, since Angolan thermal plants are heterogeneous. Findings suggest that Angolan power plants are old and labor intensive, as some of them date back to the colonial era, and that lack of capital investment should be revised in favor of installing carbon capture devices. The approach developed here consists of a valuable approach for identifying priorities when technologically updating a heterogeneous thermal industry to face pollutant concerns.


Introduction
This research focuses on a relatively understudied topic in thermal plants, which is the relationship between its productive efficiency and cost structure in a developing country such as Angola, a former Portuguese colony and actually an important oil producer and exporter in Africa. Although the efficiency of thermal power plants at the machine or equipment level is a well-defined and parameterized research stream based on the laws of thermodynamics ( [1,2], here we are interested in assessing the efficiency of thermal energy production at the industry level where different plants consist of the units of analysis and represent a reflex of how a set of individual equipment interacts with other factors such as labor, managerial style, capital, etc. Despite several previous studies focus on power plant efficiency in developed countries, only a handful of them have addressed this issue in developing nations, despite their growing relevance with respect to energy generation [3,4]. Scant previous research on thermal power plant energetic efficiency has employed distinct methods, but the most common rely on non-parametric approaches such as Data Envelopment Analysis (DEA). See for instance [5] and [6] for comprehensive reviews on the subject. This research analyzes the technical efficiency of energy production in thermal plants located in Angola using a new super-efficiency DEA model to account for undesirable outputs. The major underlying idea is to assess whether cost structure variables are related to technical super-efficiency scores in thermal energy production, thus providing a means for describing how different technological patterns can be employed to control for the emission of undesirable outputs (pollutants) such as CO 2 emissions and discharged water. While scores are computed using a super-efficiency model, the cost structure variables are tested as efficiency thresholds in bootstrapped regression trees, thus allowing for the discrimination of different technological patterns in thermal energy production.
It is worth mentioning why the set of Angolan thermal power plants represents a controlled environment for conducting this search. First, these thermal power plants are all state owned and operate in a monopolistic market; therefore, the impact or influence of different managerial practices on efficiency scores and cost structures is quite limited. Additionally, these 32 plants were built over the course of more than forty years after Angola's independence from Portugal and therefore different production technologies used in thermal energy generation and their evolution are reflected within this set of plants.
Another distinctive aspect of this research concerns the analysis of the cost structure as a means to identify technological patterns and propose improvement paths within the ambit of an industry or sector. In fact, there is an emerging research stream that can be observed in a few efficiency studies on energy and infrastructure areas that tries to correlate efficiency levels to the cost structure of the productive process. Wang et al. [5], Lin and Yang [6], and Barros and Wanke [7] analyzed possible improvement paths for energy production efficiency in the thermal power industry based on cost structure variables. Wanke and Barros [8] conducted a similar analysis within the ambit of airport operations. All of these studies focused on the emerging economies of Asia and Africa where issues regarding a better balance between capital intensity and labor expenditure are deemed relevant to achieve higher productivity levels.
This paper contributes to the literature body on energy efficiency planning in three distinct ways. Firstly, and for the first time ever, this research unveils the relationships among capital, labor, pollutant emissions, and energy efficiency with generation thermal power plants. While using Angolan data, the results presented here can be generalized to other contexts since capital and labor cost structure variables are the key fundamentals for energy generation performance. Secondly, this paper adds to this beginning research stream by applying a novel super-efficiency DEA model for undesirable outputs altogether with statistical learning techniques such as bootstrapped regression trees, thus allowing to explore the impact of different productive technologies by proxying them to the cost structure. In fact, this is the first time that DEA and machine learning are complementarily used to explore productive technologies by means of their underlying labor and capital cost structure. Thirdly, this paper contributes to a more general research stream beyond energy generation by digging into the cost structure variables and the respective outcomes of the productive process. While technological patterns are most discussed when different energy sources are compared (e.g., thermal vs. hydro), a deeper analysis within each type is deemed necessary. In this research, the term productive technology describes the technical means by which fuel, labor, and capital are combined to produce thermal energy. While the concept of productive technology transcends the modeling of a functional relationship between inputs and outputs, some of their intrinsic structural concepts could be captured and related to each other such as efficiency and cost structure for decision-making purposes.
This text is structured in five more sections. Section 2 presents the background of thermal power plants in Angola, then Section 3 presents the literature review focusing on the previous efficiency studies on thermal power plants using non-parametric methods such as DEA and its variants as cornerstones. Section 4 presents the new super-efficiency DEA model capable of handling undesirable or bad outputs developed for this research. Section 5 presents the dataset used and the bootstrapped regression trees method, which are followed in Section 6 by the results and discussion. Section 7 makes some concluding remarks.

Background on Angolan Thermal Power Plants
This search for different technological patterns in thermal plant energy efficiency and their underlying impact on pollutant emissions is conducted in 32 Angolan thermal plants during the period from 2010 to 2016. It is worth mentioning why this set of plants represents a controlled environment for conducting this search. First of all, these thermal power plants are all state-owned and operate in a monopolistic market, therefore the impact or influence of different managerial practices on efficiency scores and cost structures is quite limited. Besides, these 32 plants have been built over the course of more than forty years since the independence of this country from Portugal. Therefore, different productive technologies used in thermal energy generation and their evolution are reflected within this set of plants.
Precisely, Angolan thermal power plants are state-owned facilities that burn fuel-oil, coal, and natural gas-assuring energy supply for domestic use. The charge of generating electricity has been given to a state firm, ENE-EP (Empresa Nacional de Electricidade-Empresa Pública), which controls energy production in every province of Angola. As regards the planning of thermal energy production, the required fuel resources and other technical capacity issues of each plant are decided in a centralized fashion by ENE-EP. However, another state company, EDEL-EP (Empresa de Distribuição de Electricidade-Empresa Pública), controls electricity distribution in Angola. Specifically in the Luanda region, power distribution is controlled by a subsidiary of EDEL created specifically for this purpose. This arrangement harkens to colonial times when the Portuguese government performed infrastructure planning, which was highly concentrated in the Luanda region.
In fact, the technological choices embedded in the construction of the thermal power plant industry in Angola over the course of the years yielded a unique cost structure that reflects the type of fuel burnt, the age of the plant along with its relatively small size, and the managerial style imposed by the state control. Older technologies, small-scaled operations, and state control are elements (as corroborated by the literature review in Section 3) that are often related to inefficient operations.
It is reasonable to expect that all these elements are not only reflected in the efficiency levels, but also in the different cost structure variables and ratios that characterize the productive technology in thermal plants such as capacity cost, labor and capital costs, besides cost-asset and capital-labor ratios. All these variables are analyzed in this study and are further discussed and operationalized in Section 4. Thus, this study seeks to identify the subset of cost structure variables and ratios that best explain efficiency levels and pollutant emissions in Angolan thermal plants.

Literature Review on Thermal Power Plant Efficiency
Thermal power plant efficiency is an emerging research area. More than 20 papers have been published over the course of 18 years with most being published from 2010 on. Not only are these papers characterized by a diversity of methodological approaches, although the non-parametric related DEA methods prevail, but also their research questions are very diverse. They range from efficiency comparison between different technologies or the evolution of efficiency levels over the course of time to the impact of regulatory issues and ownership, besides emerging trends as regards carbon dioxide emissions and other sustainability issues.
For example, seminal papers in the area encompass Park and Lesourd [9] who measured the performance of 64 fuel power plants in South Korea. Lam and Shiu [10] computed the efficiency of China's thermal power generation based on data for 1995 and 1996. Chien et al. [3] applied a DEA-Malmquist model to measure the efficiency of eight thermal power plants in Taiwan. Sarica and Or [11] analyzed and compared the performance of electricity generation plants in Turkey. Barros and Peypoch [12] analyzed the technical efficiency of Portuguese thermoelectric power generating plants with a two-stage procedure. Nakano and Managi [13] assessed efficiency in the Japanese steam-power sector and scrutinized the impact of reforms on the relative performance of its firms over the course of almost two decades. Sozen et al. [14] studied the performance of 11 lignite-fired, 1 coal-fired, and 3 natural gas-fired Turkish state companies using DEA. Liu et al. [15] assessed the efficiency levels of major thermal and combined cycle power plants in Taiwan for the period 2004-2006 using the DEA  approach. Rezaee et al. [16] and Rezaee [17] presented a novel model based on game theory and DEA to measure efficiency of Iranian thermal power plants in light of diverse objectives. Shrivastava et al. [18] focused on the relative productivity of 60 Indian coal-fired plants by means of classic DEA models. Sueyoshi and Goto [19] used DEA to conduct an efficiency assessment of coal-fired plants in the US. Du and Mao [20] measured the environmental efficiency and costs related to carbon dioxide emissions in Chinese coal plants by means of a novel DEA model.
Recent papers still maintain this focus. Ghosh and Kathuria [21] investigated the impact of institutional quality typified as regulatory governance on the performance of thermal power plants in India. They estimated a translog stochastic frontier model using an index of state-level independent regulation as one of the determinants of inefficiency. Their findings show that technical efficiency is sensitive to both unbundling of state utilities and regulatory experience. Barros and Wanke [7] utilized a two-stage approach for efficiency evaluation in thermal power plants. First, a DEA Slacks Based Model (SBM) was employed to assess the relative efficiency of thermal power plants. In the second stage, beta regression models were combined with DEA-SBM efficiency scores to produce a model for predicting energy production performance. Yan et al. [22] evaluated the carbon emission efficiency using the Undesirable-SBM model and data from China's power industry in 30 provinces from 2003 to 2014. They performed a spatial autocorrelation analysis that is based on Moran's index to confirm the non-equilibrium spatial distribution of the carbon emission efficiency for the power industry. Wu et al. [23] employed an improved two-stage analysis model to analyze eco-efficiency of 58 Chinese coal-fired power plants. Firstly, the principal component analysis was selected for pre-treatment of variables in order to reduce dimensionality and distinguish prioritized factors. Secondly, the super-efficiency DEA was chosen to assess eco-efficiency with overall discriminatory rankings. Table 1 presents a literature review synthesis of the previous studies on efficiency in thermal energy generation. These previous studies not only suggest that environmental issues are increasingly growing in importance over the years as more and more plants focus on reducing carbon dioxide emissions, but they also reveal that the production technology of each plant may be a relevant study field to understand efficiency in thermal energy generation. Precisely, plant age, fuel type, scale size, frontier shift, and catching-up effects appear to be the descriptors used the most in previous papers of the technological patterns that lie within thermal energy generation. Besides, ownership and regulatory impacts are also relevant issues addressed by some of these researches undertaken in different countries worldwide, especially in Asia and Europe, while research on African countries is still negligible. Besides, although the use of non-parametric methods such as DEA and SBM (Slacks Based Model) prevail, it is worth noting that the use of super-efficiency models is still scarce and focused on addressing environmental impacts (Wu et al., [23]). Therefore, this research fills a literature gap by not only addressing the issue of thermal energy production efficiency in an important African country, but also by proposing a novel super-efficiency model to better discriminate efficiency scores in a sector where productivity variations are subtle, especially within the ambit of the Angolan state-owned plants, as discussed in Section 2. Additionally, the review synthesized in Table 1 also sheds some light, by contrast, on the nature of the contributions of this paper since most previous studies did not consider the impact of cost structure to diagnose technological patterns and their relationship with efficiency levels and pollutant emissions (Barros and Wanke, [7]). Some of them, however, use fuel prices to trace their impact on eco-efficiency levels (Wu et al., [23]). In fact, the joint use of super-efficiency DEA models and bootstrapped regression trees is an additional innovative feature of this research when compared to previous research described in Table 1 since for the first time statistical learning methods are employed in the second stage of analysis when efficiency scores are usually correlated or regressed onto contextual variables.
Hence, the distinctive aspect of the current research in comparison to the other previous researches is the joint use of novel super-efficiency models and statistical learning techniques to unveil the relationship between cost structure (as a proxy of the productive technology) and efficiency levels in thermal power plant energy generation. As explored in Section 5.2., statistical learning methods constitute a useful approach for unveiling hidden relationships between efficiency scores and cost structure variables, as long as they do not rely on the traditional parametric assumptions found in the typical regression approach (Tobit, truncated bootstrapped regression, beta, and Generalized Method of Moments (GMM), cf. Table 1), yielding higher analytical flexibility and explanatory power.

The Proposed Super-Efficiency DEA Model with Undesirable Outputs
A comprehensive review of previous studies shows that most DEA applications have considered primary cross sectional data and evaluated relative efficiencies in a single period, usually one year (Emrouznejad and Yang, [28]; Fernández et al., [29]). Exceptions are found in window analyses (Charnes et al., [30]) and other models based on the Malmquist productivity index (Färe and Grosskopf, [31]; Yao et al., [4]). Looking beyond the inherent differences between these models, we posit that their main objective is to address the changing patterns of efficiency scores along distinct time periods. While these approaches may be useful for decision making, they do not take into account an aggregated measure of efficiency that represents and provides a synthesis of multiperiod productive systems.
In this sense, a clear exception is found within the ambit of dynamic DEA models. Nemoto and Goto [32]s proposed a dynamic DEA model to assess the overall efficiency of a multiperiod production system. This overall efficiency can be viewed as price or economic efficiency. However, even in a particular period, the assumption of exact costs of individual inputs is unrealistic (Thompson et al., [33]). Moreover, the true monetary value (or exact discount factor) of an input in the time horizon remains unknown in practice (for more details see Jahanshahloo et al., [34]; Silva and Stefanou, [35]; Soleimani-damaneh, [36]).
Here, in this paper, we focused on Multiperiod Data Envelopment Analysis (MDEA) in which the aggregative efficiency in the context of time serial data is measured. Any previous information on prices or input and output weights across multiple periods are required. Hence, a Multiperiod Aggregative Efficiency (MAE) that corresponds conceptually to a technical (but not price or economic) efficiency of multiperiod production units can be delivered. This multiperiod technical efficiency, economic-free, is deemed necessary to proxy in an unbiased way for the technological pattern of a thermal plant over the course of time in terms of its cost structure, which is treated here in the ambit of the contextual variable set.
Besides, a distinctive feature of the model here developed is regards super-efficiency. The super-efficiency concept is traditionally a method used in DEA to break the ties between fully efficient units. Putting it into other words, the super-efficiency approach is an alternative to make a better discrimination for each Decision Making Unit (DMU). Suppose we have n DMUs and there are L periods, t = 1, 2, . . . , L, and in each period DMU j , j = 1, 2, . . . , n, consume m inputs, x t ij , i = 1, 2, . . . , m, to produce s outputs, y t rj , r = 1, 2, . . . , s. In order to compute the multiperiod aggregative efficiency, abbreviated as MAE in the context of time serial data, Sam Park and Park [37] proposed a two-phase DEA model (PP-model). The phase-I of the PP-model related to DMU o (o ∈ {1, 2, . . . , n}) is as follows: Indeed, the above model is an aggregated output-oriented CCR model that evaluates DMU o within all L periods simultaneously [38]. Let ψ o * be the optimal value of Equation (1), if ψ o * = 1 we say DMU o is weakly efficient. The following model is solved in phase-II of the PP-model.
Generally there is more than one DMU with ψ o * = 1 per Equation (1); therefore, it is necessary to propose a ranking model to rank efficient DMUs. In this study inspired by the AP-model (Andersen and Petersen, [39]) in which DMU o is removed from the production possibility set, Equations (1) and (2) are converted to the following models, respectively.
The drawback of the above models is that they consider an overall efficiency score for each DMU, whereas the behavior of a DMU may change from one period to another. Consequently, it is reasonable to consider a different efficiency score for each time period; therefore, the following models are proposed.
Here ψ t o represents the efficiency score of DMU o in t-th, t = 1, 2, . . . , L, period. The objective function of Equation (5) is the average efficiency of DMU o within all L periods. The advantage of Equation (5) is that not only does it maximize the average efficiency of DMU o , it also computes the ψ t o . The objective of Equation (6) is to find a solution that maximizes the sum of input excesses and output shortfalls while keeping ψ t o = ψ * t o . Let ξ o * be the optimal value of Equation (6), then we have the following definitions.
Conventional DEA models rely on the assumption that outputs have to be maximized. However, it was mentioned already in the literature that the production process might also produce undesirable outputs (Mariano et al., [40]; Ozkan and Ulutas, [41]; Scheel, [42]). In Equation (1), the outputs of the DMUs are all desirable outputs and Equation (1) cannot be applied when some of the outputs are undesirable. Now, assume that each DMU j uses m inputs to produce s desirable outputs and k undesirable outputs. The inputs, desirable outputs, and undesirable outputs of each DMU j , j = 1, 2, . . . , n in period t are defined as x t ij , i = 1, 2, . . . , m, y gt rj , r = 1, 2, . . . , s, and y bt pj , p = 1, 2, . . . , k, respectively. Similar to Seiford and Zhu [43] and Hadi-Vencheh et al. [44], we assume strong disposability of the undesirable outputs. The data of the undesirable outputs are then transformed using the following Equation (7).
In Equation (7), w t is a positive vector for period t, which can be used to let all the negative undesirable outputs be positive. Considering the transformed data, Equations (5) and (6) are converted to Equations (8) and (9), respectively.
As can be seen from (8) and (9), the undesirable outputs of the DMUs are considered as inputs when evaluating the performance of the DMUs.
As regards the model fitting the Angolan thermal plant industry analyzed here, it is worth noting the justification for some methodological choices with respect to the returns-to-scale assumption, the model productive orientation, and the strong disposability of pollutant assumption. First, a constant returns-to-scale assumption was adopted here because when compared to international standards, even the largest Angolan plant is small when compared to their USA, UE, or even South African counterparts. Besides, as shown by results analyzed and discussed in Section 5, these thermal plants are labor intensive and in such cases the returns-to-scale effect tends to be behave mostly linearly with respect to the number of employees. Second, an output orientation was chosen due to the socioeconomic characteristics of the country and, why not, for the African region as a whole since energy shortages are frequent and interruptions in energy supply cannot be sourced by alternative transmission lines given that transmission networks within Angola and between African countries barely exist. Third, a strong disposability of undesirable pollutants was considered as an adequate assumption due to the existence of technological devices for reducing CO 2 emissions and water discharges without affecting energy production levels.

Data
Data for 2010-2016 encompassing the operation of 32 Angolan thermal plants were obtained from ENE-EP. The inputs of these thermal plants are those commonly used in previous papers and encompassed, besides the costs of investment, fuel, and labor, the productive capacity and the number of employees. It is worth mentioning that productive capacity was adopted here to proxy a relevant resource constraint, thus modeling that energy production cannot be indefinitely expanded in the short and medium terms. In fact, productive capacity is considered here as a fixed input required to produce energy despite severe limitations with respect to its short/medium term variations.
As regards the outputs, energy production is used as the sole desirable output, while on the other hand carbon dioxide emissions and discharged polluted water are considered as the undesirable outputs. It is worth mentioning that, due to a lack of completeness of the dataset, fuel costs had to be estimated as a proportion of the investment costs and the labor costs, observing accepted technical standards (for a conventional coal power plant, capital costs lie around 65%, whereas fuels costs are about 30% (EIA, [45,46]). For conventional gas power plants, capital costs are about 32%, whereas fuels costs are about 61% (EIA, [45,46])). For modern combined cycle gas power plants, the capital costs are 23% and fuel costs increase to over 70% with labor costs tending to be negligible (EIA, [45,46])). Table 2 presents their descriptive statistics and the respective dataset used is given in Appendix A. In addition, a number of contextual variables are used to proxy the productive technology. They are also described in Table 2 and involve major cost structure elements. These variables are fourfold: the capacity cost per MW (calculated as the ratio of the logarithms between the total cost of the plant and its productive capacity), the labor cost per employee (calculated as the ratio of the logarithms between total salaries paid and number of employees), the capital cost (calculated as the ratio of the logarithms between amortizations and total assets), and the cost-asset ratio (calculated as the ratio of the logarithms between total costs and total assets). Lastly, the capital-labor ratio is calculated as the ratio of the logarithms between total assets and total salaries paid to employees. It is also important to mention that principles of accrual accounting were adopted here to compute the capital cost as the amortizations to total assets ratio. In fact, amortization, depreciation, and depletion are methods that are used to prorate the cost of a specific type of asset over the asset's life. This prorated cost yields, therefore, an accounting proxy for the capital cost.

Bootstrapped Regression Trees
Tree methods were first used by researchers [2,47,48] and have gained popularity through the major theoretical and practical contributions of Breiman et al. [5]. They involve stratifying and segmenting the predictor space into a number of simple regions (James et al., [49]). These features are particularly useful since (a) efficiency scores reflect uncertainty derived not only from vagueness in input/output collection and (b) explanatory variables may be endogenous or exogenous. Readers should refer to Faraway [2], Opitz and Maclin [50], Polikar [51], and Torgo [52] on how to resample (ensemble) trees using bagging.
Bagging ("bootstrapping aggregation") is a bootstrap ensemble method introduced by Breiman [53] and combines predictors across different subsets of the training data. The R functions used to perform such bootstrapped regression tree analysis and their respective default values used for the analysis are presented in Table 3 (Ledolter, [54]). Table 3. R packages and respective default values for bootstrapped regression trees.

Results and Discussion
The levels of super-efficiency computed for the Angolan thermal power plant sample using the proposed super-efficiency model for undesirable outputs are presented in Figures 1-4. The full rank of DMU scores is given in Appendix B. Readers should recall that a DMU is considered efficient (inefficient) if super-efficiency scores are below (above) 1. If super-efficiency score is exactly 1, the DMU is classified as weakly efficient. As displayed in Figure 1, pooled efficiency scores are strongly concentrated around 1 and inefficiency prevails in Angolan thermal power plants. Super-efficiency levels also appear to be stagnant over the course of the years, as suggested by Figure 2, even though a slight decrease in efficiency is seen from 2014 on. Although this stagnant behavior may be justified by the fact that Angola thermal power generation plants are publicly owned and controlled by ENE-EP, a state company (cf. Section 2), the wide dispersion of super-efficiency scores suggest that inefficiency may be driven by different technological patterns in thermal energy production reflected in cost-structure variables, and that they may also impact the emission of pollutants. Based on the literature review, the evidence implies that the smaller, older, and coal thermal plants are less efficient and more polluting than the larger, newer plants that burn gas. The former would, therefore, be located above one in Figure 1 while the latter below one. One is the super-efficiency threshold that divides efficient from non-efficient plants.         As expected, super-efficiency scores fluctuate around 1, although not symmetrically. Readers should recall that inefficient plants present scores higher than 1.0 and that a score of 0.6, for instance, indicates a less efficient plant than another one with a score of 0.5. Figure 3 reveals that inefficient thermal power plants tended to emit more CO 2 , although this direct relationship did not appear to be so strong in terms of polluted water. In fact, the following results could be observed when comparing the groups between efficient and inefficient plants. While the efficient plants present a CO 2 emission level that was 1.08% lower than those observed in the inefficient plants, the discharge of polluted water between both groups presented even a smaller variation (around 0.33% less).   These results are confirmed by the bootstrapped regression trees presented in Figures 5 and 6. While Figure 5 depicts the bootstrapped regression tree structure and its thresholds at each tree node, Figure 6 shows the most impacting variables in terms of overall increase in the Mean Squared Errors (MSE). The interpretation of a regression tree is very straightforward in terms of allowing the decision-maker to segment the results. For example, the first branch of the tree presented in Figure 5  Similarly, Figure 4 also presents a scatterplot panel of efficiency levels, but now against cost structure variables. Figure 4 clearly shows that inefficient energy generation prevailed when Angolan thermal plants presented a lower KL ratio, meaning that they are labor intensive, and had a higher cost-asset ratio, which means that they have higher operating costs in comparison to the value of their assets. As regards the lower KL ratios, the state-controlled operation of Angolan thermal plants may explain the excessive number of highly paid employees in relation to the size/scale of the plant. On the other hand, higher cost-asset ratios may reflect the operation of older, smaller, poorly maintained plants. With this cost structure in mind, the adoption of costly technologies for controlling CO 2 emissions may be enhanced by opportunities that emerge from rebalancing KL and cost-asset ratios. This is necessary because adopting carbon capture devices would certainly increase the cost to asset ratio. Therefore, labor expenses should be rightsized first in order to open room in cost expenditure before adopting such an antipollutant measure.
Firstly, an adequate equilibrium between a rightsized labor force and the intensity of capital seems to be the cornerstone for improving efficiency levels while simultaneously controlling the discharge of undesirable outputs as a byproduct of the energy generation process. The drop in oil prices imposed strong budgetary restrictions upon the Angolan economy causing the classical conflict between labor and capital to increase with respect to scarce resource allocation. Secondly, with respect to the capital investments required for controlling carbon dioxide emissions, it is deemed necessary to apprehend how economically feasible these investments are for a thermal plant and how they impact relevant ratios such as the capital-labor and the cost-asset ratios. As a matter of fact, thermal energy generation is one of the biggest causers of the greenhouse effect on a worldwide basis. Except for CO 2 , all other emissions from a thermal plant can be mitigated with the technology available at a feasible cost (http://www.brighthubengineering.com/power-plants/57788-power-plant-emissions/). This happens because carbon dioxide is an unavoidable part of the thermal generation process. Therefore, systems for capturing carbon dioxide emissions consists of a costly alternative for reducing pollutant emissions in the context of the investments required for building up and/or renovating a thermal plant (U.S. Department of Energy (DOE) and U.S. National Energy Technology Laboratory (NETL). 2010. DOE/NETL Carbon Dioxide Capture and Storage RD&D Roadmap. http://www.netl. doe.gov/File%20Library/Research/Carbon%20Seq/Reference%20Shelf/CCSRoadmap.pdf), but they are often economically feasible in newer plants.
These results are confirmed by the bootstrapped regression trees presented in Figures 5 and 6. While Figure 5 depicts the bootstrapped regression tree structure and its thresholds at each tree node, Figure 6 shows the most impacting variables in terms of overall increase in the Mean Squared Errors (MSE). The interpretation of a regression tree is very straightforward in terms of allowing the decision-maker to segment the results. For example, the first branch of the tree presented in Figure 5 states that "if labor cost is lower than 0.33 and capacity cost is lower than 0.80, and KL ratio is lower than 1.37, then the average plant efficiency is 0.962 (almost weak-efficient)". The other branches read similarly. Both figures suggest that capacity cost and labor cost are the most impacting variables on Angolan thermal power plant efficiency.   These results are confirmed by the bootstrapped regression trees presented in Figures 5 and 6. While Figure 5 depicts the bootstrapped regression tree structure and its thresholds at each tree node, Figure 6 shows the most impacting variables in terms of overall increase in the Mean Squared Errors (MSE). The interpretation of a regression tree is very straightforward in terms of allowing the decision-maker to segment the results. For example, the first branch of the tree presented in Figure 5  Policy implications for Angolan thermal plants suggest the need for a better training of the workforce, downsizing of personnel in order to keep labor costs under control, and investing in carbon capture equipment. Since it is expensive and impacts the plant's capacity cost, such equipment can be acquired by rebalancing the KL equilibrium in Angolan plants so that total operating costs are kept under control. There are, however, technological limits to adopting such measures, which are revealed by the capital cost. Lower values of capacity cost, often related to older amortized plants, may present physical constraints to the deployment of newer carbon capture technologies. In such cases, efficiency improvements may be confined to the traditional conversion from diesel to combined cycle gas, which has already occurred in some Angolan thermal plants. states that "if labor cost is lower than 0.33 and capacity cost is lower than 0.80, and KL ratio is lower than 1.37, then the average plant efficiency is 0.962 (almost weak-efficient)". The other branches read similarly. Both figures suggest that capacity cost and labor cost are the most impacting variables on Angolan thermal power plant efficiency. Policy implications for Angolan thermal plants suggest the need for a better training of the workforce, downsizing of personnel in order to keep labor costs under control, and investing in carbon capture equipment. Since it is expensive and impacts the plant's capacity cost, such equipment can be acquired by rebalancing the KL equilibrium in Angolan plants so that total operating costs are kept under control. There are, however, technological limits to adopting such measures, which are revealed by the capital cost. Lower values of capacity cost, often related to older amortized plants, may present physical constraints to the deployment of newer carbon capture technologies. In such cases, efficiency improvements may be confined to the traditional conversion from diesel to combined cycle gas, which has already occurred in some Angolan thermal plants.

Conclusions
This research assessed the thermal plants in Angola in terms of their energetic efficiency by jointly applying a novel super-efficiency DEA model that handles undesirable outputs and bootstrapped regression trees that discriminate productive technologies based on cost structure variables. This novel super-efficiency model makes it possible to capture subtle productivity variations in a state-owned industry where managerial practices tend to be quite similar over the course of time. This paper gives insights on how the technological patterns or productive technologies of thermal energy generation are reflected in the cost structure variables of each plant by means of efficiency levels, which constitutes a relatively novel approach not only in the energy efficiency strand, but also on infrastructure efficiency.
Efficiency levels were computed based on three outputs (energy production, carbon dioxide emissions, and discharged polluted water) and on five inputs (fuel, investment, labor costs, plant capacity, and number of employees). The findings suggest that efficiency levels of fuel consumption and undesirable emissions included are mostly affected by the capacity cost and the labor cost, which are the reflex of rightsizing and training the workforce in parallel with adopting expensive carbon capture devices. Specifically with respect to the pattern of pollutant emissions/discharges, CO2 emissions appear to be more impacted by the technological pattern of the power plant than the level of discharged polluted water, which may suggest that carbon capture technologies have evolved and can be deployed faster than technologies for recycling water in the energy generation productive processes.
Limitations of this research are related to the very nature of case studies built on the evidence of single countries. Although these results cannot be generalized to other countries with different regulatory regimes, some useful lessons for conducting similar research in other countries have been learned. It may be advisable to focus on capital, labor, and operating expenses and their countervailing forces while seeking opportunities for adopting antipollutant technologies. Further research should confirm these results in other environments.

Conclusions
This research assessed the thermal plants in Angola in terms of their energetic efficiency by jointly applying a novel super-efficiency DEA model that handles undesirable outputs and bootstrapped regression trees that discriminate productive technologies based on cost structure variables. This novel super-efficiency model makes it possible to capture subtle productivity variations in a state-owned industry where managerial practices tend to be quite similar over the course of time. This paper gives insights on how the technological patterns or productive technologies of thermal energy generation are reflected in the cost structure variables of each plant by means of efficiency levels, which constitutes a relatively novel approach not only in the energy efficiency strand, but also on infrastructure efficiency.
Efficiency levels were computed based on three outputs (energy production, carbon dioxide emissions, and discharged polluted water) and on five inputs (fuel, investment, labor costs, plant capacity, and number of employees). The findings suggest that efficiency levels of fuel consumption and undesirable emissions included are mostly affected by the capacity cost and the labor cost, which are the reflex of rightsizing and training the workforce in parallel with adopting expensive carbon capture devices. Specifically with respect to the pattern of pollutant emissions/discharges, CO 2 emissions appear to be more impacted by the technological pattern of the power plant than the level of discharged polluted water, which may suggest that carbon capture technologies have evolved and can be deployed faster than technologies for recycling water in the energy generation productive processes.
Limitations of this research are related to the very nature of case studies built on the evidence of single countries. Although these results cannot be generalized to other countries with different regulatory regimes, some useful lessons for conducting similar research in other countries have been learned. It may be advisable to focus on capital, labor, and operating expenses and their countervailing forces while seeking opportunities for adopting antipollutant technologies. Further research should confirm these results in other environments.

Acknowledgments:
The authors would like to tank two anonymous reviewers for their helpful comments and suggestions which improved the first draft of this paper.

Conflicts of Interest:
The authors declare no conflict of interest.