In this section, the methodology used to evaluate the water companies’ efficiency is described. Afterwards, the input and output variables related to the companies analyzed are presented.
2.3.1. Data Envelopment Analysis (DEA)
DEA is a non-parametric approach for evaluating the performance of a set of decision-making units (DMUs). These DMUs must be homogeneous units, i.e., companies, organizations, countries, etc., that convert one or various inputs into one or multiple outputs. DEA methodology was initially presented by Charnes, Cooper, and Rhodes [
22] and is based on the research developed by Farrell [
23]. This method uses the concept of efficiency as a simple coefficient between production (
) and resources (
) of the kth DMU.
DEA uses linear programming models to estimate the inefficiency of DMUs, determining whether it is possible for an operative unit to obtain more outputs with the same inputs (output-oriented) or to obtain the same outputs using fewer inputs (input-oriented) [
11], which is referred to as model orientation.
The first proposed model assumes constant returns to scale (CRS) and it was named DEA-CCR in honor of its authors Charnes, Cooper, and Rhodes. This model seeks to establish which DMUs determine the efficient production frontier. Thus, the radial distance of a DMU towards the frontier provides the measure of its efficiency. This model was extended by Banker, Charnes, and Cooper in 1984 [
24] by assuming variable returns to scale (VRS) and it was called DEA-BCC. As DEA-BCC models address efficiencies not influenced by scale effect, the efficiencies obtained are usually higher than those obtained by using DEA-CCR models.
In our study, we use a DEA-CCR model to evaluate the efficiency of different Spanish UWUs. As all the companies analyzed operate in large Spanish cities, it was firstly understood that there was constant return to scale, i.e., all the UWUs could achieve the efficiency of the most efficient ones. However, after analyzing the results it was not so clear, therefore we decided to implement both environments, CRS and VRS. Moreover, two cases that differ on the factors used as input and output variables have required the use of both exiting orientations.
DEA-CCR input-oriented and output-oriented models are presented below. In addition, the DEA-BCC input-oriented model is also included. Let
be the total number of UWUs (our DMUs),
be the number of input variables, and
the number of output variables, the following DEA-CCR input-oriented model (Equations (1)–(4)) is solved for each UWU:
The DEA-CCR output-oriented model (Equations (5)–(8)) has the following structure:
In both cases, the variables to be estimated are the weights
and
of the outputs and the inputs that maximize the efficiency of the target UWU
o as calculated in Equation (9). Furthermore, the models force the weights to be positive as defined by Equations (4) and (8), where
is an infinitesimal number (positive and close to zero).
In order to obtain the ranking of the efficient UWUs, super efficiency is allowed by removing the constraint (2) or (6) from the model for
, i.e., for the target UWU
o which is represented by Equation (10):
With this adaptation of the model, the UWUs may obtain efficiencies higher than 100%. In this sense, we can identify the most efficient one.
Finally, the DEA-BCC models differ from the DEA-CCR models in that the projection of the target UWUo is done on the hyperplane formed by the UWUs of its size. Consequently, the objective function only includes UWUs of the target UWU size.
2.3.2. Data Description
As previously mentioned, this study analyzes 18 water utilities that operate in the most important Spanish cities, among which are the water distribution networks of Madrid, Barcelona, and Sevilla.
Figure 2 shows the population served by each of the analyzed water distribution networks. It can be seen that most of them have populations between 200 and 600 thousand inhabitants. Nevertheless, the city of Madrid presents an enormous number in comparison with the rest.
Table 3 presents the acronyms and definition of the variables used as inputs or outputs in the implemented models. Most variables have been taken from the companies’ websites and a balance analysis system [
25] has been used to obtain the economic variables. All these data are from 2019.
Table 4 shows the descriptive statistics of the recorded data. The network length analyzed corresponds to 17.2% of the total Spanish supply network. Additionally, the companies supply water to 33.0% of the Spanish population. Therefore, this is a highly representative study.
According to our data, the annual water consumption per capita is 70.3 m3 on average, which corresponds to 192.7 L per person per day (this quantity includes industrial and other uses of the water). For the variable volume of water delivered (xWDEL), the two missing values are estimated using the volume of water taken and the average percentage of water losses in Spain in 2018, i.e., 22%.
All variables achieve their maximum value for the city of Madrid, which is the Spanish capital. This water distribution network encompasses 41.4% of the analyzed kilometers, supplying water to 42.3% of the population included in the study. The capital expenditures vary from −3,623,237 € to 58,036,000 €, revealing that not all companies are investing enough to increase its fixed assets. The cost of material moves in a wide range. The company that works in Cuenca spends the lowest quantity; however, its network is also small, having 93 km of pipes.
Instead of using the percentage of water losses, we have decided to include the percentage of water delivered (last variable in the table) as an output variable that characterizes the sustainability of the companies. For instance, if a distribution network has water losses that equal 15%, the value of variable x% would be 85. The higher the value of this variable the better the UWU performance.
It is noticed that only the companies with reasonably good water loss percentages publish this data. In fact, in the sample this percentage varies from 9% (La Coruña) to 23% (Alicante) with an average of 15%, which is really low in comparison with the national average presented in
Figure 1.
2.3.3. DEA Input/Output Selection
In this section, the input and output variables related to the companies analyzed are presented. According to the available data, we have decided to implement two different models. The first one aims to evaluate the efficiency of the companies using their resources; consequently, DEA input-oriented models (CRS and VRS technologies) are employed. In this case, some economic indicators are used as input variables, concretely, the cost of material (xCOST) and the fixed asset investment (xCAPEX). Furthermore, the company labor (xLABOR) is also introduced to represent the size of a company and its labor costs. Finally, the water network length (xLEN) represents the assets of the companies to perform their activities. Regarding the output variables, this model uses the water supplied (yWDEL) and the population served (yPOP), which are understood as constant variables since the companies must offer the service to all their customers.
The second model evaluates the sustainable efficiency of the companies by including the percentage of water losses as the only output variable. In this case, the DEA model is output oriented. This model is implemented with 12 UWUs associated to the companies that make this data available. The input variables that are now stablished as constant are the water network length (xLEN), the population served (xPOP), the volume of water delivered (xWDEL), and the fixed asset investment (xCAPEX).
Table 5 gathers the aforementioned information, i.e., the input and output variables used in each model. Although all variables have initially been defined using ‘x’, the ones that act as output variables are now represented by the letter ‘y’.