#### 2.1. Energy Intensity Framework

The deterministic model that predicts the energy intensity cannot be defined if the framework for its application is not previously specified. This framework allows the physical reality of the water transport and distribution network to be reduced. It is based on a system, defined as the water network contained in a control volume.

The deterministic physical model that predicts the energy intensity is based on the integral energy equation derived from the Reynolds transport theorem applied to a system that is defined by a fixed control volume [

35]. In the defined system, a continuous streamline can connect any point of the system with the water sources [

23], and with all existing net energy sources located outside the system. To understand this section, it is crucial to have prior knowledge of all these concepts. For instance, any booster stations in the network would not be considered part of the system, since no energy inputs should be considered inside the control volume to avoid an energy imbalance. In contrast, downstream tanks may be considered part of the system, as they are not net energy sources (when they fill up they store energy that is later returned to the system in the emptying phase).

The following examples clarify these concepts:

Figure 1, shows a real pressurized network for the irrigation of agricultural plots (Cap de Terme, Spain). Each plot is represented as a node in the model. All nodes are included in the system and supplied from a pumping station (

Figure 1a). Following an optimization study, it was determined that it was more energetically efficient to divide the irrigation area in three sectors based on node elevation [

36]. As a result, three independent pumping stations were set up, resulting in three fully decoupled individual systems (

Figure 1b).

Figure 2 shows the well-known sample network from the EPANET user’s manual [

37]. This network, with two external water sources or reservoirs and three downstream tanks, can be considered a single system.

The third example is the

C-Town network [

38] comprised of five systems and as many booster pumping stations as systems (

Figure 3). Downstream tanks do not influence this subdivision, but the booster stations do.

The expression for energy intensity, I_{e}, has to meet two conditions. The first one is to include all the system’s physical factors that influence the value of I_{e}. The second condition is to be defined within a physical framework. In practice, average energy intensity is calculated as the sum of all energy consumed by pumps divided by the total registered volume. This ratio fails to capture the actual energy required by pumps, as it disregards leaks and commercial losses which would make the denominator larger.

However, in practice the water transport and distribution of a utility is integrated by several systems. Therefore, its global efficiency should be calculated as the average of all individual system efficiencies, weighted by volume and considering the gravitational energy. In any case, a sectorial efficiency analysis (by system) holds more interest than the global average, as it allows improvement actions for each sector to be prioritized (similar to studying water efficiencies by sectors or for the whole network).

The existing reported calculations of the

I_{e} indicator correspond to pumping in single pipes, with a reference value of 0.4 kWh/m

^{3} for 100 m of elevation [

39]. These systems comply with the framework conditions previously established, as they are physical systems without discontinuities or unbalanced energy inputs, regardless of the pumping station’s location. However, in order to expand the application of this indicator to more complex distribution networks, and as explained previously, networks have to be subdivided in systems and consider gravitational energy.

#### 2.2. Factors Influencing a System’s Energy Needs

Once the framework for the application of the energy intensity indicator has been defined, the factors that influence energy use need to be identified. Only when the analytical expression that estimates energy use takes all these factors into account is this expression universal. This is why the validity of the three regression models explained below is limited to their specific framework of application, as they ignore several factors that affect the energy requirements (e.g., in

Figure 3, the model considers a single system instead of disaggregating it in five systems, as suggested by the framework presented in the previous section). All these models are based on data from groups of utilities with similar geographical environments to identify the key energy-related parameters. Afterwards, the coefficients of the model are used to adjust the importance of each factor by means of their weight.

The first explained predictive model for annual energy consumption [

2] is based on the statistical analysis of 86 utilities in the USA. The authors concluded that the model reaches a good fit, taking into account six variables: average daily total flow, purchased daily flow, total pump horsepower, raw/source pump horsepower, change in distribution elevation, and total water main length. Nevertheless, the subsequent analysis by the authors uncovered some weaknesses in the model. These weaknesses are related to two absent factors (water losses and friction) and the omission of natural energy. In their conclusions, the authors pointed out that these limitations could impact the validity of the proposed model.

The second statistical model [

6] is based on data from 108 USA utilities. The authors identified the five variables with the greatest impact on energy consumption. Two of them coincide with the previous model: annual water use and imported water supply. The additional variables are gravity-fed water supply, average annual rainfall, and average annual temperature. These last two variables affect water demand and, indirectly, have an impact on energy.

Both these models include the potabilization phase, although (as the authors concluded), the energy associated with this process is small compared to the energy used in transport.

In the third model, proposed by Sanjuan-Delmás [

40], the energy consumption is based on energy intensity. Using data from 50 medium-sized Spanish utilities, the model identifies the three variables with the greatest impact on energy consumption: metered water, network length, and number of inhabitants. However, some key variables are ignored, as indicated by the authors in the text. Once again, the homogeneity of the sample allows the validation of a clearly local model. This is the reason why the qualitative analyses of different countries only provide an order of magnitude [

41].

Summing up, the avoidance of the key driving factors for energy consumption leads to local models (with a homogeneity in factors). A model is universal and deterministic when it includes within the physical system boundaries all energy consumption factors, even though some of them may turn out to be irrelevant for some systems. Otherwise it can produce inconsistent results [

42].

It is consequently fundamental to identify those driving factors. In this work, up to eight key factors for energy consumption were identified: three linked to physical system characteristics, two dependent on service conditions (pressure and volume), and three factors affected by operational inefficiencies.

Factors dependent on physical characteristics of the system include:

Topography, summarized in three elevation figures: the elevation of the network’s lowest node (which sets the reference level), the elevation of the most energy-demanding node (as explained later, in a few cases it is not the same as the highest node), and the source’s elevation;

The distances traveled by water;

The natural energy supplied to the system;

Factors linked to service conditions;

The pressure values at the supply sources (usually zero) and at the delivery nodes (dependent on use), where any excess of pressure over the required value leads to energy loss; and

The water volume injected to the system V_{t}, the spatial distribution of water delivered to users (with the corresponding elevations), and the total metered volume V_{r};

Finally, the three factors affected by operational inefficiencies are:

Pumping efficiency;

Leakage; and

Friction losses.

These energy losses depend to a large extent on the operation of the system; pressure needs to be managed and pumps should operate at their BEP (best efficient point). However, they also depend on system design (selection of pumps, diameters, and pipe materials) and assembly (leaks are very sensitive to pipe assembly). The only inefficiencies not included in these factors are the structural inefficiencies, represented by the topographic energy indicator (

θ_{t}) [

36].

The energy intensity predictive model proposed in the following section includes these eight factors and allows for the prediction of, following a top-down approach, the energy consumed under estimated operation conditions once the metered water volume is known. The comparison with the actual consumed energy, calculated in a bottom-up path (actual energy consumption vs. billed volume), yields the current efficiency of the system. This comparison should be carried out with care, since the predictive value includes all forms of energy (natural and pumped), whereas the real value (usually) only considers pumped energy, and not the gravitational one. Only energy intensities obtained with the same conceptual framework should be compared.

As a result from the previous considerations, we can define four values of energy intensity:

Analytically estimated values (top-down): estimated energy intensity (I_{ee}), including all the supplied energy (natural and pumped), or the estimated pump energy intensity (I_{ee,p}), only considering mechanical or shaft energy; and

Real values (bottom-up), calculated from real operating data: real energy intensity (I_{er}), including all supplied energy (E_{s}), and the real pump energy intensity (I_{er,p}), which only accounts for shaft energy (E_{p}).

These energy intensity values should be compared in pairs as follows: I_{er} with I_{ee} or I_{er,p} with I_{ee,p}. In general, the second pair is used more often, as the required data is easier to obtain by utilities.

#### 2.3. Deterministic Model for the Prediction of the Energy Demand of a System

In order to develop the model, the I_{ee} and I_{ee,p} values are obtained analytically (top-down approach). This section describes the deterministic predictive model for single-source systems. This model is later generalized to consider the case of multi-source systems.

Using physical and operational data from the system, Equation (1) allows the energy behavior of the network to be predicted. This equation was obtained in a previous work [

39]. The equation includes all the factors driving energy demand detailed in the previous section and, as a consequence, properly reflects the energy requirements of the process. However, it is not the only indicator linked to the energy efficiency of pressurized water transport.

In order to enable a direct comparison,

I_{ee} and

I_{ee,p} need to be referred to the metered water volume (

V_{r}), as this is the volume used by utilities to calculate the real energy intensity (

I_{er} = E_{s}/V_{r} and

I_{er,p} = E_{p}/V_{r}).

V_{r} is, by definition, lower than the input volume (

V_{t}), and they are both included in Equation (1) through the water efficiency ratio of both volumes (

η_{le} = V_{r}/V_{t}). Energy terms are divided into gravitational energy availability (not impacted by the pumping efficiency,

η_{pe}) and shaft energy requirements. In consequence, greater inefficiencies (

η_{le} and

η_{pe}) lead to worse values for

I_{ee}.

The remaining terms of Equation (1) are:

0.002725, a unit conversion factor (pressure, expressed in m, to kWh/m^{3});

z_{s}_{,} elevation of the supply source;

z_{l}_{,} lowest elevation node;

z_{c}, elevation of the node with the highest energy demand, also known as the critical node (it should be noted that z_{c} is not necessarily the same as the highest node, z_{h}, as friction losses also play a role and difference in elevation could be compensated by additional distance from the source);

h_{fe}, friction losses from the source to the critical node = ${h}_{f\left(s\to p\right)}+{h}_{f\left(p\to c\right)}$;

${h}_{f\left(s\to p\right)},$ friction losses from the source to the pumping station;

${h}_{f\left(p\to c\right)},$ friction losses from the pumping station to the critical node;

p_{o}, service pressure; and

γ: specific weight for water (9810 N/m^{3}).

Figure 4 illustrates the preceding defined variables in a pumping booster system. It can be treated as a control volume because there are no nodes between the source and the pumping station.

On the other hand, gravitational systems were usually left out of previous studies because they lack pumping stations. However, they are reviewed in the next section to guarantee the universality of the indicator.

Synthesizing Equation (1), the estimated system’s maximum piezometric head,

${H}_{e}$, (

Figure 4) can be defined as:

Thus simplifying the form of the estimated energy intensity:

Finally, the critical node is the one that fulfills the relationship:

where

${z}_{i}$ is the elevation of a generic node

i,

${H}_{i}$ is its piezometric head,

${H}_{e}$ is the maximum piezometric head, and

${H}_{c}$ is the piezometric head of the critical node;

${h}_{f\left(p\to i\right)}$ represents friction losses between the pumping station and the generic node

i.

The result of Equation (1) is conditioned by the critical node. The equation allows for the estimation of the energy required by the system to supply water to the critical node with the pre-set service conditions. Since all nodes are interconnected (they are part of a system), they all receive identical energy from the source of supply, but part of that energy is lost in the path from the source to each node due to friction.

In addition, part of the supplied energy (

${E}_{s})$ is not strictly necessary in most nodes, but it must be supplied due to the irregularities of the terrain. So strictly speaking, there is an excess energy supplied to nodes; the topographic energy (

E_{t}) and its relative importance is provided by the topographic energy context indicator

θ_{t} [

43]:

where

$\overline{z}$ is the weighted average elevation of the system. The critical node should not have any topographic energy, as it should receive just enough energy to meet the quality-of-service standards. This indicator is key to comparing the energy intensity between different utilities, as it considers the structural inefficiencies due to the irregularities of the terrain [

36].

The second context indicator is the energy origin (

C_{1}) indicator. This ratio specifies which percentage of the total supplied energy corresponds to natural or gravitational energy (

E_{n}):

where

E_{n} is the natural or gravitational energy,

E_{p} is the pumped energy, and

E_{s} is the total supplied energy, calculated as the sum of the previous two. Should all supplied energy be gravitational,

C_{1} would be equal to 1, and if all the energy were provided by pumps, its value would be 0. Natural energy has traditionally not been taken into account in energy analyses, and for instance EPANET does not consider it in its energy calculations [

44]. However, this energy must be considered when comparing the energy efficiency of different utilities.

The estimated pump energy intensity (

I_{ee,p}) is a relevant indicator, as it is a benchmark for the real energy intensity (

I_{er,p}) traditionally calculated by utilities. These two indicators do not include natural energy. Therefore,

I_{ee,p} is identical to

I_{ee} (Equation (1)) if the contribution of the natural energy is excluded:

Both estimated energy intensities are linked through the context indicator

C_{1} (energy origin):

From the estimated energy intensities (

I_{ee} and

I_{ee,p}), both the total energy (

${E}_{se})$ and shaft energy

$({E}_{pe})$ needed for a registered water volume

V_{r} can be anticipated, based on the estimated working conditions (values of

${\eta}_{pe}$ and

${\eta}_{le}$):

These two equations conform to the deterministic model that, relating cause and effect, allows the energy consumption due to transport in a system to be estimated. This energy consumption depends, to a greater or lesser extent, on the eight factors listed previously.

Both equations involve the metered water volume V_{r}, to which the energy intensities have been referred. This volume can be easily converted to the total input water volume (V_{t}) by deleting the term ${\eta}_{le}$ in the denominators of Equations (1), (2), (6) and (7).

Finally, the temporal variation of the top-down estimated indicators is worth consideration. Depending on the load of the network, the critical node may change, as friction losses are not constant. These changes impact the values of I_{ee} and I_{ee,p}. However, the real indicators to which they must be compared (I_{er} and I_{er,p}), are usually averaged values referring to extended periods of time (days, months, or even years). Therefore, it does not make sense to consider hourly variations of the estimated indicators, even though the proposed mathematical formulation allows for it. The weighted average should be calculated for the same period of time to allow for comparison with the real indicator. Consequently, average values of time-dependent variables (e.g., tank water levels) extended over the selected period must be considered.

#### 2.4. Generalization of the Model to Other Systems

This section generalizes the previous equations to gravitational systems and those with more than one supply source (

Figure 2). In gravitational systems, the source of supply is the highest node and the inequality

z_{s} =

z_{h} >

z_{c} is met, with two possible cases:

The difference in elevation between the source and the critical node is lower than the friction losses plus the service pressure:

where

${h}_{f\left(s\to c\right)}$ is the friction losses between the source and the critical node. In this case, additional pumping energy is needed to satisfy the energy demand of the gravitational system. This situation is similar to the previous one analyzed, modeled by Equation (1).

The difference in elevation between the source and the critical node is equal to or greater than the friction losses plus the service pressure:

In this case, the energy intensity is

The water efficiency (${\eta}_{le}$) included in the equation factors in the influence of leaks and penalizes the result accordingly, as the metered water volume does not consider them.

Obviously, all the preceding considerations for systems with z_{c} > z_{s}, considered in Equations (2) to (10) are also valid for gravitational systems.

All these concepts should also be extended to multi-source systems, where in general z_{c} is higher than any of the supply sources (there are as many z_{si} as supply sources). However, being part of a system, the elevations of z_{c} and z_{l} are unique. Since pumping efficiencies of the pumping stations can be different, an average pumping efficiency ($\overline{{\eta}_{pe}}$) weighted by volume can be used.

Despite the singularities of a multi-source system, the same framework can be applied. In order to calculate the energy intensity, the first step is to identify the source with a higher contribution in unit energy (the one with higher piezometric head at the exit). Then the critical node of the system must be identified with Equation (4) adapted to a multi-source system:

In this case, ${H}_{s,max}$ is the piezometric head of the highest source of supply and ${h}_{f\left(s,max\to i\right)}$ represents the friction losses between this source and the generic node i.

Then the energy intensity injected by each supply source (may have different piezometric heads) is weighted with the corresponding volume. The estimated energy intensity is

where

${z}_{s,max}$ is the elevation of the source with a higher piezometric head,

${V}_{i}$ is the input volume from source

i,

${H}_{s,i}$ is the piezometric head of source

i,

${V}_{T}$ is the total supplied volume, and

k is the number of supply sources in the system. The new equivalent head,

H_{e}, is

The energy origin (

C_{1}) context indicator needs to consider that the natural energy contribution from each source is different, resulting in Equation (6) being expressed as

where

$\overline{{z}_{s}}$ is the weighted average elevation of the supply sources.

The topographic energy indicator, Equation (5), is not affected by the number of sources. Once the multi-source system has been characterized, the final Equations (9) and (10) that synthetize the model are the same.

The validity of the multi-source equations was tested by analyzing the system in

Figure 2. The real energy required by the model was calculated from the mathematical model that simulates the operation of the network. This result was compared to those obtained with Equation (9), particularized for a system with two sources of supply.

It has to be noted that in this example, the critical node changes with time, as does the energy intensity. In particular, when the downstream tanks are being filled up, the one with the highest level is the critical node. As the tanks are being emptied, the highest consumption node becomes the critical node. This network has greater friction losses than recommended (more than 10 m/km during peak periods [

45]). As a result, the energy intensity changes significantly with time, making this network especially adequate for validating Equation (12).

In this example, the validity of the analytical expressions corresponding to a multi-source network were verified with a mathematical model that faithfully reproduces the behavior of the real system. In a real network, a mathematical model is not required, since all variables can be measured.

Finally, in order to complete all possible scenarios, a possibility exists for a multi-source system where some or all of the sources are higher than the critical node (multi-source gravitational systems). In this case, the energy intensity would still be calculated from the established guidelines, requiring a good understanding of the physics behind these equations. In any case, these systems are, in practice, often overlooked, as they do not include relevant pumping stations, if at all.