Hydrothermal Unit-Commitment Problem of a Large-Scale System with Representation of Forbidden Zones

As we move towards electrical networks with a growing presence of renewable generation, the representation of the electrical components becomes more important. In hydro-dominated power systems, modelling the forbidden zones of hydro plants becomes increasingly challenging as the number of plants increases. Such zones are ranges of generation that either should be avoided or are altogether unreachable. However, because representing the forbidden zones introduces a substantial computational burden, hydrothermal unit-commitment problems (HTUC) for large systems are usually formulated ignoring the forbidden zones. Nonetheless, this simplification may demand adjustments to the solution of the HTUC, because the generation of the hydro stations may fall in forbidden zones. In practice, the adjustments are usually performed based on the experience of system operators and, then, can be far from an optimal correction. In this paper, we study the impact of explicitly representing the hydro-generation forbidden zones in a large-scale system with more than 7000 buses, 10,000 lines, and 700 hydro units. Our findings show that the simplified model that is current used can deviate significantly from the model with forbidden zones, both in terms of the generation of hydro plants, as well as the generation of thermal plants and the system marginal costs.


Introduction
Recently, environmental and economic concerns have further stimulated the efficient use of energy resources. From a generation standpoint, improvements in efficiency are attained mainly by three actions: (1) development of new technologies; (2) upgrade and retrofit; (3) better management of resources. Technological developments include, but are not limited to, large-scale energy storage [1], and alternative energy sources such as hydrogen. Although essential, such developments are resource and time consuming. Upgrading and retrofitting existing power plants is a more immediate action that is especially appealing in countries with old power plants [2]. The third option involves the planning and operation phases of power systems. While the planning phase considers time horizons of years to determine the construction and decommissioning of power plants and transmission lines, the operation phase is focused on the short term and is better represented by the day-ahead scheduling problem known as unit commitment (UC) [3]. In the following, we discuss how the formulation and solution of the UC can affect the overall efficiency of a power system.
Despite being well studied and documented, the UC is still an open problem. In its most complete form, UC is a large-scale, uncertain, mixed-integer non-linear problem with, conservatively, millions of variables and constraints. This complete formulation is challenging because it is rarely possible to solve such a problem in a timely manner and, were it possible, it is not clear how energy prices could be derived from its solution (see [4] for discussions on pricing in markets with nonconvexities). Thus, system operators generation. In contrast, ref. [16] uses a similar plant-based aggregation to [9], but tackles the problem via Lagrangian relaxation. Consequently, all forbidden zones were neglected.
Different from the studies found in the literature, we tackle a cost-based, centralized HTUC of a real, large-scale system. Moreover, in contrast to [15], here, we focus on analysing the operational benefits of considering an explicit representation of forbidden zones in a large-scale system. Our analysis provides insights into the importance of forbidden-zone representation. Furthermore, due to the lack of studies with a real system, our work is the first, to the best of our knowledge, to quantify the differences between aggregation approaches with no consideration of forbidden zones, and the explicit consideration of forbidden zones.
This work is organized as follows. In Section 2, we discuss the challenges arising from explicitly representing the forbidden zones of hydro plants. We then present the aggregated model in Section 3, followed by the zone-aware model in Section 4. In Section 5, we present the test cases used in our evaluations and the computational setting. Then, in Section 6, we show and discuss the results. The final remarks are given in Section 7.

Challenges of Forbidden-Zone Representations
In hydro-dominated systems, unit aggregation is generally applied to reduce the number of hydro generating units. For instance, if a given hydro plant has five identical generating units all connected to the same bus, then an aggregation can reduce the number of units from five to one. Suffice to say that, for systems with a significant number of hydro plants, such aggregation provides a considerable reduction in problem size. Nonetheless, this aggregation can have an undesirable effect: the operation limits of the individual hydro generating units might not be satisfied in the aggregated unit. One such limit that is usually neglected is the minimum generation. Hydro turbines are built to operate at certain generation levels set by the designer in order to avoid cavitation and mechanical vibration [17]. Thus, if the minimum generation is not taken into account during the UC, then a redispatch phase after the UC might be necessary to reallocate generation to within the limits of the generation units. In Brazil, this reallocation is performed heuristically based on the expertise of the system operator. Consequently, there is no guarantee, from a mathematical viewpoint, that the new operation point after reallocation is anywhere near optimality. To illustrate how aggregation works, take, for instance, the Brazilian hydro plant Salto Caxias, which has four identical Francis turbines all connected to the same bus and each rated at 310 MW. For a net head of 65.5 m, the minimum generation for each turbine is 235 MW. These limits are summarized in Table 1. Table 1. Bounds on generation for the units of Salto Caxias. The data is taken from the system operator's publicly available archives [18].

1-4 235 310
Given the limits of Table 1, and considering that there is no spillage, and the minimum generation of the units is not a function of the net head (the forbidden zones are regular), the total generation of Salto Caxias as a function of the total turbine discharge, i.e., the sum of the discharge of the four units, and the reservoir volume is shown in Figure 1 for the aggregated representation of the plant and the representation with explicit operating zones.
Note that, under the aforementioned conditions, Salto Caxias has four forbidden zones: 0-235 MW, 310-470 MW, 620-705 MW, and 930-940 MW. The first and largest one is due to the minimum generation of a single unit: since all units have a minimum generation of 235 MW, in turn, the minimum generation of Salto Caxias itself is 235 MW.
The second zone appears when one unit is operating but there is not enough power to turn on a second unit. In the event of operating in this second forbidden zone, then at least one unit would violate its minimum generation. Furthermore, it is possible to see from Figure 1 that the generation limits also set limits on the turbine discharge. For the reservoir volume considered, the first operating zone has turbine discharges in the range 415-545 m 3 /s. For Salto Caxias, unit aggregation can then aggregate its four units into a single large unit with minimum generation of 235 MW and maximum generation of 1240 MW. If, in addition to unit aggregation, no binary decision is associated with Salto Caxias, then its generation range is further relaxed to 0-1240 MW. Unit aggregation has further implications beyond the operation of the plant on which aggregation has been performed. Since hydro plants are generally connected to other plants downriver and upriver of them, unit aggregation of a plant can also affect the operation of downriver and upriver plants. To illustrate this point, take the six plants of the Iguaçu basin in Figure 2.
Energies 2021, 1, 0 4 of 26 considered, the first operating zone has turbine discharges in the range 415-545 m 3 /s. For Salto Caxias, unit aggregation can then aggregate its four units into a single large unit with minimum generation of 235 MW and maximum generation of 1240 MW. If, in addition to unit aggregation, no binary decision is associated with Salto Caxias, then its generation range is further relaxed to 0-1240 MW. Unit aggregation has further implications beyond the operation of the plant on which aggregation has been performed. Since hydro plants are generally connected to other plants downriver and upriver of them, unit aggregation of a plant can also affect the operation of downriver and upriver plants. To illustrate this point, take the six plants of the Iguaçu basin in Figure 2.    Figure 2 shows part of the Iguaçu basin. In total, this basin encompasses nine reservoirs, the largest ones shown in the figure. Most hydro units installed along the Iguaçu river are Francis turbines. They generally have a relatively high minimum-generation level, which can lead to significant violations if an aggregation approach is used. Take, for example, the Salto Santiago plant, if at any given period one of its four units is to operate, then the turbine discharge of this plant must be such that the generation lies within 270 MW and 355 MW. In addition, consider, for the sake of argument, that the run-of-river plant Salto Osório has no reservoir regularization (its minimum and maximum reservoir volumes are the same), the only inflow to Salto Osório comes from Salto Santiago, and Salto Santiago cannot spill water (spillage is usually avoided since it essentially means a loss of opportunity for the plant spilling water). In this context, if Salto Osório is also   Figure 2 shows part of the Iguaçu basin. In total, this basin encompasses nine reservoirs, the largest ones shown in the figure. Most hydro units installed along the Iguaçu river are Francis turbines. They generally have a relatively high minimum-generation level, which can lead to significant violations if an aggregation approach is used. Take, for example, the Salto Santiago plant, if at any given period one of its four units is to operate, then the turbine discharge of this plant must be such that the generation lies within 270 MW and 355 MW. In addition, consider, for the sake of argument, that the run-of-river plant Salto Osório has no reservoir regularization (its minimum and maximum reservoir volumes are the same), the only inflow to Salto Osório comes from Salto Santiago, and Salto Santiago cannot spill water (spillage is usually avoided since it essentially means a loss of opportunity for the plant spilling water). In this context, if Salto Osório is also scheduled to operate, then the turbine discharge from Salto Santiago must not only comply with the constraints of Salto Santiago itself, but also lie in one of the operating zones of Salto Osório. This simple example illustrates the challenges arising from considering an individual representation of generating units in a hydrothermal system. Evidently, such reasoning can be extended to all plant in the Iguaçu basin, and to the nearly 160 reservoirs of the Brazilian power system. Figure 3 shows the minimum generation for each of the nearly 200 groups of identical units in the Brazilian system. As can be seen in Figure 3, the majority of the groups have a minimum generation below 50 MW. Thus, in an aggregated model, Figure 3 shows that, for most power plants, the violations in the operation zones will be no greater than 50 MW. From the system's viewpoint, while this threshold might seem negligible when seen individually, i.e., while a 50-MW violation might not warrant concern in a system where the mean load is above 50 GW, the occurrence of multiple violations can demand significant changes in the schedule given by the aggregated model.
To more quantitatively highlight the differences between the aggregated and individual hydro models, we again take Salto Caxias as an example. As previously mentioned, this plant has four identical units, and its installed capacity is 1240 MW. In the aggregated approach, Salto Caxias can generate at any level between 0 and its maximum. In (1)-(13), we show the aggregated model for a generic plant with a single group of identical units. For simplicity, we omit the indices for the hydro plant and its groups of units.
In (1)- (13), constants are given in boldface, while variables are in italic. In this model, constraints (1), (2), and (3) limit from below and above the turbine discharge, q, reservoir volume, v, and the plant's spillage, s, respectively. In (1), variable u is used to select the operating zone z of Salto Caxias, and Z is the set of operating zones. In (4), the inflow to the reservoir at a period t is represented by variable a t . In this simple example, this variable includes the incremental inflow and the discharge coming from upriver plants, and the function that determines its value is omitted for simplicity. Moreover, in (4), constant C H converts flows in m 3 /s to volume in hm 3 , i.e., it is 3.600 · 10 −3 s. Constraints (5)- (9) form the piecewise linear model of the hydropower function: (5) and (6) limit, respectively, from above and below the generation potential, p, given by the forebay level (in this case, represented indirectly by the reservoir volume); similarly, (7) and (8) bound the generation loss, p, due to the tailrace level. The difference between p and p gives the plant's generation, p H t , which is constrained in (11) to lay between 0 and the plant's installed capacity. In (5)-(9), constants PX i , PX i , LX i , and LX i , for X = 0, 1, 2, are the coefficients of variables q and v in the piecewise model whose linear components are the indices i in the respective sets P over , P under , L over and L under . Constraints (12) are introduced to ensure that at most one variable u assumes 1, while (13) are the bounds on u. Note that, since u is continuous in this model, it is in fact possible to write an equivalent model without it. However, for a better understanding of the differences between the aggregated model and the zone model, we chose to keep variables u. The following model, (14) and (15), is the equivalent forbidden-zone aware version of (1)- (13).
In (14) and (15), different from (1)-(13), the turbine discharge, and consequently the plant's power output, depends on the binary variables u. For this generic plant with a single group of identical units, there is one binary variable for each of its operating zones. In the particular case of Salto Caxias, there are four operating zones, i.e., Z = {0, 1, 2, 3}, and the power limits of each zone are P H 0 = 235, P H 1 = 470, P H 2 = 705, P H 3 = 940 MW, and P H 0 = 310, P H 1 = 620, P H 2 = 930, P H 3 = 1240 MW. The corresponding binary variables of the operating zones are u 0,t , u 1,t , u 2,t , and u 3,t for any given t. Inequality (12), now that u is binary, guarantees that, at any given period, at most one operating zone can be chosen. Note that in (14) and (15) and (1)-(13), the model is the same, apart form the discontinuities in the permitted values for the turbine discharge and the generation in (14) and (15) that are the consequence of the now binary nature of u. As one particular complicating factor, note that the turbine discharge chosen, and, consequently, the binary variable, must comply not only with its bounds, but also with the reservoir's balance constraint, (4). In terms of model size, explicitly representing the permitted operating zones of Salto Caxias introduces four binary variables, while the number of constraints remains the same. However, while in (1)-(13) these limits do not depend on any variable, in (14) and (15), the range of the turbine discharge and generation depend on the binary variables u associated with the operating zones. The differences between (1)- (13) and (14) and (15) lead to discontinuities in the hydropower function shown in Figure 1.
With the simple and generic example shown above, we now proceed to stating the HTUC with aggregated hydro generation used in this work.

Aggregated Hydrothermal Unit Commitment
We present in the following the optimization model used for representing the HTUC with the aggregated hydro model. For a better understanding, the model is separated into convenient parts. Firstly, we show the basics of the hydro model in (16)-(32).
In (16)-(32), h ∈ H is the index of a hydro plant, t ∈ T is a time period, and u ∈ U h is the index of a group of identical units at arbitrary plant h. Furthermore, variable u is used to identify the current zone in which the group of units is operating; set Z contains the indices that identify the operating zones. In (16)-(32), u is now subscripted with four indices: the hydro plant h, the group of identical units u, the operation zone z, and the time period t. Similarly, the set Z of interested is specified with subscripts h and u. In the definitions that follow, we omit the variables' units for brevity. In summary, all hydro flow variables are in m 3 /s, the volumes are in hm 3 , and the power generation is in p.u. with 100 MW as basis. The coefficients and constants are assumed to have the appropriate units according to the aforementioned rules.
In (16)-(32), (16) limits the total turbine discharge, q, of each group of units. In this constraint, the upper bound on q depends on variable u. However, since in this aggregated model u is continuous, q can assume any value between 0 and the maximum turbine discharge. Seldom, reservoirs have other controllable means of discharging water to other reservoirs. One such means is here called a water bypass, q bp . Different from discharging water through the turbines or spilling, bypassing does not affect the power generation directly-only indirectly due to the reduction of the reservoir's forebay level. The lower and upper bounds on water bypass are given in (17). Note that, although only a handful of reservoirs dispose of this feature, (17) is written for all reservoirs in the power system. This is done to avoid excessive notation. Evidently, reservoirs not disposing of a water bypass can simply have its upper limit, Q bp h , set to zero. Similar to a water bypass, some plants may have pumps installed along their reservoirs and, with them, downriver plants can send water to upriver plants (at the cost of actually consuming energy rather than generating it). The maximum rate of pumped water, q pump , is given in (18). Again, note that, for those plants with no pumping mechanisms, the upper limit Q pump h can be set to zero. The more usual spillage, s, has its limits given in (19). Additionally, the reservoir volume, v, is required to be at all times between its minimum, V, and maximum, V, as enforced by (20).
The changes in the reservoir volume itself depend on the net balance between the total inflow to the reservoir and the total outflow. In (16)-(32), the inflow is represented by a and defined in (21) as a function of water discharges from turbines and spillages from upriver reservoirs, the term C H · ∑ j∈D ts h ∑ u∈U j q j,u,t−d j,h + s j,t−d j,h , water bypass, and pumped water, and the constant incremental inflow, I. In (21), D h are sets of reservoirs capable of sending water to h. For instance, D ts h is the set of all plants upriver of h whose turbine discharge and spillage reach h after d periods of time. Similar to D ts h , D bp h is the set of upriver plants to h whose water bypass reaches h. Lastly, D pump h is the set of plants for which the pumped water arrives at h. In (21), the total inflow a is represented in the same unit as the reservoir volume, thus, all rates are converted to volumes through the constant C h . In contrast to (21), (22) gives the total outflow, o, of each plant in each period. Finally, (23) ensures that the mass balance in the reservoirs is satisfied at all times. In hydro systems, it is crucial to evaluate the trade-off between generating power now and saving water to generate power later. In this work, such evaluation is performed with a piecewise-linear cost-to-go function, (24). In (24), C is a set of indices defining the piecewise linear function that bounds from below the future cost α. This cost is given as a function of the reservoir volumes at the last period of the planning horizon, T.
An important modelling detail of any HTUC is the hydropower function. Here, we use the same hydropower-function model used in [15], adapted to the context of explicit operation zones. As in [15], (25) and (26) bounded from above and below, respectively, the potential power, p. On the other hand, (27) and (28) limit the power loss, p. Effectively, the hydro generation, p H , is then the difference between the potential power and the power loss, as in (29). Finally, the hydro generation of each group of units is bounded in (30). Since model (16)-(32) is aggregated, the generation of each group of units can take any value between 0 and its maximum, P H . Constraints (31) and (32) are included in the aggregated model to highlight the differences between this model and the zone model. In addition to constraints (16)-(32), the operation of hydropower systems is further constrained by other uses of water. For instances, at some points, the river levels must be kept above a certain minimum to ensure navigability. Therefore, the outflow of power plants along said river are usually tightly controlled to satisfy the river's minimum level. Likewise, environmental and agricultural concerns can also limit the operation of power plants. To reflect such limits, constraints (34)-(49) are introduced into the model. Note that, again, we choose to present the constraints as though they are presented in all power plants for the sake of conciseness. Evidently, those power plants which do not have such limits can have their respective bounds appropriately set to make their inequalities redundant.
Inequalities (34) With the hydro model defined as above, we now turn our attention to the thermal model. The first part of this model involves only the binary decisions associated with the thermal units, as shown in (50)-(56).
The first constraint in (50)-(56), (50), determines the current phase of the thermal unit: off, in start-up trajectory, in shut-down trajectory, or in dispatch. In all phases but off, the status variable x assumes 1. If the unit is in the start-up trajectory, then the sum on the start-up decisions, ∑ t i=t−N SU g +1 w g,i , gives 1 and all variables in the equation other than x g,t must assume 0. Similarly, in the shut-down trajectory, only the status variable and one of the shut-down decisions, y, assume 1; if the unit is in the dispatch phase, then k and x assume 1. To guarantee that once a unit is brought on it remains at least one time period in the dispatch phase, constraints (51) are introduced. Constraints (51) and (53) are the usual minimum up-and down-time constraints. The logical constraints (54) dictate the relation between the status of the unit and the start-up and shut-down decisions. Since in this particular work start-ups and shut-downs are not economically penalized, constraints (55) are used to ensure that they do not occur at the same time. Finally, the binary nature of these decisions is given in (56). Now, the continuous part of the thermal model is given in (57)-(61).
As previously stated, in this work, a thermal unit can be in one of four states. Equality (57) defines the unit's generation, in a per-unit basis, p G according to the current state. The second term in (57), ∑ N SU g i=1 SU i · w g,t+1−i , gives the generation in the steps of the start-up trajectory; likewise, ∑ N SD g i=1 SD i · y g,t+N SD g +1−i defines the generation in the shut-down trajectory; while in the dispatch phase, the generation is simply equal to the generation in the dispatch phase itself, p disp,G . Whereas in the start-up and shut-down trajectory the unit must strictly follow a pre-defined set of generation steps, in the dispatch phase, its generation is more flexible. Nonetheless, it must still comply with the physical limits of the unit. Constraints (58) are the minimum and maximum generation in the dispatch phase, and constraints (59) and (60) are, respectively, the ramp-down and ramp-up constraints. Similar to the hydro plants, the combined generation of sets of thermal units can be further limited by the system operator. In (57)-(61), constraints (61) give lower and upper bounds to the combined generation of the thermal units in set E i , for a given constraint i ∈ P G t of an arbitrary time t. The hydro and thermal models have been defined above. In the following, we show how the network constraints couple these models. In this work, we use a DC model, where the fluxes in the transmission lines is linearly dependent on the power injections at the buses.
The main objective of generation and transmission systems is to serve the demand for electrical energy. The equations (62) establish that the power injections and withdrawals at each bus b ∈ B must balance each other out. In (62), the terms ∑ g∈B G With all components defined above, the aggregated-hydro-model unit-commitment problem is defined in the following.
The objective function of (66) is composed of three terms: the first one is the total cost of the thermal generation, the second is the cost of network violations, and the last one is the estimate of the future cost. Together, the cost of thermal generation and network violation form what we refer to in the rest of the text as the present cost. The objective function is minimized over the feasible region defined by the aggregated hydro model,

Forbidden-Zone-Aware Hydrothermal Unit Commitment
As previously stated, the model shown in the last section allows for the hydro generation to assume any non-negative value up to the plant's maximum generation. Nonetheless, due to minimum generation requirements for the hydro units, the hydro generation curve is not continuous-as the example of Figure 1 illustrates. In order to introduce into the HTUC of the previous section the discontinuities of the hydro generation, we need to resort to binary variables, as shown in the example of (1)- (13) and (14) and (15). Therefore, to accommodate the forbidden zones, model (66) is further equipped with the integrality constraints shown in (67).
As can be seen, the HTUC with aggregated hydro representation is, in fact, a relaxation of the commitment problem with explicit representation of the hydro zones. To see that, simply note that the only difference between the two models lies in the integrality of u, which is missing in the aggregated model (66). The fact that the aggregated model is a relaxation of the zone model guarantees that any differences in the optimal values of the two models are due to the integrality of u which, in turn, is a consequence of the representation of the forbidden zones.
Additionally, note that the use of forbidden zones allows not only for a better representation of the limits on hydro generation, but could also be a means of improving the model of the hydropower function itself. In the aggregated model, the hydropower function is approximated through a single piecewise linear function for the entire range of the generation, from zero to the maximum, although some generation points might be forbidden. On the other hand, the inherently discontinuous nature of the generation in the forbidden-zone-aware model allows for the hydropower function to be approximated by a different model for each of the operating zones. Take again the example of Figure 1, this plant has four operating zones. With additional (continuous) variables in (67), the four operating zones of the plant in this example could be individually approximated by models of the hydropower function for each zone. Although a promising idea, to avoid potential disparities arising from different models of the hydropower function in the aggregated and the zone model, this paper focuses solely on the impacts of explicitly representing the operating zones of power plants in the HTUC.
Before presenting the computational experiments and the associated results, we first present the optimization model used to identify and quantify violations of the operating zones caused by the aggregated model (66). To that end, we first note that there are several ways to perform this task: one could think of the violations on a basin basis, region basis, or plant basis. The first one could take into account the coupling of the reservoirs and the fact that violations in one hydro plant can likely affect the operation of reservoirs upstream and downstream of it. The second one could be based on the premises that power stations, hydro or otherwise, that are electrically close to each other are more likely to be affected by violations in operating zones. Naturally, the first two strategies for quantifying and analysing the violations of operating zones of hydro plants depend on the third one: if there is no violation for any of the hydro plants in the system, then the operation of the reservoirs in all basins and the operation of plants across the entire system will not be affected by violations. Thus, in this paper, we use a plant-based analysis for the violation of the operating zones in the aggregated model (66). The model for this analysis for a arbitrary plant h and arbitrary time period t is shown in (68).
In (68) closest to p H h,u,t is more than p H h,u,t . On the other hand, a negative optimal d indicates that the closest operating zone to p H h,u,t is above p H h,u,t . Note that, although a simple model, (68) enables us to identify violations generated by the aggregated model (66) that ultimately need to be corrected by adjusting the operation of the plant violated itself, and, possibly, also of other reservoirs in the same cascade and of other power plants.

Computational Experiments
In this paper, we work with 10 cases (identified from 1 to 10) from the Brazilian power system: it has 7475 buses, 10,698 transmission lines, 161 hydro plants with a total of 731 hydro units, and 329 thermal units. Our interest in this paper is in the day-ahead operation, thus, we consider a planning horizon with 48 half-hour periods, representing the day-ahead operation. The initial conditions of the hydro and thermal plants immediately before the beginning of the planning horizon are taken from the real operation of the Brazilian system. Additionally, the network model in all cases is the classical DC, as shown in (62)-(65). All cases have essentially the same number of constraints and variables, including binary variables, the only difference in the number of constraints being due to the model of the hydropower function, which is case-dependent and slightly varies with the initial conditions of the system. Therefore, instead of showing model sizes for each case, we present in Table 2 the size of the optimization model for Case 1. The increase in the number of constraints in the zone model is due to valid inequalities derived from (33)-(49) that are included to tighten the model. For the sake of brevity, such constraints are omitted here. Despite having roughly the same size, the times taken to solve the cases can vary widely. In our experience, we have observed that cases with a high load tend to take more time to be solved. Then, to illustrate how the 10 cases cover a wide range of operational points with distinct computational demands, we give in Figure 4 the gross load over the 48 periods of each case. As Figure 4 shows, not only are the peak loads significantly different, but the load profile itself is also markedly distinct among the cases. For instance, Case 2, the one with the highest load, is taken from the summer of 2020, in a period when the system was considerably stressed due to high temperatures. On the other hand, Case 5 is a winter case, which, in Brazil, means that the load is considerably lower than in the summer. To solve the optimization problems, both the UC problem with zone representation and with the aggregated hydro model, we use the algorithm proposed in [15]. For the sake of completeness, we give a summary of the algorithm in Section 5. Apart from the unit-commitment model, the implementation is similar to [15] and, for the sake of conciseness, we omit further details of the algorithmic implementation. The tolerance of the relative gap is set to 0.1%, and the time limit is set to 30 min. All problems are run on a workstation with two Intel Xeon CPU E5-2660 2.60 GHz v3 processors and 128 GB of RAM. The operating system is Ubuntu 20.04.3; the algorithm is implemented and run with Python 3; and Gurobi [19] is used to solve all optimization problems.
An important characteristic of the algorithm of [15], as we will explain shortly, is that it is asynchronous: every run of the same problem, under the same conditions, might produce a different solution. Thus, to account for the non-deterministic behavior of the algorithm, we run each case for the two hydro models five times. Then, a specific run of a given case will be identified by a trial ID from 1 to 5. For example, Case 1 Trial 3 refers to the third trial (or run) of Case 1. A rather beneficial side of this non-deterministic characteristic to our analysis is that, for each of the cases, we have five different, but equally acceptable in terms of relative gap, solutions to the HTUC. This enables us to perform a more robust analysis that involves multiple solutions instead of an analysis based on a single solution for each of the cases.

Summary of the Cooperative Multi-Search Benders Decomposition
Benders decomposition [20] decomposes problems with complicating variables into a master problem, where the complicating variables are dealt with, and a sub problem, which is the original problem with the complicating variables fixed. In our case, we define the complicating variables as the binary ones. In particular to [15], the main idea of the algorithm is to partition the set of master problem variables into a set of free-to-change variables and a set of variables whose values are fixed according to a given reference solution-this reference must satisfy all constraints of the original problem, although it might not necessarily be cost-effective. With multiple identical master problems, it is possible to apply different partitions to each of them, and, then, the master problems are likely to provide different candidate solutions which are later evaluated by the sub problems. In addition to the reduced computational burden due to the fixing of variables, the diversity of candidate solutions drastically increases the chances of finding quality solutions earlier in the optimization processes.
To illustrate the dynamics of algorithm [15], we use in Figure 5, an example with four master problems, 1 to 4, and four sub problems, 1 to 4. Each master problem applies a different partition to the set of complicating variables, assume here that this set is V. Thus, we have for master problem 1 that V = R 1 ∪ F 1 , with R 1 ∩ F 1 = ∅, where R 1 is the set of variables that are free to change, and F 1 is the set of variables that take their respective values in the reference solution. Furthermore, to promote the diversity of solutions, we have that R 1 = R 2 , R 1 = R 3 , R 1 = R 4 , and so forth. Once a master problem finishes its inner optimization processes, it sends its candidate solution to the general coordinator, who then forwards it to be evaluated by one of the sub problems. The sub problem then evaluates the solution and returns to the general coordinator an optimality cut, which is later sent to all master problems. In addition to controlling the flow of information between master problems and sub problems, the general coordinator is responsible for checking the convergence of the algorithm. While master problems and sub problems work towards obtaining good upper bounds, a continuous relaxation of the original problem is solved independently to provide a lower bound. The asynchronous behavior of algorithm [15] is due to the absence of of synchronization points, both when solving the master problems and when solving the sub problems, and the inherent non-deterministic running times of these optimization problems.

Results
We start our analysis by showing the bounds for each case and model in Table 3. In this table, we give the best (min) and worst (max) upper bounds for each case over the five trials. Since the methodology in [15] gives an essentially deterministic lower bound, this bound is unchanged over the trials of each case. As we mentioned in the statement of the models, the aggregated model is a relaxation of the zone model. Therefore, the optimal value of the zone model will be not less than the optimal value of the aggregated model. Since we are not able to solve the optimization models to optimality, we need to rely on the bounds of the optimization models. The difference between the lower bound of the zone model and the upper bound of the aggregated model give an underestimate on the difference between the optimal costs, whereas the difference between the upper bound of the zone model and the lower bound of the aggregated model gives an overestimate. Take, for instance, Case 1, the best upper bound for the aggregated model for this case is $108,458,103.84 ×10 3 and the lower bound for this model is $108,366,249.0 ×10 3 . For the same case, the best upper bound for the zone model is $108,447,723.49 ×10 3 and the lower bound is $108,369,725.0 ×10 3 . Thus, for Case 1, the difference between the optimal costs of the aggregated and zone models lies in $[−88,378.84, 81,474.49] ×10 3 . However, since we know that the aggregated model is a relaxation of the zone model, then the underestimate on the difference between the optimal costs is, at least, 0. Then, the range of the difference for Case 1 is $[0, 81,474.49] ×10 3 . The worst case in terms of the overestimate of the increase in costs is Case 7, where the difference between the optimal costs can be as much as $86,982,200.36. A more practical measure of the differences in costs is the simple comparison of upper bounds. In this case, it is interesting to note that, for Cases 1 and 3, the upper bounds of the zone model are better than those of the aggregated model, whereas, for Case 8, the cost of the zone model's solution is $34,242,315.15 greater than that of the aggregated model. Evidently, these are loose estimates of the differences between using the aggregated and the zone model in the decision-making process of the day-ahead operation. A more accurate analysis would require recomputing the costs of the aggregated model after redispatching the generating units due to (possible) violations of forbidden zones, and then comparing the costs after redispatch with the upper bounds given by the zone model. Unfortunately, such redispatch is commonly performed based on the experience of experts, and thus any attempt at reproducing them here would result in an unfair, and possibly biased, analysis.
Next, in Table 4, we show the relative optimality gaps and the elapsed times for each case and model. Again, to account for the differences in the solutions obtained and the run times of each case over the five trials, we give the best (min) and worst (max) gaps and elapsed times of each case over the five trials. The worst gaps in Table 4 indicate that all cases have been solved to a 0.1% optimality gap for both models in less than 30 min. Moreover, in the elapsed time's column, we see that the computing times favor the aggregated model only slightly. In fact, the average run time of this model over all cases and trials is about 10 min, while that of the zone model reaches 10.5 min. Considering the magnitudes of the problems at hand, these computing times are well within the requirements for unit-commitment problems. The first step of the multi-search Benders algorithm is to find an initial reference solution that is feasible in the original problem, although not necessarily a good solution in terms of costs. This step consists of three phases: (1) getting an estimate of the future cost; (2) solving a zero-objective continuous version of the original problem; (3) and finally getting binary solutions from the possibly fractional ones of phase (2). Phases (1) and (2) are essentially the same for the aggregated and the zone models, while phase (3) only differs due to the fact that, in the zone model, we need binary decisions for the hydro operation. Thus, it is not unreasonable to expect that the initial solutions of the aggregated and zone model can be rather close to each other, especially when we look at the thermal operation. After this first step to get the initial reference solution, the original problem is decomposed into multiple identical master problems and multiple identical sub problems. For both models, the sub problems are essentially the same in terms of model size: the aggregated model only has the additional continuous hydro variables in [0, 1], which account to roughly 15,000 variables, a relatively small number. Therefore, the running times of the sub problems for both models can be expected to be close. On the other hand, the master problems in the case of the zone model have an additional 15,000 binary variables, and this number, although relatively small, now becomes more significant because it relates to binary variables. However, our algorithm partitions the master problem's variables into a set of free-to-change variables and a set of fixed variables. This dramatically reduces the computational complexity of the master problem because we are, in practice, getting rid of a substantial number of binary variables. Hence, the running times of the master problems of the zone model are greatly reduced due to this partitioning strategy. In conclusion, due to the similarity of the sub problems and the partitioning of the master problem, the overall running times of our algorithm for both models can be rather comparable, as seen in Table 4.
To illustrate the violations in the operating zones in the aggregated model, we again use Salto Caxias as an example. Figure 6 presents the generation of this plant in the aggregated model for each of the five trials of Case 4. As it can be seen in Figure 6, most of the violations in Salto Caxias happen during periods of low demand-this can be readily seen by analysing the demands in Figure 4. Moreover, the majority of the violations are within the first forbidden zones, i.e., between 0 and the minimum generation of 235 MW. The example of Figure 6 is also useful to demonstrate how the optimization model (68)   Using Trial 1 as an example, we can compute the average of the violations shown in Figure 7 over the periods when they are strictly greater than 0. For Trial 1, this metric is 117.4 MW. The important thing to remember about a violation of this magnitude is that it requires significant changes in the operating point given by the aggregated model. This redispatch, in turn, is likely to be far from optimal if the day-ahead operation is not reoptimized (which is usually what happens since the redispatch is performed based on the experience of experts).
To gain a bigger picture of the violations over all cases for all plants, we can compute the total violations over all cases and trials, taking into consideration only violations greater than 10 MW-violations smaller than 10 MW are here considered to be more easily corrected. For a better understanding, note that each of the 10 cases has 48 periods, and each of the is run five times (the five trials). Thus, the total number of generation points to be analysed is 10 · 5 · 48 = 2400. For an arbitrary plant h, assume that its violation in a case c, with c ∈ {1, ...10}, trial i, with i ∈ {1, ...5}, and time period t is v c,i,h,t , then the total violation is ∑ c∈{1,...10} ∑ i∈{1,...5} ∑ t∈T ,v c,i,h,t >10 v c,i,h,t . In addition to the total violation, we can also compute the occurrences of violations greater than 10 MW as ∑ c∈{1,...10} ∑ i∈{1,...5} ∑ t∈T ,v c,i,h,t >10 1. Nonetheless, the aforementioned metrics are hard to understand due to their magnitudes, it is more interesting to see, first, the average of the violations over all cases and trials, defined as . Figure 8 shows the average violation and occurrences (given as percentages) of the 20 plants with most total violations over all cases and trials. The first thing to note in Figure 8 is the percentage of time that the plant Machadinho has violations, over 78%. Moreover, the average violation for this plant is 106 MW. The Machadinho plant has three identical Francis turbines with a generation range of [260, 380] MW. For the majority of time, the violation occurs in the first forbidden zone, i.e., in (0, 260) MW. The next plant in terms of frequency of violations is Itá, which for about 75% has violations that average 55 MW. In addition to the violations themselves, it is important to note that Machadinho is upstream from Itá, which adds to the difficulty of adjusting the generations of both plants. In fact, from the 20 plants shown in Figure 8, four of them, Machadinho, Itá, Foz do Chapecó and Barra Grande, are in the Uruguai basin. Another basin which has a significant number of reservoirs in Figure 8 is the Iguaçu, with three representatives: Salto Caxias, Salto Osório, and Segredo. Another plant whose violations warrants attention is Belo Monte. Although the frequency of violation for this plant is 9%, significantly less than the frequencies for the other plants in Figure 8, its average violation is 163 MW, reaching a maximum of 218 MW. In contrast to the plants discussed here so far, Belo Monte has a great number of units, 18, all of them with a minimum generation of 450 MW. In addition, Belo Monte is a run-of-river plant built to exploit the high inflows in the wet period of Northern Brazil. Thus, when there is enough inflow, Belo Monte's units operate near full capacity. However, in dry periods and/or periods of low demand, the high minimum generation of its units makes Belo Monte a likely candidate for violations.
Hitherto, we have focused on the violations in the hydro operating zones caused by the aggregated model. Now, we include in our analysis the zone model. First, we use the generation of Belo Monte in Trial 1 of Case 9 as an example-in fact, the dispatches for Belo Monte in all trials of Case 9 for both models are similar. Figure 9 shows the plant's generation over the 48 periods of the planning horizon. Note that, in the aggregated model, the generation violates all three forbidden zones. Moreover, Belo Monte's generations in the two models are considerably different: while in the aggregated model, the total energy generation is 45,408 MWh, the total energy generation in the zone model is about 34,107 MWh. Furthermore, the peak generation in the aggregated model reaches 4668 MW, whereas in the zone model it is about 2927 MW. In addition to the differences in the generation of individual hydro plants, the aggregated and zone models can also differ in the total generation. Figure 10 illustrates this fact for the first trial of Case 1. In this figure, we see that the hydro generation in the zone model is considerably higher than that in the aggregated model. If we look at the examples previously shown of Salto Caxias and Belo Monte, the increase in generation in the zone model seems reasonable: since most violations occur in the first forbidden zone, the adjustment can be either to generate 0 (decrease), or generate at the minimum (increase). Evidently, the increase in hydro generation presented in Figure 10 reflects in a corresponding decrease in thermal generation, which is shown in Figure 11. The differences in thermal generation also reflect changes in the commitment of thermal units. To illustrate these differences, we again resort to Case 1 Trial 1. The total number of thermal units committed in each half-period of Case 1 Trial 1 is shown in Figure 12. We care to note that the differences seen in this figure for the first trial are also seen in the other four trials. Moreover, with the exception of Cases 5 and 8, all cases show significant differences between the aggregated and zone model. (By significant difference, we mean that, for at least 12 h of the 24 h of the day-ahead operation, the difference in the number of committed units is at least 20.) As seen in Figure 12, the difference in the number of committed thermal units can be as much as 70, which illustrates a significant difference in the operation of the system for a single day. As we mentioned above, in Cases 5 and 8 the commitment of thermal units is similar for models aggregated and zones. We use Case 8 Trial 1 to illustrate this similarity in Figure 13.
Although the commitment of thermal units is similar in this case, their total generation is still significantly distinct. In Figure 14, we can see that the thermal generation in aggregated model is 879 MW higher than the generation in the zone model in the 40th half-hour.
The most apparent consequence in the changes in thermal generation shown in Figure 11 are differences in the present costs of the system. To illustrate such changes, we show in Figure 15 the average thermal generation costs for each case over the five trials. From this figure, it can be seen that, for most cases, there is a noticeable increase in thermal costs in the aggregated model, suggesting that the decrease in thermal generation, from the aggregated to the zone model, presented in Figure 11 is also found in most of the other cases. Two cases that are exceptions to this trend are Cases 4 and 8, for which the costs in the zone model can be slightly higher than those in the aggregated model.   Another important metric in a hydro-dominated system is the stored energy left in the reservoirs at the end of the planning horizon. The more energy stored in the reservoirs, the less thermal generation will be required in the future-resulting in lower future generation costs. Here, we use the methodology presented in [21] to compute the stored energy in each of Brazil's four subsystems: Southeast, South, Northeast, and North. In summary, the stored energy is an estimate of how much energy could be continuously generated by hydro plants with just the water that is currently in the reservoirs. Take, for instance, the cascade of of Figure 2. The stored energy for this cascade is computed as the maximum energy that could be generated by first depleting Foz do Areia's reservoir, i.e., the rightmost plant in the figure (the one farthest from the sea). As water leaves Foz do Areia's reservoir, it eventually reaches Segredo's, where it can again be used to generate more energy, in addition to the water already in Segredo's reservoir. This reasoning is applied until the last plant in the cascade, the one closest to the sea, is reached. The cascade's stored energy is then simply the sum of the stored energy of all the cascade's plants, while, for a subsystem, the stored energy is the sum of its cascades' stored energy. Nonetheless, instead of presenting the stored energy itself, we first show only the difference in the relative stored energy for the four subsystems for models aggregated and zones in Figure 16. Another important metric in a hydro-dominated system is the stored energy left in the reservoirs at the end of the planning horizon. The more energy stored in the reservoirs, the less thermal generation will be required in the future-resulting in lower future generation costs. Here, we use the methodology presented in [21] to compute the stored energy in each of Brazil's four subsystems: Southeast, South, Northeast, and North. In summary, the stored energy is an estimate of how much energy could be continuously generated by hydro plants with just the water that is currently in the reservoirs. Take, for instance, the cascade of of Figure 2. The stored energy for this cascade is computed as the maximum energy that could be generated by first depleting Foz do Areia's reservoir, i.e., the rightmost plant in the figure (the one farthest from the sea). As water leaves Foz do Areia's reservoir, it eventually reaches Segredo's, where it can again be used to generate more energy, in addition to the water already in Segredo's reservoir. This reasoning is applied until the last plant in the cascade, the one closest to the sea, is reached. The cascade's stored energy is then simply the sum of the stored energy of all the cascade's plants, while, for a subsystem, the stored energy is the sum of its cascades' stored energy. Nonetheless, instead of presenting the stored energy itself, we first show only the difference in the relative stored energy for the four subsystems for models aggregated and zones in Figure 16. (c) Northeast.
(d) North. Figure 16. Differences in the stored energy in % of the maximum stored energy in each subsystem. The difference is computed as SE a,subs − SE z,subs , where SE is the energy stored in %, a is the aggregated model, z is the zone model, and subs is a particular subsystem. The squared marks give the average of the difference stored energy over the five trials of each case, while the error bars indicate the minimum and maximum over the five trials. Positive differences indicate that more energy is stored in the aggregated model. In (a), we show the differences for the Southeast system; (b-d) present the data, respectively for the South system, Northeast, and North. Figure 16 shows that, for subsystems Southeast and South, there is an evident trend of more water usage in the zone model. According to Figure 8, the majority of violated plants are located there: Machadinho, Itá, Salto Caxias, For do Chapecó, Salto Osório, Segredo, and Barra Grande are located in the South; Itumbiara, LC Barreto, Irapê, Cana Brava, Amador Aguiar I, São Salvador, Três Irmãos, São Simão, and Serra do Facão are in the Southeast subsystem. Thus, with the results from Figures 8 and 16, we can deduce that the explicit representation of the operating zones tends to lead to more water usage for subsystems Southeast and South. A important factor that contributes to this is the presence Figure 16. Differences in the stored energy in % of the maximum stored energy in each subsystem. The difference is computed as SE a,subs − SE z,subs , where SE is the energy stored in %, a is the aggregated model, z is the zone model, and subs is a particular subsystem. The squared marks give the average of the difference stored energy over the five trials of each case, while the error bars indicate the minimum and maximum over the five trials. Positive differences indicate that more energy is stored in the aggregated model. In (a), we show the differences for the Southeast system; (b-d) present the data, respectively for the South system, Northeast, and North. Figure 16 shows that, for subsystems Southeast and South, there is an evident trend of more water usage in the zone model. According to Figure 8, the majority of violated plants are located there: Machadinho, Itá, Salto Caxias, For do Chapecó, Salto Osório, Segredo, and Barra Grande are located in the South; Itumbiara, LC Barreto, Irapê, Cana Brava, Amador Aguiar I, São Salvador, Três Irmãos, São Simão, and Serra do Facão are in the Southeast subsystem. Thus, with the results from Figures 8 and 16, we can deduce that the explicit representation of the operating zones tends to lead to more water usage for subsystems Southeast and South. A important factor that contributes to this is the presence of large Francis turbines in these two subsystems whose minimum generation are, roughly, 60% of their rated power. On the other hand, we see in Figure 16c that there is no evident change in the stored energy for the Northeast subsystem, apart from Cases 4-6. For the five trials of Case 4, this subsystem has more stored energy in the aggregated model than in the zone one. For this particular case and subsystem, the zone model tends to use less hydro generation. The differences for the North subsystem are presented in Figure 16d. From the four subsystems, the North has the lowest capacity of energy storage, and the representation of operating zones has a diminished significance in terms of energy storage. Furthermore, we see from this figure the differences in stored energy for the two models in this subsystem are also close to zero for most cases.
The analysis of the energy storage can be extended to the Brazilian system as a whole. In Table 5 we show the average stored energy for the Brazilian system for each of the ten cases over the five trials-differences in the trials are negligible for this analysis. Consistent with the results shown so far, we see that the aggregated model leads to more stored energy for all cases. The relative difference between the total stored energy of the two models ranges from 0.1% to 0.56%-the zone model stores from 0.1% to 0.56% less energy than the aggregated model. Note that these differences are computed for a single day. Thus, over the course of a year, the differences in stored energy can amount to significant numbers. Additionally, the results for the system's stored energy concur with the differences seen in the total hydro generation for models aggregated and zone illustrated in Figure 10: more hydro generation leads to less stored energy. Furthermore, the inconsistencies between the aggregated model and the zone model can also be used as a strong argument for a better representation of the hydro operation in longer-term problems, similar to what has been investigated for network constraints in [22]. Table 5. Average total stored energy of the Brazilian system over the five trials of each case in MWmonth. One unit of MWmonth means that one MW can be generate continuously during one month. In this table, the relative difference in the stored energy is computed as 100 · SE a −SE z SE a , where SE a is the total stored energy for the aggregated model in MWmonth, and SE z is the same metric for the zone model. In addition to the differences in operating costs and stored energy presented above, another important metric that is directly affected by the optimization model chosen to represent the day-ahead operation, is the system's marginal costs. In Brazil, the power system is divided into four interconnected subsystems: Southeast, South, Northeast and North. The current practice for pricing electrical-energy consumption consists in assigning uniform prices for all buses within each subsystem. The prices are derived from the marginal costs obtained by solving the unit-commitment problem, with the binary decisions fixed at their respective values in the best solution found for the day-ahead problem, and by relaxing all inner subsystem transmission constraints-the latter guarantees uniform prices within the subsystems. The marginal costs, in turn, are the dual variables associated with the power-balance constraints. Here, we follow this methodology to obtain the marginal costs for the day-ahead problem with both hydro models. In our finding, we have observed that, at most times, the marginal costs are higher for the aggregated model, which can be associated with the usual higher levels of thermal generation in this model. Figure 17 gives the marginal costs for the four subsystems in Case 8 Trial 1 to demonstrate this trend. As Figure 17 shows, for all subsystems, the greatest differences in marginal costs for the two models happen in the last periods of the planning horizon-the difference in marginal costs reaches as much as $100/MW for subsystems Southeast, South and Northeast in the 46th period (10:30 pm of the day ahead). that, at most times, the marginal costs are higher for the aggregated model, which can be associated with the usual higher levels of thermal generation in this model. Figure 17 gives the marginal costs for the four subsystems in Case 8 Trial 1 to demonstrate this trend. As Figure 17 shows, for all subsystems, the greatest differences in marginal costs for the two models happen in the last periods of the planning horizon-the difference in marginal costs reaches as much as $100/MW for subsystems Southeast, South and Northeast in the 46th period (10:30 pm of the day ahead).

Conclusions
In this paper, we have investigated differences in the operation of hydrothermal systems that can arise due to the presence or absence of the explicit representation of hydro-generation forbidden zones. Our experiments are performed on a real-life system with over 161 hydro stations that, together, have over 700 hydro units. We have shown that there are significant differences in the total generation of hydro plants and thermal plants, as there is in the system's stored energy and the marginal costs. Moreover, as our results show, the differences are even more significant when we look at plants individually: in our experiments, hydro plants have their operating zones frequently violated in significant degrees, requiring post-optimization adjustments, which are likely to be suboptimal. In addition to providing evidence of necessary improvements in the day-ahead scheduling problem of the Brazilian system, we also believe that our results give interesting subsidies for improvements in longer-term problems where the representation of operating zones is even less investigated.

Conclusions
In this paper, we have investigated differences in the operation of hydrothermal systems that can arise due to the presence or absence of the explicit representation of hydro-generation forbidden zones. Our experiments are performed on a real-life system with over 161 hydro stations that, together, have over 700 hydro units. We have shown that there are significant differences in the total generation of hydro plants and thermal plants, as there is in the system's stored energy and the marginal costs. Moreover, as our results show, the differences are even more significant when we look at plants individually: in our experiments, hydro plants have their operating zones frequently violated in significant degrees, requiring post-optimization adjustments, which are likely to be suboptimal. In addition to providing evidence of necessary improvements in the day-ahead scheduling problem of the Brazilian system, we also believe that our results give interesting subsidies for improvements in longer-term problems where the representation of operating zones is even less investigated.