A Novel Lagrangian Multiplier Update Algorithm for Short-Term Hydro-Thermal Coordination †

: The backbone of a conventional electrical power generation system relies on hydro-thermal coordination. Due to its intrinsic complex, large-scale and constrained nature, the feasibility of a direct approach is reduced. With this limitation in mind, decomposition methods, particularly Lagrangian relaxation, constitutes a consolidated choice to “simplify” the problem. Thus, translating a relaxed problem approach indirectly leads to solutions of the primal problem. In turn, the dual problem is solved iteratively, and Lagrange multipliers are updated between each iteration using subgradient methods. However, this class of methods presents a set of sensitive aspects that often require time-consuming tuning tasks or to rely on the dispatchers’ own expertise and experience. Hence, to tackle these shortcomings, a novel Lagrangian multiplier update adaptative algorithm is proposed, with the aim of automatically adjust the step-size used to update Lagrange multipliers, therefore avoiding the need to pre-select a set of parameters. A results comparison is made against two traditionally employed step-size update heuristics, using a real hydrothermal scenario derived from the Portuguese power system. The proposed adaptive algorithm managed to obtain improved performances in terms of the dual problem, thereby reducing the duality gap with the optimal primal problem.


Introduction
The objective of short-term hydro-thermal coordination is to optimize electricity generation [1], meaning to find an optimal generation dispatch, or close to ideal, for all the thermal and hydro units available in a system. This ensures the total operation cost is minimized within horizons ranging from one day to one week (168 h), taking into account the entire system and its individual constraints [2][3][4][5] and with a planning period (discrete time-step), typically set from hour to hour [5]. In other words, this crucial process is responsible for scheduling the start-up and shutdown of thermal units (binary level decisions), in coordination with hydro plants, to ensure the continuity of electricity supply with appropriate levels of spinning reserve, while minimizing the operating costs [6]. This scheduling constitutes a unit commitment (UC) problem, where the dispatch policy of the thermal units is made in such a way that the total cost (operating cost, starting cost and shut down cost) is minimal over a pre-defined time-horizon. In addition, a series of operational constraints needs to be fulfilled, thus reducing the freedom of choice to turn a thermal unit on or off. In this regard, we are primarily speaking about the load balance constraint, i.e., ensuring that our electric energy demand 1.
Identify a weakness in the classical update mechanism of the step-size used in the subgradient method.

2.
Propose an adaptative Lagrangian multiplier update algorithm that, as its name suggests, dynamically updates the step-size value and subsequently the Lagrange multipliers, so that the dual function converges towards its optimum in a pre-arranged number of iterations.

3.
The LR technique is then used with the proposed adaptative update algorithm to solve a real large-scale, short-term hydro-thermal coordination problem, using data from the largest Portuguese electric utility company in two different scenarios.

4.
For validation purposes, the algorithm is tested against two traditional step-size update heuristics with different initial parameter values. 5.
Finally, the results comparison in both case studies revealed a sizeable advantage in terms of dual problem solution (error reduction) in favor of the proposed adaptative algorithm, therefore reducing the duality gap with the primal problem. Thus, a better allocation of the complex and vast hydro and thermal resources is obtained.
The rest of this paper is organized as follows. Section 2 provides a brief description of the primal problem. Section 3 explains the Lagrangian dual problem. Section 4 introduces subgradient methods and the motivation for the proposed algorithm, introduced in Section 5. Results and discussion for the two case studies are provided in Section 6, and finally, Section 7 presents the conclusion of this work.

The Primal Problem
The hydro-thermal coordination problem is a non-linear, large-scale, non-convex and combinatorial problem by nature. It can be understood as the task of establishing a map of feasible operations for each generation unit available in an electrical power system at the lowest cost for a predefined time horizon, in order to satisfy the expected load demand and a set of other system restraints. Typically, the time horizon considered is from one to seven days, and the discrete time-step (in which decisions are made) is a one-hour period. This problem is treated as deterministic, and whenever it is necessary to include stochastic quantities such as load diagram and reservoir inflows, their expected values are used.
In this manner, a primal problem (P) is non-convex and non-linear and can be mathematically formulated as shown in Equations (1)- (6). The total operating cost for all resources (units) and over the entire considered period, K, is defined in Equation (1) and is the problem's objective function, i.e., evaluates the performance of each admissible solution. The cost function, C ik x i,k−1 , p ik , u ik , is a measure that evaluates the decision made in each state, since there is an operating cost associated with the state transition (from x i,k−1 to x ik ), which delivers the power p ik , determined by the control decision u ik , for each unit i at time k. The following three Equations (2)-(4) represent the set of global constraints. Firstly, Equation (2) translates the (global) demand-supply balance restraint, where D k is the required load demand that needs to be served by the power output of each resource i in hour k, p ik . Moreover, for simplicity purposes transmission losses were not considered. In turn, Inequality (3) represents all the hourly system capacity requirement constraints, i.e., constraints like the spinning and operating reserve requirements. R mi translates the capacity contribution function associated with resource i for the system capacity requirement of type m, while R req mk is the mth-type system capacity requirement in hour k [5].
Additionally, Inequality (4) represents all the cumulative constraints, such as the limitation on emission by a group of units over the scheduling time horizon or the amount of consumed fuel, where H n stands for the set of thermal units on the nth cumulative constraint, H ni is the function which describes a contribution of thermal unit i-nth cumulative constraint, H req n is the upper bound on nth cumulative constraint and N is the set of cumulative constraints [37].
In turn, Equations (5) and (6) represent the set of local constraints, the state Equation (5) of each resource i at a time k. This equation allows us to obtain the state of each resource x ik and its contribution p ik to satisfy demand, whatever the decision u ik . Last of all, in (6) the domain of admissible values for the control variables, as well as for the initial and final state, are defined for each individual resource i.
and wherein (x ik , p ik ) = A ik x i,k−1 , u ik i = 1, . . . , I ∧ k = 1, . . . , K Although the objective function is a separable function in resources and hours, this problem, by its formulation and due to collective constraints, does not allow this separability, providing extreme complexity to the minimization problem. In other words, the optimum value cannot be found by the sum of the various suboptimal (separately) results from each resource. Thus, we are facing a problem of unrestrainable dimension, for which a direct approach is not viable.
The primal problem defined in this study approaches the short-term hydro-thermal coordination considering the generation resources available to the electric utilities company, to match the system-wide load demand over a weekly time-period, while fulfilling a set of other global and local constraints.

Lagrangian Dual Problem
As discussed, the primal problem is difficult to solve using conventional nonlinear optimization techniques. A preferable path is to decompose the problem constraints and transfer them to the objective function, i.e., to solve the dual problem, rather than solving the primal problem directly. We know beforehand that the optimal solution of the relaxed problem is a lower bound (good estimate) of the optimal solution of the initial problem [2,10,38]. This is achieved by relaxing the constraints, i.e., weakening of the problem (P), that in practical terms means open the possibility to breach the imposed constraints. However, relaxed restrictions are not completely ignored since its violations are linearly penalized in the Lagrange function (an added cost associated with violating each constraint).
We can write the Lagrange function (L) for problem (P) by relaxing its global constraint, as expressed in Equation (7), where λ, µ and γ are the Lagrange multiplier vectors associated with the load-balance constraint, capacity constraints and cumulative constraints, respectively. These Lagrange multipliers are expressed in units of cost per unit of the parameters of their associated constraint, which in the case of Equation (2) is given in $/GW.
That is, to now solve the unit commitment problem, L is minimized, where Min u L x i,k−1 , p ik , u ik , λ, µ, γ is subject to local system constraints, i.e., Equations (5) and (6).

Subgradient of the Dual Function
The Lagrangian dual problem is obtained by forming (L), and its solution provides the primal variables as functions of the Lagrange multipliers, which are labeled dual variables. Hence, the new problem is to maximize the objective function with respect to the multipliers under the derived constraints on the dual variables. This implies, by decomposition, that each resource becomes a single entity and is individually optimized, rather than a combined optimal resource allocation. Therefore, the dual function is defined in Equation (8), presenting concave and sub-differentiable traits (resulting in inferiorly limited variables).
Given that Lagrange's dual function is a concave function with simple bounds on the variables, a local optimum is also the function global optimum. Therefore, our task is to find the Lagrangian multipliers, λ, µ and γ, that maximize the dual function. Nonetheless, this does not mean that solving the dual function is a trivial task (far from it actually), since the function is not smooth and is not given by an easy-to-compute expression [5]. For this purpose, we resort to subgradient methods, which benefit from the fact the subgradients of q are easily derived system constraint deviations. Consequently, we can define the subgradient of the dual function g (9) for each hour k as follows: Energies 2020, 13, 6621 Moreover, by the weak duality theorem for a single set of multipliers, the optimal value of the Lagrange dual problem q(λ * ) and the optimal value of the primal minimization problem p(λ * ) are related by q(λ * ) ≤ p(λ * ), and the difference between the values is called a duality gap. This implies that the dual problem offers a good indirect root to solve the primal one, since the gap in most practical cases is relatively small [6,12,39].
For all the reasons above, this approach to the problem is advantageous since it lessens the computational burden of the primal problem.

Subgradient Methods
As we saw earlier, obtaining the Lagrange dual function optimal value goes hand-in-hand with the Lagrange multiplies choice/update method, i.e., at the outset, this choice determines how close we are to the solution of the dual problem and, ultimately, how close we are from reaching the primal problem best solution. To perform this task several methods are described in the literature [5]; however, regarding our problem in particular, subgradient methods prevail as the most fitting solution by achieving higher accuracies. Further benefits include their simplicity as well as the computational ease with which the Lagrange dual function subgradient (solution deviation from the imposed constraints) is calculated.
These methods update the multipliers according to the subgradient direction and in a manner proportional to the violation of the corresponding constraints. Besides, a distinctive trademark of these methods concerns the step-size update heuristic, where again several approaches have been followed [5]. However, the downside of these conventional updating heuristics is that a long-winded trial-and-error procedure as well as a highly specialized operator are frequently required. The simplest and most common subgradient method formulation is given by where g v is the subgradient g(p λ v ), s v is a positive scalar that defines the step-size at the current iteration v and, lastly, [.] + represents the projection in the set of feasible values Λ. Nonetheless, there is no guarantee that after iteration v + 1, independently from the chosen step-size, the dual function value will actually improve (walk towards the optimal dual function value), meaning that in some occasions we will have Though, if the step value is sufficiently small, the distance between the obtained point in the current iteration and the optimum value can always be reduced. The following proposition offers an estimate for the step-size domain (range): Proposition 1. If λ v does not lead to the optimum value of the dual function, then for λ * , which corresponds to the dual function optimum value, the inequality λ v+1 − λ * < λ v − λ * is valid for all step-sizes, [. Therefore, this suggests a step-size equal to the middle value of the inequality range, i.e., Since this requires knowledge of the dual function optimal value q(λ * ), which is exactly the unknown we want to find, this approach is unviable in our case, and we resort to heuristics that determine the step-size. In this regard, a popular choice is decreasing step-size rule-based approaches, mainly due to its simplicity and effectiveness.
Considering a decrease in step-size, s v , towards zero, meaning that lim v→∞ s v = 0 ∧ s v > 0, while at the same time satisfying ∞ v=1 s v = ∞, the method can "travel" as far as possible (up to infinity) in order to converge to the optimal dual function value. Thus, under these assumptions, we can infer a 2nd proposition, from which we can conclude that it is possible, by appropriately updating the step-size, to reach the dual function maximum value [21].

Proposition 2. For the sequence of all multiplier values {λ
However, this analysis does not lead to a finite procedure, pointing to an initial value of the step, as well as a mechanism for decrementing it to zero. As such, for comparison purposes against the proposed new heuristic, the two most widely employed expressions are introduced in Equations (12) and (13), to update the step-size at each iteration v.
where a 1 and a 2 are control parameters of the heuristic process. Moreover, the chosen initial step is a highly sensitive matter, since small initial steps can prevent the method from reaching the desired optimum value in a reasonable number of iterations. Whereas, using a large initial step may cause the method to oscillate erratically in the early phase, leading to poor convergence. As a result, although the obtained value is stabilized, it could still be improved by running further iterations. This fact is more pronounced in Equation (12) rather than Equation (13), given that a 2 > 1 implies a rapid decrease in the step-size. Consequently, selecting the values to assign to parameters a 1 and a 2 is a difficult task, with direct influence on the obtained results. This could be facilitated if a good initial estimate for the dual variable vector λ 0 is available, that is, if q(λ 0 ) is already close to the solution of the dual problem.
Therefore, we can conclude that it is an intrinsically lengthy (experimentation-based) heuristic process that is highly dependent on the user's experience. Precisely to mitigate this scenario, a new algorithm will be proposed next.

Proposed Adaptative Algorithm
When applying subgradient methods, the existence of good estimates for the multipliers and careful tune-up of the subgradient step-size are considered essential to improve the computational effectiveness of the method. Subsequently, motivated by the previously exposed shortcomings from the classical subgradient optimization approaches, an adaptative heuristic is proposed in order to automatically update the Lagrange multipliers, thereby removing the need to rely on a user's past experiences or time-consuming trial-and-error tasks. This means that the step-size, s v , is automatically determined (avoiding lengthy trial-and-error procedures) by the adaptative algorithm when solving the dual problem with a subgradient method. The different stages of the algorithm that lead to a dual problem solution are illustrated in Figure 1; subsequently, the rationale behind them is detailed below: (1) define the initial step-size, s 0 = 1, and choose an initial value for the dual variable vector, λ 0 ; then compute the initial dual function and subgradient values, q 0 λ 0 and g 0 (p λ 0 ), respectively; (2) update Lagrange multipliers according to Equation (10); (3) determine the new step-size as follows: and the step-size is given by (4) compute the current (iteration) dual function and subgradient values, q v (λ v ) and g v (p λ v ); (5) if the termination criterion is met: a. terminate the algorithm; else b. proceed to the next iteration, v = v + 1; c. return to (2); end Regarding the (above) adaptative algorithm, the following clarifications are made: In (1) the initial dual variable vector positioning only impacts the convergence speed of the subgradient method; thereby, it can be considered arbitrary. On the contrary, for the step-size update expressed by Equations (12) and (13), this initial positioning benefits heavily from a nearby optimum value (derived from past experiences or other heuristics) in order to guarantee the method's performance, thereby translating an important advantage of the proposed strategy.
In turn, stage (3) depicts the original step-size update mechanism, where the rationale behind it is to dynamically update the step based on the dual function value, i.e., if this value improves then the step should be augmented; in contrast, if this value does not improve then the step should be diminished. Moreover, to prevent a large step-size increasing the distance between the new point and the optimum value, this step-size should be increased smoothly; this fact is less sensitive when reducing step-size. Additionally, it was found that the ideal domains for variables α 1 and α 2 are Lastly, the stop criterion mentioned in (4) is traditionally run a specific number of iterations, which was also the case in this work. Regarding the (above) adaptative algorithm, the following clarifications are made: In (1) the initial dual variable vector positioning only impacts the convergence speed of the subgradient method; thereby, it can be considered arbitrary. On the contrary, for the step-size update expressed by Equations (12) and (13), this initial positioning benefits heavily from a nearby optimum value (derived from past experiences or other heuristics) in order to guarantee the method's performance, thereby translating an important advantage of the proposed strategy.
In turn, stage (3) depicts the original step-size update mechanism, where the rationale behind it is to dynamically update the step based on the dual function value, i.e., if this value improves then the step should be augmented; in contrast, if this value does not improve then the step should be diminished. Moreover, to prevent a large step-size increasing the distance between the new point and the optimum value, this step-size should be increased smoothly; this fact is less sensitive when

Numerical Results
The behavior of the subgradient method is analyzed in this section. The step value is updated according to the adaptive algorithm, proposed in Section 5, and then benchmarked against a classical approach. The step-size is updated using Equations (12) and (13) and, consequently, the Lagrange multipliers. The software used in this work was written in Fortran using the development environment (IDE) Microsoft Visual Studio ® .
As previously mentioned, the unit commitment (primal problem) corresponding to the solution of the Lagrange dual problem does not always lead to a feasible solution. As such, the average subgradient norm, g(p λ ) /K, is defined as a quantitative metric of how a solution is accurate in terms of the primal problem. This means the lower the value, the closer we will be to a good solution, and typically a value on the order of 0.5% of the peak load typically means that a good solution to the primal problem was found.
The data employed in this work concern the real short-term hydro-thermal coordination problem that the main Portuguese electric utility companies face. Data include all generation parameters and auxiliary variables, i.e., a large-scale study comprising six thermal power plants and 26 hydro power plants (amounting to over 80 individual generation units), which serve the majority of the Portuguese electric power demand. Two different case studies will be considered, diverging over the selected weekly periods, size and characteristics of the system, as well as the economic strategies behind the cost curves. Additionally, the specified parameters will be kept fixed for both case studies.

Case Study I
For this case study, unit costs are the sole result from the associated generation costs, with no other parallel costs. Consequently, both the evolution of the dual function q(λ) value as well the evolution of the average subgradient norm g(p λ ) /K will be evaluated over the course of iterations. Figures 2a,b and 3 show the evolution of the dual function value (left axis) and its step value (right axis). Figures 4a,b and 5 show the evolution of the average subgradient norm. Figures 2a and 4a illustrate the behavior of the subgradient method using a classical step-size update, given by Equation (12). In the same fashion, Figures 2b and 4b illustrate the behavior when using Equation (13). Lastly, the new adaptive algorithm results are shown in Figures 3 and 5. and typically a value on the order of 0.5% of the peak load typically means that a good solution to the primal problem was found.
The data employed in this work concern the real short-term hydro-thermal coordination problem that the main Portuguese electric utility companies face. Data include all generation parameters and auxiliary variables, i.e., a large-scale study comprising six thermal power plants and 26 hydro power plants (amounting to over 80 individual generation units), which serve the majority of the Portuguese electric power demand. Two different case studies will be considered, diverging over the selected weekly periods, size and characteristics of the system, as well as the economic strategies behind the cost curves. Additionally, the specified parameters will be kept fixed for both case studies.

Case Study I
For this case study, unit costs are the sole result from the associated generation costs, with no other parallel costs. Consequently, both the evolution of the dual function ( ) value as well the evolution of the average subgradient norm ‖ ( )‖/ will be evaluated over the course of iterations. Figure 2a,b and Figure 3 show the evolution of the dual function value (left axis) and its step value (right axis). Figure 4a,b and Figure 5 show the evolution of the average subgradient norm. Figures 2a  and 4a illustrate the behavior of the subgradient method using a classical step-size update, given by Equation (12). In the same fashion, Figures 2b and 4b illustrate the behavior when using Equation (13). Lastly, the new adaptive algorithm results are shown in Figures 3 and 5.  Figure 2. Evolution of the dual function value (blue plots), q(λ), and its step value (red plots), using the heuristics expressed by Equations (12) and (13)      Regarding Figure 2a,b we can observe the following: (i) achieving convergence in more or less iterations is heavily dependent on the choice of the different parameters; (ii) using a smaller initial step-size increased the number of iterations needed to achieve convergence, maxq(λ) (dashed line); (iii) the use of a slightly larger initial step leads to some oscillation, still, without compromising convergence, represented by the solid lines; (iv) the step-size evolution is strictly decreasing, and the rate of descent depends on the considered parameters.
With respect to the adaptive algorithm, the evolution of the dual function increases as the value of the step increases, as shown in Figure 3, until a value is reached in the vicinity of the maximum dual function value. From this point onwards, the step value decreases towards zero, but then again it slightly increases whenever the dual function value does not improve compared to the previous iteration. This dynamically adjusted (based on the dual function current value) step-size clearly contrasts with the monotone evolution that occurs with the traditional step update formulation.
We can verify that, in all scenarios, the dual function maximum value was reached, and differences reside in the number of iterations necessary to achieve convergence (which was relatively similar). Nevertheless, the proposed algorithm shows a more robust approach when applying the subgradient method since it did not require educated guesses to converge on an acceptable number of iterations.
Concerning the obtained minimum average subgradient norm, Figures 4 and 5 are presented, where the secondary axis (magnified) provides a greater resolution regarding the convergence of each error curve to its recorded minimum value. Additionally, to summarize the respective minimum values achieved by the different step-size update mechanisms with different initial parameters, Table 1 is also presented. Table 1. Minimum average subgradient norm, g(p λ ) /K, and its corresponding iteration, achieved by the different step-size update mechanisms for case study I. As we can see in Figure 4a,b, both processes led to similar final results, with a (best) value close to 23 MW by employing both classical step-size update equations. Nevertheless, the first combination in Table 1 ensured the fastest convergence (186 iterations). As for the proposed Lagrangian multiplier Energies 2020, 13, 6621 12 of 19 update algorithm, we noticed an improved (smaller) error of~21 MW, which in this relatively curtailed error scenario represents an improvement above 8%. These values represent approximately 0.53% of the peak load, which, as mentioned, usually leads to a good solution to the primal problem. Besides, note that once the dual function maximum value or its proximities are reached, the average subgradient norm has not yet reached its minimum value, and it continues to oscillate up and down over the next iterations. This can be explained since small variations in the multipliers can cause large variations in the solutions in terms of the primal problem. Thus, emphasizing the fact that even when reaching the maximum value of the dual function, we may not end up with the best solution in terms of the primal problem.
Moreover, this average subgradient norm oscillation is further accentuated in the adaptative algorithm since, in contrast to the previous two step-size update expressions, it does not have a strictly decreasing behavior. This behavior can lead to a lower convergence speed and, therefore, constitutes a limitation of the proposed adaptive algorithm. Notwithstanding, it is worth noting that a slow convergence is preferred over a premature convergence.
Lastly, regarding the solution in terms of the primal problem ( Figure 6), corresponding to the solution of the dual problem for the (achieved) lowest average subgradient norm value. The same was obtained using the adaptive algorithm, since is easy to understand from previous figures that all primary solutions would be similar, so their presentation is redundant. The algorithm used in solving the primal problem based on Lagrangian relaxation, as we saw earlier, does not lead to an optimal solution. The obtained primal solution reveals the existence of Lagrangian duality. That is, we can say that good results were obtained since the generation profile (solid green line) was almost coincident with the desired load demand profile (dashed red line), but it did not match it completely. After solving the dual problem, several methods have been used to look for feasibility [5]. However, if we succeed when solving the dual problem, then we can also get, in terms of the primal problem, a good solution. In fact, in some cases it is enough to carry out an economic dispatch of thermal units to obtain a (close) feasible strategy, which is exactly what happens in the presented case study, where the difference between the maximum generation capacity (dotted orange line) and the allocated thermal unit generation (dashed magenta line) is sufficient to compensate for the mismatch (deviation) between the load profile and the obtained generation profile. results were obtained since the generation profile (solid green line) was almost coincident with the desired load demand profile (dashed red line), but it did not match it completely. After solving the dual problem, several methods have been used to look for feasibility [5]. However, if we succeed when solving the dual problem, then we can also get, in terms of the primal problem, a good solution.
In fact, in some cases it is enough to carry out an economic dispatch of thermal units to obtain a (close) feasible strategy, which is exactly what happens in the presented case study, where the difference between the maximum generation capacity (dotted orange line) and the allocated thermal unit generation (dashed magenta line) is sufficient to compensate for the mismatch (deviation) between the load profile and the obtained generation profile.

Case Study II
A second case study is considered with the purpose of further validating the proposed adaptive algorithm. For this scenario, the units' cost curves result is not the sole result of the generation costs but is also from additional economic strategies. Furthermore, the peak demand power (delivered to the grid) was 45% higher than in case study I. Thus, it constitutes a case with greater dimensions, with extra hydro and thermal plants considered. However, the parameters required for Equations

Case Study II
A second case study is considered with the purpose of further validating the proposed adaptive algorithm. For this scenario, the units' cost curves result is not the sole result of the generation costs but is also from additional economic strategies. Furthermore, the peak demand power (delivered to the grid) was 45% higher than in case study I. Thus, it constitutes a case with greater dimensions, with extra hydro and thermal plants considered. However, the parameters required for Equations (12) and (13) and the adaptive algorithm are the same as those specified in case study I, so we can compare the performance between these different methods of updating the step-size value.
In the same manner as in case study I, we started by analyzing the evolution of the dual function value, q(λ), and the correspondent step-size values. From looking at Figure 7a,b we can see that the maximum value of the dual function, using both classical step-size update equations, had not been reached. Indeed, when compared to the value obtained using the adaptive algorithm (Figure 8), this value falls short by roughly 0.7% for the parameters illustrated by the solid line, whereas for the ones represented by dashed lines an evident lack of convergence can be spotted.  In turn, the adaptative algorithm updates the step-size value dynamically and adapts to the current dual function value. That is, as mentioned above, if the dual function value improves then the step should be increased; on the contrary, if this value does not improve then the step should be decreased. Thus, in Figure 8 we can see that the step value increased until the dual function value approached its maximum value; thereafter, the step value was modified accordingly, which enabled convergence with the max ( ) in both case studies, and ultimately led to good primal solutions.
These results anticipate a higher minimum value of the average subgradient norm for the classical update expressions, which inevitably compromises the solution in terms of the primal problem. This effect should be particularly pronounced for the parameters represented by the dashed  Figure 7. Evolution of the dual function value (blue plots), q(λ), and its step value (red plots), using heuristic expressed by Equations (12) and (13)   In turn, the adaptative algorithm updates the step-size value dynamically and adapts to the current dual function value. That is, as mentioned above, if the dual function value improves then the step should be increased; on the contrary, if this value does not improve then the step should be decreased. Thus, in Figure 8 we can see that the step value increased until the dual function value approached its maximum value; thereafter, the step value was modified accordingly, which enabled convergence with the max ( ) in both case studies, and ultimately led to good primal solutions.
These results anticipate a higher minimum value of the average subgradient norm for the classical update expressions, which inevitably compromises the solution in terms of the primal  Figure 8. Evolution of the dual function value (blue plot), q(λ), and its step value (red plot), using the adaptative algorithm, with the following parameter values: α 1 = 1.05, α 2 = 1.10.
In turn, the adaptative algorithm updates the step-size value dynamically and adapts to the current dual function value. That is, as mentioned above, if the dual function value improves then the step should be increased; on the contrary, if this value does not improve then the step should be decreased. Thus, in Figure 8 we can see that the step value increased until the dual function value approached its maximum value; thereafter, the step value was modified accordingly, which enabled convergence with the maxq(λ) in both case studies, and ultimately led to good primal solutions.
These results anticipate a higher minimum value of the average subgradient norm for the classical update expressions, which inevitably compromises the solution in terms of the primal problem. This effect should be particularly pronounced for the parameters represented by the dashed plots in Figure 7a,b. Therefore, a clear contrast to the results from the first case is established, where the maximum value of the dual function is relatively similar for the different mechanisms used to update the Lagrange multipliers (as shown in Table 1).
As expected, this preliminary assessment is fully backed by examining the evolution of the average subgradient norm, as shown below in Figure 9a,b, as well as in the summary Table 2, which presents the respective minimum values achieved by the different step-size update mechanisms with different initial parameters, represented by each individual error plot in Figures 9 and 10. The results reveal sizeable minimum average subgradient norm values, even for the scenarios where the dual function closed in on its maximum value (~99.3% of the maxq(λ)), presenting values of 318 MW and 317 MW, respectively. These results differ from the ones achieved using the adaptive algorithm (Figure 10), where a much improved g(p λ ) /K minimum value was recorded, 38 MW (~8.3 times smaller), i.e., roughly representing only 0.64% of the peak power demand versus >5% with the classic heuristics.

Min ‖ ( )‖ [MW] Iteration
Step-Size Update Mechanism     Furthermore, we can also notice that some of the parameters led to a fast but also premature convergence when using traditional heuristics, contrasting with the slower convergence exhibited by the proposed adaptative algorithm due to a more refined step-size update. Thereby, the inferences highlighted for case study I are confirmed. Despite the significant improvement by several orders of magnitude compared to the more modest improvement in case study I, this improved error value is Furthermore, we can also notice that some of the parameters led to a fast but also premature convergence when using traditional heuristics, contrasting with the slower convergence exhibited by the proposed adaptative algorithm due to a more refined step-size update. Thereby, the inferences highlighted for case study I are confirmed. Despite the significant improvement by several orders of magnitude compared to the more modest improvement in case study I, this improved error value is still slightly above the one obtained in the previous case study, which translates to added difficulty when considering additional dispatch strategies. These characteristics will ultimately impact the task of obtaining a feasible solution in terms of the primal problem. Nevertheless, the proposed adaptative algorithm was able to converge to a good solution, and, in comparison, it will result in a smaller duality gap. Besides, it reinforces the mentioned need to thoroughly adjust the control parameters or rely upon educated guesses in order to achieve a good result when using classic heuristics versus the proposed automated algorithm.
Moving on towards the primal problem solution, Figure 11 is presented, revealing a higher peak demand as well as a larger usage of the thermal capacity in direct comparison to its counterpart in case study I, Figure 6. Further, it can be noted that the hydro generation follows the load profile on a smaller scale, and the hydro-thermal generation profile obtained moves further away from the load profile. Thus, the economic dispatch of thermal units is not sufficient for the solution, in terms of the primal problem, to be feasible. In other words, contrary to the first case study, the difference between the maximum generation capacity (dotted orange plot) and the allocated thermal units' generation (dashed magenta plot), illustrated in Figure 11, is insufficient to address the mismatch between the load profile and the obtained generation profile. Additionally, we can see that around 1-3 h and 147-149 h the hydro generation registered a negative value, which is explained by a sporadic period of power consumption (pumping) during an off-peak occurrence.
For this reason, it makes sense to present, for this case study, the mismatch between the load profile and the obtained generation profile as illustrated in Figure 12, where we can observe the non-feasibility after the economic dispatch for a few hours. This behavior was more significant during the last considered day (between 144 and 168 h), which is a weekend day, reaching a maximum above 65 MW at time k = 149 h. This translates a mismatch around 1% of the peak demand, and it is justified by the limited availability of thermal generation (orange and magenta lines in Figure 11 almost overlapping during this period). Nonetheless, on average, the primal problem mismatch never exceeded 0.5% of the peak load demand.
primal problem, to be feasible. In other words, contrary to the first case study, the difference between the maximum generation capacity (dotted orange plot) and the allocated thermal units' generation (dashed magenta plot), illustrated in Figure 11, is insufficient to address the mismatch between the load profile and the obtained generation profile. Additionally, we can see that around 1-3 h and 147-149 h the hydro generation registered a negative value, which is explained by a sporadic period of power consumption (pumping) during an off-peak occurrence. For this reason, it makes sense to present, for this case study, the mismatch between the load profile and the obtained generation profile as illustrated in Figure 12, where we can observe the nonfeasibility after the economic dispatch for a few hours. This behavior was more significant during the last considered day (between 144 and 168 h), which is a weekend day, reaching a maximum above 65 MW at time = 149 h. This translates a mismatch around 1% of the peak demand, and it is justified by the limited availability of thermal generation (orange and magenta lines in Figure 11 almost overlapping during this period). Nonetheless, on average, the primal problem mismatch never exceeded 0.5% of the peak load demand.

Conclusions
In this paper, we explored the numerical performance of a novel Lagrangian multiplier update algorithm for short-term hydro-thermal coordination. The step-size update mechanism is a vital component for these methods, and classic approaches are heavily dependent upon a user's experience and fine-tuning procedures, i.e., selecting the appropriate parameters. The proposed algorithm had an important advantage of not requiring parameter choices based on experimentation, and it is subsequently compared against two classical update expressions. After a results assessment of both case studies, we could infer that the adaptive algorithm produced considerably improved dual problem solutions, seen through the percentage gains in terms of average subgradient norm and the respective ratio between the average subgradient norm and peak demand. This ultimately means an improved upper bound for the primal problem solution. Moreover, for most hours it led to feasible primal solutions, which in turn translates into more cost-effective dispatch decisions. The significance of the obtained results is magnified, especially when considering the differences in data prediction, such as the expected inflows.
These improvements proved the algorithm's ability to dynamically adapt the step value according to the dual function value. We can see that during the initial iterations, the step value is incremented until approaching the vicinities of the dual function maximum. From then onwards, the step evolves dynamically and adapts to the current dual function value, allowing convergence to the maximum dual function value and to the average subgradient norm, which translates to a feasible or a near-feasible primal solution. On the contrary, when using the classical update step-size update equations, it is necessary to rely on educated guesses or perform adjustments over the control parameters (as illustrated by case study II), through a trial-and-error process, in order to obtain a solution similar to the one achieved by the new adaptative algorithm.

Conclusions
In this paper, we explored the numerical performance of a novel Lagrangian multiplier update algorithm for short-term hydro-thermal coordination. The step-size update mechanism is a vital component for these methods, and classic approaches are heavily dependent upon a user's experience and fine-tuning procedures, i.e., selecting the appropriate parameters. The proposed algorithm had an important advantage of not requiring parameter choices based on experimentation, and it is subsequently compared against two classical update expressions. After a results assessment of both case studies, we could infer that the adaptive algorithm produced considerably improved dual problem solutions, seen through the percentage gains in terms of average subgradient norm and the respective ratio between the average subgradient norm and peak demand. This ultimately means an improved upper bound for the primal problem solution. Moreover, for most hours it led to feasible primal solutions, which in turn translates into more cost-effective dispatch decisions. The significance of the obtained results is magnified, especially when considering the differences in data prediction, such as the expected inflows.
These improvements proved the algorithm's ability to dynamically adapt the step value according to the dual function value. We can see that during the initial iterations, the step value is incremented until approaching the vicinities of the dual function maximum. From then onwards, the step evolves dynamically and adapts to the current dual function value, allowing convergence to the maximum dual function value and to the average subgradient norm, which translates to a feasible or a near-feasible primal solution. On the contrary, when using the classical update step-size update equations, it is necessary to rely on educated guesses or perform adjustments over the control parameters (as illustrated by case study II), through a trial-and-error process, in order to obtain a solution similar to the one achieved by the new adaptative algorithm.
Finally, we see that for a high-dimensional optimization problem, the computational burden is not exaggerated and not dependent upon initial guesses, especially considering that a weekly unit commitment is performed.