1. Introduction
As one of the most stable forms of renewable energy, hydropower energy can be commercially developed and utilized on a large scale [
1,
2,
3,
4]. With the rapid development of cascade reservoirs, cascade reservoirs operation optimization (CROO) is attracting the attention of more and more scholars all over the world [
5,
6]. The global optimal solution of CROO has very important practical significance in developing long-term optimal operation schemes and rules for cascade reservoirs [
7,
8]. So, it is very necessary to solve the CROO problem correctly and reasonably. However, CROO is a multivariate coupled and complicated nonlinear programming problem [
9], which needs to consider not only the hydraulic connection between upstream and downstream reservoirs, but also a lot of constraints. It has the characteristics of high dimensionality, strong coupling, and uncertainty [
10,
11,
12], and its solution is full of difficulties and has always been the focus of many scholars’ research [
13].
Aiming at the solving of the CROO problem, a wide range of methods have been proposed over the past decades, which mainly involve conventional optimization algorithms and various heuristic random search algorithms [
14,
15]. The conventional methods include Linear Programming (LP) [
16], Nonlinear Programming (NLP) [
17], Lagrangian Relaxation (LR) [
18], Quadratic Programming (QP) [
19], and Multi-dimensional Dynamic Programming (MDP) [
20,
21]. They are all elitist algorithms, and have already received different degrees of success in solving CROO problems. The modern heuristic random search algorithms include the Genetic Algorithm (GA) [
22], Particle Swarm Optimization (PSO) [
23], Ant Colony Optimization (ACO) [
24], Fuzzy Neural Network (FNN) [
25], and the Differential Evolution algorithm (DE) [
26,
27,
28]. These have been extensively used to solve the CROO problem, and have also received a good application effect.
Among the above algorithms, MDP is a powerful optimization technique for CROO problems. The most significant characteristic of MDP is that it is able to obtain a global optimal solution and have no requirement for the initial trajectories. Moreover, MDP imposes no restrictions on the unsmooth and nonconvex nature of CROO problems, which make it boast high popularity among the conventional optimization techniques. Many evolutionary algorithms have been proved to possess a global convergence, while as these algorithms are affected by a stochastic feature, they cannot guarantee a global optimum with finite iterations [
29]. However, although the MDP can solve CROO problems with a global convergence, the high dimensionality, called the “curse of dimensionality”, poses difficulties and limits its application in CROO problems, especially for large-scale hydropower systems [
30].
On the whole, there are three ways to avoid or alleviate the “curse of dimensionality” and guarantee the global convergence of MDP. The first one is to improve MDP effectively on the premise of guaranteeing a global convergence, so as to shorten the run-time. The second one is to implement parallel computing by using multi-core processing technology. The third one is to combine MDP with other algorithms which have a high computational efficiency, so as to make up for their deficiencies of each other. Some scholars have done related research work in these aspects, as described below.
In terms of algorithm improvements, a variety of improved Dynamic Programming (DP) algorithms have emerged and been used to avoid or alleviate the “curse of dimensionality”, such as the Progressive Optimality Algorithm (POA) [
31], Dynamic Programming with Successive Approximation (DPSA) [
32], Discrete Differential Dynamic Programming (DDDP) [
33], and Incremental Dynamic Programming (IDP) [
34]. These improved algorithms can effectively avoid the “curse of dimensionality” to a certain degree, but defects still exist. For example, POA is sensitive to the initial trajectories, and may converge to a local optimum in some situations. DPSA and DDDP are difficult to use to solve problems with non-convexity, and may also lead to a local optimum. In addition, other related research has been done by some scholars. For example, Mousavi et al. [
35] reduced the run-time of an MDP model for a multi-reservoir system by diagnosing infeasible storage combinations and removing them from further computations, which has a good effect within a certain amount of hydropower stations, but the time consumption is still enormous and intolerable when the scale of a hydropower system reaches a certain large degree. By taking advantage of the monotonic relationship between reservoir storage volume and the optimal release decision. Zhao et al. [
36] proposed an improved DP model for reservoir operation optimization (ROO); however, the model can only be applied to a reservoir operation with a concave objective function. Ji et al. [
37] proposed a novel multi-layer nested multi-dimensional dynamic programming (MNDP) based on a multi-layer nested structure, but it was mainly used to deal with the problem of computer memory space and computational complexity for MDP in CROO, and the computing time had not been reduced.
In recent years, parallel computing has been widely applied in the field of water resources [
38,
39]. Especially, in the field of CROO and MDP applications, there are several successful examples. Dias et al. [
40] improved the performance of stochastic dynamic programming by using parallel processing techniques, and successfully applied it to the long-term operation planning of an electrical power system. Li et al. [
41] developed a parallel MDP algorithm to optimize the joint operation of a multi-reservoir system based on a distributed memory architecture and the Message Passing Interface (MPI) protocol. Ji et al. [
37] implemented parallel computing to the proposed MNDP algorithm, and achieved a good application effect. In order to evaluate the parallel performance of different parallel modes, Zhang et al. [
11] proposed three kinds of parallel MDP algorithms. On the whole, these previous studies demonstrate that run-time can be reduced a lot in optimization or simulation by using parallel computing associated with proper parallelization strategies. However, with the increase of the number of cores used in parallel computing, the parallel efficiency will reduce gradually; specifically, when the number of cores is large, the parallel efficiency is generally very low.
In terms of hybrid applications, i.e., the combination usage of MDP with other algorithms, much related research has been achieved, for example, a combination of the GA and DDDP approaches (GA-DDDP) was proposed and developed to optimize a multiple reservoir system’s operation by Tospornsampan et al. [
42], and the significant advantage obtained from using GA-DDDP is economizing on computational resources, as GA-DDDP does not require optimizing parameters and the derivation of feasible initial trial trajectories. Lantoine and Russell [
43] presented a hybrid variant of the differential dynamic programming (HDDP) algorithm to solve constrained nonlinear optimal control problems, and the hybrid method incorporates nonlinear mathematical programming techniques to increase its efficiency. Zhang et al. [
23] joined parallel deterministic dynamic programming and a hierarchical adaptive genetic algorithm to solve an ROO problem. However, in view of the above description and analysis, the effective existing hybrid applications have mainly focused on improved dynamic programming algorithms (such as DDDP and IDP) or intelligent optimization algorithms, and there are very few hybrid applications about the baseline MDP which can converge to the optimal solution without an additional requirement of unsmooth, non-convexity, and initial trajectories. Therefore, it has an important practical significance to carry out research of dimension reduction methods for MDP.
In order to effectively avoid the “curse of dimensionality” of MDP and guarantee a global convergence at the same time, this paper proposes two new dimension reduction methods for MDP based on a structural and characteristics analysis, i.e., a hybrid algorithm of MDP and POA (named MDP-POA), and an improved MDP (named IMDP). A detailed case study was provided in this paper by taking the Qingjiang cascade reservoirs in China as an instance, and in order to evaluate the performance of the proposed MDP-POA and IMDP, the results of MDP-POA, IMDP, and MDP were compared and analyzed from the aspects of power generation and run-time. In addition, the authors have analyzed the varying characteristics of the optimal solution for the proposed MDP-POA and IMDP under different discretization levels, and provided the recommended discretization levels or computing schemes for different conditions. The following part of this paper is organized as follows.
Section 2 presents the formulation of CROO problems.
Section 3 introduces the principle of MDP and the POA, and presents the optimization principle of MDP-POA and IMDP.
Section 4 shows the application of MDP-POA and IMDP in the cascade reservoirs of Qingjiang River in China, and the results are analyzed and discussed in this section.
Section 5 presents the conclusions.
2. Formulation of CROO Problems
Power generation is a significant benefit derived from a cascade reservoirs system. CROO aims at maximizing the power generation by developing an optimal plan over the entire planning horizon, while satisfying all kinds of physical and operational constraints. Generally, CROO is related to a given operation of the hydropower stations for
T stages as follows.
where
E is the total power generation over the entire planning horizon, unit: kWh;
T is the number of stages over the entire planning horizon;
is the output of the
ith hydropower station in the
tth stage, unit: kW, and the reservoir indexes from upstream to downstream are 1, 2, ... ,
n in this paper;
Ki is the efficiency coefficient of the
ith hydropower station;
is the average water level of the
ith hydropower station in the
tth stage, unit: m, and it is determined by the beginning water level
ht of the
tth stage of the reservoir, the end water level
ht+1, and the downstream tail water level
hdown, i.e.,
Ht = (
ht +
ht+1)/2 −
hdown, while
hdown is determined by the total outflow
Qt of the reservoir in the
tth stage, and
Qt happens to be determined by
ht and
ht+1. In addition, the reservoir water level
h has a one-to-one mapping relationship with the reservoir volume
V, namely the reservoir curve of water level–volume; Δ
t is the duration of a stage, unit: h. CROO is subject to the following equality and inequality constraints.
(1) Water volume balance:
where
is the outflow through the turbines of the
ith reservoir in the
tth stage, unit: m
3/s;
is the abandoned water outflow through the flood outflow gate of the
ith reservoir in the
tth stage, unit: m
3/s; the total outflow
of the
ith reservoir in the
tth stage contains
and
, unit: m
3/s;
is the inflow of the
ith reservoir in the
tth stage, unit: m
3/s;
is the evaporation capacity of the
ith reservoir in the
tth stage, unit: m
3/s; and
is the storage volume of the
ith reservoir in the
tth stage, unit: m
3. Because we study the mid- and long-term operation of cascade reservoirs in this paper, the delay of water flow between two reservoirs is not considered.
(2) Reservoir volume limits:
where
is the lower limit of
, which usually corresponds to the dead level, unit: m
3;
is the upper limit of
, which usually corresponds to the flood control level in flood season and normal level in dry season, unit: m
3.
(3) Comprehensive utilization of water resources required at downstream reservoir limits:
where
the lower limit of
, which is usually determined by the ecological flow of the downstream river, unit: m
3/s; and
is the upper limit of
, which is usually determined by the channel capacity of the downstream river, unit: m
3/s.
(4) Power generation limits:
where
is usually determined by the allowed minimum output, unit: kW; and
is usually determined by the installed capacity and expected output of the hydropower station, unit: kW.
(5) Boundary conditions limits:
where
is the storage volume of the
ith reservoir at the beginning of the first stage, unit: m
3;
is the storage volume of the
ith reservoir at the beginning of the entire planning horizon, unit: m
3;
is the storage volume of the
ith reservoir at the end of the
Tth stage, unit: m
3; and
is the storage volume of the
ith reservoir at the end of the entire planning horizon, unit: m
3.
3. Methodologies
3.1. MDP and POA
DP can be effectively used to solve multi-stage decision-making problems recursively, and the ROO problem can be regarded as a multi-stage decision-making problem by dividing the reservoir operation into sub-operations on the basis of operation intervals [
44,
45]. A reverse recursion procedure and a chronological order recursion procedure are involved in the application of DP to ROO problems. In the reverse recursion procedure, starting from the last stage, the output or power generation is calculated up to the first stage, and the optimal storage water level variations can be obtained at last by the chronological order recursion procedure. The recursive equation for the
tth stage of computation is as follows [
11].
where
is the state variable;
is the decision variable, which is determined by beginning state
and end state
;
is the decision variables set in the
tth stage;
is the optimal cumulative output of the beginning state
Sb at the
tth stage, unit: kW; and
is the optimal cumulative output of the beginning state
Se at the (
t + 1)th stage, unit: kW. The optimal cumulative output mentioned above means the sum of the output from present stage
t to last stage
T in the optimal output process.
Many variables and constraints are integrated into the procedure of solving a CROO problem, because the number of reservoirs is often two or more in a CROO problem. If the number of discrete points of storage volume for each reservoir is
M for a cascade system consisting of
n reservoirs, then
Mn combinations of these discrete points can be obtained. With reference to the reverse recursion procedure and chronological order recursion procedure of DP in an ROO problem, we can get the optimal combination of storage volume for each stage. Taking a hydropower station system consisting of
n reservoirs as an example, the combination principle of discrete storage volume is shown in
Figure 1, and the recursive equation of MDP can be formulated as follows.
where
= (
,
,…,
)′ is the decision variable vector; and
= (
,
,…,
)′ is the state variable vector. In formula (9), because of the discretization,
is equivalent to (
,
, …,
),
is equivalent to (
,
,…,
), and
is equivalent to (
,
,…,
);
is the optimal cumulative output of a storage volume combination at the
tth stage, unit: kW; and
is the optimal cumulative output of a storage volume combination at the (
t + 1)th stage, unit: kW.
The detailed exposition on the calculation principles and steps of MDP has been recorded in the literature by Ji [
37] and Zhang [
11].
The POA is a powerful improved dynamic programming in cascade reservoirs operation optimization, which was first proposed by the Canadian researchers N.G. F. Sancho and H. R. Harvson for solving multi-stage dynamic programming problems in 1975. It transforms a complex multi-stage decision problem into a series of two-stage decision-making problems, which simplifies the solution of multi-dimensional problems and decreases their computational complexity. The calculation steps of the POA in solving a CROO model are as follows.
- Step 1
Obtain the initial operation trajectories of storage volume by other conventional methods, these can be represented as {, ,…, }, where Mum is equal to the number of reservoirs multiplied by the number of operation stages in a year, i.e., Mum = nT.
- Step 2
Within the permitted scope, discretize the point , into {, ,…,}, where M is the number of discrete points for an operation point on the initial operation trajectories of storage volume, and the value of the other points on the initial trajectories are fixed.
- Step 3
For different discrete points of
, which are {
,
,…,
}, implement the simulation calculations respectively by using long series runoff data. Find out the point of
that can maximize the power generation of all operation stages of the cascade reservoirs, then update
with
, and save it as
. The power generation in the simulation calculation can be obtained by formula (10), in which
k represents the serial number of discrete points
- Step 4
In the same way, perform the Steps 2 and 3 to the other points on the initial operation trajectories, and end up with new operation trajectories .
- Step 5
Compare the power generation between the initial operation trajectories {V0(0), V1(0),…, } and the new operation trajectories {V0(1), V1(1),…, VMum(1)}; if the error meets the accuracy requirements, then stop counting, otherwise take {V0(1), V1(1),…, VMum(1)} as the new initial operation trajectories, and repeat Step 2 to Step 4.
3.2. Hybrid Application of MDP and POA
As we know, for a large-scale cascade hydropower system, the calculation amount of MDP in a high discrete degree will be very huge, which can lead to a very long run-time, while in a low discrete degree, it cannot guarantee the accuracy of the final solution. As described the introduction, the POA is sensitive to the initial trajectories, but it has a strong local search ability and fast computing speed. Therefore, if we take the global optimal trajectories of MDP in a low discrete degree as the initial trajectories of the POA, and implement further optimization to the obtained initial trajectories by the POA with a high discrete degree, then we can effectively avoid their defects and play their advantages. That is to say, we first obtain the relatively good initial trajectories for the POA by MDP in a short time, and then optimize the initial trajectories further by the POA, which can give full play to its local search ability and only take a very short time. Described above is the basic principle of the proposed hybrid algorithm in this paper, i.e., MDP-POA.
Taking the optimization calculation of one reservoir as an example, the principle of MDP-POA is shown in
Figure 2. For cascade reservoirs, its principle is basically identical to
Figure 2. In
Figure 2,
M1 is the number of discrete points for MDP optimization, and it usually takes a small value, such as 20 or 30.
M2 is the number of discrete points for POA optimization, and it usually takes a large value, such as 100 or 200.
The calculation steps of MDP-POA in solving a CROO model are similar to that of the POA. The difference is in the first step, where the POA obtains the initial operation trajectories of the storage volume by conventional methods, but we obtain the initial optimal operation trajectories of the storage volume by MDP with a low discrete degree (such as 10, 20, or 30) in MDP-POA.
3.3. Improved MDP
There is a “curse of dimensionality” problem for MDP, mainly because it is a traversal optimization process, which means that all combinations of storage volume in the feasible region are calculated. The calculation amount of MDP is usually very huge when the number of discrete points is large. However, what we ultimately need is just an optimal storage volume process line; not all the calculations in the traversal optimization are required for us in the end. Therefore, the question of how to reduce the unnecessary calculations is the key to shorten the run-time of MDP. The essence of shortening the run-time for MDP is to reduce the amount of discrete combinations of storage volume in the calculation.
On the whole, there are two ways to decrease the discrete combinations. The first way is to implement the calculation with a low discrete degree, which will reduce the number of total discrete combinations, but it will affect the precision of the final solution. The second way is to implement the calculation with a high discrete degree, but remove the unlikely optimal solution region first. However, the optimal result of MDP in a current discrete degree is related to all of the discrete combination calculations. So, before the traversal calculation, we cannot eliminate any discrete combinations that we think are unlikely optimal solution regions in the calculation, and we can eliminate them only after we make sure of it. Therefore, we can implement the calculation with a low discrete degree first, which is used to determine the generally unlikely optimal solution region in a short time, and remove it, and then implement further optimization for the rest of the region with a relatively high discrete degree, so as to ensure the accuracy of algorithm. The rest of the feasible region mentioned above is actually a corridor constructed through the optimal trajectories of MDP with a low discrete degree, and the further optimization also described above is actually implemented in the corridor by MDP with a relatively high discrete degree. Described above is the basic principle of the proposed improved MDP in this paper, i.e., IMDP.
Taking the optimization calculation of one reservoir as an example, the principle of IMDP is shown in
Figure 3. For cascade reservoirs, its principle is basically identical to
Figure 3. In
Figure 3, the corridor size can be two or more discrete units around the initial optimal trajectories, and the times of corridor optimization can be implemented once or several times.
In contrast, there are two differences between IMDP and Incremental Dynamic Programming (IDP). The first is that IDP requires an initial operation trajectory; however, IMDP directly uses the optimal solution of MDP in a low discrete degree as the initial operation trajectory of MDP in further optimization with a high discrete degree, and it does not need to find the initial solution by other methods. The second is that IMDP generally requires only one-time corridor optimization to achieve a good result, while IDP may require multiple iterations, and so its computation time is generally longer.
According to the above principle of IMDP, we can construct the corridor by the optimal result of MDP in a low discrete degree (such as 10, 20, or 30), and set up some IMDP schemes, for example, schemes “IMDP: 10 × (20/2)”, “IMDP: 10 × (40/4)”, “IMDP: 20 × (10/2)”, and “IMDP: 20 × (20/4)” as shown in
Figure 4, in which the discrete degree of the schemes is equivalent to the discrete degree of MDP with 100 discrete points. Taking the IMDP scheme “IMDP: 10 × (20/2)” for example, it has 10 discrete points in the first-stage calculation and 20 discrete points in the second-stage calculation, and the 20 discrete points are distributed in two discrete units uniformly, which can be demonstrated by the first figure in
Figure 4. So, the discrete degree of scheme “MDP: 100” and “IMDP: 10 × (20/2)” are the same, or at least the discrete degree of the IMDP scheme is not lower than that of MDP. The meaning of other IMDP schemes in
Figure 4 are similar to scheme “IMDP: 10 × (20/2)”.
The calculation steps of IMDP in solving a CROO model are similar to that of MDP, which can be briefly summarized as follows.
- Step 1
Obtain the initial optimal operation trajectories of the storage volume by MDP in a low discrete degree (such as 10, 20, or 30), which can be represented as {,,…, }.
- Step 2
Construct a corridor through the initial optimal operation trajectories of MDP obtained in Step 1.
- Step 3
Within the constructed corridor scope, discretize every point in {,,…, } by another discrete degree.
- Step 4
In the constructed corridor, obtain the optimal storage volume combination for each stage by MDP with a reverse recursion calculation and a chronological order recursion calculation.