Machine Learning ‐ Based Small Hydropower Potential Prediction under Climate Change

: As the effects of climate change are becoming severe, countries need to substantially re ‐ duce carbon emissions. Small hydropower (SHP) can be a useful renewable energy source with a high energy density for the reduction of carbon emission. Therefore, it is necessary to revitalize the development of SHP to expand the use of renewable energy. To efficiently plan and utilize this energy source, there is a need to assess the future SHP potential based on an accurate runoff predic ‐ tion. In this study, the future SHP potential was predicted using a climate change scenario and an artificial neural network model. The runoff was simulated accurately, and the applicability of an artificial neural network to the runoff prediction was confirmed. The results showed that the total amount of SHP potential in the future will generally a decrease compared to the past. This result is applicable as base data for planning future energy supplies and carbon emission reductions.


Introduction
Hydropower is an important renewable energy source. It contributes to sustainability by producing electricity with almost zero greenhouse gas (GHG) emissions [1,2]. In particular, small hydropower (SHP) has the additional advantage of quickly responding to short-term changes in the electricity demand in small-scale areas [3][4][5]. There have been many studies regarding its positive and negative impacts (e.g., ecological impacts on the fluvial ecosystem), as well as the method for mitigating those negative impacts (e.g., environmental flows) [1,2]. In particular, as it is considered to play a key role in mitigating climate change, it has been actively promoted globally. Many studies have been conducted to estimate the SHP potential, and accurately assessing the future potential of SHP is a major concern [6][7][8][9][10][11][12][13][14][15][16][17][18][19].
SHP plants highly depend on climatic conditions; thus, it is crucial to accurately predict the runoff under various climate change scenarios in order to assess the future SHP potential [9,20]. In previous studies, a runoff was simulated using various hydrological models to evaluate the impacts of climate change on the hydropower potential [9,[21][22][23][24][25]. For conceptual and lumped models, Kim et al. (2012) predicted the future runoff using the Tank model [26], and Chilkoti et al. (2017) used a conceptual rainfall-runoff model [9]. Depending on the scale and available information of the target area, the hydrological models that can be used may vary [8]. By incorporating data such as the basin slope, curve number, soil database, and digital elevation model (DEM), a more detailed model can be applied. Liu et al. (2016) applied eight global hydrological models to project the impact of climate change on the hydropower potential in China [27]. Kim et al. (2018) and Wang et al. (2019) used a grid-based surface runoff model and variable infiltration capacity model to study the impact of climate change on the future runoff and SHP potential [6,28]. Van Vliet et al. (2016) pointed out the limitations of previous studies that used only one hydrological model and then used an ensemble of results from three global hydrological models [8]. However, the greater the input data used, the greater is the uncertainty of the model [8].
To this end, an artificial neural network (ANN), which is a machine learning model, can be an alternative solution that can overcome the shortcomings of hydrological models. An ANN model is based on the structure of neurons while taking into account nonlinearity and shows highly accurate results in complex systems with less impact from outliers. Because of these strengths, ANNs have been widely used in hydrological and environmental models to study complex nonlinear processes such as the rainfall-runoff process. In particular, ANN has been applied to a wide variety of data and other models in order to calculate and predict flood discharge. Campolo et al. (2003) used three different input data (daily precipitation, hydrometric, and dam operation plan data) into an ANN model to calculate the flood discharge, and they showed better results than regression models [29]. Kerh and Lee (2006) developed an ANN model based on upper stream station data and basin characteristics to predict the characteristics of the lower stream stations, and the developed model yielded more accurate results than those from the Muskingum method [30]. Nesliihan (2011) predicted 543 ungauged basins in Turkey using two types of ANNs, and Hidayat et al. (2014) applied ANN to tidal rivers and showed that ANN could perform accurate predictions with less target station water level data [31,32]. Jahangir et al. Both studies recognized that their ANN model performed better than existing models [33,34]. As such, research on predicting the future runoff by using an ANN model has been conducted; however, there has been no research that has used the runoff predicted by the ANN model to estimate the SHP potential.
Therefore, this study applied the ANN model to predict the future runoff and then estimated the SHP potential based on the runoff prediction. The target SHP plant and data are described, and the climate change scenario, ANN, evaluation metrics, and SHP potential calculation are explained in Section 2. The results of the future SHP potential prediction are presented in Section 3. The conclusions are presented in Section 4.

Target SHP Plant and Data
In this study, the Hanseok power plant in the Han River basin of South Korea was selected as the target plant. As of 2015, there were 61 SHP plants currently in operation in South Korea. The target plant was selected based on the following criteria: (1) the existence of a rainfall station and a stage station with available weather and discharge data, (2) the existence of over 2000 kW of power plant capacity that guarantees a stable power plant operation, and (3) the possession of power generation data for 10 years or longer. In particular, for the application of ANN, the training period should be at least two times longer than the test period, and the available observed data was an important criterion for this selection.
The Hanseok SHP plant is in the standard basin of the Saigokcheon junction. There are two rainfall stations (Yeongwol and Yeongju) and a stage station (Danyanggun) in operation near the plant ( Figure 1). It has been generating electricity since its construction in 1989, with an installed capacity of 2214 kW and an effective head of 3.8 m. Weather data were collected from two nearby rainfall stations. These two stations, Yeongwol and Yeongju, are under the control of the Korea Meteorological Administration (KMA). The daily observed data for the precipitation, average temperature, average wind speed, and average relative humidity for the period of 1995 to 2020 were collected. The daily average discharge data for the same period were collected from the Danyanggun stage station.

Climate Change Scenario
A climate change scenario can be defined as the future carbon dioxide concentration in the atmosphere to be used as the forced condition of a climate change model. The Intergovernmental Panel on Climate Change (IPCC) has been developing future climate change scenarios based on GHG emission scenarios and evaluating climate change response strategies. In the IPCC fifth assessment report in 2014 (AR5), GHG concentrations were determined based on the radiation to the atmosphere caused by human activities. Representative Concentration Pathways (RCPs) were developed to indicate that socioeconomic scenarios may vary for one representative radiative forcing. The RCP scenarios consist of four different cases (2.6, 4.5, 6.0, and 8.5) according to climate change response policies (Table 1). Table 1. Brief explanation of the RCP scenarios (2.6,4.5,6.0, and 8.5).

2.6
The Earth is able to recover the effects of human activities (Impossible Scenario) 420 ppm 4.5 The green gas reduction policies are implemented significantly 540 ppm 6.0 The green gas reduction policies are realized at less than RCP 4.5 670 ppm 8.5 The greenhouse gases are emitted at the current trend (without reduction) 940 ppm RCP scenarios are materialized through global climate models (GCMs) and are used as the most general climate change forecast data. GCMs are global atmosphere-ocean circulation models based on complex interactions among various forces, such as solar radiation energy, volcanic eruptions, greenhouse effect, and various other conditions including the atmosphere, oceans, and ground surface. However, it is difficult to use GCMs to analyze regional areas because of their low resolution (135 × 135 km). Therefore, a spatial and temporal downscaling must be conducted to use a GCM at a regional scale.
To simulate the future climate of South Korea, the KMA is preparing a global climate change scenario using the Coupled Model Intercomparison Project phase 5 (CMIP5). Among several CMIP5 models, the GCM of Hadley Center Global Environment Model-Regional Climate Model (HadGEM3-RA) from the Hadley Center (UK Met Office) is widely used to understand climate change and provide future climate projections [35,36]. As aforementioned, it is difficult to use a GCM for South Korea. Therefore, the KMA uses HadGEM3-RA, a regional climate model from HadGEM2-AO, for East Asia and the Korean Peninsula. However, the regional climate model is still large for studying small areas such as watersheds. Thus, the KMA offers fine-scale (1 × 1 km) climate change data over South Korea. To prepare the fine-scale data, the KMA used the Parameter-elevation Regression on Independent Slopes Model (MK-PRISM) and PRSIM-based Downscaling Estimation Model (PRIDE) to downscale the HadGEM3-RA. This study used fine-scale data from the KMA as future climate data.
We collected daily RCP 4.5 scenarios containing the precipitation, average temperature, relative humidity, and average wind speed from 2021 to 2030. RCP 4.5 is mainly used to estimate the long-term runoff and considers the substantial realization of GHG reduction policies [37]. Because of the underestimation of precipitation data in the climate change scenario, outlier testing and bias correction through quantile mapping are required [38]. Therefore, this study used quantile mapping for the bias correction and the box plot method to detect outliers. After applying the two methods, we calculated the basin average value of each meteorological factor in the observation data and climate change scenario data by assigning the Thiessen polygon area ratio.

Artificial Neural Network
The ANN model, introduced by McCulloch and Pitts (1943), is a representative supervised machine-learning algorithm [39]. The ANN model is based on the human brain system and is generally used for the classification and prediction of specific factors using only undefined mathematical relationships. The model is known to be an effective method for analyzing nonlinear relationships between independent and dependent variables in given datasets. Figure 2 illustrates the conceptual diagram of the ANN model. The ANN model consists of three layers, namely the input, hidden, and output layers (Figure 2), each of which possesses a set of neurons that are fully connected with neurons in the following layer, and each layer has different weight values (w). The ANN model aims to reduce errors, defined as the difference between the estimated and targeted values, by modifying the weights using the backward propagation process. The backward propagation process involves adjusting the parameters (e.g., weights, w, and biases, b) of the model based on the loss provided from the previous iteration. Proper tuning results ensure minimum errors, making the model reliable by increasing its generalization. The ANN model was mathematically formulated using Equation (1): where X is an input variable, f denotes an activation function for the layers, w represents the weight values between layers, and b and B indicate the biases in the hidden and output layers, respectively. In the algorithms of the ANN model, the input X is multiplied by the weight value, and the coupled value is then converted by the activation function. Subsequently, it is transmitted to the next layer as a signal. Through these processes, the final output Y is obtained. The representative activation functions generally used in the ANN model include the sigmoid, hyperbolic tangent (tanh), and rectified linear unit (Relu) functions.

Evaluation Metrics
In this study, the three metrics used for evaluating the performance of the ANN model for runoff forecasting were the Nash-Sutcliffe efficiency (NSE), coefficient of coefficient (CC), and percent bias (PBIAS).
where ye and yo denote the simulated and observed runoff, respectively, and and are the average values of the simulated and observed runoff, respectively. The CC ranges from −1 to 1 and describes a measure of how well the outputs are simulated by the model. A value of 0 indicates that there was no correlation between the two runoff datasets. The PBIAS indicates the ratio of the difference between the sum of simulated and observed runoffs to the sum of the observed runoff. It is generally used to evaluate the modeling performance of runoff volumes. The NSE denotes the predictive power of the model. It ranges from −∞ to 1, and a value closer to 1 indicates a better performance of the model, while a value below zero indicates that the average observed value is better than the modeled value.

SHP Potential Calculation
The theoretical potential of SHP is the energy that can be obtained without considering the geographical and technical constraints. The theoretical potential was calculated as follows: where the water density is (kg/m 3 ), the runoff is Q (m /s), the head is H (m), and g is the acceleration due to gravity (m/s . The density of water was taken as 1000 kg/m 3 , and the acceleration due to gravity was approximately 9.8 m/s . The effective head ( ), the height of the head, which determines the energy, was used as the head. The value of H can be obtained by excluding loss head (head loss) from the gross head. The designed discharge ( (m /s ) value was applied. In this study, the maximum power generation potential of the Hanseok SHP plant was taken as 53 MW, and the maximum runoff contributing to power generation was 158.3 m /s. Therefore, we calculated the potential by applying 158.3 m /s as the design discharge.
The technical potential can be calculated from the theoretical potential by considering the efficiency of the plant (η) and the operation rate ( . The technical potential was calculated as follows: The total efficiency (η) varies according to the size and type of the plant. In this study, the efficiency and operation rate were assumed to be 0.8 and 0.4, respectively [40].

ANN Model Development
The ANN model used in this study had a standard three-layer network. It consisted of an input layer, hidden layer, and output layer, and included a Relu activation function in the hidden layer and a linear transfer function in the output layer. This study examined the runoff prediction performance using various numbers of hidden layers, and the results showed the most reliable results when using two hidden layers. The simulation function of the algorithms for predicting the runoff was as follows: where Qt−m is the predicted runoff with a lead time of m, and Pt−n, Ht−n, Tt−n, and Wt−n indicate antecedent precipitation, humidity, temperature, and wind speed with a previous time step of n, respectively. In this study, the input variables were obtained from two weather stations, and different values of n were considered between one and four days of the previous time steps. A fully connected layer with 40 neurons was used as the input layer. In this study, for the runoff prediction using the ANN model, the training period was from January 1995 to December 2015, the validation period was from January 2016 to December 2020, and the test period was from January 2021 to December 2030.

Runoff Prediction under A Climate Change Scenario Using ANN Model
The trained ANN model was used for the verification and testing of runoff predictions. Figure 3 illustrates the validation results for the runoff prediction using the ANN model from January 2016 to December 2020. As shown in the figure, the model could accurately capture the variability of the runoff in time and peak runoff. Statistical metrics showed the predictive performance of the model. The values of CC, PBIAS, and NSE were 0.77, 16.8%, and 0.6, respectively. According to Moriasi et al. (2007) and Xiang et al. (2020), in hydrologic modeling, an NSE value over 0.5 is considered acceptable, indicating that the ANN model used in this study provided a sufficient predictive performance in forecasting the runoff [41,42].

SHP Potential Prediction
The future SHP potential (2021-2030) was predicted using future runoff simulation results. The results of the monthly SHP potential are shown in Figure 4. In the prediction period of 2021-2030, the SHP potential decreased as compared to that of the historic period. This result can be attributed to a decrease in precipitation in future climate change scenarios. The decreasing trend of precipitation in the 2020s has also been mentioned in previous studies [43][44][45]. Table 2 shows the comparison between the statistics of the monthly SHP potential in the historic and future prediction periods. The maximum to mean value of the SHP potential in the future period indicates a lower value than the values in the historic period, with a change of −45.9% to −11.6%. On the other hand, the lower quartile (25%) and minimum values showed an increase in the future period, with a change of 16.9% to 734.6%. The monthly average potentials of each period are compared in Figure 5. Overall, the average monthly potential decreased in the future period as compared to that of the historic period, particularly from July to September. However, future increases in the monthly average potential showed the largest change in June, with 59%. These results indicate that the variation in the SHP potential will decrease in the future, similar to the characteristics of precipitation in the climate change scenarios.

Conclusions and Discussion
Efforts for achieving carbon neutrality are needed in order to mitigate global climate crises. To this end, it is necessary to develop eco-friendly sources of energy and SHP. In this study, we attempted to accurately predict the future SHP potential using a climate change scenario and an ANN model. The Hanseok SHP plant was selected as the target plant. The future runoff was simulated using the ANN model while considering climate change, and the future SHP potential was subsequently predicted. The runoff simulation results of the ANN model showed a sufficient predictive performance with a CC value of 0.77, PBIAS of 16.8%, and NSE of 0.6. The results showed that the model accurately captured the variability and peak runoff. The future SHP potential was predicted using the future runoff. The results indicate that the SHP potential will decrease from 2021 to 2030. In particular, the maximum value of the monthly SHP potential in the future period was predicted to decrease by 45.9% compared to that in the historic period. In addition, the results of monthly average potentials showed that the average potential during July to September was expected to decrease, while an opposite trend was shown in June.
This study contributes to improving the predictive performance of the future SHP potential by using an ANN model. Though this study was conducted for a target SHP plant, it can also be applied to any SHP plant. Owing to this limitation, we assumed the most conservative value for the analysis, as the accurate operation rate and design discharge of the plant were not available. In actual applications, more accurate predictions are possible using more accurate values. In addition, machine learning models including ANN have limitations that learn from observation data and make a prediction based on them, so that the model is highly dependent on the learning data. Even if the length of the training period is long enough and the characteristics of the selected training data can represent the characteristics of the whole data, it is difficult to predict future extreme events that have not occurred in the past. It is possible to make more accurate predictions by taking these limitations into account and finding ways to complement those of existing models based on the results of the models used in this study; it is then expected that this study can be used to plan for future energy supplies and carbon emission reductions.