An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models

Atalay, Bektaş Aykut; Zor, Kasım

doi:10.3390/app151910514

Open AccessArticle

An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models

by

Bektaş Aykut Atalay

and

Kasım Zor

^*

Department of Electrical and Electronic Engineering, Graduate School, Adana Alparslan Türkeş Science and Technology University, 01250 Adana, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(19), 10514; https://doi.org/10.3390/app151910514

Submission received: 28 August 2025 / Revised: 15 September 2025 / Accepted: 24 September 2025 / Published: 28 September 2025

Download

Browse Figures

Versions Notes

Abstract

Hydroelectricity, one of the oldest and most potent forms of renewable energy, not only provides low-cost electricity for the grid but also preserves nature through flood control and irrigation support. Forecasting hydroelectricity generation is vital for utilizing alleviating resources effectively, optimizing energy production, and ensuring sustainability. This paper provides an innovative approach to hydroelectricity generation forecasting (HGF) of a 138 MW hydroelectric power plant (HPP) in the Eastern Mediterranean by taking electricity productions from the remaining upstream HPPs on the Ceyhan River within the same basin into account, unlike prior research focusing on individual HPPs. In light of tuning hyperparameters such as number of trees and learning rates, this paper presents a thorough benchmark of the state-of-the-art tree-based machine learning models, namely categorical boosting (CatBoost), extreme gradient boosting (XGBoost), and light gradient boosting machines (LightGBM). The comprehensive data set includes historical hydroelectricity generation, meteorological conditions, market pricing, and calendar variables acquired from the transparency platform of the Energy Exchange Istanbul (EXIST) and MERRA-2 reanalysis of the NASA with hourly resolution. Although all three models demonstrated successful performances, LightGBM emerged as the most accurate and efficient model by outperforming the others with the highest coefficient of determination (R²) (97.07%), the lowest root mean squared scaled error (RMSSE) (0.1217), and the shortest computational time (1.24 s). Consequently, it is considered that the proposed methodology demonstrates significant potential for advancing the HGF and will contribute to the operation of existing HPPs and the improvement of power dispatch planning.

Keywords:

catboost; forecasting; generation; hydroelectricity; lightgbm; machine learning; xgboost

1. Introduction

Hydroelectric power remains a key element of renewable energy with its high efficiency and low costs [1]. In 2023, hydroelectricity’s contribution of 14.17% to total electricity generation and 47.17% to renewable electricity production underscored its significant role in the global energy mix and its dominant position within the renewable sector [2]. Negating its role on supply and demand sets hydroelectrics apart from other renewable technologies. This makes it a critical stabilizer in increasingly decentralized and uncontrollable electric grids [3]. Additionally, hydroelectricity enhances grid stability, offers long-duration energy storage, and plays a crucial role in integrating intermittent renewable sources like wind and solar into energy systems aiming for net-zero emissions [4,5]. Moreover, it is considered that the environmental benefits of hydroelectricity are well-documented in the literature. Studies have shown that hydroelectricity substantially lessens carbon emissions compared to fossil fuels [6,7]. As the electricity demand grows and the integration of variable renewable sources accelerates, the need for more accurate hydroelectricity forecasts becomes crucial [8]. Forecasting is vital for effective power dispatch planning, sustainable water resource management, and grid reliability [9,10,11].

Non-linear dynamics including hydrological and meteorological variations and operational aspects impact water availability and hydroelectricity generation. This variability poses noteworthy challenges for classic forecasting approaches relying on linear statistical models. Moreover, the ecological impacts of HPPs on river ecosystems highlight the need for equilibrium between energy production and ecosystem preservation [12].

Precise and trustworthy hydroelectricity forecasting is important for many reasons, such as enabling efficient energy management, reservoir operations, and grid integration of this erratic renewable resource. Meticulous forecasts support effective resource allocation such as water, the minimization of energy shortages, and improvements to the overall stability of the grid. Inaccurate forecasts may result in grid inequalities, economic losses, and low operational efficiency. ML-based algorithms can retain complex patterns and relationships from large data sets, making them well-suited for dealing with the challenges encountered in HGF. Tree-based ML models such as random forest (RF), gradient boosted decision trees (GBDT), CatBoost, LightGBM, and XGBoost have shown particular promise in handling data with non-linear characteristics and have demonstrated elevated performance in various forecasting tasks. Despite their potential, the application of tree-based methods in hydroelectricity forecasting remains relatively unexplored, and a rigorous comparative analysis is lacking [13].

The Ceyhan River is one of the essential waterways in the Eastern Mediterranean as shown in Figure 1. Aslantaş HPP and other HPPs in the same basin harness the energy of the Ceyhan River to produce electricity. With an installed capacity of 138 MW, Aslantaş HPP aims to generate 569 GWh of electricity annually. Behind the HPP, the vast reservoir, with 1150 million cubic meters of water on average, plays a paramount role in irrigating a substantial land area of about 149,849 hectares. Aslantaş HPP’s crucial role in managing floods along the Ceyhan River is beyond generating electricity [14]. Consequently, the management of hydroelectricity generation needs accurate forecasting to utilize water resources and grid stability, empowering caretakers responsible for electricity generation and water management to make informed decisions for a more sustainable future. Table 1 presents the HPPs installed on the Ceyhan River in detail.

This study proposes a novel method that benchmarks tree-based ML techniques by using basin hydroelectricity generation data at the Aslantaş HPP to bridge a critical deficiency in HGF. By utilizing historical power generation data of all HPPs on the Ceyhan River in the same basin, meteorological measures, market prices, and categorized date–time records as input parameters, this study aims to develop robust and efficient forecasting models. Therefore, this study addresses an obvious research gap by proposing an innovative approach to forecast hydroelectricity generation. Specifically, this paper employs CatBoost, LightGBM, and XGBoost in forecasting hourly hydroelectricity generation at the Aslantaş HPP in the Eastern Mediterranean.

Furthermore, this study investigates the comparative accuracy and reliability of these models using a variety of input parameters, including historical electricity generation, temperature, humidity, wind speed, shortwave flux, and calendar data.

The performance of the models is rigorously evaluated by calculating the R² and RMSSE metrics.

Below are the original contributions of this research:

First and foremost, Python, which is an open-source programming language, is used in this paper on a publicly available data set to present reproducible work for other researchers studying the same field and to bring the term of reproducibility to the fore in scientific writing.
One of the main contributions of this study is to propose an innovative approach for forecasting hydroelectricity generation of an HPP by paying attention to the electricity productions of the other upstream HPPs on the same river (or within the same basin) alongside a variety of explanatory features containing meteorological, market, calendar, and historical hydroelectricity generation. The proposed methodology uniquely differs this paper from other studies in the literature that focus on a single HPP and offers a more comprehensive perspective on basin-wide hydrological and operational dynamics for future studies. Furthermore, the HGF literature is considered immature in terms of covering studies with real-time data in the short-term horizon, and it is thought that this paper will bridge the highlighted gap and reinforce the current literature.
For the first time in the literature, this paper fulfills a thorough benchmark of state-of-the-art tree-based machine learning models, namely XGBoost, LightGBM, and CatBoost, by taking the tuning of the hyperparameters such as the number of trees and learning rates into consideration. To the best of one’s knowledge, no previous research has conducted a direct head-to-head comparison of these algorithms in forecasting hydroelectricity generation under identical constraints with the same performance and error metrics.

The rest of this study is organized as follows: Section 2 reviews the relevant literature on HGF, with a specific focus on machine learning applications. Section 3 details the data and methodological approaches used in this study. Section 4 thoroughly presents the results and analyses obtained and discussed. Finally, Section 5 summarizes the study’s findings and outlines future research directions.

2. Related Work

This section reviews the related works on HGF. While various methods and approaches have been explored in this domain, a focused and systematic review targeting the HGF for various plant types and operating conditions is currently lacking. The existing body of work is chronologically presented such that it offers valuable insights and identifies the crucial gaps that motivate the actual study.

Ref. [13] explicitly identified this research gap, emphasizing the limited application of HGF compared to other renewable energy sources. This gap underscores the need for targeted reviews focusing on HGF to bridge this divide and advance the state of the field.

Ref. [16] systematically reviewed ML models in energy systems but mentioned HGF only briefly concerning renewable energy systems. Ref. [17] examined ANN applications in energy and reliability prediction across solar, wind, and hydraulic energy sources but provided limited coverage of hydroelectricity. The literature contains comprehensive reviews highlighting advancements in renewable energy forecasting, particularly leveraging ML and DL techniques. These reviews have extensively explored forecasting models for wind, solar, and other renewable energy sources, focusing on integrating data-driven methodologies, hybrid models, and optimization algorithms to enhance forecasting accuracy and reliability [16,17,18,19].

These studies collectively emphasize the role of ML techniques in advancing renewable energy forecasting systems and optimizing energy grid operations. However, many studies have centered on wind and solar energy forecasting, leaving hydroelectric comparatively under-explored. HGF remains a relatively underrepresented area in the literature.

To provide a coherent overview of the field, the reviewed HGF studies are grouped according to their methodological foundations.

2.1. Statistical Models

Regression-based approaches have been widely applied for HGF, ranging from multiple linear correlation with stepwise selection to climate-informed regression using large-scale predictors, while more recent studies benchmarked Gaussian processes and support vector regression against traditional formulations, with kernel-based methods showing improved accuracy [20,21,22].

ARIMA and its extensions (ARIMAX and SARIMA) have been widely applied in HGF, linking generation with precipitation, capturing seasonal fluctuations, and supporting medium-term planning across diverse regions. Comparative analyses highlighted Holt–Winters as effective for seasonal variability, while evaluations in Brazil showed that even simple seasonal naïve baselines can provide competitive references. Applications in Vietnam, Malaysia, Ghana, and Rwanda further underscored the suitability of ARIMA/SARIMA approaches in data-limited settings [23,24,25,26,27,28,29].

Grey models, seasonal and data-grouping extensions, and more recent fractional-order formulations with buffer operators and metaheuristic optimization have been applied to monthly and quarterly hydroelectricity generation. These approaches are particularly useful when historical data are limited, offering robust alternatives to conventional statistical methods [30,31,32]. Short-term generation has been modeled with precipitation, demand, and past production using statistical bias correction. Flow–duration curves and reference flows have also supported feasibility assessments of small hydroelectric power plants [33].

2.2. Neural Networks-Based Models

Neural networks have become central to HGF by providing non-linear mapping capabilities [34,35,36,37], and applications in Türkiye showed their value for estimating generation potential in irrigation dams [38].

Applications include ABC algorithm for Türkiye’s national generation [39], particle swarm optimization in BP-ANNs for small HPPs [40], Bat Algorithm for Malaysian reservoirs [41], and firefly optimization for small HPP forecasting [42,43]. Brazilian case studies demonstrated the potential of deep neural networks [44], while LSTM and ELM improved temporal modeling of small hydroelectricity generation [45,46]. Additional architectures such as GMDH were tested against MLPs in the Amazon basin [47], and ANN models were adapted to water–energy interactions in Malaysia [48].

Recent studies introduced specialized neural architectures, including Transformer-enhanced LSTM, temporal convolutional hybrids, and extensions to interconnected systems as HPPs integrated with WDS with lowered forecasting errors and improved robustness in large-scale applications [49,50,51,52].

2.3. Tree-Based Models

Tree-based ensemble methods have recently gained traction for HGF. RF has been shown to provide stable baselines for reservoir generation prediction [53], while GBDT variants achieved superior accuracy in Turkish case studies [54]. Optimized GBDT and CatBoost models further improved predictive efficiency under diverse inflow conditions [55].

CatBoost excelled in handling categorical inputs [56] compared to XGBoost and LightGBM, while broader benchmarks confirmed its generalization performance [57]. XGBoost was also adapted with meteorological features to improve forecasts [58].

GBDT outperformed ANN variants for SHP forecasting in Poland [59]. XGBoost was validated for hydroelectricity generation [60]. XGBoost and CatBoost coupled with metaheuristics (SMA, AO, and GWO) pushed accuracy further [61,62].

2.4. Hybrid and Other Models

Hybrid and alternative models have been widely developed by combining statistical, neural, and heuristic techniques for HGF. Early studies explored fuzzy systems and evolutionary algorithms, showing that evolving fuzzy inference models could achieve daily inflow predictions comparable to hydrological baselines [63,64]. Genetic algorithms were also applied to reservoir operation planning and long-term inflow modeling, demonstrating their value in optimization-driven forecasting [65,66].

Neuro-fuzzy approaches extended this line of research. ANFIS models optimized with grey wolf algorithms or cascaded structures consistently outperformed classical machine learning alternatives [67,68]. Beyond this, hybrid architectures have emerged that integrate deep learning. Zhou et al. proposed the DeepHydro framework [69], a latent recurrent neural network model, while other studies benchmarked LSTM against XGBoost and SVR [70] and developed ensembles combining LSTM with Conv1D for Cameroon’s Songloulou HPP [71]. LSTM–SVR hybrids were also evaluated in Türkiye [72].

Aksoy [73] evaluated multiple machine learning techniques including kNN, SVR, RF, GA, DNN, RNN, and autoencoders for hourly forecasting.

EEMD–GRU and wavelet–LSTM–RF formulations improved the treatment of nonstationary inputs [74,75], and the HYPE–ANN framework yielded robust forecasts for run-of-river schemes [76].

A Developed Crow Search Algorithm was incorporated into ANN training under climate change conditions in China [77], and ABC was combined with ELM for Turkish SHPs [78]. Hybrid formulations also gained traction, with ANN–GA and PSO–ANN variants improving forecast accuracy in Laos [79,80]. LSTM was applied to Malawian plants [81], LSTM was benchmarked against ANFIS for Turkish run-of-river schemes [82], and LWNRBF networks were adapted for next-day capacity prediction [83].

After reviewing all references related to HGF, it is evident that a significant number of studies have employed tree-based methods for HGF. These studies, as summarized in Table 2, showcase the diversity of approaches and evaluation metrics used in the field. The methods include widely used algorithms such as RF, GBDT, and XGBoost among others.

3. Material and Methods

3.1. Material

The material of this study is a data set that includes a variety of categories classified for energy, weather, market, and calendar variables. The data set covers the period from 1 July 2020 to 31 October 2024. The modeling framework in this study is based on the input variables summarized in Table 3, which include historical hydroelectricity generation, reservoir levels, inflow data, precipitation, temperature, and other relevant meteorological and hydrological indicators. These variables were consistently used as predictors across all modeling experiments and evaluation stages, ensuring a standardized input configuration for all algorithms.

Energy variables including hydroelectricity generation of Aslantaş HPP (indicated in Figure 2) and other HPPs on the Ceyhan River are illustrated in Figure 3. The energy variables also contain the lagging values of Aslantaş HPP’s 1-hour, 1-day, and 1-week lags as well.

Calendar data were extracted from date and time data. Calendar variables consist of the year, month of year, week of year, day of month, hour of day, day of week, and type of day (0 for weekdays and 1 for weekends).

The meteorological data contain air temperature, humidity, wind speed, and shortwave flux. The data were obtained from NASA’s MERRA-2 reanalysis. One is likely associated with atmospheric sounding measurements at various altitudes; another pertains to flux measurements such as energy or heat flux; and the third focuses on radiation data, including radiative flux or cosmic radiation [85,86,87]. The choice of specific meteorological data, such as temperature, humidity, wind speed, and shortwave flux, was driven by their direct impact on water resources and hydrological processes. Temperature influences evaporation rates, which in turn affect water availability. Higher temperatures can increase evaporation, reduce reservoir water levels, and impact the water supply. Humidity is under the influence of the rate of evaporation and transpiration from water bodies and vegetation. Higher humidity can reduce evaporation rates and affect water conservation. Wind speed can influence the evaporation process. Higher wind speeds can increase the evaporation rate and lead to a decrease in water levels. Shortwave flux is related to solar radiation and has impacts on temperature and evaporation rates. Increased solar radiation can cause higher temperatures and evaporation rates.

Market data regarding Turkish electricity spot markets were derived from day-ahead, intraday, and balancing power markets’ prices, namely market clearing price (MCP), weighted average price (WAP), and system marginal price (SMP), respectively. Both energy and market data were acquired from the EXIST transparency platform [88].

The raw data set, obtained from publicly accessible sources, exhibited a limited proportion of missing entries and noisy measurements. To ensure data quality, a systematic preprocessing pipeline was implemented: (i) detection and removal of outliers based on statistical thresholds, (ii) correction or exclusion of erroneous records, and (iii) imputation of missing values using linear interpolation. These measures were adopted to mitigate the potential adverse effects of data quality issues on model training and forecasting performance.

In addition to those, Figure 4 shows the Winsorized Pearson’s correlation map of the exogenous variables of the data set. A Winsorized form of Pearson’s correlation is a robust measure of correlation to evaluate the linear relationship between two independent variables while reducing the impact of outliers at the same time [89]. Within this context, the Winsorized Pearson’s correlation analysis was employed to systematically assess and illustrate the linear associations among variables under the reduced influence of extreme values, thereby offering readers a more reliable understanding of the data set’s structure and the potential relevance of predictors.

3.2. Methods

This study employed several gradient boosting algorithms entitled XGBoost, LightGBM, and CatBoost to model the relationships within the data set, as shown in Figure 5. Gradient boosting is an ensemble learning technique that iteratively builds a strong model from a combination of weaker learners, typically decision trees. These algorithms were selected owing to offering superior predictive performance, efficient computation, and robust hyperparameter optimization capabilities, which make them ideal for benchmarking in HGF tasks in the recent literature. This section details the core mechanisms of each algorithm.

3.2.1. XGBoost

XGBoost is a gradient boosting algorithm proposed by Ref. [91]. Each tree aims to correct the errors or residuals left by its predecessors. By combining the outputs of all these trees, the final prediction is obtained [92]. One of the strengths of XGBoost is its ability to effectively process tabular data and its transparency in model interpretation. XGBoost mechanism is illustrated in Figure 6.

Recognized for its effectiveness across various predictive modeling tasks, XGBoost is a highly scalable machine learning system for tree boosting. It is a prominent implementation of gradient boosting machines (GBM), known for its superior performance in supervised learning tasks. It is suitable for both regression and classification problems [91].

XGBoost offers an open-source implementation of gradient boosting, optimized for high performance, flexibility, and portability. This library implements machine learning algorithms within the Gradient Boosting framework. Leveraging parallel tree boosting (commonly referred to as GBDT or GBM), XGBoost can effectively and rapidly address numerous data science challenges [94].

XGBoost constructs an additive expansion of the objective function by minimizing a loss function. Since decision trees are the sole base classifiers in XGBoost, a modified loss function is employed to regulate tree complexity.

The predicted value

{\hat{y}}_{i}

is the cumulative sum of each decision tree output as expressed in Equation (1).

{\hat{y}}_{i} = ϕ (x_{i}) = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in F

(1)

where

F

denotes the collection of decision trees,

f_{k} (x_{i})

represents the output generated by the k-th tree for the instance

x_{i}

, and

{\hat{y}}_{i}

is the predicted value for the i-th instance

x_{i}

.

The algorithm progressively minimizes the objective function presented in Equation (2).

L (ϕ) = \sum_{i = 1}^{K} l ({\hat{y}}_{i}, y_{i}) + \sum_{i = 1}^{K} Ω (f_{k})

(2)

where l denotes a differentiable convex loss function measuring difference between the predicted value

{\hat{y}}_{i}

and the actual target

y_{i}

. The term

Ω

in the second part imposes a penalty on the model’s complexity [95].

3.2.2. LightGBM

LightGBM, introduced by Ref. [96], was designed to address the challenges of reduced accuracy and efficiency in GBDT when handling large-scale data sets [97]. This approach integrates Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) techniques into the GBDT framework. The approach retains samples with large gradient values while randomly selecting samples with small gradient values and assigning them constant weights. By doing so, GOSS prioritizes undertrained samples while preserving the original data distribution as illustrated in Figure 7 [98,99].

The Gradient-based One-Side Sampling (GOSS) proposed by Ref. [96] can be summarized as follows:

1.: Rank all training instances by the absolute values of their gradients in descending order.
2.: Retain the top $a \times 100 %$ of instances with the largest gradients to form subset A.
3.: From the remaining $(1 - a) \times 100 %$ instances with smaller gradients, randomly sample $b \times | A^{c} |$ instances to create subset B, where $A^{c}$ is the complement of A.
4.: Determine the optimal split by evaluating the variance gain ${\tilde{V}}_{j} (d)$ over the combined set $A \cup B$ .

The variance gain

{\tilde{V}}_{j} (d)

is defined as follows:

{\tilde{V}}_{j} (d) = \frac{1}{n} (\frac{{(\sum_{x_{i} \in A_{l}} g_{i} + \frac{1 - a}{b} \sum_{x_{i} \in B_{l}} g_{i})}^{2}}{n_{l}^{j} (d)} + \frac{{(\sum_{x_{i} \in A_{r}} g_{i} + \frac{1 - a}{b} \sum_{x_{i} \in B_{r}} g_{i})}^{2}}{n_{r}^{j} (d)})

(3)

where

$A_{l} = {x_{i} \in A : x_{i j} \leq d}$ and $A_{r} = {x_{i} \in A : x_{i j} > d}$ are the subsets of A split by threshold d.
$B_{l} = {x_{i} \in B : x_{i j} \leq d}$ and $B_{r} = {x_{i} \in B : x_{i j} > d}$ are the subsets of B split similarly.

The coefficient

\frac{1 - a}{b}

is introduced to normalize the sum of gradients over B back to the original size of

A^{c}

, ensuring that the smaller gradients in B are properly scaled when calculating

{\tilde{V}}_{j} (d)

.

LightGBM is widely applied to energy demand forecasting, solar energy forecasting, the optimization of energy distribution, and the development of effective planning strategies [100,101,102,103].

3.2.3. CatBoost

CatBoost is a gradient boosting algorithm developed by Ref. [104]. It is specifically designed to handle categorical data effectively while delivering high predictive performance. Each tree in CatBoost is trained to reduce residual errors left by its predecessors and combine their outputs to generate the final prediction. A unique strength of CatBoost lies in its ability to natively process categorical features without requiring preprocessing such as one-hot encoding.

CatBoost has been broadly applied to various energy prediction tasks including estimating building energy consumption, forecasting solar electricity generation, and predicting wind power output through hybrid models [105,106].

3.2.4. Model Implementation

All selected ML models are implemented with Python 3.12.0. Python packages XGBoost 4.1.0, CatBoost 1.2.7, and LightGBM 2.1.3 were used. Models were coded with the following regressors: XGBRegressor [94], LGBMRegressor [107], and CatBoostRegressor [108].

4. Results and Discussions

All computations in this study were performed on a computer running Windows 11 (version 23V2). The system featured an Intel i7-10870H processor at 2.20GHz, 64 GB of RAM, and an NVIDIA RTX 2070 GPU with 8GB of GDDR6 memory (256-bit). Jupyter Notebook (Version 7.4.5) was employed as the integrated development environment for Python, a widely used language for statistical analysis, data processing, and producing high-quality visualizations [109].

R² and RMSSE were employed as evaluation metrics in this study. The coefficient of determination R² is given by

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}} .

(4)

In this expression, i represents the index running over observations from 1 to n,

y_{i}

represents the actual value for the i-th observation,

{\hat{y}}_{i}

represents the predicted value for the i-th observation,

\bar{y}

represents the mean of all

y_{i}

, and n represents the total number of observations [110].

RMSSE is defined as

R M S S E = \sqrt{\frac{\frac{1}{h} \sum_{i = n + 1}^{n + h} {(y_{i} - {\hat{y}}_{i})}^{2}}{\frac{1}{n - 1} \sum_{i = 2}^{n} {(y_{i} - y_{i - 1})}^{2}}}

(5)

Here, the index i continues through time from 1 to

n + h

, where n is the last time point in the training set and h is the forecast horizon. The term

\frac{1}{h} \sum_{i = n + 1}^{n + h} {(y_{i} - {\hat{y}}_{i})}^{2}

measures the average squared errors of the predictions from

i = n + 1

to

i = n + h

, while the term

\frac{1}{n - 1} \sum_{i = 2}^{n} {(y_{i} - y_{i - 1})}^{2}

represents the average of the squared differences between consecutive actual values in the training period [111].

For model testing and evaluation, a random sampling method was applied to XGBoost, LightGBM, and CatBoost to select 80% of the data for training and 20% for testing [112].

This paper focused on the optimization of two crucial hyperparameters, the number of trees and learning rate (

η

), to improve the three distinct boosting methods’ performance and generalization capabilities. Because these two factors directly influence model complexity and learning dynamics, selecting the right combinations can lead to substantial performance gains while mitigating risks such as overfitting or underfitting. Tree sizes were changed from 100 to 1000 by 100 to capture various degrees of model expressiveness and computational costs.

This paper also examined learning rates between 0.05 and 0.35 by 0.05 to recognize that smaller values generally promote stable learning but require more iterations. In contrast, larger values can accelerate training yet increase the likelihood of overshooting during optimization. To comprehensively assess model performance, R² and RMSSE during the prediction process were calculated to compare the obtained results.

As seen from Figure 8, Figure 9 and Figure 10; those three tree-based ML models consistently showed that adding more trees—typically up to around 700–800—drove strong gains in R² and lowered RMSSE, reflecting an improved balance between underfitting and overfitting. Beyond that range, improvements in predictive performance began to plateau, highlighting diminishing returns in exchanging training time for slight increases in model accuracy.

The learning rate (

η

) likewise played a pivotal role, with smaller values in the 0.05–0.10 range delivering more stable and ultimately higher R² at the cost of longer training times. Higher rates, usually 0.20–0.25 or above, converged more quickly but risked overshooting the loss surface and ending up in suboptimal regimes, resulting in flatter or even declining gains in R².

Among the three models tested, LightGBM emerged as the fastest, requiring only 1–2.5 s to reach 700–1000 trees, all while achieving the top R² (around 97.0–97.1%) and the lowest RMSSE (about 0.120–0.121). CatBoost and XGBoost tended to take longer (up to 4–5 s), yet still attained competitive R² scores near 96.8–96.9% and RMSSE around 0.122–0.127. Overall, these results underscore that choosing a moderate learning rate alongside 700–800 trees typically strikes the best compromise between model accuracy and computational cost, although the final decision should be guided by the specific time and performance needs of the application.

Table 4 presents the top five results sorted by RMSSE for each model: LightGBM, CatBoost, and XGBoost. Among the three, LightGBM achieves the best overall performance with an RMSSE of 0.1217 and an R² of 97.07%, using 1000 trees and a learning rate of 0.10. The next four LightGBM configurations, ranging from 900 to 600 trees with the same learning rate, maintain low RMSSE values while exhibiting a gradual decline in performance.

CatBoost’s top configurations start in the sixth position overall, with its best result (1000 trees and a learning rate of 0.15) achieving an RMSSE of 0.1242 and an R² of 96.94%. The remaining top CatBoost configurations show slightly higher RMSSE values but stay competitive with LightGBM, showcasing the importance of its hyperparameter tuning.

XGBoost’s top results rank lower, starting at the 11th position. Its best configuration (900 trees and a learning rate of 0.15) achieves an RMSSE of 0.1273 and an R² of 96.79%. While its performance is respectable, XGBoost lags behind both LightGBM and CatBoost in this comparison. The remaining XGBoost configurations exhibit similar RMSSE values but fail to match the performance of the other two models.

These results emphasize the importance of hyperparameter optimization in determining model performance. LightGBM emerges as the most effective model in this evaluation, particularly for minimizing RMSSE.

From the results in Table 4, the configuration with a learning rate of 0.10 and 1000 trees was selected for each model (LightGBM, CatBoost, and XGBoost). The predictions generated by these models are plotted against the actual energy values for the period between 21 April 2022 and 25 April 2022 as shown in Figure 11.

In Figure 11, all three models closely follow the actual energy values, demonstrating their ability to capture the general trends in the data set. However, there are slight deviations in specific regions where the energy values exhibit sharp changes. LightGBM predictions align almost perfectly with the actual data across the time range. Its ability to accurately capture both stable and transitional periods makes it the most robust model in this comparison.

CatBoost predictions also closely follow the actual values but show small deviations in certain periods, particularly during sharp transitions. This could indicate slight sensitivity to rapid changes in the data.

XGBoost predictions demonstrate good performance but exhibit slightly larger deviations during sharp transitions compared to LightGBM and CatBoost. This is consistent with its ranking in terms of RMSSE in Table 4.

LightGBM emerges as the most accurate model based on the visual analysis of predictions and its superior RMSSE performance. CatBoost and XGBoost also perform well but are slightly less precise during transitions. This highlights the importance of hyperparameter optimization and model selection in improving prediction accuracy for time-series data.

Although these results are specific to the Ceyhan River basin, the proposed framework is not inherently site-dependent. Because it integrates hydrological, meteorological, and market variables, it has potential applicability to other river systems. However, transferring the framework to a new basin would require retraining the model with basin-specific data, as well as independent testing and validation to ensure robustness under different hydrological regimes. Performance may vary depending on the basin characteristics, and additional cross-basin experiments would be necessary to confirm its generalizability.

Another notable aspect of the proposed approach is real-time forecasting. The boosting models employed allow inference so forecasts may be updated rapidly as new process, meteorological, or market data is available. Retraining on a rolling window and drift monitoring would further ensure forecast accuracy maintained in an operational setting without significant computational time. Libraries such as XGBoost and LightGBM allow for incremental updates, enabling the model to incorporate recent data quickly as periodic full retraining maintains long-term stability.

5. Conclusions

Given the increasing role of hydroelectricity in achieving sustainable energy goals, this study benchmarks advanced tree-based machine learning models—XGBoost, LightGBM, and CatBoost—to enhance HGF accuracy. Unlike traditional methods, the proposed approach incorporates basin-wide hydrological and meteorological data, offering a comprehensive view of the factors influencing energy production.

The findings highlight the robustness of the selected models in managing the complexities of forecasting in a dynamic environment. Evaluated through R² and RMSSE metrics, the results demonstrate the models’ ability to deliver accurate and reliable predictions. Among the three, LightGBM emerged as the most accurate model, achieving the lowest RMSSE (0.1217) and the highest R² (97.07%), followed closely by CatBoost and XGBoost. The predictions plotted against actual energy values show all models effectively capturing overall trends, though minor deviations are observed during sharp transitions, with LightGBM consistently outperforming the other models in both accuracy and stability.

This study is considered to represent a significant contribution to the field, being among the first to integrate upstream hydrological data into hydroelectricity forecasting. The proposed framework provides a strong foundation for optimizing energy dispatch, improving water resource management, and maintaining grid stability. Additionally, the model framework can be applied to other basins with similar operational characteristics, advancing both theoretical and practical applications in sustainable energy management.

The insights gained pave the way for further exploration of hybrid methodologies, real-time implementation strategies, and the integration of additional data sources to enhance predictive accuracy and operational efficiency.

It is important to know the potential risks of over-reliance on ML-based forecasting in hydroelectric dispatching. Forecast errors during extreme inflow events may cause operational issues such as spillage, inefficient reservoir management, or ecological flow violations. To mitigate these risks, the forecasting models should be applied as a decision support tool, not a fully automated dispatch system, complemented by rule curves, operational constraints, and prediction intervals. Incorporating hybrid approaches that combine machine learning with hydrological knowledge represents a promising direction for future work.

Author Contributions

Conceptualization, B.A.A. and K.Z.; Data Curation: B.A.A.; Formal Analysis: B.A.A. and K.Z.; Investigation: B.A.A.; Methodology: B.A.A. and K.Z.; Project Administration: K.Z.; Resources: B.A.A. and K.Z.; Software: B.A.A. and K.Z.; Supervision: K.Z.; Validation: B.A.A.; Visualization: B.A.A. and K.Z.; Writing—Original Draft: B.A.A.; and Writing—Review and Editing: K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The authors of this paper would like to bring the term reproducibility to the fore by using Python programming language on a publicly available data set. The data set can be accessed by sending an e-mail to the corresponding author.

Acknowledgments

The authors are grateful to and thank the anonymous reviewers for their valuable comments and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ABC	Artificial Bee Colony
ABDT	Adaptive Boosting Decision Trees
ABLR	Adaptive Boosting Linear Regression
AE	Autoencoder
AI	Artificial Intelligence
ANFIS	Adaptive Neuro-Fuzzy Inference System
ANN	Artificial Neural Network
ARIMA	Auto-Regressive Integrated Moving Average
AWT	Adaptive Wavelet Transform
CatBoost	Categorical Boosting
DL	Deep Learning
DNN	Deep Neural Network
EEMD	Ensemble Empirical Mode Decomposition
ELM	Extreme Learning Machines
EXIST	The Energy Exchange Istanbul
GA	Genetic Algorithm
GBDT	Gradient Boosted Decision Trees
GBM	Gradient Boosting Machine
GOSS	Gradient-based One Side Sampling
GPR	Gaussian Process Regression
GWO	Grey Wolf Optimization
HGF	Hydroelectricity Generation Forecasting
HPP	Hydroelectric Power Plant
kNN	K-Nearest Neighbor
LightGBM	Light Gradient Boosting Machine
LSTM	Long Short-Term Memory
LWNRBF	Linear Weighted Normalized Radial Basis Function
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MERRA-2	Modern-Era Retrospective Analysis for Research and Applications, Version 2
ML	Machine Learning
MLP	Multilayer Perceptron
MLR	Multiple Linear Regression
MSE	Mean Squared Error
NSE	Nash–Sutcliffe Efficiency
R²	Coefficient of Determination
RBF	Radial Basis Function
RF	Random Forest
RMSE	Root Mean Squared Error
RMSPE	Root Mean Squared Percentage Error
RMSSE	Root Mean Squared Scaled Error
RNN	Recurrent Neural Networks
SARIMA	Seasonal ARIMA
SVM	Support Vector Machine
SVR	Support Vector Regression
XGBoost	Extreme Gradient Boosting
WDS	Water Distribution Systems

References

Çakır, S. Renewable energy generation forecasting in Turkey via intuitionistic fuzzy time series approach. Renew. Energy 2023, 214, 194–200. [Google Scholar] [CrossRef]
Energy Institute. Statistical Review of World Energy 2024; Technical report; Energy Institute: London, UK, 2024. [Google Scholar]
Cebeci, C.; Parker, M.; Recalde-Camacho, L.; Campos-Gaona, D.; Anaya-Lara, O. Variable-Speed Hydropower Control and Ancillary Services: A Remedy for Enhancing Grid Stability and Flexibility. Energies 2025, 18, 642. [Google Scholar] [CrossRef]
Zor, K.; Tolun, G.G.; Şeker Zor, E. Forecasting Electricity Generation of a Geothermal Power Plant Using LSTM and GRU Networks. In Proceedings of the 2025 7th Global Power, Energy and Communication Conference (GPECOM), Bochum, Germany, 11–13 June 2025; pp. 531–536. [Google Scholar] [CrossRef]
International Hydropower Association. 2024 World Hydropower Outlook; Technical report; International Hydropower Association: London, UK, 2024. [Google Scholar]
Bayazıt, Y. The effect of hydroelectric power plants on the carbon emission: An example of Gokcekaya dam, Turkey. Renew. Energy 2021, 170, 181–187. [Google Scholar] [CrossRef]
Rahman, A.; Farrok, O.; Haque, M.M. Environmental impact of renewable energy source based electrical power plants: Solar, wind, hydroelectric, biomass, geothermal, tidal, ocean, and osmotic. Renew. Sustain. Energy Rev. 2022, 161, 112279. [Google Scholar] [CrossRef]
Atalay, B.A. Hydroelectric Power Forecasting via Tree-Based Machine Learning Algorithms. Ph.D. Thesis, Department of Electrical and Electronic Engineering, Graduate School, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye, 2024. Available online: https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=1pwTzRXnomYf6jwqVORfUU5c3WKK1Ha5zeoRJpvV87EpgvwsiUzpV629p6yCgy4n (accessed on 23 September 2025).
Lorca, A.; Favereau, M.; Olivares, D. Challenges in the Management of Hydroelectric Generation in Power System Operations. Curr. Sustain. Energy Rep. 2020, 7, 94–99. [Google Scholar] [CrossRef]
Xiao, J.W.; Fang, H.; Wang, Y.W. Short-Term Residential Load Forecasting via Pooling-Ensemble Model With Smoothing Clustering. IEEE Trans. Artif. Intell. 2024, 5, 3690–3702. [Google Scholar] [CrossRef]
Li, P.C.; Wang, Y.W.; Xiao, J.W. Accurate forecasting on few-shot learning with a novel inference foundation model. Inf. Fusion 2025, 124, 103370. [Google Scholar] [CrossRef]
Kuriqi, A.; Pinheiro, A.N.; Sordo-Ward, A.; Bejarano, M.D.; Garrote, L. Ecological impacts of run-of-river hydropower plants—Current status and future prospects on the brink of energy transition. Renew. Sustain. Energy Rev. 2021, 142, 110833. [Google Scholar] [CrossRef]
Krechowicz, A.; Krechowicz, M.; Poczeta, K. Machine Learning Approaches to Predict Electricity Production from Renewable Energy Sources. Energies 2022, 15, 9146. [Google Scholar] [CrossRef]
EÜAŞ Aslantaş HPP. 2024. Available online: https://www.euas.gov.tr/en-US/power-plants/aslantas-hepp (accessed on 23 September 2025).
Enerjiatlasi.com HPPs in Türkiye. 2025. Available online: https://www.enerjiatlasi.com/hidroelektrik/ (accessed on 23 September 2025).
Mosavi, A.; Salimi, M.; Faizollahzadeh Ardabili, S.; Rabczuk, T.; Shamshirband, S.; Varkonyi-Koczy, A.R. State of the Art of Machine Learning Models in Energy Systems, a Systematic Review. Energies 2019, 12, 1301. [Google Scholar] [CrossRef]
Ferrero Bermejo, J.; Gómez Fernández, J.F.; Olivencia Polo, F.; Crespo Márquez, A. A Review of the Use of Artificial Neural Network Models for Energy and Reliability Prediction. A Study of the Solar PV, Hydraulic and Wind Energy Sources. Appl. Sci. 2019, 9, 1844. [Google Scholar] [CrossRef]
Wang, H.; Lei, Z.; Zhang, X.; Zhou, B.; Peng, J. A review of deep learning for renewable energy forecasting. Energy Convers. Manag. 2019, 198, 111799. [Google Scholar] [CrossRef]
Ying, C.; Wang, W.; Yu, J.; Li, Q.; Yu, D.; Liu, J. Deep learning for renewable energy forecasting: A taxonomy, and systematic literature review. J. Clean. Prod. 2023, 384, 135414. [Google Scholar] [CrossRef]
Aleksandrovskii, A.Y.; Borshch, P.S. Prediction of electric-power generation at hydroelectric power plants. Power Technol. Eng. 2013, 47, 83–88. [Google Scholar] [CrossRef]
Lima, C.H.; Lall, U. Climate informed long term seasonal forecasts of hydroenergy inflow for the Brazilian hydropower system. J. Hydrol. 2010, 381, 65–75. [Google Scholar] [CrossRef]
Ekanayake, P.; Wickramasinghe, L.; Jayasinghe, J.M.J.W.; Rathnayake, U. Regression-Based Prediction of Power Generation at Samanalawewa Hydropower Plant in Sri Lanka Using Machine Learning. Math. Probl. Eng. 2021, 2021, 4913824. [Google Scholar] [CrossRef]
Barzola-Monteses, J.; Mite-León, M.; Espinoza-Andaluz, M.; Gómez-Romero, J.; Fajardo, W. Time Series Analysis for Predicting Hydroelectric Power Production: The Ecuador Case. Sustainability 2019, 11, 6539. [Google Scholar] [CrossRef]
Lei, Y.; Xue, P.; Li, Y. Comparison of Holt-Winters and ARIMA Models for Hydropower Forecasting in Guangxi. In Proceedings of the 2020 3rd International Conference on Signal Processing and Machine Learning, Beijing China, 22–24 October 2020; pp. 63–67. [Google Scholar] [CrossRef]
de Sousa, M.A.; Maçaira, P.M.; Souza, R.C.; Cyrino Oliveira, F.L.; Calili, R.F. Forecasting Electricity Generation of Small Hydropower Plants. In Springer Proceedings in Business and Economics; Springer Science and Business Media B.V.: Cham, Switzerland, 2020; pp. 45–54. [Google Scholar] [CrossRef]
Polprasert, J.; Hanh Nguyen, V.A.; Nathanael Charoensook, S. Forecasting Models for Hydropower Production Using ARIMA Method. In Proceedings of the 2021 9th International Electrical Engineering Congress (iEECON), Pattaya, Thailand, 10–12 March 2021; pp. 197–200. [Google Scholar] [CrossRef]
Abu, N.; Tukimat, N.N.A.; Abu, N. Forecasting of hydropower production using Box-Jenkins model at Tasik Kenyir, Terengganu. AIP Conf. Proc. 2024, 2895, 050005. [Google Scholar] [CrossRef]
Sarpong, S.A.; Agyei, A. Forecasting Hydropower Generation in Ghana Using ARIMA Models. Int. J. Stat. Probab. 2022, 11, 30. [Google Scholar] [CrossRef]
Shoaga, G.O.; Ikuzwe, A.; Gupta, A. Forecasting of Monthly Hydroelectric and Solar Energy in Rwanda using SARIMA. In Proceedings of the 2022 IEEE PES/IAS PowerAfrica, PowerAfrica 2022, Kigali, Rwanda, 22–26 August 2022. [Google Scholar] [CrossRef]
Cheng, C.T.; Miao, S.M.; Luo, B.; Sun, Y.J. Forecasting monthly energy production of small hydropower plants in ungauged basins using grey model and improved seasonal index. J. Hydroinform. 2017, 19, 993–1008. [Google Scholar] [CrossRef]
Wang, Z.X.; Li, Q.; Pei, L.L. Grey forecasting method of quarterly hydropower production in China based on a data grouping approach. Appl. Math. Model. 2017, 51, 302–316. [Google Scholar] [CrossRef]
Li, Z.; Hu, X.; Guo, H.; Xiong, X. A novel Weighted Average Weakening Buffer Operator based Fractional order accumulation Seasonal Grouping Grey Model for predicting the hydropower generation. Energy 2023, 277, 127568. [Google Scholar] [CrossRef]
Monteiro, C.; Ramirez-Rosado, I.J.; Fernandez-Jimenez, L.A. Short-term forecasting model for aggregated regional hydropower generation. Energy Convers. Manag. 2014, 88, 231–238. [Google Scholar] [CrossRef]
Coulibaly, P.; Anctil, F.; Bobée, B. Neural Network-Based Long-Term Hydropower Forecasting System. Comput.-Aided Civ. Infrastruct. Eng. 2000, 15, 355–364. [Google Scholar] [CrossRef]
Valena, M.; Ludermir, T. Constructive neural networks in forecasting weekly river flow. In Proceedings of the Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications, ICCIMA 2001, Yokusika City, Japan, 30 October–1 November 2001; pp. 271–275. [Google Scholar] [CrossRef]
Stokelj, T.; Paravan, D.; Golob, R. Enhanced Artificial Neural Network Inflow Forecasting Algorithm for Run-of-River Hydropower Plants. J. Water Resour. Plan. Manag. 2002, 128, 415–423. [Google Scholar] [CrossRef]
Joaquim, P.; Rosa, J. Artificial neural networks for temporal processing applied to prediction of electric energy in small hydroelectric power stations. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada, 31 July–4 August 2005; Volume 4, pp. 2625–2630. [Google Scholar] [CrossRef]
Cobaner, M.; Haktanir, T.; Kisi, O. Prediction of Hydropower Energy Using ANN for the Feasibility of Hydropower Plant Installation to an Existing Irrigation Dam. Water Resour. Manag. 2008, 22, 757–774. [Google Scholar] [CrossRef]
Uzlu, E.; Akpinar, A.; Özturk, H.T.; Nacar, S.; Kankal, M. Estimates of hydroelectric generation using neural networks with the artificial bee colony algorithm for Turkey. Energy 2014, 69, 638–647. [Google Scholar] [CrossRef]
Li, M.; Deng, C.H.; Tan, J.; Yang, W.; Zheng, L. Research on Small Hydropower Generation Forecasting Method Based on Improved BP Neural Network. In Proceedings of the 2016 3rd International Conference on Materials Engineering, Manufacturing Technology and Control, Taiyuan, China, 27–28 February 2016. [Google Scholar] [CrossRef]
Hussin, S.N.H.S.; Malek, M.A.; Jaddi, N.S.; Hamid, Z.A. Hybrid metaheuristic of artificial neural network—Bat algorithm in forecasting electricity production and water consumption at Sultan Azlan shah Hydropower plant. In Proceedings of the 2016 IEEE International Conference on Power and Energy (PECon), Melaka, Malaysia, 28–29 November 2016; pp. 28–31. [Google Scholar] [CrossRef]
Hammid, A.T.; Sulaiman, M.H.B.; Awad, O.I. A robust firefly algorithm with backpropagation neural networks for solving hydrogeneration prediction. Electr. Eng. 2018, 100, 2617–2633. [Google Scholar] [CrossRef]
Hammid, A.T.; Sulaiman, M.H.B.; Abdalla, A.N. Prediction of small hydropower plant power production in Himreen Lake dam (HLD) using artificial neural network. Alex. Eng. J. 2018, 57, 211–221. [Google Scholar] [CrossRef]
Tucci, C.E.M.; Collischonn, W.; Fan, F.M.; Schwanenberg, D. Hydropower Forecasting in Brazil. In Handbook of Hydrometeorological Ensemble Forecasting; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1307–1328. [Google Scholar] [CrossRef]
Li, L.; Yao, F.; Huang, Y.; Zhou, F. Hydropower generation forecasting via deep neural network. In Proceedings of the 2019 6th International Conference on Information Science and Control Engineering, ICISCE 2019, Shanghai, China, 20–22 December 2019; pp. 324–328. [Google Scholar] [CrossRef]
Lian, C.; Wei, H.; Qin, S.; Li, Z. Trend-guided Small Hydropower System Power Prediction Based on Extreme Learning Machine. In Proceedings of the 2020 4th International Conference on Power and Energy Engineering (ICPEE), Xiamen, China, 19–21 November 2020; pp. 220–225. [Google Scholar] [CrossRef]
Lopes, M.N.G.; da Rocha, B.R.P.; Vieira, A.C.; de Sá, J.A.S.; Rolim, P.A.M.; da Silva, A.G. Artificial neural networks approaches for predicting the potential for hydropower generation: A case study for Amazon region. J. Intell. Fuzzy Syst. 2019, 36, 5757–5772. [Google Scholar] [CrossRef]
Joe, W.W.; Yuzainee, M.Y.; Zaini, N.; Malek, M.A. Methods in Forecasting Water Used and Electricity Production at Hydropower Plants. Int. J. Recent Technol. Eng. (IJRTE) 2019, 8, 6499–6505. [Google Scholar] [CrossRef]
Ma, L.; Chen, S.; Wei, D.; Zhang, Y.; Guo, Y. A Comprehensive Hybrid Deep Learning Approach for Accurate Status Predicting of Hydropower Units. Appl. Sci. 2024, 14, 9323. [Google Scholar] [CrossRef]
Zhang, G.; Li, H.; Wang, L.; Wang, W.; Guo, J.; Qin, H.; Ni, X. Research on Medium- and Long-Term Hydropower Generation Forecasting Method Based on LSTM and Transformer. Energies 2024, 17, 5707. [Google Scholar] [CrossRef]
Di Grande, S.; Berlotti, M.; Cavalieri, S.; Gueli, R. A Machine Learning Approach for Hydroelectric Power Forecasting. In Proceedings of the 14th International Renewable Energy Congress (IREC), Sousse, Tunisia, 16–18 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
Di Grande, S.; Berlotti, M.; Cavalieri, S.; Gueli, R. A Machine Learning Approach to Forecasting Hydropower Generation. Energies 2024, 17, 5163. [Google Scholar] [CrossRef]
Javed, U.; Fraz, M.M.; Mahmood, I.; Shahzad, M.; Arif, O. Forecasting of Electricity Generation for Hydro Power Plants. In Proceedings of the 2020 IEEE 17th International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI (HONET), Charlotte, NC, USA, 14–16 December 2020; pp. 32–36. [Google Scholar] [CrossRef]
Al Rayess, H.; Ülke Keskin, A. Forecasting the hydroelectric power generation of gcms using machine learning techniques and deep learning (Almus dam, Turkey). Geofizika 2021, 38, 1–14. [Google Scholar] [CrossRef]
Wang, B.; Li, T.; Xu, N.; Zhou, H.; Xiong, Z.; Long, W. A Novel Reservoir Modeling Method based on Improved Hierarchical XGBoost. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; pp. 1918–1923. [Google Scholar] [CrossRef]
Szczepanek, R. Daily Streamflow Forecasting in Mountainous Catchment Using XGBoost, LightGBM and CatBoost. Hydrology 2022, 9, 226. [Google Scholar] [CrossRef]
Kumar, V.; Kedam, N.; Sharma, K.V.; Mehta, D.J.; Caloiero, T. Advanced Machine Learning Techniques to Improve Hydrological Prediction: A Comparative Analysis of Streamflow Prediction Models. Water 2023, 15, 2572. [Google Scholar] [CrossRef]
Wu, Y.; Xie, Y.; Xu, F.; Zhu, X.; Liu, S. A runoff-based hydroelectricity prediction method based on meteorological similar days and XGBoost model. Front. Energy Res. 2023, 11, 1273805. [Google Scholar] [CrossRef]
Maciejewski, D.; Mudryk, K.; Sporysz, M. Forecasting Electricity Production in a Small Hydropower Plant (SHP) Using Artificial Intelligence (AI). Energies 2024, 17, 6401. [Google Scholar] [CrossRef]
Atalay, B.A.; Zor, K. XGBoost (Aşırı Gradyan Artırımlı Karar Ağaçları) ile Hidroelektrik Enerji Tahmini. Çukurova Üniv. Mühendis. Fak. Derg. 2025, 40, 205–218. [Google Scholar] [CrossRef]
Qi, Z.; Feng, Y.; Wang, S.; Li, C. Enhancing hydropower generation Predictions: A comprehensive study of XGBoost and Support Vector Regression models with advanced optimization techniques. Ain Shams Eng. J. 2025, 16, 103206. [Google Scholar] [CrossRef]
Wang, C.; Li, C.; Feng, Y.; Wang, S. Predicting hydropower generation: A comparative analysis of Machine learning models and optimization algorithms for enhanced forecasting accuracy and operational efficiency. Ain Shams Eng. J. 2025, 16, 103299. [Google Scholar] [CrossRef]
Luna, I.; Lopes, J.E.G.; Ballini, R.; Soares, S. Verifying the Use of Evolving Fuzzy Systems for Multi-Step Ahead Daily Inflow Forecasting. In Proceedings of the 15th International Conference on Intelligent System Applications to Power Systems, Curitiba, Brazil, 8–12 November 2009; pp. 1–6. [Google Scholar] [CrossRef]
Konica, J.A.; Staka, E. Forecasting of a hydropower plant energy production with Fuzzy logic Case for Albania. J. Multidiscip. Eng. Sci. Technol. (JMEST) 2017, 4, 2458–9403. [Google Scholar]
Wardlaw, R.B.; Sharif, M.; Kimaite, F. Real-time hydro-power forecasting on the Victoria Nile. Proc. Inst. Civ. Eng.-Water Manag. 2005, 158, 45–54. [Google Scholar] [CrossRef]
Wang, W.; Xu, D.; Qiu, L.; Ma, J. Genetic Programming for Modelling Long-Term Hydrological Time Series. In Proceedings of the 2009 Fifth International Conference on Natural Computation, Tianjian, China, 14–16 August 2009; Volume 4, pp. 265–269. [Google Scholar] [CrossRef]
Dehghani, M.; Riahi-Madvar, H.; Hooshyaripor, F.; Mosavi, A.; Shamshirband, S.; Zavadskas, E.K.; Chau, K.w. Prediction of Hydropower Generation Using Grey Wolf Optimization Adaptive Neuro-Fuzzy Inference System. Energies 2019, 12, 289. [Google Scholar] [CrossRef]
Rathnayake, N.; Rathnayake, U.; Dang, T.L.; Hoshino, Y. A Cascaded Adaptive Network-Based Fuzzy Inference System for Hydropower Forecasting. Sensors 2022, 22, 2905. [Google Scholar] [CrossRef] [PubMed]
Zhou, S.; Wang, Y.; Su, H.; Chang, J.; Huang, Q.; Li, Z. Dynamic quantitative assessment of multiple uncertainty sources in future hydropower generation prediction of cascade reservoirs with hydrological variations. Energy 2024, 299, 131447. [Google Scholar] [CrossRef]
Hao, R.; Bai, Z. Comparative Study for Daily Streamflow Simulation with Different Machine Learning Methods. Water 2023, 15, 1179. [Google Scholar] [CrossRef]
Tebong, N.K.; Simo, T.; Takougang, A.N. Two-level deep learning ensemble model for forecasting hydroelectricity production. Energy Rep. 2023, 10, 2793–2803. [Google Scholar] [CrossRef]
Çakıcı, F.N.; Tezcan, S.S.; Düzkaya, H. Estimation of Hydroelectric Power Generation Forecasting and Analysis of Climate Factors with Deep Learning Methods: A Case Study in Yozgat Province in Turkey. Gazi Üniv. Fen Bilim. Derg. Part C Tasarım Teknol. 2024, 12, 819–831. [Google Scholar] [CrossRef]
Aksoy, B. Estimation of Energy Produced in Hydroelectric Power Plant Industrial Automation Using Deep Learning and Hybrid Machine Learning Techniques. Electr. Power Components Syst. 2021, 49, 213–232. [Google Scholar] [CrossRef]
Wang, J.; Gao, Z.; Ma, Y. Prediction Model of Hydropower Generation and Its Economic Benefits Based on EEMD-ADAM-GRU Fusion Model. Water 2022, 14, 3896. [Google Scholar] [CrossRef]
Zolfaghari, M.; Golabi, M.R. Modeling and predicting the electricity production in hydropower using conjunction of wavelet transform, long short-term memory and random forest models. Renew. Energy 2021, 170, 1367–1381. [Google Scholar] [CrossRef]
Ogliari, E.; Nespoli, A.; Mussetta, M.; Pretto, S.; Zimbardo, A.; Bonfanti, N.; Aufiero, M. A Hybrid Method for the Run-Of-The-River Hydroelectric Power Plant Energy Forecast: HYPE Hydrological Model and Neural Network. Forecasting 2020, 2, 410–428. [Google Scholar] [CrossRef]
Huangpeng, Q.; Huang, W.; Gholinia, F. Forecast of the hydropower generation under influence of climate change based on RCPs and Developed Crow Search Optimization Algorithm. Energy Rep. 2021, 7, 385–397. [Google Scholar] [CrossRef]
Yildiz, C.; Açikgöz, H. Forecasting diversion type hydropower plant generations using an artificial bee colony based extreme learning machine method. Energy Sources Part B Econ. Plan. Policy 2021, 16, 216–234. [Google Scholar] [CrossRef]
Kongpaseuth, V.; Kaewarsa, S. Nam Theun 2 Hydropower Plant Energy Prediction Using Artificial Neural Network and Genetic Algorithm. In Proceedings of the Asia-Pacific Power and Energy Engineering Conference, APPEEC, Chiang Mai, Thailand, 6–9 December 2023. [Google Scholar] [CrossRef]
Kaewarsa, S.; Kongpaseuth, V. An energy prediction approach using bi-directional long short-term memory for a hydropower plant in Laos. Electr. Eng. 2024, 106, 2609–2625. [Google Scholar] [CrossRef]
Prakash, S.A.; Shah, D.; Jayavel, K.; Mtonga, K. Hydropower Energy Generation Prediction Model: A Machine Learning Approch. In Proceedings of the 2022 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 25–27 January 2022; pp. 01–04. [Google Scholar] [CrossRef]
Bilgili, M.; Keiyinci, S.; Ekinci, F. One-day ahead forecasting of energy production from run-of-river hydroelectric power plants with a deep learning approach. Sci. Iran. 2022, 29, 1838–1852. [Google Scholar] [CrossRef]
İnal, S.; Akkaya Oy, S.; Özdemir, A.E. A Neural Network Model for Estimation of Maximum Next Day Energy Generation Capacity of a Hydropower Station: A Case Study from Turkey. Celal Bayar Üniv. Fen Bilim. Derg. 2023, 19, 197–204. [Google Scholar] [CrossRef]
Safaraliev, M.; Kiryanova, N.; Matrenin, P.; Dmitriev, S.; Kokin, S.; Kamalov, F. Medium-term forecasting of power generation by hydropower plants in isolated power systems under climate change. Energy Rep. 2022, 8, 765–774. [Google Scholar] [CrossRef]
Global Modeling and Assimilation Office (GMAO). MERRA-2 inst1_2d_asm_Nx: 2d,1-Hourly,Instantaneous,Single-Level,Assimilation,Single-Level Diagnostics V5.12.4 (M2I1NXASM); Global Modeling and Assimilation Office (GMAO): Greenbelt, MD, USA, 2015. [Google Scholar] [CrossRef]
Global Modeling and Assimilation Office (GMAO). MERRA-2 tavg1_2d_flx_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Surface Flux Diagnostics V5.12.4 (M2T1NXFLX); Global Modeling and Assimilation Office (GMAO): Greenbelt, MD, USA, 2015. [Google Scholar] [CrossRef]
Global Modeling and Assimilation Office (GMAO). MERRA-2 tavg1_2d_rad_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Radiation Diagnostics V5.12.4 (M2T1NXRAD); Global Modeling and Assimilation Office (GMAO): Greenbelt, MD, USA, 2015. [Google Scholar] [CrossRef]
EXIST(EPIAS). Transparency Platform. 2025. Available online: https://seffaflik.epias.com.tr/home (accessed on 23 September 2025).
Wilcox, R.R. Chapter 9 - Correlation and Tests of Independence. In Introduction to Robust Estimation and Hypothesis Testing, 5th ed.; Wilcox, R.R., Ed.; Academic Press: New York, NY, USA, 2022; pp. 541–575. [Google Scholar] [CrossRef]
Patil, I. Visualizations with statistical details: The “ggstatsplot” approach. J. Open Source Softw. 2021, 6, 3167. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Tolun, O.C.; Zor, K.; Tutsoy, O. A comprehensive benchmark of machine learning-based algorithms for medium-term electric vehicle charging demand prediction. J. Supercomput. 2025, 81, 475. [Google Scholar] [CrossRef]
Tolun, G.G.; Tolun, O.C.; Zor, K. An Application of Prosumer Electric Load Forecasting with Machine Learning-Based Algorithms. In Proceedings of the 2024 15th National Conference on Electrical and Electronics Engineering (ELECO), Bursa, Türkiye, 28–30 November 2024; pp. 1–6. [Google Scholar] [CrossRef]
Distributed (Deep) Machine Learning Community. DMLC XGBoost. Available online: https://github.com/dmlc/xgboost (accessed on 23 September 2025).
Mitchell, R.; Adinets, A.; Rao, T.; Frank, E. XGBoost: Scalable GPU Accelerated Learning. arXiv 2018, arXiv:1806.11248. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
Tolun, G.G.; Tolun, Ö.C.; Zor, K. Advanced machine learning algorithms for reactive power forecasting in electric distribution systems. E-Prime-Adv. Electr. Eng. Electron. Energy 2025, 13, 101019. [Google Scholar] [CrossRef]
Lu, Y.; Wang, L.; Zhu, C.; Zou, L.; Zhang, M.; Feng, L.; Cao, Q. Predicting surface solar radiation using a hybrid radiative Transfer–Machine learning model. Renew. Sustain. Energy Rev. 2023, 173, 113105. [Google Scholar] [CrossRef]
Wang, R.; Liu, Y.; Ye, X.; Tang, Q.; Gou, J.; Huang, M.; Wen, Y. Power System Transient Stability Assessment Based on Bayesian Optimized LightGBM. In Proceedings of the 2019 3rd IEEE Conference on Energy Internet and Energy System Integration: Ubiquitous Energy Network Connecting Everything, EI2 2019, Changsha, China, 8–10 November 2019; pp. 263–268. [Google Scholar] [CrossRef]
Fan, L.; Wang, Y.; Fang, X.; Jiang, J. To Predict the Power Generation based on Machine Learning Method. J. Phys. Conf. Ser. 2022, 2310, 012084. [Google Scholar] [CrossRef]
Wang, L.; Lu, Y.; Wang, Z.; Li, H.; Zhang, M. Hourly solar radiation estimation and uncertainty quantification using hybrid models. Renew. Sustain. Energy Rev. 2024, 202, 114727. [Google Scholar] [CrossRef]
Xiong, X.; Hu, X.; Tian, T.; Guo, H.; Liao, H. A novel Optimized initial condition and Seasonal division based Grey Seasonal Variation Index model for hydropower generation. Appl. Energy 2022, 328, 120180. [Google Scholar] [CrossRef]
Adinkrah, J.; Kemausuor, F.; Tutu Tchao, E.; Nunoo-Mensah, H.; Agbemenu, A.S.; Adu-Poku, A.; Kponyo, J.J. Artificial intelligence-based strategies for sustainable energy planning and electricity demand estimation: A systematic review. Renew. Sustain. Energy Rev. 2025, 210, 115161. [Google Scholar] [CrossRef]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the Advances in Neural Information Processing Systems; Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2018; Volume 31. [Google Scholar]
Pan, Y.; Zhang, L. Data-driven estimation of building energy consumption with multi-source heterogeneous data. Appl. Energy 2020, 268, 114965. [Google Scholar] [CrossRef]
Vasina, D.; Gorshenin, A. Application of the Catboost Gradient Boosting Method in Forecasting Solar Electricity. In Proceedings of the 2023 Dynamics of Systems, Mechanisms and Machines (Dynamics), Omsk, Russia, 14–15 November 2023; pp. 1–5. [Google Scholar] [CrossRef]
Microsoft. GitHub-Microsoft/LightGBM: A Fast, Distributed, High Performance Gradient Boosting (GBT, GBDT, GBRT, GBM or MART) Framework Based on Decision Tree Algorithms, Used for Ranking, Classification and Many other Machine Learning Tasks. 2024. Available online: https://github.com/Microsoft/LightGBM (accessed on 23 September 2025).
CatBoost. CatBoost-Open-Source Gradient Boosting Library. 2025. Available online: https://catboost.ai/ (accessed on 23 September 2025).
What is Python Used For? 8 Real-Life Python Uses. 2024. Available online: https://www.datacamp.com/blog/what-is-python-used-for (accessed on 23 September 2025).
Timur, O.; Zor, K.; Çelik, Ö.; Teke, A.; İbrikçi, T. Application of Statistical and Artificial Intelligence Techniques for Medium-Term Electrical Energy Forecasting: A Case Study for a Regional Hospital. J. Sustain. Dev. Energy Water Environ. Syst. 2020, 8, 520–536. [Google Scholar] [CrossRef]
Makridakis, S.; Spiliotis, E.; Assimakopoulos, V. M5 accuracy competition: Results, findings, and conclusions. Int. J. Forecast. 2022, 38, 1346–1364. [Google Scholar] [CrossRef]
Cebeci, C.; Zor, K. Electricity Demand Forecasting Using Deep Polynomial Neural Networks and Gene Expression Programming During COVID-19 Pandemic. Appl. Sci. 2025, 15, 2843. [Google Scholar] [CrossRef]

Figure 1. Geographical representation of the HPPs in Table 1 (source: Google Earth).

Figure 2. Hourly generated energy of Aslantaş HPP.

Figure 3. Weekly aggregated generation data of all HPPs within the Ceyhan basin.

Figure 4. Winsorized Pearson’s correlation map of input variables [90].

Figure 5. Simplified flowchart of the applied methodology.

Figure 6. Illustration of the XGBoost concept [93].

Figure 7. Demonstration of the LightGBM concept [98].

Figure 8. Visualization of hyperparameter tuning for XGBoost.

Figure 9. Demonstration of hyperparameter tuning for LightGBM.

Figure 10. Illustration of hyperparameter tuning for CatBoost.

Figure 11. Prediction comparison of models.

Table 1. HPPs installed on the Ceyhan River [15].

		Owner	Altitude	Installed Power	CF *
Number	Name	Status	(m)	(MW)	(%)
1	Dağdelen HPP	Private	1111	8.00	37.7
2	Kandil HPP	Private	1087	207.92	26.7
3	Sarıgüzel HPP	Private	870	103.00	30.9
4	Hacınınoğlu HPP	Private	749	140.00	25.4
5	Menzelet HPP	Private	560	124.00	44.1
6	Kılavuzlu HPP	Private	489	54.00	38.6
7	Sır HPP	State	420	283.50	20.4
8	Berke HPP	State	340	510.00	27.6
9	Aslantaş HPP	State	145	138.00	36.6

* Capacity factor is selected as the highest value obtained between 2021 and 2023.

Table 2. Summary of studies on HPP modeling using ML-based methods.

Year	Ref.	Location	Capacity	Methods	Output	Metrics
2020	[53]	Tarbela HPP, Pakistan	4.88 MW	MLR, kNN, SVR, RF, LSTM	Daily	2.47 kWh (MAE), 3.98 kWh (RMSE)
2021	[54]	Almus HPP, Türkiye	27 MW	DT, GBDT, RF, GL	Monthly	0.717 GBDT (Corr.)
2021	[73]	Dinar 2 HPP, Türkiye	3 MW	kNN, SVR, RF, GA, DNN, RNN, AE	Hourly	1.904 kWh (MAE), 2.841 kWh (RMSE)
2021	[75]	Mahabad HPP, Iran	6 MW	AWT, LSTM, RF	Daily	2.154 kWh (MAE), 5.261 kWh (RMSE), 98.7% (R²)
2022	[84]	Gorno-Badakhshan HPPs, Tajikistan	N/A	LR, kNN, ABDT, ABLR, RF, XGBoost, MLP	Daily	5.23% (MAPE)
2023	[58]	Yunnan, China	N/A	XGBoost, GM	Quarter Hourly	97.14% (Acc.)
2024	[59]	Skawa HPP, Poland	760 kW	RF, GBDT, MLP, RBF	Daily	10.96 kWh (MAE), 3.41% (MAPE)

Table 3. Features of the data set.

Category	Feature	Description	Units
Energy	EnergyLag1h	Hourly generation lagged by 1 h	MWh
Energy	EnergyLag1d	Hourly generation lagged by 1 day	MWh
Energy	EnergyLag1w	Hourly generation lagged by 1 week	MWh
Energy	Dağdelen HPP	Hourly generation	MWh
Energy	Kandil HPP	Hourly generation	MWh
Energy	Sarıgüzel HPP	Hourly generation	MWh
Energy	Hacınınoğlu HPP	Hourly generation	MWh
Energy	Menzelet HPP	Hourly generation	MWh
Energy	Kılavuzlu HPP	Hourly generation	MWh
Energy	Sır HPP	Hourly generation	MWh
Energy	Berke HPP	Hourly generation	MWh
Weather	QV2M	Specific humidity at 2 m	kg/kg
Weather	U2M	East–west wind components at 2 m	m/s
Weather	V2M	North–south wind components at 2 m	m/s
Weather	T2M	Temperature at 2 m	C
Weather	TQI	Total column ice water content	kg/m²
Weather	TQL	Total column liquid water content	kg/m²
Weather	TQV	Total column vapor content	kg/m²
Weather	SWTDN	TOA incoming shortwave flux	W/m²
Weather	SWGDN	Surface incoming shortwave flux	W/m²
Weather	PRECTOT	Total precipitation	mm
Weather	PREVTOT	Total column re-evap of precipitation	mm
Weather	PRECSNO	Snowfall precipitation	mm
Market	MCP	Market clearing price	TRY
Market	WAP	Weighted average price	TRY
Market	SMP	System marginal price	TRY

Table 4. Five best results of the proposed models.

		Tree	Learning	R²		Computational
	Model	Size	Rate	(%)	RMSSE	Time (s)
1	LightGBM	1000	0.10	97.07	0.1217	1.240
2	LightGBM	900	0.10	97.06	0.1219	1.192
3	LightGBM	800	0.10	97.05	0.1220	1.066
4	LightGBM	700	0.10	97.04	0.1221	0.894
5	LightGBM	600	0.10	97.03	0.1223	0.768
6	CatBoost	1000	0.15	96.94	0.1242	4.832
7	CatBoost	900	0.15	96.93	0.1245	4.316
8	CatBoost	1000	0.20	96.91	0.1249	4.971
9	CatBoost	800	0.20	96.90	0.1250	3.328
10	CatBoost	900	0.20	96.90	0.1250	4.049
11	XGBoost	900	0.15	96.79	0.1273	2.007
12	XGBoost	600	0.15	96.78	0.1274	1.349
13	XGBoost	700	0.15	96.78	0.1274	1.600
14	XGBoost	800	0.15	96.78	0.1274	1.797
15	XGBoost	1000	0.15	96.78	0.1274	2.274

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Atalay, B.A.; Zor, K. An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models. Appl. Sci. 2025, 15, 10514. https://doi.org/10.3390/app151910514

AMA Style

Atalay BA, Zor K. An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models. Applied Sciences. 2025; 15(19):10514. https://doi.org/10.3390/app151910514

Chicago/Turabian Style

Atalay, Bektaş Aykut, and Kasım Zor. 2025. "An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models" Applied Sciences 15, no. 19: 10514. https://doi.org/10.3390/app151910514

APA Style

Atalay, B. A., & Zor, K. (2025). An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models. Applied Sciences, 15(19), 10514. https://doi.org/10.3390/app151910514

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Innovative Approach for Forecasting Hydroelectricity Generation by Benchmarking Tree-Based Machine Learning Models

Abstract

1. Introduction

2. Related Work

2.1. Statistical Models

2.2. Neural Networks-Based Models

2.3. Tree-Based Models

2.4. Hybrid and Other Models

3. Material and Methods

3.1. Material

3.2. Methods

3.2.1. XGBoost

3.2.2. LightGBM

3.2.3. CatBoost

3.2.4. Model Implementation

4. Results and Discussions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI