1. Introduction
In recent decades, the global energy sector has undergone significant transformations, driven by growing energy demand, energy security concerns and the urgency of reducing greenhouse gas emissions [
1,
2]. In this context, monitoring and analyzing global energy system is a strategic priority for policy makers, researchers and economic actors. At the same time, the digitization of energy infrastructures has led to the generation of a considerable amount of data, the analytical potential of which remains underutilized in the absence of advanced computational methods [
3]. Moreover, previous research has shown that digital development and technology adoption differ significantly across countries, highlighting the need for data-driven analytical approaches when comparing socio-economic and technological indicators at the national level [
4].
Machine learning (ML) offers a promising methodological framework for extracting knowledge from complex, dynamic and multidimensional datasets, characteristic of modern energy systems. In contrast to traditional statistical models, which assume linear relationships or predefined functional forms between variables, ML algorithms can learn nonlinear relationships directly from data, often difficult to capture by conventional methods [
5]. This flexibility is essential in analyzing energy systems, where the interactions between economic, technological, climatic and political factors are highly varied and interdependent.
ML applications in the energy domain have expanded considerably over the last decade due to the digitization of the energy infrastructure and the increased availability of high-resolution data. Some of the most common applications include forecasting energy demand on an hourly, daily, or seasonal basis in order to balance production and consumption in real-time [
6,
7,
8]; detecting anomalies in power grids and identifying unusual consumer or equipment behaviors [
9,
10]; and optimizing the operation of smart energy grids by adjusting real-time power flows and integrating intermittent renewable sources [
11,
12].
Relevant to the present study, ML is increasingly being used to identify and model macroenergy trends on a global scale, providing analytical support for long-term forecasting and simulation of energy transition scenarios [
13,
14,
15]. ML algorithms, such as neural networks, decision trees or ensemble methods (e.g., random forest, gradient boosting), can be trained on complex socio-economic, climate and technological datasets to extract recurrent patterns and estimate the evolution of consumption, energy mix or CO
2 emissions according to various explanatory factors.
Although ML is increasingly being applied in energy studies, there are still research gaps that deserve attention from researchers. First, most existing studies focus on single-region datasets, individual renewable technologies, or specific forecasting tasks. There are relatively few studies in the literature that provide a comprehensive, multidimensional analysis of renewable electricity generation. Second, there is limited systematic comparison of different machine learning paradigms in both full-featured and reduced-featured scenarios. Finally, the interpretability of high-performance machine learning models remains a significant challenge, especially when these models are used to inform energy policy and strategic planning.
To address these gaps, this study introduces an integrated machine learning framework for analyzing renewable electricity generation using a global dataset spanning 176 countries from 2000 to 2020. The framework incorporates ten feature selection techniques, four predictive models (ANN, GBR, XGBoost, and Random Forest), and SHAP-based explainability. This approach allows for comprehensive assessment of predictive accuracy, dimensionality–efficiency tradeoffs, and interpretability of key factors influencing renewable electricity generation.
This study aims to investigate the applicability of machine learning techniques in data-driven analysis of electricity generation from renewable sources in the sustainable energy sector. It examines the relevant data sources, the models used, and their performance in extracting meaningful information to support strategic decisions. Through this approach, the study contributes to the development of an integrated analytical framework for interpreting developments in the global energy sector.
The remainder of this paper is structured as follows: the next section provides a review of the relevant literature concerning the research topic, followed by a comprehensive presentation of the methodology employed for the development and evaluation of the predictive models. The subsequent section presents and analyzes the obtained results. Finally, the paper concludes with a discussion of the main findings, their implications for global energy dynamics, the limitations of the study, and potential directions for future research.
Although energy analysis using machine learning techniques has been extensively studied, most existing research focuses on single-region datasets or specific energy subsectors (e.g., electricity or renewables). The main scientific contributions and novelties of the study are:
Integrated feature selection and interpretability framework: The study systematically compares ten feature selection techniques (filter, wrapper, and hybrid) and evaluates how dimensionality reduction affects model accuracy and efficiency—an aspect rarely investigated in sustainable energy research.
Comparative evaluation across four complementary ML paradigms: The study performs rigorous benchmarking of ANN, Gradient Boosting, XGBoost, and Random Forest under two experimental scenarios (full and reduced feature sets), highlighting the trade-offs between accuracy, complexity, and explainability.
Explainable AI for global energy insights: By applying SHAP (Shapley Additive Explanations), we quantify the contribution of each input feature to model outputs—providing interpretable, policy-relevant insights into global energy transition dynamics.
Evidence for dimensionality–efficiency balance: The results demonstrate that moderate dimensionality reduction improves computational efficiency without sacrificing accuracy, offering a generalizable methodological guideline for large-scale ML applications in energy analysis.
2. Literature Review
The specialized literature on the application of machine learning algorithms in the energy field has grown rapidly in recent years, with significant results in areas such as renewable generation forecasting, demand estimation, predictive maintenance and anomaly detection. However, a critical review highlights both important merits and persistent limitations within these contributions.
Recent advances in machine learning have led to significant improvements in the accuracy and reliability of predictive models used in renewable electricity generation. These advances have facilitated the implementation of data-driven decision-making processes essential for strategic planning and energy policy development [
11,
15]. Machine learning methods are increasingly integrated with geospatial, climatic, and socio-economic variables, which allows for increased spatial and temporal resolution of renewable energy resource forecasts [
6,
16].
One of the major challenges in this field is the interpretability of models, especially those based on deep neural networks, which are often perceived as “black boxes” [
17]. Methods such as Grad-CAM or attention map visualization techniques have been proposed to increase the decision transparency of models, but they still do not provide complete interpretability, which limits widespread adoption in critical domains [
18,
19].
In terms of the types of models used, artificial neural networks (ANNs) are particularly effective in capturing complex nonlinear dependencies in energy data [
5,
20]. Applications include energy consumption forecasting, photovoltaic production estimation, and data analysis of smart energy management systems [
6,
9]. At the same time, ensemble techniques, such as Gradient Boosting Regressor (GBR), have also proven to perform extremely well in regression tasks on structured data by iteratively combining multiple weak models to reduce prediction error [
21,
22,
23].
Although ML techniques have become increasingly relevant for energy-related analysis, their use requires both methodological discipline and interpretability. Artificial Neural Networks, for example, can learn complex nonlinear relationships from data through layered structures of interconnected processing units, making them suitable for multidimensional prediction problems [
24]. However, as machine learning models become more complex, their interpretability becomes a central concern. Recent research has highlighted the importance of evaluating explainable, interpretable, and non-interpretable algorithms, especially when model outputs are intended to support decision-making [
25]. In this context, the present study includes predictive modeling, feature selection, and SHAP-based explainability to provide simultaneously accurate and interpretable knowledge of renewable electricity generation.
Recent studies in the field emphasize the importance of combining machine learning with feature selection. Sim et al. proposed an XAI-based input variable selection method that led to the identification of relevant predictors and improved model efficiency [
26]. Panagoulias et al. demonstrated that AI-based energy management systems should provide transparent and personalized explanations to support trust and decision-making [
27]. Shin et al. concluded that improved data processing and machine learning techniques are essential to enhance the efficiency and reliability of forecasting models [
28]. These studies reinforce the relevance of the current approach, which integrates feature selection, predictive modeling, and SHAP-based explainability for the analysis of renewable electricity generation.
Moreover, the integration of multidisciplinary data (geospatial, climatic, socio-economic) into ML models has improved the accuracy of predictions and better adaptability to regional variability in renewable resources [
29,
30]. However, the models face difficulties in generalizing to regions with insufficient data or with different characteristics from those in the training sets, highlighting the need for robust approaches capable of handling these limitations [
31].
Another emerging field is physics-informed machine learning (PIML), which integrates physical laws and domain knowledge into the learning process, contributing to increasing the robustness and interpretability of models in energy applications [
32,
33].
Machine Learning (ML) has been established as an essential component in the modernization of the energy sector, providing advanced solutions for forecasting, optimization and automation. The implementation of these techniques helps to increase operational efficiency, integrate renewables and reduce environmental impacts [
33,
34]. The integration of Machine Learning within Industry 4.0 has a profound impact by enhancing process optimization and operational efficiency, while providing the strategic advantage of adaptability to market dynamics and sustainable technological integration. Moreover, ML-driven predictive and adaptive capabilities enable real-time decision-making, fostering innovation and creating resilient, data-driven industrial ecosystems [
35,
36].
Table 1 provides a structured overview of key use cases where machine learning (ML) techniques have been successfully implemented. It outlines five main application areas: energy demand forecasting, renewable energy production estimation, predictive maintenance and anomaly detection, energy consumption optimization in smart buildings, and energy market price forecasting. For each application, specific ML algorithms are listed—including Random Forest, Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNNs), and hybrid models—alongside the primary benefits, such as improved grid balancing, enhanced reliability of renewable sources, increased operational efficiency, and decision-making support for stakeholders. The information is synthesized from multiple studies [
34,
35,
36,
37,
38,
39,
40,
41,
42,
43], highlighting the growing relevance of ML in addressing complex challenges within the energy domain.
Overall, existing contributions provide a solid basis for the application of ML in energy, but there is a clear need to extend the research towards studies in real-world environments, explainable AI models, rigorous benchmarking and sustainable integration of algorithms into the decision-making infrastructure of energy systems. More attention should also be paid to ethical components, social impact and algorithmic transparency, especially as ML models begin to influence critical processes in energy networks.
The application of machine learning in the energy sector has expanded significantly, covering a wide range of applications that contribute to the modernization and digitalization of energy infrastructures. From forecasting demand and renewable generation, to optimizing consumption in smart buildings and modeling energy markets, ML algorithms provide efficient, scalable and adaptable solutions [
45,
46].
However, the success of these applications depends largely on the quality of the data available, the ability of the models to generalize to a variety of contexts, and the integration of physical or economic knowledge into the learning process [
47,
48]. The future evolution of these systems will therefore require sustained interdisciplinary efforts to develop more robust, transparent models that are better grounded in the operational reality of the energy sector.
Collectively, these studies highlight the importance of integrating both machine learning in sustainable energy research, ensuring a balanced evaluation of accuracy, robustness, and interpretability. Building on these insights, the following section details the methodological framework adopted in this study. It describes the datasets used, the preprocessing procedures, and the design of the predictive models.
3. Methodology
Figure 1 illustrates the methodology adopted in this study, based on a dataset comprising 3649 instances and 21 columns reflecting global energy consumption patterns and indicators. The data was collected from 176 countries over the period 2000–2020. The first stage of the framework is data description and exploratory data analysis to understand the structure and underlying distributions. Data preprocessing follows, including splitting the dataset into training and testing subsets (80% training), cleaning and formatting procedures. In the feature selection step, we designed several experiments to decide whether to keep all the features in our research or to design the next stages only with some attributes considered relevant in our analysis. Four predictive models are employed in parallel—an Artificial Neural Network (ANN), a Gradient Boosting Regressor (GBR), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF)—each undergoing hyperparameter selection, model training, and performance evaluation. The final step involves comparing the results of models to assess their predictive capacity and derive insights relevant to global energy consumption sector.
Because ensemble models performed very well in predictions, additional steps were taken to reduce the risk that data information would end up in the wrong place during training or testing. Specifically, the PanelSplit method was used to preserve the temporal order of the data. Thus, future data were not used during model training. Furthermore, data processing was performed only on the training data, separately for each validation step.
3.1. Modeling Approach
To construct prediction models, this study used four machine learning algorithms—ANN, GBR, XGBoost, and RF. The ANN model is particularly useful in capturing complex, nonlinear relationships between multiple energy indicators and consumption levels, making it suitable for high-dimensional data. In contrast, the GBR model offers strong interpretability and robustness against overfitting, effectively handling heterogeneous features and outliers. The XGBoost algorithm extends the gradient boosting framework with advanced regularization and optimized parallel processing, allowing for faster convergence and improved generalization. Meanwhile, the Random Forest (RF) model leverages ensemble averaging to reduce variance and enhance predictive stability, performing exceptionally well on diverse datasets with minimal parameter tuning. Together, these models provide complementary perspectives that enhance the reliability and depth of the predictive analysis.
3.1.1. Artificial Neural Network (ANN)
An Artificial Neural Network (ANN) is a computational model inspired by the structure and function of the human brain. It is designed to recognize patterns and model complex relationships within data. The ANN comprises layers of interconnected nodes (neurons), with each connection assigned a weight that is updated during training to minimize prediction error [
44].
In this study, an ANN was used to identify nonlinear dependencies between global energy consumption and 21 explanatory variables. The final architecture consists of an input layer with one neuron for each attribute used in the experiments, two hidden layers with 41 neurons each, which include the Rectified Linear Unit (ReLU) activation function, and an output layer with a single neuron with linear activation, suitable for regression tasks.
The network was trained using the backpropagation algorithm with the Adam optimizer, and the mean square error (MSE) was used as the loss function. In this study, a learning rate of 0.01 was used and early stopping was applied to prevent overfitting. The model was implemented in TensorFlow (v2.20) and Keras (v3.11), to ease the design, training, and evaluation of the model. The analysis demonstrated that the artificial neural network (ANN) is suitable for studying renewable energy generation in sustainable energy systems due to its ability to model complex, high-dimensional data with nonlinear interactions.
3.1.2. Gradient Boosting Regression (GBR)
Gradient boosting regression (GBR) is an ensemble learning method that builds a predictive model by sequentially combining the outputs of multiple weak learners, usually decision trees. At each stage, a new tree is trained to correct the residual errors of the previous ensemble, the strategy leading to gradual improvement in the overall performance of the model [
24].
The use of Gradient Boosting Regression (GBR) is due to its proven robustness against outliers as well as its efficient predictive accuracy on structured datasets. The final model used 500 estimators, a maximum tree depth of 5, and a learning rate of 0.1 to achieve a balance between bias and variance. The mean square error (MSE) was considered as the loss function, and early stopping was performed depending on the validation performance to mitigate overfitting.
To increase the efficiency of the algorithm, the model was implemented using the Scikit-learn library. Unlike neural networks, GBR manages to provide a better interpretation by integrating the importance of features, offering additional advantages in identifying the main factors influencing energy consumption within the dataset.
3.1.3. XGBoost
Extreme Gradient Boosting Regression (XGBoost) is an implementation designed for speed and performance, achieved through various system-level optimizations such as parallel tree construction, cache awareness, and regularization to reduce overfitting. Unlike conventional methods, XGBoost adds both L1 (Lasso) and L2 (Ridge) regularization terms to its objective function, improving the generalization and stability of the model [
21].
In this study, XGBoost was used for scalability and handling high-dimensional and sparse data. The model was set up with 500 estimators, a maximum tree depth of 10, and a learning rate of 0.1. A subsampling ratio of 0.3 was also applied to help reduce overfitting. The optimization was guided by the Root Mean Squared Error (RMSE) metric, and early stopping was based on validation results.
The Python XGBoost library version 3.0.5 was integrated with the Scikit-learn API for efficient training and native compatibility and, for deep understanding of the dataset and identification of dominant predictors, the importance of features was analyzed.
3.1.4. Random Forest
Random Forest (RF) regression is an ensemble learning technique that builds multiple independent decision trees and aggregates their predictions to generate a more stable and accurate result [
49]. Each tree is trained on a bootstrap sample of the data and, at each split, a random subset of features is considered. This approach ensures diversity among trees and reduces model variance.
The RF method was selected for its robust performance on structured data, noise resistance, and minimal tuning requirements. The model used 100 estimators, each with a maximum depth of 20, and used the mean square error criterion to evaluate the quality of the split. A fixed random seed was applied to ensure the reproducibility and stability of the model. In addition, RF provides interpretability through feature importance scores, which were used to identify the most influential variables in predicting energy consumption.
The combination of these methods, ANN, GBR, XGBoost and Random Forest, offers the advantages of different but complementary perspectives on the proposed model. ANN effectively identified complex nonlinear dependencies between variables, while GBR and XGBoost improved predictive accuracy through iterative error correction and regularization. Random Forest improved the stability and robustness of the model through ensemble averaging. Together, these methods strengthened the reliability, interpretability and overall credibility of the study’s predictive results.
For these experiments, the following statistical metrics were used in order to assess the algorithms: Mean Square Error (MSE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Coefficient of determination (R2 score). The R2 score is used to measure how well a model fits data; the closer to 1.0, the better the model fits the data [
50,
51]. The 80/20 train test split and PanelSplit techniques are applied to determine these performance metrics [
52]. The training dataset, which contained 80% of the instances, was first used in the feature selection stage. Furthermore, the PanelSplit technique was applied to the same training set. Finally, we used 20% of the dataset’s instances (set aside initially) to test the final models’ architectures.
3.2. Dataset
The dataset used
is Global Data on Sustainable Energy (2000–2020) which contains 21 columns reflecting global energy consumption patterns and indicators from 176 countries gathered from 2000 to 2020 [
53]. There are 3649 instances recorded in this dataset. Except the attributes
entity (object),
year,
Financial flows to developing countries (US
$),
Density (P/Km
2),
Land Area (Km
2), which are integers, the other columns are real numbers. For the experiments in this article, the column
Electricity from renewables (TWh) was selected to be set as the target attribute.
Environmental variables include CO2 emissions per country,
primary energy consumption per capita (kWh/person), and
the energy intensity of primary energy use. Economic indicators such as
Gross Domestic Product (GDP),
GDP per capita, and
financial flows to developing countries are also integrated. Additionally, spatial and demographic factors—such as
population density,
land area,
latitude, and
longitude—are included to enhance the contextual accuracy of the predictions.
These variables collectively provide a multidimensional dataset for analyzing the relationships between energy systems, economic development, and environmental impact.
3.3. Data Processing
The techniques chosen in the preprocessing stage directly influence the model performance regarding reducing bias, stability and balancing feature influence.
Table 2 presents a four-step process. In order to evaluate the models’ performance, the first step in the preprocessing stage was data splitting with an 80:20 ratio. In the next stage, the missing values were replaced with median calculated for each attribute. The median is preferred over the mean when data are skewed or contain outliers. The median represents the middle value in an ordered dataset, which prevents extreme values from distorting the measure of central tendency.
Furthermore, data normalization is necessary before building the models, since there are various heterogeneous data. Data normalization scales the feature data proportionally to adjust each data into a common scale (e.g., 0 to 1). This step also allows gradient-based models to converge faster by preventing features with large magnitude to dominate the ones with a small magnitude. In terms of stability, if the data is not normalized, features with larger ranges can cause numerical instability (underflow/overflow) during deep learning model training [
54].
To comprehensively assess the impact of input dimensionality on model performance, two main experimental scenarios were developed: one utilizing all available features and another relying on the optimized subset obtained after feature selection. This comparative design enabled a clearer understanding of how feature reduction influences predictive accuracy, model stability, and computational efficiency across different algorithms. Feature selection is the process of identifying and retaining the most relevant variables within a dataset to enhance model accuracy, efficiency, and interpretability while minimizing redundancy and overfitting. In this study, three categories of feature selection techniques (filter, wrapper, and hybrid methods) were applied prior to model training.
Following standardized data preprocessing procedures is crucial for ensuring the accuracy and robustness of machine learning models. Proper preprocessing directly impacts model performance by mitigating noise, handling missing or inconsistent data, and enabling effective feature extraction, which are all critical for obtaining robust and generalizable predictive outcomes in the energy sector.
4. Results and Discussions
All experiments were performed on a system equipped with an Intel(R) UHD Graphics GPU, an Intel Core i7-13620H CPU@2.40 GHz, and 16 GB of RAM. The case studies were developed using Python 3.13.7, employing several scientific libraries including NumPy 2.2.6, Scikit-learn 1.8.0, Scipy 1.16.2, itertools 3.13, Pandas 2.3.2, PanelSplit 2.1.0, Matplotlib 3.10.6, Seaborn 0.13.2, TensorFlow 2.20.0, Keras 3.11.3, feature_engine 1.9.3, mlxtend 0.23.4.
4.1. Feature Selection Experiments
To identify the most informative and non-redundant variables influencing energy-related indicators, a comprehensive feature selection process was conducted prior to model training. This step was essential to reduce data dimensionality, eliminate noise, and enhance both the interpretability and computational efficiency of the predictive models. By isolating only the most significant features, the analysis ensured that subsequent modeling phases focused on variables with genuine explanatory power, ultimately improving both the accuracy and the stability of the results.
We used 10 feature selection methods to determine the most relevant attributes for this dataset: five belonging to the filter category, four to the wrapper category, and one hybrid approach. The selected feature selection techniques cover complementary aspects of the problem, including data quality filtering (constant, quasi-constant, duplicate), univariate statistical relevance (Pearson correlation, Chi-square), and interaction-aware multivariate search (exhaustive, forward and backward selection). Model-based importance was approximated using linear coefficients and permutation-based feature blending. Embedded methods, such as LASSO and Boruta, were excluded because they are model-dependent and may bias feature selection towards certain assumptions, such as linearity or tree-based structures. This study emphasizes a modular framework, allowing for a more transparent comparison between feature selection strategies, while capturing similar effects through importance-based and encapsulation methods [
55].
The results obtained are summarized in
Table 3. The first row presents the original dataset without any feature reduction, serving as a baseline for comparison. Three of the five filter-based techniques (Constant, Quasi-Constant, and Duplicated) retained all features, indicating that no redundant or constant attributes were detected. In contrast, the Pearson and Chi-square filters eliminated 2 and 10 features, respectively. The Pearson method removed
Access to electricity (% of population) and Value_CO2_emissions_kt_by_country, while the Chi-square test excluded
Entity, Year, Access to electricity (% of population), Renewable electricity generating capacity per capita, Primary energy consumption per capita (kWh/person), Energy intensity level of primary energy (MJ/$2017 PPP GDP), Density (P/km2), Land Area (km2), Latitude, and Longitude.
Table 3 also presents a detailed comparison of these feature selection methods and their influence on both predictive performance and computational efficiency. The evaluation encompasses classical error metrics—Mean Squared Error (MSE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE)—and the coefficient of determination (R
2), complemented by the recorded time to transform and training time.
The baseline model, constructed using the full feature set (“Original data”), achieved an MSE of 1227.156 and an R2 score of 0.8248. Several filter-based approaches, including Constant, Quasi-Constant, and Duplicated feature removal, produced identical outcomes to the baseline. However, the Pearson and Chi-square filters slightly changed the results (R2 = 0.819 and 0.749, respectively), suggesting that the eliminated features, though weakly correlated, contributed synergistically to model performance.
An important observation highlights the diversity of the features dropped by the methods. While filter-based techniques typically removed a small, homogeneous subset of features, wrapper and hybrid approaches discarded larger and more diverse groups of attributes—ranging from 10 to 15 variables. This heterogeneity in feature elimination strongly influenced model behavior. In particular, the Exhaustive and Step Backward Selection methods demonstrated that selective removal of specific combinations of features can marginally reduce prediction error (MSE ≈ 1210–1213). Conversely, the Step Forward, Linear, and Feature Shuffling approaches dropped features essential to the model’s explanatory power, resulting in R2 values as low as 0.742.
From a computational perspective, the trade-off between accuracy and efficiency is evident. Filter methods required negligible transformation time (<0.01 s), while wrapper and hybrid techniques, especially Exhaustive Selection, imposed substantial computational demands (up to 573 s). Nevertheless, the diversity of dropped features reveals the complex interplay between feature relevance, redundancy, and model interpretability.
Figure 2 provides an overview of how frequently each feature was excluded across the applied selection techniques. The results reveal distinct patterns of feature elimination, reflecting differences in how each method interprets feature relevance and redundancy.
A notable observation is the heterogeneity of the discarded attributes. Certain features, such as Year, Value—CO2 emissions (% equivalent), and Longitude, were consistently removed, indicating limited or overlapping predictive value. In contrast, economic and energy-related indicators, including GDP growth and Renewables (% equivalent), were retained in most models, suggesting stronger and more consistent explanatory power.
The overall distribution illustrates that selection algorithms emphasize different structural aspects of the data. Rather than converging toward a uniform subset, they highlight complementary perspectives on feature importance—underlining the multidimensional nature of the dataset and the contextual dependencies among variables.
The results show that models trained on the original dataset and those employing filter-based methods achieved the lowest error values, as reflected by the Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). These techniques maintained consistent accuracy with minimal performance variability, indicating that filtering removed mostly redundant or irrelevant attributes without distorting the predictive structure of the data.
As illustrated in
Figure 3, the comparative performance of the models across different feature selection strategies—Normal data, Filter, Wrapper, and Hybrid—shows notable variations in predictive accuracy. The Normal data configuration achieved the lowest error values for all three error-based metrics (MSE, MAE, and RMSE) and the highest R
2 score, showing superior predictive capability and model fit.
Conversely, the Hybrid approach produced the highest MSE and RMSE values and the lowest R2 score. This suggests the combined feature selection method may have introduced redundant or irrelevant features that negatively affected model generalization. The Filter and Wrapper techniques yielded intermediate results, with slightly higher error metrics and marginally reduced R2 values compared to the Normal data configuration.
4.2. The Experiments Description
We set two major experiment types. The first scenario (S1) experiments use all the features from dataset to train and test the ML models. In the second scenario (S2), the columns the most times mentioned to be dropped by the feature selection methods are eliminated from the dataset before training the models (Access to electricity (% of population), Energy intensity level of primary energy (MJ/$2017 PPP GDP), Longitude). Furthermore, the dataset was partitioned into two subsets across all experiments: 80% for training and 20% for testing.
In both scenarios, four main experimental phases were carried out: the first aimed at identifying the optimal architecture of the Artificial Neural Network (ANN) model, the second focused on fine-tuning the hyperparameters of the Gradient Boosted Regression (GBR) model, the third concentrated on calibrating the eXtreme Gradient Boosting (XGBoost) algorithm, while the fourth addressed the optimization and validation of the Random Forest (RF) model. The selection of these models was guided by their complementary learning characteristics. The artificial neural network (ANN) model facilitates the approximation of complex nonlinear relationships. In contrast, ensemble-based methods, such as Gradient Boosting Regression (GBR), XGBoost, and Random Forest (RF), offer robust performance through aggregation mechanisms and are particularly effective for structured tabular data. For each model, hyperparameter ranges were established to balance model expressiveness and computational efficiency, based on commonly adopted configurations in the literature and preliminary test results. This approach allows for systematic exploration of the parameter space while reducing the risk of overfitting due to excessive adjustment.
To avoid data leakage in the panel dataset, in these experiments, the PanelSplit cross validation method was taken into consideration while accounting for temporal dependencies. This approach preserves the temporal and group-wise structure of the dataset, ensuring observations from the same entity or future time steps are not included in both training and test sets. Unlike random splitting, this method offers a more realistic evaluation of model generalization with temporal dependencies [
52]. Since the data follows a time series, the PanelSplit method with 10 splits was used. This preserves the order of the years and helps to avoid mixing information between the training and testing data. Choosing 10 splits provides a good balance between reliable results and reasonable computational time. This allows the models to be more accurately compared for data from 176 countries, over the period 2000–2020. For testing, a single time period was used to simulate the situation in which the model predicts the following year. No lag was used between the data, as the period under analysis is quite short and the observations are annual.
These experiments were designed to constrain the algorithm’s search space, thereby reducing unnecessary computational effort and minimizing the risk of exploring non-optimal parameter regions.
4.2.1. The Artificial Neural Network Experiments
In
Table 4, the most relevant results for the experiments conducted on Artificial Neural Networks (ANNs) are synthesized, in which the architectures were varied with respect to the number of neurons across hidden layers, activation functions, and the implementation of early stopping. In each case, there were 10 runs and the average values for the metrics were calculated, as provided in
Table 4. The network architecture was systematically adjusted by varying the number of hidden layers from one to three and the number of neurons within these layers between 10 and 41 to evaluate the impact of network depth and capacity on model performance. Regarding the activation function, experiments with a linear function as well as the RELU (Rectified Linear Unit) activation function were conducted.
The number of training epochs was set to 1000, and the early stopping technique was applied to prevent overfitting. The patience parameter defines the number of consecutive epochs without improvement after which training is terminated. It was set to 10.
In Scenario 1, Experiment 7 achieved the best overall performance among the ANN configurations tested. It achieved the lowest errors, with MSE = 140.1938, MAE = 4.6403, and RMSE = 11.8403, as well as the highest R2 score = 0.9844. This configuration used a network with three hidden layers, consisting of 41, 41, and 30 neurons. The results suggest that a deeper architecture helped the model to more effectively capture the elaborate relationships between the analyzed indicators. The results also show how important the balance between model complexity and generalization is. More complex configurations, such as Exp6–Exp8, performed better than the simpler ones. At the same time, the use of early stopping helped to avoid overfitting; i.e., the situation where the model overlearns the details in the training data and performs poorly on new data. In the case of Exp7, training stopped after 74 epochs, which shows that the model reached a stable solution without having to go through all 1000 epochs.
A similar trend was observed in Scenario 2, where feature selection led to better results. In this case, Exp6 achieved the best performance, with R2 = 0.9885, MSE = 103.214, and MAE = 5.2935. This shows that removing less relevant variables helped the model to learn more efficiently. The dataset became simpler and more focused on important predictors.
Overall, the experiments show that the performance of the ANN model was influenced by both the network structure and the quality of the input variables. In Scenario 2, feature selection contributed to better generalization and reduced forecasting errors. Thus, the results confirm that excluding redundant variables can improve the performance of models applied to complex datasets in the field of sustainable energy.
4.2.2. The Gradient Boosting Regressor Experiments
The second set of experiments involved searching for the most appropriate values for the hyperparameters of the GBR model.
Table 5 summarizes the outcomes of the Gradient Boosting Regressor (GBR) experiments, each configured with varying hyperparameters; namely, the subsample, number of trees, maximum depth, and learning rate. In each case synthesized in
Table 5, there were 10 runs and the average values for the metrics were calculated to be written down in the table, similar to the ANN experiments. The number of trees to be used was set between 30 and 500; the learning rate swept from 0.1 to 0.01. The maximum depth reflects the number of nodes in the tree (used from 3 to 20), and the subsample parameter shows the fraction of samples used to train each tree (with values between 0.3 and 0.8).
In Scenario 1, Experiment 6 obtained the best results for the GBR model. It had the lowest error values: MSE = 67.922, MAE = 1.7284, and RMSE = 8.2414, along with the highest R2 score = 0.9924. These results show that the model was able to make very accurate predictions. The configuration used in Experiment 6 included 500 estimators, a maximum tree depth of 5, and a learning rate of 0.1. This combination allowed the model to effectively identify complex and nonlinear relationships in the data. At the same time, the model was able to maintain good generalization ability. Additionally, using a subsampling ratio of 0.8 probably helped to reduce excessive variance and increase the diversity of the model.
A similar situation was observed in Scenario 2, where feature selection improved the model performance. In this scenario, Experiment 6 also performed best, with R2 = 0.9939, MSE = 54.7124, and RMSE = 7.3967. This shows that removing less important variables helped the model to learn better from the data. By using a smaller and more relevant feature set, the model was able to focus more easily on the important predictors. Thus, redundant information was reduced and the training process became more efficient. Comparing the two scenarios shows that feature selection improved the GBR model. In Scenario 2, the model used fewer variables but retained important information for prediction. Thus, the model achieved more stable results, reduced errors, and better performance in the validation stage.
4.2.3. The eXtreme Gradient Boosting Experiments
Table 6 summarizes the performance of the eXtreme Gradient Boosting (XGBoost) model under two experimental scenarios—using all features and using only the selected subset. Overall, XGBoost demonstrated high predictive accuracy and remarkable consistency across configurations, with R
2 scores generally exceeding 0.98 in both settings.
In Scenario 1, several configurations of the XGBoost model achieved very good results, with R2 values above 0.99. This shows that the model was stable and was able to capture the nonlinear relationships in the data well.
Of the configurations tested, Experiment 6 had one of the best performances, with MSE = 64.4514, MAE = 2.3360, and RMSE = 8.0281. This configuration used 500 trees, a maximum depth of 5, and a learning rate of 0.1, which provided a balance between model complexity and stability of the results. Increasing the tree depth, as in Experiment 8, did not change performance very much. This shows that the XGBoost model is robust and not very sensitive to these moderate changes in the hyperparameters.
In Scenario 2, feature selection led to even better results. The best configuration was Exp8, with MSE = 59.507, RMSE = 7.714, and R2 = 0.9933. These results show that removing less relevant variables helped the model to focus on the most important information for prediction.
Overall, the results show that feature selection improved the performance of the XGBoost model. The model worked more efficiently with a smaller set of variables and was able to obtain accurate predictions on complex and correlated energy data.
4.2.4. The Random Forest Experiments
Table 7 presents the performance outcomes of the Random Forest (RF) model under two experimental scenarios, one using the complete feature set and the other employing the selected subset of features. Overall, the results confirm the robustness and stability of the RF algorithm, which maintained high predictive accuracy across configurations, with most R
2 scores exceeding 0.97.
In Scenario 1, several configurations of the Random Forest (RF) model achieved very good results. This shows that the model was stable and not very sensitive to moderate changes in the hyperparameters. Of the experiments tested, Exp5 performed best, with MSE = 41.8228 and R2 = 0.9953. This configuration used 100 trees, a maximum depth of 20, and the Log2 strategy for feature selection. The results suggest that sufficiently deep trees and balanced feature selection helped the model to achieve very accurate predictions.
Close results were also obtained by Exp4, Exp6, and Exp8, confirming the robustness of the RF model in several configurations. In Scenario 2, feature selection led to even better results. The best fit was Exp4, with R2 = 0.9955, MSE = 39.6791, and RMSE = 6.4670. Removing less relevant variables helped the model to focus on important predictors and reduce unnecessary information.
Overall, the results show that feature selection improved the stability and generalizability of the RF model. The model also proved to be suitable for complex datasets in the sustainable energy domain, where nonlinear relationships and correlated variables exist.
Thus, the way the experiments were organized allowed for a fair and clear comparison of the models. All models were tested under similar conditions, and the temporal order of the data was preserved. This reduced the risk that future information would influence the training process. With this approach, the predictions obtained are more reliable and easier to interpret. It is also possible to better assess how well each model performs and how well it predicts new data.
4.3. Discussions
This study explores the applicability of machine learning techniques in the data-driven analysis of electricity generation from renewable sources in sustainable energy sector. Two experimental setups were designed: Scenario 1, which utilized all available features in the datasets, and Scenario 2, where three low-impact attributes identified through feature selection were removed. In both scenarios, four predictive models were developed: Artificial Neural Network (ANN), Gradient Boosting Regressor (GBR), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF). A key challenge during implementation was the effective tuning of hyperparameters. The previous section outlines the experiments conducted to identify the optimal parameter values.
Once the optimal hyperparameters were determined, algorithms were used to train the respective models. A predefined stopping criterion was applied to ensure convergence and avoid overfitting. Specifically, training was halted if the loss function did not improve by more than 0.0001 over ten consecutive iterations, at which point the current loss value was considered optimal.
4.3.1. Comparative Predictive Performance Analysis
The trained models were then evaluated on the test dataset by comparing actual values with model predictions, allowing for an assessment of predictive accuracy.
Figure 4 displays cross-plots of actual versus predicted values for all machine learning algorithms in Scenario 2. A high density of data points near the bisector line is evident for the models, indicating strong predictive accuracy. Furthermore, the fitted regression lines derived from the cross-plot points closely follow the ideal line (y = x), suggesting minimal deviation between actual and predicted values. These visual results confirm the high level of agreement between observed and estimated electricity generation from renewables (TWh), underscoring the effectiveness of all models in capturing underlying patterns and delivering reliable decision-making support.
Figure 5a–d show the comparison between the four models analyzed: ANN, GBR, XGBoost and RF. The comparison is made based on the MAE, MSE, RMSE and R
2 metrics. The results show that the ensemble-based models (i.e., GBR, XGBoost and RF) performed better than the ANN model in both scenarios. This suggests that tree-based models are more suitable for the analyzed data, as they can better capture the nonlinear and complex relationships between variables.
The error metrics in
Figure 5a–c show that RF, GBR and XGBoost achieved smaller errors than ANN. Therefore, these models made more accurate and stable predictions. Among them, RF had the smallest errors, indicating a very good predictive performance. The GBR and XGBoost models also performed well. They were able to identify the nonlinear relationships between energy indicators and maintained stable performance in both scenarios.
Figure 5d confirms the same results through the R
2 score. The RF model achieved the highest score, followed by GBR and XGBoost. The ANN model performed less well, but its results remained competitive. In conclusion, ensemble-based models are more suitable for sustainable energy forecasting. They perform better when the data contains correlated variables and nonlinear relationships. Additionally, the results from Scenario 2 show that feature selection helped to reduce redundant information and improve predictions.
Although the Random Forest model achieved a very high R2 score (R2 = 0.9955), the adopted validation strategy reduced the risk that the model could use information from future observations during training. Using the PanelSplit strategy, the data were kept in their original order over time. Thus, information from subsequent years did not affect the model’s training. Moreover, the similar results obtained across several validation splits indicate that the model’s performance was stable. Therefore, the very good result seems to be due to the model’s ability to learn correctly from the data, not to an artificial increase in performance.
The evaluation of feature importance represents a critical step not only in understanding how input variables influence model predictions but also in optimizing computational cost and model complexity. In high-dimensional datasets, redundant or weakly correlated features tend to increase training time and memory consumption without necessarily improving predictive accuracy.
The comparative analysis presented in
Figure 6 revealed that Scenario 2 maintained almost identical predictive performance to Scenario 1, with negligible variations in RMSE and training time. This demonstrates that trimming the input space by eliminating only three non-essential variables can preserve the model accuracy. Even if the training time was not significantly reduced, having three attributes less in the analysis computational space improves model efficiency and reduces the cost complexity overhead.
In addition to the performance of the models, the organization of the experiments also took into account the computational time, the required resources, and the possibility of applying the method to larger datasets. The analyzed models have different computational requirements. The Artificial Neural Network (ANN) requires an iterative training process, and the complexity depends on the structure of the network and the number of training epochs. In contrast, ensemble methods, such as Gradient Boosted Regression (GBR) and eXtreme Gradient Boosting (XGBoost), depend mainly on the number of boosting steps and the depth of the trees used. The Random Forest (RF) model can be run in parallel, which makes it more efficient, but it can become computationally expensive when the number of trees or the number of features increases.
In order for the experiments to be carried out in a reasonable time, the parameter values were limited to moderate ranges. Thus, a balance was sought between the ability of the models to learn complex relationships and the cost required for training. Specifically, limits were set for the number of estimators (i.e., up to 500 trees), the maximum depth of the trees (i.e., up to 20), and the size of the neural network (i.e., a maximum of 3 hidden layers, with a sufficient number of nodes between 0 and 41).
From the point of view of scalability, the proposed experimental framework is suitable for medium-sized panel datasets. At the same time, it can be extended to larger datasets, especially by using parallelization—for example, in the case of RF and XGBoost models—or by applying more efficient hyperparameter search methods. Using the PanelSplit cross-validation method increases the computational cost compared to a simple random data split. However, the trade-off is necessary, which helps to obtain more reliable results when the data have temporal dependencies.
4.3.2. Statistical Significance Analysis
The statistical significance of the differences between the models was analyzed using nonparametric tests. These tests were chosen because the number of cross-validation folds was reduced (i.e.,
n = 10) and because the data were not necessarily normally distributed. The first step was to apply the Friedman test to each evaluation metric: RMSE, MAE, MSE, and R
2. This test was used to check if there were general differences between the models analyzed. After that, pairwise comparisons between the models were performed using the Wilcoxon signed-rank test and the Bonferroni correction. The Wilcoxon test was chosen because it is suitable for pairwise comparisons when the data are not normally distributed. The Bonferroni correction was used to reduce the risk of misclassifying a difference as significant, since multiple comparisons were made between the models [
56,
57].
Statistical significance was analyzed at a corrected threshold of alpha = 0.05/6 = 0.0083. This threshold corresponds to the six pairwise comparisons performed between the four models evaluated. The results presented in
Table 8 show that there are statistically significant differences between the models analyzed for all metrics considered, both in Scenario 1 and Scenario 2. In Scenario 1, the Friedman statistic values were 23.16 for RMSE, MSE, and R
2, and that for MAE was 26.04. The
p-values obtained were much lower than the significance threshold of 0.05. Specifically, the
p-values of 0.000037398 and 0.000009355 indicate that the differences observed between the models are not due to random variation across the cross-validation folds. The results obtained in Scenario 2 are similar.
These results show that the evaluated models do not have the same predictive performance. In other words, at least one of the models has significant results compared to the others. Therefore, the null hypothesis that all models have equal performance was rejected across all evaluation metrics. Based on these results, post hoc pairwise comparisons were performed using the Wilcoxon signed-rank test with the Bonferroni correction. The purpose of these comparisons was to identify the exact model pairs for which there are statistically significant differences.
The results obtained in Scenario 1 and Scenario 2 show that the ANN model performs significantly differently from all ensemble-based models, as observed in
Table 9. Specifically, significant differences were found between ANN and GBR, ANN and XGBoost, and ANN and RF. For all these comparisons, the
p-value was 0.001953, which is lower than the adjusted significance threshold. This shows that the ANN model performed worse than the tree-based models in the experiments performed.
A significant difference was also observed between GBR and RF, with p-value = 0.001953. This result shows that the Random Forest (RF) model achieved better accuracy, especially by reducing the RMSE value. In contrast, no statistically significant differences were found between GBR and XGBoost or between XGBoost and RF. The p-values for these comparisons were higher than the Bonferroni adjusted threshold. For example, for GBR and XGBoost, the p-values were 0.27343 in Scenario 1 and 0.013672 in Scenario 2. For XGBoost and RF, the p-values were 0.275390 in Scenario 1 and 0.193359 in Scenario 2.
These results suggest that RF and XGBoost performed closely and consistently over the course of cross-validation. The GBR model performed in the middle, outperforming ANN but generally underperforming RF. Overall, the statistical analysis shows that ensemble learning methods outperformed the ANN model on the sustainable energy forecast dataset. Among the models analyzed, Random Forest (RF) emerged as the most robust model with consistently good results.
The validation framework used makes the study more rigorous and methodologically better organized. It combines several important steps: cross-validation that preserves the temporal order of the data, evaluation of the results for each fold, estimation of confidence intervals, and application of non-parametric statistical tests. In addition, using a validation method suitable for panel data allows for obtaining results that are closer to real-world situations, where the data are time-dependent. This approach reduces the risk of overestimating the performance of the models due to the accidental use of future information in the training process.
4.3.3. Correlation Analysis
To better understand why the ensemble models performed so well, correlation analysis was performed between variables in the dataset. The results show that there are strong linear relationships between several energy indicators. This suggests that some variables contain very useful information for prediction. These relationships between variables most likely helped the Random Forest (RF) and XGBoost models to learn more efficiently from the data. These models are very good at identifying relationships between features, including when the relationships are complex, correlated, or nonlinear.
At the same time, correlation analysis shows why feature selection was useful in Scenario 2. Some variables provide similar information, so they can be considered redundant. By removing them, the dataset becomes easier to use, and models can be trained faster and with less resource consumption. The heat map highlights a strong correlation between
Value_co2 _emissions_ kt_by_country and
Electricity from fossil fuels (TWh) (0.95), as well as between
Access to clean fuels for cooking and
Access to electricity (% of population) (0.86) (
Figure 7).
These values show that the variables are directly proportional. In other words, when one of these variables increases, there is a strong tendency for the other to increase as well.
4.3.4. SHAP-Based Model Interpretation
This approach allows for a deeper understanding of how each input parameter influences the target output, which in this case is electricity generated from renewable sources.
Figure 7 presents the SHAP values for the input features and their relative impact, as determined by the ANN, GBR, XGBoost, and Random Forest models. The features are ranked according to their average absolute SHAP values, with higher values indicating stronger influence on the model output.
The SHAP analysis (
Figure 8) reveals that several key variables—particularly Land Area, Electricity from Fossil Fuels (ElFossilFuels), and Renewables—consistently exhibit the highest contributions across all models. These parameters demonstrate a strong positive impact on the prediction of renewable electricity generation, indicating that both geographic scale and the existing energy mix are dominant drivers of model outputs. Additional factors such as CO
2 emissions (ValueCO2Emissions), Electricity from Nuclear (ElNuclear), and Low-Carbon Electricity (LowCarbonEl) also show moderate influence, reflecting their indirect but relevant roles in the transition toward renewable energy systems.
Although all models identify similar top predictors, differences in their relative importance can be observed. The ANN model attributes greater weight to Primary Energy Consumption per Capita and Year, suggesting sensitivity to temporal and consumption dynamics. In contrast, ensemble models—particularly GBR and RF—display more stable and interpretable feature rankings, emphasizing structural and environmental variables such as Land Area and Energy from Fossil Fuels. The XGBoost model, while similar to GBR, assigns relatively higher importance to CO2 emissions and Access to Clean Fuels for Cooking, indicating its stronger responsiveness to environmental and socio-economic inputs.
Overall, the SHAP-based interpretation confirms that geographic and energy-related features dominate the prediction of renewable electricity generation across all models. The consistent ranking of key drivers across distinct algorithms strengthens the interpretability of the results and enhances confidence in the reliability of the machine learning framework for energy forecasting applications.
5. Conclusions
This study proposed and tested a machine learning framework for the analysis of renewable electricity generation, including four models: ANN, GBR, XGBoost and Random Forest. These models were compared according to their prediction accuracy, interpretability, computational efficiency and ability to adapt to new data. Two experimental scenarios were analyzed. In the first scenario, all available variables were used. In the second scenario, a reduced set of variables was selected using feature selection. Thus, it was possible to observe how eliminating less relevant variables affects model performance. For a more accurate assessment, the study also used PanelSplit cross-validation, statistical tests and SHAP analysis. Unlike many previous studies, which analyze a single country, region, technology or forecasting task, this research provides a global analysis. The dataset includes 176 countries, covering the period 2000–2020, 3649 observations, and 21 explanatory variables. The study compares four machine learning models in two experimental scenarios. It also uses SHAP analysis to identify the main factors influencing renewable electricity generation. Thus, the proposed framework combines global data, feature selection, model comparison, and interpretability.
The results show that the ensemble-based models performed better than the ANN model. In particular, Random Forest and XGBoost achieved the best predictive performance. Feature selection affected the efficiency of the models, as it reduced redundant information and retained important predictors. The Friedman and Wilcoxon statistical tests confirmed that the differences between the models are statistically significant. From a practical perspective, the proposed framework can support decision-making in sustainable energy planning. More accurate predictions can help with better resource allocation, renewable energy integration, long-term energy strategies, and sustainability policy formulation. In addition, SHAP analysis makes models easier to interpret, showing which variables most influence renewable electricity generation, such as land area, fossil fuel-based electricity, and the share of renewables.
However, the studies have some limitations. The analysis is based on a dataset from 2000 to 2020. Therefore, they may differ across other periods, countries, or regions. Also, some variables are highly correlated, which can influence model behavior. Another limitation is the computational cost, as several feature selection methods and machine learning models were tested.
Future research can extend this approach by using newer datasets. Hybrid deep learning models, transformer-based forecasting methods, and real-time prediction systems for smart grids can also be analyzed. Future studies may also analyze uncertainty, multimodal energy data, and more advanced explainability methods to include both prediction accuracy and model interpretation.
Author Contributions
Conceptualization, A.P. and E.G.D.; methodology, A.P., E.G.D., F.A.T. and A.G.B.; software, E.G.D. and F.A.T.; validation, A.P. and A.G.B.; formal analysis, A.P. and F.A.T.; investigation, E.G.D. and A.G.B.; resources, A.P. and F.A.T.; data curation, E.G.D., writing—original draft preparation, E.G.D. and F.A.T.; writing—review and editing, A.P. and A.G.B.; visualization, E.G.D.; supervision, A.P.; project administration, A.P.; funding acquisition, A.P. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Petroleum-Gas University of Ploiești, Fiscal Identification Code/CUI: 2844790, IBAN: RO52TREZ5210F330500XXXX, Address: Bulevardul București, nr. 39, cod 100680, Ploiești, România.
Data Availability Statement
The dataset used in this study is publicly available as Global Data on Sustainable Energy (2000–2020). The processed data and code are available from the corresponding author upon reasonable request.
Acknowledgments
All experiments were performed on a system equipped with an Intel(R) UHD Graphics GPU, an Intel Core i7-13620H CPU @ 2.40 GHz, and 16 GB of RAM. The case studies were developed using Python 3.13.7, employing several scientific libraries, including NumPy 2.2.6, Scikit-learn 1.8.0, Scipy 1.15.2, itertools 3.13, Pandas 2.3.2, Matplotlib 3.10.6, Seaborn 0.13.2, TensorFlow 2.20.0, Keras 3.11.3, feature_engine 1.9.3, and mlxtend 0.23.4. During the preparation of this manuscript, the authors used ChatGPT (OpenAI, GPT-5.3) for language refinement. The authors have reviewed and edited the output and take full responsibility for the content of this publication.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| AI | Artificial Intelligence |
| ML | Machine Learning |
| ANN | Artificial Neural Network |
| GBR | Gradient Boosting Regressor |
| Grad-CAM | Gradient-weighted Class Activation Map |
| LSTM | Long Short-Term Memory |
| CNN | Convolutional Neural Network |
| XGBoost | eXtreme Gradient Boosting Model |
| SVM | Support Vector Machine |
| ReLU | Rectified Linear Unit |
| MSE | Mean Squared Error |
| RMSE | Root Mean Squared Error |
| MAE | Mean Absolute Error |
| MAPE | Mean Absolute Percentage Error |
| SHAP | SHapley Additive exPlanations |
References
- World Energy Outlook 2023. 2023. Available online: https://www.iea.org/reports/world-energy-outlook-2023 (accessed on 12 July 2025).
- Renewables 2022 Global Status Report. 2022. Available online: https://www.ren21.net/wp-content/uploads/2019/05/GSR2022_Full_Report.pdf (accessed on 12 July 2025).
- Ponnusamy, V.K.; Kasinathan, P.; Madurai Elavarasan, R.; Ramanathan, V.; Anandan, R.K.; Subramaniam, U.; Ghosh, A.; Hossain, E. A Comprehensive Review on Sustainable Aspects of Big Data Analytics for the Smart Grid. Sustainability 2021, 13, 13322. [Google Scholar] [CrossRef]
- Bălăcescu, A.; Pătrașcu, A.; Tănăsescu, A. Analysis of Disparities in Internet Purchases by Individuals at the EU State Level. Electronics 2023, 12, 982. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Hong, T.; Pinson, P.; Fan, S.; Zareipour, H.; Troccoli, A.; Hyndman, R.J. Probabilistic Energy Forecasting: Global Energy Forecasting Competition 2014 and Beyond. Int. J. Forecast. 2016, 32, 896–913. [Google Scholar] [CrossRef]
- Mystakidis, A.; Koukaras, P.; Tsalikidis, N.; Ioannidis, D.; Tjortjis, C. Energy Forecasting: A Comprehensive Review on Techniques and Technologies. Energies 2024, 17, 1662. [Google Scholar] [CrossRef]
- Pătrascu, A.; Toader, F.A.; Bălăcescu, A. An Improved Multi-Objective Hybrid Algorithm for Solving Job Shop Scheduling Problem. Econ. Comput. Econ. Cybern. Stud. Res. 2024, 59, 177–192. [Google Scholar] [CrossRef]
- Ahmed, M.; Naser Mahmood, A.; Hu, J. A survey of network anomaly detection techniques. J. Netw. Comput. Appl. 2016, 60, 19–31. [Google Scholar] [CrossRef]
- Fernandes, G.; Rodrigues, J.J.P.C.; Carvalho, L.F.; Al-Muhtadi, J.F.; Proença, M.L. A comprehensive survey on network anomaly detection. Telecommun. Syst. 2019, 70, 447–489. [Google Scholar] [CrossRef]
- Völker, B.; Reinhardt, A.; Faustine, A.; Pereira, L. Watt’s up at Home? Smart Meter Data Analytics from a Consumer-Centric Perspective. Energies 2021, 14, 719. [Google Scholar] [CrossRef]
- Hussein, N.A.-H.K.; Abouessaouab, M. Optimizing Energy Efficiency in Smart Grids Using Machine Learning Algorithms: A Case Study in Electrical Engineering. Khwarizmia 2023, 2023, 113–120. [Google Scholar] [CrossRef]
- Wang, H.; Lei, Z.; Zhang, X.; Zhou, B.; Peng, J. A Review of Deep Learning for Renewable Energy Forecasting. Energy Convers. Manag. 2019, 198, 111799. [Google Scholar] [CrossRef]
- Schmitt, M. Automated machine learning: AI-driven decision making in business analytics. Intell. Syst. Appl. 2023, 18, 200188. [Google Scholar] [CrossRef]
- Zhang, T.; Tang, Z.; Wu, J.; Du, X.; Chen, K. Short Term Electricity Price Forecasting Using a New Hybrid Model Based on Two-Layer Decomposition Technique and Ensemble Learning. Electr. Power Syst. Res. 2022, 205, 107762. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2020, 128, 336–359. [Google Scholar] [CrossRef]
- Zhang, Y.; Hong, D.; McClement, D.; Oladosu, O.; Pridham, G.; Slaney, G. Grad-CAM helps interpret the deep learning models trained to classify multiple sclerosis types using clinical brain magnetic resonance imaging. J. Neurosci. Methods 2021, 353, 109098. [Google Scholar] [CrossRef]
- Mohamed, E.; Sirlantzis, K.; Howells, G. A review of visualisation-as-explanation techniques for convolutional neural networks and their evaluation. Displays 2022, 73, 102239. [Google Scholar] [CrossRef]
- Niu, Z.; Zhong, G.; Yu, H. A Review on the Attention Mechanism of Deep Learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: San Francisco, CA USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Di Persio, L.; Fraccarolo, N. Energy Consumption Forecasts by Gradient Boosting Regression Trees. Mathematics 2023, 11, 1068. [Google Scholar] [CrossRef]
- Zhang, C.; Liu, C.; Zhang, X.; Almpanidis, G. An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst. Appl. 2017, 82, 128–150. [Google Scholar] [CrossRef]
- Agatonovic-Kustrin, S.; Beresford, R. Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J. Pharm. BioMed Anal. 2000, 22, 717–727. [Google Scholar] [CrossRef]
- Trillo, J.R.; González-López, F.; Morente-Molinera, J.A.; Magán-Carrión, R.; García-Sánchez, P. Evaluation of Explainable, Interpretable and Non-Interpretable Algorithms for Cyber Threat Detection. Electronics 2025, 14, 3073. [Google Scholar] [CrossRef]
- Sim, T.; Choi, S.; Kim, Y.; Youn, S.H.; Jang, D.-J.; Lee, S.; Chun, C.-J. eXplainable AI (XAI)-Based Input Variable Selection Methodology for Forecasting Energy Consumption. Electronics 2022, 11, 2947. [Google Scholar] [CrossRef]
- Panagoulias, D.P.; Sarmas, E.; Marinakis, V.; Virvou, M.; Tsihrintzis, G.A.; Doukas, H. Intelligent Decision Support for Energy Management: A Methodology for Tailored Explainability of Artificial Intelligence Analytics. Electronics 2023, 12, 4430. [Google Scholar] [CrossRef]
- Shin, J.; Moon, H.; Chun, C.-J.; Sim, T.; Kim, E.; Lee, S. Enhanced Data Processing and Machine Learning Techniques for Energy Consumption Forecasting. Electronics 2024, 13, 3885. [Google Scholar] [CrossRef]
- Aslam, S.; Aung, P.P.; Rafsanjani, A.S.; Majeed, A.P.P.A. Machine Learning Applications in Energy Systems: Current Trends, Challenges, and Research Directions. Energy Inform. 2025, 8, 62. [Google Scholar] [CrossRef]
- Rasheed, A. Improving prediction efficiency by revolutionary machine learning models. Mater. Today Proc. 2023, 81, 577–583. [Google Scholar] [CrossRef]
- Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
- Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-Informed Machine Learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
- Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
- Wang, Q.; Li, Y.; Li, R. Integrating artificial intelligence in energy transition: A comprehensive review. Energy Strategy Rev. 2025, 57, 101600. [Google Scholar] [CrossRef]
- Akkaya, B.; Apostu, S.A.; Hysa, E.; Panait, M. (Eds.) Digitalization, Sustainable Development, and Industry 5.0: An Organizational Model for Twin Transitions; Emerald Publishing Limited: Leeds, UK, 2023. [Google Scholar]
- Pătrascu, A.; Bucur, C.; Tănăsescu, A.; Toader, F.A. Proposal of a Machine Learning Pre-dictive Maintenance Solution Architecture. Int. J. Comput. Commun. Control 2024, 19, 6499. [Google Scholar] [CrossRef]
- Semmelmann, L.; Henni, S.; Weinhardt, C. Load forecasting for energy communities: A novel LSTM-XGBoost hybrid model based on smart meter data. Energy Inf. 2022, 5, 24. [Google Scholar] [CrossRef]
- Zhang, L.; Wen, J.; Li, Y.; Chen, J.; Ye, C.; Fu, Y.; Livingood, W. A review of machine learning in building load prediction. Appl. Energy 2021, 285, 116452. [Google Scholar] [CrossRef]
- O’Connor, C.; Bahloul, M.; Prestwich, S.; Visentin, A. A Review of Electricity Price Forecasting Models in the Day-Ahead, Intra-Day, and Balancing Markets. Energies 2025, 18, 3097. [Google Scholar] [CrossRef]
- Lu, R.; Hong, S.H.; Yu, M. Demand Response for Home Energy Management Using Reinforcement Learning and Artificial Neural Network. IEEE Trans. Smart Grid 2019, 10, 6629–6639. [Google Scholar] [CrossRef]
- Laitsos, V.; Vontzos, G.; Bargiotas, D.; Daskalopulu, A.; Tsoukalas, L.H. Data-Driven Techniques for Short-Term Electricity Price Forecasting through Novel Deep Learning Approaches with Attention Mechanisms. Energies 2024, 17, 1625. [Google Scholar] [CrossRef]
- Devaraj, J.; Madurai Elavarasan, R.; Shafiullah, G.; Jamal, T.; Khan, I. A Holistic Review on Energy Forecasting Using Big Data and Deep Learning Models. Int. J. Energy Res. 2021, 45, 13489–13530. [Google Scholar] [CrossRef]
- Saxena, N.; Kumar, R.; Rao, Y.K.S.S.; Mondloe, D.S.; Dhapekar, N.K.; Sharma, A.; Yadav, A.S. Hybrid KNN-SVM Machine Learning Approach for Solar Power Forecasting. Environ. Chall. 2024, 14, 100838. [Google Scholar] [CrossRef]
- Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef]
- Zournatzidou, G. Advancing Sustainability Through Machine Learning: Modeling and Forecasting Renewable Energy Consumption. Sustainability 2025, 17, 1304. [Google Scholar] [CrossRef]
- Benti, N.E.; Chaka, M.D.; Semie, A.G. Forecasting Renewable Energy Generation with Machine Learning and Deep Learning: Current Advances and Future Prospects. Sustainability 2023, 15, 7087. [Google Scholar] [CrossRef]
- Mienye, I.D.; Swart, T.G. A Comprehensive Review of Deep Learning: Architectures, Recent Advances, and Applications. Information 2024, 15, 755. [Google Scholar] [CrossRef]
- Ahmed, S.F.; Alam, M.S.B.; Hassan, M.; Rozbu, M.R.; Ishtiak, T.; Rafa, N.; Mofijur, M.; Ali, A.B.M.S.; Gandomi, A.H. Deep learning modelling techniques: Current progress, applications, advantages, and challenges. Artif. Intell. Rev. 2023, 56, 13521–13617. [Google Scholar] [CrossRef]
- Pinsky, E.; Meunier, E.; Moreau, P.; Sharma, T. A Simple Computational Approach to Predict Long-Term Hourly Electric Consumption. Eng. Proc. 2024, 68, 59. [Google Scholar] [CrossRef]
- Jierula, A.; Wang, S.; OH, T.-M.; Wang, P. Study on Accuracy Metrics for Evaluating the Predictions of Damage Locations in Deep Piles Using Artificial Neural Networks with Acoustic Emission. Data. Appl. Sci. 2021, 11, 2314. [Google Scholar] [CrossRef]
- Hodson, T.O. Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geosci. Model Dev. 2022, 15, 5481–5487. [Google Scholar] [CrossRef]
- Frey, E.; Seimon, B.; Peist, M. PanelSplit Software. 2024. Available online: https://github.com/4Freye/panelsplit (accessed on 12 February 2026).
- Ansh, T. Global Data on Sustainable Energy (2000–2020). Available online: https://www.kaggle.com/datasets/anshtanwar/global-data-on-sustainable-energy (accessed on 1 April 2026).
- Kim, Y.S.; Kim, M.K.; Fu, N.; Liu, J.; Wang, J.; Srebric, J. Investigating the impact of data normalization methods on predicting electricity consumption in a building using different artificial neural network models. Sustain. Cities Soc. 2025, 118, 105570. [Google Scholar] [CrossRef]
- Biernacki, A. Evaluating Filter, Wrapper, and Embedded Feature Selection Approaches for Encrypted Video Traffic Classification. Electronics 2025, 14, 3587. [Google Scholar] [CrossRef]
- Pereira, D.G.; Afonso, A.; Medeiros, F.M. Overview of Friedman’s Test and Post-hoc Analysis. Commun. Stat.-Simul. Comput. 2015, 44, 2636–2653. [Google Scholar] [CrossRef]
- Welsch, D.; Neuhäuser, M. Wilcoxon-Signed-Rank Test. In International Encyclopedia of Statistical Science; Lovric, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2025. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |