A Survey of Machine Learning Models in Renewable Energy Predictions

: The use of renewable energy to reduce the effects of climate change and global warming has become an increasing trend. In order to improve the prediction ability of renewable energy, various prediction techniques have been developed. The aims of this review are illustrated as follows. First, this survey attempts to provide a review and analysis of machine-learning models in renewable-energy predictions. Secondly, this study depicts procedures, including data preprocessing techniques, parameter selection algorithms, and prediction performance measurements, used in machine-learning models for renewable-energy predictions. Thirdly, the analysis of sources of renewable energy, values of the mean absolute percentage error, and values of the coefficient of determination were conducted. Finally, some possible potential opportunities for future work were provided at end of this survey.


Machine Learning
In the past decades, machine-learning techniques have been widely applied to many fields associated with data-driven problems. Machine-learning techniques include many interdisciplinary areas, such as statistics, mathematics, artificial neural networks, data mining, optimization, and artificial intelligence. Machine-learning techniques try to seek the relations between input data and output data with or without mathematical forms of problems. After the machine-learning models are well-trained by the training dataset, decision makers can obtain satisfying forecasting output values by feeding the forecasting input data into the well-trained models. The data pre-processing procedure plays an essential role in machine learning and can improve the performance of machine learning efficiently [1]. Basically, machine-learning technology mainly uses three learning methods: namely, supervised learning, unsupervised learning, and reinforcement learning. Supervised learning takes advantage of labeled data in the training phase. Unsupervised learning is to automatically categorize input data into clusters by certain criteria for training data that has not been labeled in advanced. Thus, the number of clusters generally depends on the clustering criteria used. Reinforcement learning is learning through interaction with the external environment to obtain feedback in order to maximize the expected benefits. By ways of three basic learning principles, many theoretical mechanisms and applications have been proposed [2]. Due to the rapid development of information technology in hardware and software, deep learning, a sub-field of machine learning, has been booming recently. Deep learning is able to realize characteristic nonlinear attributes and high-level invariant data configurations and, therefore, has been applied in many fields to obtain satisfying performances [3]. Additionally, some studies have focused on forecasting renewable energy by using a single machine-learning model [4]. However, due to the diversified datasets, time steps, prediction ranges, settings, and performance indicators, it is difficult to improve the forecasting performance by using a single machine-learning model. Therefore, in order to improve the prediction performance, some studies developed hybrid machine-learning models or overall prediction methods in renewable-energy predictions. Recently, support-vector machines and deep-learning methods are very popular in the field of machine learning [5].

Renewable Energy
With the rapid development of global industrialization, it has been recognized that excessive consumption of fossil fuels will not only accelerate the reduction in fossil fuel reserves but also have an adverse impact on the environment. These influences will result in increasing health risks and threats of global climate change. In addition to fossil fuels and nuclear energy, renewable energy is currently the fastest-growing energy source. Renewable energy refers to reusable energy that can be recovered in nature, such as solar energy, wind power, hydropower, biomass energy, waves, tides, and geothermal energy. With characteristics of sustainability and low environmental pollution, the issue of renewable energy has attracted attention, and plenty of related studies have been performed recently. One of the most important challenges of renewable energy in the near future is the energy supply. The renewable supply is the integration of renewable energy sources into the existing or future energy supply structures [5]. The development of renewable-energy systems will be able to cope with essential issues of current energy problems, such as improving the reliability of energy supply and solving regional energy shortages. However, due to the huge volatility and the intermittent and random nature of renewable energy, this generation of various energy sources is intermittent and chaotic. Therefore, accurately dealing with the randomness of renewable energy data is still a work to be conquered. High-precision energy monitoring can improve the efficiency of the energy system. Energy forecasting technology plays a vital role in the development, the management and the policy making of energy systems. As ways of providing electrical energy from renewable energy sources increase, it is very important to develop proper technologies to store renewable energy [6]. Many studies have revealed that various machine-learning models have been employed in renewable-energy predictions. The data-driven models do provide practical ways of renewable-energy predictions. In addition, hybrid machine-learning models were designed to increase prediction accuracy of renewable energy. Various time intervals, such as minutes, hours, days, and weeks, were employed to predict renewable energy according to different purposes of predictions. Forecasting accuracy and efficiency were typically utilized to evaluate the performance of machine-learning models in renewable-energy predictions [7]. Table 1 lists some recent survey studies of applying machine-learning techniques to renewableenergy predictions. Mellit et al. [8] reviewed the literature on the topic of photovoltaic power prediction by using artificial intelligence technology, machine-learning techniques, deep learning, and hybrid methods. The author pointed out that by using numerical weather predictions with feature extraction and deep-learning techniques to achieve long-term photovoltaic-power-generation prediction, long short-term memory networks, convolutional neural networks, and recurrent neural networks, they were able to estimate the time-dependence data in photovoltaic power predictions. Wang et al. [5] reviewed renewable-energy prediction methods based on deep learning. The prediction methods were divided into four categories, namely deep belief networks, the stack autoencoder, deep recurrent neural networks, and others. In addition, some data pre-processing and postprocessing techniques were employed to improve forecasting accuracy. Bermejo et al. [9] investigated artificial neural networks in forecasting energy and reliability. Energy sources in this study included solar, hydraulic, and wind. Many cases were provided to demonstrate the superiority of artificial neural networks in forecasting energy and reliability. Mosavi et al. [10] surveyed applications and the taxonomy of machine-learning techniques in energy systems. Authors reported that hybrid models are superior to the traditional machine-learning models in applications of energy systems. Ahmed and Khalid [11] investigated forecasting models of renewable power systems from aspects of power dispatching systems, energy storage systems, the energy policy and markets, reliability, and optimal reserve capacity. This review was useful to the power sector by providing recent trends and forecasting improvements of the system design and operations of power systems. Zendehboudi et al. [7] reviewed applications of support-vector machine (SVM) models to forecasting solar and wind energy and indicated support-vector machine models outperformed the other forecasting models in terms of prediction accuracy. In addition, authors revealed that hybrid support-vector machine models can obtain better forecasting results than a single support-vector machine model. Das et al. [12] conducted a review and performance analysis of forecasting techniques in solar photovoltaicpower generations. This study indicated that the use of artificial neural networks and a supportvector machine model has been popular in this field. Authors pointed out that variations in weather conditions resulted in high forecasting errors in forecasting of the solar power. Because solar radiation is a main source of solar energy, Voyant et al. [13] performed a review of using machinelearning models in forecasting solar radiation. The authors reported that artificial neural networks, support-vector regression, regression tree, random forest, and gradient boosting are promising tools in solar radiation prediction, and hybrid models or ensemble forecast approaches are feasible ways to improve forecasting accuracy. Pérez-Ortiz et al. [14] reviewed classification algorithms for renewable energy problems and provided helpful insights for researchers as well as practitioners in this area. Khare et al. [15] presented a comprehensive survey of hybrid renewable-energy systems. This study included hybrid renewable-energy system issues of feasibility analysis, optimum sizing, modeling, control and reliability, applications of evolutionary technique, and game theory. Table 1 depicts that the applications machine-learning techniques to renewable energy have been growing and shows that this issue is worth further exploring.

Review Studies of Applying Machine-Learning Techniques to Renewable Energy
Aiming at a broad survey of machine-learning models in renewable-energy predictions based in recent years, this investigation elucidates categories of renewable energy sources, machine-learning models, data pre-processing techniques, parameter selection of machine-learning models, and performance measurements of machine-learning models in renewable-energy predictions. Moreover, directions of potential or possible future work were pointed out as well. The structure of this study is as follows. Section 2 depicts data preprocessing methods, machine-learning models, and parameter selection techniques; Section 3 presents measurements of model performances; Section 4 draws conclusions and provides directions of future work.

Machine-Learning Models in Renewable-Energy Predictions
Renewable energy is an environment-friendly energy source. Such carbon-free technology helps to combat climate change and has become an essential alternative to current petrochemical energy sources. However, renewable energy often has diverse characteristics, which lead to uncertainties of renewable energy power systems. Therefore, the prediction of renewable energy is an important way to deal with this problem. Machine learning is a data-driven process used to establish an intelligent outcome. Basic steps of machine learning contain data collection and preprocessing, feature selection and extraction, model selection, and model verification. Sharifzadeh et al. [16] categorized machinelearning models in renewable energy into statistical models, artificial intelligence techniques, and hybrid methods. This study collected 130 recent papers and Figure 1 shows the pie chart in terms of seven renewable energy sources. It can be noticed that both solar energy and wind energy are close to 40 percent each; and each of the other five renewable energy fields are less than five percent in the literature of this study.  Table 2 lists papers related topics of using machine-learning models in renewable-energy predictions since 2017. For wind-energy predictions, statistical methods were used in the early stage [17,18]. Recent studies employed artificial intelligence and machine-learning techniques in windenergy predictions, such as support-vector machines [16][17][18][19], random forest classification algorithms [20,21], gradient boosting decision trees [22], adaptive neuro-fuzzy inference system (ANFIS) [23], artificial neural network [24][25][26][27][28][29][30][31][32][33][34][35][36][37][38], and long short-term memory networks [39][40][41][42][43][44][45]. The machinelearning techniques were able to capture data trends in forecasting wind energy. In addition, hybrid algorithms have been developed to improve forecasting models effectively and efficiently by integrating data processing approaches and optimization algorithms into machine-learning models [46][47][48][49]. Extreme-learning machines (ELM) have been used in forecasting wind-power generation [50][51][52][53][54]. Wavelet decomposition (WD), wavelet packet decomposition (WPD), and ensemble empirical mode decomposition (EEMD) were employed to eliminate influences of noise from original data and can effectively improve the accuracy of wind-speed predictions [55,56]. [57,58] used the Bayesian model-based method to predict hybrid wind power. The numerical results indicated that the Bayesian model-based method can provide more accurate results than the other forecasting models in forecasting hybrid wind power. [59] developed a dynamic integration method to forecast wind speed. The phase space reconstruction (PSR) was used to dynamically select input vectors of prediction models. Besides, the kernel principal component analysis (KPCA) was used to extract the nonlinear features of the high-dimensional feature space reconstructed by PSR. Then, the core vector regression (CVR) model with parameters determined by the competition over resource (COR) algorithm is established. The empirical results revealed that the proposed model significantly increased the prediction accuracy and statistically outperformed other forecasting approaches. A hybrid model including WPD, convolutional neural networks (CNN), and long short-term memory (LSTM) was designed to predict wind speed and can provide satisfying prediction results [60]. A hybrid model containing empirical wavelet transform, long short-term memory network, and Elman neural networks (EWT-LSTM-Elman) was proposed and outperformed the other forecasting models in wind-speed predictions [61].

The Status of Machine-Learning Technology Used in Renewable-Energy Forecasting
In the solar energy prediction method, solar irradiance can be considered as a time series with different time scales. The most common time series forecasting method was autoregressive moving average (ARMA) [62,63]. Machine-learning models and deep-learning techniques, including supportvector machines [64][65][66][67][68][69][70]] and artificial neural networks (ANN) [71][72][73][74][75] have been booming datadriven prediction models. Deep learning includes CNN [76], deep neural network (DNN) [77][78][79], long short-term memory [64,[80][81][82][83][84][85], and the other hybrid models used in multistep predictions of solar energy. A method for predicting solar radiation sequences was introduced by using multiscale decomposition techniques, such as empirical mode decomposition (EMD), integrated empirical mode decomposition (EEMD), and wavelet decomposition, to investigate several clear sky index data [86], and based on linear, the method performs an autoregressive process (AR) and a nonlinear method. The results showed that the WD hybrid model (WD-NN) performed the best in predicting solar radiation. In terms of applications of SVM to solar-energy predictions, parameter selection is crucial to the forecasting accuracy of SVM. Thus, some studies have used optimization methods [87][88][89][90], such as grid search, firefly algorithm (FFA), genetic algorithm (GA), and particle swarm optimization, (PSO) to determine parameters of SVM models.
For the hydropower prediction, a hybrid method of ANFIS and gray wolf optimization (GWO) was designed to forecast hydropower generation [91]. In renewable energy, biomass and hydrogen can increase global energy sustainability and reduce greenhouse gas emissions [92]. It was reported that machine-learning algorithms, linear regression (LR), K nearest neighbor regression (KNN), support-vector machine regression (SVMR), and decision-tree regression (DTR) can be used to model gasification products without further revisions [93]. The gradient-boosted regression trees (GBRT) were employed to forecast the high heating value (HHV) of biomass with a dataset including 511 biomass samples [94]. Four models, namely polynomial regression, support-vector regression, decision-tree regression, and multilayer perceptron were used to predict CO, CO2, CH4, H2 during biomass gasification and HHV outputs [95]. The numerical results illustrated that multilayer perceptron and decision-tree regression can provide better forecasting results than the other models. A new hybrid algorithm based on the combination of SVM and simulated-annealing (SA) optimization technology was designed to study the baking process in order to obtain a prediction model of the high heating value (HHV) in biomass [96]. The results of this study indicated that the SVM-SA hybrid model can improve the prediction performance. In wave-power-generation predictions, the prediction algorithms were used to predict the height of coastal waves in a relatively short period of time. A hybrid Improved complete ensemble EMD-ELM model was proposed to predict the wave heights [97]. The proposed model was an effective tool in forecasting wave heights. A Bayesian optimization with hybrid grouping genetic algorithms and an extreme-learning machine (BO-GGA-ELM) model was proposed to forecast the wave height and the wave energy flux at the target ocean [98]. Bayesian optimization was used to provide wave characteristics of the problem. In predictions of tidal-power generation, a univariate prediction method based on wavelet transform and support-vector regression (SVR) was designed to forecast the tide velocity and directions with high accuracy [99]. A power flow prediction method based on clustering technology was presented to obtain harmonic power flows [100]. These hybrid models involved wavelet and artificial neural networks (WNN and ANN) and the Fourier series combination based on Fourier series based on least square method (FSLSM) technology. The numerical results indicated that the model can obtain high forecasting accuracy. A method using probabilistic machine-learning techniques in the Bayesian framework was proposed to predict power flow [101]. The Gaussian process was used in this study to model tidal data. Based on ensemble empirical mode decomposition (EEMD) and least-squares support-vector machines (LSSVM), a forecasting models was developed to improve the accuracy of predicting tidal current speed (TCS) and tidal current direction (TCD) [102]. A two-stage method for modeling and prediction of tidal was presented [103]. This method used a fuzzy feature selection technique to extract the dataset from the tide velocity and directions. Then, support-vector regression was employed to make a forecast of tidal. For methods of geothermal energy predictions, [104] employed LSTM encoder-decoder model to forecast the geothermal energy. The LSTM encoder and the LSTM decoder were applied to deal with the learning of the past geothermal energy production and predictions of the future geothermal-energy generations, respectively. The predictions of geothermal-energy generations were conducted according to the output of the encoder and the batch size of LSTM with optimized periods and sequence lengths. [105] investigated heat exchangers of geothermal installations with the goal of using previous data to obtain accurate predictions of the sensors placed along the heat exchangers in systems. The results showed that the time-dependent neural networks (TDNN) model is superior to other standard regression indicators in all cases. Bayesian linear regression, neural network regression, boosted decision-tree regression, and decision forest regression were used to predict the water levels of a reservoir, which is critical to hydropower generation [106]. The study revealed that Bayesian linear regression can obtain superior forecasting results than the other three forecasting models in terms of forecasting accuracy.
[107] designed a forecasting system (LMS-BSDP) by employing inflow predictions with various lead-times intervals to increase the performance of hydropower stations in hydropower predictions as well as reliability. Based on the proposed forecasting model with accurate forecasting performance, proper operation policies of hydropower stations can be provided. A hybrid model (PSO-SVM) based on supportvector machines and particle swarm optimization (PSO) was presented to forecast the higher heating value (HHV) in a roasting process of the biomass-energy generation [108]. The particle swarm optimization was used to determine parameters of support-vector machines. Experiment results indicated that the proposed PSO-SVM model with cubic kernel functions can generate more accurate forecasting HHV results than the other forecasting models. Fuzzy inference systems (FIS) and artificial neural networks (ANN) were proposed to predict wave energy in faster and cheaper ways [109]. This study reported that that the proposed forecasting models is able to predict wave power effectively and efficiently at any place in deep oceanic waters. [110] used the Gaussian process (GP) to forecast short-term waves, which is essential to wave energy converters. Two other forecasting models, neural network (NN) and autoregressive modeling (AR), were employed to deal with the same dataset to demonstrate the performance of the GP models. The numerical results illustrated that the GP model outperformed the other two models. Furthermore, the GP model is capable of dealing with the uncertainty of short-term waves. A fuzzy inference system using interval type -2 fuzzy inference system 2 (IT2FIS) was designed to forecast waves [111]. The proposed IT2FIS used metacognitive learning algorithms to capture knowledge from wave data. This study revealed that the proposed ITSFIS is a promising alternative in-wave forecasting. [112] Multiple regression and artificial neural networks were employed to predict drilling parameters of rates of penetration (ROP) while generating geothermal energy. The investigation showed that both models can provide high values of correlation and, therefore, improve prediction performance of ROP predictions. [113] used random-forest algorithms and data collected from geographic information systems to predict very shallow geothermal opportunity. The study indicated that the random forest is a feasible and promising way to forecast geothermal energy when data, such as topography, weather and soil, are available. Table 2 illustrates that wind and solar are the two renewable energies often investigated, and artificial intelligence and hybrid models are the two frequently employed techniques. Table 2. Summary of sources, models, and techniques in renewable-energy predictions.

Data Preprocessing Techniques
In energy forecasting, the modeling process could be divided into four stages, namely data preprocessing, determining proper hyper-parameters of models, training models, testing, and forecasting problems [49]. The data preprocessing includes removing data with missing values and data exceptions [57]. These data exceptions come from the lack of data caused by the abnormal collection mechanism of data collection. For example, the solar radiation data at night are meaningless to solar-power predictions [86]. Thus, data exceptions have to be filtered out. Table 3 lists data pre-processing techniques used by machine-learning models for renewable-energy predictions and illustrates that the decomposition method is more commonly used than the other data pre-processing techniques. For the data splitting, robust machine-learning models are generated by dividing original data into a training dataset, a validation dataset, and a testing dataset, or performing a cross-validation procedure during the modeling process [57,95,129]. The kernelized Mahalanobis distance (kWMD) method can effectively calculate the similarity of two unknown sample sets [71]. Data decomposition is used to preprocess the original signal and can increase the forecasting accuracy. Data decomposition decomposes a high-dimensional dataset into several lowdimensional sub-datasets and is often employed in signal processing problems [147]. Data discretization converts continuous data into discrete data and is especially useful in forecasting renewable energy by weather data [100,130,137,138]. The encoding categorical features method removes impurities and redundancy in the original data and designs more efficient features to depict the relationship between problems and prediction models [67]. Feature selection is a technique to seek proper independent variables and remove the irrelevant or redundant attributes [43]. Data imputation of missing values is a process to replace the missing value with substituted values when null data happens in the modeling process [62,139]. Data normalization refers to the adjustment of datasets expressed in different scales so that different datasets can be processed by machine-learning models. For example, weather attributes have to be normalized when conducting wind-speed predictions [16,32,38,39,46,57,66]. Data standardization is a process that converts data from different sizes to the same size, and Z-Score values are used to measure data scales [24]. Data transformation is a way to convert the state of the data. The Markov chains method is a typical data transformation approach [140]. Tables 4 and 5 show data pre-processing methods utilized by machine-learning models in predicting renewable energy in terms of types of renewable energy sources and variables, respectively. It can be observed that decomposition approaches for wind-energy predictions are the most often employed. The cause could be resulted from decomposing wind data with different frequencies and is able to improve forecasting performance of machine-learning models.

Parameter Selection of Machine-Learning Models in Renewable-Energy Predictions
Parameter selection influences performances of machine-learning models a lot. Most machinelearning models have more than two parameters. The trial-and-error method is not feasible. Thus, metaheuristics have been a popular way to seek proper parameters of machine-learning models. Basically, the forecasting error functions served as objective functions of metaheuristics for optimization. For each iteration, a new set of parameters are generated and used by a machinelearning model to make a forecast. Then a new foresting error is produced [47]. Sometimes a validation dataset is divided from the training dataset to prevent the overfitting of the training process [148]. Majid Dehghani et al. [91] used the gray wolf optimization (GWO) method to select conjunction parameters of the ANFIS model to predict hydropower generation. GWO can significantly improve the forecasting performance of ANFIS. Wu et al. [50] employed extremelearning machines with multi-objective grey wolf optimization (MOGWO) to achieve effective wind predictions. Lin et al. [87] proposed an improved moth optimization algorithm (IMFO) to optimize the parameters of the SVM model for predicting photovoltaic-power generations. Liu et al. [88] used firefly algorithm (FFA) to determine parameters of SVM models in forecasting solar radiation. Numerical results indicated that the proposed SVM-FFA model outperformed the other forecasting models in terms of forecasting accuracy. Particle swarm optimization (PSO) is a popular method used to determine parameters in wind and solar forecasting models [23,48,89,142,144]. Fan et al. [89] used whale optimization, PSO, and Bat algorithms(BAT)for parameter determination of SVM models in solar-radiation predictions. The numerical results showed that support vector machines with bat algorithms (SVM-BAT) models performed the best in terms of the prediction accuracy and the rate of convergence. Demircan et al. [145] designed an empirical regression model with the artificial bee colony (ABC) algorithms to predict global solar radiation. The forecasting model depended on durations and angles of sunlight to make a forecast. The artificial bee colony algorithms were employed to selected parameters of empirical regression models. Li et al. [46] conducted short-term wind-power predictions by using an improved dragonfly algorithm to select parameters of the SVM model. García Nieto et al. [96] presented a hybrid model of support-vector machine and simulated annealing (SA) to forecast the HHV of the biomass. The SA was used to decide SVM parameters, and the proposed hybrid model can provide promising forecasting results. Lin et al. [47] presented a deep belief network with a genetic algorithms (DBNGA) model to predict wind speeds. The genetic algorithms was used for parameter selection of deep belief networks Zhou et al. [51] proposed a new a hybrid model, which contains variational mode decomposition (VMD), backtracking search algorithms (BSA) and regularized extreme-learning-machine (RELM) techniques in wind-speed predictions. The BSA technique was employed to select parameters of RELM models. Li and Jin [49] used the multi-objective ant lion algorithms to determine parameters of ELM (Extreme-learning machines)models in the wind-speed predictions and can provide satisfying forecasting results. Salcedo-Sanz et al. [52] used coral-reef algorithms and ELM for wind-speed predictions. The coralreef algorithms were employed to select parameters of ELM models. Cornejo-Bueno et al. [98] used Bayesian optimization to obtain ELM parameters in the ocean-wave prediction system. Zameer et al. [38] developed a short-term wind-energy prediction method by artificial neural networks with genetic programming for parameter selection. Papari et al. [103] used modified harmony search (MHS) to determine parameters SVR in power-flow predictions. The results showed that the proposed SVR with an MHS model can generate more accurate results than the other forecasting models. Figure 2 provides a collection of metaheuristics for machine-learning parameter selection in renewable-energy predictions. It is shown that extreme-learning machines and support-vector machines are two machine-learning models applying metaheuristics most frequently for parameter selection.

Measurements of Forecasting Performance
Measurements of forecasting accuracy are investigated in this section. In total, 41 types of measurements of forecasting accuracy were gathered in this study. Table 6 lists measurements of forecasting accuracy used by more than 10 studies. Mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE) are three measurements most frequently used. Renewable energy can be represented in diversified units, and values of renewable energy fluctuate a lot for different studies. To avoid influences of units and values of renewable energy, MAPE is specified here to depict forecasting accuracy. In Table 7, a total of 39 collected studies used MAPE to measure forecasting performances. Generally, many models were introduced in each study. Thus, Table 7 shows the best performance result in each study. According to Lewis [149], MAPE values less than 10 percent are highly accurate predictions. Thus, most forecasting accuracies of collected studies are high in terms of MAPE. In addition, the coefficient of determination (R 2 ) is another measurement specified in this study for analysis. The coefficient of determination represents the proportion of the variance in the dependent variable that is explainable by the independent variables. Table 8 illustrates 23 collected studies employing R 2 as a measurement in forecasting renewable energy. It can be observed that most values of R 2 are larger than 0.8. Besides, eight articles used R 2 and MAPE simultaneously as forecasting performance measurements. Table 9 lists values of R 2 and MAPE, and a correlation coefficient of −0.7869 was calculated from the numerical data. Thus, this study revealed that machine-learning models with higher R 2 values result in more accurate renewable-energy prediction in terms of MAPE. When employing machine-learning models in forecasting renewable energy, the coefficient of determination could be calculated before further steps of modeling are performed.

Conclusions
Due to concerns caused by climate change and global warming recently, the issue of renewable energy is booming. Thus, accurate prediction of renewable energy power becomes crucial, and plenty of related studies have been conducted. In addition, the complexity of various environmental conditions in renewable-energy-generation systems resulted in the inappropriateness of using closed mathematical forms to describe renewable-energy-generation systems [136]. Therefore, applications of machine-learning models have been gradually popular in the renewable-energy predictions. This study reviewed machine-learning models in energy predictions in recent years and analyzed this topic from aspects of machine-learning models, renewable energy sources, data pre-processing techniques, parameter selection algorithms, and prediction performance measurements.
Findings of this study can be summarized as follows. First, the applications machine-learning techniques to renewable energy have been increasing and the uses of artificial intelligence techniques and hybrid models in solar-energy and wind-energy predictions are the majority. Secondly, the decomposition method is more often employed than the other data pre-processing techniques for machine-learning models in renewable-energy predictions. Thirdly, extreme-learning machines and support-vector machines are two machine-learning models most frequently applying metaheuristics to parameter selection. Finally, machine-learning models with larger R 2 values lead to more accurate renewable-energy predictions in terms of MAPE.
Some possible future research directions for machine-learning models in renewable-energy prediction are depicted as follows. First, it can be observed that most topics of machine-learning technology in renewable-energy predictions have been focused on solar or wind energy forecasting. Thus, instead of solar and wind-energy predictions, the other types of renewable-energy predictions, such as tidal energy, biomass energy, wave energy, hydraulic power, and geothermal energy, could be potential fields for the future work. In addition, artificial intelligence techniques and hybrid models could be promising ways in forecasting renewable energy. Secondly, data pre-processing methods do influence prediction performances of machine-learning models in renewable-energy predictions. However, this issue has not attracted many attentions yet. Thus, the analysis of data preprocessing techniques and machine-learning models in renewable-energy predictions is maybe another direction for future research. Finally, parameter selection influences the performance of machine-learning models in renewable-energy predictions a lot. Thus, to improve performances of machine-learning models in renewable-energy predictions, the use of new metaheuristics, such as a coronavirus optimization algorithm [150] for machine-learning parameter selection, is an encouraging opportunity for future research.