Enhanced Random Forest Model for Robust Short-Term Photovoltaic Power Forecasting Using Weather Measurements

: Short-term Photovoltaic (PV) Power Forecasting (STPF) is considered a topic of utmost importance in smart grids. The deployment of STPF techniques provides fast dispatching in the case of sudden variations due to stochastic weather conditions. This paper presents an efﬁcient data-driven method based on enhanced Random Forest (RF) model. The proposed method employs an ensemble of attribute selection techniques to manage bias/variance optimization for STPF application and enhance the forecasting quality results. The overall architecture strategy gathers the relevant information to constitute a voted feature-weighting vector of weather inputs. The main emphasis in this paper is laid on the knowledge expertise obtained from weather measurements. The feature selection techniques are based on local Interpretable Model-Agnostic Explanations, Extreme Boosting Model, and Elastic Net. A comparative performance investigation using an actual database, collected from the weather sensors, demonstrates the superiority of the proposed technique versus several data-driven machine learning models when applied to a typical distributed PV system.


Introduction
Over the years, the exponential increase in global energy demand has become the leading cause of the rapid depletion of fossil fuels and increased Greenhouse Gas (GHG) emissions of conventional generators [1].To effectively satisfy the meteoric growth in energy consumption, the world has taken serious initiatives to deploy RES on a larger scale.[2].Solar Energy (SE) hold out the greatest promise for modern humankind among all RES, being free, clean, and abundantly available [3].For these reasons, it keeps increasing its share in the energy-mix in the face of diminishing conventional fossil fuel energy sources and rising environmental protection concerns [3].However, the discontinuity of PV power flow brings into question the reliability of the high penetration of PV systems, which affect the dispatch accuracy greatly.Moreover, the negative effects of the sudden weather change on the PV farms threatens the grid stability and rises the cumbersome costs of the allocation of the spinning reserve [3].Therefore, PV Power Forecasting (PPF) is a pivotal element for reliable power supply as it significantly reduces the sensitivity of energy systems to weather intermittency.PPF is mandatory for PV generators as it has a direct impact on the stability and reliability of the grid.Achieving accurate forecasting for PV power generation will facilitate the SE integration to the power system.
In this context, the research community has been focusing on the development of effective forecasting techniques to handle pattern dependencies [4,5].With computer hardware and software development, forecasting models take advantage of High-Performance Computing (HPC) to achieve higher effectiveness.The energy forecasting methods provided by the PV generators can be generally classified into two categories: traditional methods and Artificial Neural Networks (ANN)-based methods [6].Traditional methods mostly include statistical methods.Statistical methods include regression techniques, Exponential Smoothing (ES) [7], Autoregressive (AR), Moving Average (MA), and their generalizations such as Autoregressive Integrated Moving Average with exogenous inputs (ARIMAX) methods also known as Box-Jenkins models [8,9].These models include a few model parameters leading to higher simplicity and interpretability.In [10], a PVPF method-based Autoregressive Integrated Moving Average (ARIMA) has been adopted for the design of an energy management system.Paper [7] exploits ES State Space (ESSS) for short-term solar irradiance model.Nevertheless, the direct PPF is not considered in this study [7].In summary, the traditional former approaches do not make use of the historical data generated by weather stations leading to poor forecastability potential.Reciprocally, ANN becomes one of the most commonly used approaches for PVPF [11].It has been reported in [12] that ANNs are easy to use for RES designs, especially for solar irradiance with related PV power [13].
Pioneering work is presented in [14], where it is shown that ANN can generate deterministic and probabilistic PV power for three days ahead.An Analog Ensemble (AnEN) model has boosted the ANN accuracy using computed astronomical variables and past predictions of a deterministic Numerical Weather Prediction (NWP) model.However, the computational requirements for the model training are cumbersome.The finding is consistent with the results of recent studies by [15], which employs Multi-Layer Perceptron (MLP)-based ANN.These contributions are completed in [16], where a comparison of different models is performed.Although the MLP method enhances the prediction performance, the uncertainty resulting from the assumptions of the pre-processed features could form a barrier to its practical implementation [16].In [17], a radial basis function (RBF)-based ANN has been proposed for an online PVPF.Such observations are confirmed in [18] where Feed-Forward Neural Networks (FFNNs) and RBF, tow variants of ANN, are also used for solar PV power production predictions.The proposed model provides a Root Mean Square Error RMSE = 10.59% in a typical Autumn day.Ultimately, in [19], an optimized ANN for one day ahead PVPF has been presented.The proposed model makes use of dust and temperature to follow a PV plant, yielding a coefficient of determination R 2 = 91.4%.
To sum up, the ANN-based models are perfectly tailored for PVPF.The concern, however, has been raised that the above methods lack an end-to-end process for selecting essential features among the provided inputs, leading to a tedious manual preprocessing phase.Furthermore, these models sacrifice interpretability over high forecasting accuracy.In other words, the recently cited forecasting methods failed to provide an empirical ability to develop insights on how they use different inputs, resuming in burdensome problem for the industrial acceptability of these methods.
However, it is evident that the input parameters do not have an equal contribution to the domain knowledge.Therefore, several methods were proposed to benefit from this inequality.In earlier work, the authors employed an enhanced RF for classification purposes [20].Their proposed approach lies in the integration of a slowness index with a feature ranking and selection process.According to the simulation results, a classification technique of static and dynamic nodes was adopted to mitigate the overlap between classes.Next, the Relative Mean Decrease Gini (RMDG) method is employed to determines the significance of the feature inputs to the domain knowledge and rank them accordingly.Although the proposed method outperforms a variety of prediction techniques, the compu-tational effort is high.This burden is due to the large number of possibilities investigated, especially with a larger dimensional space.
This paper deploys a state-of-the-art method to improve the performance potential of RF model.The system employs three techniques to rank the weather and power inputs through a novel feature selection procedure.This ranking leads to a Weighted Feature Vector Importance (FVI).The forecasting model relies on FVI coefficient to generate multiple forecast outputs for each feature input.The final output result is concluded from a summation of the single forecasts.The main contributions are outlined as follows: 1.
An effective feature engineering technique is deployed based on six input parameters using three different approaches.The features are classified according to their weights.FVI is calculated using the input parameters relevant to the PPF model.

2.
A new approach-based-multimodal prediction system is comprehensively investigated.

3.
The performance superiority of the proposed approach versus Decision Trees (DT), K-Nearest Neighbors (KNN), and Random decision forest (RF) is demonstrated using a real data set.
The remainder of the paper is organized as follows: Section 2 comprehensively defines the problem statement and the main contributions.Then, Section 3 presents the related works for PV power forecasting and the common taxonomies and methodologies.Section 4 introduces the proposed methodology and investigates the FVI formulation.Afterward, Section 5 illustrates the implementation results, interpretations, and provides a comprehensive comparison with the state-of-the-art models.Finally, Section 6 discusses the presented results and concludes the study.

Literature Review
In [21,22], the PPF was classified into four distinguished classes according to the forecasting horizon, the forecasting method, and forecasting output, as shown in Figure 1.Conceptually, the determination of a specific parameter variation essentially lies in Physical Models (PMs), statistical techniques, and Artificial intelligence (AI) techniques [23].The PMs consist of real-world natural conversion formulas to conduct a deterministic closed-form solution for future behavior.PMs are commonly deployed with lowcomplexity systems and target short-term dependencies [23].On the other hand, statistical forecasting is carried out through extensive numerical patterns analysis based on statistical theory.Statistical algorithms require a dataset acquisition to build their domain knowledge since they neglect the investigated physical process [24].Moreover, statistical and physical models are found not great enough to be effective with unsatisfactory accuracy in numerous complex problems such as renewable energy forecasting and weather forecasting.AI techniques have been achieving worldwide acceptance for their accurate results and excellent generalization capabilities [25].Although AI is very promising for power systems due to the abundance of computational resources and high-resolution databases, ML techniques have only been accorded to a few considerations compared to statistical and physical methods.For PV power forecasting, ML and statistical techniques are greatly influenced by the horizon time series prediction [26].
According to the time domain, there are four distinguished forecasting horizons, specifically ultra-short (USTF), short-term forecasting (STF) required to be valid for seconds to one day, medium-term forecasting for one day to weeks, and Long-Term Forecasting (LTF) that may be valid for years [27].For example, the USTF forecasting was comprehen-sively investigated by the authors in [28].The authors' work consists of the implementation of the underlying Local Sensitive Hash algorithm (LSH).The used taxonomy takes into account four weather conditions, specifically clear, cloudy, rainy, and snowy weather.LSH profoundly investigates the coupling correlated weather features.The methodology adopted for LSH system classifies the PV power segments and generates a PPF output.In [28], the authors proposed a hybrid method for an accurate hourly PV power prediction based on a gradient-descent backpropagation method (BP), Schema Frog Leaping Algorithm (SFLA), and Artificial Neural Network (ANN) named BP-SFLA-ANN model.Their proposed BP-SFLA-ANN model consists of using SFLA as a mediator between BP and ANN models.BP model provides the values of the primary hyperparameters of ANN to let the SFLA start from this initial selection to further search for more suitable parameters of a typical ANN.The interaction between SFLA and the BP led to a superior ANN accuracy and less computational burden compared to an SFLA-ANN without the initial tuning of BP.
So far, the forecasting methodologies can be classified as physical methods, statistical methods [29][30][31], AI methods or a mix of them (hybrid models) [32,33].The physical models use NWP models or satellite imagery alongside physical considerations such as meteorological or topological data.However, physical models are restricted to tedious mathematical approaches for specific PV plants, leading to poor generalization potential and complicated modeling [32,34].Statistical models employ prediction models such as Moving Average (MA) and Autoregressive (AR) [35].AI methods employ computational intelligence to predict the PV output accurately, taking advantage of the evolved enhancement in hardware and software [36].The optimal models are often a combination of physical and statistical models [37].According to the literature, it has been found that the combination of different forecasting models could enhance the performance and efficiency of the overall prediction paradigm [38,39].
Moreover, PV generation forecast methodologies can be taxonomized, taking into account the relationship between inputs and estimated outputs: direct and indirect.The indirect PV power forecasting approach is the estimation of a key relevant element, such as the irradiance and the temperature, leading to an accurate PV power prediction [40], while the direct forecasting approach only considers the PV power as the output to be predicted from weather conditions.Some of the PV power forecasting methods are provided in Table 1.In particular, an ANN-based Statistical Feature Parameters (ANN-SFP) has been implemented for solar irradiance prediction [48].The proposed model provides a 24-h weather forecast on the hourly level for all the daylight hours of the next day.However, the proposed model is incapable of following the PV generation on an overcast and cloudy days.A three-stage prediction approach named optimized multi-layer backpropagation neural network has demonstrated better system performance than the state-of-art for ultra-short-term PPF [49].This approach relies on the seasonal division of weather data set to guarantee the adequate repartition of sample features for different stages.Nevertheless, the splitting process for the meteorological database can threaten the inherent consistency of the overall data set.Additionally, forecasting PV power generation located in the north of Italy for the next day could be conducted using a Physical Hybrid ANN (PHANN) [50].By fusing the ANN with the physical model of the clear sky solar irradiance method, the proposed model improved prediction accuracy in some of the selected days but unable to provide stable and improved forecast results in peculiar weather conditions.Seasonal Autoregressive Integrated Moving Average (SARIMA)-Random Vector Functional Link (RVFL) model is employed for a short-term PPF [51].A maximum Overlap Discrete Wavelet Transform technique is implemented to assist a hybrid model for better generalization potential.The produced power profile of a Silicon-crystalline PV module yielded an R 2 = 92.4% for a single-step-ahead prediction.However, the accuracy drops as the prediction time window become wider (Three steps ahead).In [22], an ensemble of ANN has been proposed to conduct short-term solar forecasting using day-ahead weather forecasts.Despite outperforming several benchmarks, the proposed model cannot capture the fast variation of the weather conditions.A high-precision Convolutional Neural Network (CNN)-based PPF named PVPNet is depicted to predict the PV generation for one day ahead [52].The proposed deep learning model has been found highly sensible to representation learning and the quality of data [25].In [53], an LSTM-based attention mechanism has been proposed for STPF.Nevertheless, the prediction system is limited to a single-step forecasting strategy.A Random Forest solar power forecast based on classification optimization was presented and analyzed for PPF [54].Despite the high system complexity, the proposed model makes it possible to forecast solar irradiance on a 24-h basis with high precision.

Problem Statement and Contributions
For ML techniques, the forecasting accuracy is essentially related to three factors, namely bias, variance, and noise.Inappropriate tuning of the aforementioned factors leads to overfitting or underfitting.A comprehensive adjustment of these factors is a core solution in improving prediction results.Ideally, the bias describes the mismatched samples values between measurements and forecasts taken during the learning process.Although the variance is the quantification of the squared deviation of a random feature from its mean [55].Erroneous predictions are due to high variance or high bias.Mathematically, let y be a variable output generated by a function f with a set of variable vector X, we assume that f is a forecast of f (x).Then, the computation of error Err(x) is given by [56]: where an output y can be calculated as follows [56]: where ε denotes the error term.Thus, the prediction error is given by: where σ 2 ε denotes the irreducible error.The optimal goal is to minimize both the variance and bias at the same time to reduce the errors.However, the bias-variance tradeoff is inversely proportional, as presented in Figure 2. Therefore, the optimization task is required to obtain the desired outputs.In order to avoid high variance, we employed RF Ensemble presented in this paper.This variance optimization is achieved using randomized replication of the original dataset to construct sub-modules.The prediction output is presented by averaging these models' outputs.Although RF reduces the variance of one predictor, the system remains biased.The bias taken from the original model before the subdivision stays unchanged [57].The bias optimization is carried out in this study using the interpretability of Feature Importance (FI).
Recently, FI has been commonly deployed, especially for high-dimensional data.It consists of the evaluation of the inputs' sensitivity to the output.The probability value (p-value) consists of the measurement of the evidence occurrence by calculating the probability of action when the null hypothesis is correct.The p-value of the distributed variables importance allows the system to determine the features' contribution as indicators of a future target behavior.The features with statistical significance are given higher importance from the p-value coefficient and vice versa.Additional information from p-values that improves the model accuracy is given in [58].The Feature Attribute Coefficient (FAC) deploys this knowledge to optimize the variance/bias of the ANN.Feature relative importance (FRI) introduces metrics weights to emphasize the significance of variables to the model.The feature weights are combined in a vector named the null importance.In the case of small datasets, instead of putting a threshold for weights values and the removal of every feature that has a lower weight, our proposed technique takes into account all the features that have a physical interaction with the output.In this paper, a novel Voted Feature Weighting (VFW) is introduced and deeply investigated.This procedure reduces system complexity and computational burden.The weights are fed in an ensemble learning system for the aim of achieving further accuracy.In the proposed model, Feature Importance (FI) is considered a crucial part of the decision-making in the PPF system.This is related to the role of FI ranking in avoiding multicollinearity and low accuracy caused by the arbitrary variable selection.To the best of the authors' knowledge, the architecture of the proposed machine learning model has not been reported in the literature.The implementation results have been verified and validate the effectiveness of the proposed model for bias correctness.

Proposed Methodology
The importance of an input depends on whether the forecasting performance varies dramatically when such input is replaced with random noise [59].Thus, the selection of the best features or the best combination of features has an utmost importance on the prediction model performance.The proposed methodology consists of introducing a preprocessing approach based on the p-value information associated with the RF model.For variable importance quantification, the p-value is designed to measure the feature relevance using Gini index (impurity).For every input parameter, the p-value is measured according to three feature ranking techniques: Elastic Net, Local Interpretable Model-agnostic Explanations (LIME), and Extreme Gradient Boosting (XGBoost).The Feature Vector Importance (FVI) is unified for each method to fairly grasp the non-linear relationships among candidate attributes and assess interactions to showcase the most effective combination of features.The global FVI is concluded using the average FVI methods output.Afterward, with every elimination of one feature, RF model generates an output result using the rest of the data.
Then, a voted ensemble technique makes a multiscale prediction for input vectors and multiply the probability distribution by the FVI.This perfectly tailored forecasting system assumes that the RF acquires n features.The prediction system is divided into n subsystems.For each subsystem, a k feature parameter is eliminated from the database.With the Bagging model, every subsystem gives a prediction output ȳi .Let us make w i ∈ [0, 1] d be the importance rate of each feature.The final output is concluded by summing the weighted subsystems products by an importance factor explained in: where the weight values w i denote adjusted using three potential FRI methods.The usefulness of using multiple techniques simultaneously lies in the variant architectures of these tools.Regarding the fact that the selection of the most suitable FRI method is confusing, these three methods are taken into consideration to give more integrity to the domain knowledge.Assuming N is the number of feature weighted methods.The correlation could be averaging or voting as follows: In the study, averaging is the primary case deployed.Then, a voting output result is comprehensively analyzed.Assuming w is the reweighting feature coefficient.xij is the feature weighted of x i , which is computed as: xij = x ij s(w j ) ; j = {1, . . ., d} where s denotes a positive coefficient, w is the weight vector, and x ij is the jth feature of x i .The proposed method allows RF to overcome overfitting by an additional correctness vector.By using a distinguished feature importance, this paper verifies the contribution of multiple FRI techniques to the model accuracy as shown in Figure 3.

Case Study
Typically, forecasting techniques applied for smart grid operations are validated through a meteorological database and a real power system to verify the efficiency and feasibility of the proposed model.

PV System Description
In this study, the data used for the numerical validation of the proposed model is obtained from an open source from a large-scale PV plant in the Desert Knowledge Australia Solar Center (DKASC), Alice Springs (AS), Australia, at a latitude 2376 S and a longitude 133°87 E) [60].AS has a desert climate with scarce rainfall and frequent clear skies during the dry days and, therefore, comparatively rare output volatility in the PV generation due to sky cover during that period.Rainy days are frequently registered between November and February during the wet season leading to high PV uncertainty.For showcasing the PV systems' repartition in DKASC, Figure 4, is represented.This PV plant relies on high-resolution sensors for PV systems of different technologies and configurations to record data every five minutes.The DKASC consists of a demonstration facility of 38 sites to build a high confidence level of PV technologies with different manufactures and stakeholders.The detailed characteristics of the used PV plant are summarized in Table 2.The explanatory labels consist of time indicator, relative humidity, wind speed, and its direction, horizontal irradiation, relative horizontal irradiation, temperature, and PV power output.The above-mentioned parameters allow the PPF techniques to tackle every slight change that could affect the PV power generation.The measurements were taken from 1 April 2016 to 1 August 2019, which provides sufficient information for training and validation.248,503 samples were pre-processed and split into 3 phases, namely training, validation, and testing tiers.Regarding the validation process, 17,280 samples are devoted to the analysis of the prediction quality.

Feature Engineering
In real-world problems, several preprocessing steps were taken into account for better interpretability of the obtained data.The complete steps of the adopted data prepossessing strategy are given in Figure 5. Specifically, the acquired data have been cleaned from outliers, missing, and redundant data, which require a huge effort to be smoothed accordingly.For instance, for special cases where it can be found some samples were missing the values for the same times from the previous or following day were inserted.To deal with this issue, an output of zero was given to larger missing boxes.These samples are later excluded from the database since they do not give any significance to the variability analysis.As a result of the data cleaning process, the wind speed has been removed from the system inputs since it includes many apparent wrong measurements (negative values) and missing values.The generated data contains electrical features such as the PV power generation, meteorological features such as the wind orientation, temperature, and horizontal radiation, and date features.Using one-hot encoding method, the date features are transformed into numerical values to be used in the forecasting system for all data to be time-synchronized.However, the date features are excluded from the feature inputs since the irradiation features already embody time and seasonal variation tendencies.Finally, the resulting samples are standardized by the Min-Max normalization method to the range of [0, 1] to prevent the model saturation during the learning process and promote the efficiency of the forecasting system [33].The original PV power and its related features are shown in Figure 6.
As can be seen from Figure 6, the related indicators have a direct relation with the PV power output.However, these correlations differ from one input to another.For example, the previous PV power and the horizontal radiation are perfectly tailored for the PPF contrary to the wind direction, which shows less variation with the PV power.Regardless of the weather indicators, the accumulation of PV power records over the years may be taken into consideration as a reliable measure of future PV power predictions.In the simulation part, one year (2017) data set is used for training prior to the beginning of the yearly test period.Subsequently, the testing data measurements during 2018 were used for the evaluation process.As shown in Figure 7, the historical PV power curves are illustrated for the 1st of August and the 1st of April of four consecutive years and the monthly PV power during three successive years.It can be noticed that the real generated PV power in 2018 for the spring season (Figure 7a) and the winter season (Figure 7b) is remarkably close to the first previous last year accordingly.This impressive behavior leads to conclude that the lagged yearly PV output presents an important feature indicator for the estimation of the PV power.It can be noticed from Figure 7c) that the measured power values during 2017 and 2018 have a close variation, while the 2016 values are less correlated with the following years.This difference is noticeable during January, April, May, and June months.The historical PV power series at the same instant from the previous year is associated with the weather parameters and the hourly time indicator.These inputs face many processing stages.The first step is composed of feature engineering and data cleaning.The missing and odd data are removed from the dataset.Next, the extracted inputs samples pass by a feature selection stage to evaluate their importance.The P-value of each input indicates its relevance to the PV power output.For a given temporal resolution of 5 min, a total of 288 samples are collected per day.

Feature Vector Construction
The weather dataset consists of the sensor measurements, including the temperature (°C), relative humidity (%), wind direction (°), PV power (kW), and horizontal and vertical solar radiation (W/m 2 ).It may be intuitively understood that the chosen features are relevant to PV generation.Feature selection methods lie in shaping the FVI. Figure 8 presents the simulation results of a set of feature importance ranking.The annotation gives LIME the yellow color, Elastic Net the grey color, and XGBoost the red color.The proportional importance distribution is not uniformly partitioned, as shown in Figure 8.For instance, XGBoost only considers the horizontal radiation as an informative feature while the Elastic Net does not attribute high p-value to Horizontal radiation.On the contrary, LIME gives the most physically comprehensive results.The yearly lagged PV power and horizontal radiation are the most correlated features with the current PV power.Each feature is given a relevant weight value w i with ∑ n i=1 w i = 1.The weights calculation takes into consideration the inputs permutation, and the percentage of the error caused by the exclusion of the corresponding feature.The higher p-value reflects, the closer the behavior of a feature inputs to the output predicted.
This diversity contributes to the system accuracy from the FVI coefficients.The horizontal irradiation followed by the previous PV power from the same instance in the neighboring year gains more importance.Next, the wind direction comes third, followed by the diffuse horizontal irradiation.Finally, the relative humidity takes place to finish with the temperature parameter in the last position with a lower relevant information according to the accumulation of the three methods p-values.

Simulation Results and Comparison with Benchmark Models
The proposed paradigm passes by four stages, specifically, data processing and feature engineering, object determination, model constriction, and evaluation as shown in Figure 9.
The data are normalized using Min-Max normalization in the data preprocessing stage.The Min-Max normalization is defined as follows: x n = x r − x min x max − x min (7) where x n denotes the normalized weather variable, x r is the real value.Here, x min and x max are the minimum and maximum values.The hybrid model employs a Randomized Search tool for hyperparameter optimization.The outputs for this tool assign to the modified RF a minimum sample leaf of 20, maximum leaf nodes of 100 and maximum depth equal to 8.
For reference models, Table 3 illustrates the hyperparameter of benchmarks.
The trained model is verified on a testing dataset.All experimental models run in Python 3.6.7 programming environment.The hardware is a Lenovo personal computer (PC) with Intel Core i7 9th Generation and 16 GB of memory.The Windows 10 operates a graphic card of NVIDIA GeForce GTX 1650.Score metrics between the actual power y i and the forecast points ŷi were computed in terms of coefficient of determination (R 2 ), RMSE and Mean Absolute Error (MAE) as follows [61]: where n denotes the total number of samples.The simulation results are compared with those results of the KNN, RF, and DT models.The forecasting horizon is investigated for the short-term, specifically, for 5-min daily time interval during the year of 2018.The testing data are evaluated using RMSE and MAE, and R 2 score metrics.Figure 10 presents the PV power variations in four types of days arbitrary selected, including rainy, sunny to cloudy, sunny, foggy to cloudy days.Regarding Figure 10, it can be noticed that the proposed model exhibits satisfactory forecasting performance according to the forecasting curves between the ground truth and the forecasted PV power.From Figures 10b,c,e, the the actual PV power ramps smoothly during the sunny day.With no abrupt change, the proposed model efficiently provides precise estimations of the PV power output.Even with sudden changes, especially in the middle of the day, it can be noticed from Figure 10a,d,f that the proposed hybrid machine learning model was able to follow the curve shape of the PV power generation during the rainy, foggy to cloudy weather from the close distance between the forecasted and real points.The forecast ability of the proposed method seems to be very promising in all the seasons of the year with different climatic conditions.To showcase the prediction performance of the proposed model in a more intuitive way, Figure 11 presents the scatter plot and error distributions of the proposed model.Regarding Figure 11b, it is apparent that the forecasted points are consistent with the actual values.In Figure 11b, the majority of the instance are concentrated on the zero axes.It should be emphasized that the difference of the proposed range between −10 kW and 40 kW in the worst-case scenario.To better examine the model performance, Figure 12 illustrates a 10-fold Cross-validation (10-CV) curve.According to Figure 12, the model conducts a coefficient of determination R 2 = 96%, which reflects the high potential of the proposed approach in improving the existing RF.The simulation results confirm that the bias correctness is significantly contributing to the prediction accuracy.The proposed technique is effective in diminishing the errors coming from the misleading inputs.Therefore, the proposed architecture generalization for ML model improvement worth further investigation.The model performance assessment requires the deployment of different methodologies and comparative analysis to ML models.Table 4 includes MAE, and RMSE scores for ML models.4, the proposed method is highly effective according to the registered low error values.From Table 4, the proposed method outperforms the list of models depicted, achieving a mean RMSE = 8.36 kW and a mean MAE = 5.21 kW.The high accuracy achieved takes advantage of the bias correctness of RF model.Alternatively, the original RF and DT produces large MAE, and RMSE values than the proposed model resulting in poor performance.In particular, RF generates an RMSE = 14.37 kW and a MAE = 9.24 kW.This superiority is provided by the p-value adjustment based on multiple features selection methods.Thus, the model ensures the correct repartition of categorical feature inputs instead of selecting a particular threshold and extracting the feature corresponding to higher p-values from a single assessment method.Although the subsequent heavy computation of p-values calculations, the proposed model is high performing in PV power series forecasting.

Discussion and Analysis
As the weather indicators have a disproportionate relevance on the PV power generation, the intuition of associating an importance vector to emphasize the relevance of each variable input seems to be promising for the overall forecasting system accuracy.From the above-mentioned results (Section 5.4), it can be said that the proposed model shows excellent predictive performance for different meteorological conditions.This will permit the generalization of the proposed model.More specifically, the strong competitive advantage of using the proposed model is evident during the rainy and cloudy days since the prediction results are very close to the real values.Therefore, the proposed site-specific hybrid model can be applied to similar PV power systems with different climatic conditions and different locations.Despite the PV power output is sensitive to chaotic meteorological conditions, the proposed model has the potential to capture the trend of the PV power generation with dramatic variability.Compared to the original RF model, the proposed method significantly improves the forecast accuracy by giving more importance to the feature relevance.In fact, it generates forecasting results with the lowest RMSE and MAE on most types of the day.The results further reveal the robustness of the proposed method.The superior accuracy of the proposed model is primarily due to attributing a coefficient that describes the importance of each feature to the PV power output, which provides an effective means to approximate inherent invariant features and structures.

Conclusions
For a reliable and secure operation of power systems, this paper seeks to explore the problem of predicting PV power generation for efficiently manage the capacity of the intermittent asynchronous PV generators.To overcome this challenge, an Enhanced Random Forest (ERF) model was first proposed to increase system forecasting accuracy based on the adequate understanding of the unequal influence of the input indicators on the PV power.To distinguish the relevance of the variables, a feature vector importance has been constructed based on three methods, specifically, Elastic Net, Local Interpretable Modelagnostic Explanations (LIME), and Extreme Gradient Boosting (XGBoost).A multivariate dataset from Desert Knowledge Australia Solar Center (DKASC) has been employed to validate the efficiency of the proposed method.The numerical performance investigation in sunny, rainy, and cloudy days demonstrate that the proposed model is effective, simple, explainable, and more accurate than the benchmark models with an overall RMSE = 8.36 kW and MAE = 5.21 kW.The proposed model is perfectly tailored to fulfill short-term PV power forecasting needs with high efficiency.Although the proposed model seems to be suitable for PV power systems, there are many areas that can be improved and optimized, such as the deep investigation of the weather patterns to forecasting performance substantially and empower the proposed approach stability for different climatic conditions.

Figure 1 .
Figure 1.General taxonomy of PV power forecasting.

Figure 3 .
Figure 3. Flow diagram for proposed model implementation.

Figure 5 .
Figure 5. Flowchart of the adopted data prepossessing strategy.

Figure 6 .
Figure 6.PV power time series with the related feature inputs: (a) diffuse horizontal radiation (b) horizontal radiation (c) relative humidity (d) previous PV power (e) temperature (f) wind direction.

Figure 7 .
Figure 7. PV power variation (a) on April 1st in the past 4 years (b) on August 1st in the past 4 years (c) between 2016-2018.

Figure 8 .
Figure 8. Relative importance of the candidate variables, using LIME, Elastic Net, and XGBoost.

Figure 9 .Table 3 .
Figure 9. Structure of the proposed methodology with (a) data preprocessing and feature engineering, (b) object determination, (c) model construction (d) evaluation.Table 3. Hyperparameters settings for reference models.Base Models Hyperparameter Settings DT maximum depth = 3; minimum samples leaf = 3; maximum leaf nodes = 5;minimum impurity decrease = 0.2 KNN The algorithm is KDTree; the nearest neighbor number is 7; the leaf size is 90; the distance function is Minkowski distance RF The maximum depth is 50; the minimum samples split is 10; The number of estimators is 140 (a) Forecasted PV power on a rainy day.(b) Forecasted PV power on a sunny day.(c) Forecasted PV power on a sunny day.(d) Forecasted PV power on a sunny to cloudy day.(e)Forecasted PV power on a sunny day .(f) Forecasted PV power on a foggy to cloudy day.

Figure 11 .
Figure 11.Scatter plots (a) and error distributions (b) of PV measured power and forecasted power.

Table 1 .
A summary of the literature studies.

Table 2 .
The related characteristics of the PV plant.

Table 4 .
Forecast error metrics of the simulated predictors for various weather conditions.

Table 4 ,
compares different predictors in terms of RMSE and MAE errors and different weather conditions.From Table