Microgrid-Level Energy Management Approach Based on Short-Term Forecasting of Wind Speed and Solar Irradiance

Background: The Distributed Energy Resources (DERs) are beneficial in reducing the electricity bills of the end customers in a smart community by enabling them to generate electricity for their own use. In the past, various studies have shown that owing to a lack of awareness and connectivity, end customers cannot fully exploit the benefits of DERs. However, with the tremendous progress in communication technologies, the Internet of Things (IoT), Big Data (BD), machine learning, and deep learning, the potential benefits of DERs can be fully achieved, although a significant issue in forecasting the generated renewable energy is the intermittent nature of these energy resources. The machine learning and deep learning models can be trained using BD gathered over a long period of time to solve this problem. The trained models can be used to predict the generated energy through green energy resources by accurately forecasting the wind speed and solar irradiance. Methods: We propose an efficient approach for microgrid-level energy management in a smart community based on the integration of DERs and the forecasting wind speed and solar irradiance using a deep learning model. A smart community that consists of several smart homes and a microgrid is considered. In addition to the possibility of obtaining energy from the main grid, the microgrid is equipped with DERs in the form of wind turbines and photovoltaic (PV) cells. In this work, we consider several machine learning models as well as persistence and smart persistence models for forecasting of the short-term wind speed and solar irradiance. We then choose the best model as a baseline and compare its performance with our proposed multiheaded convolutional neural network model. Results: Using the data of San Francisco, New York, and Los Vegas from the National Solar Radiation Database (NSRDB) of the National Renewable Energy Laboratory (NREL) as a case study, the results show that our proposed model performed significantly better than the baseline model in forecasting the wind speed and solar irradiance. The results show that for the wind speed prediction, we obtained 44.94%, 46.12%, and 2.25% error reductions in root mean square error (RMSE), mean absolute error (MAE), and symmetric mean absolute percentage error (sMAPE), respectively. In the case of solar irradiance prediction, we obtained 7.68%, 54.29%, and 0.14% error reductions in RMSE, mean bias error (MBE), and sMAPE, respectively. We evaluate the effectiveness of the proposed model on different time horizons and different climates. The results indicate that for wind speed forecast, different climates do not have a significant impact on the performance of the proposed model. However, for solar irradiance forecast, we obtained different error reductions for different climates. This discrepancy is certainly due to the cloud formation processes, which are very different for different sites with different climates. Moreover, a detailed analysis of the generation estimation and electricity bill reduction indicates that the proposed framework will help the smart community to achieve an annual reduction of up to 38% in electricity bills by integrating DERs into the microgrid. Conclusions: The simulation results indicate that our proposed framework is appropriate for approximating the energy generated through DERs and for reducing the electricity bills of a smart community. The proposed framework is not only suitable for different time horizons (up to 4 h ahead) but for different climates. Energies 2019, 12, 1487; doi:10.3390/en12081487 www.mdpi.com/journal/energies Energies 2019, 12, 1487 2 of 27


Introduction
The ongoing depletion of fossil fuels, the changing weather, and ecological pollution are some reasons for incorporating DERs into existing power systems.Many advanced countries in the world have directives for energy-providing companies to escalate their energy production from renewable energy sources.In this regard, the government of California established its Renewable Portfolio Standard (RPS) program.In this program, the government signed a bill with utilities to increase the renewable energy production from 20% in 2010 to 33% in 2020 [1].
Table 1 below shows the list of abbreviations used in this paper.The energy generation from the DERs is intermittent in nature as it is dependent on naturally varying climate factors, such as wind speed, solar irradiance, and air temperature [2].These atmospheric variations result in significant changes in the energy generated through DERs, which in turn leads to uncertainty.Consequently, precise and accurate prediction models are crucial for forecasting the generated energy through DERs.These models will be helpful in forecasting the generated energy through DERs, which will be available to the microgrids of smart communities.This will not only help in fulfilling energy requirements but also assist in decreasing energy costs and ensuring the adequate comfort of users in the smart community.
Owing to the intermittent nature of renewable energy resources, the development of a precise and accurate model has become an important factor for increasing the dissemination of DERs in existing power systems.Accurate forecasting of the energy generated through DERs not only helps in the incorporation of renewable energy into power systems but also guarantees good trading performance of renewable energy in the global market [3].Nevertheless, the forecasting accuracy is heavily dependent on the atmospheric circumstances of the geographical location [4].Thus, it becomes even more challenging.
There are two main classes of prediction models for forecasting the wind speed and solar irradiance: physical (numerical) models and machine learning (data-driven) models.The main purpose of these models is to forecast wind speed and solar irradiance for a specific location at a selected future time frame.The data-driven models are largely founded on time-series analyses [5].Their computational complexity is lower than that of the physical models, and they are suitable for short-term prediction.On the other hand, the physical models are based on mathematical equations for relating the dynamics and the physics of the atmosphere, which influences radiation from the sun [1,3,6].Physical models are usually used for long-term and medium-term forecasting.Consequently, in this work, we selected data-driven models for short-term forecasting owing to their lower complexity and good prediction accuracy.
In the past, traditional statistical approaches have been extensively explored for time-series analyses.Recently, machine learning and deep learning approaches have gained much attention from the research community.Artificial Neural Networks (ANNs) possess exceptional nonlinear mapping and robust generality abilities; thus, these networks can be applied to wind and solar energy forecasting [7].However, ANN-based models easily fall into local minima and show poor generalization.Moreover, they are well known to over-fit and they have slow convergence rates [8].In the literature, several other models have been applied, including the Extreme Learning Machine Neural Network (ELMNN) [9], Generalized Regression Neural Network (GRNN) [10], and Support Vector Machine (SVM) [11].The performance of ELMNN heavily depends on the activation function.If the activation function is not selected appropriately it would result in the generalization degradation phenomenon [12].Moreover, it is not suitable for applications that require deep extraction of features as it cannot encode more than one layer of abstraction.The main disadvantage of GRNN models is their size and huge computational time [13].The SVM algorithms have some limitations, such as optimal choice of kernel, computational complexity of the model, and large memory space requirement [14].
Recently, researchers who are applying machine learning models as a core forecasting model have advanced their research with other methods, including weather categorizations [15], parameter or feature selection [16][17][18], and decorrelation [19].Some other researchers used hybrid models to enhance the prediction precision.However, the long training time based on increased computational complexity of such models is an issue that needs consideration from the researchers.In a previous study [20], the authors explored an approach for predicting one-day-ahead PV power using neural networks and time-series analysis.The authors in another study [21] implemented and evaluated an optimized prediction model that was based on ANNs and a genetic algorithm (GA).The authors in yet another study [22] explored four architectures (Adaptive Neuro Fuzzy Inference System (ANFIS), Multilayer Perceptron (MLP), GRNN, and Nonlinear Autoregressive Recurrent Exogenous Neural Network (NARX)) for enhancing the prediction precision.They proposed hybrid wavelet-ANN models for solar forecasting at a specific site.However, the model was not tested at different geographical locations to assess the wider potential.Moreover, very limited set of features were used to train the model.
Currently, the convolutional neural network (CNN)-based model is one of the most successful models in deep learning and has been broadly adopted in different applications, including image recognition and classification, object detection, and tracking.However, CNN models have not been extensively explored in time-series analysis.The rapid progress in the computational power of hardware in the last decade has enabled CNN-based models to deeply penetrate various fields.The authors in a previous study [23] proposed a CNN model for interpreting weather data by considering the temporal and spatial associations between the independent parameters for producing local forecasts.They compared the performances of various architectures and stated that the purpose of their exploration was to show that CNN-based models can learn certain patterns of meteorological parameters and relate them to rainfall events.The authors in a previous study [24] also applied a CNN-based model for precipitation prediction.
The authors in a previous study [25] developed a hybrid model by combining long short term memory (LSTM) and CNN models for the prediction of extreme rainfall.The weather parameter was applied as an input to the CNN model, and the outputs of the CNN model were presented as inputs to the LSTM model.In this developed model, the researchers considered the LSTM and CNN models as independent steps.Atmospheric variables, including pressure and temperature, were used as input data.The authors in a previous study [26] developed a framework for the accurate forecasting of short-term wind speed.Their framework was based on hybrid nonlinear/linear models and empirical mode decomposition (EMD).They applied EMD to decompose the wind speed data into residuals and intrinsic mode functions (IMFs).They studied different linear and nonlinear models, including CNN, to analyze the residuals and the IMFs.Among all the hybrid models, EMD-ARMIA-RF performed well for ten-min-ahead forecasting.However, none of the hybrid models performed well 1 h ahead.
In the literature, an approach called benchmarking is mostly used for comparison with the newly developed algorithm [8,27].The best existing machine learning techniques are selected and evaluated to select the baseline model.The selected baseline model is then used to compare the performance of the newly developed technique.In a previous study [28], authors compared their proposed model for short-term wind speed forecasting with commonly used machine learning algorithms, such as SVM, random forest (RF), and decision tree (DT).We have selected well-known machine learning models, including k-nearest neighbors (KNN), gradient boosting, extra tree regressor, and random forest regressor, for short-term forecasting of wind speed.The best model among them is selected as a baseline model for comparing and evaluating the performance of the proposed model.
Each of the selected machine learning models has its own limitations.For example, KNN algorithm is very sensitive to outliers, as it chose neighbors based on distance criteria.Moreover, it is computationally extensive when the dataset is very large.Gradient boosting models are sensitive to overfitting if the data is noisy.Also, they are harder to tune than other models.One of the weaknesses of random forest and extra tree models when used for regression problems is that the model cannot predict beyond the range in the training data.
Various studies confirm that physical models, such as NWP, are best suited for forecasting more than 4 h to several days [28][29][30][31][32][33][34][35].These techniques are weak at handling smaller scale phenomenon and are not suitable for short-term forecast horizons [29].Machine learning methods give the best results for forecast horizons of up to 6 h [30].The choice of model depends on the forecast horizon.The NWP models generally outperform machine learning models over longer horizons.However, for short-term horizons the time series models have more power [34].At the intra-hour forecast horizon, NWP is extremely expensive and not practical, especially for the renewable energy sector [36].
In a previous study [37], the authors used smart persistence model as a baseline for the deterministic forecast.In fact, California Independent System Operator (CAISO) uses persistence method in its renewable energy forecasting and dispatching [38].This method is highly effective in short term prediction, i.e., 1 h ahead.It is often used as a comparison with other advanced methods [39].In the irradiance forecasting community, numerous works have been devoted recently to the development of models that generate deterministic or point forecasts [34,[40][41][42][43][44][45].In this work, as we are dealing with short-term forecasting of solar irradiance, we have considered the persistence model and smart persistence model for comparison purposes.
We studied the trends of adaptation of the renewable energy resources in various states of the United States of America (USA).We found that California has made effective policies for the integration of renewable energy resources.In California in 2017, 32% of the electricity was acquired from renewable energy sources, due to which it seems to be well on track to meet its renewable energy targets of 33% and 50% for 2020 and 2030, respectively [46].Based on the planned effective and concrete policies of the government of California, we have selected San Francisco from the NSRDB of the NREL as a case study in our analysis.The NSRDB uses a physics-based modeling approach, in which the solar radiation data for the entire United States is gridded into segments of 4 km × 4 km using geostationary satellites.The temporal resolution of the data is 30 min [47].The NSRDB's physics-based, gridded data collection approach is called the Physical Solar Model (PSM).More details about the PSM can be found in a previous study [48].
This paper proposes a multiheaded convolutional neural network (MH-CNN) model for the short-term forecasting of solar irradiance and wind speed to approximate the energy generated through solar panels and wind turbines, respectively.We consider several machine learning models, as well as persistence and smart persistence models for forecasting the short-term solar irradiance and wind speed.We then choose the best model as a baseline and compare its performance with our proposed MH-CNN model.The comparison is based on evaluation metrics, including the root mean square error (RMSE), mean absolute error (MAE), and symmetric mean absolute percentage error (sMAPE).Using the NSRDB of the NREL data of San Francisco as a case study, the results show that our proposed model outperforms all other models in forecasting the wind speed and solar irradiance.The obtained results indicate that our proposed framework for microgrid-level energy management is appropriate for approximating the renewable energy and for reducing the electricity bills of a smart community.The main contributions of our work are as follows:

•
We formulated a solar irradiance and wind speed prediction problem for approximating the generated energy through solar panels and wind turbines.

•
We evaluated the performance of various machine learning models, as well as persistence and smart persistence models, for selecting a baseline model.

•
We proposed an MH-CNN model for the short-term forecasting of solar irradiance and wind speed.

•
We evaluated the effectiveness of proposed model on different time horizons (up to 4 h).

•
We evaluated the effectiveness of proposed model on different climates.

•
We proposed a framework for microgrid-level energy management for reducing the electricity bills of a smart community.
The remainder of the paper is organized as follows.The related work from the literature is reviewed and presented in Section 2. The proposed framework is elaborated in Section 3. A performance evaluation of the various considered models is provided in Section 4. Results and discussions are provided in Section 5.The contributions of the paper are discussed in Section 6.

Related Work
Researchers from academia and industry have explored methods and technologies for tackling the problems of global energy crises.In this part of the manuscript, we present recent research work on global energy crises and explored solutions.The future Smart Grid (SG) will be composed of the latest technologies and will significantly improve the existing power grids.The possibility of two-way information flow and interoperability between smart homes provides a chance to optimize the power consumption of the end users and simultaneously improve the operation of the SG [49][50][51][52].The increasing diffusion of renewable energy in power systems has given rise to the concept of microgrids, which will probably play a substantial role in the development of SGs [53,54].It is anticipated that the network of microgrids will result in the formation of an SG [54].The microgrid is composed of DERs, power loads, and Energy Storage Systems (ESS) [55,56].
Typically, DERs, such as wind turbines and solar panels, are among the useful energy resources for solving energy shortfalls.These resources also help in decreasing the effects of carbon emissions in the modern world.By incorporating DERs in power systems, consumers will be able to achieve their power requirements by generating green energy, which in turn will lead to electricity bill reductions.In the last decade, many researchers focused their efforts on solving the challenges of DERs-their integration into the SG, intermittent nature, the optimal power flow, etc.One of the important issues with the energy generated through DERs is the intermittent nature of these power-generating sources.Many researchers have dedicated their efforts to mitigating these issues [57,58].The development of an accurate prediction model for forecasting the wind energy and solar energy is desirable.However, the generated energy from DERs is heavily reliant on the accuracy of the weather prediction model.
The accuracy of the weather prediction model is reliant on different atmospheric phenomena, such as pressure, temperature, wind speed, and humidity.The enormously random variations of weather conditions lead to difficulties in the accuracy of the prediction [58].Fortunately, different parameters of the weather can be predicted with significant accuracy by developing any of the latest models, including the ANN, Deep Neural Network (DNN), and LSTM [59][60][61].
The authors investigated the integration of DERs in power systems in a previous study [62].They suggested dealing with the uncertainty of DERs by virtualization, and validated their method by performing real-time experiments.The authors in another study [63] investigated a prediction model for approximating the quantity of solar energy generation.Their prediction model was composed of a wavelet transform and a neural network.They used RMSE and MAE to evaluate their developed model.A comparison of their obtained results with existing promising results proved that their developed model achieved good performance.
Recently, in a previous study [64], we proposed a short-term load prediction technique based on support vector quantile regression.In this study, we compared three kernel functions: Gaussian kernel, linear kernel, and polynomial kernel.The predicted precision of the power load was approximated using data sets from Singapore.We achieved better results compared to those from Support Vector Regression and the Firefly Algorithm.Power systems in today's world are being transformed into distributed energy resources.The integration of DERs in existing power systems leads to energy management problems because these energy resources produce power in nondeterministic manners.The well-known "duck curve" problem arises in the off-peak hours because of the overgeneration from DERs that causes generator units to be underloaded [65].The underloading of a generator impacts the individual components of a power system and the overall system performance because of the mismatch between generation and demand.
Recently, in a previous study [66], we considered the radial structure of a distribution grid and applied commonly used configuration topology for the integration of DERs and ESSs in power systems.Furthermore, we addressed a multilevel Multi Agent System (MAS) optimization framework for the co-scheduling of demand and supply resources.The MAS structure permits Plug-and-Play (PnP) capabilities and flexible control of DERs for load balancing.During both off-peak and peak hours, the PnP algorithm deactivates or activates the ESS to rectify demand and supply mismatches.The ESS stocks the excess energy from DERs and uses it to meet the energy demand at a later time.Our main objective has been to reduce electricity bills without compromising user comfort during peak hours.Our simulation results proved that our developed MAS helped in balancing the load while maintaining adequate user comfort.
In the current work, our aim is to develop an MH-CNN model for the short-term forecasting of wind speed and solar irradiance.The forecasted wind speed and solar irradiance will be used for approximating the generated power through wind turbines and solar panels.We performed extensive simulations to prove the improved performance of the proposed strategy.Moreover, we proposed a framework for microgrid-level energy management for reducing the electricity bills of the smart community.

Proposed System Model
In this section, the proposed system model is explained.It is always beneficial to reduce the electricity bill of the users without affecting their comfort.Integrating DERs in the power system helps to reduce electricity bills, increase user comfort, and fulfill energy requirements.Sometimes the total electricity generated by DERs during off-peak hours exceeds the demand of the consumer, which results in a generation-demand imbalance.The consequence of a generation-demand imbalance is the basis of the "duck curve" problem [65].Temporarily, excessive electricity generation from DERs lessens the power load on the grid generators.In this situation, the excess power generated by the DERs may be harmful to the generator and motors.Thus, there is a need to develop efficient machine and deep learning models to accurately predict short-term renewable energy generation.Based on these models, efficient energy management frameworks need to be explored.
The proposed microgrid-level energy management framework is presented in Figure 1, where a smart meter, ESS, and DERs are integrated.As shown in Figure 1, an ESS is integrated in the proposed system to mitigate the influence of the duck curve problem, which we recently targeted in a previous study [66].The smart meters are used for two-way communication in addition to many other advanced features.The DERs in the form of wind turbines and solar panels are used to generate the renewable energy to ensure the required user comfort and to reduce electricity bills.The ESSs are used for storing the excess generated energy at any time.This excess energy can then be used at a later time.In addition to DERs, the microgrid has access to power from the main grid, as the nature of the DER is intermittent and may produce very low energy on certain days and at certain times.

Proposed System Model
In this section, the proposed system model is explained.It is always beneficial to reduce the electricity bill of the users without affecting their comfort.Integrating DERs in the power system helps to reduce electricity bills, increase user comfort, and fulfill energy requirements.Sometimes the total electricity generated by DERs during off-peak hours exceeds the demand of the consumer, which results in a generation-demand imbalance.The consequence of a generation-demand imbalance is the basis of the "duck curve" problem [65].Temporarily, excessive electricity generation from DERs lessens the power load on the grid generators.In this situation, the excess power generated by the DERs may be harmful to the generator and motors.Thus, there is a need to develop efficient machine and deep learning models to accurately predict short-term renewable energy generation.Based on these models, efficient energy management frameworks need to be explored.
The proposed microgrid-level energy management framework is presented in Figure 1, where a smart meter, ESS, and DERs are integrated.As shown in Figure 1, an ESS is integrated in the proposed system to mitigate the influence of the duck curve problem, which we recently targeted in a previous study [66].The smart meters are used for two-way communication in addition to many other advanced features.The DERs in the form of wind turbines and solar panels are used to generate the renewable energy to ensure the required user comfort and to reduce electricity bills.The ESSs are used for storing the excess generated energy at any time.This excess energy can then be used at a later time.In addition to DERs, the microgrid has access to power from the main grid, as the nature of the DER is intermittent and may produce very low energy on certain days and at certain times.The architecture of the proposed MH-CNN model is shown in Figure 2. We used the same model for the short-term forecasting of both wind speed and solar irradiance.Meteorological parameters, such as temperature, pressure, and wind speed, as well as cyclic parameters, such as season, month, day of the year, and hour over the past day, are passed to both the wind speed and solar irradiance forecasting models.Moreover, we incorporate the past day's lag of wind speed and solar irradiance as lag features in the wind speed and solar irradiance forecasting models, respectively.The data preparation steps are presented in Figure 3.The architecture of the proposed MH-CNN model is shown in Figure 2. We used the same model for the short-term forecasting of both wind speed and solar irradiance.Meteorological parameters, such as temperature, pressure, and wind speed, as well as cyclic parameters, such as season, month, day of the year, and hour over the past day, are passed to both the wind speed and solar irradiance forecasting models.Moreover, we incorporate the past day's lag of wind speed and solar irradiance  The same input is passed to three 1D CNNs.Each CNN has the same filter size but different kernel sizes.All three sub-CNN models extract features by looking at the input data from different aspects owing to the different kernel sizes.For our model, we used a Rectified Linear Unit (ReLU) as an activation function, as it does not encounter the gradient vanishing problem and performed best in the case study data.The CNN part consists of two 1D convolution layers.In the second convolution layer, we halved the filter size and doubled the kernel size to reduce the dimensionality and enhance the feature selection domain, respectively.The output of the second convolution layer (after applying the ReLU activation function) for each sub-CNN model is flattened and concatenated as a single feature vector.The feature vector then goes through the fully connected architecture and the ReLU activation function to produce the output, as shown in Figure 2.
The data preprocessing steps are shown in Figure 3. Initially, the missing values are determined and are replaced with the values from the same time on the previous day.If the value from the same time of the previous day is also missing, then the missing data is imputed by using the value of the same time of the last previous day with available data.Then, further processing is performed on the clean data in three different ways.The sine and cosine transformations of cyclic parameters, such as   The same input is passed to three 1D CNNs.Each CNN has the same filter size but different kernel sizes.All three sub-CNN models extract features by looking at the input data from different aspects owing to the different kernel sizes.For our model, we used a Rectified Linear Unit (ReLU) as an activation function, as it does not encounter the gradient vanishing problem and performed best in the case study data.The CNN part consists of two 1D convolution layers.In the second convolution layer, we halved the filter size and doubled the kernel size to reduce the dimensionality and enhance the feature selection domain, respectively.The output of the second convolution layer (after applying the ReLU activation function) for each sub-CNN model is flattened and concatenated as a single feature vector.The feature vector then goes through the fully connected architecture and the ReLU activation function to produce the output, as shown in Figure 2.
The data preprocessing steps are shown in Figure 3. Initially, the missing values are determined and are replaced with the values from the same time on the previous day.If the value from the same time of the previous day is also missing, then the missing data is imputed by using the value of the same time of the last previous day with available data.Then, further processing is performed on the clean data in three different ways.The sine and cosine transformations of cyclic parameters, such as The same input is passed to three 1D CNNs.Each CNN has the same filter size but different kernel sizes.All three sub-CNN models extract features by looking at the input data from different aspects owing to the different kernel sizes.For our model, we used a Rectified Linear Unit (ReLU) as an activation function, as it does not encounter the gradient vanishing problem and performed best in the case study data.The CNN part consists of two 1D convolution layers.In the second convolution layer, we halved the filter size and doubled the kernel size to reduce the dimensionality and enhance the feature selection domain, respectively.The output of the second convolution layer (after applying the ReLU activation function) for each sub-CNN model is flattened and concatenated as a single feature vector.The feature vector then goes through the fully connected architecture and the ReLU activation function to produce the output, as shown in Figure 2.
The data preprocessing steps are shown in Figure 3. Initially, the missing values are determined and are replaced with the values from the same time on the previous day.If the value from the same time of the previous day is also missing, then the missing data is imputed by using the value of the same time of the last previous day with available data.Then, further processing is performed on the clean data in three different ways.The sine and cosine transformations of cyclic parameters, such as hour of day, day of the year, month of the year, season of the year, and wind direction, are determined.We used binary encoding to encode the categorical feature named "cloud type".This feature was obtained by the NREL from the pathfinder atmospheres extended (PATMOS-X) model.Meteorological parameters, including temperature, pressure, and wind speed, are separated.The one-day time lags of the wind speed and solar irradiance are arranged as separate features for the wind speed and solar irradiance models, respectively.These features are merged, and then normalization is performed, i.e., the range of each input vector is restricted to (0, 1).The scaled data are then separated into train and test data sets for training and evaluating the proposed model, respectively.

Convolutional Neural Networks
In this section, we describe the layers associated with the implementation of our forecasting model, including 1-D convolution, ReLU, dropout, and fully-connected layers.

The 1-D convolution
The convolutional layer is the most important building block of any CNN.This layer is regarded as a set of learnable filters that consists of many convolution operations.The parameters of every convolution operation are optimized by a back propagation algorithm.Each filter in a specific convolution layer has the same receptive field.An example of 1D convolution is shown in Figure 4.The weights associated with kernel size of 3 are {w1, w2, w3}.These weights are shared by the input layer {i1, i2, i3, i4, i5}.The feature map will be obtained by the convolution between the weights and inputs.In this example, the feature f2 is obtained by f2 = w1 × i2 + w2 × i3 + w3 × i4.
Energies 2019, 12, x FOR PEER REVIEW 9 of 27 hour of day, day of the year, month of the year, season of the year, and wind direction, are determined.We used binary encoding to encode the categorical feature named "cloud type".This feature was obtained by the NREL from the pathfinder atmospheres extended (PATMOS-X) model.Meteorological parameters, including temperature, pressure, and wind speed, are separated.The one-day time lags of the wind speed and solar irradiance are arranged as separate features for the wind speed and solar irradiance models, respectively.These features are merged, and then normalization is performed, i.e., the range of each input vector is restricted to (0, 1).The scaled data are then separated into train and test data sets for training and evaluating the proposed model, respectively.

Convolutional Neural Networks
In this section, we describe the layers associated with the implementation of our forecasting model, including 1-D convolution, ReLU, dropout, and fully-connected layers.

The 1-D convolution
The convolutional layer is the most important building block of any CNN.This layer is regarded as a set of learnable filters that consists of many convolution operations.The parameters of every convolution operation are optimized by a back propagation algorithm.Each filter in a specific convolution layer has the same receptive field.An example of 1D convolution is shown in Figure 4.The weights associated with kernel size of 3 are {w1, w2, w3}.These weights are shared by the input layer {i1, i2, i3, i4, i5}.The feature map will be obtained by the convolution between the weights and inputs.In this example, the feature f2 is obtained by f2 = w1 × i2 + w2 × i3 + w3 × i4.

ReLU
The activation functions are used to enhance the ability of models to learn complex structures.ReLU has been widely adopted by various researchers to make the network more trainable.It works by thresholding values at 0, i.e., f (z) = max (0, z).

Dropout
The dropout technology provides an easy way to overcome the overfitting problem while designing the deep learning model.This method involves the random selection of neurons and disabling them during training.The output values of these randomly disabled neurons are zero.

ReLU
The activation functions are used to enhance the ability of models to learn complex structures.ReLU has been widely adopted by various researchers to make the network more trainable.It works by thresholding values at 0, i.e., f (z) = max (0, z).

Dropout
The dropout technology provides an easy way to overcome the overfitting problem while designing the deep learning model.This method involves the random selection of neurons and disabling them during training.The output values of these randomly disabled neurons are zero.

Fully-Connected Layer
The fully connected layer exhibits the nonlinear mapping from the input to the output, by using bias and an activation function.These layers are usually applied towards the end of the network.We use the flatten layer after the convolution layers, as this layer expects 1-D data.

Proposed Model
The details of our proposed MH-CNN model are shown in Figure 5.The number of features for wind speed and solar irradiance short-term forecasting are 62 and 47, respectively.For wind speed forecasting, there are 48 instances per day, as the data are recorded every 30 min.However, for solar irradiance forecasting, we considered the data from 5:30 to 19:00; hence, there are 28 instances per day.We used 64 filters in the first convolution layer with kernel sizes of 3, 5, and 7 for each head of the MH-CNN.Similarly, we used 32 filters in the second convolution layer with kernel sizes of 5, 7, and 9.We used a dropout value of 0.5 before applying the flattening layer.After concatenating all of the features into a single feature vector, a fully connected layer was applied with 16 neurons and the ReLU as a nonlinear activation function.The parameter settings of the proposed model are listed in Table 2.The prediction of both wind speed and solar irradiance concerns half-hour-ahead prognosis.The proposed model forecasts the next half hour value using the values of the previous day as inputs.

Fully-connected layer
The fully connected layer exhibits the nonlinear mapping from the input to the output, by using bias and an activation function.These layers are usually applied towards the end of the network.We use the flatten layer after the convolution layers, as this layer expects 1-D data.

Proposed Model
The details of our proposed MH-CNN model are shown in Figure 5.The number of features for wind speed and solar irradiance short-term forecasting are 62 and 47, respectively.For wind speed forecasting, there are 48 instances per day, as the data are recorded every 30 min.However, for solar irradiance forecasting, we considered the data from 5:30 to 19:00; hence, there are 28 instances per day.We used 64 filters in the first convolution layer with kernel sizes of 3, 5, and 7 for each head of the MH-CNN.Similarly, we used 32 filters in the second convolution layer with kernel sizes of 5, 7, and 9.We used a dropout value of 0.5 before applying the flattening layer.After concatenating all of the features into a single feature vector, a fully connected layer was applied with 16 neurons and the ReLU as a nonlinear activation function.The parameter settings of the proposed model are listed in Table 2.The prediction of both wind speed and solar irradiance concerns half-hour-ahead prognosis.The proposed model forecasts the next half hour value using the values of the previous day as inputs.The training flow of the proposed model is shown in Figure 6.The training data are split into 90% training data and 10% validation data.The validation loss is based on the Mean Square Error (MSE) value.If the validation loss does not decrease for two consecutive epochs, then the learning rate is reduced by a factor of 0.85.The minimum value of the learning rate is set to be 1 × 10 -6 .If the validation loss is decreasing, then the model is saved with the updated weights.To avoid overfitting of the model during the training process, if the validation loss is not decreasing for 10 consecutive epochs, then early stopping callback is applied, and the last-saved best model is loaded for forecasting and performance evaluation.Otherwise, the training process continues until the maximal number of epochs is completed.The training flow of the proposed model is shown in Figure 6.The training data are split into 90% training data and 10% validation data.The validation loss is based on the Mean Square Error (MSE) value.If the validation loss does not decrease for two consecutive epochs, then the learning rate is reduced by a factor of 0.85.The minimum value of the learning rate is set to be 1 × 10 -6 .If the validation loss is decreasing, then the model is saved with the updated weights.To avoid overfitting of the model during the training process, if the validation loss is not decreasing for 10 consecutive epochs, then early stopping callback is applied, and the last-saved best model is loaded for forecasting and performance evaluation.Otherwise, the training process continues until the maximal number of epochs is completed.
where a m is the actual value, and p m is the predicted value.The RMSE, MAE, and MBE represents model prediction error in units of the target variable.The RMSE gives a relatively high weightage to the outliers compared to MAE, as the residual is squared before averaging.The MAE is a linear score where all the individual differences are weighted equally.The MBE indicates the degree to which the observations are "over" or "under" forecasted by the prediction model.The smaller RMSE, MAE, and MBE denote the good performance of a forecasting model.MAPE is another standard metric for evaluating the performance of forecasting algorithms.Problems in its use can occur when a m is zero or very small.As an alternative, we used sMAPE, as shown in Equation ( 4):

Performance Evaluation of Proposed Model
In this study, weather data from 1998 to 2007 for San Francisco, California, are used.The data were retrieved from the NSRDB of the NREL [67].The first nine years of data are used to train the model, and the data for the last year (2007) are used to test the performance of the trained model.

Short-Term Forecasting Analysis of Wind Speed
We selected KNN, gradient boosting, extra tree regressor, and random forest regressor as our machine learning models.The input to all these models is the complete set of features mentioned in Figure 3.We used the default values of the parameters for our baseline model comparison.We used MSE as the loss function for all machine learning models in this study.
We also considered the persistence model for the short-term forecasting of wind speed.The persistence model assumes that wind data at a certain future time (the next half hour, in our case) will be the same as when the forecast was made.In this study, for the persistence model, we assumed that the wind data in the next half hour will be the same as that of the current time.
For our baseline model comparison, the parameters of various machine learning algorithms are taken from a previous study [8] and shown in Table 3.
Table 3. Parameters of various machine learning algorithms [8].All trained machine learning models were evaluated on the same test data.We selected three standard evaluation metrics for comparing the performance of these models: RMSE, MAE, and sMAPE.Based on the test data set of one year, i.e., 2007, the seasonal average values (three-months-average values of RMSE, MAE, and sMAPE, with spring season defined as March, April, and May) are calculated.The seasonal variation of RMSE, MAE, and sMAPE are shown in Figures 7-9, respectively.It is clear from these figures that the random forest method outperforms the persistence model, and therefore serves as the baseline model.The RMSE, MAE, and sMAPE of our proposed model are the lowest among the evaluated models.We have selected a random day from the test data to demonstrate the comparison of various machine learning models with the proposed model.Detailed     We have selected a random day from the test data to demonstrate the comparison of various machine learning models with the proposed model.Detailed  We have selected a random day from the test data to demonstrate the comparison of various machine learning models with the proposed model.Detailed comparison results of the wind speed prediction for all of the evaluated models are presented in Figure 10.The bold blue line represents the actual wind speed, whereas the bold black line represents the forecast by the proposed model.A careful analysis of this figure reveals that the forecast results of the KNN and gradient boosting algorithms barely coincide with the actual wind speed.The wind speed predicted by our proposed model is quite close to the actual wind speed.The forecasting ability of the proposed model is also verified in this experiment.The independent axis in this figure shows 48 values because of the 30 min sampling interval of the measured data.
Energies 2019, 12, x FOR PEER REVIEW 14 of 27 prediction for all of the evaluated models are presented in Figure 10.The bold blue line represents the actual wind speed, whereas the bold black line represents the forecast by the proposed model.A careful analysis of this figure reveals that the forecast results of the KNN and gradient boosting algorithms barely coincide with the actual wind speed.The wind speed predicted by our proposed model is quite close to the actual wind speed.The forecasting ability of the proposed model is also verified in this experiment.The independent axis in this figure shows 48 values because of the 30 min sampling interval of the measured data.In Figure 11, a scatter plot of the predicted and actual wind speed values for the complete test data is presented.The coefficient of determination value is 0.9948, which confirms the strong, positive, linear association between the predicted and actual wind speeds.Furthermore, the coefficient of determination shows that the proposed model is able to explain 99.48% of the variation of the actual data.This indicates the very good forecasting ability of the proposed model.
Previous results (Figure 7, Figure 8 and Figure 9) showed the evaluation performance of various models for seasonal trends.Furthermore, we evaluated the performance of the selected models on the complete test data.These results are listed in Tables 4 and 5.In Figure 11, a scatter plot of the predicted and actual wind speed values for the complete test data is presented.The coefficient of determination value is 0.9948, which confirms the strong, positive, linear association between the predicted and actual wind speeds.Furthermore, the coefficient of determination shows that the proposed model is able to explain 99.48% of the variation of the actual data.This indicates the very good forecasting ability of the proposed model.
prediction for all of the evaluated models are presented in Figure 10.The bold blue line represents the actual wind speed, whereas the bold black line represents the forecast by the proposed model.A careful analysis of this figure reveals that the forecast results of the KNN and gradient boosting algorithms barely coincide with the actual wind speed.The wind speed predicted by our proposed model is quite close to the actual wind speed.The forecasting ability of the proposed model is also verified in this experiment.The independent axis in this figure shows 48 values because of the 30 min sampling interval of the measured data.In Figure 11, a scatter plot of the predicted and actual wind speed values for the complete test data is presented.The coefficient of determination value is 0.9948, which confirms the strong, positive, linear association between the predicted and actual wind speeds.Furthermore, the coefficient of determination shows that the proposed model is able to explain 99.48% of the variation of the actual data.This indicates the very good forecasting ability of the proposed model.
Previous results (Figure 7, Figure 8 and Figure 9) showed the evaluation performance of various models for seasonal trends.Furthermore, we evaluated the performance of the selected models on the complete test data.These results are listed in Tables 4 and 5.  Previous results (Figures 7-9) showed the evaluation performance of various models for seasonal trends.Furthermore, we evaluated the performance of the selected models on the complete test data.These results are listed in Tables 4 and 5. Table 4 shows a comparison of the various models on the basis of the evaluation metrics for the selection of a baseline model.The results indicate that the KNN and gradient boosting algorithms perform poorly on the complete test data.Usually, it is difficult to outperform the persistence model for short-term forecasting.We can see that the random forest method outperformed the persistence model.Therefore, we selected random forest as a baseline model for comparison with our proposed model.
The comparison of our proposed model with the baseline model is presented in Table 5.It is clear from this table that the proposed MH-CNN model resulted in much lower (better) evaluation metrics than the baseline model.The percentages of error reductions achieved by the proposed model for RMSE, MAE, and sMAPE are 44.94,46.12, and 2.25, respectively.

Short-Term Forecasting Analysis of Solar Irradiance
In solar irradiance forecasting, the persistence model usually serves as a baseline model for short-term forecasting.A simple persistence model [36] is shown in Equation ( 5): where GHI(t) is the current global horizontal irradiance (GHI) at the surface.(The terms "solar irradiance" and "GHI" are used interchangeably throughout the manuscript.)We selected another model with which to compare our proposed model: a variant of the persistence model, called the smart persistence model [36].It is defined as where k t (t) is the clear-sky index correction factor, defined as Based on the test data of solar irradiance for the year 2007, Figures 12 and 13 show the seasonal average variations for the persistence, smart persistence, and proposed models in terms of RMSE and sMAPE, respectively.It is clear from these figures that RMSE and sMAPE of our proposed model are lower than those of the persistence and smart persistence models.This shows that our proposed model is suitable for the short-term forecasting of solar irradiance.A comparison of the actual GHI and predicted solar irradiance using the proposed model and the smart persistence model is shown in Figure 14a.We selected a random day from the test data to demonstrate the comparison of the smart persistence model with the proposed model.The prediction accuracies of the proposed model and smart persistent model are shown in Figure 14b and Figure 14c, respectively.The coefficient of determination of the proposed model is reasonably high compared to that of the smart persistence model, which shows that our proposed model can be successfully applied to predict the solar irradiance.Later, we used the predicted solar irradiance to approximate the generated solar energy.A comparison of the actual GHI and predicted solar irradiance using the proposed model and the smart persistence model is shown in Figure 14a.We selected a random day from the test data to demonstrate the comparison of the smart persistence model with the proposed model.The prediction accuracies of the proposed model and smart persistent model are shown in Figure 14b and Figure 14c, respectively.The coefficient of determination of the proposed model is reasonably high compared to that of the smart persistence model, which shows that our proposed model can be successfully applied to predict the solar irradiance.Later, we used the predicted solar irradiance to approximate the generated solar energy.A comparison of the actual GHI and predicted solar irradiance using the proposed model and the smart persistence model is shown in Figure 14a.We selected a random day from the test data to demonstrate the comparison of the smart persistence model with the proposed model.The prediction accuracies of the proposed model and smart persistent model are shown in Figures 14b and 14c, respectively.The coefficient of determination of the proposed model is reasonably high compared to that of the smart persistence model, which shows that our proposed model can be successfully applied to predict the solar irradiance.Later, we used the predicted solar irradiance to approximate the generated solar energy.A comparison of the persistence and smart persistence models for the selection of the best model is presented in Table 6.As seen from the results, the smart persistence model is best as a baseline model for comparison with the proposed model.A comparison of the proposed model with the smart persistence model is presented in Table 7.It is clear from the results of this table that the proposed model produced better results than the smart persistence model.We achieved 7.68%, 54.29%, and 0.14% error reductions in the RMSE, MBE, and sMAPE, respectively.A comparison of the persistence and smart persistence models for the selection of the best model is presented in Table 6.As seen from the results, the smart persistence model is best as a baseline model for comparison with the proposed model.A comparison of the proposed model with the smart persistence model is presented in Table 7.It is clear from the results of this table that the proposed model produced better results than the smart persistence model.We achieved 7.68%, 54.29%, and 0.14% error reductions in the RMSE, MBE, and sMAPE, respectively.

Results and Discussions
In this section, we evaluated the effectiveness of the proposed model for different forecasting horizons and different climates.Then, we performed a comprehensive bill reduction analysis by estimating the generated power from renewable resources using the data of San Francisco as a case study.

Evaluation of Proposed Model for Different Time Horizons
In the previous section, we explored the effectiveness of the proposed model for a single-step forecast, i.e., predicting the observation at the next time stamp.To illustrate the effectiveness of the proposed model for multi-step forecasting, i.e., different time horizons, we considered the same data used in Section 4. The output of the proposed model was reshaped according to the forecasting time horizon.For example, when the time horizon was set to 4 h ahead, then the model was evaluated such that for each instance of test data, the model will predict the next eight values in one-shot.
Table 8 shows the seasonal RMSE variation of the proposed model for wind speed and solar irradiance forecasting of different time horizons.[67].We prepared the training and test data according to Figure 3 for both sites.The first nine years of data (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) was used to train the model and the last one year of data (2007) was used to test the performance of the trained model.
To fairly compare the RMSE across different sites, normalized root mean square error (nRMSE) is computed as, where y is the mean of the actual values.Table 9 shows the seasonal variation of the proposed model for short-term forecasting of wind speed and solar irradiance for different climates.As seen in the table, for each season, there is a small discrepancy between the wind speed nRMSE of various sites with different climates.This result indicates that our proposed model is capable of forecasting the short-term wind speed for different climates during various seasons with high accuracy.
For short-term forecasting of solar irradiance, there is an almost 9% difference between the best predictor (summer season) for San Francisco and New York.This discrepancy is certainly due to the cloud formation processes, which are very different in these two sites.The two sites experience different sky conditions during the year.Sites such as San Francisco and Las Vegas exhibit stable sky conditions during the summer.However, New York witnesses occasional thunderstorms with heavy rain in summer, and tornadoes are not uncommon.
There is an almost 5.6% difference between the worst predictor (winter season) of San Francisco and New York.In San Francisco and New York, the sky is mostly cloudy, around 55% and 53% of the time in winter, respectively.The proposed model performance is worst in winter, since the sky coverage is highly variable.In Las Vegas, there is a significant seasonal variation in the cloud coverage over the course of the year.For a hot desert climate, such as Las Vegas, the seasonal performance of the model is reasonably suitable for solar irradiance forecasting.
The result of the solar irradiance forecast indicates that the proposed model is well-suited for a hot desert climate, as well as a Mediterranean climate.Moreover, it can also be used for a humid subtropical climate.

Generation Estimation and Bill Reduction Analysis: San Francisco as a Case Study
We considered a smart community consisting of 80 homes as the consumers of the electricity.For simulation purposes, it was assumed that the smart community has a microgrid that is equipped with wind turbines and solar panels, in addition to having access to the power from the main grid.At any time, the energy generated by the wind turbine and solar panels is provided to the users through the microgrid, and the excess generated energy is stored in the ESS for later use.In addition, the deficit energy at any time is purchased from the commercial grid to satisfy the energy demands of the users in the smart community.
In this work, a Time of Use (ToU) pricing model is applied to determine the price of the consumed electricity [68], which is shown in Figure 15.A 24-h time period is considered and is denoted by T. This is divided into 1-h subintervals indicated by t. forecasting the short-term wind speed for different climates during various seasons with high accuracy.
For short-term forecasting of solar irradiance, there is an almost 9% difference between the best predictor (summer season) for San Francisco and New York.This discrepancy is certainly due to the cloud formation processes, which are very different in these two sites.The two sites experience different sky conditions during the year.Sites such as San Francisco and Las Vegas exhibit stable sky conditions during the summer.However, New York witnesses occasional thunderstorms with heavy rain in summer, and tornadoes are not uncommon.
There is an almost 5.6% difference between the worst predictor (winter season) of San Francisco and New York.In San Francisco and New York, the sky is mostly cloudy, around 55% and 53% of the time in winter, respectively.The proposed model performance is worst in winter, since the sky coverage is highly variable.In Las Vegas, there is a significant seasonal variation in the cloud coverage over the course of the year.For a hot desert climate, such as Las Vegas, the seasonal performance of the model is reasonably suitable for solar irradiance forecasting.
The result of the solar irradiance forecast indicates that the proposed model is well-suited for a hot desert climate, as well as a Mediterranean climate.Moreover, it can also be used for a humid subtropical climate.

Generation Estimation and Bill Reduction Analysis: San Francisco as a Case Study
We considered a smart community consisting of 80 homes as the consumers of the electricity.For simulation purposes, it was assumed that the smart community has a microgrid that is equipped with wind turbines and solar panels, in addition to having access to the power from the main grid.At any time, the energy generated by the wind turbine and solar panels is provided to the users through the microgrid, and the excess generated energy is stored in the ESS for later use.In addition, the deficit energy at any time is purchased from the commercial grid to satisfy the energy demands of the users in the smart community.In this work, a Time of Use (ToU) pricing model is applied to determine the price of the consumed electricity [68], which is shown in Figure 15.A 24-h time period is considered and is denoted by T. This is divided into 1-h subintervals indicated by t.
For a randomly selected day, the proposed model is applied to predict the wind speed for 24 h.The predicted wind speed is then used to approximate the generated wind energy based on Equation (8), which we also used in our recent work [69].The predicted wind speed and generated wind power are presented in Figure 16.It is clear from the figure that with an increase in the wind speed, the power generated by the wind turbine also increases.The power generation of the wind turbine is approximated by implementing Equation ( 8 For a randomly selected day, the proposed model is applied to predict the wind speed for 24 h.The predicted wind speed is then used to approximate the generated wind energy based on Equation ( 8), which we also used in our recent work [69].The predicted wind speed and generated wind power are presented in Figure 16.It is clear from the figure that with an increase in the wind speed, the power generated by the wind turbine also increases.The power generation of the wind turbine is approximated by implementing Equation (8) in MATLAB.For simulation purposes, we used a single wind turbine of 30 kW [70].As shown in the figure, when the wind speed is equal to, or greater than, the rated wind speed of the selected wind turbine, the output power is the maximum attainable, which is the rated maximum power.
Energies 2019, 12, x FOR PEER REVIEW 20 of 27 the rated wind speed of the selected wind turbine, the output power is the maximum attainable, which is the rated maximum power.The predicted half-hour wind speed for three days and the associated generated wind power are shown in Figure 17.The fluctuations in the predicted wind speed at different times during the 72h period are evident from the figure.As shown, when the wind speed is lower than the cut-in speed of the wind turbine, the generated power is zero.In Equation ( 8), the power generated by the wind turbine is represented by Pt wt , and Cp is the power coefficient.It also depends on air density ρ, area swept by rotor blades A, and wind speed Vt wt .
The wind turbine triggers are based on the cut-in and cut-out speeds.The association between the output power and the wind speed of the wind turbine is based on Equation ( 9) from a previous study [71].In Equation (9), Pout is the output power, PR is the rated power, Vt wt is the wind speed at time t, Vci is the cut-in wind speed, and Vco is the cut-out wind speed.The technical specifications of the selected wind turbine are shown in Table 10.The predicted half-hour wind speed for three days and the associated generated wind power are shown in Figure 17.The fluctuations in the predicted wind speed at different times during the 72-h period are evident from the figure.As shown, when the wind speed is lower than the cut-in speed of the wind turbine, the generated power is zero.
Energies 2019, 12, x FOR PEER REVIEW 20 of 27 the rated wind speed of the selected wind turbine, the output power is the maximum attainable, which is the rated maximum power.The predicted half-hour wind speed for three days and the associated generated wind power are shown in Figure 17.The fluctuations in the predicted wind speed at different times during the 72h period are evident from the figure.As shown, when the wind speed is lower than the cut-in speed of the wind turbine, the generated power is zero.In Equation ( 8), the power generated by the wind turbine is represented by Pt wt , and Cp is the power coefficient.It also depends on air density ρ, area swept by rotor blades A, and wind speed Vt wt .
The wind turbine triggers are based on the cut-in and cut-out speeds.The association between the output power and the wind speed of the wind turbine is based on Equation ( 9) from a previous study [71].In Equation (9), Pout is the output power, PR is the rated power, Vt wt is the wind speed at time t, Vci is the cut-in wind speed, and Vco is the cut-out wind speed.The technical specifications of the selected wind turbine are shown in Table 10.In Equation ( 9), the power generated by the wind turbine is represented by P t wt , and C p is the power coefficient.It also depends on air density ρ, area swept by rotor blades A, and wind speed V t wt .The wind turbine triggers are based on the cut-in and cut-out speeds.The association between the output power and the wind speed of the wind turbine is based on Equation (10) from a previous study [71].In Equation (10), P out is the output power, P R is the rated power, V t wt is the wind speed at time t, V ci is the cut-in wind speed, and V co is the cut-out wind speed.The technical specifications of the selected wind turbine are shown in Table 10.
Figure 18 reveals the association between the power generated by the solar panel and the solar irradiance.We can observe that with an increase in the solar irradiance, the power generated by the solar panel also increases.The solar panel temperature data are taken from a previous study [72].The solar irradiance is predicted using our proposed model.The power generation by the solar panel is approximated by implementing Equation (10) from our previous work [69] in MATLAB.
)) 25 ( 200 In Equation ( 10), the hourly generated power from the solar panel is represented by Pt PV .The area and efficiency of the solar panel are represented by A PV and η PV , respectively.The solar irradiance is represented by Irr, and the hourly temperature of the solar panel is represented by Tt.In our simulations, we considered two solar panels per house, each at 300 W. The proportions of the average daily power generated by the wind turbine and solar panels in each month of the year 2007 are presented in Figure 19.It is evident from the figure that the power generated by the solar panels varies predictably across the seasons, as expected.The month of June has the highest recorded average daily power generation using solar panels.The average daily power generated by the wind turbine is lowest in August and saw its best month in February.The proportions of the average daily power generated by the wind turbine and solar panels in each month of the year 2007 are presented in Figure 19.It is evident from the figure that the power generated by the solar panels varies predictably across the seasons, as expected.The month of June has the highest recorded average daily power generation using solar panels.The average daily power generated by the wind turbine is lowest in August and saw its best month in February.It can be observed from Figure 22 that the contribution of DERs during the winter season is not very significant.However, in the remaining seasons, significant relief is provided by the DERs in terms of bill reduction.Based on the numerical calculations of the bar graphs in Figure 22, it can be concluded that the proposed framework will help the smart community to achieve an annual reduction of up to 38% in their electricity bills by installing DERs.

Conclusions
DERs are valuable in decreasing consumers' electricity bills by enabling them to generate their own green energy.However, the intermittent nature of DERs is a significant issue in accurately forecasting the amount of generated energy through these renewable energy resources.In this work, we proposed and evaluated an efficient approach to energy management in a smart community based on the integration of DERs.Sometimes, the energy generated through DERs is greater than the energy demand of the consumers, which results in a demand and generation mismatch.
The demand and generation mismatch leads to the well-known "duck curve" problem.We applied a machine learning model to accurately predict the generated energy through DERs.We considered a smart community consisting of 80 smart homes.The smart community has access to 0.E+00 It can be observed from Figure 22 that the contribution of DERs during the winter season is not very significant.However, in the remaining seasons, significant relief is provided by the DERs in terms of bill reduction.Based on the numerical calculations of the bar graphs in Figure 22, it can be concluded that the proposed framework will help the smart community to achieve an annual reduction of up to 38% in their electricity bills by installing DERs.It can be observed from Figure 22 that the contribution of DERs during the winter season is not very significant.However, in the remaining seasons, significant relief is provided by the DERs in terms of bill reduction.Based on the numerical calculations of the bar graphs in Figure 22, it can be concluded that the proposed framework will help the smart community to achieve an annual reduction of up to 38% in their electricity bills by installing DERs.

Conclusions
DERs are valuable in decreasing consumers' electricity bills by enabling them to generate their own green energy.However, the intermittent nature of DERs is a significant issue in accurately forecasting the amount of generated energy through these renewable energy resources.In this work, we proposed and evaluated an efficient approach to energy management in a smart community based on the integration of DERs.Sometimes, the energy generated through DERs is greater than the energy demand of the consumers, which results in a demand and generation mismatch.
The demand and generation mismatch leads to the well-known "duck curve" problem.We applied a machine learning model to accurately predict the generated energy through DERs.We considered a smart community consisting of 80 smart homes.The smart community has access to 0.E+00

Conclusions
DERs are valuable in decreasing consumers' electricity bills by enabling them to generate their own green energy.However, the intermittent nature of DERs is a significant issue in accurately forecasting the amount of generated energy through these renewable energy resources.In this work, we proposed and evaluated an efficient approach to energy management in a smart community based on the integration of DERs.Sometimes, the energy generated through DERs is greater than the energy demand of the consumers, which results in a demand and generation mismatch.
The demand and generation mismatch leads to the well-known "duck curve" problem.We applied a machine learning model to accurately predict the generated energy through DERs.We considered a smart community consisting of 80 smart homes.The smart community has access to electric power through a microgrid that is equipped with DERs in the form of wind turbines and photovoltaic systems, in addition to having access to power from the main grid.
The simulation results indicated that our proposed framework is appropriate for approximating the energy generated through DERs and for reducing the electricity bills of the smart community.We evaluated the performance of several machine learning models for selecting a baseline model.Then, we evaluated the performance of our proposed model and compared it with the baseline model.
For the case of wind speed prediction, we obtained 44.94%, 46.12%, and 2.25% error reductions in the evaluation metrics of RMSE, MAE, and sMAPE, respectively.In the case of solar irradiance prediction, we obtained 7.6%, 54.3%, and 0.14% error reductions in the evaluation metrics of RMSE, MBE, and sMAPE, respectively.
We further evaluated the effectiveness of the proposed model in different climates and for different time horizons.The results conclude that the proposed model is not only suitable for short-term forecasting of wind speed and solar irradiance for different time horizons (up to four hours) but for different climates as well.

Figure 3 .
Figure 3. Preprocessing steps for cleaning and separating data into train test sets.

Figure 3 .
Figure 3. Preprocessing steps for cleaning and separating data into train test sets.

Figure 3 .
Figure 3. Preprocessing steps for cleaning and separating data into train test sets.

Figure 5 .
Figure 5. Structure of proposed MH-CNN model.Figure 5. Structure of proposed MH-CNN model.

Figure 5 .
Figure 5. Structure of proposed MH-CNN model.Figure 5. Structure of proposed MH-CNN model.

Figure 6 .
Figure 6.Training flow of proposed model.

Figure 6 .
Figure 6.Training flow of proposed model.The performance of the trained model is evaluated based on various metrics, including RMSE, MAE, and MBE.The mathematical calculation methods of these performance matrices with their equations are shown in Equations (1) to (3).

Figure 7 .
Figure 7. Seasonal variation of RMSE for forecasting wind speed.

Figure 8 .
Figure 8. Seasonal variation of MAE for forecasting wind speed.

Figure 9 .
Figure 9. Seasonal variation of sMAPE for forecasting wind speed.We have selected a random day from the test data to demonstrate the comparison of various machine learning models with the proposed model.Detailed comparison results of the wind speed

Figure 7 .
Figure 7. Seasonal variation of RMSE for forecasting wind speed.

Figure 8 .
Figure 8. Seasonal variation of MAE for forecasting wind speed.

Figure 9 .
Figure 9. Seasonal variation of sMAPE for forecasting wind speed.

Figure 7 .
Figure 7. Seasonal variation of RMSE for forecasting wind speed.

Figure 8 .
Figure 8. Seasonal variation of MAE for forecasting wind speed.

Figure 9 .
Figure 9. Seasonal variation of sMAPE for forecasting wind speed.

Figure 10 .
Figure 10.Comparison of wind speed prediction for various models.

Figure 11 .
Figure 11.Actual vs. predicted wind speed for test data.

Figure 10 .
Figure 10.Comparison of wind speed prediction for various models.

Figure 10 .
Figure 10.Comparison of wind speed prediction for various models.

Figure 11 .
Figure 11.Actual vs. predicted wind speed for test data.

Figure 11 .
Figure 11.Actual vs. predicted wind speed for test data.

Figure 12 .
Figure 12.Seasonal variation of RMSE for forecasting solar irradiance.

Figure 14 .
Figure 14.(a) Comparison of actual GHI and predicted solar irradiance by smart persistence and proposed model, (b) prediction accuracy of proposed model, and (c) prediction accuracy of smart persistence.

Figure 14 .
Figure 14.(a) Comparison of actual GHI and predicted solar irradiance by smart persistence and proposed model, (b) prediction accuracy of proposed model, and (c) prediction accuracy of smart persistence.

Figure 16 .
Figure 16.Predicted wind speed and approximated wind power.

Figure 17 .
Figure 17.Predicted wind speed for three days and associated power generated by wind turbine.

Figure 16 .
Figure 16.Predicted wind speed and approximated wind power.

Figure 16 .
Figure 16.Predicted wind speed and approximated wind power.

Figure 17 .
Figure 17.Predicted wind speed for three days and associated power generated by wind turbine.

Figure 17 .
Figure 17.Predicted wind speed for three days and associated power generated by wind turbine.

Figure 18 .
Figure 18.Predicted solar irradiance and power generated by solar panel.

Figure 18 .
Figure 18.Predicted solar irradiance and power generated by solar panel.

Figure 21 .
Figure 21.Average monthly cost of electricity for 80 homes and savings provided by DERs.

Figure 22 .
Figure 22.Seasonal cost savings provided by DERs.

Energies 2019 , 27 Figure 21 .
Figure 21.Average monthly cost of electricity for 80 homes and savings provided by DERs.

Figure 22 .
Figure 22.Seasonal cost savings provided by DERs.

Table 1 .
List of Abbreviations.
Energies 2019, 12, 1487 8 of 27 as lag features in the wind speed and solar irradiance forecasting models, respectively.The data preparation steps are presented in Figure3.

Table 2 .
Parameter settings of proposed MH-CNN model.

Table 2 .
Parameter settings of proposed MH-CNN model.
Seasonal variation of RMSE for forecasting wind speed.
comparison results of the wind speed Winter Spring Figure 8. Seasonal variation of MAE for forecasting wind speed.
comparison results of the wind speed

Table 4 .
Comparison of various models based on test data.

Table 5 .
Comparison of baseline model with proposed model.

Table 6 .
Comparison of persistence and smart persistence models based on test data.

Table 7 .
Comparison of proposed and smart persistence models based on test data.

Table 6 .
Comparison of persistence and smart persistence models based on test data.

Table 7 .
Comparison of proposed and smart persistence models based on test data.

Table 8 .
Effectiveness of proposed model in different time horizons.Evaluation of Proposed Model for Different Climates In Section 4, we tested the proposed model on San Francisco (Latitude: 37.77, Longitude: −122.42), which has a warm summer Mediterranean climate.In order to demonstrate the effectiveness of the proposed model in a different climate, we selected New York (Latitude: 40.73,Longitude: −74.02), which has a humid subtropical climate, and Las Vegas (Latitude: 33.61, Longitude: −114.58), which has a hot desert climate.In this experiment, data from 1998 to 2007 for New York and Las Vegas is used.The data were retrieved from the NSRDB of the NREL

Table 9 .
Effectiveness of proposed model in different climates.
Figure 21.Average monthly cost of electricity for 80 homes and savings provided by DERs.