Next Article in Journal
Assessing Spurious Correlations in Big Search Data
Previous Article in Journal
Intervention Time Series Analysis and Forecasting of Organ Donor Transplants in the US during the COVID-19 Era
 
 
Article
Peer-Review Record

Performance Analysis of Statistical, Machine Learning and Deep Learning Models in Long-Term Forecasting of Solar Power Production

Forecasting 2023, 5(1), 256-284; https://doi.org/10.3390/forecast5010014
by Ashish Sedai 1,*, Rabin Dhakal 2, Shishir Gautam 3, Anibesh Dhamala 4, Argenis Bilbao 1, Qin Wang 2, Adam Wigington 2 and Suhas Pol 1,5
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Forecasting 2023, 5(1), 256-284; https://doi.org/10.3390/forecast5010014
Submission received: 31 December 2022 / Revised: 12 February 2023 / Accepted: 15 February 2023 / Published: 22 February 2023
(This article belongs to the Special Issue Energy Forecasting Using Time-Series Analysis)

Round 1

Reviewer 1 Report

In this paper, ML/DL models have been used to provide long-term predictions for solar power production. The study also has suggested approaches to enhance the accuracy of these forecasts, and compares different forecasting models such as ARIMA (statistical model), SVR (ML model), LSTM & GRU (DL models) and RF & hybrid ensemble methods in terms of their ability to accurately predict solar power generation over a period ranging from 1 day up to 15 days. Finally, conclusions have been drawn predicting how the accuracy of various forecasted changes with variation in prediction time frame and input variables.

This research can be used to inform decision making processes related to solar power production, as well as provide guidance on how best to use different types of forecasted models depending on the time frame needed and input variables available. Additionally, by providing approaches which enhance prediction accuracy, stakeholders may have more confidence when planning their operations around renewable energy sources such as solar power plants.

 However, there are some concerns about the present work:

1.         There may not have been enough data points used in order to accurately assess the performance of each forecasting model over different time frames. Authors should explain this situation.

2.         More information should be provided on how these models perform under varying  conditions such as weather or seasonality.

3.         Modelling error analysis results should be given in tabular form according to each presented machine learning model.

4. Other factors which can affect solar power production, such as cloud cover or seasonal changes have not been have taken into account in this research.  Also, it is possible that more data points and a larger sample size would be needed in order to accurately assess the performance of each forecasting model over different time frames.

Author Response

Response to the reviewer’s comment

Thank you for providing the valuable suggestions and motivation for our work.

  1. That’s a valid point; the more data, the better the performance of the models can be compared. However, in many instances, large data is not always accessible for various reasons (remote location, lack of capital for equipment setup etc. ). This study helps examine the performance of various statistical, ML and DL on limited access to data.
  2. It’s a valuable suggestion to perform a sensitivity analysis of the model based on the weather and seasonality. However, the author's intention in this study here is to analyze different models standing on the same ground (same input) and see how the model's performance varies as the prediction horizon varies.
  3. The modeling error is presented as RMSE value (graphical representation) and MAPE (Mean Absolute Percentage Error) for all the model (Tabular representation).
  4. Access to data like cloud cover was unavailable during the study. However, the authors agree that variables like cloud cover can be very significant/important parameters for evaluating long-term solar power production prediction. The author understands the significance of the large data sets that the reviewer wants to outline. Here the author intends to analyze the performance of the models under limited data and input variables. The authors have published a related study on comparisons of the model for forecasting wind speed with the introduction of new variables through feature engineering. The author plans to perform a similar study for solar power production and incorporate the cloud cover data, perform feature engineering, and utilize large sample size data as a part of future studies.

https://www.mdpi.com/2076-3417/12/18/9038

Reviewer 2 Report

This study used ML/DL models to provide long-term predictions for solar power production using several existing forecasting models. It compares the statistical model (ARIMA), ML model (SVR), DL models (LSTM, GRU, etc.), and ensemble models (RF, hybrid) for long-term prediction. The topic is very interesting; however, the manuscript has many flaws and, thus, needs revision. My specific comments are as under:

 

 

1.       There are many grammatical mistakes and typos throughout the manuscript. Professional proofreading is recommended. See the first line of the abstract, for example.

2.       At the end of the abstract, please give the numerical results of the study in no more than two lines..

3.       Avoid lump-sum references in the introduction section.

4.       For Section 1, the authors should provide comments of the cited papers after introducing each relevant work. What readers require is, by convinced literature review, to understand the clear thinking/consideration of why the proposed approach can reach more convincing results. This is the very contribution from the authors.

5.       Please highlight the novelty of the current work point-wise at the end of section 1. How is your work different from other published works? What contribution does it bring to the scholarly world?

6.       Please provide the section-wise breakup at the end of section 1.

7.       In the literature review, the authors should discuss other important time series forecasting models used for forecasting energy variables to highlight the importance of this topic for the readers of this journal. For example, econometric modeling (Economic Assessment of a concentrating solar power forecasting system for participation in the Spanish electricity market), functional modeling (functional data approach for short-term electricity demand forecasting, Forecasting next-day electricity demand and prices based on functional models), ensemble learning (Ensemble methods for wind and solar power forecasting a state-of-the-art review, electricity spot prices forecasting based on ensemble learning), nonlinear models (An Bayesian learning and nonlinear regression model for photovoltaic power output forecasting, modeling and forecasting electricity demand and prices: a comparison of alternative approaches), etc.

8.       Are the data used in the current study freely available? If yes, the authors should provide a link for retrieval.

9.       Please provide the MAE, MAPE, and adjusted R-squared values as well to evaluate the models' performance.

10.   Quantification of the difference is preferable, especially in comparison to other published work. The author must compare their results with those already existing in the literature.

11.   Please also provide the ACF and PACF plots of the final residuals.

 

12.   The conclusion should be shortened and precise.

Author Response

Response to the reviewer’s comment

  1. Authors like apologize for the grammatical errors and typos. The revised manuscript has gone through professional proofreading.
  2. The numerical results have been added to the revised manuscript as follows.

Random Forest model predicted long-term solar power generation with 50 % better accuracy over univariate statistical model and 10 % better accuracy over multivariate ML/DL models.

  1. The lump-sum references have been avoided in the introduction section of the revised edition.
  2. The overall comments on the cited paper has been made in the revised manuscript. The following has been added in response.

Literature review discussed above highlights the effectiveness of various forecasting models for short-term solar power generation forecasting (intra-day head prediction). However, the authors identify that the research related to long-term solar power generation forecasting is limited, due to factors such as limited input datasets, computational constraints, and a lack of consideration for long-term forecasting among stakeholders. As the penetration of renewable energy in the electrical grid increases, accurate long-term forecasting becomes increasingly important for improving economic efficiency in unit commitment and dispatch processes.

 

  1. The highlight has been added as follows.

Significance of the Study:

  • This study aims to provide a comprehensive comparison of popular forecasting models for long-term solar power generation forecasting, an area where there has been limited research.
  • The study seeks to understand the relationship between the forecasting model's input variables and forecasting accuracy.
  • The study investigates how the performance of different models changes as the prediction horizon changes.
  • The study compares the performance of hybrid and ensemble models to that of single models.
  • The study assesses the performance of statistical, ML, DL, and ensemble forecasting models when limited input variables and datasets are available.
  1. The section wise break-up has been added.
  2. The difference between the other forecasting models and time series ML/DL model of other has been added as a part of literature review as follows.

In addition to the timeseries forecasting using the statistical, ML, DL, ensemble, and hybrid models, like the one in this study, there are several other models like Functional modeling, Bayesian learning modeling, non-linear regression modeling etc. that are used in predicting renewable energy generation. The method utilized in this study is different from the above-mentioned modeling approaches in a few keyways.

For instance, time series ML/DL forecasting model is different from Functional modeling in the following aspects,

  • Temporal aspect: A time series model considers the temporal aspect of the data, specifically the order in which the data points occur, while functional forecasting modeling is more suited to modeling physical systems, which do not have a specific temporal aspect.
  • Complexity: Time series models can be more complex than functional models because they need to consider patterns and trends in the data, which may not be captured by a simpler functional model.
  • Feature Engineering: Time series ML models often rely on feature engineering, which is the process of creating new features from the raw data, to improve the prediction performance. Functional forecasting modeling does not require feature engineering.

Likewise, time series ML/DL forecasting modeling is different from the Bayesian learning model in the following aspects.

  • Data Assumptions: Time series models assume that the data is generated by a stationary process and make assumptions about the underlying distribution of the data, while Bayesian learning models use Bayes' theorem to update the probability of a hypothesis as more evidence becomes available.
  • Predictive model: Time series models are specifically designed to predict future values of a time series based on historical data, whereas Bayesian learning models can be used for a wide range of applications, including time series forecasting, but it may not be as well-suited to the task as a dedicated time series forecasting model.
  • Modeling techniques: Time series machine learning models use techniques such as ARIMA, LSTM, and Prophet to improve the prediction performance, whereas Bayesian learning models use techniques such as Markov Chain Monte Carlo (MCMC) and variational inference to estimate the parameters of the model.
  1. Yes, data is available and will be uploaded with revised manuscript.
  2. MAPE has been provided in the tabular form in addition to the RMSE Value for different model.
  3. Although there are some past studies in the topic. However, authors do not find the literature for fair comparison to quantify the difference with past studies that shares the same ground.
  4. Residuals ACF and PACF is provided (Figure 5)
  5. The conclusions and discussions are shortened.

Reviewer 3 Report

The authors performed an extensive performance analysis of statistical, machine learning, and deep learning-based methods for long-term solar power prediction. According to the literature on forecasting/prediction in the load/demand domain, the prediction horizons are divided into long-, short-, and medium-terms where Long-term ELF refers to predictions made for one or more years ahead, Medium-term refers to predictions covering between one month and a year while short-terms prediction normally refers to predictions of between seconds/minutes and a month ahead. However, the experiments are performed for 1, 3, 5, and 15 days therefore, the authors are suggested to change the prediction horizon from long- to short-term. Additionally, in the current literature, hybrid models such as CNN-LSTM, LSTM-CNN, CNN-ESN, ESN-CNN, etc. achieved higher performance compared to solo methods for solar power prediction. The authors are suggested to report the results of the hybrid method for a fair evaluation of each method in the given domain. Furthermore, the quality of the figures should be improved.

Author Response

Response to the reviewer’s comment

Thank you for your suggestions and time in helping improve the quality of the paper. The author agrees that some articles have considered long-term forecasting as forecasting covering one or more years. However, some recently published articles on solar power production have considered long-term forecasting as forecasting covering weeks to months prediction horizon. The reference for considering this study as long-term has been taken from the following Wiki page.

https://en.wikipedia.org/wiki/Solar_power_forecasting#:~:text=Solar%20power%20forecasting%20is%20the,grid%20and%20for%20power%20trading.

The ARIMA-LSTM and ensemble model ‘Random Forrest’ is utilized in the study as a hybrid model. The author agrees that models like CNN-LSTM, CNN-ESN etc., perform better than a solo model. However, the authors have not incorporated many hybrid models in this study. The author’s aim is to see how the model’s prediction accuracy varies as input variables, and prediction horizons vary. First, the comparison of three univariate models (ARIMA, SVR, LSTM) is presented in section 4.1. Out of the three models, the best-performing model, LSTM, is considered for further analysis. Then, 6 different LSTM models (stacked, bidirectional, GRU, stacked GRU, Encoder-Decoder and multivariate LSTM) are tested using the multivariate input datasets. Here the Stacked LSTM outperforms all other LSTM models for predicting long-term forecasting (up to 15 days ahead). The author’s intention here to incorporate the hybrid model is to see how the prediction accuracy of the best-performing (LSTM) and worst-performing model (ARIMA) changes when we merge both models. The reason to introduce the Random Forest model is that it is the most popular and widely used ensemble model for predicting wind speed and demand. The author intends to see how well the model predicts intermittent solar power production. The authors have published a related study on comparisons of the model for forecasting wind speed with the introduction of novel variables through feature engineering. The author plans to perform a similar study for solar power production and incorporate the cloud cover data, test different hybrid models, perform feature engineering, and utilize large sample size data as a part of future studies.

https://www.mdpi.com/2076-3417/12/18/9038

The quality of the figure has been improved in the revised manuscript.

Round 2

Reviewer 1 Report

Authors suggestions and critique of the reviews very carefully studied. Necessary corrections and additions necessary section and the title of the research study was conducted. These corrections and additions seem enough. The study complies with the acceptance state.

Reviewer 2 Report

The authors addressed some of my concerns, and hence, I recommend it for publication in its present form

Reviewer 3 Report

The authors address all my comments therefore, I recommend the article for possible publication.

Back to TopTop