Next Article in Journal
An Integrated Assessment Approach for Underground Gas Storage in Multi-Layered Water-Bearing Gas Reservoirs
Previous Article in Journal
Unveiling the Dual Mechanisms of Public Environmental Concern on Green Innovation Quality: The Interplay Between External Pressure and Internal Motivation
 
 
Article
Peer-Review Record

Comparative Study of Feature Selection Techniques for Machine Learning-Based Solar Irradiation Forecasting to Facilitate the Sustainable Development of Photovoltaics: Application to Algerian Climatic Conditions

Sustainability 2025, 17(14), 6400; https://doi.org/10.3390/su17146400
by Said Benkaciali 1, Gilles Notton 2,* and Cyril Voyant 3
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sustainability 2025, 17(14), 6400; https://doi.org/10.3390/su17146400
Submission received: 30 May 2025 / Revised: 24 June 2025 / Accepted: 10 July 2025 / Published: 12 July 2025
(This article belongs to the Section Energy Sustainability)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript studied the solar irradiation forecasting using the machine learning. For two different climatic zones, such as Ghardaïa, located in an arid desert region, and Algiers in a Mediterranean region, based on the Gradient Boosting model over horizons, the authors evaluate eight feature selection methods, i.e., Pearson, Spearman, Mutual Information, LASSO, SHAP (GB and RF), and RFE (GB and RF) from one to six hours ahead. They calculated the normalized Mean Absolute Error (nMAE) and normalized Root Mean Square Error (nRMSE) of each method, and found that most methods agree on the central role of GHIt (irradiance at forecast launch) for short-term horizons and the growing importance of periodic variables such as VO1t and VO2t for longer horizons. The manuscript give some interesting results and I recommend to be accepted for publication after some issues outlined below.

 

  1. Figures 4-8, there are many smaller values for the different function variables and they are better to plot in the Log for the y-axis, and these values will be seen.
  2. Section 3.1, Pearson and Spearman correlation coefficients are listed in tables 3-4. And these coefficients are between 1-14, what does the means? It is between to show an example of input x and output y here, and show their Pearson and Spearman correlation coefficients. Meanwhile, from the manuscript, I understand that the smaller coefficient values, the better the method. It is right? and explain the reason. The same issue for the tables 5-10 for the various methods.
  3. Table 12 is very important for the manuscript, and the authors got the conclusion from this table. From the manuscript, I did not find the way how to calculate these error values. It is better for the reader to understand the table 12 if the authors show an example how to detect one of the error values in text. Although the nMAE and nRMSE are generally used to evaluated the method, and the smaller nMAE and nRMSE, the more accurate the method, but inTable 12, such a tiny value of nMAE or nRMSE at each time, it is hard to conclude which method is best. This could be discuss in the text. One conclusion can be drawn from the table 12 is that the errors are increasing with the time, indicating the foresting less accurately.

Author Response

Dear Reviewer,

Thank you very for your review and for the time you spent doing this work.

We tried to answer to all your remarks and questions.

You will able to find the answers to your review in blue in the revised version of the paper.

The answers can be found in the attached under Reviewer 1.

We hope our answers satisfy you.

Thank you again.

With kind regards

Prof Gilles Notton and the co-authors

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors
  • Consider citing articles below for insights on ML-based forecasting under uncertainty in complex systems and relevant context on solar energy system design. Both works align with the study's themes of prediction and renewable energy: 1. Jebbor, I., Benmamoun, Z., & Hachimi, H. (2024). Forecasting supply chain disruptions in the textile industry using machine learning: A case study. Ain Shams Engineering Journal15(12), 103116.     2. Haqqi, M., Benmamoun, Z., Hachimi, H., Raouf, Y., Jebbor, I., & Akikiz, M. (2023, October). Renewable and Sustainable Energy: Solar Energy and Electrical System Design. In 2023 9th International Conference on Optimization and Applications (ICOA) (pp. 1-6). IEEE. 

Author Response

Dear Reviewer,

Thank you very for your review and for the time you spent doing this work.

We tried to answer to all your remarks and questions.

You will able to find the answers to your review in Green in the revised version of the paper.

The answers can be found in the attached under Reviewer 2.

We hope our answers satisfy you.

Thank you again.

With kind regards

Prof Gilles Notton and the co-authors

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This study has conducted in-depth exploration on the impact of feature selection on the accuracy of solar radiation prediction. The manuscript is relatively complete in terms of content and methods, with a reasonable research design and well analyzed results. However, there are still some potential problems and issues for improvement. Specifically, This paper has certain issues and room for improvement in terms of dataset, methodology, experimental design, result interpretation, and literature review. Future research can explore and optimize these issues in greater depth to improve the accuracy and universality of research. Therefore, I hope the authors can make corresponding modifications and improvements to the paper based on the following review comments.

  1. Introduction,on one hand, the literature reviewcan be simplified : Further streamline the literature review section, highlighting the unique contributions and innovations of this manuscript. And on the other hand, although the literature review section of the paper mentions some relevant research in recent years, it may not fully cover the latest developments in this field. This may lead to research being disconnected from the latest trends in certain aspects.
  2. The data only covers two sites in Algeria, and generalization needs to be verified;
  3. Not combined with deep learning methods (such as attention mechanism) for feature selection;
  4. Lack of quantifiable economic benefits of actual energy dispatch based on predicted results.
  5. The statistical information of the data set should be more clear,Provide more specific statistical information about the dataset to facilitate readers' understanding of the distribution characteristics of the data.
  6. Limitations of geographical and climatic conditions: The paper only selected data from two cities in Algeria (Gharda ï a and Algiers), whose climatic conditions, although distinct, are not sufficient to represent global climate diversity. Therefore, the universality of research results may be limited.
  7. Quantify the significance of differences between different feature selection methods through statistical testing and other methods to enhance the persuasiveness of the analysis.
  8. In depth exploration of the reasons: Combining external factors such as climate characteristics, explore in depth the reasons for the differences in performance of different feature selection methods.
  9. Optimize terminology explanation: Provide more accessible and understandable explanations for some professional terms to enhance the readability of the article.
  10. The authorsused 4 years of data, which may be sufficient for capturing seasonal and interannual variations, but may not be sufficient for predicting certain long-term trends or extreme climate events.
  11. The authorscompared eight feature selection methods, but did not cover all possible methods, such as deep learning based feature selection methods. This may lead to overlooking the potential advantages of certain advanced methods.
  12. Line167, I used the word 'below', I suggest changing it to a specific location.
  13. The design of Table 2 is too simple. It is recommended to consider enriching the design and content of Table 2
  14. This manuscriptmainly uses the Gradient Boosting model to predict. Although this method has shown good performance in many fields, the use of a single model may limit the diversity and accuracy of the results. In the future, combining multiple models for prediction can be considered to improve robustness.
  15. During the feature selection process, certain variables (such as meteorological variables) are frequently excluded, which may be based on subjective judgments from experimental results. However, these variables may have a significant impact on the prediction results in specific contexts, so excluding them may lead to information loss.
  16. Although the manuscript demonstrates the performance of different feature selection methods at different prediction time scales, there is a lack of in-depth discussion on the reasons and mechanisms behind the results. For example, why do certain methods perform better on certain time scales? What are the advantages and limitations of these methods?

 

Comments on the Quality of English Language

English grammar, vocabulary, and sentence structure need to conform to the speaking habits of Europeans and Americans

Author Response

Dear Reviewer,

Thank you very for your review and for the time you spent doing this work.

We tried to answer to all your remarks and questions.

You will able to find the answers to your review in Brown in the revised version of the paper.

The answers can be found in the attached as Reviewer 3.

We hope our answers satisfy you.

Thank you again.

With kind regards

Prof Gilles Notton and the co-authors

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

You have systematically revised the comments I reviewed, and from my perspective, I believe that the manuscript has met the publication requirements of Sustainability. Therefore, I agree to have the manuscript published in the journal of Sustainability.

Back to TopTop