Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe core of this article is the study of very short-term (5 minutes) PV power prediction, focusing on how advanced multivariate models combined with non-climatic exogenous variables can improve the accuracy of PV power prediction. Although interesting, the manuscript shows several deficits which need to be improved before publication, the details are as follows:
- It is mentioned that the data interpolation method is used to deal with missing data, but the description of the processing method of outliers is not detailed enough.
- The optimization process of model hyperparameters (such as learning rate, number of hidden layer units, dropout rate, etc.) is not described in detail in this paper.
- The study used data from only eight photovoltaic plants in the central region of Cuba. The geographic range of these data is relatively narrow and may not fully reflect the characteristics of PV power changes in the wider region.
- Figure 3 only illustrates the correlation between the prediction of model F and the actual value, but does not mention the specific characteristics of the data.
- The vertical axis of Figures 4 and 5 is not clearly marked with units.
- The article focuses on the short-term prediction , but does not mention the predictive power of the model on longer time scales.
Author Response
23-03-2025
Tittle: Response to Reviewers paper ADVANCED MULTIVARIATE MODELS INCORPORATING NON-CLIMATIC EXOGENOUS VARIABLES FOR VERY SHORT-TERM PHOTOVOLTAIC POWER FORECASTING
Dear Reviewer
We have revised your comments and recommendations to enhance the paper contribution. Thanks for your comments related as follow:
Comment 1: It is mentioned that the data interpolation method is used to deal with missing data, but the description of the processing method of outliers is not detailed enough.
Previously, Lines 138-142 Now, Lines 172-189.
Original Text: "SCADA system malfunctions or information collection failures may cause data anomalies, anomalous data includes readings exceeding the maximum park capacity, or those repeated more than twice consecutively because of measurement freezes. A data imputation procedure replaces these values, treating them as missing data".
Modified text: SCADA system failures or information collection issues can lead to data anomalies. These anomalies include readings that exceed the maximum park capacity or values repeated more than twice consecutively because of measurement freezes. To address these issues, we implemented a two-step process to handle outliers and missing data. First, we identified outliers using the MATLAB Statistics Toolbox to detect outliers in the data. The approach adopted, based on the mean and standard deviation, setting a threshold of 3 times the standard deviation to ensure that only extremely improbable values were flagged as outliers.
Second, to correct both identified outliers and actual missing data, we applied the Inverse Distance Weighting (IDW) method. This approach leverages the high spatial correlation between energy generation in neighboring parks within the same geographic region. A previous work validated the correlation between parks using a Pearson correlation coefficient analysis, yielding average values exceeding 0.85 in all cases. The IDW method assigns greater weight to data from parks closer to the point of interest, gradually decreasing the weight as the distance increases. In our implementation, we considered a maximum radius of 150 km to select neighboring parks, ensuring that environmental and operational conditions are comparable.
Comment 2: The optimization process of model hyperparameters (such as learning rate, number of hidden layer units, dropout rate, etc.) is not described in detail in this paper.
Previously, Lines 286-291 Now, Lines 350-375.
Original Text: The researchers performed all analyses using Matlab® R2021b and developed regression models on an HP EliteBook 855 G8 Notebook PC laptop with an AMD Ryzen 7 PRO 5850U processor at 1,901 GHz and 16 GB of RAM, also developed several models using LSTM and BiLSTM networks to analyze how the incorporation of various inputs affects the forecast accuracy in the target park. These models include both univariate and multivariate versions.
Modified text: The researchers conducted all analyses using MATLAB® R2021b on a portable HP EliteBook 855 G8 laptop equipped with an AMD Ryzen 7 PRO 5850U processor (1,901 GHz) and 16 GB of RAM. To optimize the performance of the LSTM and BiLSTM models, a hyperparameter tuning process was carried out. This process involved evaluating multiple configurations of key hyperparameters, including the number of hidden layer units (3 to 153 with a step of 3), validation frequency (250 to 500 with a step of 50), mini-batch size (32, 64, 144, 288, and 432), and early stopping (3, 4, 5, 6, and 7 epochs without improvement). The selection of these configurations was based on their ability to minimize the prediction error metric (RMSE). The final architecture for the best-performing model (Model F) included 128 hidden units, a validation frequency of 500, a mini-batch size of 32, and early stopping after 6 epochs without improvement.
The neural network architecture described processes sequences of data and generate a regression out-put. The configuration starts with a `sequenceInputLayer`, fol-lowed by an LSTM/BiLSTM layer configured to return only the output corresponding to the last time step (`’OutputMode’, ‘last’`). To regularize the weights and prevent overfitting, L2 regularization factors of 0.1 apply to both the input weights (`Input-WeightsL2Factor`) and the recurrent weights (`RecurrentWeightsL2Factor`) and biases (`BiasL2Factor`). The weight initializers follow the He heuristic (`’he’`), which helps speed up training in deep networks, while the bias is initialized using the `’unit-forget-gate’` strategy, commonly used in LSTM/BiLSTM networks to enhance training stability. A dropout layer with a rate of 0.25 follows the LSTM/BiLSTM layer to further reduce overfitting. The network continues with a fully connected layer (`fullyConnectedLayer`) that reduces the dimensionality to a single output, followed by a regression layer (`regressionLayer`) that enables the modeling of continuous pre-diction problems. This configuration combines advanced regularization and initialization techniques to ensure a balance between modeling capacity and generalization. In univariate models, a single input and output variable is used to predict a future instant within the time series. In contrast, the multivariate models incorporate powers gener-ated in other parks in the region as exogenous variables, besides the series of the target park.
Comment 3: The study used data from only eight photovoltaic plants in the central region of Cuba. The geographic range of these data is relatively narrow and may not fully reflect the characteristics of PV power changes in the wider region.
Previously, Lines 260-267 Now, Lines 317-331.
Original Text: The aim of this study is to analyze the photovoltaic park in Yaguaramas, in the province of Cienfuegos, which will be called P1 Park. Besides the data from the P1 Park, there are a power series from seven other parks in the central region of Cuba, identified as Caguaguas (P2), Frigorífico (P3), Guasimal (P4), Marrero (P5), Mayajigüa-1 (P6), Mayajigüa-2 (P7) and Venegas (P8). Program adjustments used to take power measurements at 5-minute resolution and organize them annually. This study used data from 2021, comprising 105,120 power records per park, distributed in 288 daily records. The researchers used 75% of this data for training, 15% for validation, and the remaining 10% for testing
Modified text: The aim of this study is to analyze the Yaguaramas photovoltaic park, in the province of Cienfuegos and referred to as Parque P1. To enhance the robustness of the analysis, the research include data from the generation of seven additional solar plants in the central region of Cuba. These plants are: Caguaguas (P2), Frigorífico (P3), Guasimal (P4), Marrero (P5), Mayajigüa-1 (P6), Mayajigüa-2 (P7), and Venegas (P8). The selected plants span 8,040 km² (60 km × 134 km) within the same climatic region.
Researchers calculated Pearson’s correlation coefficient between the Yaguaramas park and the other parks. The results show a strong positive correlation between the power generated by the parks considered. In particular, the Yaguaramas park exhibits a correlation coefficient greater than 0.85 compared to other parks.
Researchers collected data every 5 minutes throughout 2021, producing 105,120 energy records per plant, divided into 288 daily intervals. The analysis divided the da-taset into training subsets (75%), validation subsets (15%), and test subsets (10%) to ensure appropriate model evaluation.
Comment 4: Figure 3 only illustrates the correlation between the prediction of model F and the actual value, but does not mention the specific characteristics of the data.
Previously, Lines 357-358 Now, Lines 440-447.
Original Text: Figure 3 shows the correlation between the values measured and predicted by Model F.
Modified text: Figure 3 illustrates the correlation between the measured and predicted power values using Model F for the test dataset. The data represents a 5-minute time horizon, with predictions generated for both clear and cloudy days. Cloudless days exhibit more uniform power generation patterns because of constant solar irradiance, while cloudy days show abrupt fluctuations caused by intermittent cloud cover. The high correlation coefficient (r = 95.17%) shows that Model F effectively captures these variations, demonstrating its capability to adapt to different weather.
Comment 5: The vertical axis of Figures 4 and 5 is not clearly marked with units.
The figures were revised to clarify the units of measurement. Originally on the horizontal axis were placed a number corresponding to the records of the time series. This was replaced by the corresponding time.
Comment 6: The article focuses on the short-term prediction but does not mention the predictive power of the model on longer time scales.
Response
We agree that future studies should consider longer time scales. The authors consider the model could perform at longer time horizons, such as 30 minutes, 1 hour or even longer. Models based on LSTM and BiLSTM can capture long-term dependencies in the time series, which makes them suitable for applications with extended prediction horizons. However, the accuracy of predictions may decrease as the time horizon increases because of the accumulation of errors and the increased uncertainty associated with changes in climatic and operational conditions. The paper included the following paragraph in the introduction to show the importance of this type of forecast:
Experts increasingly recognize the importance of very-short-term forecasting in photovoltaic power generation, especially as integrating distributed energy resources (DERs) transforms power plant operations. Traditional power plants use consistent load data to control output. Modern systems, however, face irregular and variable generation because of environmental factors like fluctuating cloud cover. This vola-tility makes accurate forecasting crucial for effective grid management, particularly in implementing real-time controls and optimizing energy storage systems [15,16].
Countries like South Korea use 15-minute load data profiles, acknowledging that such short-term data is essential for grid operations and microgrid backup systems. Very short-term forecasts are vital in managing applications like PV-linked energy storage systems (ESS), where they inform the charging patterns in response to the unpredictable nature of PV generation. These forecasts support energy optimization initiatives like distributed conservation voltage reduction (CVR) that require precise generation data to maintain stability and efficiency within the power grid. Improving forecasts enhances the management and operation of both large-scale and microgrid EPSs [9,15].
Future research should evaluate the model’s performance over longer time horizons. This could include:
Extending the prediction horizon: Test the model over longer time intervals, such as 30 minutes or 1 hour, to analyze its ability to maintain accuracy.
Incorporating additional variables: Including climate variables (such as solar radiation, temperature, and cloud cover) in multivariate models to improve long-term predictions.
Sensitivity analysis: Conduct a sensitivity analysis to identify how different factors, such as climate variability and spatial patterns, affect model accuracy over longer horizons.
Comparison with other methods: Compare the performance of LSTM and BiLSTM models with other approaches, such as convolutional neural networks (CNN) or hybrid models, over longer time horizons.
Sincerely,
JORGE IVÁN SILVA ORTEGA |
HERNAN HERNANDEZ HERRERA UNIVERSIDAD SIMÓN BOLIVAR |
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsReview of the Paper Titled: "Advanced multivariate models incorporating non-climatic exogenous variables for very short-term photovoltaic power forecasting"
This paper addresses a critical aspect of PV power forecasting. The paper is well written and I enjoyed reading it. Please allow me to offer the below comments
Comments from the manuscript include:
- Within the abstract and the entire document, refrain from using “We”.
- In line 53 of the introduction, the paper states that “One of the most critical issues in EPS is frequency stability, which becomes more complex as renewable energy penetrates the grid. This challenge…” There is no related information within the abstract regarding this issue.
- Within the introduction, the paper compares the performance of models based on LSTM and BiLSTM network for 5 minutes. It is recommended in the explanation starting from line 88 to line 119, to include additional insight on the frequency stability which addresses the challenge mentioned within the introduction
- In Figure 1, the equation included in the Model Quality Verification is not clear
- In line 149, ensure you cite the equation number within the text, for example, the author can write: “… values calculated using equation 1 and 2”.
- In line 145, the author must explain the IDW to allow readers with limited experience to easily follow the paper analysis.
- Ensure all equations are cited with their numbers within the text prior to the equation insertion
- Define all equations variables
- It is recommended to add a section for discussion where the author highlights their novel works and includes the main comparison approach with and without the method deployed by the authors
Please try to use short and clear sentences
Author Response
23-03-2025
Tittle: Response to Reviewers paper ADVANCED MULTIVARIATE MODELS INCORPORATING NON-CLIMATIC EXOGENOUS VARIABLES FOR VERY SHORT-TERM PHOTOVOLTAIC POWER FORECASTING
Dear Reviewer
We have revised your comments and recommendations to enhance the paper contribution. Thanks for your comments related as follow:
Comment 1: Within the abstract and the entire document, refrain from using “We”.
The use of “We” was corrected throughout the document.
Comment 2: In line 53 of the introduction, the paper states that “One of the most critical issues in EPS is frequency stability, which becomes more complex as renewable energy penetrates the grid. This challenge…” There is no related information within the abstract regarding this issue.
The summary was changed: This study explores advanced multivariate models that incorporate non-climatic exogenous variables for very short-term photovoltaic energy forecasting. By integrating historical energy data from multiple photovoltaic plants, the research aims to improve the prediction accuracy of a target plant while addressing critical challenges in electric power systems (EPS), such as frequency stability. Frequency stability becomes increasingly complex as renewable energy sources penetrate the grid because of their intermittent nature. To mitigate this challenge, precise forecasting of photovoltaic energy generation is essential for balancing supply and demand in real-time. The performance of long short-term memory (LSTM) networks and bidirectional LSTM (BiLSTM) networks was compared over a 5-minute horizon. Including energy generation data from neighboring plants significantly improved prediction accuracy compared to univariate models. Among the models, multivariate BiLSTM showed superior performance, achieving a lower root-mean-square error (RMSE) and higher correlation coefficients. Quantile regression applied to manage prediction uncertainty, providing robust confidence intervals. The results suggest that incorporating an exogenous power series effectively captures spatial correlations and enhances prediction accuracy. This approach offers practical benefits for optimizing grid management, reducing operational costs, improving the integration of renewable energy sources, and supporting frequency stability in power generation systems.
Comment 3: Within the introduction, the paper compares the performance of models based on LSTM and BiLSTM network for 5 minutes. It is recommended in the explanation starting from line 88 to line 119, to include additional insight on the frequency stability which addresses the challenge mentioned within the introduction.
Previously, Lines 88-119 Now, Lines 142-153.
Point 7 is added
Modified text 7 Improvement of Electric System Frequency Stability: "The ability to precisely predict photovoltaic generation at very short time horizons (5 minutes) allows for better management of electric system frequency stability. This aspect is particularly relevant given the growing challenge faced by power systems due to the increasing penetration of intermittent renewable energy sources. Accurate predictions enable the implementation of more effective control schemes to dynamically adjust conventional generation and complement solar generation fluctuations; optimize the operation of storage systems and other flexibility resources to maintain system stability; anticipate rapid variability events in photovoltaic generation, allowing preventive actions to avoid significant frequency deviations; and improve coordination among different generation sources to maintain real-time load-generation balance. This is especially important in systems with low inertia, where rapid fluctuations can have a more pronounced impact on system stability."
Comment 4: In Figure 1, the equation included in the Model Quality Verification is not clear.
Previously, Lines 142 Now, Lines 190.
A new figure is placed
Comment 5: In line 149, ensure you cite the equation number within the text, for example, the author can write: “… values calculated using equation 1 and 2”.
Previously, Lines 147-149 Now, Lines 190.
Original Text: This method replaces the missing values in the power series of a park with values calculated using the following expression:
Modified text: The answer is given together with the answer to comment 6.
Comment 6: In line 145, the author must explain the IDW to allow readers with limited experience to easily follow the paper analysis.
Previously: Lines 145-147. Now Lines 193-202
Original Text: The research used the Inverse Distance Weighting (IDW) to impute the missing data in the time series, based on the expected high correlation between power generation in a target PV park and that of neighboring parks in the same region [22]. This method replaces the missing values in the power series of a park with values calculated using the following expression.
Modified text: The research used the Inverse Distance Weighting (IDW) method to impute miss-ing data in the time series. IDW is a widely used spatial interpolation technique that estimates missing values based on the weighted average of known values from nearby locations. The weights are inversely proportional to the distance between the target location and the neighboring locations, meaning that closer points have a greater in-fluence on the estimated value than farther ones. This approach is suitable for this study because it leverages the high correlation between energy generation in a target photovoltaic park and that of neighboring parks in the same region [32]. This method replaces the missing values in the power series of a park with values calculated using Equations (1) and (2).
Comment 7: Ensure all equations are cited with their numbers within the text prior to the equation insertion.
Corrections are made and the equations are cited with their numbers in the text before inserting them.
Comment 8: Define all equations variables.
Corrected, the article includes the definition of the variables of the equations
Comment 9: It is recommended to add a section for discussion where the author highlights their novel works and includes the main comparison approach with and without the method deployed by the authors.
Section 4 is added.
4 Comparison with Previous Models and Contributions
The authors investigated the development of several models for the case study park [9,12], combining the Discrete Wavelet Transform (DWT) with artificial neural networks (feed-forward backpropagation and generalized regression neural networks) to decompose and reconstruct the time series of power and meteorological variables. Although these models achieved excellent results, their dependence on meteorological data and a more complex structure limited their precision. For example, the Hybrid Model in [9] presented a MAPE of 33.89%, an RMSE of 1.5 MW, and an error variance (σ²) of 0.32. In contrast, the models proposed in this article introduce significant sim-plification and improvement by employing Long Short-Term Memory (LSTM) and Bi-directional LSTM (BiLSTM) neural networks, which integrate power data from other plants in the region as predictive variables. This innovation allowed for greater accu-racy, with Models D, E, and F exhibiting RMSE values between 400.63 kW and 404.29 kW, MAPE values between 94.12% and 102.03%, and a correlation coefficient (r) be-tween 95.08% and 95.17%. Specifically, Model F stood out for its reduced complexity (only two inputs) and superior accuracy metrics, with an RMSE of 400.63 kW, a MAPE of 102.03%, and a correlation coefficient of 95.17%, demonstrating a notable improve-ment compared to previous models.
To better understand the advantages of the proposed method, we conducted a comparative analysis between univariate and multivariate models. Univariate models, such as Model A, B, and C, are based only on historical energy data from the target plant (P1). While these models provide reasonable accuracy, they cannot capture broader spatial patterns and correlations present in the regional energy generation dynamics. For instance, Model C, the best-performing univariate model, achieved an RMSE of 407.24 kW and a correlation coefficient of 95.08%. However, the lack of ex-ternal input explaining variability beyond the target plant’s immediate environment limited its performance.
In contrast, multivariate models, such as Models D, E, and F, incorporate power series from highly correlated neighboring PV plants. These models show superior pre-dictive capabilities, particularly Model F, which achieved the lowest RMSE (400.63 kW) and the highest correlation coefficient (95.17%). Including exogenous variables allows Model F to more effectively capture spatial dependencies and temporal pat-terns, leading to more accurate and reliable predictions.
A key advantage of the proposed method is its robustness in handling scenarios with limited meteorological data. Traditional forecasting methods often heavily rely on weather forecasts or numerical weather prediction (NWP) models, which may not always be available or accurate. By focusing on power series data from neighboring plants, our method provides a practical solution for regions where meteorological in-formation is scarce or unreliable.
The study highlights the importance of model selection and architecture design. Using BiLSTM networks in multivariate models (e.g., Model F) demonstrates their abil-ity to capture bi-directional temporal dependencies, which are crucial for understand-ing past and future trends in PV energy generation. This contrasts with LSTM-based models, which process information only in one forward direction and may overlook important contextual relationships.
To underscore the impact of the proposed method, Table 3 compares the perfor-mance metrics of univariate and multivariate models. The results clearly show that incorporating exogenous variables significantly improves prediction accuracy. For example, Model F reduces the RMSE by approximately 6.61 kW compared to Model C while maintaining a comparable MAE and achieving a slightly higher correlation coef-ficient. These improvements highlight the value of integrating spatial correlations into forecasting models.
Last, the application of quantile regression to calculate confidence intervals fur-ther enhances the practical utility of the proposed method. By providing solid esti-mates of uncertainty, this technique enables grid operators to make informed decisions and prepare for various scenarios, especially during periods of high variability, such as cloudy days.
In summary, the novel contributions of this study include:
- Introduction of non-climatic exogenous variables: Using power series from neighboring PV plants as inputs represents a significant advancement in very short-term forecasting.
- Greater precision through spatial correlations: Multivariate models outperform univariate models by capturing spatial dependencies and reducing prediction errors.
- Practical applicability: The proposed method is particularly valuable for regions with limited access to meteorological data, offering a reliable alternative to traditional NWP-based approaches.
- Improved uncertainty management: The integration of quantile regression provides an integrated framework for evaluating and managing prediction uncertainty.
Sincerely,
JORGE IVÁN SILVA ORTEGA |
HERNAN HERNANDEZ HERRERA UNIVERSIDAD SIMÓN BOLIVAR |
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors- The future work should be included in the conclusion rather than the abstract.
- The first paragraph of the introduction appears to be a reviewer’s comment or recommendation and should be removed. The authors need to thoroughly revise their manuscript before submission. The same issue is present in lines 314-315 on page 9.
- The literature review is insufficient and lacks discussion on previous work related to the paper's subject. A comprehensive and detailed literature review is necessary.
- Focus should be on recent research rather than older studies.
- The proposed method lacks clarity; diagrams and additional details should be included.
- The results presented are inadequate to demonstrate the accuracy of the proposed model.
- There is no comparison with previous models.
- There is an error in the axis label of Figure 4.
- The references section needs to be updated.
The English could be improved to more clearly express the research.
Author Response
23-03-2025
Tittle: Response to Reviewers paper ADVANCED MULTIVARIATE MODELS INCORPORATING NON-CLIMATIC EXOGENOUS VARIABLES FOR VERY SHORT-TERM PHOTOVOLTAIC POWER FORECASTING
Dear Reviewer
We have revised your comments and recommendations to enhance the paper contribution. Thanks for your comments related as follow:
Comment 1. The future work should be included in the conclusion rather than the abstract.
Accepted the recommendation, the document is corrected and in summary future work is eliminated and added in the conclusions.
Comment 2: The first paragraph of the introduction appears to be a reviewer’s comment or recommendation and should be removed. The authors need to thoroughly revise their manuscript before submission. The same issue is present in lines 314-315 on page 9.
The first paragraph of the introduction and lines 314-315 were removed.
Comment 3: The literature review is insufficient and lacks discussion on previous work related to the paper's subject. A comprehensive and detailed literature review is necessary.
The introduction was modified by incorporating new references and expanding the discussion in more detail to allow for a better understanding.
Comment 4: Focus should be on recent research rather than older studies.
The introduction was modified by incorporating new recent references related to the research topic, text added in red.
Modified text: The worldwide growth of renewable energy technologies has been driven by key factors, such as technological advances, supportive policies, and growing awareness of carbon emission reduction. These factors contribute to mitigate climate change impacts. Efforts Around the world have resulted in the installation of over 600 GW of capacity, reflecting substantial advancements in renewable energy implementation [1].
Photovoltaic systems (PV) have become increasingly relevant for generating clean and sustainable electricity from sunlight. This rising adoption has notably reduced PV technology costs, making it more demanding for both residential and commercial uses. However, renewable energy sources like PV present significant challenges, especially because of their intermittent nature. One of the most critical issues in Electric Power Systems (EPS) is frequency stability, which becomes more complex as renewable energy penetrates the grid. This challenge requires advanced solutions to ensure a stable and reliable energy supply [2,3]. The inherent variability and reduction in EPS inertia caused by technologies such as solar and wind complicate maintaining stable grid operation [4,5]. The reduction of the system inertia below a certain level, frequency maintenance during transient periods because of power imbalances, becomes very variable [6–8], potentially leading to undesirable cascading failures of system components or even blackouts in large power systems [9–12].
Concerns about system stability are amplified as the penetration of these technologies increases, being a limiting factor for their effective EPS integration [13]. This challenge is even more relevant in island EPSs, which are often composed of low inertia generation units [2,3,14]. Recent research highlighted the importance of accurately forecasting short-term renewable energy generation.
Experts increasingly recognize the importance of very-short-term forecasting in photovoltaic power generation, especially as integrating distributed energy resources (DERs) transforms power plant operations. Traditional power plants use consistent load data to control output. Modern systems, however, face irregular and variable generation because of environmental factors like fluctuating cloud cover. This volatility makes accurate forecasting crucial for effective grid management, particularly in implementing real-time controls and optimizing energy storage systems [15,16].
Countries like South Korea use 15-minute load data profiles, acknowledging that such short-term data is essential for grid operations and microgrid backup systems. Very short-term forecasts are vital in managing applications like PV-linked energy storage systems (ESS), where they inform the charging patterns in response to the unpredictable nature of PV generation. These forecasts support energy optimization initiatives like distributed conservation voltage reduction (CVR) that require precise generation data to maintain stability and efficiency within the power grid. Improving forecasts enhances the management and operation of both large-scale and microgrid EPSs [9,15].
There is an increasing volume of publications focused on ultra-short-term and short-term forecasting methodologies. This trend appears to be driven by the rapid advancements and influence of emerging technologies, such as machine learning, big data analytics, and real-time data processing, which facilitate these shorter-term predictions [16,17].
Accurate forecasting of PV generation is essential for integrating renewable energy sources into the grid. It enhances the integration of these sources into the grid while optimizing the planning, scheduling, and operation of the EPS for stable and efficient power supply [12,15]. This has led several countries to implement policies requiring mandatory forecasting of solar PV generation to enable effective energy management and cost minimization [12].
Previous research has seen a remarkable rise in creating PV prediction models. These models use advanced techniques like Machine Learning and Deep Learning to enhance the accuracy of predictions [16–18]. The strategies used to predict quantities in electrical systems are like those used for the models [19–21] and these techniques have demonstrated the most promising potential in terms of predictive accuracy [22]. Otherwise, a forecast scenario uses models based on historical meteorological data or numerical weather prediction (NWP). According to the estimated horizon, models based on historical weather data or numerical climate forecasts (NWPs) are used [23]. Also, integrating additional meteorological variables into multivariate models has been extensively explored to improve prediction accuracy [24–26].
Although the use of climate parameters as input variables plays an important role in supporting historical data for better accuracy, in [16] it is demonstrated that in recently published literature, historical data emerges as the most significant and effective source for forecasting and prediction.
Despite these advances, previous works evidenced gaps integrating data acquisition, model development [27], uncertainty quantification [28], and adaptive learning mechanisms [29], and there is a gap in the literature on using a power series from other PV plants as variables to predict the output of a specific unit without relying solely on meteorological data [30]. This research aims to fill this gap by evaluating the use of time series data from multiple plants in the same climatic region, but far apart from each other. The goal is to predict the output of a specific PV plant using this innovative approach. This paper compares the performance of models based on Long short-term memory (LSTM) and bidirectional LSTM (BiLSTM) networks for a 5-minute time horizon. These models offer significant advantages that justify their implementation and use in the planning and operation of PV systems. Some of these advantages include:
EPS Management optimization: PV generation can fluctuate rapidly because of changes in weather, such as cloudy conditions or variations in the intensity of solar radiation. Forecasting models with 5-minute resolution allow these fluctuations to be captured more accurately, helping utilities to match supply and demand more efficiently. This is especially important for integrating solar energy into power grids that must constantly balance production and consumption.
The equipment operation planning improvement: A very short-term generation forecasts allow a more accurate planning of the operation and maintenance of photovoltaic equipment. The power plant managers can anticipate and prepare for changes in production, optimizing the use of resources and minimizing unnecessary downtime.
Operating Costs reduction: With more accurate and frequent predictions, more effective control strategies can be implemented, such as dynamic adjustment of the power generated or proactive management of batteries and storage systems. This not only improves operational efficiency, but also reduces the costs associated with managing uncertainty in electricity generation.
Improvements based on Reliability of Supply: The ability to anticipate changes in PV generation with high frequency allows utilities and power system operators to take preventative measures to ensure grid stability. This contributes to greater reliability in the power supply, reducing the risk of interruptions or failures in the network.
Optimize Renewable Energy Integration: The photovoltaics integration into the EPS requires precise coordination between renewable generation and other energy sources. Very short-term prediction models provide crucial information for the efficient integration of solar energy, facilitating coordination with complementary generation sources and the implementation of demand management strategies.
Financial and Contractual Adjustments: For power sales contracts or power purchase agreements (PPAs), the ability to predict generation with high accuracy at short intervals allows for better financial planning and optimization in commercial agreements. This is essential for risk management and revenue maximization in competitive electricity markets.
Improvement of Electric System Frequency Stability: “The ability to precisely predict photovoltaic generation at very short time horizons (5 minutes) allows for better management of electric system frequency stability. This aspect is relevant given the growing challenge faced by power systems because of the increasing penetration of intermittent renewable energy sources. Accurate predictions enable the implementation of more effective control schemes to dynamically adjust conventional generation and complement solar generation fluctuations; optimize the operation of storage systems and other flexibility resources to maintain system stability; anticipate rapid variability events in photovoltaic generation, allowing preventive actions to avoid significant frequency deviations; and improve coordination among different generation sources to maintain real-time load-generation balance. This is especially important in systems with low inertia, where rapid fluctuations can have a more pronounced impact on system stability.”
The researchers considered various delays in the power series for model development and future value estimation. The main contributions of this study include the introduction of time series from other power plants as inputs in a multivariable model, the use of spatial interpolation to fill in missing data, and the application of inter-time series causality tests for the selection of predictor variables. In addition, the research analyzes the uncertainty associated with the predictions using quantile regression techniques [31], addressing different percentiles of the distribution of error. This study intends to advance the predictive capacity of photovoltaic generation through a novel approach that can be crucial for the efficient planning and operation of electricity systems in a growing penetration of renewable energies when meteorological information is not available.
Comment 5: The proposed method lacks clarity; diagrams and additional details should be included
Figure 1 was changed to give more details of the information flow and the steps of the methodology.
Comment 6: The results presented are inadequate to demonstrate the accuracy of the proposed model.
Comment 7: There is no comparison with previous models.
Session 4 was added to compare the current research with previous models developed by the authors in the same photovoltaic park under study.
Comment 8: There is an error in the axis label of Figure 4.
The error was fixed
Original Text: Photovoltaic gwneration power (MW)
Modified text: Photovoltaic generation power (MW
Comment 9: The references section needs to be updated
The bibliography was updated with 10 new references from the last two years.
Original bibliography
- Haas, R.; Duic, N.; Auer, H.; Ajanovic, A.; Ramsebner, J.; Knapek, J.; Zwickl-Bernhard, S. The Photovoltaic Revolution Is on: How It Will Change the Electricity System in a Lasting Way. Energy 2023, 265, doi:10.1016/j.energy.2022.126351.
- Garcia, M.G.; Sanchez, Z.G.; Hernandez Herrera, H.; Cueto Cruz, J.A.G.; Silva Ortega, J.I.; Sánchez, G.C. Frequency Response Analysis under Faults in Weak Power Systems. International Journal of Electrical and Computer Engineering (IJECE) 2022, 12, 1077, doi:10.11591/ijece.v12i2.pp1077-1088.
- Gallego-Landera, Y.; Garcia-Sanchez, Z.; Casas-Fernandez, L.; Rivas-Arocha, Y. Impacto de La Implementación de Paneles Fotovoltaicos En El Sistema Eléctrico Cayo Santa María Impact of the Implementation of Photovoltaic Panels at Cayo Santa Maria Electric System. mayo/agosto 2017, 76–87.
- Saleh, S.A.; Ozkop, E.; Meng, R.J.; Sanchez, Z.G.; Betancourt, O.A.A. Selecting Locations and Sizes of Battery Storage Systems Based on the Frequency of the Center of Inertia and Principle Component Analysis. IEEE Trans Ind Appl 2020, 56, 1040–1051, doi:10.1109/TIA.2019.2960003.
- Betancourt, O.A.; Sanchez, Z.G.; Saleh, S.A.; Hill, E.F.; Zhao, X.; Sanchez, F.P. Battery Energy Storage Systems for Primary Frequency Regulation in Island Power Systems. In Proceedings of the Conference Record - Industrial and Commercial Power Systems Technical Conference; 2020; Vol. 2020-June.
- Yang, X.; Xu, M.; Xu, S.; Han, X. Day-Ahead Forecasting of Photovoltaic Output Power with Similar Cloud Space Fusion Based on Incomplete Historical Data Mining. Appl Energy 2017, 206, 683–696, doi:10.1016/j.apenergy.2017.08.222.
- Pierro, M.; Bucci, F.; De Felice, M.; Maggioni, E.; Moser, D.; Perotto, A.; Spada, F.; Cornaro, C. Multi-Model Ensemble for Day Ahead Prediction of Photovoltaic Power Generation. Solar Energy 2016, 134, 132–146, doi:10.1016/j.solener.2016.04.040.
- Ramsami, P.; Oree, V. A Hybrid Method for Forecasting the Energy Output of Photovoltaic Systems. Energy Convers Manag 2015, 95, 406–413, doi:10.1016/j.enconman.2015.02.052.
- Gómez Rodríguez, M.A.; Gómez Sarduy, J.R.; Lorenzo Ginori, J.V.; Fonte González, R.; García Sánchez, Z. Electrical Generation Forecast of Photovoltaic Systems. First Steps by Cuban Universities | Pronóstico de La Generación Eléctrica de Sistemas Fotovoltaicos. Un Inicio En Cuba Desde La Universidad. Universidad y Sociedad 2021, 13, 253–265.
- Li, Y.; He, Y.; Su, Y.; Shu, L. Forecasting the Daily Power Output of a Grid-Connected Photovoltaic System Based on Multivariate Adaptive Regression Splines. Appl Energy 2016, 180, 392–401, doi:10.1016/j.apenergy.2016.07.052.
- Eseye, A.T.; Zhang, J.; Zheng, D. Short-Term Photovoltaic Solar Power Forecasting Using a Hybrid Wavelet-PSO-SVM Model Based on SCADA and Meteorological Information. Renew Energy 2018, 118, 357–367, doi:10.1016/j.renene.2017.11.011.
- Fraga-Hurtado, I.; Gómez-Rodríguez, M.; Gómez-Sarduy, J.R.; García-Sánchez, Z. PREDICTION OF PHOTOVOLTAIC GENERATION USING DEEP LEARNING. Universidad y Sociedad 2023, 15, 266–275.
- Kouhi, S.; Khavaninzadeh, M.; Keynia, F. Applying Wavelet to Ann Based Short-Term Load Forecasting: A Case Study of Zanjan Power System. International Jorunal of Electrical And Electronics Engineering Research 2013, 3, 209–215.
- Sarduy, J.R.G.; Di Santo, K.G.; Saidel, M.A. Linear and Non-Linear Methods for Prediction of Peak Load at University of São Paulo. Measurement (Lond) 2016, 78, 187–201, doi:10.1016/j.measurement.2015.09.053.
- Peña-Acción, J.A.; Viego-Felipe, P.R.; Gómez-Sarduy, J.R.; Padrón-Padrón, A.E. Peak Load Forecasting for Energy Management at Cienfuegos University. Universidad y Sociedad 2019, 11, 220–228.
- Muñoz-Jiménez, A. Modelos de Predicción a Corto Plazo de La Generación Eléctrica En Instalaciones Fotovoltaicas. Doctoral, Universidad de la Rioja, 2015.
- Succetti, F.; Rosato, A.; Araneo, R.; Panella, M. Deep Neural Networks for Multivariate Prediction of Photovoltaic Power Time Series. IEEE Access 2020, 8, 211490–211505, doi:10.1109/ACCESS.2020.3039733.
- Rahman, N.H.A.; Hussin, M.Z.; Sulaiman, S.I.; Hairuddin, M.A.; Saat, E.H.M. Univariate and Multivariate Short-Term Solar Power Forecasting of 25MWac Pasir Gudang Utility-Scale Photovoltaic System Using LSTM Approach. Energy Reports 2023, 9, 387–393, doi:10.1016/j.egyr.2023.09.018.
- Hossain, M.S.; Mahmood, H. Short-Term Photovoltaic Power Forecasting Using an LSTM Neural Network and Synthetic Weather Forecast. IEEE Access 2020, 8, 172524–172533, doi:10.1109/ACCESS.2020.3024901.
- Zhen, H.; Niu, D.; Wang, K.; Shi, Y.; Ji, Z.; Xu, X. Photovoltaic Power Forecasting Based on GA Improved Bi-LSTM in Microgrid without Meteorological Information. Energy 2021, 231, doi:10.1016/j.energy.2021.120908.
- Massaoudi, M.; Chihi, I.; Sidhom, L.; Trabelsi, M.; Refaat, S.S.; Abu-Rub, H.; Oueslati, F.S. An Effective Hybrid NARX-LSTM Model for Point and Interval PV Power Forecasting. IEEE Access 2021, 9, 36571–36588, doi:10.1109/ACCESS.2021.3062776.
- Crespo-Turrado, M.C. imputación de Datos Faltantes En Redes de Distribución de Baja tensión Aplicación a Edificios de Pública Concurrencia. Tesis Doctoral, Universidad de Oviedo: Oviedo, 2018.
Updated bibliography
- Haas, R.; Duic, N.; Auer, H.; Ajanovic, A.; Ramsebner, J.; Knapek, J.; Zwickl-Bernhard, S. The Photovoltaic Revolution Is on: How It Will Change the Electricity System in a Lasting Way. Energy 2023, 265, doi:10.1016/j.energy.2022.126351.
- Garcia, M.G.; Sanchez, Z.G.; Hernandez Herrera, H.; Cueto Cruz, J.A.G.; Silva Ortega, J.I.; Sánchez, G.C. Frequency Re-sponse Analysis under Faults in Weak Power Systems. International Journal of Electrical and Computer Engineering (IJECE) 2022, 12, 1077, doi:10.11591/ijece.v12i2.pp1077-1088.
- Gallego-Landera, Y.; Garcia-Sanchez, Z.; Casas-Fernandez, L.; Rivas-Arocha, Y. Impacto de La Implementación de Paneles Fotovoltaicos En El Sistema Eléctrico Cayo Santa María Impact of the Implementation of Photovoltaic Panels at Cayo Santa Maria Electric System. mayo/agosto 2017, 76–87.
- Saleh, S.A.; Ozkop, E.; Meng, R.J.; Sanchez, Z.G.; Betancourt, O.A.A. Selecting Locations and Sizes of Battery Storage Systems Based on the Frequency of the Center of Inertia and Principle Component Analysis. IEEE Trans Ind Appl 2020, 56, 1040–1051, doi:10.1109/TIA.2019.2960003.
- Betancourt, O.A.; Sanchez, Z.G.; Saleh, S.A.; Hill, E.F.; Zhao, X.; Sanchez, F.P. Battery Energy Storage Systems for Primary Frequency Regulation in Island Power Systems. In Proceedings of the Conference Record - Industrial and Commercial Power Systems Technical Conference; 2020; Vol. 2020-June.
- Hasan, A.K.M.K.; Haque, M.H.; Mahfuzul Aziz, S. Enhancing Frequency Response Characteristics of Low Inertia Power Systems Using Battery Energy Storage. IEEE Access 2024, 12, 116861–116874, doi:10.1109/ACCESS.2024.3444330.
- Criollo, A.; Minchala-Avila, L.I.; Benavides, D.; Arévalo, P.; Tostado-Véliz, M.; Sánchez-Lozano, D.; Jurado, F. Enhancing Virtual Inertia Control in Microgrids: A Novel Frequency Response Model Based on Storage Systems. Batteries 2024, 10, doi:10.3390/batteries10010018.
- Babu, V.V.; Roselyn, J.P.; Nithya, C.; Sundaravadivel, P. Development of Grid-Forming and Grid-Following Inverter Control in Microgrid Network Ensuring Grid Stability and Frequency Response. Electronics (Switzerland) 2024, 13, doi:10.3390/electronics13101958.
- Yang, X.; Xu, M.; Xu, S.; Han, X. Day-Ahead Forecasting of Photovoltaic Output Power with Similar Cloud Space Fusion Based on Incomplete Historical Data Mining. Appl Energy 2017, 206, 683–696, doi:10.1016/j.apenergy.2017.08.222.
- Lee, J.; Kang, J.; Lee, S.; Oh, H.-M. Ultra-Short Term Photovoltaic Generation Forecasting Based on Data Decomposition and Customized Hybrid Model Architecture. IEEE Access 2024, 12, 20840–20853, doi:10.1109/ACCESS.2024.3362234.
- Gupta, M.; Arya, A.; Varshney, U.; Mittal, J.; Tomar, A. A Review of PV Power Forecasting Using Machine Learning Techniques. Progress in Engineering Science 2025, 2, 100058, doi:10.1016/j.pes.2025.100058.
- Pierro, M.; Bucci, F.; De Felice, M.; Maggioni, E.; Moser, D.; Perotto, A.; Spada, F.; Cornaro, C. Multi-Model Ensemble for Day Ahead Prediction of Photovoltaic Power Generation. Solar Energy 2016, 134, 132–146, doi:10.1016/j.solener.2016.04.040.
- Ramsami, P.; Oree, V. A Hybrid Method for Forecasting the Energy Output of Photovoltaic Systems. Energy Convers Manag 2015, 95, 406–413, doi:10.1016/j.enconman.2015.02.052.
- Gomez-Sarduy, J.; Lorenzo-Ginori, J.; Garcia-Sanchez, Z. Electrical Generation Forecast of Photovoltaics Systems. First Steps by Cuban Universities. Universidad y Sociedad 2021, 13, 253–265.
- Li, Y.; He, Y.; Su, Y.; Shu, L. Forecasting the Daily Power Output of a Grid-Connected Photovoltaic System Based on Mul-tivariate Adaptive Regression Splines. Appl Energy 2016, 180, 392–401, doi:10.1016/j.apenergy.2016.07.052.
- Eseye, A.T.; Zhang, J.; Zheng, D. Short-Term Photovoltaic Solar Power Forecasting Using a Hybrid Wavelet-PSO-SVM Model Based on SCADA and Meteorological Information. Renew Energy 2018, 118, 357–367, doi:10.1016/j.renene.2017.11.011.
- Fraga-Hurtado, I.; Gómez-Rodríguez, M.; Gómez-Sarduy, J.R.; García-Sánchez, Z. Prediction of Photovoltaic Generation Using Deep Learning. Universidad y Sociedad 2023, 15, 266–275.
- Vargas Cordero, Z.R. La Investigación Aplicada: Una Forma de Conocer Las Realidades Con Evidencia Científica. Revista Educación 2009, 33, 155, doi:10.15517/revedu.v33i1.538.
- Kouhi, S.; Khavaninzadeh, M.; Keynia, F. Applying Wavelet to Ann Based Short-Term Load Forecasting: A Case Study of Zanjan Power System. International Jorunal of Electrical And Electronics Engineering Research 2013, 3, 209–215.
- Sarduy, J.R.G.; Di Santo, K.G.; Saidel, M.A. Linear and Non-Linear Methods for Prediction of Peak Load at University of São Paulo. Measurement (Lond) 2016, 78, 187–201, doi:10.1016/j.measurement.2015.09.053.
- Peña-Acción, J.A.; Viego-Felipe, P.R.; Gómez-Sarduy, J.R.; Padrón-Padrón, A.E. Peak Load Forecasting for Energy Man-agement at Cienfuegos University. Universidad y Sociedad 2019, 11, 220–228.
- Husein, S.M.; Gago, E.J.; Hasan, B.; Pegalajar, M.C. Towards Energy Efficiency: A Comprehensive Review of Deep Learn-ing-Based Photovoltaic Power Forecasting Strategies. Heliyon 2024, 10, doi:10.1016/j.heliyon.2024.e33419.
- Muñoz-Jiménez, A. Modelos de Predicción a Corto Plazo de La Generación Eléctrica En Instalaciones Fotovoltaicas. Doctoral, Universidad de la Rioja, 2015.
- Succetti, F.; Rosato, A.; Araneo, R.; Panella, M. Deep Neural Networks for Multivariate Prediction of Photovoltaic Power Time Series. IEEE Access 2020, 8, 211490–211505, doi:10.1109/ACCESS.2020.3039733.
- Rahman, N.H.A.; Hussin, M.Z.; Sulaiman, S.I.; Hairuddin, M.A.; Saat, E.H.M. Univariate and Multivariate Short-Term Solar Power Forecasting of 25MWac Pasir Gudang Utility-Scale Photovoltaic System Using LSTM Approach. Energy Reports 2023, 9, 387–393, doi:10.1016/j.egyr.2023.09.018.
- Hossain, M.S.; Mahmood, H. Short-Term Photovoltaic Power Forecasting Using an LSTM Neural Network and Synthetic Weather Forecast. IEEE Access 2020, 8, 172524–172533, doi:10.1109/ACCESS.2020.3024901.
- Rosso, A.P.; Rampinelli, G.A.; Schaeffer, L. Experimental Development of a Method of Short and Medium-Term Photovoltaic Generation Forecasting Using Multivariate Statistics and Mathematical Modeling. Energy Reports 2024, 12, 1710–1722, doi:10.1016/j.egyr.2024.07.058.
- Ren, X.; Zhang, F.; Yan, J.; Liu, Y. A Novel Convolutional Neural Net Architecture Based on Incorporating Meteorological Variable Inputs into Ultra-Short-Term Photovoltaic Power Forecasting. Sustainability (Switzerland) 2024, 16, doi:10.3390/su16072786.
- Al-Dahidi, S.; Madhiarasan, M.; Al-Ghussain, L.; Abubaker, A.M.; Ahmad, A.D.; Alrbai, M.; Aghaei, M.; Alahmer, H.; Alahmer, A.; Baraldi, P.; et al. Forecasting Solar Photovoltaic Power Production: A Comprehensive Review and Innovative Data-Driven Modeling Framework. Energies (Basel) 2024, 17, doi:10.3390/en17164145.
- Zhen, H.; Niu, D.; Wang, K.; Shi, Y.; Ji, Z.; Xu, X. Photovoltaic Power Forecasting Based on GA Improved Bi-LSTM in Microgrid without Meteorological Information. Energy 2021, 231, doi:10.1016/j.energy.2021.120908.
- Massaoudi, M.; Chihi, I.; Sidhom, L.; Trabelsi, M.; Refaat, S.S.; Abu-Rub, H.; Oueslati, F.S. An Effective Hybrid NARX-LSTM Model for Point and Interval PV Power Forecasting. IEEE Access 2021, 9, 36571–36588, doi:10.1109/ACCESS.2021.3062776.
- Crespo-Turrado, M.C. Imputacion de Datos Faltantes En Redes de Distribucion de Baja Tension Aplicacion a Edificios de Publica Concurrencia. Tesis Doctoral, Universidad de Oviedo: Oviedo, 2018.
Sincerely,
JORGE IVÁN SILVA ORTEGA |
HERNAN HERNANDEZ HERRERA UNIVERSIDAD SIMÓN BOLIVAR |
Author Response File: Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for Authors- In section 4, "Comparison with Previous Models and Contributions," Table 3, which provides the comparative results, is missing and should be incorporated into the revised manuscript.
- The resolution of the figures is low and should be improved.
Author Response
29-03-2025
Tittle: Response to Reviewers paper ADVANCED MULTIVARIATE MODELS INCORPORATING NON-CLIMATIC EXOGENOUS VARIABLES FOR VERY SHORT-TERM PHOTOVOLTAIC POWER FORECASTING
Dear Reviewer
We have revised your comments and recommendations to enhance the paper contribution. Thanks for your comments related as follow:
- Comment 01 (figures): All figures have been updated improving quality.
- Comment 02 (error in table 3): The error mentioning table 3 has been corrected. The table that has to be mentioned is table 2. The typing error has been corrected.
Sincerely,
JORGE IVÁN SILVA ORTEGA |
HERNAN HERNANDEZ HERRERA UNIVERSIDAD SIMÓN BOLIVAR |
Author Response File: Author Response.pdf