Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting

Fraga-Hurtado, Isidro; Gómez-Sarduy, Julio Rafael; García-Sánchez, Zaid; Hernández-Herrera, Hernán; Silva-Ortega, Jorge Iván; Reyes-Calvo, Roy

doi:10.3390/electricity6020029

Open AccessArticle

Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting

by

Isidro Fraga-Hurtado

¹

,

Julio Rafael Gómez-Sarduy

¹,

Zaid García-Sánchez

^1,†,

Hernán Hernández-Herrera

^2,*,

Jorge Iván Silva-Ortega

^3,*

and

Roy Reyes-Calvo

⁴

¹

Centro de Estudios de Energía y Medio Ambiente, Universidad de Cienfuegos, Cienfuegos 55100, Cuba

²

Facultad de Ingeniería, Universidad Simón Bolívar, Barranquilla 080002, Colombia

³

Departamento de Energía, Universidad de la Costa, Cl. 58 No. 55-66, Barranquilla 080002, Colombia

⁴

Departamento de Mecánica, Universidad de Cienfuegos, Cienfuegos 55100, Cuba

^*

Authors to whom correspondence should be addressed.

^†

Current address: Departamento de Ingeniería, Escuela Politecnica Superior, Universidad de las Islas Baleares, 07008 Palma de Mallorca, Spain.

Electricity 2025, 6(2), 29; https://doi.org/10.3390/electricity6020029

Submission received: 22 January 2025 / Revised: 29 March 2025 / Accepted: 4 April 2025 / Published: 1 June 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

This study explores advanced multivariate models that incorporate non-climatic exogenous variables for very short-term photovoltaic energy forecasting. By integrating historical energy data from multiple photovoltaic plants, the research aims to improve the prediction accuracy of a target plant while addressing critical challenges in electric power systems (EPS), such as frequency stability. Frequency stability becomes increasingly complex as renewable energy sources penetrate the grid because of their intermittent nature. To mitigate this challenge, precise forecasting of photovoltaic energy generation is essential for balancing supply and demand in real time. The performance of long short-term memory (LSTM) networks and bidirectional LSTM (BiLSTM) networks was compared over a 5 min horizon. Including energy generation data from neighboring plants significantly improved prediction accuracy compared to univariate models. Among the models, multivariate BiLSTM showed superior performance, achieving a lower root-mean-square error (RMSE) and higher correlation coefficients. Quantile regression applied to manage prediction uncertainty, providing robust confidence intervals. The results suggest that incorporating an exogenous power series effectively captures spatial correlations and enhances prediction accuracy. This approach offers practical benefits for optimizing grid management, reducing operational costs, improving the integration of renewable energy sources, and supporting frequency stability in power generation systems.

Keywords:

photovoltaic power forecasting; multivariate models; LSTM; BiLSTM; renewable energy

1. Introduction

The worldwide growth of renewable energy technologies has been driven by key factors, such as technological advances, supportive policies, and growing awareness of carbon emission reduction. These factors contribute to mitigate climate change impacts. Efforts around the world have resulted in the installation of more than 600 GW of capacity, reflecting substantial advancements in renewable energy implementation [1].

Photovoltaic systems (PV) have become increasingly relevant for generating clean and sustainable electricity from sunlight. The increasing use of photovoltaic technology has led to significant cost reductions, making it attractive and increasing its demand for both residential and commercial applications. However, renewable energy sources like PV present significant challenges, especially because of their intermittent nature. One of the most critical issues in electric power systems (EPS) is frequency stability, which becomes more complex as renewable energy penetrates the grid. This challenge requires advanced solutions to ensure a stable and reliable energy supply [2,3]. The inherent variability and reduction in EPS inertia caused by technologies such as solar and wind complicate maintaining stable grid operation [4,5]. The reduction of the system inertia below a certain level, and frequency maintenance during transient periods because of power imbalances, becomes very variable [6,7,8], potentially leading to undesirable cascading failures of system components or even blackouts in large power systems [9,10,11,12].

Concerns about system stability are amplified as the penetration of these technologies increases, being a limiting factor for their effective EPS integration [13]. This challenge is even more relevant in island EPSs, which are often composed of low-inertia generation units [2,3,14]. Recent research highlights the importance of accurately forecasting short-term renewable energy generation.

Experts increasingly recognize the importance of very short-term forecasting in photovoltaic power generation, especially as integrating distributed energy resources (DERs) transforms power plant operations. Traditional power plants use consistent load data to control output. Modern systems, however, face irregular and variable generation because of environmental factors like fluctuating cloud cover. This volatility makes accurate forecasting crucial for effective grid management, particularly in implementing real-time controls and optimizing energy storage systems [15,16].

Countries including South Korea use 15 min load data profiles, acknowledging that such short-term data are essential for grid operations and microgrid backup systems. Very short-term forecasts are vital in managing applications like PV-linked energy storage systems (ESSs), where they inform the charging patterns in response to the unpredictable nature of PV generation. These forecasts support energy optimization initiatives like distributed conservation voltage reduction (CVR) that require precise generation data to maintain stability and efficiency within the power grid. Improving forecasts enhances the management and operation of both large-scale and microgrid EPSs [9,15].

There is an increasing volume of publications focused on ultra-short-term and short-term forecasting methodologies. This trend appears to be driven by the rapid advancements and influence of emerging technologies, such as machine learning, big data analytics, and real-time data processing, which facilitate these shorter-term predictions [16,17].

Accurate forecasting of PV generation is essential for integrating renewable energy sources into the grid. It enhances the integration of these sources into the grid while optimizing the planning, scheduling, and operation of the EPS for stable and efficient power supply [12,15]. This has led several countries to implement policies requiring mandatory forecasting of solar PV generation to enable effective energy management and cost minimization [12].

Previous research has seen a remarkable rise in creating PV prediction models. These models use advanced techniques like machine learning and deep earning to enhance the accuracy of predictions [16,17,18]. The strategies used to predict quantities in electrical systems are like those used for the models in [19,20,21] and these techniques have demonstrated the most promising potential in terms of predictive accuracy [22]. Otherwise, a forecast scenario uses models based on historical meteorological data or numerical weather prediction (NWP). According to the estimated horizon, models based on historical weather data or numerical climate forecasts (NWPs) are used [23]. Also, integrating additional meteorological variables into multivariate models has been extensively explored to improve prediction accuracy [24,25,26].

Although the use of climate parameters as input variables plays an important role in supporting historical data for better accuracy, in [16] it is demonstrated that in recently published literature, historical data emerge as the most significant and effective source for forecasting and prediction.

Despite these advances, previous works evidenced gaps integrating data acquisition, model development [27], uncertainty quantification [28], and adaptive learning mechanisms [29], and there is a gap in the literature on using power series from other PV plants as variables to predict the output of a specific unit without relying solely on meteorological data [30]. This research aims to fill this gap by evaluating the use of time series data from multiple plants in the same climatic region, but far apart from each other. The goal is to predict the output of a specific PV plant using this innovative approach. This paper compares the performance of models based on long short-term memory (LSTM) and bidirectional LSTM (BiLSTM) networks for a 5 min time horizon. These models offer significant advantages that justify their implementation and use in the planning and operation of PV systems. Some of these advantages include the following.

EPS management optimization: PV generation can fluctuate rapidly because of changes in weather, such as cloudy conditions or variations in the intensity of solar radiation. Forecasting models with 5 min resolution allow these fluctuations to be captured more accurately, helping utilities to match supply and demand more efficiently. This is especially important for integrating solar energy into power grids that must constantly balance production and consumption.

Equipment operation planning improvement: Very short-term generation forecasts allow a more accurate planning of the operation and maintenance of photovoltaic equipment. The power plant managers can anticipate and prepare for changes in production, optimizing the use of resources and minimizing unnecessary downtime.

Operating costs reduction: With more accurate and frequent predictions, more effective control strategies can be implemented, such as dynamic adjustment of the power generated or proactive management of batteries and storage systems. This not only improves operational efficiency, but also reduces the costs associated with managing uncertainty in electricity generation.

Improvements based on reliability of supply: The ability to anticipate changes in PV generation with high frequency allows utilities and power system operators to take preventative measures to ensure grid stability. This contributes to greater reliability in the power supply, reducing the risk of interruptions or failures in the network.
Optimized renewable energy integration: Photovoltaics’ integration into the EPS requires precise coordination between renewable generation and other energy sources. Very short-term prediction models provide crucial information for the efficient integration of solar energy, facilitating coordination with complementary generation sources and the implementation of demand management strategies.
Financial and contractual adjustments: For power sales contracts or power purchase agreements (PPAs), the ability to predict generation with high accuracy at short intervals allows for better financial planning and optimization in commercial agreements. This is essential for risk management and revenue maximization in competitive electricity markets.
Improvement of electric system frequency stability: The ability to precisely predict photovoltaic generation at very short time horizons (5 min) allows for better management of electric system frequency stability. This aspect is relevant given the growing challenge faced by power systems because of the increasing penetration of intermittent renewable energy sources. Accurate predictions enable the implementation of more effective control schemes to dynamically adjust conventional generation and complement solar generation fluctuations; optimize the operation of storage systems and other flexibility resources to maintain system stability; anticipate rapid variability events in photovoltaic generation, allowing preventive actions to avoid significant frequency deviations; and improve coordination among different generation sources to maintain real-time load-generation balance. This is especially important in systems with low inertia, where rapid fluctuations can have a more pronounced impact on system stability.

The researchers considered various delays in the power series for model development and future value estimation. The main contributions of this study include the introduction of time series from other power plants as inputs in a multivariable model, the use of spatial interpolation to fill in missing data, and the application of inter-time series causality tests for the selection of predictor variables. In addition, the research analyzes the uncertainty associated with the predictions using quantile regression techniques [31], addressing different percentiles of the distribution of error. This study intends to advance the predictive capacity of photovoltaic generation through a novel approach that can be crucial for the efficient planning and operation of electricity systems in a growing penetration of renewable energies when meteorological information is not available.

2. Materials and Methods

Figure 1 describes the general methodology followed to develop forecasting models. The first step is to obtain and process information to build a database containing the power measurements generated by the selected PV parks. The monitoring, control, and data acquisition (SCADA) systems in the photovoltaic parks of a specific region provided this information.

For data processing, researchers designed software to automate the cleaning and completion of the database imported from Excel spreadsheets, and to identify and remove duplicate records during the process and complete missing records. SCADA system failures or information collection issues can lead to data anomalies. These anomalies include readings that exceed the maximum park capacity or values repeated more than twice consecutively because of measurement freezes. To address these issues, we implemented a two-step process to handle outliers and missing data. First, we identified outliers using the MATLAB Statistics Toolbox to detect outliers in the data. The approach adopted was based on the mean and standard deviation, setting a threshold of three times the standard deviation to ensure that only extremely improbable values were flagged as outliers.

Second, to correct both identified outliers and actual missing data, we applied the inverse distance weighting (IDW) method. This approach leverages the high spatial correlation between energy generation in neighboring parks within the same geographic region. A previous work validated the correlation between parks using a Pearson correlation coefficient analysis, yielding average values exceeding 0.85 in all cases. The IDW method assigns greater weight to data from parks closer to the point of interest, gradually decreasing the weight as the distance increases. In our implementation, we considered a maximum radius of 150 km to select neighboring parks, ensuring that environmental and operational conditions were comparable.

The research used the inverse distance weighting (IDW) method to impute missing data in the time series. IDW is a widely used spatial interpolation technique that estimates missing values based on the weighted average of known values from nearby locations. The weights are inversely proportional to the distance between the target location and the neighboring locations, meaning that closer points have a greater influence on the estimated value than farther ones. This approach is suitable for this study because it leverages the high correlation between energy generation in a target photovoltaic park and that of neighboring parks in the same region [32]. This method replaces the missing values in the power series of a park with values calculated using Equations (1) and (2).

P_{E} = \frac{\sum_{i = 1}^{n} W (r_{i, E}) P_{i}}{\sum_{i = 1}^{n} W (r_{i, E})}

(1)

W (r_{i, E}) = \frac{1}{r_{i, E}^{p}}

(2)

where

P_i is the power measurement in park i;
W(r_i,E) is the weighting function for park i with respect to the estimation park;
r_i,E is the distance in kilometers between park i and the estimate park;
n is the number of neighboring parks considered in the interpolation;
p is an interpolation exponent that is generally taken to be equal to two.

After completing and normalizing the database for each PV park, research divided it into a training set (75% of the records), a validation set (15%), and a test set (10%). Before training the forecast models, researchers standardized all these subsets for each park using the characteristic values of the training data. The structure of the predictor variables and the response of the model depend on the prediction horizon and the number of variables and delays considered. It is crucial to determine the input variables to predict the PV power generated by a park, considering not only its availability but also its correlation with the responses [30].

This research examined two types of models for prediction: univariate and multivariate. Univariate models rely on the historical time series data of the park to generate forecasts. In contrast, multivariate models incorporate additional variables, such as the power generation data from other parks, along with the time series data of the park itself. These models include multiple delays (D) in the input sequence to predict future values, with delays determined through autocorrelation tests. A sliding window method organized the data structure based on [25]. A fixed-size window slides along the P1, P2, …, Pn power time series with equal size D. The input vector for training comprised the values within the window; the goal was the power in the next instant of time.

This method used Matlab to enter the data as two-dimensional numerical arrays, where each column represents an input vector. For a univariate, single-step-ahead model, for example, the research converted the prediction problem to a regression problem using samples of past values from a single time series under analysis. Thus, the prediction used a univariate regression model with parameters determined through a training process based on historical data up to time t − 1,

P_{1} = \{P_{1} [t]\}, t > 0 P_{1} [t + h], h > 0

.

For multivariate models, which consider n time series of power from photovoltaic parks with K records, and for a pre-diction horizon h and delays D, the input data and the model response (output) for the P₁ series will have the following matrix structure described by Equations (3) and (4):

I n p u t = [\begin{matrix} P_{1} (1) P_{1} (2) P_{1} (3) \dots P_{1} (k + 1 - D - h) \\ P_{1} (2) P_{1} (3) P_{1} (4) \dots P_{1} (k + 2 - D - h) \\ P_{1} (3) P_{1} (4) P_{1} (5) \dots P_{1} (k + 3 - D - h) \\ ⋮ ⋮ ⋮ \dots ⋮ \\ P_{1} (D) P_{1} (D + 1) P_{1} (D + 2) \dots P_{1} (k - h) \\ P_{2} (1) P_{2} (2) P_{2} (3) \dots P_{2} (k + 1 - D - h) \\ P_{2} (2) P_{2} (3) P_{2} (4) \dots P_{2} (k + 2 - D - h) \\ P_{2} (3) P_{2} (4) P_{2} (5) \dots P_{2} (k + 3 - D - h) \\ ⋮ ⋮ ⋮ \dots ⋮ \\ P_{2} (D) P_{2} (D + 1) P_{2} (D + 2) \dots P_{2} (k - h) \\ ⋮ ⋮ ⋮ \dots ⋮ \\ P_{n} (1) P_{n} (2) P_{n} (3) \dots P_{n} (k + 1 - D - h) \\ P_{n} (2) P_{n} (3) P_{n} (4) \dots P_{n} (k + 2 - D - h) \\ P_{n} (3) P_{n} (4) P_{n} (5) \dots P_{n} (k + 3 - D - h) \\ ⋮ ⋮ ⋮ \dots ⋮ \\ P_{n} (D) P_{n} (D + 1) P_{n} (D + 2) \dots P_{n} (k - h) \end{matrix}]

(3)

O u t p u t = [P_{1} (D + h) P_{1} (D + h + 1) P_{1} (D + h + 2) \dots P_{1} (k)]

(4)

where

P(x) is the power of park x;
D is the number of delays that will be used for the prediction;
h is the prediction horizon.

2.1. Model Quality Statistics

A meticulous evaluation, based on various criteria that analyzed the accuracy and adequacy of the results, determine the final prediction model. Among the statistical indicators commonly used to evaluate the performance of prediction models are the mean absolute error percentage (MAPE), mean absolute error (MAE), root-mean-square error (RMSE), and correlation coefficient (r). The calculation of these indicators is as follows:

M A P E = \frac{100}{N} \cdot \sum_{t = 1}^{N} |\frac{P (t) - \hat{P} (t)}{\bar{P}}| (%)

(5)

M A E = \frac{100}{N} \cdot \sum_{t = 1}^{N} |P (t) - \hat{P} (t)| (%)

(6)

R M S E = \sqrt{\sum_{t = 1}^{N} \frac{1}{N} \cdot {(P (t) - \hat{P} (t))}^{2}} (k W)

(7)

r = \frac{\sum_{t = 1}^{N} (P (t) - \bar{P}) \sum_{t = 1}^{N} (\hat{P} (t) - \bar{\hat{P}})}{\sqrt{\sum_{t = 1}^{N} {(P (t) - \bar{P})}^{2} \sum_{t = 1}^{N} {(\hat{P} (t) - \bar{\hat{P}})}^{2}}} \cdot 100 %

(8)

where N is the number of observations, P(t) is the measurement of the power generated in t,

\hat{P} (t)

is the prediction of the power generated in t,

\bar{P}

is the average value of the power generated and

\bar{\hat{P}}

is the average value of the forecasted power.

2.2. Architecture and Operation of Neural Networks Used in Modeling

The LSTM (long short-term memory) and BiLSTM (bidirectional LSTM) neural networks, specialized architectures within recurrent neural networks (RNNs), handle and learn patterns in data sequences, such as time series, text, and signals.

2.3. LSTM (Long Short-Term Memory)

Hochreiter and Schmidhuber proposed LSTMs in 1997 to overcome the limitations of traditional RNNs, which have difficulty handling gradients and remembering information in the long term because of the problem of gradient fading or bursting. LSTMs address this problem by using memory units called memory cells, which have an internal structure that allows information to be maintained and updated for long periods of time.

2.4. Architecture and Operation of LSTM Networks

A long short-term memory (LSTM) network is a recurrent neural network architecture designed to capture long-term dependencies in data streams. The architecture of an LSTM comprises several key components that work together to manage information overtime sequences.

Memory cells: Each LSTM unit contains a ct memory cell that maintains the memory state at time t. Specific operations control the flow of information, updating, forgetting, or reading this cell.
LSTM gates: LSTMs use three types of gates to regulate the flow of information within the memory cell:
- Forget gate: The forget gate decides what information stored in the memory cell should be forgotten or kept. It is calculated using the sigmoid function $f (t) = σ [W_{f} x (t) + U_{f} h (t - 1) + b_{f}]$ , where σ is the sigmoid function, W_f is the weight of the forgetting gate, h(t − 1) is the hidden state in the previous time, x(t) is the current input, and b_f is the bias of the forgetting gate. The result f(t) is a vector of values between 0 and 1 that indicates how strongly the previous information is forgotten.
- Input gate: The gateway determines what new information should be added to the memory cell. It is calculated in two steps: first, a sigmoid function decides which values will be updated: $i (t) = σ [W_{i} x (t) + U_{i} h (t - 1) + b_{i}]$ . A tanh activation function then generates an update the $\hat{c} (t)$ candidate vector $\hat{c} (t) = φ [W_{c} x (t) + U_{c} h (t - 1) + b_{c}]$ . The new memory cell c(t) is updated by combining the old value and the new content using: $c (t) = f (t) ⊙ c (t - 1) + i (t) ⊙ \hat{c} (t)$ where ⊙ represents element-by-element multiplication.
- Output gate: The output gate decides how much of the information in the memory cell should be used as output. It is calculated as follows: $o (t) = σ [W_{o} x (t) + U_{o} h (t - 1) + b_{o}]$ where, W_o and b_o are the weights and the egress door bias, respectively. The hidden state h(t) is obtained as $h (t) = o (t) ⊙ φ [c (t)]$ . This allows only a portion of the current memory cell to be exposed as output, after being modified by the tanh function.
Feedback structure: LSTMs handle long sequences through a feedback structure that allows for the propagation of information through multiple temporal steps. This prevents information degradation as the sequence spreads by connecting the memory cells.

2.5. BiLSTM (Bidirectional LSTM)

BiLSTMs extend unidirectional LSTMs allowing information to flow both forward and backward through the time sequence. This makes them particularly effective at capturing context both ways around each point in the sequence, which is useful in tasks where prediction depends on the past and future history of the data.

2.6. Architecture and Operation of BiLSTM Networks

Two LSTM layers: The BiLSTM comprises two independent LSTM layers for each temporal direction: one that processes the sequence in the temporal order and one that processes it in reverse order.
At each time step, the analysis concatenates the outputs from both LSTMs to form a combined representation of the past and future context.

The BiLSTM grids are useful in tasks where both past and future context need to be considered, such as sentiment analysis, sequence tagging, machine translation, and more recently, in processing time sequences for prediction.

3. Results and Discussion

The aim of this study was to analyze the Yaguaramas photovoltaic park in the province of Cienfuegos, referred to as Parque P1. To enhance the robustness of the analysis, the research include data from the generation of seven additional solar plants in the central region of Cuba. These plants were Caguaguas (P2), Frigorífico (P3), Guasimal (P4), Marrero (P5), Mayajigüa-1 (P6), Mayajigüa-2 (P7), and Venegas (P8). The selected plants spanned 8040 km² (60 km × 134 km) within the same climatic region.

Researchers calculated Pearson’s correlation coefficient between the Yaguaramas park and the other parks. The results showed a strong positive correlation between the power generated by the parks considered. In particular, the Yaguaramas park exhibited a correlation coefficient greater than 0.85 compared to other parks.

Researchers collected data every 5 min throughout 2021, producing 105,120 energy records per plant, divided into 288 daily intervals. The analysis divided the dataset into training subsets (75%), validation subsets (15%), and test subsets (10%) to ensure appropriate model evaluation.

The first stage comprised verifying the relationship between the power series of the parks to assess whether it was feasible to include data from other parks as predictor variables in the models. Pearson’s correlation coefficient showed a strong positive correlation between the powers generated by the parks, higher than 0.72 with the other parks in Yaguaramas.

Researchers conducted a Granger causality test to assess the inclusion of information from other parks in the forecast models; the test did not reject the alternative hypothesis. In addition, the augmented Dickey–Fuller test verified all series were stationary. The focus of the study was to analyze how the incorporation of data from other parks affected the forecasting models for Yaguaramas park.

In the study, the autocorrelation function (ACF) was calculated to introduce temporal characteristics into the models. Figure 2 presents the ACF for the power generated in the Yaguaramas park, together with the 95% confidence intervals (CI). In Figure 2, the horizontal blue lines are the approximate 95% CI, the vertical red lines are the markers indicating the value of the autocorrelation function of the series with its own lag. The ACF at the zero delay is equal to 1, showing the correlation of a time series with itself, while the horizontal blue lines represent the CIs. The data show a high autocorrelation exceeding 0.8 for up to the first six delays; each delay corresponds to a 5 min interval within the recorded data. The behavior of the all-time series showed an autocorrelation greater than 0.7 up to 10 delays.

The researchers conducted all analyses using MATLAB^® R2021b on a portable HP EliteBook 855 G8 laptop equipped with an AMD Ryzen 7 PRO 5850U processor (1901 GHz) and 16 GB of RAM. To optimize the performance of the LSTM and BiLSTM models, a hyperparameter tuning process was carried out. This process involved evaluating multiple configurations of key hyperparameters, including the number of hidden layer units (3 to 153 with a step of 3), validation frequency (250 to 500 with a step of 50), mini-batch size (32, 64, 144, 288, and 432), and early stopping (3, 4, 5, 6, and 7 epochs without improvement). The selection of these configurations was based on their ability to minimize the prediction error metric (RMSE). The final architecture for the best-performing model (Model F) included 128 hidden units, a validation frequency of 500, a mini-batch size of 32, and early stopping after 6 epochs without improvement.

The neural network architecture described processed sequences of data and generated a regression output. The configuration started with a sequenceInputLayer, followed by an LSTM/BiLSTM layer configured to return only the output corresponding to the last time step (OutputMode, last). To regularize the weights and prevent overfitting, L2 regularization factors of 0.1 were applied to both the input weights (InputWeightsL2Factor) and the recurrent weights (RecurrentWeightsL2Factor) and biases (BiasL2Factor). The weight initializers followed the He heuristic (he), which helps speed up training in deep networks, while the bias was initialized using the unit forget gate strategy, commonly used in LSTM/BiLSTM networks to enhance training stability. A dropout layer with a rate of 0.25 followed the LSTM/BiLSTM layer to further reduce overfitting. The network continued with a fully connected layer (fullyConnectedLayer) that reduced the dimensionality to a single output, followed by a regression layer (regressionLayer) that enabled the modeling of continuous prediction problems. This configuration combined advanced regularization and initialization techniques to ensure a balance between modeling capacity and generalization. In univariate models, a single input and output variable is used to predict a future instant within the time series. In contrast, the multivariate models incorporate powers generated in other parks in the region as exogenous variables, besides the series of the target park.

Univariate one-step models use a single variable as input to predict the value of that same variable at a specific future point; this represents the simplest type of prediction. Multivariate one-step models introduce multiple variables to the network to predict a single variable at a single future instant. While the common practice is to use meteorological variables as additional inputs, this research explored the powers generated in other parks in the region, as previously detailed.

The models processed historical records of power generated every 5 min and learned to predict its future behavior. All models developed were for time horizons of 5 min. Because autocorrelation between past and current series values existed, authors investigated models with various time delays to assess their effects on prediction quality. Because of the large number of configurations, this article presents only the models with the best results or that provide meaningful information. Table 1 details the different cases studied, including their input and architecture.

Models A, B, and C were univariate, using only the power series of the Yaguaramas target park (P1) as inputs. The remaining models were multivariate. The D model included all the series of the parks, including the target park. Models D, E, and F considered only the parks with the highest correlation with the target park, i.e., P1, P2, P4, and P5. Table 2 shows the statistical indicators to evaluate the quality of the models. These error statistics were the basis for comparison between the different models. The error statistics remained relatively stable and the correlation coefficient r varied between 94.09% and 95.17%, indicating high accuracy in short-term prediction.

3.1. Comparison of the Models

To compare models that operate on the same time horizon, it is necessary to first identify the significant differences between them. To do this, authors used the Tukey test, also known as the Tukey multiple comparisons procedure. This statistical technique is suitable for comparing the means of several groups and determining whether there are significant differences between them. It is useful when making comparisons between three or more groups and seeking to identify which groups differ in relation to a variable of interest. In this study, the Tukey test allowed us to compare the error or residual between the forecasts generated by the models and the actual power measurements. The test implementation used Matlab’s multcompare function. Interpretation of the Tukey test results was based on the confidence interval for the difference between the two means. If this interval did not include zero and the associated p-value was less than the significance level (α = 0.05), results concluded that a significant difference existed between the models in terms of prediction error.

The A, B, and C models were univariate and had significant differences from each other, as shown by Tukey’s test. When evaluating the performance of these models using error metrics (RMSE, MAE, and MAPE) and correlation coefficient (r), Model C stood out as the best among univariate models with a 5 min horizon. Although the ASM of Model C (107.56%) was higher than that of the other models, its RMSE (407.24 kW) and MAE (138.80 kW) values were lower, and its correlation coefficient (95.08%) was slightly higher, showing a better fit.

Among the multivariate models with a 5 min horizon, comparison of Models D, E, and F showed Model F had the lowest RMSE (400.63 kW). Although its MAE (147.66 kW) was slightly higher than that of Model E (141.96 kW), it was lower than that of Model D (162.44 kW), and its correlation coefficient was comparable. Authors preferred Model F because of its lower complexity; it used only two input variables, thus reducing data preparation, computational cost, and training time.

For 5 min time horizons, Models C (univariate) and F (multivariate) were the standouts, both based on BiLSTM networks and incorporating six delays from each time series into the input characteristics. According to Table 2, Model F showed a slightly lower RMSE (400.63 kW) and ASM (102.03 kW) than Model C (407.24 kW and 107.56 kW, respectively), showing higher accuracy. In addition, the correlation coefficient of Model F (95.17%) was slightly higher than that of Model C (95.08%), which reinforced its better adjustability. Figure 3 illustrates the correlation between the measured and predicted power values using Model F for the test dataset. The data represent a 5 min time horizon, with predictions generated for both clear and cloudy days. Cloudless days exhibited more uniform power generation patterns because of constant solar irradiance, while cloudy days showed abrupt fluctuations caused by intermittent cloud cover. The high correlation coefficient (r = 95.17%) showed that Model F effectively captured these variations, demonstrating its capability to adapt to different weather.

Figure 4 and Figure 5 show the forecast results and the actual value for a cloudy day and a clear day, respectively. In these graphs, for clear days, the prediction was better adjusted to reality, while for a cloudy day, with sudden variations in generation, the result was slightly worse.

3.2. Forecast Interval

Predicting with confidence interval rather than simply providing a point value is critical to more informed and effective decision-making. A confidence interval not only provides an estimate of future value but also reflects the uncertainty and variability inherent in predictions. This is especially important in contexts such as PV generation, where factors such as weather can cause significant fluctuations. A confidence interval allows managers to plan and prepare for different scenarios, rather than relying solely on an estimate that might prove inaccurate. In addition, it provides a more robust measure of risk associated with decisions, helping to optimize resources, reduce the impact of errors, and improve the reliability of operational and control strategies.

Quantile regression is a useful technique for managing uncertainty in PV power forecasts. Rather than just estimating an average value, this method allows you to obtain confidence intervals that show how the forecast varies in different parts of the error distribution. To use quantile regression, the first step is to calculate the differences between the actual and predicted values, creating a set of errors. Then, we estimate certain percentiles of these errors—for example, the 5th (0.05 quantile) and the 95th (0.95 quantile)—using MATLAB’s quantile function.

Researchers added or subtracted these percentiles from the original predictions to obtain a range of potential future values. This range, or forecast interval, showed where future observations were likely to fall, considering variability in errors. For example, when applying this method to Model F to predict power over a 5 min horizon, it was observed that, for a cloudy day with significant fluctuations in energy generation, most of the values fell within the prediction range. For sunny days, all values were adjusted to this range. Thus, quantile regression offered a more comprehensive way to assess and manage uncertainty in our forecasts, helping to make more informed decisions and assess risks more effectively. Figure 6 shows the confidence interval.

3.3. Residual Analysis of the Model

Figure 7 shows the analysis of Model F residuals, showing positive model fit results. The absence of residual autocorrelation shows the model was not omitting important temporal patterns in the data, suggesting that Model F was well-specified and effective in predicting. Although the histogram of residues was not perfectly normal, it resembled a normal distribution, which is a good sign, although not ideal.

This suggests that Model F provided accurate predictions overall, but there are areas where the model could improve. To improve future results, authors suggest data transformations or alternative models to improve residual normality. They also recommend investigating and handling outliers and adjusting hyperparameters. These actions could help optimize the accuracy and robustness of the model in future analyses.

4. Comparison with Previous Models and Contributions

This study introduces an innovative approach for very short-term forecasting of photovoltaic (PV) energy, using multivariate models that incorporate non-climatic exogenous variables. The major innovation lies in using time-series data from neighboring PV plants as inputs, rather than relying solely on meteorological data or univariate models. This strategy addresses an important gap in the literature, as highlighted in [2], where the integration of spatial correlations between plants has been explored rarely.

The authors investigated the development of several models for the case study park [9,12], combining the discrete wavelet transform (DWT) model with artificial neural networks (feed-forward backpropagation and generalized regression neural networks) to decompose and reconstruct the time series of power and meteorological variables. Although these models achieved excellent results, their dependence on meteorological data and a more complex structure limited their precision. For example, the hybrid model in [9] presented a MAPE of 33.89%, an RMSE of 1.5 MW, and an error variance (σ²) of 0.32. In contrast, the models proposed in this article introduce significant simplification and improvement by employing long short-term memory (LSTM) and bidirectional LSTM (BiLSTM) neural networks, which integrate power data from other plants in the region as predictive variables. This innovation allowed for greater accuracy, with Models D, E, and F exhibiting RMSE values between 400.63 kW and 404.29 kW, MAPE values between 94.12% and 102.03%, and a correlation coefficient (r) between 95.08% and 95.17%. Specifically, Model F stood out for its reduced complexity (only two inputs) and superior accuracy metrics, with an RMSE of 400.63 kW, a MAPE of 102.03%, and a correlation coefficient of 95.17%, demonstrating a notable improvement compared to previous models.

To better understand the advantages of the proposed method, we conducted a comparative analysis between univariate and multivariate models. Univariate models, such as Models A, B, and C, are based only on historical energy data from the target plant (P1). While these models provide reasonable accuracy, they cannot capture broader spatial patterns and correlations present in the regional energy generation dynamics. For instance, Model C, the best-performing univariate model, achieved an RMSE of 407.24 kW and a correlation coefficient of 95.08%. However, the lack of external inputs explaining variability beyond the target plant’s immediate environment limited its performance.

In contrast, multivariate models, such as Models D, E, and F, incorporate power series from highly correlated neighboring PV plants. These models showed superior predictive capabilities, particularly Model F, which achieved the lowest RMSE (400.63 kW) and the highest correlation coefficient (95.17%). Including exogenous variables allowed Model F to more effectively capture spatial dependencies and temporal patterns, leading to more accurate and reliable predictions.

A key advantage of the proposed method is its robustness in handling scenarios with limited meteorological data. Traditional forecasting methods often heavily rely on weather forecasts or numerical weather prediction (NWP) models, which may not always be available or accurate. By focusing on power series data from neighboring plants, our method provides a practical solution for regions where meteorological information is scarce or unreliable.

The study highlights the importance of model selection and architecture design. Using BiLSTM networks in multivariate models (e.g., Model F) demonstrates their ability to capture bidirectional temporal dependencies, which are crucial for understanding past and future trends in PV energy generation. This contrasts with LSTM-based models, which process information only in one forward direction and may overlook important contextual relationships.

To underscore the impact of the proposed method, Table 2 compares the performance metrics of univariate and multivariate models. The results clearly show that incorporating exogenous variables significantly improved prediction accuracy. For example, Model F reduced the RMSE by approximately 6.61 kW compared to Model C while maintaining a comparable MAE and achieving a slightly higher correlation coefficient. These improvements highlight the value of integrating spatial correlations into forecasting models.

Last, the application of quantile regression to calculate confidence intervals further enhanced the practical utility of the proposed method. By providing solid estimates of uncertainty, this technique enables grid operators to make informed decisions and prepare for various scenarios, especially during periods of high variability, such as cloudy days.

5. Conclusions

This study evaluated the efficacy of advanced multivariate models for the very short-term prediction of PV power generation, integrating data from power series from multiple plants as exogenous variables.

Incorporating power series data from nearby PV plants as exogenous variables in the prediction models has proven to be effective in improving the accuracy of 5 min estimates. Multivariate models that include data from multiple PV parks show superior performance compared to univariate models based solely on data from the target farm.

Among the models evaluated, Model F, which used BiLSTM networks and data generated from power in parks with high correlation with the target park, showed the best overall performance. This model had the lowest mean square error (RMSE) and a higher correlation coefficient, showing a greater ability to adjust predictions to real values.

Including power data from other PV parks not only improves the accuracy of predictions but allows them to capture spatial patterns and correlations that are not reflected in univariate models. This approach allows for better management of variability in power generation, facilitating a more effective integration of solar energy into the electricity grid.

The application of quantile regression to calculate confidence intervals has proven to be useful in managing uncertainty in predictions. Confidence intervals provide a more robust measure of variability and help plan and prepare for different scenarios in power generation.

The results of this study have significant implications for the planning and operation of photovoltaic systems. The ability to forecast generation with high accuracy and consider the associated uncertainty facilitates better management of the power grid, optimization of resources, and reduction of operating costs. Implementing advanced multivariate models can significantly improve the integration of renewable energies and the stability of electricity systems.

Future research could focus on optimizing hyperparameters, handling potential outliers, and applying transformations to the data to further improve the accuracy and robustness of the model.

Author Contributions

Conceptualization, I.F.-H. and J.R.G.-S.; methodology, I.F.-H. and Z.G.-S.; software, R.R.-C.; validation, H.H.-H., J.I.S.-O. and R.R.-C.; formal analysis, H.H.-H.; investigation, I.F.-H.; resources, I.F.-H., R.R.-C., J.R.G.-S. and Z.G.-S.; data curation, I.F.-H.; writing—original draft preparation, I.F.-H.; writing—review and editing, H.H.-H. and J.I.S.-O.; visualization, Z.G.-S.; supervision, J.R.G.-S. and Z.G.-S.; project administration, H.H.-H.; funding acquisition, H.H.-H. and J.I.S.-O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy and legal reasons.

Conflicts of Interest

The authors declare no conflict of interest.

References

Haas, R.; Duic, N.; Auer, H.; Ajanovic, A.; Ramsebner, J.; Knapek, J.; Zwickl-Bernhard, S. The Photovoltaic Revolution Is on: How It Will Change the Electricity System in a Lasting Way. Energy 2023, 265, 126351. [Google Scholar] [CrossRef]
Garcia, M.G.; Sanchez, Z.G.; Hernandez Herrera, H.; Cueto Cruz, J.A.G.; Silva Ortega, J.I.; Sánchez, G.C. Frequency Response Analysis under Faults in Weak Power Systems. Int. J. Electr. Comput. Eng. (IJECE) 2022, 12, 1077–1088. [Google Scholar]
Gallego-Landera, Y.; Garcia-Sanchez, Z.; Casas-Fernandez, L.; Rivas-Arocha, Y. Impacto de La Implementación de Paneles Fotovoltaicos En El Sistema Eléctrico Cayo Santa María [Impact of the Implementation of Photovoltaic Panels at Cayo Santa Maria Electric System]. Ing. Energética 2017, 38, 76–87. [Google Scholar]
Saleh, S.A.; Ozkop, E.; Meng, R.J.; Sanchez, Z.G.; Betancourt, O.A.A. Selecting Locations and Sizes of Battery Storage Systems Based on the Frequency of the Center of Inertia and Principle Component Analysis. IEEE Trans. Ind. Appl. 2020, 56, 1040–1051. [Google Scholar] [CrossRef]
Betancourt, O.A.; Sanchez, Z.G.; Saleh, S.A.; Hill, E.F.; Zhao, X.; Sanchez, F.P. Battery Energy Storage Systems for Primary Frequency Regulation in Island Power Systems. In Proceedings of the Conference Record-Industrial and Commercial Power Systems Technical Conference, Las Vegas, NV, USA, 29 June–28 July 2020. [Google Scholar]
Hasan, A.K.M.K.; Haque, M.H.; Mahfuzul Aziz, S. Enhancing Frequency Response Characteristics of Low Inertia Power Systems Using Battery Energy Storage. IEEE Access 2024, 12, 116861–116874. [Google Scholar] [CrossRef]
Criollo, A.; Minchala-Avila, L.I.; Benavides, D.; Arévalo, P.; Tostado-Véliz, M.; Sánchez-Lozano, D.; Jurado, F. Enhancing Virtual Inertia Control in Microgrids: A Novel Frequency Response Model Based on Storage Systems. Batteries 2024, 10, 18. [Google Scholar] [CrossRef]
Babu, V.V.; Roselyn, J.P.; Nithya, C.; Sundaravadivel, P. Development of Grid-Forming and Grid-Following Inverter Control in Microgrid Network Ensuring Grid Stability and Frequency Response. Electronics 2024, 13, 1958. [Google Scholar] [CrossRef]
Yang, X.; Xu, M.; Xu, S.; Han, X. Day-Ahead Forecasting of Photovoltaic Output Power with Similar Cloud Space Fusion Based on Incomplete Historical Data Mining. Appl. Energy 2017, 206, 683–696. [Google Scholar] [CrossRef]
Lee, J.; Kang, J.; Lee, S.; Oh, H.-M. Ultra-Short Term Photovoltaic Generation Forecasting Based on Data Decomposition and Customized Hybrid Model Architecture. IEEE Access 2024, 12, 20840–20853. [Google Scholar] [CrossRef]
Gupta, M.; Arya, A.; Varshney, U.; Mittal, J.; Tomar, A. A Review of PV Power Forecasting Using Machine Learning Techniques. Prog. Eng. Sci. 2025, 2, 100058. [Google Scholar] [CrossRef]
Pierro, M.; Bucci, F.; De Felice, M.; Maggioni, E.; Moser, D.; Perotto, A.; Spada, F.; Cornaro, C. Multi-Model Ensemble for Day Ahead Prediction of Photovoltaic Power Generation. Sol. Energy 2016, 134, 132–146. [Google Scholar] [CrossRef]
Ramsami, P.; Oree, V. A Hybrid Method for Forecasting the Energy Output of Photovoltaic Systems. Energy Convers. Manag. 2015, 95, 406–413. [Google Scholar] [CrossRef]
Gómez Rodríguez, M.A.; Gómez Sarduy, J.R.; Lorenzo Ginori, J.V.; Fonte González, R.; García Sánchez, Z. Electrical Generation Forecast of Photovoltaic Systems. First Steps by Cuban Universities|Pronóstico de La Generación Eléctrica de Sistemas Foto-voltaicos. Un Inicio En Cuba Desde La Universidad. Univ. Soc. 2021, 13, 253–265. [Google Scholar]
Li, Y.; He, Y.; Su, Y.; Shu, L. Forecasting the Daily Power Output of a Grid-Connected Photovoltaic System Based on Multivariate Adaptive Regression Splines. Appl. Energy 2016, 180, 392–401. [Google Scholar] [CrossRef]
Eseye, A.T.; Zhang, J.; Zheng, D. Short-Term Photovoltaic Solar Power Forecasting Using a Hybrid Wavelet-PSO-SVM Model Based on SCADA and Meteorological Information. Renew. Energy 2018, 118, 357–367. [Google Scholar] [CrossRef]
Fraga-Hurtado, I.; Gómez-Rodríguez, M.; Gómez-Sarduy, J.R.; García-Sánchez, Z. Prediction of Photovoltaic Generation Using Deep Learning. Univ. Soc. 2023, 15, 266–275. [Google Scholar]
Cordeiro-Costas, M.; Villanueva, D.; Eguía-Oller, P.; Granada-Álvarez, E. Machine Learning and Deep Learning Models Applied to Photovoltaic Production Forecasting. Appl. Sci. 2022, 12, 8769. [Google Scholar] [CrossRef]
Kouhi, S.; Khavaninzadeh, M.; Keynia, F. Applying Wavelet to Ann Based Short-Term Load Forecasting: A Case Study of Zanjan Power System. Int. J. Electr. Electron. Eng. Res. 2013, 3, 209–215. [Google Scholar]
Gómez-Sarduy, J.R.; Di Santo, K.G.; Saidel, M.A. Linear and Non-Linear Methods for Prediction of Peak Load at University of São Paulo. Measurement 2016, 78, 187–201. [Google Scholar] [CrossRef]
Peña-Acción, J.A.; Viego-Felipe, P.R.; Gómez-Sarduy, J.R.; Padrón-Padrón, A.E. Peak Load Forecasting for Energy Management at Cienfuegos University. Univ. Soc. 2019, 11, 220–228. [Google Scholar]
Husein, S.M.; Gago, E.J.; Hasan, B.; Pegalajar, M.C. Towards Energy Efficiency: A Comprehensive Review of Deep Learning-Based Photovoltaic Power Forecasting Strategies. Heliyon 2024, 10, e33419. [Google Scholar] [CrossRef]
Muñoz-Jiménez, A. Modelos de Predicción a Corto Plazo de La Generación Eléctrica En Instalaciones Fotovoltaicas. Ph.D. Thesis, Universidad de la Rioja, Logroño, Spain, 2015. [Google Scholar]
Succetti, F.; Rosato, A.; Araneo, R.; Panella, M. Deep Neural Networks for Multivariate Prediction of Photovoltaic Power Time Series. IEEE Access 2020, 8, 211490–211505. [Google Scholar] [CrossRef]
Rahman, N.H.A.; Hussin, M.Z.; Sulaiman, S.I.; Hairuddin, M.A.; Saat, E.H.M. Univariate and Multivariate Short-Term Solar Power Forecasting of 25MWac Pasir Gudang Utility-Scale Photovoltaic System Using LSTM Approach. Energy Rep. 2023, 9, 387–393. [Google Scholar] [CrossRef]
Hossain, M.S.; Mahmood, H. Short-Term Photovoltaic Power Forecasting Using an LSTM Neural Network and Synthetic Weather Forecast. IEEE Access 2020, 8, 172524–172533. [Google Scholar] [CrossRef]
Rosso, A.P.; Rampinelli, G.A.; Schaeffer, L. Experimental Development of a Method of Short and Medium-Term Photovoltaic Generation Forecasting Using Multivariate Statistics and Mathematical Modeling. Energy Rep. 2024, 12, 1710–1722. [Google Scholar] [CrossRef]
Ren, X.; Zhang, F.; Yan, J.; Liu, Y. A Novel Convolutional Neural Net Architecture Based on Incorporating Meteorological Variable Inputs into Ultra-Short-Term Photovoltaic Power Forecasting. Sustainability 2024, 16, 2786. [Google Scholar] [CrossRef]
Al-Dahidi, S.; Madhiarasan, M.; Al-Ghussain, L.; Abubaker, A.M.; Ahmad, A.D.; Alrbai, M.; Aghaei, M.; Alahmer, H.; Alahmer, A.; Baraldi, P.; et al. Forecasting Solar Photovoltaic Power Production: A Comprehensive Review and Innovative Data-Driven Modeling Framework. Energies 2024, 17, 4145. [Google Scholar] [CrossRef]
Zhen, H.; Niu, D.; Wang, K.; Shi, Y.; Ji, Z.; Xu, X. Photovoltaic Power Forecasting Based on GA Improved Bi-LSTM in Microgrid without Meteorological Information. Energy 2021, 231, 120908. [Google Scholar] [CrossRef]
Massaoudi, M.; Chihi, I.; Sidhom, L.; Trabelsi, M.; Refaat, S.S.; Abu-Rub, H.; Oueslati, F.S. An Effective Hybrid NARX-LSTM Model for Point and Interval PV Power Forecasting. IEEE Access 2021, 9, 36571–36588. [Google Scholar] [CrossRef]
Crespo-Turrado, M.C. Imputacion de Datos Faltantes En Redes de Distribucion de Baja Tension Aplicacion a Edificios de Pública Concurrencia. Ph.D. Thesis, Universidad de Oviedo, Oviedo, Spain, 2018. [Google Scholar]

Figure 1. Diagram of the modeling process.

Figure 2. Autocorrelation of the power series of the Yaguaramas photovoltaic park.

Figure 3. Correlation between measurements and forecasting with Model F for test data.

Figure 4. Forecast for 5 min with a clear day with Model F.

Figure 5. Forecast for 5 min with a cloudy day with Model F.

Figure 6. Forecast interval.

Figure 7. Residual analysis.

Table 1. Developed models and their characteristics.

Model	Architecture	Inputs	Delays (D)
Model A	LSTM Units = 128; dropout = 0.5; fully connected = 1	P₁	1
Model B	LSTM Units = 128; dropout = 0.5; fully connected = 1	P₁	6
Model C	BiLSTM Units = 128; dropout = 0.5; fully connected = 1	P₁	6
Model D	BiLSTM Units = 128; dropout = 0.5; fully connected = 1	P₁, P₂, P₃, P₄, P₅, P₆, P₇, P₈	6
Model E	BiLSTM Units = 128; dropout = 0.5; fully connected = 1	P₁, P₂, P₄, P₅	6
Model F	BiLSTM Units = 128; dropout = 0.5; fully connected = 1	P₁, P₄	6

Table 2. Quality metrics of the evaluated models.

Model	Time Horizon	RMSE (kW)	MAE (kW)	MAPE (%)	r (%)
Univariate models
A	5 min	445.28	148.83	89.35	94.09
B	5 min	416.06	169.12	90.46	95.00
C	5 min	407.24	138.80	107.56	95.08
Multivariate models
D	5 min	404.29	162.44	94.12	95.17
E	5 min	404.12	141.96	101.46	95.16
F	5 min	400.63	147.66	102.03	95.17

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fraga-Hurtado, I.; Gómez-Sarduy, J.R.; García-Sánchez, Z.; Hernández-Herrera, H.; Silva-Ortega, J.I.; Reyes-Calvo, R. Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting. Electricity 2025, 6, 29. https://doi.org/10.3390/electricity6020029

AMA Style

Fraga-Hurtado I, Gómez-Sarduy JR, García-Sánchez Z, Hernández-Herrera H, Silva-Ortega JI, Reyes-Calvo R. Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting. Electricity. 2025; 6(2):29. https://doi.org/10.3390/electricity6020029

Chicago/Turabian Style

Fraga-Hurtado, Isidro, Julio Rafael Gómez-Sarduy, Zaid García-Sánchez, Hernán Hernández-Herrera, Jorge Iván Silva-Ortega, and Roy Reyes-Calvo. 2025. "Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting" Electricity 6, no. 2: 29. https://doi.org/10.3390/electricity6020029

APA Style

Fraga-Hurtado, I., Gómez-Sarduy, J. R., García-Sánchez, Z., Hernández-Herrera, H., Silva-Ortega, J. I., & Reyes-Calvo, R. (2025). Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting. Electricity, 6(2), 29. https://doi.org/10.3390/electricity6020029

Article Menu

Advanced Multivariate Models Incorporating Non-Climatic Exogenous Variables for Very Short-Term Photovoltaic Power Forecasting

Abstract

1. Introduction

2. Materials and Methods

2.1. Model Quality Statistics

2.2. Architecture and Operation of Neural Networks Used in Modeling

2.3. LSTM (Long Short-Term Memory)

2.4. Architecture and Operation of LSTM Networks

2.5. BiLSTM (Bidirectional LSTM)

2.6. Architecture and Operation of BiLSTM Networks

3. Results and Discussion

3.1. Comparison of the Models

3.2. Forecast Interval

3.3. Residual Analysis of the Model

4. Comparison with Previous Models and Contributions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI