1. Introduction
Reducing CO
2 emissions in construction is essential to achieve climate neutrality by 2050 and reach the EU target of a 55% reduction in emissions by 2030 [
1]. Within this framework, buildings contribute about 36% of total greenhouse gas emissions and consume about 40% of energy in the EU, positioning the sector as a key player in energy efficiency and decarbonization strategies [
2]. Moreover, the most recent European directive on energy efficiency stipulates that, from 2030, all new buildings must comply with the zero-emission standard, and that existing buildings must be gradually transformed to meet this requirement by 2050 [
3].
The mass integration of photovoltaic (PV) systems for self-consumption in smart buildings can be key to moving towards sustainable and resilient energy models. These systems allow buildings to generate their own clean energy, as well as optimize consumption and reduce costs, while contributing to decarbonization and urban energy efficiency [
4]. However, the inherent variability in solar radiation and local weather conditions poses significant challenges for the accurate prediction of PV production, a critical aspect for efficient demand management, grid stability, and optimization of distributed energy resources [
5]. In this context, accurate forecasting of the power generated by PV systems for self-consumption is essential to anticipate the availability of renewable energy, facilitate operational planning, and improve the integration of these systems into the grid, thereby maximizing their benefits and minimizing the risks associated with the intermittency of solar generation [
6].
Faced with these challenges, digital twins (DTs) applied to photovoltaic installations for self-consumption emerge as transformative solutions by creating dynamic virtual replicas that simulate physical installations in real time, allowing their continuous monitoring, operational optimization, and predictive maintenance [
7]. However, the effectiveness of these twins depends critically on the selection of the underlying predictive model, which must balance accuracy, robustness, and computational efficiency under real operating conditions [
8].
Digital twins are revolutionizing the management of intelligent buildings by enabling the continuous, real-time monitoring, simulation, and optimization of their systems and energy consumption [
9]. This technology virtually replicates the operation of a building, integrating data from renewable sources, energy storage, and load management systems. This opens up new possibilities for achieving greater efficiency and sustainability [
10]. DTs enable a proactive response to changes in demand, improving operational efficiency and reducing costs [
11]. Regarding the use of predictive maintenance, implementing DTs enables the early detection of failures and anomalies in critical equipment, facilitating predictive maintenance strategies that reduce breakdowns and prolong the useful life of the systems [
12]. DTs also allow for the advanced monitoring of a building’s services and interior environment. They adjust the air conditioning, lighting, and ventilation in real time to maximize comfort and energy efficiency [
13]. DTs help determine when to store or release energy by considering electricity prices, demand curves, and renewable production. Thus, smart strategies can be designed to minimize consumption peaks and take advantage of both dynamic tariffs and distributed storage, contributing to more resilient and sustainable grids [
14].
A traditional approach to the development of predictive tools in DTs has been the formulation of mathematical models capable of accurately describing the dynamics of the system. This approach is particularly suitable when a solid knowledge of the physical behavior of the components is available, allowing their fundamental relationships to be expressed by well-established equations and laws [
15]. However, this method has certain limitations when applied to complex systems, especially those that are highly nonlinear or involve many interrelated factors. In such cases, achieving an accurate representation of reality is difficult [
8]. In addition, the lack of sufficient information or reliable data can lead to inaccurate or even erroneous results, compromising the usefulness of these models [
16]. Traditional mathematical approaches are generally designed for very specific cases, making them difficult to scale or adapt to larger systems or those requiring a high degree of customization [
17]. To simplify handling, approximations are often incorporated that sacrifice important details of the actual behavior of the system, which can negatively affect the fidelity of the predictions obtained [
18].
In response to these limitations, the advance of artificial intelligence, and particularly artificial neural networks, has opened up new modeling possibilities for digital twins. These networks have demonstrated an outstanding ability to model multifactorial dynamics, such as those present in energy, biomedical, and electronic systems, extending the range of technical and industrial applications beyond the typical uses of artificial intelligence [
19]. Its main strength lies in the integration and processing of data in real time, which enables the digital twin to remain up to date and able to continuously adapt to new operating conditions, thereby increasing both the accuracy and robustness of predictions and diagnostics [
20]. In addition, neural networks are used for predictive control and process optimization, overcoming the limitations of classical approaches through their ability to capture complex relationships between variables and learn from large volumes of historical and operational data [
21]. The use of convolutional networks brings added value in image and signal analysis tasks, which favors pattern recognition, human–machine interaction, and advanced monitoring in fields such as manufacturing and robotics, thus consolidating neural networks as versatile and essential tools for the development of intelligent digital twins in complex industrial environments [
22]. This versatility and explanatory power make neural networks an ideal tool for the development of DTs in environments where uncertainty, variability, or lack of information prevent the use of classical mathematical models alone [
23].
Recently, hybrid frameworks combining physical models with neural networks have been developed to take advantage of the accuracy of physical laws and the flexibility of data learning. This improves the generalization capability and adaptability of DTs in complex systems [
24]. For instance, in designing flat-plate solar collectors, where simplified, trial-and-error methods are traditionally used, physics-based neural networks have been suggested to predict the optimal design conditions in regions with unique environmental conditions, such as the highlands of Ecuador [
25]. Evolutionary algorithms and language models have been employed to design and optimize hybrid DT architectures, improving their efficiency and applicability in scenarios with limited data [
26].
Therefore, neural networks can be assumed as a suitable option that can model nonlinear relationships and complex temporal patterns in the data, surpassing traditional methods such as linear regression or physical models [
27]. However, its optimal implementation requires critical evaluation of the operational context, the granularity of the available data, and specific computational constraints [
28].
The use of neural networks as a basis for the development of digital twins in PV installations has enabled the accurate simulation and prediction of system behavior under a wide variety of operating conditions. For example, digital twins incorporating hybrid neural network architectures are able to simulate with high fidelity the characteristics of PV panels in changing contexts, providing versatile tools for energy monitoring and optimization [
29]. Furthermore, the integration of recurrent neural networks has facilitated the real-time estimation of PV power generation, even when weather conditions are variable, which is essential for dynamic energy resource management [
30]. Such forecasts should have a short time horizon, covering intervals of minutes or hours, and no more than a few days, in order to adapt production precisely to consumption needs at any given time [
31]. Other variants, such as models based on Multilayer Perceptron (MLP) and Elman networks, allow realistic and reliable estimates of the power generated, adapting to the complex nature of the data collected [
32]. The ability of neural networks to process large volumes of information in real time is especially leveraged in short-term forecasting applications, where the digital twin uses IoT (Internet of Things) data to anticipate power generation and adjust operational strategy almost instantaneously [
33]. Additionally, advanced architectures such as the FFNN-LSTM have outperformed traditional physical models and other artificial intelligence techniques in accurately estimating PV power [
34]. In the field of predictive maintenance, multi-twin digital twins supported by deep networks excel at diagnosing faults in strings of photovoltaic modules, achieving accuracy levels of over 98%, which translates into greater reliability and operational safety [
35]. Finally, comparisons between digital twins of a physical nature and those driven by neural networks in the context of combined photovoltaic and battery systems show the advantages of the artificial intelligence-based approach, particularly in terms of adaptability, accuracy, and scalability in the face of the complexity of modern systems [
36].
The scientific literature has explored a wide range of neural network models to improve the prediction accuracy of PV generation, addressing both the nonlinear nature and time dependence of the data, distinguishing between long-, medium-, short-, and very short-term predictions [
37]. In [
38], neural network models, in particular the Multilayer Perceptron (MLP), demonstrated a high predictive ability to estimate the power generated by a PV system under real conditions, reaching R
2 values higher than 0.93 and mean absolute errors (MAEs) lower than 0.08 in the experimental validation. The results presented in [
39] showed that the MLP model achieved a very satisfactory performance in the very short-term (5 min) prediction of PV production, achieving accuracy comparable to recurrent architectures, but with significantly shorter training and inference times. This indicates that, in contexts where computational resources are limited or frequent model updating is required, the MLP represents a more efficient and practical alternative to more complex models such as LSTM (Long Short-Term Memory). These results show that simple structures, such as the MLP, can provide more accurate fits than other modern models and also reduce the prediction time in different fields, as some works have pointed out [
40]. However, the results obtained in [
41] show that the combination of LSTM with self-attention mechanisms and the integration of historical and forecast meteorological data increased the coefficient of determination (R
2) by 26.4% with respect to the basic LSTM, thus achieving superior accuracy and adaptability in both short- and long-term forecast horizons. Therefore, it can be said that the LSTM model is designed to remember relevant information over long periods, which allows it to model the temporal evolution of PV power and anticipate changes due to variable meteorological conditions. The GRU (Gated Recurrent Unit) model, like the LSTM, is designed to learn patterns and relationships over time, which is essential to anticipate the evolution of PV power under changing weather conditions. In addition, it is simpler and faster to train than other recurrent models such as the LSTM; in fact, it is a simplification of the latter, which allows its use in real-time applications and with large volumes of data [
42].
Neural networks, particularly the MLP, LSTM, and GRU models, are arguably driving a transformation in DTs for buildings. These models enable predictive monitoring and more efficient energy management. However, the full adoption of these technologies still faces technical hurdles, primarily regarding interoperability between Building Information Modeling (BIM) systems, the Internet of Things (IoT), and artificial intelligence (AI) algorithms. Adoption of these technologies remains challenging [
43]. MLPs have been used for prediction and classification tasks, such as estimating CO
2 emissions or analyzing energy consumption patterns in DTs [
44]. LSTMs tend to be more accurate for energy consumption prediction, though GRUs can match or even outperform LSTM models in certain cases, especially when greater computational efficiency is required [
45]. In addition, GRUs have been proven to offer better results in occupancy and trajectory estimation within buildings. They achieve lower errors and have fewer parameters to adjust, which facilitates their training and scalability [
46]. Integrating neural networks into DTs offers clear benefits, such as improving the visualization and understanding of critical information, automating monitoring, and enabling the implementation of data-driven management strategies, thereby increasing operational efficiency [
47].
The main contribution and originality of this work lies in the realization of an exhaustive and homogeneous experimental comparison between a validated mathematical model and three neural network architectures (MLP, LSTM, and GRU) for the prediction of PV production in a real environment, using high-temporal-resolution data and complete annual coverage of the installation of the School of Industrial Engineering of the University of Extremadura. Unlike previous studies, which tend to focus on specific prediction horizons, limited datasets, or partial comparisons between models, this work objectively evaluates the accuracy, robustness, and seasonal adaptability of each approach under standardized metrics (MSE, RMSE, MAE, and R2), following a transparent and reproducible methodological protocol. This approach allows not only for the identification of the most suitable model for its integration in digital twins of PV installations but also for the provision of practical recommendations for its deployment in real scenarios, considering the seasonal variability and the usual operational constraints in the sector. Thus, this study contributes to closing the existing gap in the literature on the comprehensive and contextualized comparison of predictive models for advanced energy management applications in smart buildings.
The rest of the article is organized as follows:
Section 2 describes the actual plant monitored, the data collected, the prediction methods used, and the metrics applied to evaluate the performances of the models. In
Section 3, the results achieved are presented and analyzed. Finally, in
Section 5, the conclusions are presented.
2. Materials and Methods
2.1. Actual Plant Description
This study was carried out on the photovoltaic installation located on the roof of the School of Industrial Engineering of the University of Extremadura, in Badajoz. The installation has a nominal power of 2.79 kWp, consisting of six JA SOLAR JAM72S20 monocrystalline photovoltaic modules (from JA SOLAR GmbH, München, Germany) connected in series, and a Huawei SUN2000-5KTL-M1 inverter (from Huawei Technologies Co. Ltd., Shenzhen, China). The orientation of the panels is south, with an inclination of 30° (see
Figure 1).
The dataset used in this study was 36,823 records of solar irradiance, ambient temperature, wind speed, and actual DC power generated by the facility under study collected every 5 min. Meteorological data acquisition was performed using a Davis Vantage Pro2™ Wireless weather station, complemented with a Weatherlink Live system and a DAVIS 6450 pyranometer installed in the same plane as the panels, located next to the facility under study. The power generated is obtained through Excel file downloads from the Huawei manufacturer’s application.
The electrical parameters required for the mathematical model were obtained from the JA SOLAR JAM72S20 panel manufacturer’s datasheet.
2.2. Data Collection
The inputs to the DT should be those outdoor variables that are considered to affect the actual PV installation [
48]. Solar radiation and cell temperature are, among many other factors, some of the variables on which the energy generated by the PV installation depends [
49]. When studying meteorological variables, it is fundamental to perform it by seasons of the year because the relationships, trends, and effects of these variables change significantly depending on the season. Meteorological conditions such as temperature, solar radiation, wind, and precipitation present different behaviors and relationships in each season [
50]. In the case of photovoltaic technology, an increase in ambient temperature causes an increase in PV cell temperature. This leads to a decrease in installation performance [
51]. In the comparison of different mathematical models for estimating the PV cell temperature used in [
52], it was shown that wind speed reduces the negative influence of temperature on power generation by up to 10%. Ultimately, solar irradiance, ambient temperature, and wind speed are critical factors for accurate prediction, and neural network models can effectively integrate them [
53]. Taking into account these considerations, the variables selected for this study were solar irradiance (W/m
2), ambient temperature (°C), and wind speed (m/s). They were selected for their physical relevance and direct impact on photovoltaic production, allowing for a homogeneous and robust comparison between the different models evaluated.
In order to evaluate the robustness and generalization capability of the predictive models in different meteorological conditions, it was necessary to have data from the complete annual cycle. Therefore, each sample, captured at five-minute intervals, covered all seasons of the year and climatic conditions representative of the geographical location of the study. The monitoring period spanned from May 2024 to June 2025.
As an example, the data corresponding to a week (14–21 June 2024) are shown in
Figure 2. Clear-sky conditions on 14 June, for instance, result in high irradiance peaks (>900 W/m
2), while cloudy days like 18 June show significantly reduced values. Wind speed fluctuates between 0 and 5 m/s, with isolated gusts corresponding to changes in weather conditions. Maximum temperatures reach approximately 35 °C on 14 June, coinciding with clear skies and high irradiance, while minimum peak temperatures are observed during the cloudiest periods (about 22 °C on 19 June). The power generated reflects these patterns, reaching up to 3 kW on days with the highest solar irradiance and decreasing during periods of lower irradiance. These data highlight the strong influence of weather variability, particularly solar irradiance and temperature, on the PV system’s generation. Statistical information regarding the four datasets are provided in
Table 1.
The collected data underwent preprocessing to ensure their quality. During this stage, erroneous readings and records with missing values were systematically eliminated. These missing values were primarily caused by connectivity interruptions or temporary failures in data acquisition. Next, the data corresponding to each variable were normalized using the StandardScaler technique, which adjusts the data to have a mean of zero and a standard deviation of one. This normalization ensures that all variables are on a homogeneous scale. This prevents differences in magnitude between parameters from negatively affecting the neural networks’ learning process and contributes to more stable and efficient model convergence during training.
The simulations were performed using the Python 3.13.2 programming language together with the PyTorch 2.6.0 framework, widely used for the development and training of deep learning models. The modeling and training process was performed on a personal computer with 64-bit architecture, equipped with an Intel Core i7-10750H processor (from Intel corporation, Santa Clara, CA, USA) (2.6 GHz, 6 cores, and 12 threads) and a dedicated NVIDIA GeForce RTX 2060 graphics card (from Nvidia Corporation, Santa Clara, CA, USA).
2.3. Forecasting Models
When developing a digital twin, the first option is to model it using mathematical equations that describe the system dynamics. If that is not possible or does not provide satisfactory accuracy, then other options should be tested. But while this mathematical model may perform accurately, other models should also be tested to determine whether they can improve accuracy and, if so, to identify the model with the best results. In this work, a mathematical model of the PV plant is provided alongside three neural networks to determine which performs best. The selected neural models are the Multilayer Perceptron (MLP), which is one of the most widely used models and has proven to be an accurate and reliable tool for regression and classification problems; and two deep learning models: the Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU). LSTMs and GRUs have been used for translation and language processing problems, as well as time series forecasting, providing very good performance.
Neural networks have shown great potential as predictive models in the context of DTs [
19]. However, their implementation in real-time monitoring systems presents practical challenges. The most relevant challenges include the limited availability of training data in certain locations, the need for sufficient computational resources to train and run the models, and the existence of intermittent or unreliable connectivity in some application environments [
20]. These limitations can affect the frequency with which models are updated and their ability to adapt to changing conditions. Therefore, it is crucial to select efficient architectures and develop robust preprocessing and data transfer strategies to ensure system reliability [
23].
Both the mathematical model and the neural network take as inputs solar irradiance, ambient temperature, and wind speed and provide a prediction of the DC power generated as output. So, each model will receive an input vector made up of values of the aforementioned three variables at a time point and provide one output: a prediction of the DC power that the PV systems should give.
A neural network must be trained before it can perform any task. Therefore, the entire dataset must be divided into two subsets: one for training and one for validation. To properly perform the training and validation processes, this work randomly divided the entire dataset into 75% for training and 25% for validation using the hold-out method. This strategy is widely used in the literature and allows us to evaluate the models’ abilities when processing unlearned data. This approach is particularly useful when the goal is to compare the performances of different architectures under consistent conditions. After training each model, performance evaluations were carried out on the same 25% of the data reserved for validation for both global metrics and seasonal analysis. This ensures that the results of each model are assessed comparably and objectively under the same conditions. It is worth noting that other data partitions for training and validation were tested (40–60% and 20–80%, as mentioned when explaining the MLP training process model below), but the 25–75% partition yielded slightly better results.
The mathematical model only requires its parameters to be adjusted using information provided by the system’s manufacturers. Therefore, it does not require training; only the validation set must be simulated. This allows us to evaluate the model’s performance under the same conditions as the neural networks.
2.3.1. Mathematical Model
The mathematical model chosen to simulate the photovoltaic system in this work was selected based on its ability to incorporate module efficiency, enabling the simulations to accurately reflect the system’s real behavior [
54].
The first step in defining the mathematical model is to adapt the available information—solar irradiance, ambient temperature, and wind speed—to the variables of that model. As the model selected used the irradiance and PV cell temperature as inner variables, the irradiance is directly used from the available data; however, the PV cell temperature must be obtained from the ambient temperature and wind speed:
In this equation,
represents the cell temperature,
T is the ambient temperature,
is a reference temperature, (
),
is the solar irradiance,
w is the wind speed, and
is a coefficient which depends on the panel technology, which in our case is monocrystalline (
[
55].
Once the cell temperature is obtained, the equations providing the cell’s current and voltage can be defined [
54]:
In these equations
and
are the cell’s current and voltage, which depend on the solar irradiance and the cell’s temperature.
y
represent the cell’s current and voltage, respectively, obtained from the manufacturer’s datasheet at the maximum power point under reference conditions (
and
). Their values for the panel used in this work are
and
. Finally,
and
are the coefficients of variation of the current and power with temperature, respectively. According to the manufacturer’s datasheet for the analyzed panel, their values are 0.044%/°C and −0.35%/°C, respectively [
56].
As Equations (2) and (3) provide the current and voltage that a cell gives, the corresponding current and voltage given by the PV system are obtained by multiplying each one by the number of cells in parallel (for current) and in series (for voltage):
Therefore, the power provided by the system is as follows:
In this mathematical model, the calculation of the voltage drop of the PV cable (which transforms into a power loss due to the Joule effect) from the PV panel to the DC inverter is neglected. It is assumed that the installation company has correctly calculated the PV cable cross section and that the voltage drop is less than 1.5%, according to the Spanish instruction ITC-BT 40 [
57].
2.3.2. Multilayer Perceptron (MLP)
The Multilayer Perceptron (MLP) [
58] is probably the most basic neural model used today. It has become a classic because it was one of the first models that could provide accurate and reliable results for classification and regression tasks. Despite its simplicity, it is still one of the most commonly used models for applications that are not closely related to human intelligence. In fact, it has been proven that an MLP with one hidden layer is a universal approximator, provided that it has enough neurons [
59]. It is a multilayer, feedforward structure, meaning its processing elements (neurons) are arranged into layers, with information flowing from the input layer to the output layer for processing (see
Figure 3). There is no feedback between neurons. In this multilayer structure, the input layer is not an actual layer but rather the input data to the network. In other words, the input layer is a data vector. Similarly, the output layer is not defined to properly process information but rather to provide an output that matches the network outputs to the range of the data used.
The information the network receives is sequentially processed by all layers. Each neuron in one layer receives the outputs of all the neurons in the preceding layer (this is why they are usually known as fully connected layers) and processes all this information by means of a transfer function:
In this expression represents the output of neuron j in layer l; is the output of neuron i in layer l − 1; is the weight that defines the strength of connections between neurons; and is a bias term. f( ) is usually a sigmoid function (outputs between 0 and 1) or a hyperbolic tangent function (output values between −1 and 1) for the neurons in the hidden layers, although other functions such as the ReLU (which is 0 for inputs lower than 0 and linearly increases for values higher than 0) can also be used. The transfer function for neurons in the output layer is usually linear.
Neuronal networks can accurately reproduce the behavior of many complex systems because they can learn this behavior from data. To acquire this capability, the networks must be trained before they are used for their intended task. Therefore, the available dataset must be divided into two subsets: one for training and one for validation. Usually, divisions ranging from 40–60% (training–validation) to 20–80% are used. The MLP training process is carried out using the well-known backpropagation algorithm [
44]. To apply the algorithm, the training dataset is organized as pairs of inputs (patterns) and their corresponding desired outputs. The patterns are then sequentially presented to the network, which processes them and provides the corresponding outputs. These outputs are compared with the desired outputs to measure the prediction error. The sum of these errors is then backpropagated to allow the algorithm to minimize its value by adjusting the weights of all neurons. This process is repeated until the desired level of accuracy is achieved.
In this work, an MLP with a single hidden layer with 128 neurons was used; the activation function of the neurons was the ReLU. The input layer has three components that correspond to the selected meteorological variables: solar irradiance, ambient temperature, and wind speed. The output layer has a single neuron that provides the prediction of the generated power. We tested different numbers of neurons in the hidden layer, but the configuration with 128 neurons performed better.
The MLP was trained using the Adam algorithm, a procedure that optimizes the backpropagation algorithm, with an initial learning rate of 0.001 and a batch size of 32 samples. The mean-squared error (MSE), which is described in
Section 2.4 below, was used as the loss function. To prevent overfitting and optimize the number of epochs, an early stopping strategy was implemented and executed after ten consecutive epochs without improvement in the loss function. The algorithm stops after a maximum of 200 epochs.
2.3.3. Long Short-Term Memories
The LSTM network [
60] is a complex neural model with a multilayer structure and feedback connections. The neurons in each layer can be organized into blocks containing multiple elements. Each neuron receives inputs from the preceding layer and feedback from the other neurons in the same layer (
Figure 4). LSTM neurons also store a type of “memory” of their past states, which is processed with inputs and feedback [
46]. Two control gates decide which portion of the neuron’s inputs (new inputs and feedback), the input gate, and inner state (“memory”), the forget gate, will be processed to create a new inner state:
In these equations represents the new input data; is the feedback from the neurons in the same layer; Wi and Wf are weight matrices; bf and bo represent biases; and σ( ) is a sigmoid function. To provide clearer and more compact expressions, both the new inputs, , and the feedback, , are arranged into a single input vector, .
The neuron calculates a temporal new inner state from its inputs by means of the following:
In this expression
and
are the corresponding weights and bias. A fraction of this temporal inner state is then combined with a fraction of the stored one, both controlled by gates
and
, to obtain the new inner state:
Finally, the neuron’s output is a fraction of this inner state after been processed by a hyperbolic tangent to limit its value between −1 and +1. The fraction of this value to be provided as output is decided by a third gate (
), the output gate:
The variables, parameters, and function (σ( )) in Equation (12) have the same meanings as those in Equations (8) and (9).
The LSTM is trained with a modified version of backpropagation, which was adapted to take into account both the feedback and “memory” present in its structure [
60]. Two modifications have been defined by taking into account the nature and goals of the different weights and biases: truncated backpropagation through time (BPTT), for output units and output gates, and real-time recurrent learning (RTRL), for the neuron’s inputs, input gates, and forget gates.
To explore possible temporal correlations between each input variable’s data, the LSTM’s input vector consisted of six historical data points from each input variable preceding the PV power value to be predicted. In other words, sequences of six historical data points for each meteorological variable were provided in 30 min windows (six 5 min intervals). Thus, the input vector comprises 18 components. The model’s structure consists of a hidden LSTM layer with 50 neurons and a dropout layer, which randomly switches off neurons to create a more efficient network and prevent overfitting. The dropout rate was fixed to 0.2. This is followed by an output layer with a single neuron that provides the power prediction. Since the dropout layer can disable neurons, only the model with 50 neurons was tested, as the model itself can reduce the number of neurons if necessary to improve accuracy.
As with the MLP, the training algorithms employed the Adam optimizer with a learning rate of 0.001 and the mean-squared error (MSE) as the loss function. The early stopping criterion waits 10 epochs before ending the training process, which concludes after a maximum of 200 epochs in any case.
2.3.4. Gated Recurrent Unit
With the aim of defining a simpler structure of the LSTM model while retaining its computational capability, a simplification has been proposed: the Gated Recurrent Unit (GRU) [
47]. A first simplification is assumed by considering that a single gate controls the combination of the new inputs and the stored “memory”, providing a balanced combination of both:
A second simplification consists of defining the neuron’s output as only a fraction of its new inner state:
This model was proposed for speech recognition [
61], although it has also been applied for time series forecasting [
62].
The GRU model used in this work has a structure similar to that of the LSTM model described in the previous section. The only difference is that a GRU is used instead of an LSTM. The model consists of one hidden GRU layer with 50 neurons, a dropout layer with a 0.2 dropout rate, and a linear output layer with one neuron. This configuration prioritizes efficiency in learning sequential patterns without compromising predictive capability.
As with the LSTM, the input data have 18 components: three blocks of six historical data points, one for each meteorological variable considered. The training algorithm employs the Adam optimizer with a learning rate of 0.001 and the mean-squared error (MSE) as the loss function. The training algorithm runs a maximum of 200 epochs, stopping early if the MSE does not decrease after 10 epochs.
2.4. Model Assessment
To rigorously assess the accuracy and explanatory power of the four predictive models proposed, four widely recognized statistical metrics were calculated: the mean-squared error (MSE), which quantifies the average magnitude of the squared errors; the root-mean-squared error (RMSE), which provides the square root of the MSE to give a value with the original units of the data and facilitates direct interpretation of the standard deviation of the predictions; the mean absolute error (MAE), which measures the average of the absolute differences between actual and estimated values; and the coefficient of determination (R
2), which expresses the proportion of variance of the dependent variable explained by the model. Together, these metrics provide a comprehensive view of performance by considering both the magnitude of the errors and the model’s ability to capture the variability of the data. Their mathematical expressions are as follows:
In these expressions represents an actual data point, is its predicted value, and is the mean of all the data. N represents the total number of observations. For the MSE, RMSE, and MAE, lower values indicate better model accuracy. For the R2, closer values to 1 indicate better model performance.
3. Results
Table 2 shows the results of comparing the performances of the four models using the MSE, RMSE, MAE, and R
2 metrics with the data of the validation set. The single-layer MLP performed best overall, achieving the lowest mean-squared error (MSE = 0.0389) and the highest coefficient of determination (R
2 = 0.931). This confirms its ability to accurately capture the relationships between meteorological variables and generated PV power. This performance surpasses that of the traditional mathematical model, which, despite also exhibiting good values (R
2 = 0.914; MAE = 0.0752), is less precise in contexts with greater climatic variability, such as overall yearly predictions.
In contrast, the LSTM and GRU models, which are designed to process complex temporal dependencies, performed worse. They produced higher error metrics (MSE = 0.0588 for the LSTM and MSE = 0.0593 for the GRU) and less satisfactory fits (R2 = 0.896 for the LSTM and R2 = 0.895 for the GRU). The performances of these models can be interpreted by analyzing the dynamics of the data used. In this context, the relationship between weather variables and PV power production varies smoothly over short time intervals. Thus, much of the system’s behavior can be captured by models that consider only the most recent values of these variables. Therefore, incorporating recurrent mechanisms, such as those present in the LSTM and GRU architectures, which were designed to account for strong temporal dependencies, like those in language or some time series, does not appreciably improve performance compared to simpler models for problems like the one studied here. In fact, complex models increase computational complexity and may present a trend to induce overfitting.
Consequently, for real-time PV power forecasting scenarios, where data are updated with high frequency but there are no significant time dependencies, the Multilayer Perceptron (MLP) is the most efficient, robust, and accurate option.
To provide a graphical representation of the models’ performances that can help with analysis, the scatter plots of predictions versus actual data for each model were obtained. These plots, which were obtained using only the data reserved for validation, are shown in
Figure 5. The proximity and concentration of points along the diagonal indicate the degree of predictive accuracy; closer alignment implies a lower prediction error and higher coefficient of determination (R
2).
The MLP model exhibits the highest density of points tightly clustered around the diagonal (R2 = 0.931), demonstrating its superior goodness of fit and predictive capability across the entire data range. The mathematical model, while slightly less precise than the MLP (R2 = 0.914), achieves a robust performance and maintains notable accuracy.
In contrast, both recurrent neural network architectures (LSTM and GRU) show a clear wider dispersion of points, especially at intermediate and higher power values. This indicates a reduced predictive fidelity under these conditions (R
2 = 0.896 for the LSTM and R
2 = 0.895 for the GRU). This is consistent with the error metrics in
Table 2 and may be due to the nature of the dataset, which is dominated by relatively static and direct relationships between meteorological variables and PV power. Unlike in scenarios with complex temporal relationships, recurrent networks did not provide additional benefits here. Their time-dependent structure may even introduce a tendency toward overfitting or increase errors in cases with limited or no time dependencies.
Seasonal Analysis
Seasonal analysis is essential for assessing the robustness and adaptability of predictive photovoltaic (PV) generation models because the solar energy production and algorithm performance vary significantly throughout the year due to changes in weather conditions and incident radiation [
50].
To accomplish this, we evaluated the performances of the four models on a seasonal basis by dividing the entire dataset into four seasons: spring, summer, autumn, and winter. Each season was used to independently train and validate each model, just as was performed with the entire dataset. This allows for a homogeneous comparison of the accuracy and explanatory power of the models by analyzing their performances within a set of data with similar weather behavior.
The results of the seasonal analysis are presented numerically in
Table 3 and graphically in
Figure 6 for easier understanding. The graphical visualization allows for a quicker and clearer interpretation of the results. Overall, the single-layer MLP model shows the highest consistency and accuracy across all seasons. It achieves the lowest values of the MSE, RMSE, and MAE, as well as the highest coefficients of determination (R
2), in three of the four seasons (spring, summer, and winter). It clearly outperforms the other models in spring (MSE = 0.631; R
2 = 0.9204), the season with the most fluctuating weather conditions. Nevertheless, the mathematical model performs slightly better than the MLP in autumn, when weather conditions are more stable. These results demonstrate the MLP’s ability to effectively capture the relationships between meteorological variables and generated power, despite seasonal variability. Overall, the MLP’s results are better than those of the other models, providing higher accuracy in seasons with fluctuating weather conditions, though the mathematical model slightly outperforms the MLP when favorable weather conditions appear. Therefore, it can be concluded that the MLP provides more balanced and accurate results in all weather conditions. The mathematical model can only provide valuable predictions for seasons with relatively stable weather conditions.
It is worth noting that in the season with the most stable weather conditions, summer, the four models were able to notably increase their accuracies: they all achieved values of the R2 very close to 1, with the MLP providing slightly better metrics (MSE = 0.0032; R2 = 0.9957).
In winter and spring, the mathematical model, LSTM, and GRU achieved clearly worse results than the MLP. The MLP has significantly lower MSE values (MSE = 0.0370 in winter and MSE = 0.0631 in spring) and higher R2 values (R2 = 0.9251 in winter and R2 = 0.9204 in spring) than those of the other three models. Notably, while the other three models experience a notable drop in forecasting accuracy due to the changing weather conditions typical of these seasons, the MLP is able to provide relatively accurate predictions. However, while the mathematical model slightly outperforms the LSTM and GRU in spring, it underperforms these two models in winter, demonstrating its limitations in dealing with changing weather conditions.
When comparing the forecasting performances of several models, especially those that will be implemented in real-time systems, it is important to consider their computational performances, i.e., how long it takes the model to provide a prediction.
Table 4 presents a comparative summary of the training and validation times measured for each model. For simplicity, only the values obtained with the entire annual dataset are used. Three times are provided: the training time, the validation time, and the sum of both. Since the mathematical model does not require training, only the validation time is provided. The validation time represents the time needed to predict the entire validation dataset.
Of the three neural network models, the MLP is the most time-efficient, requiring 41.84 s for training and practically instantaneous validation (0.09 s). The total time consumption is 41.93 s. This performance makes the MLP well-suited for implementations with moderate computational resources that require frequent updates. It is worth noting that, while training can be carried out offline, validation must be carried out in real time. Therefore, the validation time is the most significant factor in evaluating a model’s performance in real-time applications.
In contrast, recurrent models perform worse in terms of runtime. The GRU requires less time than the LSTM: 91.99 s for training, 1.04 s for validation, and 93.03 s in total versus 126.88 s for training, 1.10 s for validation, and 127.98 s in total. These results align with expectations, as the GRU has a simpler structure and fewer parameters than the LSTM. However, both models require substantially more prediction time than the MLP, which may limit their use in scenarios where speed and computational efficiency are priorities.
Notably, the neural models outperformed the mathematical model in the prediction time by a significant margin (1.86 s). The other two neural models also outperformed the mathematical model, demonstrating that once trained, neural models are simpler computational structures.
4. Discussion
The results achieved and described in the preceding section show that the MLP yields more accurate and confident predictions than the recurrent neural architectures (LSTM and GRU) in all seasons. The MLP performs similarly to the mathematical model in stable weather conditions (summer and autumn). However, it significantly outperforms the mathematical model in seasons with changing conditions. Moreover, the mathematical model provides an even worse performance than the LSTM and GRU in some of these cases (winter). The MLP stands out as the better forecasting option due to its consistent and balanced accuracy. It provides the best metrics in the worst weather conditions. However, the mathematical model remains competitive in scenarios with low variability, such as in autumn.
It is worth noting that the four models yielded better accuracies when trained and validated using seasonal data than when the entire dataset was used. This is not surprising, since, when using the entire dataset, models must deal with data representing different weather behaviors, which complicates training and prediction. However, when seasonal datasets are used, the weather conditions are more stable, and the models can process the information more easily.
Including historical data as inputs to the LSTM and GRU models used in this study did not improve accuracy. In fact, it increased prediction errors compared to the simpler MLP structure defined in this work. Using historical data with no time dependency not only fails to improve accuracy but also tends to degrade it because using more data with a more complex prediction structure tends to induce higher computational errors, as the results obtained show.
Finally, it should be noted that the prediction times are not critical for the four models tested when used in the actual system analyzed in this work because the monitoring system captures data every five minutes, and the models require only a very short time to provide a prediction. Note that the times shown in
Table 4 correspond to the times needed to forecast the entire validation dataset. However, the prediction model will only provide one prediction when the corresponding meteorological values are entered, and this process will take significantly less time than the times shown in
Table 4.
Therefore, it can be concluded that neural networks are valuable tools for implementing DTs. As demonstrated in this work, they can outperform accurate mathematical models. Nevertheless, not all models can provide good results in every case. Several models must be tested to determine the best model for each particular case. Therefore, new neural models [
42] or combinations of neural models with other tools, hybrid models [
41], are valuable options that deserve to be tested. It is worth noting that more complex models do not always perform better, as some studies have pointed out [
39,
40] and this study has shown.
5. Conclusions
This study demonstrates that predicting PV power in real environments using high-resolution data and annual coverage benefits significantly from a rigorous comparative evaluation between traditional mathematical models and several neural network architectures. The results obtained show that the single-layer MLP model provides the best overall performance in terms of accuracy, with a lower MSE and higher R2. It outperforms both the mathematical model and the recurrent LSTM and GRU architectures.
Seasonal analysis reveals that the MLP maintains high consistency and accuracy across all seasons, performing best in summer and autumn. The traditional mathematical model closely follows the MLP in conditions of low variability, such as in summer and autumn. However, in spring and winter it does not perform as well. Conversely, the recurrent LSTM and GRU architectures underperform in most seasons, suggesting that their greater ability to capture temporal dependencies does not provide an advantage in contexts with low temporal complexity of PV generation patterns.
These findings highlight the importance of tailoring the selection and implementation of predictive models to the specific operational and seasonal characteristics of each PV installation. In particular, integrating the MLP into digital twins of PV installations is presented as an efficient, accurate, and cost-effective solution that facilitates real-time monitoring, optimization, and predictive maintenance, key aspects of smart energy management in sustainable buildings.
The methodological approach adopted, which includes a homogeneous comparison of models under standardized metrics and detailed seasonal analysis, contributes to closing the existing gap in the literature on comprehensive model evaluation for advanced energy management applications.
These findings underscore the importance of conducting thorough seasonal analyses when validating the DTs of photovoltaic (PV) installations. These analyses enable us to identify the strengths and limitations of each approach under real operating conditions. Additionally, seasonal comparisons provide essential information for selecting and adjusting models according to the installation’s climatic and technological context. This contributes to more efficient and resilient energy management throughout the year. This information is crucial for designing advanced maintenance, optimization, and early warning strategies for smart PV systems. Overcoming the challenges of seasonality and climate variability is essential for predicting photovoltaic energy with neural networks. Models that explicitly incorporate these factors significantly improve accuracy and robustness, two essential aspects for application in distributed generation and energy management systems.
These predictive models enable a DT that allows for the proactive and optimized management of smart buildings by anticipating photovoltaic generation and adapting energy consumption in real time. This capability translates into reduced costs and emissions, as well as improved urban environment comfort and sustainability. Additionally, integrating DTs with energy storage systems and load management strategies creates new opportunities for balancing supply and demand, increasing system resilience, and maximizing renewable energy use. These synergies contribute together to a smart ecosystem geared toward increasingly autonomous and efficient buildings that can respond dynamically to environmental conditions and consumption needs.