1. Introduction
Renewable energy is an alternative to traditional power plants [
1,
2,
3,
4,
5]. In recent times, more specifically in the last century, renewable power has earned recognition over conventional power production [
6]. One of the main advantages of solar photovoltaic power plants is their distributed generation [
7]. In addition to solar photovoltaic energy generation [
2,
3,
8], there are some other ways of harvesting solar energy, such as solar thermal energy generation, solar thermal water collection, and photocatalysis activated by sunlight, among others [
7,
9].
The I-V characteristic is required in experiments for the characterization of solar cells [
10,
11]. This curve allows easy access to various performance parameters, such as fill factor and maximum power point, among others [
10,
11]. The I-V characteristics must be carefully and accurately measured since it is crucial to be able to distinguish cells correctly in terms of their efficiency [
10]. For assessing the I-V parameters precisely, it is necessary to measure different conditions. The main and more relevant conditions are the associated irradiance and temperature [
12].
The I-V characteristic of a solar cell represents the electrical response of the device under illumination. It illustrates how the output current varies with the applied voltage and is essential for understanding the cell’s performance. At one end of the curve (short-circuit condition), the voltage is zero and the current reaches its maximum value, defined as the short-circuit current (
). At the other end (open-circuit condition), the current is zero and the voltage reaches its maximum value, known as the open-circuit voltage (
) [
13].
The shape of the I-V curve is governed by fundamental physical mechanisms, including the photogenerated current, diode ideality, and the internal resistances of the device (series and shunt) [
10,
11]. From this curve, key performance parameters can be extracted, such as the fill factor (
), the maximum power point (
), and the overall energy conversion efficiency [
14]. These parameters are highly sensitive to environmental conditions such as temperature and irradiance, making the I-V curve an indispensable tool for characterizing the real-world behavior of photovoltaic devices [
15].
The is the current delivered when the voltage across the terminals is zero, and it is directly influenced by the incident irradiance. The is the voltage when no current flows through the external circuit (), and it depends primarily on the material properties and temperature of the cell.
The
corresponds to the point on the I-V curve where the product of current and voltage is maximized, representing the operating condition at which the solar cell delivers its highest power output [
2,
14].
The
is a dimensionless figure of merit that indicates the “squareness” of the I-V curve. It is defined as the ratio between the actual maximum power and the theoretical power, which would be obtained if the device operated simultaneously at
and
[
13]. The
is calculated as:
A higher indicates lower internal losses and better overall device quality. Together, , , , and are fundamental parameters for evaluating the efficiency and performance of photovoltaic technologies.
In general, the solar panel manufacturing industry uses specific test conditions, namely standard test conditions (STCs), which correspond to an irradiance of 1000 W/m
2 and cell temperature of 25 °C [
15]. However, corresponding to optimal ratings does not allow us to infer how a cell will perform under varying environmental conditions, as there may be many factors that interfere with the irradiance and temperature conditions [
16]. For instance,
Figure 1 illustrates the dependence of power on temperature at constant illumination.
The power–voltage (P-V) curve is calculated using the I-V curve; therefore, each point of the I-V curve (these being a pair of voltage and current values) is used in Equation (
2) to find the power, where the variable
V is the voltage and
I is the current. The result is shown in
Figure 1.
To infer the impact on solar cell performance and to obtain reliable parameters that define its I-V model, it is necessary to apply a method for generating this curve in different scenarios of temperature and irradiance [
16,
17]. There are some available mathematical models in the literature that can describe this curve [
18,
19]. The main challenge of these models is the existence of transcendent variables, such as in the single diode model represented by Equation (
3), that make it difficult to determine its parameters. The parameters,
,
,
,
and
n, associated with the cell construction or operating condition, demand intense computational resources due to the equation for the single diode model being transcendental and nonlinear [
17,
18,
20,
21].
In this equation, I stands for the current (A) between solar cell terminals, for the photocurrent (A), for the diode reverse saturation current (A), q is a constant, the elementary charge, V stands for the voltage through the solar cell terminals, for the series resistance, for the shunt resistance, n for the diode ideality factor, k is the Boltzmann constant, and T is the absolute temperature.
There are three main steps for obtaining these parameters: selection of appropriate equivalent models, mathematical model formulation, and accurate extraction of parameter values from the models [
9].
The main environmental conditions that affect solar cell behavior are temperature and illumination. For this reason, there is a need to estimate the solar cell parameters of the single diode model for each temperature and illumination value associated with the respective I-V characteristic curve [
22,
23]. Due to issues related to accuracy and computational cost, there are many methods for modeling the I-V characteristic curve, and numerous studies have been conducted with the objective of developing new methods of modeling this curve [
23,
24]. For instance, most approaches consider artificial networks, genetic algorithms, neuro-fuzzy inference systems, and particle swarm optimization [
24,
25].
Table 1 presents the typical trends of single-diode parameters with temperature and irradiance.
These trends are consistent with both theory and measurement. The
increases nearly linearly with irradiance and slightly with temperature, due to temperature-induced increases in thermal carrier generation [
26]. The
has an exponential dependence on temperature stemming from the diode equation energy gap relation [
27]. Parasitic resistances also vary:
increases with temperature (higher resistivity) and marginally with irradiance, while
decreases under higher thermal loading, indicative of leakage pathways [
28]. The
n shows minimal irradiance dependence but may increase slightly with temperature, reflecting changes in recombination mechanisms [
26].
The issue of optimization was developed in the 1940s with the Simplex method was developed. This method was first developed with the objective of solving problem-type linear optimization [
29]. In the late 1980s, neural networks became the main method used in machine learning and artificial intelligence [
30]. The first generation of neural networks was based on two types called the multilayer perceptron and self-organizing maps. The multilayer perceptron was built with several “nodes” or “artificial neurons”. An illustration of an artificial neural network used in the multilayer perceptron is shown in
Figure 2. The multilayer perceptron combines several of these neurons to solve problems (a combination of several neurons is called a network) [
30]. These networks learn through a process called backpropagation, in which they adjust their own parameters based on the mistakes they make, improving with each attempt. The second type, self-organizing maps, are networks capable of learning on their own to identify patterns and organize information without receiving direct instructions about what is right or wrong.
Artificial neural networks (ANNs) are computational models inspired by the structure of the human brain capable of solving complex problems involving nonlinear relationships [
31]. In the context of solar cells, ANNs have been successfully applied to predict electrical behavior based on environmental variables.
A particular class of these models, known as recurrent neural networks (RNNs), is especially suited for sequential data, such as time series or characteristic curves that vary with boundary conditions like voltage, temperature, and irradiance. RNNs differ from traditional ANNs by incorporating feedback loops, allowing the network to retain information from previous steps. These networks are used in difficult tasks such as recognition, machine translation, image captioning, and modeling I-V curves under varying environmental conditions [
31]. In RNNs, the information flow is a little different from that of ANNs, as can be seen in
Figure 3. In this case, the weights are characterized by
and
. The
is a weight which will be multiplied by the output from the previous neuron. With this, each RNN neuron is connected to the next one [
30,
32]. As with ANNs, there is an activation function to add non-linearity, as can be seen in
Figure 3. In training, the weights are adjusted using an algorithm called Backpropagation Through Time (BPTT), which is responsible for calculating the output error and adjusting the weights, taking into account all previous steps in the sequence.
There is a problem in simple RNNs known as the vanishing gradient problem. This problem is a challenge for RNNs when they try to learn from very long sequences of data. During training, RNNs try to adjust the error using BPTT. However, when the data sequence is very large, the adjustment needed to be made to the sequence at the earliest step ends up becoming smaller and smaller. These weights end up decreasing so much that they practically disappear. As a result, the network can no longer learn or remember information that occurred early in the sequence, which impairs its ability to capture long-term relationships [
32,
33,
34]. The long short-term memory (LSTM) is an alternative type of RNN that is more popular due to its architecture. This method solves a problem that exists in simple RNN [
32,
33,
34]. The LSTM has a structure that can be classified as ‘gates’; these gates are responsible for having the memory of this neuron [
34]. The neurons of an LSTM have a layout as three types of self-connected gates; these gates are responsible for allowing the reading, writing, and removal of information in memory [
32]. The architecture of the LSTM is illustrated in
Figure 4, which illustrates the complex calculations existing in a cell LSTM with the three gates represented. One way to understand these three gates is that they work as filters. The forget gate is responsible for deciding which data should be erased from memory (this behavior is developed by the use of the activation function and mathematical calculation), the input gate decides what should be stored in memory, and the output gate decides what from memory should be used. Like the simple RNN, this cell inputs previous data, but in this case, the LSTM has two states, one called the cell state (the weight value is
in
Figure 4) and a hidden state (the weight values is
in
Figure 4) and the current data period (with weight denominated
). The LSTMs have hidden weights associated with the gates, so there are multiple weights to adjust in each LSTM cell. The LSTMs also use BPTT to adjust weights during training.
LSTM is very accurate, but it has a high computational cost. There are several weights to be adjusted, and one way to improve LSTM was the development of the Gated Recurrent Unit (GRU), which can be considered a variant of the LSTM proposed by Chung in 2014 [
35]. The difference between LSTM and GRU is how their gates monitor information flow; the GRU has two gates called reset and update [
35,
36,
37]. The update gate decides how much of the memory should be maintained and how much of the new information should be used; it is like saying “this is still important, let us keep remembering it”. The second gate is the reset gate; it decides how much of the old memory should be kept and how much new information should be used—it is like saying “that does not matter anymore, let us start from scratch here” [
36].
Figure 5 is illustrated the GRU, and it is possible to see that GRUs have another difference between LSTM and GRU—the GRU has only cell state represented by
and data of current period (
) with Tanh playing the role of activation function [
36].
There are numerous options for activation functions, such as Sigmoid, Tanh, ReLU, Linear, and others. In general, these functions were responsible for introducing nonlinearity in the neural network. These functions work with input values by transforming them, for example, ReLU transforms them to zero if the input value is negative, the sigmoid transforms the input value to the range zero to one, the Tanh transforms the input value to the range negative one to one, and other transformations take place.
In this work, LSTM and GRU architectures with a bidirectional method were employed to predict the output current of a polycrystalline silicon solar cell based on input voltage, temperature, and irradiance, ultimately modeling the full I-V characteristic. These architectures were selected due to their proven robustness in handling nonlinear experimental data and multi-variable dependencies.
3. Results
With the architecture assembled and the data corrected, it was possible to train the neural networks. Before training, the GRU and LSTM models with the bidirectional method were evaluated using cross-validation with the GroupKFold method. This method is used when data are grouped by some common feature, for example, multiple measurements for the same feature (which is the case in this study). Thus, GroupKFold divides the dataset into subsets (folds), where in each iteration, one fold is used for training and one is reserved for testing. The process is repeated for all folds, ensuring that each fold is used exactly once as a test set.
This study was used in five folds, because the choice of five folds in GroupKFold is a strategic decision that balances validation quality, computational efficiency, and respect for the data structure. With a training set containing 16,800 samples, this division allows each fold to contain, on average, about 3360 samples, which is enough to ensure statistical diversity in each division.
During cross-validation, the models’ predicted data were compared with real data using the MAE, RMSE, and R
2 metrics. The average results obtained with GroupKFold are presented in
Table 4.
Analysis of the error metrics revealed that both approaches were able to describe the real data, as the MAE and RMSE values were low. In addition, the R2 was very close to one, indicating that almost all the variability in the data was handled by the models. These metrics can thus be used for model training.
Another point is the computational cost; the GRU model had a training time of 7 min and 22 s, and the LSTM model had a training time of 8 min and 15 s. The callback strategy was used for the optimizer at training time (two callbacks were used, and both functions are native to TensorFlow. ReduceLROnPlateau has the function of reducing the learning rate when the models are at local minima, and the EarlyStopping function has a function to stop training if the model does not have a better error metric).
The time to calculate the parameters by the evolutionary algorithm for all conditions analyzed (considering that they were measured ten times for each condition) was 97 min and 55 s.
Table 5 shows the averages of the error metrics of the validation set, and as shown in
Figure 9, the models lead to a set of parameters with a reasonable prediction for a scenario of 900 W/m
2 in the range 25 to 55 °C. For
Figure 9, as the measurements were made ten times for the I-V curve, the plotted curve is an average of these measurements for use as model validation, and two temperatures are also plotted, 25 and 45 °C; the other temperatures can be evaluated using
Table 6, which shows the RMSE of all temperatures.
Using the measurement data, the RMSE, MAE, and R
2 metrics were calculated. The values are represented in
Table 7. The RMSE and MAE values were low, and R
2 was close to one.
The results for the GRU model can be observed in the graph of
Figure 9. The curve obtained with the model is in green. The prediction is very close to the average experimental data (in blue) and to the I-V curve predicted by the evolutionary algorithm (in orange).
Table 6 shows metrics for validation data; the values MAE, RMSE, and R
2 confirm that the model managed to get very close to the evolutionary algorithm. It can be observed that all models presented greater errors at temperatures of 45 and 50 °C. Consequently, the graph for the 45 °C temperature is presented in
Figure 9b; it is possible to observe that the error is in the short-circuit region (negative current).
In the LSTM model, there was behavior very close to real data, and the data are estimated by an evolutionary algorithm. The LSTM model in comparison to the GRU model was similar. The values of the MAE, RMSE, and R
2 shown in
Table 5 models are very close, for example, the average MAE in the validation data was
and GRU was
A, a small difference. R
2 was very close to one for both models (in
Table 5). Comparing each temperature in
Table 6, in some scenarios, the GRU model was better, and LSTM was better in others. In the forecast graph, at temperatures of 25 and 45 °C (
Figure 9), these small differences become clearer.
The validation leads to low error metric values, indicating that the proposed architecture is capable of fitting and predicting experimental data.
Table 5 and
Table 7 show that the GRU model shows the best results with the validation data. Regarding measurements with fewer data points, the test data, and the metrics measured (
Table 8), the best model was LSTM_GRU.
Another simulation that was carried out used only irradiance data equal to 600 and 1000 W/m
2 with a temperature between 25 and 55 °C. This was carried out to verify whether the architecture model will be able to describe experimental data, with a smaller amount of data for training. There is a total of 8400 data points, consisting of voltage, temperature, and irradiance. Remembering that ten measurements were taken for each temperature, these data were used for the new training. The first training was with 15,000 data points, so neural networks tend to perform better with large data sets. Thus, this procedure is a test on how reducing the data size influences the accuracy of the models. The other data (irradiance between 700, 800, and 900 W/m
2) were used to test, and the error metrics (MAE, RMSE, and R
2) are shown in
Table 8.
The training time was longer in the second training; the GRU model took 7 min and 51 s, and the LSTM took 8 min and 50 s.
As expected, the model metrics got worse results, although R
2 values kept close to one. This is indicative of that model being able to describe the observed data. In
Figure 10, some results for the test data are presented. The GRU model tends to improve its accuracy with an increase in temperature and irradiance.
The results demonstrate that both models—GRU and LSTM—were capable of accurately predicting the I-V characteristics of the solar cell under varying temperature and irradiance conditions. The GRU model exhibited the best overall performance, achieving the lowest error metrics while maintaining the shortest training time. Although the LSTM models produced similar results, their slightly higher computational cost and marginal improvements in accuracy suggest that GRU is the most efficient choice for this application. Additionally, when trained with a reduced dataset, all models experienced a small increase in error.
Comparing the results with the complete data and those with the reduced data, it can be seen that with the reduced data, the model had greater difficulty learning the behavior of the real data. This difficulty is associated with the complexity of the behavior of the I-V curve. Therefore, it is possible to conclude that the diversity and density of the input data—especially in relation to irradiance and temperature—are essential for the neural network to be able to accurately capture the variations in the current.
The physical consistency of the models’ I-V curves was assessed by extracting five key parameters from the single diode model (Equation (
3)) for three scenarios: (a) a representative experimental measurement, (b) the reference condition at 900 W/m
2 and 25 °C, and (c) the model neural-predicted I-V curves under the same conditions. These parameters were obtained by fitting Equation (
3) to both experimental and predicted curves using evolutionary algorithms. The extracted values are shown in
Table 9 and compared across the three scenarios. Comparing the five parameters, the curve predicted by the GRU model had its results very close to the averaged data. Additionally, the parameters FF,
, and
were extracted from the I-V curve predicted by the GRU model, and the results are presented in
Figure 11. A decreasing trend in
Figure 11a
can be observed as temperature increases. This behavior is fully consistent with the behavior of photovoltaic devices.
exhibits a linear behavior (
Figure 11b), decreasing with increasing temperature, a well-established phenomenon in semiconductor physics and essential for solar cell performance. As expected,
also demonstrates a clear decrease with increasing temperature.
The values obtained for the
n and
differ from those commonly reported in the literature for crystalline silicon solar cells, which typically present
n in the range of 1.2–1.5 and
between 0.1 and
[
40,
41]. In the present study, the
n values were slightly higher (up to 2.03), while
was lower (0.04–
). This discrepancy can be attributed to factors such as the specific test conditions adopted (900 W/m
2 and 25 °C), the possible aging of the module used, and the particularities of the parameter extraction method, which combines measured data with a GRU-type recurrent neural network-based model. Furthermore, parameter extraction from I-V curves at irradiances below the standard condition (STC) can lead to adjustments in the
n and
to compensate for variations in the photoelectric behavior of the cell. Similar results, with n greater than 2 and
less than
, were also reported in studies that apply optimization methodologies or neural networks [
42,
43,
44], reinforcing the plausibility of the values obtained in this work.
The GRU model (trained with complete data, as this presented the best results) was chosen to predict other scenarios and evaluate the behavior of solar energy cells. After predicting the I-V curve, the maximum power at a given temperature and irradiance was calculated using Equation (
2); the results are shown in
Figure 12. At each irradiance value, the temperature values were varied, and the I-V curve was estimated. From the I-V curve, the maximum power value was determined, and the values are plotted on the graph.
The model learned the behavior of the I-V curve, allowing the simulation of unmeasured scenarios and the estimation of the maximum power delivered by the solar cell in these scenarios. Another characteristic that can be evaluated is the behavior of the solar cell with temperature variation. In the GRU solar cell model, increasing the irradiance will only be beneficial if the temperature of the solar cell is controlled, since the behavior of the power is closely related to temperature and irradiance, where increasing the irradiance increases the power, but increasing the temperature reduces the power. This behavior is according with observed in the literature [
45,
46].
5. Conclusions
The I-V characteristic curve of silicon solar cells provides access to various performance parameters, some of which are sensitive to temperature and illumination changes. While mathematical models may describe this curve well, the presence of transcendental variables makes their determination challenging. Neural networks offer an alternative approach to address this issue, which was the focus of this study.
To train these models, the I-V characteristic curves of polycrystalline silicon solar cells under different temperatures and lighting conditions were measured. Among the available neural network architectures, LSTM and GRU were selected. Model performance was evaluated using error metrics, with the GRU model achieving the best results: MAE = , RMSE = , and R2 = . The other models had similar performance, with only minor variations in error metrics.
In terms of computational cost, the GRU model had the shortest training time with the full dataset, completing in 7 min and 22 s. When training with a reduced dataset, error metrics increased slightly, but the models still described the measured data well, as shown in
Figure 10. In this case, the LSTM model had the lowest computational cost, with a training time of 8 min and 15 s. The use of recurrent neural networks with bidirectional for modeling the I-V characteristic curve proved to be a robust approach, leading to low error metrics. While differences in accuracy were observed between models, they were minimal. Training with a reduced dataset led to higher errors but maintained good fitting quality while reducing the computational cost. Overall, LSTM and GRU models are reliable approaches for modeling I-V curves. However, the GRU model demonstrated the best balance between accuracy and computational efficiency.
The GRU model was used to predict other scenarios, and the results were as expected for solar cells, another indication of the good performance of the model. With the predictions, it was possible to evaluate the performance of the solar cell, and the power of the solar cell is closely related to the temperature since the increase in the temperature of the surface of the solar cell reduces the power.
Evaluating the performance of the GRU and LSTM-based models, the accuracy was comparable to that of regression using the evolutionary algorithm of Equation (
3). The I-V curves predicted by the artificial neural network models showed a notable overlap with those obtained by regression, as illustrated in
Figure 9 and
Figure 10. However, the gain in computational time is a notable point. Training the GRU/LSTM models was completed in approximately 8 min, while determining the parameters of Equation (
3) and estimating the curves for the scenarios studied took approximately 97 min. This time difference represents a substantial improvement in process efficiency.
In neural network training, one of the biggest challenges is finding the balance between learning enough from the data and not learning too much. This balance directly influences the model’s ability to generalize to new data. A recurrent neural network model that has overfitted during training over-adjusts the network’s weights to the training data, capturing not only the real patterns but also the noise and outliers. This causes the model to perform excellently on the data it has already seen, but to fail when dealing with new data. During this study, dropout techniques (during training, some neurons are chosen to be turned off randomly) and early stopping (stopping training when an improvement does not occur) were used as a way to mitigate the risk of overfitting. The presented results demonstrate high robustness. However, since the database is limited, the risk of generalization still persists.
The study was developed using experimental data from polycrystalline solar cells, and for this type of material, the model’s accuracy can be considered good. However, as the I-V characteristic curves of other photovoltaic materials (thin-film and perovskite) were not present, the model’s behavior with these materials cannot be determined. It is believed that with a dataset with several types of solar cells, it would be possible to develop a model capable of predicting the current for the I-V curve from the temperature, irradiance, and voltage data. If generalizability is confirmed, application to photovoltaic equipment would be possible.
Based on the study’s findings and limitations, a natural next step is to expand the model’s capabilities to be more broadly applicable. A promising avenue for future research involves training the model on a more diverse dataset that includes thin-film and perovskite solar cells. This would allow for a comprehensive evaluation of the model’s performance across different photovoltaic technologies, moving beyond the current focus on polycrystalline cells.