Next Article in Journal
Optimization of Temperature Uniformity in Photovoltaic Laminators Based on Electromagnetic Induction Heating
Next Article in Special Issue
Characterising Power Generation by Model Photovoltaic Towers Located in a Simulated Urban Environment
Previous Article in Journal
Aeroelastic Modeling of an Airborne Wind Turbine Based on a Fluid–Structure Interaction Approach
Previous Article in Special Issue
Wind-Induced Stability Identification and Safety Grade Catastrophe Evaluation of a Dish Concentrating Solar Thermal Power System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of Long Short-Term Memory Networks and SHAP Evaluation in the Solar Radiation Forecast

Department of Electrical Engineering, Cheng-Shiu University, Kaohsiung 833, Taiwan
*
Author to whom correspondence should be addressed.
Energies 2025, 18(23), 6099; https://doi.org/10.3390/en18236099
Submission received: 27 September 2025 / Revised: 18 November 2025 / Accepted: 19 November 2025 / Published: 21 November 2025
(This article belongs to the Special Issue Solar Energy Utilization Toward Sustainable Urban Futures)

Abstract

This paper proposes a hybrid forecasting framework that combines Long Short-Term Memory (LSTM) networks with Shapley Additive Explanations (SHAPs) to quickly and accurately predict solar radiation. Historical meteorological data from the Central Weather Administration (CWA) in Taiwan, spanning 2018–2023, are processed to construct multivariate input features, including temperature, humidity, pressure, wind conditions, global radiation, and temporal encodings. The LSTM network is employed to capture nonlinear dependencies and temporal dynamics in the multivariate meteorological data. SHAP-guided feature selection reduces the number of input variables, thereby lowering computational cost and accelerating convergence without sacrificing accuracy. A case study in the Penghu region—characterized by abundant solar irradiance and active photovoltaic deployment—was conducted to evaluate the model under three scenarios. Results demonstrated that if the number of features decreases from fifteen to five, the number of model parameters is reduced from 53,569 to 51,521 and the computation time is reduced from 6 ms to 4 ms. The MSE and MAE remain within the range of 0.07~0.11 and 0.13~0.18, with almost no change. The LSTM–SHAP framework not only achieves high forecasting precision but also provides transparent explanations of key meteorological drivers, with the temperature, humidity, and temporal variables identified as the most influential factors. Overall, this research contributes a scalable and interpretable methodology for solar radiation prediction, offering practical implications for photovoltaic power dispatch, grid stability, and renewable energy planning.

1. Introduction

The global energy landscape is undergoing a profound transformation driven by climate change, environmental concerns, and the depletion of fossil fuel resources [1]. Traditional reliance on coal, oil, and natural gas has contributed significantly to greenhouse gas emissions, thereby accelerating global warming, intensifying extreme weather events, and disrupting ecosystems. In this context, renewable energy (RE) has emerged as one of the most promising alternatives, offering clean, inexhaustible, and widely available energy to support long-term decarbonization strategies. Due to advances in solar energy technologies, solar power is currently considered one of the most rapidly increasing resources [2,3]. Solar energy, harvested through photovoltaic (PV) systems, converts sunlight directly into electricity without generating noise pollution. Taiwan, located in a subtropical region with abundant annual sunlight, possesses particularly favorable conditions for PV deployment [4]. However, despite its environmental benefits, solar energy production faces inherent challenges due to its variability and intermittency. Solar irradiance is highly sensitive to meteorological conditions, seasonal cycles, and diurnal variations, resulting in significant fluctuations in electricity output [5,6].
Solar energy is expected to play a crucial role in the future energy supply of Taiwan. It has become a strategic priority in Taiwan’s energy transition, with government policies promoting its integration into the national grid [7,8]. However, the variability of solar generation poses risks to power system stability and complicates grid operation and scheduling. Inaccurate forecasts of solar irradiance can lead to inefficient energy dispatch, increased reliance on reserve capacity, and higher operational costs. To mitigate these challenges, accurate and reliable forecasting models are essential for ensuring grid reliability, optimizing resource allocation, and supporting decision-making in energy markets. In particular, anticipating fluctuations in solar radiation can improve the efficiency of storage systems, reduce curtailment, and facilitate the integration of PV energy into smart grids. Owing to the instability of natural climate conditions, solar power output is inherently discontinuous and intermittent. It is strongly influenced by variations in sunshine, cloud cover, and rainfall, which further increase the instability of power supply within the grid. Therefore, developing accurate forecasting models under uncertain environmental conditions is a critical issue for power management, demand planning, and energy security [9,10].
In recent years, advances in machine learning and deep learning have offered new opportunities for solar energy forecasting [11,12,13,14]. For example, ref. [15] evaluated the forecasting skill of various shortwave and longwave radiation scheme configurations for solar power prediction using WRF-Solar under different weather conditions. A hybrid deep learning framework was proposed in [16] that integrates a Convolutional Neural Network (CNN) for pattern recognition with a Long Short-Term Memory (LSTM) network for half-hourly global solar radiation forecasting. Similarly, ref. [17] demonstrated the application of Artificial Neural Networks (ANNs) for daily solar radiation prediction. To broaden accessibility, ref. [18] introduced a framework that leverages a diverse range of predictive models, including both traditional regression techniques and neural networks. In [19], a probabilistic solar power forecasting method was developed based on weather scenario generation, accounting for inherent correlations among different meteorological variables. Likewise, ref. [20] focused on temperature-based global solar radiation models, employing ANN techniques to predict radiation using only temperature data. In [21], five forecasting approaches—autoregressive integrated moving average (ARIMA), autoregressive moving average (ARMA), feed-forward backpropagation neural networks (FFBP), hybrid ARIMA–FFBP, and hybrid ARMA–FFBP—were compared for daily global solar radiation prediction with different combinations of meteorological parameters. A statistical method was proposed in [22] to address the non-stationarity of photovoltaic (PV) production data and to predict solar power output. In [23], a conversion model was constructed by incorporating photovoltaic variables, ambient temperature, and wind speed to estimate grid-connected power generation. Furthermore, ref. [24] examined the influence of meteorological parameters on solar radiation and electricity production in the Salt–Jordan region (Middle East) using LSTM and Adaptive Network-based Fuzzy Inference System (ANFIS) models. Ref. [25] developed an LSTM model to predict solar irradiance and wind power over a 24 h horizon using a 240 h (10-day) dataset. Results showed that the LSTM model can effectively predict renewable energy output by predicting the solar irradiance.
Although significant progress has been made in solar irradiance forecasting, existing models often lack the flexibility to account for the high variability of solar radiation in meteorological data. While deep learning models have improved forecasting accuracy, their reliance on a large number of input features increases computational complexity and reduces generalizability. Feature selection is therefore critical for achieving efficient and robust model performance. Identifying key meteorological variables—such as temperature, humidity, and diurnal cycles—is essential for building trust, validating model behavior, and informing physical and operational strategies. However, developing forecasting models that are accurate, interpretable, and computationally efficient remains an important research challenge.
This paper proposes an integrated framework that combines LSTM [26,27] networks with Shapley Additive Explanations (SHAPs) [28,29] for feature evaluation. Solar radiation exhibits strong temporal correlation and nonlinearity, and its variations are influenced by the interaction of multiple meteorological factors, such as the daily sunrise and sunset patterns, annual variations in day length, sudden weather events, and cloud cover leading to a sharp drop in radiation. LSTM networks have gained prominence due to their ability to capture long-term dependencies and to overcome the vanishing gradient problem inherent in traditional Recurrent Neural Networks (RNNs) [30,31]. They have been widely applied across diverse domains, including natural language processing, financial forecasting, and climate modeling, and their application in solar irradiance forecasting has shown promising results. The SHAP framework has been increasingly applied in the solar energy domain to enhance the interpretability of machine learning models. For solar irradiance forecasting, SHAP has been used to identify the most influential meteorological features and to visualize their nonlinear and threshold effects on model outputs [32]. Several studies have reported that cloud cover, humidity, and air temperature are consistently dominant predictors [33,34,35], and their SHAP dependence plots reveal saturation behavior and regime shifts under different weather conditions. Moreover, SHAP interaction analyses have uncovered synergistic effects, such as the combined influence of cloud cover and humidity on solar radiation attenuation, providing physically meaningful explanations that align with atmospheric dynamics. SHAP has emerged as a robust method for attributing feature importance and quantifying the contribution of each input variable to model predictions. Integrating SHAPs with LSTM models provides the dual benefits of predictive accuracy and interpretability, thereby enhancing both the technical and practical value of solar radiation forecasting. The proposed framework is validated through a case study in the Penghu region of Taiwan, a location characterized by high solar irradiance potential and active PV deployment. The results provide actionable insights for PV operators, grid managers, and policymakers seeking to improve the reliability and transparency of solar energy forecasting systems.

2. Data Selection

2.1. Meteorological Data Sources

Solar irradiance forecasting relies heavily on high-quality meteorological data. The primary factor affecting solar power output is the intensity of solar irradiance. The output of a photovoltaic (PV) module can be calculated using Equation (1) [36]:
P s ( t ) = K P V × A P V × P G ( t ) / 3600
P s ( t ) represents PV output power at time t ( w ), P G ( t ) is the solar irradiance intensity at time t ( j / m 2 ), A P V is the area of the PV array ( m 2 ), and K P V is the efficiency of PV.
In this study, we employed historical meteorological records provided by the Central Weather Administration (CWA) of Taiwan through its Observation Data Inquire System (ODIS), also known as the CODiS platform [37]. The Penghu archipelago was selected as the focal region for analysis. The dataset spans six consecutive years (2018–2023), comprising approximately 52,512 hourly observations. The first five years (2018–2022) were used for model training, while the data from 2023 was reserved exclusively for testing and validation. Figure 1 presents the raw irradiance time series.
The relatively long time span and fine granularity of the dataset ensure sufficient variability across seasons, weather events, and irradiance cycles, thereby enabling robust model training and evaluation. Following an initial screening and domain-driven analysis, 15 key features were identified as potentially influential for solar irradiance forecasting. These include the following:
  • Atmospheric pressure (atm)
  • Sea level pressure (atm)
  • Air temperature (°C)
  • Dew point temperature (°C)
  • Relative humidity (%)
  • Wind speed (m/s)
  • Wind direction (degrees)
  • Maximum gust speed (m/s)
  • Precipitation (mm)
  • Precipitation duration (hour)
  • Sunshine duration (hour)
  • Global solar irradiance (j/m2)
  • Visibility (km)
  • Day of year (DOY) (cyclic encoding)
  • Hour of day (time index) (cyclic encoding)

2.2. Feature Engineering Process

Figure 1 shows that solar radiation varies with the seasons but is not strictly proportional to monthly values, instead exhibiting a sine wave pattern. Capturing these multi-scale dynamics—ranging from hourly to seasonal—is essential for accurate forecasting. To address this, time is decomposed into three components: year, Day Of Year (DOY), and hour (H). The DOY, representing the calendar day within a year, is modeled using a 365-day period and expressed through sine and cosine transformations, as shown in Equations (2) and (3). Likewise, the hour of the day is encoded using sine and cosine functions with a 24 h period, as shown in Equations (4) and (5).
s i n D O Y = s i n 2 π D a t a   t i m e D O Y N u m b e r   o f   d a y   i n   a   y e a r
c o s D O Y = s i n 2 π D a t a   t i m e D O Y N u m b e r   o f   d a y   i n   a   y e a r
s i n d a y = s i n 2 π D a t a   t i m e H 24
c o s d a y = s i n 2 π D a t a   t i m e H 24
Converting temporal features into sine and cosine representations offers several advantages. First, because time is inherently cyclical, sine and cosine functions can effectively capture periodic variations, enabling the model to better learn temporal patterns. Second, solar radiation is not monotonically correlated with time; for example, the radiation intensity at 11:00 AM is typically lower than at noon. Encoding time features with sine and cosine functions avoids a linear interpretation of temporal sequences, allowing the model to treat all time points with equal importance. This, in turn, improves generalization and reduces the risk of misinterpreting time intervals.
To ensure comparability across features and improve convergence during training, the max-min normalization method [38] was used to normalize all variables to the range [0, 1]. While the inclusion of 15 candidate features initially maximized the information available to the model, not all variables contributed equally to predictive performance. Excessive or irrelevant features can introduce noise, increase computational cost, and reduce generalizability. To address this, this study employs SHAP values to quantify and rank feature importance.

3. Solution Methodology

The proposed forecasting framework integrates LSTM with SHAPs to achieve both predictive accuracy and interpretability in solar irradiance forecasting. This section outlines the theoretical foundations of LSTM, introduces SHAPs as a method for feature attribution, and details the overall solution procedure adopted in this study.

3.1. LSTM

An LSTM unit consists of three primary gates—the forget gate, input gate, and output gate—which operate in conjunction with the cell state C t and hidden state h t , as illustrated in Figure 2. The LSTM computation process enables the cell to determine what information to retain from the input data, what outdated information to discard, and when to transfer information from the cell state to the output. The main computational steps are summarized as follows [39].
A.
Forget Gate
The forget gate determines which information from the previous cell state should be discarded. Its output f t is computed using the current input x t and the previous hidden state h t 1 , as expressed in Equation (6).
f t = σ W f · h t 1 , x t + b f
W f is the weight of the forget gate, b f is the bias term of the forget gate, σ is an S-function, expressed as Equation (7), so that the output value is between 0 and 1, which determines the proportion of forgetting.
σ x = 1 1 + e z
  • B. Input Gate
The input gate determines how much new information should be written to the cell state, as shown in Equation (8). The new information vector C t ~ , with a candidate value range of [−1, +1], is created with t a n h layer, expressed as Equation (9).
i t = σ W i · h t 1 , x t + b i
C t ~ = tanh W C · h t 1 , x t + b C
W i and W c are the weights of the input gate. b i and b c are the bias terms of the input gate.
The cell state C t is combined with the information from the forget gate and input gate into the updated state, expressed as Equation (10):
C t = f t C t 1 + i t C ~ t
f t determines how much of the previous cell state to retain, while i t × C t ~ adds new candidate values.
  • C. Output Gate
The output gate determines which information from the cell state will be passed to the output, as defined in Equation (11). The resulting value h t serves both as the output at the current time step and as the hidden state for the next time step, as shown in Equation (12).
o t = σ · W o · h t 1 , x t + b o
h t = o t × tanh C t
o t is the activation function of the output gate, which determines which parts to export. The final output h t is the product of this activation value and the t a n h value of the cell state, and ensures that the output value is from −1 to 1. W o is the weight of the input gate. b o is the bias term of the input gate.
Through these mechanisms, the LSTM learns to retain relevant temporal dependencies while alleviating the vanishing gradient problem. In the context of solar irradiance forecasting, this enables the network to simultaneously capture:
  • Diurnal cycles (daily patterns of sunlight).
  • Seasonal variations (summer peaks, winter troughs).
  • Irregular fluctuations due to clouds, rainfall, or typhoons.

3.2. Shapley Additive Explanations (SHAPs)

While LSTM models deliver strong predictive performance, their complex architecture limits interpretability. For operational decision-making in energy systems, however, transparency is essential. Operators must understand not only the prediction outcomes but also the relative importance of meteorological drivers such as temperature, humidity, and cloud cover. SHAP addresses this need by assigning each feature a contribution score (SHAP value) that reflects its marginal impact on the prediction across all possible feature coalitions. When applied to machine learning models, SHAP values can be used to assess the relative contribution of each feature to the model’s predictions.
Consider a model with n features and define a feature subset S , which is a set of selected features. SHAP uses a value function v S to represent the total benefit of this subset. A SHAP value measures the average marginal contribution of a single feature to the entire system. Its calculation formula is shown in Equation (13).
i = S N { i } S ! n S 1 n ! v S i v S
ϕ i represents the average marginal contribution of feature i across all subsets, ensuring properties of efficiency, symmetry, and fairness. N is the set of all features in machine learning, and n is the number of elements in the feature set. S⊆N\{i} is any subset that does not contain feature i . v S is the prediction produced by the model when subset S is used as the feature input, and v S i v S is the marginal contribution of feature i to the set S . Then, multiply by the combination weight S ! n S 1 ! n ! , where S ! represents the number of possible combinations of features in subset S . Multiply by the possible combinations of the remaining features n S 1 ! , and divide the total number of all permutations of the feature set N by n ! . The result is the combination weight, which ensures that the contributions of all permutations remain fair. Finally, the contribution of feature i is calculated by substituting all subsets of feature set N that do not contain feature i .
In this paper, SHAP analysis was applied after model training to:
  • Rank feature importance across the dataset.
  • Identify redundant or less influential variables.
  • Guide feature selection for retraining the LSTM under reduced input dimensions.
SHAP identifies temperature, humidity, and cyclic time features as the dominant contributors, which aligns with meteorological principles. SHAP was used to perform the feature selection and demonstrated that removing low-impact features improves model generalization and computational efficiency. SHAP is not only used for visualization but also as a quantitative basis for feature reduction. Selecting various important features showed the structural limitations of excessive feature removal, demonstrating how SHAP guides the balance between model simplicity and predictive power. By quantifying feature influence, SHAP not only enhances interpretability but also improves computational efficiency, since models trained on fewer yet more informative features tend to converge more quickly. Overall, this reinforces that SHAP provides actionable insights for model optimization.

3.3. Solution Procedure

The overall solution methodology comprises the following steps:
Step 1: Data Collection and Preprocessing
  • Acquire six years (2018–2023) of hourly meteorological data from CODiS.
  • Apply preprocessing: missing value handling, normalization, and sinusoidal encoding for temporal features.
Step 2: Model Design
  • Construct an LSTM network with a 24 h sliding input window.
  • Configure hyperparameters: batch size, epochs, and the number of layers are 128, 100, and 2, respectively. Simulator is Adam optimizer [40].
  • Use MSE and MAE for evaluation.
Step 3: Baseline Training
  • Train the model using the full set of 15 meteorological features.
  • Evaluate performance across training and validation sets.
  • Identify risks of overfitting and analyze seasonal variations.
Step 4: Feature Attribution with SHAP
  • Apply SHAP to compute feature contributions.
  • Generate global feature importance rankings and local explanations.
  • Identify key predictors (temperature, humidity, time encodings).
Step 5: Feature Reduction (scenarios-2 and scenarios-3)
  • Retrain the model using only the top 10 features (scenario 2) and top 5 features (scenario 3) ranked by SHAP.
  • Compare predictive accuracy, convergence time, and generalization performance.
Step 6: Evaluation and Visualization
  • Evaluate models using MAE and MSE.
  • Visualize training curves, predicted vs. actual irradiance, and SHAP importance plots.
  • Conduct seasonal analysis (spring, summer, autumn, winter) to assess robustness.
Step 7: Interpretation and Discussion
  • Compare results across test scenarios to assess trade-offs.
  • Highlight how SHAP-guided feature reduction improves efficiency.
  • Discuss physical plausibility of feature rankings.
Figure 3 illustrates the workflow of the LSTM-SHAP, which integrates data preprocessing, min–max normalization, a sequence-to-one LSTM architecture, and SHAP-guided optimization. The model employs two LSTM layers with dropout regularization and an Adam optimizer, using MSE and MAE as performance metrics.
To evaluate the accuracy of LSTM, the mean absolute error (MAE) and mean square error (MSE) are both used in this paper. The MAE and MSE are defined as
M A E = 1 T t = 1 T R a d t t u r e R a d t p r e d i c t
M S E = 1 T t = 1 T | R a d t t u r e R a d t p r e d i c t | 2
R a d t t u r e is the actual value of global radiation at hour t and R a d t p r e d i c t is the forecasted value of sunshine radiation at hour t. T is the number of testing data.

4. Results

To evaluate the effectiveness of the proposed LSTM–SHAP forecasting framework, a case study was conducted using real-world meteorological data from the Penghu region of Taiwan. Model performance was assessed using error metrics, training convergence, and seasonal analysis. For comparison, standalone LSTM and RNN models were also developed and tested. All experiments were carried out in Python 3.12 with the pandas library on an Intel(R) Core(TM) i5-11300 H processor with 16 GB RAM.

4.1. Dataset Information

The experimental dataset comprises hourly meteorological records from 2018 to 2023, sourced from the Central Weather Administration (CWA) Observation Data Inquire System (CODiS). The meteorological dataset characteristics are described in Table 1. The dataset includes 15 candidate features and a target variable (global solar irradiance), as described in Section 2. The time step is set to 24 h, and the prediction method uses a sequence-to-one approach.
This paper describes the dataset characteristics, experimental setup, and comparative results from three test scenarios:
(i)
Scenario 1: full feature set. This is a baseline.
(ii)
Scenario 2: SHAP-ranked top 10 features
(iii)
Scenario 3: SHAP-ranked top 5 features.

4.2. SHAP Feature Analysis

The SHAP values of 100 randomly selected samples from the dataset are shown in Figure 4, and the SHAP value trends of each feature in the data are shown in Figure 5. Figure 6 shows the importance ranking of each feature in the LSTM prediction. Results show that temperature has a significantly greater influence than other features, indicating that this variable contributes the most to the model’s predictions. Next in line are the cyclical features related to time, s i n d a y and c o s D O Y , which were represented using sine and cosine transformations. This encoding ensures that temporally adjacent values (e.g., 31 December and 1 January, or 23:00 and 00:00) are mapped to nearby points in the feature space; such features were described as “ s i n d a y ” and “ c o s D O Y ”, indicating that the model relies heavily on cyclical temporal changes for prediction. Other meteorological factors, such as humidity and dew point, also contribute to the model’s predictions.

4.3. Scenario 1: Full Feature Set (15 Features)

This paper evaluates the model’s seasonal forecasting performance for spring, summer, autumn, and winter, as illustrated in Figure 7, Figure 8, Figure 9 and Figure 10. Overall, the model demonstrates strong trend alignment in summer (Figure 8) and autumn (Figure 9), where the predicted values closely track the observed data with fluctuations around the mean. In contrast, spring (Figure 7) shows larger forecast errors and irregular fluctuations, likely due to frequent weather changes and nonlinear seasonal dynamics. Despite higher overall variability in winter (Figure 10), the model maintains a good trend fit, highlighting its robustness in handling low-insolation and highly variable conditions.
Table 2 compares the performance of LSTM and RNN models for Scenario 1. In spring and summer, Scenario 1 exhibits similar performance for high-dimensional inputs. The RNN achieves a lower MSE in spring (0.05 vs. 0.07), whereas the LSTM demonstrates greater stability in autumn and winter, highlighting its superior generalization ability under conditions of higher seasonal variability. Moreover, the LSTM’s average inference time is 6 ms, which is significantly faster than the RNN’s 10 ms.

4.4. Scenario 2: SHAP-Selected Top 10 Features

Based on SHAP rankings from Scenario 1, the 10 most impactful features were retained, removing low-contribution variables such as wind direction, wind speed, rail fall, maximal speed, and visibility. This reduction aimed to improve computational efficiency and reduce overfitting risk. The prediction results for spring, as shown in Figure 11, generally align with the actual values, but show slight underestimation in the high-radiance range (radiance values greater than 2.5). This is likely due to the more variable weather patterns and complex climatic conditions in spring, resulting in more dramatic data fluctuations and potential bias in the model’s predictions at extreme values. Summer, as shown in Figure 12, is the season with the most stable sunshine and the highest radiation levels of the year, and the model performs best during this period. The high overlap between the predicted and actual values indicates that the model has successfully learned and simulated the distinct and cyclical summer radiation patterns. The forecast accuracy for autumn, as shown in Figure 13, is between that for spring and summer. The overall model fit is good, but there is still a slight underestimation or delay in the forecast in some sections with drastic changes. This may be related to the frequent changes in meteorological conditions during the transition between seasons. The forecast performance for winter, as shown in Figure 14, is relatively weak. The model has obvious overestimations at some time points, and the overall forecast line fluctuates greatly. This may be due to the short sunshine hours, low total radiation, and large variability in winter, which increases the difficulty of forecasting. Frequent rainfall or cloudy weather may also cause the input characteristics to be unable to effectively reflect actual radiation changes.
Table 3 compares the performance of LSTM and RNN models for Scenario 2. The LSTM achieves smaller errors in autumn and winter, with MSEs of 0.11 and 0.10, respectively, while the RNN records slightly higher values. Overall, the LSTM demonstrates more stable error performance and lower inference cost under this setting, with a batch computation time of 5 ms compared to 9 ms for the RNN.

4.5. Scenario 3: SHAP-Selected Top Five Features

Scenario 2 further reduced the input dimension to only the top five most influential features:
  • Air temperature
  • Relative humidity
  • Dew point
  • Hour encoding (sin/cos)
  • Day-of-year encoding (sin/cos)
These features were selected due to their consistent dominance in SHAP analysis. The selected features include temperature, relative humidity, dew point temperature, s i n d a y , c o s d a y , s i n D O Y , and c o s D O Y . Because the sin and cos conversions describe a single concept, this paper combines s i n d a y and c o s d a y into a single feature. The same applies to s i n D O Y and c o s D O Y . Therefore, this model is trained using only five semantically independent features. This experiment is designated as Scenario 3 to explore the changes and stability of the model’s prediction performance with continued refinement of the input features.
The overall forecast results show that all seasons exhibit an underestimation of radiance, indicating a decline in model predictive ability after feature simplification. In spring, as shown in Figure 15, the forecast curve is significantly lower than the observed values overall, and the underestimation is more severe than in Scenario 2. This may reflect that the existing features are insufficient to accurately model the frequent spring climate transitions. Summer, as shown in Figure 16, remains the best forecasted season, but the forecast curve also exhibits discrepancies from the observed values, particularly during peak radiance periods, with significant underestimation, indicating room for improvement at extreme values. In autumn, as shown in Figure 17, the model exhibits significant underestimation and forecast delays during periods of significant data variability. This indicates that when data volatility increases, the model struggles to capture trend changes in real time, further demonstrating its limited adaptability under limited model complexity. In winter, as shown in Figure 18, a slight overestimation in Scenario 2 shifts to an overall underestimation in Scenario 3. This result may be related to factors such as the highly variable winter weather patterns and complex climatic conditions. This makes it difficult for models trained solely on simplified features to accurately simulate actual radiation variations, leading to a significant increase in forecast bias under extreme conditions.
Table 4 shows that with low-dimensional feature input, the LSTM performs more consistently, outperforming or matching the RNN in spring, summer, and winter. In summer in particular, the LSTM achieves an MSE of 0.05, compared to 0.07 for the RNN. Model simplification also enhances inference efficiency, with the LSTM running in just 4 ms versus 8 ms for the RNN. Results demonstrate that the LSTM maintains strong performance and efficiency even under low-dimensional input settings.
As the number of features decreases from 15 to 5, the number of model parameters is reduced from 53,569 to 51,521, demonstrating that dimensionality reduction effectively lowers model complexity. In terms of prediction error, both the mean squared error (MSE) and mean absolute error (MAE) exhibit only slight variations. Scenario 3, however, can achieve a better performance in both MSE and MAE, indicating that appropriate feature reduction does not affect the quality of analysis results. It can also improve the efficiency of computation ability, with single-step computation time dropping from 6 ms to 4 ms. This highlights that the significant benefits of feature reduction can apply widely in a real-time prediction system. Moreover, the results from Scenario 3 show that using SHAP value analysis helps eliminate non-contributing variables, thereby reducing inefficient computation, maintaining a stable model structure, and improving both generalization and predictive performance. In addition, the feature importance derived from the SHAP mechanism is interpretable, making the feature selection process not only effective but also theoretically sound and practically valuable.

5. Conclusions

This research demonstrates that integrating LSTM with SHAP provides a scalable, interpretable, and practical solution for solar irradiance forecasting. LSTM networks effectively capture both short-term fluctuations (hourly and daily patterns) and long-term dependencies (seasonal cycles), outperforming traditional models that are prone to vanishing gradients. By applying SHAP, we quantified the relative contributions of input features and identified temperature, relative humidity, and cyclic time encodings as the most influential predictors. The case study in the Penghu region of Taiwan validated the framework’s effectiveness, underscoring the dual importance of predictive accuracy and transparency in energy system applications. Results show that SHAP effectively removes non-contributory features, enhances interpretability, and reduces unnecessary computation. The execution time was improved from 6 ms to 4 ms when the low-impact features were removed, going from fifteen features to five. Execution efficiency improved by approximately 33%. The proposed framework offers actionable insights for grid operators and policymakers by enabling more accurate scheduling of photovoltaic (PV) generation, the optimization of storage systems, and the stabilization of smart grids. Thus, the methodology not only advances academic understanding but also delivers tangible benefits for the rapidly growing field of renewable energy forecasting.

Author Contributions

M.-T.T. is the first author. He generalized novel algorithms, designed the system planning projects, and prepared the manuscript as the corresponding author. I.-C.L. conducted the theory experiments, used the simulation tools, and performed the formal analysis and data investigation. Two authors were involved in exploring system validation and results. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

We would like to thank Cheng-Shiu University, Taiwan, for financial support. (Grant Nos. D-2-2-1 Promotion of Industry–Academia New Technology Research Projects).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ANNsArtificial Neural Networks
ARIMAAutoregressive Integrated Moving Average
ARMAAutoregressive Moving Average
ANFISAdaptive Network-based Fuzzy Inference System
CWACentral Weather Administration
DOYDay of year
LSTMLong Short-Term Memory
MAEMean Absolute Error
MSEMean Square Error
ODISObservation Data Inquire System
PVPhotovoltaic
RERenewable Energy
SHAPShapley Additive Explanations

References

  1. Martiny, A.; Taglialatela, J.; Testa, F.; Iraldo, F. Determinants of environmental social and governance (ESG) performance: A systematic literature review. J. Clean. Prod. 2024, 456, 142213. [Google Scholar] [CrossRef]
  2. Arnulf, J.W. Photovoltaics and renewable energies in Europe. Renew. Sustain. Energy Rev. 2007, 11, 1414–1437. [Google Scholar] [CrossRef]
  3. Prichard, M.T.; Subramanian, V. Current trends in silicon-based photovoltaic recycling: A technology, assessment, and policy review. Sol. Energy 2023, 7, 137–150. [Google Scholar] [CrossRef]
  4. Taiwan Power Company. The Sustainable Operation of White Paper for Taiwan Power Company; Taiwan Power Company: Taipei, Taiwan, 2021. [Google Scholar]
  5. Lave, M.; Kleissl, J.; Arias-Castro, E. High-frequency irradiance fluctuations and geographic smoothing. Sol. Energy 2012, 86, 2190–2199. [Google Scholar] [CrossRef]
  6. Antonanzas-Torres, F.; Urraca, R.; Polo, J.; Perpiñán-Lamigueiro, O.; Escobar, R. Clear sky solar irradiance models: A review of seventy models. Renew. Sustain. Energy Rev. 2019, 107, 374–387. [Google Scholar] [CrossRef]
  7. Shyu, C.W.; Yang, Z.I. Socio-technical tensions from community acceptance, energy transition, and energy justice: Lessons from solar photovoltaic projects in Taiwan. Appl. Energy 2025, 384, 125415. [Google Scholar] [CrossRef]
  8. Kung, S.S.; Li, H.; Kung, C.C. Prospects of Taiwan’s solar energy development and policy formulation. Heliyon 2024, 10, e23811. [Google Scholar] [CrossRef]
  9. Wang, G.; Zhang, Z.; Lin, J. Multi-energy complementary power systems based on solar energy: A review. Renew. Sustain. Energy Rev. 2024, 199, 114464. [Google Scholar] [CrossRef]
  10. Wang, Y.; Luo, S.; Yao, L.; Du, Z.; Guan, Z.; Xiao, X. Modeling concentrating solar power plants in power system optimal planning and operation: A comprehensive review. Sustain. Energy Technol. Assess. 2024, 71, 103992. [Google Scholar] [CrossRef]
  11. Elsheikh, A.; Faqeha, H.; Hammoodi, K.; Bawahab, M.; Fujii, M.; Shanmugan, S.; Abd-Elaziem, W.; Ramesh, B.; Sathyamurthy, R.; Egiza, M. Integrating predictive and hybrid Machine Learning approaches for optimizing solar still performance: A comprehensive review. Sol. Energy 2025, 295, 113536. [Google Scholar] [CrossRef]
  12. Zaidi, A. A bibliometric analysis of machine learning techniques in photovoltaic cells and solar energy (2014–2022). Energy Rep. 2024, 11, 2768–2779. [Google Scholar] [CrossRef]
  13. Zhou, Y.; Liu, Y.; Wang, D.; Liu, X.; Wang, Y. A review on global solar radiation prediction with machine learning models in a comprehensive perspective. Energy Convers. Manag. 2021, 235, 113960. [Google Scholar] [CrossRef]
  14. Kumari, P.; Toshniwal, D. Deep learning models for solar irradiance forecasting: A comprehensive review. J. Clean. Prod. 2021, 318, 128566. [Google Scholar] [CrossRef]
  15. Mwigereri, D.G.; Ochieng, F.X.; Ouma, J.; Mwai, Z.; Mwalili, T. Evaluation of weather research forecasting-solar radiation schemes for solar power forecasting in equatorial Africa. Energy Rep. 2025, 13, 4318–4330. [Google Scholar] [CrossRef]
  16. Ghimire, S.; Deo, R.; Raj, N.; Mi, J. Deep solar radiation forecasting with convolutional neural network and long short-term memory network algorithms. Appl. Energy 2019, 253, 113541. [Google Scholar] [CrossRef]
  17. Paoli, C.; Voyant, C.; Muselli, M.; Nivet, M.L. Forecasting of preprocessed daily solar radiation time series using neural networks. Sol. Energy 2010, 84, 2146–2160. [Google Scholar] [CrossRef]
  18. Escalona-Llaguno, M.; Solís-Sánchez, L.; Castañeda-Miranda, C.; Olvera-Olvera, C.; Martinez-Blanco, M.; Guerrero-Osuna, H.; Castañeda-Miranda, R.; Díaz-Flórez, G.; Ornelas-Vargas, G. Comparative Analysis of Solar Radiation Forecasting Techniques in Zacatecas, Mexico. Appl. Sci. 2024, 14, 7449. [Google Scholar] [CrossRef]
  19. Sun, M.; Feng, C.; Zhang, J. Probabilistic solar power forecasting based on weather scenario generation. Appl. Energy 2020, 266, 114823. [Google Scholar] [CrossRef]
  20. Ali, M.; Elsayed, A.; Elkabani, I.; Akrami, M.; Youssef, M.; Hassan, G. Artificial Intelligence-Based Improvement of Empirical Methods for Accurate Global Solar Radiation Forecast: Development and Comparative Analysis. Energies 2024, 17, 4302. [Google Scholar] [CrossRef]
  21. Belmahdi, B.; Louzazni, M.; Marzband, M.; El Bouardi, A. Global Solar Radiation Forecasting Based on Hybrid Model with Combinations of Meteorological Parameters: Morocco Case Study. Forecasting 2023, 5, 172–195. [Google Scholar] [CrossRef]
  22. Bracale, A.; Guido, C.; Falco, P. A Probabilistic Competitive Ensemble Method for Short-Term Photovoltaic Power Forecasting. IEEE Trans. Sustain. Energy 2017, 8, 551–560. [Google Scholar] [CrossRef]
  23. Xwégnon, G.; Robin, G.; George, K. Short-Term Spatio-Temporal Forecasting of Photovoltaic Power Production. IEEE Trans. Sustain. Energy 2018, 9, 538–546. [Google Scholar] [CrossRef]
  24. Gider, V.; Budak, C.; Izci, D.; Ekinci, S. Daily Solar Radiation Prediction Using LSTM Neural Networks. In Proceedings of the IEEE 2022 Global Energy Conference (GEC), Batman, Turkey, 26–29 October 2022. [Google Scholar] [CrossRef]
  25. Alguhi, A.A.; Al-Shaalan, A.M. LSTM-Based Prediction of Solar Irradiance and Wind Speed for Renewable Energy Systems. Energies 2025, 18, 4594. [Google Scholar] [CrossRef]
  26. Fraihat, H.; Almbaideen, A.A.; Al-Odienat, A.; Al-Naami, B.; De Fazio, R.; Visconti, P. Solar Radiation Forecasting by Pearson Correlation Using LSTM Neural Network and ANFIS Method: Application in the West-Central Jordan. Future Internet 2022, 14, 79. [Google Scholar] [CrossRef]
  27. Kapanski, A.A.; Hruntovich, N.V.; Klyuev, R.V.; Boltrushevich, A.E.; Sorokova, S.N.; Efremenkov, E.A.; Demin, A.Y.; Martyushev, N.V. Intelligent Methods of Operational Response to Accidents in Urban Water Supply Systems Based on LSTM Neural Network Models. Smart Cities 2025, 8, 59. [Google Scholar] [CrossRef]
  28. Mvita, M.J.; Zulu, N.G.; Thethwayo, B. Artificial neural network integrated SHapley Additive exPlanations modeling for sodium dichromate formation. Eng. Appl. Artif. Intell. 2025, 158, 111457. [Google Scholar] [CrossRef]
  29. Wang, J.; Yu, B.; Chen, X.; Dai, G.; Dai, G.; Liu, W.; He, N.; Zhu, P.; Yin, Z.; Pan, Z. An interpretable short-term electrical load forecasting model based on SHapley Additive exPlanations-A case study in Haidian, Beijing. Electr. Power Syst. Res. 2025, 247, 111769. [Google Scholar] [CrossRef]
  30. Thu, N.; Van, P.N.; Bao, P.Q.; Nam, N.V.; Minh, P.H.; Quang, T.N. Short-term Forecasting of Solar Radiation Using a Hybrid Model of CNN-LSTM Integrated with EEMD. In Proceedings of the IEEE 2022 6th International Conference on Green Technology and Sustainable Development (GTSD), Nha Trang City, Vietnam, 29–30 July 2022. [Google Scholar] [CrossRef]
  31. Şener, I.F.; Tuğal, I. Optimized CNN-LSTM with hybrid metaheuristic approaches for solar radiation forecasting. Case Stud. Therm. Eng. 2025, 72, 106356. [Google Scholar] [CrossRef]
  32. Fu, J.; Sun, Y.; Li, Y.; Wang, W.; Wei, W.; Ren, J.; Han, S.; Di, H. An investigation of photovoltaic power forecasting in buildings considering shadow effects: Modeling approach and SHAP analysis. Renew. Energy 2025, 245, 122821. [Google Scholar] [CrossRef]
  33. Zhou, H.; Zheng, P.; Dong, J.; Liu, J.; Nakanishi, Y. Interpretable feature selection and deep learning for short-term probabilistic PV power forecasting in buildings using local monitoring data. Appl. Energy 2024, 375, 124274. [Google Scholar] [CrossRef]
  34. Batiyah, S.; Al-Subhi, A.; Elsherbiny, O.; Aldosari, O.; Aldawsari, M. Deep neural networks model for accurate photovoltaic parameter estimation under variable weather conditions. Sol. Energy 2025, 299, 113734. [Google Scholar] [CrossRef]
  35. Zhu, H.; Wang, Y.; Sun, S.; Zhang, D.; Hu, S.; Chen, W. Meteorological information calculation and power forecasting for regional distributed photovoltaic power based on information fusion and deep learning. Int. J. Electr. Power Energy Syst. 2025, 172, 111189. [Google Scholar] [CrossRef]
  36. Wang, J.; Li, X.; Yang, H.; Kong, S. Design and Realization of Microgrid Composing of Photovoltaic and Energy Storage System. Energy Procedia 2011, 12, 1008–1014. [Google Scholar] [CrossRef]
  37. Center Weather Administration. Meteorological Observation Database. Available online: https://opendata.cwa.gov.tw/index (accessed on 25 October 2024).
  38. Sun, J.; Xia, Y. Pretreating and normalizing metabolomics data for statistical analysis. Genes Dis. 2024, 11, 100979. [Google Scholar] [CrossRef]
  39. Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice, 2nd ed.; OTexts: Melbourne, Australia, 2018; Available online: https://otexts.com/fpp2 (accessed on 25 July 2024).
  40. Kanaparthi, T.; Ramesh, S.; Yarrabothu, R.S. K-Means Cluster-Based Interference Alignment with Adam Optimizer in Convolutional Neural Networks. Int. J. Inf. Secur. Priv. 2022, 16, 2. [Google Scholar] [CrossRef]
Figure 1. The raw irradiance time series.
Figure 1. The raw irradiance time series.
Energies 18 06099 g001
Figure 2. The network structure of LSTM.
Figure 2. The network structure of LSTM.
Energies 18 06099 g002
Figure 3. Architecture of the LSTM–SHAP for solar radiation prediction.
Figure 3. Architecture of the LSTM–SHAP for solar radiation prediction.
Energies 18 06099 g003
Figure 4. SHAP values of all features of 100 randomly selected samples.
Figure 4. SHAP values of all features of 100 randomly selected samples.
Energies 18 06099 g004
Figure 5. The SHAP value trends of each feature.
Figure 5. The SHAP value trends of each feature.
Energies 18 06099 g005
Figure 6. The importance ranking of each feature in the LSTM prediction.
Figure 6. The importance ranking of each feature in the LSTM prediction.
Energies 18 06099 g006
Figure 7. Forecasting performance in spring under Scenario 1.
Figure 7. Forecasting performance in spring under Scenario 1.
Energies 18 06099 g007
Figure 8. Forecasting performance in summer under Scenario 1.
Figure 8. Forecasting performance in summer under Scenario 1.
Energies 18 06099 g008
Figure 9. Forecasting performance in autumn under Scenario 1.
Figure 9. Forecasting performance in autumn under Scenario 1.
Energies 18 06099 g009
Figure 10. Forecasting performance in winter under Scenario 1.
Figure 10. Forecasting performance in winter under Scenario 1.
Energies 18 06099 g010
Figure 11. Forecasting performance in spring under Scenario 2.
Figure 11. Forecasting performance in spring under Scenario 2.
Energies 18 06099 g011
Figure 12. Forecasting performance in summer under Scenario 2.
Figure 12. Forecasting performance in summer under Scenario 2.
Energies 18 06099 g012
Figure 13. Forecasting performance in autumn under Scenario 2.
Figure 13. Forecasting performance in autumn under Scenario 2.
Energies 18 06099 g013
Figure 14. Forecasting performance in winter under Scenario 2.
Figure 14. Forecasting performance in winter under Scenario 2.
Energies 18 06099 g014
Figure 15. Forecasting performance in spring under Scenario 3.
Figure 15. Forecasting performance in spring under Scenario 3.
Energies 18 06099 g015
Figure 16. Forecasting performance in summer under Scenario 3.
Figure 16. Forecasting performance in summer under Scenario 3.
Energies 18 06099 g016
Figure 17. Forecasting performance in autumn under Scenario 3.
Figure 17. Forecasting performance in autumn under Scenario 3.
Energies 18 06099 g017
Figure 18. Forecasting performance in winter under Scenario 3.
Figure 18. Forecasting performance in winter under Scenario 3.
Energies 18 06099 g018
Table 1. Meteorological dataset characteristics.
Table 1. Meteorological dataset characteristics.
AttributeValue
Total duration2018–2023 (6 years)
Time resolution1 h
Total samples52,512 h
Training period2018–2022 (42,096 h)
Testing period2023 (10,416 h)
Number of features15
Primary targetGlobal solar irradiance (MJ/m2 per hr)
Table 2. Comparison of the performance of LSTM and RNN models for Scenario 1.
Table 2. Comparison of the performance of LSTM and RNN models for Scenario 1.
LSTMRNN
MSEMAEExecution TimeMSEMAEExecution Time
Spring0.070.136 ms0.050.1210 ms
Summer0.050.116 ms0.050.1210 ms
Autumn0.120.186 ms0.150.2110 ms
Winter0.110.176 ms0.100.1810 ms
Table 3. Comparison of the performance of LSTM and RNN models for Scenario 2.
Table 3. Comparison of the performance of LSTM and RNN models for Scenario 2.
LSTMRNN
MSEMAEExecution TimeMSEMAEExecution Time
Spring0.070.135 ms0.050.129 ms
Summer0.060.125 ms0.060.139 ms
Autumn0.110.185 ms0.130.209 ms
Winter0.100.175 ms0.100.189 ms
Table 4. Comparison of the performance of LSTM and RNN models for Scenario 3.
Table 4. Comparison of the performance of LSTM and RNN models for Scenario 3.
LSTMRNN
MSEMAEExecution TimeMSEMAEExecution Time
Spring0.070.134 ms0.080.158 ms
Summer0.050.114 ms0.070.158 ms
Autumn0.110.184 ms0.110.198 ms
Winter0.090.164 ms0.110.198 ms
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tsai, M.-T.; Lo, I.-C. Application of Long Short-Term Memory Networks and SHAP Evaluation in the Solar Radiation Forecast. Energies 2025, 18, 6099. https://doi.org/10.3390/en18236099

AMA Style

Tsai M-T, Lo I-C. Application of Long Short-Term Memory Networks and SHAP Evaluation in the Solar Radiation Forecast. Energies. 2025; 18(23):6099. https://doi.org/10.3390/en18236099

Chicago/Turabian Style

Tsai, Ming-Tang, and I-Cheng Lo. 2025. "Application of Long Short-Term Memory Networks and SHAP Evaluation in the Solar Radiation Forecast" Energies 18, no. 23: 6099. https://doi.org/10.3390/en18236099

APA Style

Tsai, M.-T., & Lo, I.-C. (2025). Application of Long Short-Term Memory Networks and SHAP Evaluation in the Solar Radiation Forecast. Energies, 18(23), 6099. https://doi.org/10.3390/en18236099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop