Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF)

Elmousalami, Haytham; Peng Hui, Felix Kin; Alnaser, Aljawharah A.

doi:10.3390/buildings15152785

Open AccessArticle

Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF)

by

Haytham Elmousalami

^1,*,

Felix Kin Peng Hui

¹

and

Aljawharah A. Alnaser

^2,*

¹

Department of Infrastructure Engineering, Faculty of Engineering and Information Technology, The University of Melbourne, Melbourne, VIC 3010, Australia

²

Department of Architecture and Building Science, College of Architecture and Planning, King Saud University, Riyadh 11421, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Buildings 2025, 15(15), 2785; https://doi.org/10.3390/buildings15152785

Submission received: 4 July 2025 / Revised: 25 July 2025 / Accepted: 4 August 2025 / Published: 6 August 2025

(This article belongs to the Special Issue Intelligent Automation in Construction Management)

Download

Browse Figures

Versions Notes

Abstract

The transition to smart, zero-carbon cities relies on advanced, sustainable energy solutions, with artificial intelligence (AI) playing a crucial role in optimizing renewable energy management. This study evaluates state-of-the-art AI models for solar power forecasting, emphasizing accuracy, reliability, and environmental sustainability. Using operational data from Benban Solar Park in Egypt and Sakaka Solar Power Plant in Saudi Arabia, two of the world’s largest solar installations, the research highlights the effectiveness of hybrid AI techniques. The hybrid Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) model outperformed other models, achieving a Mean Absolute Percentage Error (MAPE) of 2.04%, Root Mean Square Error (RMSE) of 184, Mean Absolute Error (MAE) of 252, and R² of 0.99 for Benban, and an MAPE of 2.00%, RMSE of 190, MAE of 255, and R² of 0.98 for Sakaka. This model excels at capturing complex spatiotemporal patterns in solar data while maintaining low computational CO₂ emissions, supporting sustainable AI practices. The findings demonstrate the potential of hybrid AI models to enhance the accuracy and sustainability of solar power forecasting, thereby contributing to efficient, resilient, and zero-carbon urban environments. This research provides valuable insights for policymakers and stakeholders aiming to advance smart energy infrastructure.

Keywords:

sustainable artificial intelligence; solar power forecasting; energy supply optimization; smart cities energy management; renewable energy prediction; Benban Solar Park; Sakaka Solar Power Plant

1. Introduction

1.1. Solar Energy Supply for Zero-Carbon Cities

The journey toward zero-carbon cities is a global imperative in combating climate change and achieving sustainable urban development. Solar energy, one of the most abundant and renewable energy sources, has emerged as a cornerstone of this transition [1,2]. By harnessing solar power, cities can significantly reduce their reliance on fossil fuels, decrease greenhouse gas emissions, and promote energy independence. Photovoltaic (PV) systems, solar thermal technologies, and solar-powered microgrids are increasingly being integrated into urban energy systems, providing clean and cost-effective solutions for residential, commercial, and industrial energy needs. Advances in solar panel efficiency, energy storage technologies, and grid integration strategies have made solar energy a viable and scalable option for powering smart cities while supporting their carbon neutrality goals [3,4].

Despite its immense potential, the widespread adoption of solar energy in urban environments presents several challenges. These include variability in solar irradiance due to weather conditions, space constraints for large-scale PV installations, and energy storage and distribution inefficiencies. Overcoming these hurdles requires innovative approaches, such as AI-driven solar forecasting, energy-efficient grid systems, and policies that incentivize rooftop solar adoption and urban solar farms [5,6]. By combining technological advancements with supportive regulatory frameworks, cities can optimize their solar energy supply, enhance grid reliability, and create resilient, zero-carbon urban energy ecosystems. Solar energy’s integration into smart cities contributes to environmental sustainability and fosters economic development, energy equity, and improved quality of life for urban residents [7,8].

The global solar belt, spanning latitudes 35° N to 35° S, represents regions with the highest solar irradiance, making them ideal for solar energy generation as shown in Figure 1. Within this belt, the Middle East and North Africa (MENA) region stands out as one of the most promising areas for solar energy development due to its abundant sunlight, low cloud cover, and vast desert landscapes suitable for large-scale photovoltaic (PV) and concentrated solar power (CSP) installations [9,10]. Countries such as Saudi Arabia, the United Arab Emirates, and Morocco have capitalized on this potential by investing in mega-scale solar projects such as the Sudair Solar PV Project in Saudi Arabia and the Mohammed bin Rashid Al Maktoum Solar Park in the UAE, which are among the largest in the world. In addition, the MENA region’s strategic location positions it as a key player in exporting clean energy to Europe and neighboring regions, fostering cross-border energy cooperation and contributing significantly to global carbon reduction efforts [11].

Solar power forecasting operates across various timescales, each tailored to specific applications in energy management and grid operations, as shown in Figure 2. Short-term forecasting, ranging from minutes to a few hours ahead, is critical for real-time energy dispatch and grid stability, addressing fluctuations in solar irradiance due to weather changes [12,13]. Medium-term forecasts, spanning several hours to days, support day-ahead energy market operations, resource scheduling, and maintenance planning. Long-term forecasts, covering weeks to years, are essential for strategic planning, policy formulation, and investment decisions in renewable energy infrastructure. Each timescale utilizes distinct data sources and modeling techniques, such as numerical weather prediction for longer horizons and machine-learning (ML) models for near-term predictions, ensuring optimized integration of solar power into smart energy systems. These forecasting capabilities are indispensable for achieving efficient, reliable, and sustainable solar energy supply in the transition toward zero-carbon cities [12,14].

1.2. Sustainable Artificial Intelligence (SAI)

Sustainable artificial intelligence (SAI) represents a transformative approach to developing and deploying AI technologies, prioritizing environmental, economic, and social sustainability. With the rapid advancement of AI systems, their growing energy consumption and associated carbon emissions have raised significant concerns. For instance, training large AI models like GPT-3 is estimated to produce hundreds of tons of CO₂ emissions, prompting researchers to seek more eco-friendly alternatives [15,16]. SAI aims to address these challenges by optimizing computational efficiency, leveraging renewable energy sources, and adopting green AI methodologies that balance performance and environmental impact [17,18]. This approach reduces the carbon footprint of AI applications and aligns technological progress with global sustainability goals, such as the United Nations’ Sustainable Development Goals (SDGs) [19,20].

Moreover, SAI extends beyond energy considerations to incorporate fairness, inclusivity, and ethical principles in AI design and implementation. By fostering transparent, unbiased, and equitable algorithms, SAI addresses critical societal issues while ensuring long-term benefits for diverse stakeholders [21]. Applications of SAI in renewable energy systems, such as wind power forecasting and grid optimization, exemplify its potential to contribute to zero-carbon cities and combat climate change [22]. As AI becomes increasingly integrated into critical sectors like healthcare, transportation, and urban planning, adopting sustainable practices is imperative for mitigating unintended consequences while maximizing the societal value of AI innovations [23].

1.3. Research Gaps and Problems

Despite the advancements in AI models for solar power forecasting, significant research gaps and challenges remain. One notable issue is the trade-off between predictive accuracy and environmental sustainability. While AI models such as deep learning (DL) demonstrate exceptional accuracy, their computational intensity results in higher CO₂ emissions, raising concerns about the environmental impact of their widespread adoption. The lack of standardized frameworks for measuring and mitigating the carbon footprint of AI models complicates efforts to align solar forecasting technologies with global sustainability goals. Furthermore, most studies emphasize single-algorithm forecasting, leaving a gap in the comparative analysis of AI models. Comparing various models provides deeper insights into their predictive accuracy, computational efficiency, and environmental impact. This approach facilitates the selection of optimal models for diverse solar power forecasting applications to select the best model for predicting the solar power supply for smart cities [24,25].

Another critical challenge lies in the availability and quality of data used for training AI models. Solar irradiance and power datasets are often location-specific, limiting the generalizability of models trained on them. Additionally, integrating heterogeneous data sources, such as weather patterns, satellite imagery, and historical irradiance records, remains a complex task requiring advanced data preprocessing techniques. While hybrid models such as CNN-LSTM excel at capturing spatial and temporal patterns, their reliance on extensive computational resources hinders their practical implementation in low-resource settings. Addressing these gaps necessitates further exploration of ensemble methods, transfer learning, and federated learning approaches to create scalable, accurate, and sustainable forecasting systems. These efforts are essential for enabling the broader adoption of renewable energy technologies and achieving the goal of zero-carbon energy systems [26].

1.4. Research Objectives

The primary objective of this study is to evaluate the performance of artificial intelligence (AI) models for solar power forecasting, focusing on balancing predictive accuracy, computational efficiency, and environmental sustainability. By benchmarking various ML and DL models, including hybrid CNN-LSTM, Random Forest (RF), Support Vector Machine (SVM), and standalone LSTM and CNN models, this research aims to identify algorithms that deliver high forecasting accuracy while minimizing CO₂ emissions and computational resource demands. Key performance metrics such as MAPE, RMSE, MAE, and the adjusted coefficient of determination (R²) are used to rigorously assess the performance of each model across different forecasting horizons. This objective aligns with the broader goal of optimizing renewable energy systems and advancing sustainable AI practices for sustainable energy supply for smart cities. On the other hand, the research seeks to develop energy-efficient AI models by leveraging advanced techniques like sustainable AI while maintaining high prediction accuracy. These objectives are expected to contribute to developing scalable, eco-friendly solar forecasting systems, supporting the global transition toward zero-carbon energy infrastructures.

The remainder of this paper is organized as follows. Section 2 presents a comprehensive literature review on AI-based solar forecasting models and sustainable artificial intelligence practices. Section 3 outlines the research methodology, including data collection, preprocessing, and model development procedures. Section 4 details the operational characteristics of the Benban and Sakaka solar power plants and the input parameters used for forecasting. Section 5 discusses the development and tuning of machine-learning and deep learning models, including the proposed CNN-LSTM hybrid. Section 6 presents and analyzes the results, including forecasting accuracy, CO₂ emissions, and contributions to smart and zero-carbon cities. Section 7 addresses the limitations of the current study and proposes directions for future research. Finally, Section 8 concludes the paper by summarizing key findings and their implications for sustainable energy forecasting.

2. Literature Review

2.1. AI-Driven Solar Forecasting: Advances and Challenges

Integrating artificial intelligence (AI) into solar power forecasting has revolutionized energy management systems, particularly for smart cities striving for sustainability. Advanced ML and DL algorithms are increasingly employed to accurately predict solar irradiance, leveraging diverse data sources such as meteorological records and historical power generation data. Models such as Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNNs) excel in capturing temporal and spatial patterns, respectively, in solar datasets. Recent studies highlight the superior performance of hybrid models, which combine these strengths [14,27,28]. These models are instrumental in optimizing the integration of renewable energy sources into smart grids, enhancing reliability and efficiency in energy supply for urban areas.

While accuracy has been a primary focus, the environmental impact of AI-driven solar forecasting systems has garnered increasing attention. The computational demands of DL models can lead to significant carbon emissions, contradicting the sustainability goals they aim to support. Recent research has emphasized the importance of developing energy-efficient algorithms to minimize the environmental footprint of solar power forecasting [29,30]. Techniques such as pruning, quantization, and leveraging renewable energy for computation have emerged as promising solutions. These approaches aim to balance high-performance forecasting with environmental responsibility, making AI a more viable tool for achieving zero-carbon energy goals in smart cities [31,32].

Another development area involves improving the scalability and generalizability of solar power forecasting models. Many existing models are limited by their reliance on location-specific data, which hinders their application in diverse geographic regions. Transfer learning and federated learning methods have shown promise in addressing this challenge by enabling models to adapt to new datasets with minimal retraining [33,34,35]. These techniques allow the integration of heterogeneous data sources, including global solar radiation maps and regional weather patterns, enhancing the robustness and adaptability of forecasting systems. Such innovations are crucial for scaling AI-driven solutions across smart cities worldwide.

The convergence of AI and Internet of Things (IoT) technologies offers further potential for advancing solar power forecasting in smart cities. IoT-enabled sensors and devices can provide real-time data on solar irradiance, temperature, and atmospheric conditions, which can be seamlessly integrated into AI models for dynamic forecasting [36,37]. Additionally, AI-driven digital twins have emerged as a novel approach, allowing virtual replicas of solar energy systems to simulate and optimize energy flows in real-time. These advancements improve the accuracy and responsiveness of forecasting systems and enable proactive energy management, paving the way for resilient and sustainable energy infrastructures in urban environments [38].

2.2. AI Algorithms

Solar forecasting for one hour ahead relies on various machine-learning and statistical models, each with unique parameters influencing their performance. RF is an ensemble model that uses multiple decision trees, where each tree is trained on random subsets of data and features, as shown in Figure 3. Its parameters, such as the number of trees and minimum samples required for splits, determine the model’s ability to capture non-linear patterns in solar irradiance [29,39,40]. By aggregating predictions from individual trees, RF reduces overfitting and improves generalization. This model excels in handling large datasets with diverse input features, such as solar irradiance, temperature, and wind speed, making it effective for short- and long-term solar forecasting [41,42].

SVM and Gradient Boosting Machine (GBM) represent powerful approaches for solar energy prediction. SVM works by mapping input data to a higher-dimensional space using kernels (e.g., linear, radial basis function) to find a hyperplane that minimizes prediction errors. Key parameters include the kernel type, regularization parameter (C), and gamma, which control the influence of data points in RBF kernels. GBM, on the other hand, builds predictive models iteratively by minimizing a loss function at each step. Its parameters, such as the learning rate, number of boosting iterations, and maximum tree depth, allow for fine-grained control over model complexity and performance. GBM is particularly suited for capturing non-linear trends in solar data over different timescales, especially when combined with weather and temporal variables [43,44,45].

Deep learning models, such as LSTM and CNNs, are widely used for solar forecasting due to their ability to model temporal and spatial data, respectively, as shown in Figure 4. LSTM, a recurrent neural network variant, captures long-term dependencies in sequential data like time-series solar irradiance. Its key parameters include the number of LSTM layers, the number of hidden units, and dropout rates for regularization. CNNs are specialized for spatial data analysis, such as weather maps, using parameters like kernel size, number of filters, and pooling layers [44,46].

As shown in Figure 5, hybrid CNN-LSTM models combine the spatial feature extraction of CNNs with the temporal analysis of LSTMs, making them highly effective for spatiotemporal solar forecasting. ARIMA, a traditional statistical model, models time-series data based on autoregressive (p), differencing (d), and moving average (q) components. This simplicity makes ARIMA effective for linear short-term forecasting; however, it requires precise parameter selection to ensure accuracy in dynamic solar energy systems [47,48].

3. Research Methodology

The methodology for developing the proposed machine-learning framework to support a “smart and zero-carbon city” is grounded in a comprehensive, data-driven approach that integrates both predictive performance and environmental sustainability as in Figure 6. The process begins with data collection from two major utility-scale solar installations: Benban Solar Park in Egypt and Sakaka Solar Power Plant in Saudi Arabia. These facilities provide high-resolution meteorological and operational data, including variables such as solar irradiance, ambient temperature, wind speed, humidity, solar zenith angle, and parameters specific to photovoltaic systems. Before model development, the raw datasets undergo rigorous preprocessing steps, namely data cleaning to remove inconsistencies, normalization using min–max scaling to ensure uniform feature scales and structuring to align inputs for time-series modeling. The refined dataset is then partitioned into training (70%), validation (15%), and testing (15%) sets to support systematic learning and evaluation. Additional explanation of this data processing pipeline, including outlier removal using the interquartile range (IQR) method and feature encoding strategies, is provided in Section 3.

Following preprocessing, the model training stage involves fitting several machine-learning and deep learning algorithms, including Random Forest (RF), Support Vector Machine (SVM), Gradient Boosting Machine (GBM), Long Short-Term Memory (LSTM) networks, Convolutional Neural Networks (CNNs), a hybrid CNN-LSTM model, and the ARIMA statistical method. Each model is trained using the designated training set and optimized using Bayesian hyperparameter tuning, implemented via the Tree-structured Parzen Estimator (TPE). This optimization minimizes the Mean Absolute Percentage Error (MAPE) on the validation set while iteratively selecting the best parameter configurations such as tree depth (RF, GBM), kernel functions (SVM), and architectural parameters like the number of layers, neurons per layer, dropout rates, and batch sizes (LSTM, CNN, CNN-LSTM). The CNN-LSTM architecture is designed to extract spatial features using convolutional layers, which are then passed into LSTM layers to capture temporal dependencies in the sequence data. Detailed architecture specifications and model flow are described in Section 3 to illustrate the sequential learning structure and feature transformation stages employed in this hybrid model.

After the training and validation phases, model performance is evaluated on the testing dataset using a comprehensive set of performance metrics: MAPE (Mean Absolute Percentage Error), RMSE (Root Mean Square Error), MAE (Mean Absolute Error), and the adjusted coefficient of determination (R²). These metrics capture both average error magnitude and variance explanation, offering a balanced view of prediction quality. To ensure robustness and generalizability, K-fold cross-validation is applied during model development, averaging results across multiple partitions of the training set. Crucially, the framework also incorporates carbon footprint estimation, calculating the CO₂ emissions per computational hour for each model during training and inference. This step, aligned with sustainable AI principles, allows the selection of models that balance high forecasting accuracy with low environmental impact. The integration of training–validation–testing cycles with both performance and sustainability metrics establishes a robust foundation for real-world deployment in smart grid infrastructure. Further architectural and metric-specific details are elaborated in Section 3 to provide transparency in model evaluation and reproducibility of the results.

4. Data Collection

Benban Solar Park in Aswan, Egypt, is one of the largest solar energy projects globally and a cornerstone of Egypt’s commitment to renewable energy, as shown in Figure 7. Covering an area of approximately 37.2 square kilometers in the Sahara Desert, the park consists of 41 solar power plants with a total capacity of 1.8 GW, supplying electricity to over one million homes. This ambitious project is part of Egypt’s Sustainable Energy Strategy 2035, which aims to increase the contribution of renewable energy to the national grid while reducing greenhouse gas emissions [49,50]. Financed through partnerships with private developers and global organizations, such as the International Finance Corporation (IFC), Benban Solar Park exemplifies international collaboration in advancing sustainable energy infrastructure. The park meets local energy demands and positions Egypt as a renewable energy hub in the Middle East and North Africa (MENA) region [51].

Table 1 provides a comparative overview of Benban Solar Park in Egypt and Sakaka Solar Power Plant in Saudi Arabia, highlighting key operational parameters. Benban Solar Park, located in Aswan Governorate, Egypt, was commissioned in 2019 with a total capacity of 1650 MW, significantly larger than the Sakaka plant, which was commissioned in 2020 with a capacity of 405 MW. Both locations receive high solar irradiance, with Benban at approximately 2300 kWh/m²/year and Sakaka at around 2200 kWh/m²/year. Benban covers a vast area of approximately 37.2 km² and utilizes around 7.2 million solar panels, whereas Sakaka spans 6 km² with over 1.2 million panels. In terms of energy generation, Benban produces around 3.8 TWh annually, generating an average of 10.4 GWh daily and 433 MW hourly. In comparison, Sakaka generates approximately 0.94 TWh annually, with an average daily generation of 2.6 GWh and an hourly output of 107 MW, reflecting its smaller scale but efficient energy production within the available area.

Table 2 highlights the key input parameters utilized in solar forecasting models, which are essential for predicting solar energy generation. These parameters include solar irradiance (0–1100 W/m²), the primary driver of photovoltaic (PV) output, as it measures the solar power incident per unit area. Ambient temperature (5–45 °C) is critical because it affects the efficiency of solar panels; higher temperatures can reduce panel performance. Wind speed (0–15 m/s) influences panel cooling, helping to counteract temperature effects, while humidity (10–50%) impacts air clarity and can reduce solar irradiance due to water vapor absorption. Temporal variables such as the time of day (0–24 h) and month of the year (January–December) are included to account for diurnal and seasonal variations in solar intensity.

Furthermore, geometric, and panel-specific factors play a crucial role. The solar zenith angle (0–75°), which measures the sun’s position relative to the vertical, directly influences the amount of sunlight hitting the panels. Similarly, the panel tilt angle (0–80°) and panel orientation (0–300°) are adjusted to optimize exposure to sunlight based on geographic location and time of year. Panel efficiency (15–22%) reflects the ability of PV cells to convert solar energy into electricity, a parameter influenced by both technology and environmental conditions. Collectively, these input parameters enable solar forecasting models to deliver accurate predictions by considering the complex interplay of environmental, temporal, and technological factors. Such forecasts are vital for optimizing energy management in solar parks and effectively integrating renewable energy into the grid.

The dataset used in this study was collected from two utility-scale solar facilities: Benban Solar Park in Egypt and Sakaka Solar Power Plant in Saudi Arabia. Data acquisition was conducted hourly over a continuous 24-month period (January 2023 to December 2024) using calibrated on-site sensors and local weather stations at each site. This high-frequency monitoring yielded approximately 17,520 observations per station (365 days × 24 h × 2 years), resulting in a combined total of 35,040 hourly data points. The dataset includes a comprehensive set of meteorological and technical input variables essential for solar power forecasting. These input features consist of solar irradiance (W/m²), ambient temperature (°C), wind speed (m/s), relative humidity (%), and solar zenith angle (°), alongside photovoltaic system parameters such as panel tilt angle (°), azimuth orientation (°), and conversion efficiency (%). In addition, temporal attributes such as the hour of the day, day of the year, and month are included to account for diurnal and seasonal variations.

Before model training, the dataset underwent several preprocessing steps. Missing values were imputed using linear interpolation to maintain temporal continuity. Outliers were detected using the interquartile range (IQR) method and removed or capped as appropriate. Finally, all numerical features were normalized to a [0, 1] scale using min–max scaling to ensure consistent model convergence and to prevent dominance by high-magnitude variables. The output variable is the forecasted solar power output (in kW) one hour ahead, enabling the model to learn temporal dependencies in generation behavior. This robust dataset structure ensures high-resolution, reproducible model training and evaluation across varied climatic and operational contexts.

Figure 8 illustrates the typical daily power output of Benban Solar Park, highlighting the inherent variability of solar energy. Power generation begins at sunrise, gradually increases to a peak around midday when solar radiation is strongest, and then steadily declines as the sun sets. This bell-shaped curve demonstrates how solar power production is directly influenced by the availability of sunlight throughout the day, with maximum output (around 1500 MW) occurring between 10:00 and 14:00 h and minimal output during nighttime hours.

5. AI Model Development

Bayesian optimization is a powerful approach for hyperparameter tuning, leveraging probabilistic models to efficiently search the parameter space and identify the optimal combination for a given ML model [52,53]. Unlike grid or random search, Bayesian optimization constructs a surrogate model, such as Gaussian Processes, to predict the objective function and iteratively refines the search. For solar forecasting, hyperparameters such as the number of trees in Random Forest, learning rate, and boosting iterations in GBMs, or kernel size and hidden layers in DL models can significantly affect model performance. Metrics such as MAPE, Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) are often used as objective functions in Bayesian optimization, ensuring that models are fine-tuned to minimize prediction errors.

During model validation, these metrics provide critical insights into model performance. MAPE evaluates the average percentage deviation between predicted and actual values, making it suitable for understanding relative forecasting errors in solar irradiance prediction. MSE and RMSE, by emphasizing larger errors due to their squared and square root terms, respectively, are highly sensitive to outliers, helping to identify models that struggle with extreme prediction errors. MAE complements these by providing a straightforward average of absolute errors, offering a balanced view of overall accuracy. Validation typically involves splitting data into training and testing sets or employing cross-validation to ensure the model generalizes well. The Bayesian optimization process integrates these metrics to propose better hyperparameters iteratively.

For comprehensive model assessment, metrics like R² (coefficient of determination) and its adjusted form (R² *), are crucial for validating the goodness-of-fit. R² quantifies the proportion of variance explained by the model, ranging from 0 to 1, where higher values indicate better explanatory power. Adjusted R² accounts for model complexity by penalizing excessive parameters, making it a robust choice for evaluating models with different numbers of predictors. Together, these metrics help Bayesian optimization to guide models toward configurations that balance accuracy and complexity, ensuring high predictive performance while avoiding overfitting. Validation proves the optimized model is both accurate and reliable for solar forecasting applications based on the following equations:

M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{|y_{i} - {\hat{y}}_{i}|}{{\hat{y}}_{i}} x 100

(1)

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(2)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(3)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(4)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}}, 0 \leq R^{2} \leq 1

(5)

R^{* 2} = R^{2} - \frac{(1 - R^{2}) K}{n - (K + 1)}

(6)

To enhance model performance and ensure fair comparison, Bayesian optimization was employed for hyperparameter tuning across all forecasting models evaluated in this study. The optimization process utilized the Tree-structured Parzen Estimator (TPE) algorithm and was conducted over 50 iterations for each model configuration. The objective function guiding this process was the minimization of the Mean Absolute Percentage Error (MAPE) on the validation dataset, thereby aligning model selection with forecasting accuracy. A tailored set of hyperparameters was selected for each model class. For ensemble learning algorithms (Random Forest and Gradient Boosting Machines), key parameters included the number of estimators, tree depth, and feature sampling strategies. For Support Vector Machines, kernel type, regularization coefficient (C), and kernel coefficient (gamma) were optimized. Deep learning models such as LSTM, CNN, and the hybrid CNN-LSTM were tuned concerning network architecture (number of layers, units per layer), dropout rates, batch size, and the number of training epochs. Table 3 and Table 4 present the final optimized hyperparameter values for all models trained on the Benban Solar Park and Sakaka Solar Power Plant datasets, respectively.

6. Results and Discussion

6.1. Solar Power Forecasting for Benban Solar Park, Egypt

Table 5 summarizes the performance of various AI models for solar forecasting, evaluated using key metrics: MAPE, RMSE, MAE, and the adjusted coefficient of determination (R*²). These metrics provide insights into each model’s accuracy and reliability in predicting solar power. Among the models, the hybrid CNN-LSTM model achieves the best performance, with the lowest MAPE (2.04), RMSE (184), and MAE (252), and the highest R*² (0.99). This demonstrates its ability to effectively capture both spatial and temporal patterns, making it ideal for highly accurate solar forecasting.

Traditional machine-learning models like RF and SVM perform moderately well. RF achieves a high R*² of 0.95, with a relatively low MAPE (4.47), making it a robust choice for general forecasting. However, SVM shows higher prediction errors (MAPE: 8.42, RMSE: 416.4), indicating it cannot handle complex non-linearities as effectively in this context. The GBM model balances predictive performance with an MAPE of 6.88 and R*² of 0.87, suggesting it captures non-linear relationships better than SVM but falls behind RF in overall accuracy.

Deep learning models, including LSTM and CNNs, highlight the strength of neural networks in solar power forecasting. While LSTM has a low MAPE (3.44), its RMSE (840) and MAE (936) suggest challenges with larger prediction errors. CNNs perform slightly better in terms of RMSE (336) and MAE (528), though their R*² (0.85) is slightly lower. As a statistical baseline, the ARIMA model delivers the lowest R*² (0.91) and the highest MAPE (11.77), emphasizing its limitations for non-linear and long-term solar forecasting. These results underscore the importance of selecting advanced models such as hybrid CNN-LSTM for highly accurate and reliable predictions in solar energy applications.

CodeCarbon 3.0.4 is an open-source Python toolkit that estimates the carbon footprint of code execution by monitoring energy consumption across computational resources such as CPU, GPU, and memory. It factors in hardware usage, runtime, and geographic location to compute CO₂ emissions using regional electricity carbon intensity data from sources like the International Energy Agency (IEA). In this study, such an approach supports the evaluation of environmental impacts associated with training and deploying AI models. By enabling transparent measurement of emissions per computational hour, CodeCarbon’s methodology aligns with the paper’s goal of promoting sustainable AI practices in solar power forecasting, particularly by assessing the trade-offs between model accuracy and carbon efficiency.

Figure 9 illustrates the CO₂ emissions per computational hour for various AI forecasting models used in solar prediction. The bar chart compares the environmental impact of these models based on their energy consumption during training and inference. ARIMA, with the lowest emissions at 75 g CO₂/hour, is the most energy-efficient, followed by RF and SVM, emitting 102 g and 111 g CO₂/hour, respectively. These models are highlighted as “green algorithms” due to their minimal environmental impact, making them ideal for sustainability-conscious applications where computational efficiency is critical.

Conversely, DL models such as CNN-LSTM, CNNs, and LSTM demonstrate significantly higher CO₂ emissions, ranging from 215 g to 287 g CO₂/hour. While these models deliver higher forecasting accuracy, their energy-intensive nature raises concerns about sustainability. Among deep learning models, GBM emits a comparatively moderate 130 g CO₂/hour, striking a balance between performance and energy consumption. The figure emphasizes the trade-off between prediction accuracy and environmental impact, underscoring the importance of optimizing AI models for both accuracy and energy efficiency to support green energy initiatives.

Figure 10 illustrates the performance of the CNN-LSTM model in forecasting solar power output. The graph compares the actual power output (blue line) with the predicted power output (orange line) over a period represented in hours. The two lines closely align, indicating that the model effectively captures the temporal patterns of solar power generation. The cyclic nature of the graph reflects the daily solar power generation pattern, with peaks during midday when sunlight is abundant and troughs during nighttime. The strong correlation between the actual and forecasted outputs demonstrates the accuracy and reliability of the CNN-LSTM model for solar power prediction.

Figure 11 depicts the performance of the CNN-LSTM model in terms of its predictive accuracy, represented by a scatter plot of predicted versus actual solar power outputs. Each data point corresponds to a specific prediction, with the proximity of points to the diagonal line indicating the alignment between the predicted and actual values. The high coefficient of determination (R² = 0.9969) signifies a strong correlation, demonstrating that the model explains approximately 99.69% of the variance in the actual power output. This indicates that the CNN-LSTM model achieves highly accurate predictions with minimal error.

6.2. Solar Power Forecasting for Sakaka Solar Power Plant, Saudi Arabia

Table 6 presents the accuracy performance of various artificial intelligence (AI) models developed for one-hour-ahead solar power forecasting at Sakaka Solar Power Plant in Saudi Arabia. The models were evaluated using key performance metrics, including Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R²). Among the models, the hybrid CNN-LSTM model demonstrated superior performance with the lowest MAPE of 2%, RMSE of 190, MAE of 255, and the highest R² value of 0.98, indicating excellent prediction accuracy. The Long Short-Term Memory (LSTM) model also showed relatively good accuracy with an MAPE of 3.4%, while Random Forest (RF) and Convolutional Neural Networks (CNNs) recorded MAPE values of 4.5% and 4.3%, respectively. On the other hand, the ARIMA model exhibited the highest MAPE of 11.8%, reflecting lower forecasting accuracy compared to AI-based models. Overall, the results highlight that deep learning models, particularly the hybrid CNN-LSTM, significantly outperformed traditional statistical and machine-learning models in forecasting solar power for short-term prediction horizons.

The CNN-LSTM hybrid model stands out as the most accurate forecasting model, achieving the lowest MAPE (2.04%) and RMSE (184) among all models, indicating its exceptional predictive performance. By combining the spatial feature extraction capabilities of CNNs with the temporal sequence modeling strength of LSTM networks, the CNN-LSTM effectively captures both spatial and temporal patterns in solar data. Despite its high computational complexity, the CO₂ emissions of 287 g per computational hour are acceptable when weighed against its unmatched accuracy. This makes the CNN-LSTM model a compelling choice for applications prioritizing precision, particularly in high-stakes solar energy forecasting scenarios as shown in Figure 12.

Figure 10 illustrates the CO₂ emissions associated with one hour of computation for each forecasting model used at Sakaka Solar Power Plant. Among all models, the CNN-LSTM hybrid exhibited the highest carbon footprint, producing 287 g of CO₂ per computational hour, followed by CNNs (237 g) and LSTM (215 g). These deep learning models, particularly those with sequential or layered convolutional architectures, demand significantly more computational resources, resulting in higher energy usage and carbon emissions. In contrast, ensemble learning models such as GBM (130 g), SVM (111 g), and RF (102 g) demonstrate moderate emissions, reflecting their balance between model complexity and computational efficiency. The ARIMA model registered the lowest emissions at just 75 g, highlighting its minimal computational requirements.

The data emphasize a clear trade-off between forecasting accuracy and environmental sustainability. While complex architectures like CNN-LSTM may offer superior predictive accuracy, they incur significantly higher environmental costs in terms of CO₂ output. This has important implications for model deployment, particularly in smart city or edge-computing contexts where energy efficiency and sustainability are key priorities. For applications requiring frequent retraining or continuous deployment, lightweight models such as RF or ARIMA may be preferred despite slightly lower predictive performance, especially if emissions reduction is a strategic goal. Overall, this analysis underscores the necessity of evaluating AI models not only by their technical precision but also by their carbon efficiency, supporting the broader agenda of sustainable AI in renewable energy forecasting.

Figure 13 illustrates the performance of the hybrid CNN-LSTM model in forecasting the solar power output for Sakaka Solar Power Plant in Saudi Arabia over a period of 100 h. The hybrid CNN-LSTM model achieved the highest forecasting accuracy for solar power prediction at Sakaka Solar Power Plant, with an MAPE of 2%, RMSE of 190, MAE of 255, and R² of 0.98. The figure compares the actual power output (blue line) with the forecasted power output (orange line). The results demonstrate that the CNN-LSTM model effectively captures the power generation pattern, accurately predicting the rise and fall of solar power output during daylight hours, while correctly predicting zero output during nighttime hours. The close alignment between the actual and forecasted curves indicates the high accuracy and robustness of the model, with only minor deviations observed at peak generation periods. Overall, the CNN-LSTM model shows excellent potential for short-term solar power forecasting, ensuring reliable and precise energy prediction for grid management and planning.

Figure 14 presents the scatter plot illustrating the performance of the CNN-LSTM model for one-hour-ahead solar power forecasting at Sakaka Solar Power Plant, Saudi Arabia. The plot compares the actual power output with the forecasted power output, where the data points are closely clustered around the diagonal line, indicating a strong correlation between the predicted and observed values. The coefficient of determination (R²) is recorded as 0.9828, highlighting the developed model’s excellent predictive capability and accuracy. This high R² value demonstrates that the CNN-LSTM model can effectively capture the complex patterns of solar power generation, making it a reliable tool for accurate short-term solar power forecasting in real-world applications.

A deeper examination of the forecasting results for Benban Solar Park reveals that the hybrid CNN-LSTM model significantly outperformed all other models in terms of MAPE, RMSE, and R² scores. This superior performance can be attributed to the model’s ability to capture both spatial dependencies (via convolutional layers) and temporal dynamics (via LSTM layers), which is particularly beneficial in handling the complex and non-linear interactions present in the desert climate of Upper Egypt. Notably, performance improvements were especially pronounced during the peak irradiance hours (10:00 a.m.–2:00 p.m.), suggesting that the CNN-LSTM model effectively leveraged strong irradiance gradients and thermal profiles for accurate short-term forecasting. Ensemble models such as Random Forest and GBM demonstrated robust performance during non-peak hours but lacked the sequential learning capabilities required to model abrupt weather transitions. Additionally, the ARIMA model, although useful for capturing linear trends, showed high residual variance, indicating its limited suitability for non-stationary, high-frequency PV output data. These findings underscore the advantage of deep hybrid architectures in data-rich, high-resolution forecasting environments like Benban.

The performance evaluation at Sakaka Solar Power Plant presented a contrasting pattern, highlighting the sensitivity of forecasting models to regional climatic variability. While the CNN-LSTM model again achieved the lowest MAPE (2.04%), its margin over traditional machine-learning models was narrower compared to Benban. This is potentially due to the higher relative humidity and more moderate diurnal irradiance fluctuations in the Al-Jouf region, which reduce the volatility in PV output and make the dataset more amenable to simpler learning models like SVM and GBM. Interestingly, the GBM model yielded consistent results across all hours, indicating its strength in capturing the mild seasonal and atmospheric changes observed at the site. The LSTM-only model struggled to match CNN-LSTM’s accuracy, reaffirming the value of hybridization in contexts with stable yet non-linear features. Moreover, the comparative analysis revealed that computationally lightweight models (e.g., SVM) offered competitive performance with substantially lower training overhead, making them viable for deployment in edge-computing scenarios. These results highlight the importance of aligning model complexity with regional data characteristics and operational goals in sustainable AI deployments.

6.3. LIME Values and Features Contribute

The LIME (Local Interpretable Model-Agnostic Explanations) importance chart for Benban Solar Park highlights the relative contribution of input features to the CNN-LSTM model’s solar power forecasting accuracy. Solar irradiance emerges as the most influential feature, contributing approximately 33% to the model’s predictions, followed by ambient temperature at 27%, both of which directly affect photovoltaic output and thermal efficiency. Wind speed and humidity contribute 11% and 9%, respectively, likely due to their effects on convective cooling and atmospheric transparency as shown in Figure 15. Panel efficiency and time of day account for 6% and 5%, while panel tilt angle and month of the year each contribute around 3%. The solar zenith angle and panel orientation are the least influential, each contributing only about 1% to the model’s outputs. These findings confirm that real-time meteorological inputs dominate model performance, with static or seasonal parameters playing a secondary role in short-term forecasting for high-irradiance environments like Benban.

The LIME (Local Interpretable Model-Agnostic Explanations) importance chart for Sakaka Solar Power Plant reveals the contribution of each input feature to the CNN-LSTM model’s forecasting performance. Solar irradiance remains the most critical input, contributing approximately 32.5% to the model’s predictive accuracy, followed closely by ambient temperature at 30.5%, underscoring their combined influence on photovoltaic generation. Wind speed and humidity show moderate importance, with contributions of around 14% and 11%, respectively, likely reflecting their role in affecting panel cooling and atmospheric clarity in Sakaka’s semi-arid climate as shown in Figure 16. Panel efficiency and time of day account for approximately 5.5% and 4.5%, while solar zenith angle, month of the year, panel orientation, and panel tilt angle each contribute less than 3%, indicating a minor effect on the model’s short-term predictive ability. These findings suggest that, similar to Benban, dynamic meteorological factors are far more influential than static or seasonal attributes in accurate, real-time solar power forecasting for the Sakaka site.

The LIME (Local Interpretable Model-Agnostic Explanations) analyses for both Benban Solar Park and Sakaka Solar Power Plant demonstrate that real-time meteorological variables are the most influential features driving the CNN-LSTM model’s forecasting accuracy. In both locations, solar irradiance and ambient temperature collectively contribute over 60% of the model’s predictive power (Benban: 33% and 27%; Sakaka: 32.5% and 30.5%), highlighting their critical impact on photovoltaic performance. Secondary meteorological features such as wind speed and humidity play moderate roles (Benban: 11% and 9%; Sakaka: 14% and 11%), likely due to their influence on panel cooling and atmospheric clarity. In contrast, static or seasonal parameters, including panel tilt angle, orientation, solar zenith angle, and month of the year, collectively contribute less than 10% at both sites, indicating limited relevance for short-term prediction. These results affirm that CNN-LSTM model performance is predominantly driven by dynamic, real-time environmental factors, reinforcing the importance of high-resolution meteorological data for accurate solar forecasting in arid and semi-arid climates.

To enhance decision-making, LIME visualizations can be seamlessly integrated into interactive dashboards that provide clear, real-time explanations of the model’s predictions. These dashboards can visually highlight the most influential input features for each forecast, such as solar irradiance, temperature, or wind speed, enabling operators to understand the rationale behind predicted outputs. By translating complex model behavior into intuitive graphical summaries, these tools make AI-driven forecasts more accessible to non-technical users. This fosters greater operational transparency and situational awareness, which are essential for informed energy planning. Moreover, understanding feature-level impacts allows grid managers to anticipate and respond to sudden environmental changes more effectively. It also enhances trust in AI systems by making them more explainable and accountable. Ultimately, such interpretability supports real-time adjustments in grid operations, battery dispatch, and demand-response strategies, aligning forecasts with practical energy management objectives.

6.4. Contribution to Smart and Zero-Carbon Cities

The conceptual framework for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF) is illustrated in Figure 17 and its role in enabling energy decarbonization for smart cities. The process begins with solar energy production, which is integrated into the SAI-SPF system for enhanced forecasting and energy management. Real-time computing (RTC) and real-time data (RTD) form the system’s backbone, ensuring accurate and timely solar power predictions. These predictions optimize energy supply management for smart and zero-energy cities, enabling efficient use of renewable energy resources.

The SAI-SPF system leverages IoT devices and data collection to provide a steady flow of RTD. This data is used for forecasting and monitoring greenhouse gas (GHG) emissions, allowing the system to align energy supply with urban decarbonization strategies. The feedback loop ensures continuous system improvement by integrating GHG data into carbon emission measuring processes, which informs the strategy for achieving net-zero carbon cities. This holistic approach enables urban areas to balance energy demand and supply efficiently while reducing their carbon footprint.

Therefore, the framework incorporates urban deep decarbonization strategies as a starting point, guiding cities towards sustainable development goals. By creating a green energy supply pathway that prioritizes solar energy, the system connects renewable energy production with carbon neutrality objectives. The feedback loop ensures adaptive measures are taken to improve performance over time, driving the transition towards a more sustainable and decarbonized future for urban environments.

One-hour-ahead solar power forecasting plays a crucial role in achieving zero-carbon cities by enhancing the integration and reliability of renewable energy systems [54]. This forecasting capability supports real-time decision-making for energy storage utilization, load balancing, and distributed energy resource management, which are essential for maintaining grid stability. Additionally, precise forecasting reduces economic losses from energy overproduction or shortages, fosters the adoption of smart grid technologies, and accelerates the transition toward sustainable urban energy systems, ultimately contributing to the decarbonization of cities [55].

The predicted solar power output serves as a core input for dynamic energy scheduling, enabling grid operators to optimize supply allocation, minimize reliance on fossil-based backup systems, and ensure load stability in smart city infrastructures. Simultaneously, the CO₂ emission estimates, computed per computational hour for each forecasting model, enable stakeholders to evaluate the environmental cost of AI model deployment. This dual-layered prediction informs technical performance (e.g., accuracy, latency) and quantifies sustainability outcomes, such as emissions avoided by early demand-response actions. For example, selecting a model with slightly lower accuracy but 50% less CO₂ impact (e.g., GBM vs. CNN-LSTM) can lead to tangible environmental gains in large-scale or real-time applications. Together, these outputs enhance the interpretability and ethical accountability of AI-assisted solar energy systems, aligning the SAI-SPF framework with smart, zero-carbon city objectives.

The generalizability of the proposed CNN-LSTM model across different climatic zones and seasonal variations was a key consideration in the development of the SAI-SPF framework. Although the model was primarily trained and validated using datasets from Benban Solar Park and Sakaka Solar Power Plant, these locations, Benban Solar Park (desert-arid climate) and Sakaka Solar Power Plant (semi-arid to arid climate), were strategically selected due to their positioning within the global solar belt, a high-irradiance region spanning subtropical latitudes. The model demonstrated consistently low forecasting errors (MAPE < 2.1%) across both sites, suggesting strong adaptability to environmental conditions characterized by high solar exposure, diurnal extremes, and moderate seasonal variation. The hybrid CNN-LSTM architecture, which combines convolutional feature extraction with sequential learning, is particularly effective at capturing local irradiance patterns and temporal dependencies, thereby ensuring reliable performance in similar geographic contexts. Consequently, the model is considered well-suited for deployment across other locations within the solar belt, including regions in North Africa, the Middle East, and parts of South Asia. For broader validation, future work will extend the framework to temperate and tropical zones using external datasets, and may incorporate transfer learning or domain adaptation to enhance cross-climate generalization.

The trained CNN-LSTM model applied TensorFlow Lite benchmarks. On an NVIDIA Jetson Nano (Silicon Valley, CA, USA), the model achieved an average inference latency of approximately 210 milliseconds per input batch, which is well within the real-time operational threshold for one-hour-ahead solar forecasting. These findings support the model’s suitability for deployment in edge-computing environments, such as microgrid controllers or localized energy management systems, where low latency and decentralized intelligence are critical for robust solar integration into smart grids. The reduction in MAPE from 4 to 8% (observed in traditional machine-learning models) to approximately 2% with the CNN-LSTM model, can yield significant operational and economic advantages. Improved forecast precision enables more effective load balancing, battery energy storage system (BESS) cycling, and minimization of curtailment losses, all of which directly contribute to lower operational costs and improved grid reliability.

Moreover, this paper advocates for the integration of AI-based solar power forecasting systems into national renewable energy mandates, particularly in developing countries seeking to accelerate their transition to zero-carbon energy infrastructures. The adoption of such intelligent forecasting tools can significantly improve grid stability and renewable energy integration, which are critical components of sustainable energy policy. Second, we recommend that regulatory bodies establish incentives for sustainable AI development, including mandatory disclosure of model energy consumption and associated carbon emissions. This would promote accountability and encourage the use of energy-efficient algorithms in line with Green AI principles. Third, the study supports the inclusion of real-time forecasting requirements in Power Purchase Agreements (PPAs) and grid interconnection codes, ensuring that solar energy producers adhere to predictive standards that facilitate efficient energy dispatch and minimize curtailment. Together, these policy recommendations align technical advancements in AI forecasting with broader regulatory and environmental objectives, fostering a more resilient and sustainable energy ecosystem.

6.5. SAI-SPF Contribution to Sustainability

The proposed Sustainable AI-Driven Solar Power Forecasting (SAI-SPF) system significantly contributes to sustainability by advancing the integration of renewable energy into urban energy systems. By leveraging real-time data (RTD) and real-time computing (RTC), the SAI-SPF model enhances the accuracy of solar power forecasts, enabling smarter and more efficient energy supply management. This reduces dependency on fossil fuels and minimizes energy wastage, fostering the transition to green energy in smart cities. On the other hand, the system’s ability to integrate IoT devices for data collection supports proactive energy planning, ensuring a reliable and sustainable energy supply for growing urban populations [38].

Furthermore, the SAI-SPF framework aligns with global carbon neutrality goals by facilitating urban deep decarbonization strategies. By measuring and monitoring greenhouse gas (GHG) emissions in real-time, cities can adopt adaptive emission reduction measures while promoting net-zero carbon cities. The closed-loop feedback system ensures continuous improvement, allowing the model to refine its predictions for urban ecosystems. Through its innovative approach, the SAI-SPF system advances renewable energy adoption and serves as a critical enabler for achieving environmental, economic, and social sustainability objectives in modern cities. Table 7 expands on how the SAI-SPF system aligns with key SDGs, promoting cleaner energy, smarter infrastructure, sustainable cities, climate action, and global partnerships [56,57].

Therefore, this paper establishes a comprehensive methodological framework for evaluating sustainable artificial intelligence (SAI) practices by integrating environmental, computational, and ethical performance metrics. The framework is grounded in Green AI principles and aligned with the United Nations Sustainable Development Goals (SDGs), particularly SDGs 7, 9, 11, 13, and 17. Environmental sustainability is assessed by quantifying the CO₂ emissions per computational hour for each AI model using power consumption data and emission factors, emphasizing the trade-off between predictive performance and ecological impact. Computational efficiency is measured using model training time, inference speed, and energy consumption across different hardware platforms, including edge devices. Furthermore, the framework incorporates real-time data integration, interpretability, and scalability as essential dimensions for deployment in smart cities. The hybrid CNN-LSTM model, while achieving superior forecasting accuracy (MAPE: 2.0–2.04%), is critically evaluated alongside less energy-intensive models such as Random Forest and SVM to determine the optimal balance between accuracy and sustainability. This approach ensures that AI models are not only technically proficient but also environmentally responsible and socially aligned with long-term urban decarbonization goals.

7. Research Limitations and Future Research

While the SAI-SPF framework demonstrates significant potential, several research limitations need to be addressed for broader applicability. One primary limitation is the dependency on high-quality, real-time data (RTD) and advanced computational resources for accurate solar power forecasting. Many regions, particularly in developing countries, lack the necessary infrastructure, such as IoT devices and reliable data acquisition systems, which could limit the implementation and scalability of the model. Additionally, the system’s reliance on AI and ML models introduces challenges related to computational complexity, model training, and optimization, which could pose barriers for resource-constrained settings.

Another limitation lies in weather conditions’ dynamic and unpredictable nature, which can introduce uncertainties in solar power predictions. Although the SAI-SPF framework is designed to handle variability, extreme or sudden weather changes still impact the model’s accuracy. Furthermore, the integration of this system into existing energy grids requires substantial investments, technical expertise, and policy alignment, which could delay widespread adoption. Addressing these limitations through enhanced data-sharing frameworks, robust policy support, and technological advancements will be crucial to ensuring the long-term success and accessibility of the SAI-SPF model, as shown in Figure 18.

Future research on the SAI-SPF framework could focus on enhancing its adaptability and scalability to accommodate diverse geographical and socio-economic conditions. Developing models that can work effectively in regions with limited real-time data availability or intermittent internet connectivity will be critical. This could involve integrating satellite-based solar irradiation data or developing hybrid approaches that combine traditional statistical methods with AI-driven forecasting. Additionally, further exploration of lightweight and energy-efficient algorithms would enable implementation in resource-constrained settings, broadening the global applicability of the framework as shown in Figure 19.

Another promising avenue for future research is the integration of SAI-SPF into decentralized energy systems, such as microgrids or peer-to-peer energy trading networks. The system could facilitate transparent energy transactions and localized energy management by leveraging blockchain technology and advanced IoT solutions. Furthermore, incorporating advanced climate models and expanding the framework’s scope to include multi-source renewable energy forecasting (e.g., wind and hydropower) would enhance its robustness and effectiveness. These efforts would improve the accuracy and reliability of energy predictions and support the transition to sustainable, resilient energy ecosystems in urban and rural areas.

On the other hand, while the proposed CNN-LSTM-based Sustainable AI-Driven Solar Power Forecasting (SAI-SPF) framework has demonstrated strong performance using real-world data from the Benban and Sakaka solar power plants, several limitations must be acknowledged. These limitations primarily relate to data availability, generalizability, and computational considerations. Addressing these gaps will be critical for scaling the model to diverse geographic regions, integrating additional environmental variables, and enhancing its suitability for real-time deployment. The key limitations and proposed areas for future research are outlined below:

Lack of Transfer Learning and Domain Adaptation:

The current model was trained and validated exclusively on data from two sites within the solar belt, due to the absence of publicly available, high-resolution datasets from other geographic regions. Future research will explore transfer learning and domain adaptation methods to adapt pre-trained models to new climates with minimal data requirements.

No Inclusion of Transformer-Based Models:

Although the CNN-LSTM model was benchmarked against several traditional and deep learning algorithms (e.g., RF, SVM, GBM, LSTM, CNN, ARIMA), state-of-the-art transformer architectures such as the Temporal Fusion Transformer (TFT) and Informer were not implemented. Their potential for multivariate, long-horizon forecasting will be explored in future comparative evaluations.

Absence of Physics-Informed and Graph-Based AI Models:

The current study focuses on data-driven approaches and does not incorporate Physics-Informed Neural Networks (PINNs) or Spatiotemporal Graph Neural Networks (ST-GNNs). These advanced architectures could embed solar radiation physics and sensor topology to improve performance under complex environmental conditions.

Limited Contextual Environmental Data:

Contextual features such as dust storms, satellite cloud cover, and aerosol indices were not included due to unavailability during the study period. Future work will integrate such remote-sensing data to enhance forecasting accuracy in regions with volatile atmospheric behavior.

No Explicit Uncertainty Modeling in Extreme Conditions:

While outliers were removed during preprocessing to improve stability, no explicit uncertainty modeling was conducted for extreme weather events or irradiance fluctuations. Future efforts will incorporate real-time anomaly detection and probabilistic forecasting techniques to address this gap.

Real-Time Deployment Not Yet Implemented:

Although the model was trained on real operational data, it has not yet been deployed in an actual real-time forecasting system. Such deployment depends on formal approval from the administrative authorities at the solar facilities following publication acceptance.

No Error Analysis for Specific Temporal or Environmental Conditions:

A detailed error analysis to isolate underperforming time windows (e.g., dawn, dusk, cloudy conditions) was not conducted due to a lack of labeled event metadata. This analysis will be prioritized in future studies using enriched contextual datasets.

Model Compression and Pruning Not Applied:

The study did not employ compression techniques such as pruning, quantization, or knowledge distillation. These strategies will be considered in future work to reduce computational load and enable efficient deployment on edge devices or low-resource platforms.

8. Conclusions

The evaluation of various AI models for solar power forecasting highlights the significant performance differences between AI models and hybrid approaches. The hybrid CNN-LSTM model stands out as the most accurate, achieving the lowest MAPE (2.04%) and RMSE (184), along with the highest R*² (0.99), indicating its superior ability to capture both spatial and temporal patterns in solar data. This advanced model effectively addresses the complexities of solar energy prediction, making it an ideal choice for high-precision applications. While deep learning models such as CNN-LSTM demonstrate remarkable accuracy, their higher computational cost and CO₂ emissions underscore the trade-off between model performance and energy consumption.

On the other hand, traditional models such as RF and SVM offer moderate performance, with RF achieving a high R*² of 0.95 and a relatively low MAPE (4.47%), making it a reliable option for general forecasting tasks. These models are also more energy-efficient, emitting less CO₂ per computational hour compared to deep learning models. ARIMA, although a statistical baseline model, performs poorly in terms of accuracy, with the highest MAPE (11.77%) and the lowest R*² (0.91), highlighting its limitations for solar forecasting, especially when dealing with non-linearities and long-term predictions.

The results also emphasize the importance of considering forecasting accuracy and environmental impact when selecting AI models for solar energy applications. While the hybrid CNN-LSTM model excels in accuracy, its higher energy consumption and CO₂ emissions may pose challenges in sustainability-conscious environments. Conversely, though less accurate, models such as RF and SVM provide a more energy-efficient solution, making them suitable for applications where computational efficiency is a priority. This analysis underscores the need for optimizing AI models to balance performance and environmental impact, aligning with the growing demand for green energy solutions and sustainable practices in the renewable energy sector.

Author Contributions

Conceptualization, H.E., A.A.A.; Data curation, H.E., A.A.A.; Formal analysis, H.E.; Funding acquisition, A.A.A.; Investigation, H.E., F.K.P.H., A.A.A.; Methodology, H.E.; Project administration, H.E., A.A.A.; Resources, H.E., A.A.A.; Software, H.E.; Supervision, H.E., A.A.A.; Validation, H.E., A.A.A.; Visualization, H.E., F.K.P.H., A.A.A.; Writing—original draft, H.E., F.K.P.H., A.A.A.; Writing—review and editing, H.E., F.K.P.H., A.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors appreciate the Ongoing Research Funding Program (ORF-2025-590), King Saud University, Riyadh, Saudi Arabia.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Ionescu, L. Urban Greenhouse Gas Accounting for Net-Zero Carbon Cities: Sustainable Development, Renewable Energy, and Climate Change. Geopolit. Hist. Int. Relat. 2022, 14, 155–171. [Google Scholar]
Komninos, N. Net Zero Energy Districts: Connected Intelligence for Carbon-Neutral Cities. Land 2022, 11, 210. [Google Scholar] [CrossRef]
Abdous, M.; Aslani, A.; Noorollahi, Y.; Zahedi, R. Design and Analysis of Zero-Energy and Carbon Buildings with Renewable Energy Supply and Recycled Materials. Energy Build. 2024, 324, 114922. [Google Scholar] [CrossRef]
Chenic, A.Ș.; Cretu, A.I.; Burlacu, A.; Moroianu, N.; Vîrjan, D.; Huru, D.; Stanef-Puica, M.R.; Enachescu, V. Logical Analysis on the Strategy for a Sustainable Transition of the World to Green Energy—2050. Smart Cities and Villages Coupled to Renewable Energy Sources with Low Carbon Footprint. Sustainability 2022, 14, 8622. [Google Scholar] [CrossRef]
Pan, W.; Pan, M. Drivers, Barriers and Strategies for Zero Carbon Buildings in High-Rise High-Density Cities. Energy Build. 2021, 242, 110970. [Google Scholar] [CrossRef]
Duan, Z.; Kim, S. Progress in Research on Net-Zero-Carbon Cities: A Literature Review and Knowledge Framework. Energies 2023, 16, 6279. [Google Scholar] [CrossRef]
Xu, X.; Wang, Y.; Ruan, Y.; Wang, J.; Ge, K.; Zhang, Y.; Jin, H. Integrated Energy Planning for Near-Zero Carbon Emission Demonstration District in Urban Areas: A Case Study of Meishan District in Ningbo, China. Energies 2022, 15, 874. [Google Scholar] [CrossRef]
Elmousalami, H.H.; Hassanien, A.E. Day Level Forecasting for Coronavirus Disease (COVID-19) Spread: Analysis, Modeling and Recommendations. arXiv 2020, arXiv:2003.07778. [Google Scholar] [CrossRef]
Szabó, S.; Pinedo Pascua, I.; Puig, D.; Moner-Girona, M.; Negre, M.; Huld, T.; Mulugetta, Y.; Kougias, I.; Szabó, L.; Kammen, D. Mapping of Affordability Levels for Photovoltaic-Based Electricity Generation in the Solar Belt of Sub-Saharan Africa, East Asia and South Asia. Sci. Rep. 2021, 11, 3226. [Google Scholar] [CrossRef]
Hu, L.; Hu, J.; Huang, W. Evolutionary Analysis of the Solar Photovoltaic Products Trade Network in Belt and Road Initiative Countries from an Economic Perspective. Energies 2023, 16, 6371. [Google Scholar] [CrossRef]
Nagy, M.; Lemerle, A.; Charbonneau, P. Impact of Nonlinear Surface Inflows into Activity Belts on the Solar Dynamo. J. Space Weather Space Clim. 2020, 10, 62. [Google Scholar] [CrossRef]
Sharadga, H.; Hajimirza, S.; Balog, R.S. Time Series Forecasting of Solar Power Generation for Large-Scale Photovoltaic Plants. Renew. Energy 2020, 150, 797–807. [Google Scholar] [CrossRef]
Mellit, A.; Kalogirou, S. Artificial Intelligence and Internet of Things to Improve Efficacy of Diagnosis and Remote Sensing of Solar Photovoltaic Systems: Challenges, Recommendations and Future Directions. Renew. Sustain. Energy Rev. 2021, 143, 110889. [Google Scholar] [CrossRef]
Jannah, N.; Gunawan, T.S.; Yusoff, S.H.; Hanifah, M.S.A.; Sapihie, S.N.M. Recent Advances and Future Challenges of Solar Power Generation Forecasting. IEEE Access 2024, 12, 168904–168924. [Google Scholar] [CrossRef]
Strubell, E.; Ganesh, A.; McCallum, A. Energy and Policy Considerations for Modern Deep Learning Research. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 13693–13696. [Google Scholar]
Elmousalami, H.H.; Darwish, A.; Hassanien, A.E. The Truth About 5G and COVID-19: Basics, Analysis, and Opportunities. In Digital Transformation and Emerging Technologies for Fighting COVID-19 Pandemic: Innovative Approaches; Hassanien, A.E., Darwish, A., Eds.; Studies in Systems, Decision and Control; Springer International Publishing: Cham, Switzerland, 2021; Volume 322, pp. 249–259. [Google Scholar] [CrossRef]
Harrou, F.; Zeroual, A.; Hittawe, M.M.; Sun, Y. (Eds.) Chapter 2—Road traffic modeling. In Road Traffic Modeling and Management; Elsevier: Amsterdam, The Netherlands, 2022; pp. 15–63. [Google Scholar] [CrossRef]
Tahri, O.; Usman, M.; Demonceaux, C.; Fofi, D.; Hittawe, M. Fast Earth Mover’s Distance Computation for Catadioptric Image Sequences. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 2485–2489. [Google Scholar] [CrossRef]
Elmousalami, H.; Sakr, I. Artificial Intelligence for Drilling Lost Circulation: A Systematic Literature Review. Geoenergy Sci. Eng. 2024, 239, 212837. [Google Scholar] [CrossRef]
Elmousalami, H.; Maxy, M.; Hui, F.K.P.; Aye, L. AI in Automated Sustainable Construction Engineering Management. Autom. Constr. 2025, 175, 106202. [Google Scholar] [CrossRef]
Binns, R. Fairness in Machine Learning: Lessons from Political Philosophy. In Proceedings of the Conference on Fairness, Accountability and Transparency, New York, NY, USA, 23–24 February 2018; pp. 149–159. [Google Scholar]
Elmousalami, H.; Elshaboury, N.; Elyamany, A.H. Green Artificial Intelligence for Cost-Duration Variance Prediction (CDVP) for Irrigation Canals Rehabilitation Projects. Expert Syst. Appl. 2024, 249, 123789. [Google Scholar] [CrossRef]
Elmousalami, H.H. Closure to “Artificial Intelligence and Parametric Construction Cost Estimate Modeling: State-of-the-Art Review” by Haytham H. Elmousalami. J. Constr. Eng. Manag. 2021, 147, 07021002. [Google Scholar] [CrossRef]
Zhang, Y.; Huang, T.; Zhao, Y.; Jia, Z.; Li, Y.; Xu, G. Solar Array Power Prediction of Long Endurance Stratospheric Aerostat Using a Hybrid Model Based on Blur Informer. Sol. Energy 2025, 287, 113121. [Google Scholar] [CrossRef]
Kannan, N.; Vakeesan, D. Solar Energy for Future World:—A Review. Renew. Sustain. Energy Rev. 2016, 62, 1092–1105. [Google Scholar] [CrossRef]
Kabir, E.; Kumar, P.; Kumar, S.; Adelodun, A.A.; Kim, K.-H. Solar Energy: Potential and Future Prospects. Renew. Sustain. Energy Rev. 2018, 82, 894–900. [Google Scholar] [CrossRef]
Yüzer, E.Ö.; Bozkurt, A. Solar Irradiance Prediction and Methods Used in Prediction Studies. In Interdisciplinary Studies on Contemporary Research Pratices in Engineering in the 21st Century-III; Özgür Publications: Istanbul, Turkey, 2023; p. 215. [Google Scholar]
Wen, X.; Shen, Q.; Zheng, W.; Zhang, H. AI-Driven Solar Energy Generation and Smart Grid Integration a Holistic Approach to Enhancing Renewable Energy Efficiency. Int. J. Innov. Res. Eng. Manag. 2024, 11, 55–66. [Google Scholar] [CrossRef]
Benti, N.E.; Chaka, M.D.; Semie, A.G. Forecasting Renewable Energy Generation with Machine Learning and Deep Learning: Current Advances and Future Prospects. Sustainability 2023, 15, 7087. [Google Scholar] [CrossRef]
Yin, Q.; Han, C.; Li, A.; Liu, X.; Liu, Y. A Review of Research on Building Energy Consumption Prediction Models Based on Artificial Neural Networks. Sustainability 2024, 16, 7805. [Google Scholar] [CrossRef]
Velasquez, J.D.; Cadavid, L.; Franco, C.J. Intelligence Techniques in Sustainable Energy: Analysis of a Decade of Advances. Energies 2023, 16, 6974. [Google Scholar] [CrossRef]
Alnaser, A.A.; Elmousalami, H. Benefits and Challenges of AI-Based Digital Twin Integration in the Saudi Arabian Construction Industry: A Correspondence Analysis (CA) Approach. Appl. Sci. 2025, 15, 4675. [Google Scholar] [CrossRef]
Rajasundrapandiyanleebanon, T.; Kumaresan, K.; Murugan, S.; Subathra, M.S.P.; Sivakumar, M. Solar Energy Forecasting Using Machine Learning and Deep Learning Techniques. Arch Comput. Methods Eng 2023, 30, 3059–3079. [Google Scholar] [CrossRef]
Ma’arif, A.; Firdaus, A.A.; Suwarno, I. Capability of Hybrid Long Short-Term Memory in Stock Price Prediction: A Comprehensive Literature Review. Int. J. Robot. Control. Syst. 2024, 4, 1382–1402. [Google Scholar] [CrossRef]
Zhu, X.; Zou, F.; Li, S. Enhancing Air Quality Prediction with an Adaptive PSO-Optimized CNN-Bi-LSTM Model. Appl. Sci. 2024, 14, 5787. [Google Scholar] [CrossRef]
Khan, M.M.; Hasan, M.M.; Abrar, N.; Habib, M.A. A Data-Driven Approach for Multivariate Solar Radiation Forecasting Using Frequency and Temporal Abstractions for Bang-ladesh. Available at SSRN 4843616. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4843616 (accessed on 3 August 2025).
Puri, V.; Jha, S.; Kumar, R.; Priyadarshini, I.; Abdel-Basset, M.; Elhoseny, M.; Long, H.V. A Hybrid Artificial Intelligence and Internet of Things Model for Generation of Renewable Resource of Energy. IEEE Access 2019, 7, 111181–111191. [Google Scholar] [CrossRef]
Nath, D.C.; Kundu, I.; Sharma, A.; Shivhare, P.; Afzal, A.; Soudagar, M.E.M.; Park, S.G. Internet of Things Integrated with Solar Energy Applications: A State-of-the-Art Review. Environ. Dev. Sustain. 2024, 26, 24597–24652. [Google Scholar] [CrossRef]
Elmousalami, H.H. Comparison of Artificial Intelligence Techniques for Project Conceptual Cost Prediction: A Case Study and Comparative Analysis. IEEE Trans. Eng. Manag. 2020, 68, 183–196. [Google Scholar] [CrossRef]
Elmousalami, H.H. Artificial Intelligence and Parametric Construction Cost Estimate Modeling: State-of-the-Art Review. J. Constr. Eng. Manag. 2020, 146, 03119008. [Google Scholar] [CrossRef]
Elmousalami, H.; Hui, F.K.P.; Aye, L. Electroencephalography (EEG) for Psychological Hazards and Mental Health in Construction Safety Automation: Algorithmic Systematic Review (ASR). Autom. Constr. 2025, 177, 106346. [Google Scholar] [CrossRef]
Elmousalami, H.; Alnaser, A.A.; Hui, F.K.P. Sustainable AI-Driven Wind Energy Forecasting: Advancing Zero-Carbon Cities and Environmental Computation. Artif. Intell. Rev. 2025, 58, 191. [Google Scholar] [CrossRef]
Elmousalami, H.H. Intelligent Methodology for Project Conceptual Cost Prediction. Heliyon 2019, 5, e01625. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Malhotra, P.; Vig, L.; Shroff, G.; Agarwal, P. Long Short Term Memory Networks for Anomaly Detection in Time Series. In Proceedings of the ESANN 2015, Bruges, Belgium, 22–24 April 2015; Volume 2015, p. 89. [Google Scholar]
Elmousalami, H.; Elmesalami, H.H.; Maxi, M.; Farid, A.A.M.; Elshaboury, N.A.T. A Comprehensive Evaluation of Machine Learning and Deep Learning Algorithms for Wind Speed and Power Prediction. Decis. Anal. J. 2024, 13, 100527. [Google Scholar] [CrossRef]
Zha, W.; Liu, Y.; Wan, Y.; Luo, R.; Li, D.; Yang, S.; Xu, Y. Forecasting Monthly Gas Field Production Based on the CNN-LSTM Model. Energy 2022, 260, 124889. [Google Scholar] [CrossRef]
Wang, J.; Yu, L.-C.; Lai, K.R.; Zhang, X. Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, Germany, 7–12 August 2016; pp. 225–230. [Google Scholar]
AlMallahi, M.N.; Al Swailmeen, Y.; Abdelkareem, M.A.; Olabi, A.G.; Elgendi, M. A Path to Sustainable Development Goals: A Case Study on the Thirteen Largest Photovoltaic Power Plants. Energy Convers. Manag. X 2024, 22, 100553. [Google Scholar] [CrossRef]
Nabil Abdel Sadek El Sebai, M. Impact of Solar Energy on Urban Design (Case Study, Benban’s PV Solar Park). Eng. Res. J. 2023, 180, 161–188. [Google Scholar] [CrossRef]
Elfeky, K.E.; Wang, Q. Techno-Economic Assessment and Optimization of the Performance of Solar Power Tower Plant in Egypt’s Climate Conditions. Energy Convers. Manag. 2023, 280, 116829. [Google Scholar] [CrossRef]
Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; De Freitas, N. Taking the Human out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 2015, 104, 148–175. [Google Scholar] [CrossRef]
Frazier, P.I. Bayesian Optimization. In Recent Advances in Optimization and Modeling of Contemporary Problems; Gel, E., Ntaimo, L., Shier, D., Greenberg, H.J., Eds.; INFORMS: Catonsville, MD, USA, 2018; pp. 255–278. [Google Scholar] [CrossRef]
Akhter, M.N.; Mekhilef, S.; Mokhlis, H.; Almohaimeed, Z.M.; Muhammad, M.A.; Khairuddin, A.S.M.; Akram, R.; Hussain, M.M. An Hour-Ahead PV Power Forecasting Method Based on an RNN-LSTM Model for Three Different PV Plants. Energies 2022, 15, 2243. [Google Scholar] [CrossRef]
Sarmas, E.; Dimitropoulos, N.; Marinakis, V.; Mylona, Z.; Doukas, H. Transfer Learning Strategies for Solar Power Forecasting Under Data Scarcity. Sci. Rep. 2022, 12, 14643. [Google Scholar] [CrossRef]
Obaideen, K.; AlMallahi, M.N.; Alami, A.H.; Ramadan, M.; Abdelkareem, M.A.; Shehata, N.; Olabi, A.G. On the contribution of solar energy to sustainable developments goals: Case study on Mohammed bin Rashid Al Maktoum Solar Park. Int. J. Thermofluids 2021, 12, 100123. [Google Scholar] [CrossRef]
Ali, A.H.; El Rifaee, M.; Abdulai, S.F.; Elmousalami, H.H. A Holistic Model for Assessing Key Success Factors in Mitigating Challenges to Modular Integrated Construction. Int. J. Constr. Manag. 2025, 1–21. [Google Scholar] [CrossRef]

Figure 1. Global solar belt highlighting regions with high solar irradiance, including the Middle East and North Africa (MENA) region, which offers optimal conditions for large-scale solar energy development [9,11].

Figure 2. Timescales and corresponding applications of solar power forecasting, illustrating the role of short-, medium-, and long-term predictions in supporting real-time grid operations, energy market planning, and strategic energy infrastructure development.

Figure 3. (a) Bagging ensemble method. (b) Random Forest (RF) architecture as an application of bagging. (c) Boosting technique, highlighting differences in data sampling and model training strategies for ensemble learning in solar forecasting applications.

Figure 4. Architectural overview of recurrent neural network (RNN) and Long Short-Term Memory (LSTM) units, illustrating the internal gating mechanisms used to capture sequential dependencies in time-series data for solar power forecasting [46].

Figure 5. Architecture of the hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model, illustrating the integration of spatial feature extraction and temporal sequence learning for enhanced solar power forecasting accuracy.

Figure 6. Flowchart of the research methodology, detailing the data acquisition, preprocessing, model development, and evaluation stages, along with the combined forecast algorithm used to enhance predictive performance in solar power forecasting.

Figure 7. Geographical locations of Benban Solar Park in Egypt and Sakaka Solar Power Plant in the Kingdom of Saudi Arabia (KSA) represent two major utility-scale solar energy facilities situated within the global solar belt.

Figure 8. Actual 24-h solar power generation curves for Benban Solar Park (Egypt) and Sakaka Solar Power Plant (Saudi Arabia), illustrating typical daily output patterns influenced by diurnal solar irradiance cycles.

Figure 9. Estimated CO₂ emissions per computational hour for each forecasting model applied to Benban Solar Park, Egypt, highlighting the trade-off between model accuracy and environmental impact in the context of sustainable AI.

Figure 10. Forecasting performance of the CNN-LSTM model for Benban Solar Park, Egypt, showing a close alignment between actual and predicted solar power outputs over time, demonstrating the model’s accuracy and temporal learning capability.

Figure 11. Scatter plot showing the performance of the CNN-LSTM model for Benban Solar Park, Egypt, with predicted versus actual solar power outputs closely aligned along the diagonal, indicating high forecasting accuracy (R² = 0.9969).

Figure 12. Estimated CO₂ emissions per computational hour for each forecasting model used at Sakaka Solar Power Plant, Saudi Arabia, emphasizing the balance between predictive performance and computational sustainability.

Figure 13. Forecasting performance of the CNN-LSTM model for Sakaka Solar Power Plant, Saudi Arabia, illustrating the alignment between actual and predicted solar power outputs over 100 h, confirming the model’s robustness and predictive accuracy.

Figure 14. Scatter plot illustrating the performance of the CNN-LSTM model for Sakaka Solar Power Plant, Saudi Arabia, showing strong alignment between predicted and actual solar power outputs, with a high coefficient of determination (R² = 0.9828), indicating excellent predictive accuracy.

Figure 15. LIME-based feature importance analysis for the CNN-LSTM model at Benban Solar Park, Egypt, indicating the dominant role of meteorological inputs in solar power forecasting accuracy.

Figure 16. LIME-derived feature importance values for the CNN-LSTM forecasting model at Sakaka Solar Power Plant, Saudi Arabia, illustrating the dominance of real-time weather parameters in shaping prediction outcomes.

Figure 17. Urban solar energy supply decarbonization framework illustrating the role of the Sustainable AI-Driven Solar Power Forecasting (SAI-SPF) system in integrating real-time data, forecasting, and greenhouse gas monitoring to support smart city energy management and carbon neutrality goals.

Figure 18. Identified research limitations of the SAI-SPF framework, highlighting challenges related to data availability, computational requirements, integration with existing grid infrastructure, and sensitivity to extreme weather conditions.

Figure 19. Future research directions for the SAI-SPF framework, outlining pathways for enhancing model scalability, integrating additional data sources, incorporating advanced AI techniques, and expanding applicability to diverse geographic and socio-economic contexts.

Table 1. Benban Solar Park and Sakaka Solar Power Plant.

Parameter	Benban Solar Park, Egypt	Sakaka Solar Power Plant, Saudi Arabia
Location	Benban, Aswan Governorate, Egypt	Al Jouf, Saudi Arabia
Commissioning Year	2019	2021
Total Capacity	1650 megawatts (MW)	300 MW
Solar Irradiance	Approximately 2300 kWh/m²/year	Approximately 2200 kWh/m²/year
Area Covered	Approximately 37.2 square kilometers (km²)	6 km²
Number of Solar Panels	Approximately 7.2 million panels (assuming an average of 330 watts per panel)	Over 1.2 million panels
Annual Energy Generation	Approximately 3.8 terawatt-hours (TWh)	Approximately 0.94 TWh
Average Daily Energy Generation	Approximately 10.4 gigawatt-hours (GWh) (3.8 TWh/365 days)	Approximately 2.6 GWh (0.94 TWh/365 days)
Average Hourly Energy Generation	Approximately 433 megawatts (MW) (capacity factor of 26%)	Approximately 107 MW (capacity factor of 26.4%)

Table 2. The input parameters of solar power forecasting models.

		Benban Solar Station		Sakaka Solar Station
Input Parameters	Unit	Minimum	Maximum	Minimum	Maximum
Solar Irradiance	W/m²	0	1100	0	1300
Ambient Temperature	°C	5	45	5	51
Wind Speed	m/s	0	15	0	18
Humidity	%	10	50	9	57
Time of Day	hours	0	24	0	24
Month of the Year	month	January	December	January	December
Solar Zenith Angle	degrees	0	75	0	70
Panel Tilt Angle	degrees	0	80	0	83
Panel Orientation	degrees	0	300	0	315
Panel Efficiency	%	15	22	18	24

Table 3. Optimized hyperparameters—Benban Solar Park.

Model Notation	AI Model	Optimized Hyperparameters
RF	Random Forest	n_estimators = 150, max_depth = 12, min_samples_split = 4, max_features = ‘sqrt’
SVM	Support Vector Machines	kernel = ‘rbf’, C = 10, gamma = 0.01
GBM	Gradient Boosting Machines	n_estimators = 120, learning_rate = 0.05, max_depth = 5, subsample = 0.8
LSTM	Long Short-Term Memory	layers = 2, units = 64, dropout = 0.2, optimizer = ‘adam’, epochs = 100, batch_size = 32
CNNs	Convolutional Neural Networks	filters = 64, kernel_size = 3, activation = ‘relu’, pooling = ‘max’, epochs = 80, batch_size = 64
CNN-LSTM	Hybrid CNN-LSTM Model	CNN_filters = 32, LSTM_units = 50, dropout = 0.3, optimizer = ‘adam’, epochs = 120, batch_size = 32
ARIMA	Autoregressive Integrated Moving Average	p = 2, d = 1, q = 2

Table 4. Optimized hyperparameters—Sakaka Solar Power Plant.

Model Notation	AI Model	Optimized Hyperparameters
RF	Random Forest	n_estimators = 140, max_depth = 10, min_samples_split = 3, max_features = ‘log2’
SVM	Support Vector Machines	kernel = ‘poly’, C = 5, gamma = 0.001
GBM	Gradient Boosting Machines	n_estimators = 130, learning_rate = 0.07, max_depth = 4, subsample = 0.9
LSTM	Long Short-Term Memory	layers = 2, units = 50, dropout = 0.25, optimizer = ‘adam’, epochs = 120, batch_size = 16
CNNs	Convolutional Neural Networks	filters = 32, kernel_size = 5, activation = ‘relu’, pooling = ‘average’, epochs = 90, batch_size = 64
CNN-LSTM	Hybrid CNN-LSTM Model	CNN_filters = 64, LSTM_units = 64, dropout = 0.2, optimizer = ‘adam’, epochs = 150, batch_size = 32
ARIMA	Autoregressive Integrated Moving Average	p = 1, d = 1, q = 1

Table 5. The accuracy of the developed models for solar power, one hour ahead of forecasting for Benban Solar Park, Egypt.

Model Notation	AI Model	MAPE	RMSE	MAE	R*²
RF	Random Forest	4.47	332.8	516	0.95
SVM	Support Vector Machines	8.42	416.4	789.6	0.88
GBM	Gradient Boosting Machines	6.88	724	502.4	0.87
LSTM	Long Short-Term Memory Networks	3.44	840	936	0.84
CNNs	Convolutional Neural Networks	4.26	336	528	0.85
CNN-LSTM	Hybrid CNN-LSTM Model	2.04	184	252	0.99
ARIMA	Autoregressive Integrated Moving Average	11.77	300	464	0.91

Table 6. The accuracy of the developed models for solar power one hour ahead of forecasting for Sakaka Solar Power Plant, Saudi Arabia.

Model Notation	AI Model	MAPE	RMSE	MAE	R²
RF	Random Forest	4.5	335	520	0.94
SVM	Support Vector Machines	8.4	420	790	0.87
GBM	Gradient Boosting Machines	6.9	730	505	0.86
LSTM	Long Short-Term Memory Networks	3.4	850	940	0.83
CNNs	Convolutional Neural Networks	4.3	340	530	0.84
CNN-LSTM	Hybrid CNN-LSTM Model	2	190	255	0.98
ARIMA	Autoregressive Integrated Moving Average	11.8	310	470	0.9

Table 7. SDGs and sustainable AI for solar power forecasting.

SDGs	Contributions
	The SAI-SPF system plays a vital role in enhancing the integration of solar energy, providing an efficient and reliable source of clean energy. Utilizing advanced forecasting tools and energy management techniques ensures that solar energy is harnessed optimally, making it accessible even in regions with fluctuating sunlight. This system helps to reduce dependence on fossil fuels, lowering energy costs while supporting the transition to sustainable energy. The system facilitates a decentralized energy grid in urban and smart city applications, promoting local energy resilience. Ultimately, it contributes to achieving universal access to affordable and clean energy, a fundamental goal of SDG 7.
	Leveraging cutting-edge AI and IoT technologies, the SAI-SPF system promotes the growth of sustainable industrial infrastructure and innovation. Intelligent energy management optimizes the operation of energy-intensive industries, reducing energy waste and improving overall efficiency. The framework encourages the development of smart energy systems that can dynamically adjust to demand and supply fluctuations, driving the smart grid revolution forward. By improving energy infrastructure, the SAI-SPF system fosters an ecosystem of innovation that supports sustainable industrial practices. This innovation enhances industrial competitiveness and contributes to achieving sustainable growth in line with SDG 9.
	The SAI-SPF system is a catalyst for sustainable urban development by enabling smart energy management and promoting decarbonization efforts. Optimizing energy use in urban environments, it helps to reduce carbon emissions and facilitates the creation of low-carbon, energy-efficient buildings and infrastructures. The system supports urban resilience by ensuring that energy systems can adapt to changing demand and climate conditions, promoting a sustainable and livable urban future. Additionally, by reducing energy consumption, the framework enhances overall sustainability, making cities more inclusive and sustainable. This contributes directly to SDG 11, which focuses on making cities resilient, safe, and sustainable.
	The SAI-SPF system’s real-time monitoring and forecasting capabilities align directly with global climate action goals by enabling proactive measures to reduce greenhouse gas emissions by optimizing energy production from renewable sources. Additionally, the system’s ability to forecast energy demand and supply allows for more efficient energy use, preventing unnecessary waste and reducing the environmental footprint. The system contributes to achieving climate action goals by offering a scalable solution for climate change mitigation.
	The SAI-SPF system fosters collaboration among various stakeholders, including energy providers, technology developers, urban planners, and policymakers, to accelerate sustainable development goals. It creates a platform for multi-sector partnerships that leverage shared expertise and resources to address complex urban energy challenges. By aligning the interests of diverse stakeholders, it strengthens the capacity of cities and industries to implement innovative energy solutions. This collaboration is essential for scaling up the adoption of sustainable energy systems and ensuring long-term success. Through such partnerships, the framework plays a crucial role in advancing SDG 17, emphasizing the importance of partnerships in achieving global goals.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Elmousalami, H.; Peng Hui, F.K.; Alnaser, A.A. Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF). Buildings 2025, 15, 2785. https://doi.org/10.3390/buildings15152785

AMA Style

Elmousalami H, Peng Hui FK, Alnaser AA. Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF). Buildings. 2025; 15(15):2785. https://doi.org/10.3390/buildings15152785

Chicago/Turabian Style

Elmousalami, Haytham, Felix Kin Peng Hui, and Aljawharah A. Alnaser. 2025. "Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF)" Buildings 15, no. 15: 2785. https://doi.org/10.3390/buildings15152785

APA Style

Elmousalami, H., Peng Hui, F. K., & Alnaser, A. A. (2025). Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF). Buildings, 15(15), 2785. https://doi.org/10.3390/buildings15152785

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Smart and Zero-Carbon Cities Through a Hybrid CNN-LSTM Algorithm for Sustainable AI-Driven Solar Power Forecasting (SAI-SPF)

Abstract

1. Introduction

1.1. Solar Energy Supply for Zero-Carbon Cities

1.2. Sustainable Artificial Intelligence (SAI)

1.3. Research Gaps and Problems

1.4. Research Objectives

2. Literature Review

2.1. AI-Driven Solar Forecasting: Advances and Challenges

2.2. AI Algorithms

3. Research Methodology

4. Data Collection

5. AI Model Development

6. Results and Discussion

6.1. Solar Power Forecasting for Benban Solar Park, Egypt

6.2. Solar Power Forecasting for Sakaka Solar Power Plant, Saudi Arabia

6.3. LIME Values and Features Contribute

6.4. Contribution to Smart and Zero-Carbon Cities

6.5. SAI-SPF Contribution to Sustainability

7. Research Limitations and Future Research

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI