Time Series Analysis and Forecasting of Solar Generation in Spain Using eXtreme Gradient Boosting: A Machine Learning Approach

: The rapid expansion of solar photovoltaic (PV) generation has established its pivotal role in the shift toward sustainable energy systems. This study conducts an in-depth analysis of solar generation data from 2015 to 2018 in Spain, with a speciﬁc emphasis on temporal patterns, excluding weather data. Employing the powerful eXtreme gradient boosting (XGBoost) algorithm for modeling and forecasting, our research underscores its exceptional efﬁcacy in capturing solar generation trends, as evidenced by a remarkable root mean squared error (RMSE) of 11.042, a mean absolute error (MAE) of 5.621, an R-squared (R 2 ) of 0.999, and a minimal mean absolute percentage error (MAPE) of 0.046. These insights hold substantial implications for grid management, energy planning, and policy development, reafﬁrming solar energy’s promise as a dependable and sustainable contributor to the electrical power system’s evolution. This research contributes to the growing body of knowledge aimed at optimizing renewable energy integration and enhancing energy sustainability for future generations.


Introduction
The rapid decline in the cost of renewable energies, as highlighted by IRENA [1], is driving a global transition towards more sustainable options, with photovoltaic energy expected to contribute up to 40% of the world's energy supply by 2040 [2].However, this shift presents a significant challenge due to the inherent volatility of renewable energy sources (RES) caused by climate fluctuations, creating substantial barriers for electricity companies.As the proportion of RES in energy production increases, the risk of temporary blackouts [3] and a reduction in energy quality also rises [4].This volatility is primarily attributed to RES, such as solar photovoltaic (PV) generation, which are intermittent and susceptible to weather conditions.Effectively managing this variability is crucial for maintaining a reliable energy supply.
The expansion of RES in electricity production introduces a new set of challenges [5].Accurate short-term predictions are crucial for optimal energy management, encompassing storage, sale, and distribution, while forecasting errors can lead to significant profit losses [6].The need to adapt energy production to current demand, already in practice, becomes more complex as wind and solar power gain prominence [7,8].Developing methods for forecasting electricity production by these sources, contingent on weather conditions, and analyzing production capacity at different intervals are essential [9].Continuous advancements in technology aim to enhance predictive accuracy, ensuring sustainability and reliability in grid operations amid the evolving landscape of renewable energy.
Renewable energy, particularly solar PV, will become a significant source of energy in the future.To ensure safety, reliability, and profitability as their proportion in the electrical energy supply grows, the accurate prediction of photovoltaic panel power generation is Energies 2023, 16, 7618 2 of 14 crucial.Solar energy's unpredictable nature poses challenges such as voltage fluctuations, power factor issues, and stability.Solar energy's ascendancy underscores the importance of a thorough understanding of its temporal and spatial behavior, particularly in the absence of weather data.While previous studies have often emphasized the interplay between solar generation and weather conditions, this research uniquely focuses on analyzing solar generation patterns independently of weather variables.It delves into the inherent capabilities and challenges of solar energy as a standalone contributor to the electricity power grid.
The purpose of this research is to unveil the intricate temporal patterns of solar generation in Spain during the aforementioned critical period.Through rigorous analysis and predictive modeling employing the XGBoost algorithm, we scrutinize diurnal variations, seasonal trends, and geographic disparities in solar generation.The following metrics are used to assess the outcome: mean absolute error (MAE), mean absolute percentage error (MAPE), root mean squared error (RMSE), and coefficient of determination (R 2 ).The research's primary objective is to provide insights into the self-reliance and potential reliability of solar energy within the electrical power system.Furthermore, it explores the implications of these patterns for grid management, energy planning, and policy formulation in Spain and beyond.
As solar energy continues to gain prominence on the global energy landscape, understanding its nuances and capabilities, especially without the crutch of weather data, holds profound implications for harnessing its full potential.This research endeavors to contribute to this essential discourse, offering a comprehensive analysis that illuminates the contribution of solar energy in defining the prospects of sustainable energy systems.The following is how this paper is structured: Section 2 offers a review of previous research on this topic; Section 3 provides details the methodology of research utilizing XGBoost; Section 4 presents and analyzes the results of the research; and Section 5 summarizes the paper by discussing future recommendations.

Related Work
This section delves into the most recent research literature within the area of study, highlighting the extensive range of solar power models and techniques that have been proposed.These models encompass various mathematical functions, both linear and nonlinear, applied across diverse contexts, including projects in Saudi Arabia [9], Malaysia [10], Brazil [11], Israel [12], Australia [13,14], Turkey [15], India [16], the United States [17], Scotland [18], South Korea [19], Nigeria [20], Italy [21], and Algeria [22].Moreover, non-linear functions have been employed for daily diffuse solar energy radiation calculations [23], irradiation simulations [24], and unrestricted methods [25].In [26,27], the focus was on forecasting the solar power radiation output, while our method focus on the predicting solar generation output, providing a more accessible perspective for end-users and emphasizing the return on investment of solar installations.Fuzzy logic techniques have found utility in short-term energy forecasting [28].Additionally, genetic algorithms have played a role in achieving pump self-sustainability [29], while artificial neural networks (ANNs) have become increasingly prevalent.Table 1 presents the related work concerning forecasting solar PV power generation using machine learning algorithms, including the parameters employed and the output metrics.The significance of these advancements in solar energy modeling becomes evident when assessing their performance against traditional statistical methods [38].For instance, the LSTM model has consistently demonstrated superior solar forecast accuracy compared to ARIMA in the absence of data with noise [39].The application of deep learning techniques, such as ANNs, has yielded impressive outcomes in regression and classification challenges thanks to their automatic parameter tuning through supervised learning algorithms [40].Research has closely examined artificial intelligence methods in renewable energy forecasting and optimization [41], particularly for modeling and simulating solar energy systems [42].Furthermore, artificial neural networks have been pivotal in predicting environmental variables and estimating unconventional energy systems [43].In the realm of solar radiation applications, MLPs, empowered decision trees combined with linear regression and other approaches have been evaluated [44], while LSTM models process large datasets and exhibit adaptability to unknown data, making them preferable to SVM-based models due to their superior results [45].
Numerous prior studies in [11,[46][47][48] have conventionally focused on a multitude of meteorological parameters.These encompass, but are not limited to, irradiance, temperature, humidity, air pressure, wind speed, wind direction, precipitation, dust deposition, and cloud cover.In contrast, the distinctiveness of our research paper lies in its deliberate omission of meteorological data, with a sole reliance on historical data spanning the years 2015 to 2018, sourced exclusively from Spain.Furthermore, our study introduces a pioneering approach by exclusively utilizing historical data, thereby eliminating the inherent uncertainties associated with weather forecasting.

Dataset and Preprocessing
The foundation of this research rests upon a meticulously collected dataset of hourly solar generation spanning the years 2015 to 2018 in Spain.The data were sourced from reliable repositories and encompass records of solar energy production across various regions of the country.To ensure data integrity and reliability, a rigorous preprocessing phase was undertaken.This entailed data cleaning to address missing values and outliers, data transformation to handle datetime formats, and data consistency checks.The Python programming language was instrumental in this phase, facilitating data manipulation and validation.

Training and Testing Data
To ensure robust model training and evaluation, we implemented a chronological split of the data to simulate real-world forecasting scenarios.The dataset was partitioned, allocating 80% for the training stage, where the machine learning model learned patterns from historical data.The remaining 20% was reserved for the testing stage.The training set encompasses data from the years 2015 to 2017, allowing the model to learn from historical patterns and trends.Subsequently, the testing set comprises data from the year 2018, representing unseen future data for the model to predict.This approach ensures a rigorous evaluation of the model's generalization capabilities to predict solar energy generation beyond the training period.

Exploratory Data Analysis (EDA)
The initial insights into the temporal and spatial dynamics of solar generation were unearthed through an extensive EDA.Descriptive statistics, time series decomposition, and visualization techniques such as line plots, box plots, and heatmaps were employed.The EDA phase provided valuable context by revealing diurnal patterns, seasonal variations, and geographic disparities in solar generation.Python's data visualization libraries, including Matplotlib and Seaborn, played a pivotal role in visualizing and interpreting these patterns.

Time Series Modeling with XGBoost
Central to our research is the application of the XGBoost algorithm, a state-of-the-art gradient boosting framework renowned for its exceptional predictive power.Leveraging Python's XGBoost library, we capitalize on the algorithm's capabilities to unveil the intricate dynamics of solar generation.Employing a supervised learning paradigm, we treat solar generation as our target variable, while the meticulously engineered features serve as predictors.
XGBoost is an open-access package or library that efficiently implements the gradient boosting approach, which is based on a greedy function approximation of the gradient [49].This technique involves iteratively finetuning several weak forecasting models sequentially, where each model of which builds on the findings obtained by the prior model, ultimately generating a better model in the end.
The XGBoost gradient algorithm, a powerful ensemble learning technique, plays a pivotal role in predicting solar energy production in this research.The algorithm's effectiveness lies in its ability to deal with complex, non-linear connections in the data, making it ideal for capturing the dynamic patterns of solar generation.Specifically, XGBoost leverages a collection of decision trees that work collaboratively to refine predictions.Each tree examines different aspects of the data, enabling the model to capture nuanced patterns and dependencies.The algorithm also incorporates regularization techniques, which prevent overfitting by penalizing overly complex models.This balance between model complexity and predictive accuracy is essential in solar generation forecasting, as it ensures that the model generalizes well to unseen data.Furthermore, XGBoost provides feature importance rankings, indicating the significance of different input features, helping to identify the key drivers of solar output.Its ability to handle missing data and optimization for parallel processing further enhances its utility in the context of this research.In summary, XGBoost's comprehensive capabilities make it a robust tool for accurately predicting solar energy production and understanding the factors influencing it.
The journey to model development commences with the rigorous training of our XGBoost model on the designated training dataset.Here, Python's Scikit-Learn interface to XGBoost proves indispensable, enabling seamless integration into our workflow.To maximize the model's predictive prowess, we embark on hyperparameter tuning-a meticulous process fine-tuning the model's settings.This optimization endeavor is conducted with utmost care to ensure that our model extracts every nugget of insight latent in the data.
As we delve into the inner workings of our model, we place a premium on interpretability.This interpretability not only enriches our understanding but also informs decision-makers and stakeholders about the key drivers of solar energy production.
Our choice of the XGBoost algorithm stems from its robustness, adaptability to time series data, and proven track record in predictive modeling.It equips us with the means to distil complex temporal patterns and seasonality in solar generation, culminating in a model that not only forecasts but also elucidates the intricate interplay of variables within the renewable energy landscape.

Model Evaluation and Validation
The effectiveness of our XGBoost model is subjected to meticulous scrutiny through a comprehensive evaluation process, ensuring the reliability of our findings.Recognizing the unique characteristics of time series data, we employ four evaluation metrics that are tailor-made for the domain of time series forecasting: MAPE, RMSE, MAE, and R 2 [50][51][52] are used in this study.

1.
Root mean squared error (RMSE) stands as a sentinel of predictive accuracy, gauging the extent of discrepancies between predicted and observed values.A low RMSE value signifies a model that closely tracks the actual solar generation, while higher values reveal areas for improvement.The formula for RMSE is as follows: Mean absolute error (MAE) provides insights into the average magnitude of errors between predictions and actual data points.It complements RMSE by offering a more intuitive understanding of forecasting accuracy.The formula for MAE is as follows: 3. R-squared (R 2 ) often regarded as the coefficient of determination; it unveils the proportion of variance in the target variable captured by our model.A value of 1.00 signifies a perfect fit, while values closer to 0 indicate diminishing predictive power.The formula for the R 2 score is as follows: Mean absolute percentage error (MAPE) allows us to assess the relative magnitude of errors as a percentage of the actual solar generation values.This metric is particularly valuable in understanding the proportional accuracy of our predictions.The formula for MAPE is as follows: where n is the total amount of measurements, y i is the actual value for the data point, y p is the projection made by the model forecast, and e represent the amount of residual.
To fortify the robustness of our model and ensure its adaptability across varying temporal segments, we conduct rigorous cross-validation.This iterative process assesses our model's performance on distinct subsets of the data, enhancing its ability to generalize beyond the training dataset.Cross-validation provides an essential layer of validation, bolstering the reliability of our forecasts.
Energies 2023, 16, 7618 6 of 14 By leveraging this comprehensive suite of evaluation metrics and cross-validation techniques, we construct a well-vetted model capable of not only capturing temporal patterns but also providing a clear and quantifiable assessment of its predictive accuracy.These validation measures fortify the foundations of our research, instilling confidence in our results and conclusions.

Temporal Analysis
The research further encompasses temporal and spatial analyses of solar generation patterns.Temporal analysis focuses on diurnal, weekly, and seasonal patterns, daily fluctuations, and seasonality effects.Python's libraries for data analysis, such as Pandas and NumPy, were instrumental in performing these analyses.

Temporal Patterns of Solar Generation
The analysis of hourly solar generation data from 2015 to 2018 in Spain has unveiled distinct temporal patterns that shed light on the dynamics of solar energy production.As expected, diurnal variations in solar generation are prominent, with peak generation consistently occurring during daylight hours and a notable decline at night. Figure 1 illustrates displays the hourly solar generation patterns for four selected days in the year 2017.Each day is represented by a distinct line with a different color, and the x-axis shows the hours of the day in a 24-h format (e.g., "00:00" to "23:00").Furthermore, the examination of the data reveals pronounced seasonality, with solar generation consistently higher during sunnier months and experiencing a dip during the winter period.These patterns underscore the strong influence of solar irradiance on generation, offering valuable insights into the inherent predictability of solar energy production.
Energies 2023, 16, x FOR PEER REVIEW 6 of 14 patterns but also providing a clear and quantifiable assessment of its predictive accuracy.These validation measures fortify the foundations of our research, instilling confidence in our results and conclusions.

Temporal Analysis
The research further encompasses temporal and spatial analyses of solar generation patterns.Temporal analysis focuses on diurnal, weekly, and seasonal patterns, daily fluctuations, and seasonality effects.Python's libraries for data analysis, such as Pandas and NumPy, were instrumental in performing these analyses.

Temporal Patterns of Solar Generation
The analysis of hourly solar generation data from 2015 to 2018 in Spain has unveiled distinct temporal patterns that shed light on the dynamics of solar energy production.As expected, diurnal variations in solar generation are prominent, with peak generation consistently occurring during daylight hours and a notable decline at night. Figure 1 illustrates displays the hourly solar generation patterns for four selected days in the year 2017.Each day is represented by a distinct line with a different color, and the x-axis shows the hours of the day in a 24-h format (e.g., "00:00" to "23:00").Furthermore, the examination of the data reveals pronounced seasonality, with solar generation consistently higher during sunnier months and experiencing a dip during the winter period.These patterns underscore the strong influence of solar irradiance on generation, offering valuable insights into the inherent predictability of solar energy production.3 August 2018.This information gives important context for understanding the daily variations in solar energy output.

XGBoost Modeling and Forecasting
The research employed the XGBoost algorithm, a robust gradient boosting machine learning technique, to model and predict solar generation patterns.Figure 6 provides a visual comparison between the actual and projected values of a machine learning model for solar generation on a weekly basis.The x-axis represents the weeks over the data's time period, and the y-axis indicates the values of the solar generation.Two lines are depicted in the plot, one representing the actual solar generation (marked with circles) and the other representing the predicted values (marked with crosses).This figure enables a direct assessment of how well the model aligns with the actual solar generation trends on a weekly scale, offering insights into the model's accuracy in capturing weekly variations.

XGBoost Modeling and Forecasting
The research employed the XGBoost algorithm, a robust gradient boosting machine learning technique, to model and predict solar generation patterns.Figure 6   The model fitting process demonstrated its remarkable capacity to capture the intricate temporal dependencies inherent in the dataset.Notably, the model excelled in predicting solar generation trends with a high degree of accuracy, as evidenced by its evaluation metrics.Specifically, the model achieved a root mean squared error (RMSE) of 11.042, a mean absolute error (MAE) of 5.86, a perfect R-squared (R²) value of 0.999, and an impressively low mean absolute percentage error (MAPE) of 0.0463.These metrics collectively underscore the model's ability to closely align with observed historical data, making it a powerful tool for solar generation forecasting.
Figure 7 presents a graphical representation of the machine learning model's performance metrics, serving as a crucial visual aid for evaluating its effectiveness.The bar chart elegantly portrays four essential metrics: root mean squared error (RMSE), mean absolute error (MAE), R-squared (R²), and mean absolute percentage error (MAPE).The inclusion of numerical values alongside the graphical elements offers precise measurements of the model's accuracy and its efficiency in forecasting solar generation patterns.This figure provides a valuable means of assessing and comparing the model's performance across various evaluation criteria, enhancing the comprehensibility and interpretability of the research results.The model fitting process demonstrated its remarkable capacity to capture the intricate temporal dependencies inherent in the dataset.Notably, the model excelled in predicting solar generation trends with a high degree of accuracy, as evidenced by its evaluation metrics.Specifically, the model achieved a root mean squared error (RMSE) of 11.042, a mean absolute error (MAE) of 5.86, a perfect R-squared (R 2 ) value of 0.999, and an impressively low mean absolute percentage error (MAPE) of 0.0463.These metrics collectively underscore the model's ability to closely align with observed historical data, making it a powerful tool for solar generation forecasting.
Figure 7 presents a graphical representation of the machine learning model's performance metrics, serving as a crucial visual aid for evaluating its effectiveness.The bar chart elegantly portrays four essential metrics: root mean squared error (RMSE), mean absolute error (MAE), R-squared (R 2 ), and mean absolute percentage error (MAPE).The inclusion of numerical values alongside the graphical elements offers precise measurements of the model's accuracy and its efficiency in forecasting solar generation patterns.This figure provides a valuable means of assessing and comparing the model's performance across various evaluation criteria, enhancing the comprehensibility and interpretability of the research results.The model fitting process demonstrated its remarkable capacity to capture the intricate temporal dependencies inherent in the dataset.Notably, the model excelled in predicting solar generation trends with a high degree of accuracy, as evidenced by its evaluation metrics.Specifically, the model achieved a root mean squared error (RMSE) of 11.042, a mean absolute error (MAE) of 5.86, a perfect R-squared (R²) value of 0.999, and an impressively low mean absolute percentage error (MAPE) of 0.0463.These metrics collectively underscore the model's ability to closely align with observed historical data, making it a powerful tool for solar generation forecasting.
Figure 7 presents a graphical representation of the machine learning model's performance metrics, serving as a crucial visual aid for evaluating its effectiveness.The bar chart elegantly portrays four essential metrics: root mean squared error (RMSE), mean absolute error (MAE), R-squared (R²), and mean absolute percentage error (MAPE).The inclusion of numerical values alongside the graphical elements offers precise measurements of the model's accuracy and its efficiency in forecasting solar generation patterns.This figure provides a valuable means of assessing and comparing the model's performance across various evaluation criteria, enhancing the comprehensibility and interpretability of the research results.

Learning Curves
The model's performance was further examined through learning curve analysis, providing a valuable depiction of its behavior.Learning curves serve as a visual representation of the model's training and validation performance relative to the number of data points used for training, aiding in the identification of potential overfitting or underfitting tendencies.

Learning Curves
The model's performance was further examined through learning curve analysis, providing a valuable depiction of its behavior.Learning curves serve as a visual representation of the model's training and validation performance relative to the number of data points used for training, aiding in the identification of potential overfitting or underfitting tendencies.
Figure 8   The learning curves unveil the convergence of training and validation scores, highlighting that as the model encounters more data, its performance on both the training and validation sets reaches a state of stability.The limited divergence between the two curves underscores the model's capability to refrain from overfitting the training data and to make accurate predictions on unseen data.These insights align seamlessly with the earlier discussion of evaluation metrics, further underlining the model's ability to capture intricate solar generation patterns.

Conclusions
In the ever-evolving landscape of renewable energy, our research has delved into the temporal patterns of solar generation in Spain from 2015 to 2018.By focusing on solar The learning curves unveil the convergence of training and validation scores, highlighting that as the model encounters more data, its performance on both the training and validation sets reaches a state of stability.The limited divergence between the two curves underscores the model's capability to refrain from overfitting the training data and to make accurate predictions on unseen data.These insights align seamlessly with the earlier discussion of evaluation metrics, further underlining the model's ability to capture intricate solar generation patterns.

Conclusions
In the ever-evolving landscape of renewable energy, our research has delved into the temporal patterns of solar generation in Spain from 2015 to 2018.By focusing on solar generation patterns independently of weather data, we have highlighted the inherent predictability and self-reliance of solar energy within the electrical power system.The study revealed diurnal variations with peak generation during daylight hours and seasonal trends characterized by higher output in sunnier months, offering opportunities for efficient grid management.
At the core of our investigation, the XGBoost algorithm played a pivotal role, enabling us to capture and forecast solar generation trends with unparalleled precision.The journey involved meticulous model development, enriching our understanding of the factors influencing solar generation and providing a robust predictive tool.Our research's implications extend to the incorporation of solar energy into the electrical power system, facilitating optimized grid operations, efficient energy storage, and informed demand management strategies.By aligning peak solar generation with high-demand periods, we can enhance grid reliability and reduce reliance on fossil-fuel-based peaking plants.As we conclude this exploration of solar generation, it is vital to acknowledge that our research primarily focuses on historical patterns.
The future of solar energy holds boundless potential, shaped by technological advancements, evolving policies, and unforeseen events.Our findings serve as a solid foundation, guiding stakeholders, policymakers, and grid operators in navigating the evolving energy landscape.In summary, this research underscores the enduring promise of solar energy as a reliable and sustainable contributor to the electrical power system.We discovered patterns that indicate the path toward a cleaner and more environmentally friendly energy future by looking into the temporal dimensions of solar generation.As we stand at the nexus of renewable energy expansion, these insights empower us to make informed decisions and shape a world where solar energy's full potential is harnessed to illuminate our path forward.

Figure 1 .
Figure 1.Hourly solar generation comparison for selected days in 2015.

Figure 2
Figure 2 is a heatmap that illustrates the hourly solar generation patterns for a specific week in 2018, from 1 August to 7 August.Each row in the heatmap represents a date, and each column represents an hour of the day.The color intensity in the heatmap cells indicates the level of solar generation during that hour, with brighter colors representing higher solar generation.The figure below shows the description of the fluctuation in solar energy generation across the given time period in a selected week.Figure 2 shows that the solar energy generation peaks on 7 August 2018, while its lowest points were recorded on

Figure 1 .
Figure 1.Hourly solar generation comparison for selected days in 2015.

Figure 2 14 3
Figure 2 is a heatmap that illustrates the hourly solar generation patterns for a specific week in 2018, from 1 August to 7 August.Each row in the heatmap represents a date, and each column represents an hour of the day.The color intensity in the heatmap cells indicates the level of solar generation during that hour, with brighter colors representing higher solar generation.The figure below shows the description of the fluctuation in solar energy generation across the given time period in a selected week.Figure 2 shows that the solar energy generation peaks on 7 August 2018, while its lowest points were recorded

Figure 2 .
Figure 2. Hourly solar output variations for a week in August 2018.The graph in Figure 3 illustrates the amount of solar energy generation in 2015-2017 for each of the four seasons: spring, summer, fall, and winter.Each season is presented as a different bar, and each year is distinguished by a distinct color.The figure provides a clear visual comparison of how solar generation varies across seasons in the specified years.The box plots in Figure 4 provide a comprehensive view of solar generation behavior, highlighting monthly variations.Each box plot within the figure represents a specific year, enabling a detailed analysis of how solar power generation levels fluctuate across the seasons and months.The figure presents a visual comparison of how variety in solar generation enhances the understanding of annual solar generation patterns from month to month across the years 2015-2018.

Figure 2 .
Figure 2. Hourly solar output variations for a week in August 2018.The graph in Figure 3 illustrates the amount of solar energy generation in 2015-2017 for each of the four seasons: spring, summer, fall, and winter.Each season is presented as a different bar, and each year is distinguished by a distinct color.The figure provides a clear visual comparison of how solar generation varies across seasons in the specified years.

Figure 2 .
Figure 2. Hourly solar output variations for a week in August 2018.The graph in Figure 3 illustrates the amount of solar energy generation in 2015-2017 for each of the four seasons: spring, summer, fall, and winter.Each season is presented as a different bar, and each year is distinguished by a distinct color.The figure provides a clear visual comparison of how solar generation varies across seasons in the specified years.The box plots in Figure 4 provide a comprehensive view of solar generation behavior, highlighting monthly variations.Each box plot within the figure represents a specific year, enabling a detailed analysis of how solar power generation levels fluctuate across the seasons and months.The figure presents a visual comparison of how variety in solar generation enhances the understanding of annual solar generation patterns from month to month across the years 2015-2018.

Figure 3 .
Figure 3. Solar generation patterns across seasons in three consecutive years.The box plots in Figure 4 provide a comprehensive view of solar generation behavior, highlighting monthly variations.Each box plot within the figure represents a specific year, enabling a detailed analysis of how solar power generation levels fluctuate across the seasons and months.The figure presents a visual comparison of how variety in solar generation enhances the understanding of annual solar generation patterns from month to month across the years 2015-2018.

Figure 3 .
Figure 3. Solar generation patterns across seasons in three consecutive years.

Figure 5
Figure 5 displays the total solar power generation over different time intervals.The first subplot presents the daily total solar generation, the second shows the weekly total solar generation, and the last line graph shows the monthly total solar generation.The xaxis in each line graph represents time (date), while the y-axis represents the corresponding total solar power generation values.The figure provides a clear and concise visual line graph of the fluctuations of solar generation on a daily, weekly, and monthly basis.

Figure 5
Figure 5 displays the total solar power generation over different time intervals.The first subplot presents the daily total solar generation, the second shows the weekly total solar generation, and the last line graph shows the monthly total solar generation.The xaxis in each line graph represents time (date), while the y-axis represents the corresponding total solar power generation values.The figure provides a clear and concise visual line graph of the fluctuations of solar generation on a daily, weekly, and monthly basis.

Figure 5 .
Figure 5. Solar generation trends over different time intervals.

Figure 5 .
Figure 5. Solar generation trends over different time intervals.
provides a visual comparison between the actual and projected values of a machine learning model for solar generation on a weekly basis.The x-axis represents the weeks over the data's time period, and the y-axis indicates the values of the solar generation.Two lines are depicted in the plot, one representing the actual solar generation (marked with circles) and the other representing the predicted values (marked with crosses).This figure enables a direct assessment of how well the model aligns with the actual solar generation trends on a weekly scale, offering insights into the model's accuracy in capturing weekly variations.

Figure 7 .
Figure 7. Comparative model evaluation metrics for solar generation prediction.

Figure 7 .
Figure 7. Comparative model evaluation metrics for solar generation prediction.Figure 7. Comparative model evaluation metrics for solar generation prediction.

Figure 7 .
Figure 7. Comparative model evaluation metrics for solar generation prediction.Figure 7. Comparative model evaluation metrics for solar generation prediction.
illustrates the learning curve for the machine learning model employed in solar generation prediction.This visualization reveals changes in the model's root mean squared error (RMSE) on both the training and validation datasets as a function of the number of training examples.The x-axis corresponds to the count of training examples, while the y-axis showcases the corresponding RMSE values.The red curve represents the training RMSE, and the green curve pertains to the validation RMSE.As the number of training examples increases, the training RMSE diminishes, indicating a progressively improved fit to the training data.Simultaneously, the validation RMSE initially decreases but eventually stabilizes, signifying the point at which additional training examples provide marginal enhancements.This figure proves instrumental in comprehending how the model's accuracy evolves concerning varying training set sizes, serving as a critical tool for model assessment and optimization.Subsequent observations from the learning curve analysis reassure that the model exhibits robust generalization performance.Energies 2023, 16, x FOR PEER REVIEW 11 of 14 illustrates the learning curve for the machine learning model employed in solar generation prediction.This visualization reveals changes in the model's root mean squared error (RMSE) on both the training and validation datasets as a function of the number of training examples.The x-axis corresponds to the count of training examples, while the y-axis showcases the corresponding RMSE values.The red curve represents the training RMSE, and the green curve pertains to the validation RMSE.As the number of training examples increases, the training RMSE diminishes, indicating a progressively improved fit to the training data.Simultaneously, the validation RMSE initially decreases but eventually stabilizes, signifying the point at which additional training examples provide marginal enhancements.This figure proves instrumental in comprehending how the model's accuracy evolves concerning varying training set sizes, serving as a critical tool for model assessment and optimization.Subsequent observations from the learning curve analysis reassure that the model exhibits robust generalization performance.

Table 1 .
Parameters and output metrics in machine learning-based solar power generation forecasting models.