A Review of Solar Forecasting Techniques and the Role of Artificial Intelligence

: Solar energy forecasting is essential for the effective integration of solar power into electricity grids and the optimal management of renewable energy resources. Distinguishing itself from the existing literature, this review study provides a nuanced contribution by centering on advancements in forecasting techniques. While preceding reviews have examined factors such as meteorological input parameters, time horizons, the preprocessing methodology, optimization, and sample size, our study uniquely delves into a diverse spectrum of time horizons, spanning ultrashort intervals (1 min to 1 h) to more extended durations (up to 24 h). This temporal diversity equips decision makers in the renewable energy sector with tools for enhanced resource allocation and refined operational planning. Our investigation highlights the prominence of Artificial Intelligence (AI) techniques, specifically focusing on Neural Networks in solar energy forecasting, and we review supervised learning, regression, ensembles, and physics-based methods. This showcases a multifaceted approach to address the intricate challenges associated with solar energy predictions. The integration of Satellite Imagery, weather predictions, and historical data further augments precision in forecasting. In assessing forecasting models, our study describes various error metrics. While the existing literature discusses the importance of metrics, our emphasis lies on the significance of standardized datasets and benchmark methods to ensure accurate evaluations and facilitate meaningful comparisons with naive forecasts. This study stands as a significant advancement in the field, fostering the development of accurate models crucial for effective renewable energy planning and emphasizing the imperative for standardization, thus addressing key gaps in the existing research landscape.


Introduction
The pressing need for reducing greenhouse gas emissions has led to the worldwide adoption of renewable energy sources (RESs) [1][2][3].However, RESs tend to be volatile in nature, especially solar and wind energy, making it hard to predict their power output and making them less reliable.This volatile nature can lead to voltage fluctuations, frequency fluctuations, and system outages [1,2,4,5].
The large-scale integration of RESs into the energy supply network requires the development of new technologies and methods to balance supply and demand.As the share of RESs in the energy mix increases, the load on the energy grid increases with corresponding consequences.The intermittent nature of solar energy has proven to be an obstacle to the large-scale integration of solar energy.For example, a massive increase in grid-connected PV energy can result in overvoltage or congestion problems [6].
A rethinking of the traditional electricity grid is taking place in order to handle the perceived unpredictable nature of RESs.Traditional grids are continuously evolving and changing, becoming so-called smart grids.A smart grid can be seen as the result of fusing the electricity grid with Information and Communication Technologies (ICT).This allows for a two-way flow of information between the supply side and demand side on the energy grid [1,4], which in turn allows for the improved control of management over all the different domains that are part of energy production and distribution [4].Through the creation of decision-support tools that exploit these flows of information, the distribution and management of the grid can be optimized.Decision-support tools often deal with a variety of tasks, such as energy distribution, energy curtailment, and energy storage system activation.These are now being developed and often include forecasting and the recognition of energy demand and production.Artificial Intelligence (AI) is deemed to be very promising for dealing with these complex tasks [1,2,4].
One of the proposed solutions is to forecast solar irradiance and, in turn, to forecast solar energy production to help balance supply and demand through combination with electricity storage [7,8].As a result, solar forecasting has seen an increase in interest from researchers, grid operators, and other parties involved in the electricity market [9].
The conclusion of various parameters covered in the existing investigations is presented in Tables 1-7 in Sections 4 and 5.This review paper aims to provide an overview and critical evaluation of the current and emerging solar forecasting techniques, with a specific emphasis on methods based on All-Sky Imagers (ASIs), satellite data, Sensor Networks, the different data used, the time horizon, the evaluation metrics, and the different applications of solar forecasting, highlighting the integration of AI.The aim is to contribute to the advancement of solar forecasting research by identifying trends and obstacles and offering suggestions for further exploration.However, these studies fall short in addressing the latest AI techniques and recent research in different solar forecasting tools, along with their evaluations.
This present article comprehensively explores solar radiation forecasting techniques based on AI and highlights the intriguing interest in newly discovered procedures encompassing the evaluation of forecasting methods, analysis of ramp events and timing, insights into the AI technique for a special time horizon, resolution, considerations of the spatial-temporal resolution, examination of input variables and their accessibility, locationbased accuracy assessments, and the use of evaluation metrics for intended applications.Future recommendations focus on establishing a benchmarking framework and creating publicly available standardized datasets.Additionally, this article elucidates the working principles of each model, providing in-depth insights into recent research articles and their numerical results; a block diagram summarizing the contents of different chapters is presented in Figure 1.The main objectives of this study are structured as follows: I Solar forecasting, AI methods, and performance.II Assessment of forecasting methods.III Current research-an overview.IV Future recommendations and consistency of the training data.

Solar Forecasting, Methods, and Performance
The investigation into solar energy is an interdisciplinary pursuit that merges insights from various domains of research, including atmospheric science, climatology, statistics, data science, and Artificial Intelligence [10].In general, the problem of solar forecasting starts by determining the current state of the atmosphere in order to predict its future state [11].From this viewpoint, the process of solar forecasting can be reduced to three important parts: (1) the collection of input data, (2) the processing of input data through various methods, including preprocessing and postprocessing, and (3) the generation of output data or the forecast as presented in the flowchart in Figure 2. Due to the multitude of different forecasting methods available, this abstraction will serve as an overarching framework to connect and compare the different solar forecasting techniques to each other.We note that this is just a simplification, as often the solar forecasting methods can be very complex and detailed.Often, the methods applied to process the input data depend on the methods used to collect the input data.Therefore, we will describe some general approaches to solar forecasting, differentiated by the method of data acquisition, as seen from different investigations in Figure 3a in Section 2. The most common ways to gather input data for determining atmo-  Additional acronyms are added and can be found in the abbreviations, created by using Vosviewer (Version 1.6.18)[12].

Satellite Images
High-altitude geostationary satellites are equipped with a wide range of sensors, including sensors for visible and infrared light.Images are taken every 15 to 30 min by these satellites, and they are then combined with physical modeling to determine solar irradiance [2,11].The general approach is to first determine the clear-sky irradiance at a particular point through physical modeling, taking into account various parameters such as the aerosol content, water vapor, Elevation, and ozone.Then, the cloud pictures are analyzed to estimate the location and transmittance of the clouds.Finally, these data are combined to make an estimation of the actual solar irradiance [13].In order to make forecasts of solar irradiance, consecutive images are used to determine Cloud Motion Vectors.Under the assumption that the cloud structures stay the same, the future position of the clouds can be determined [9,13].Forecasting methods based on the mentioned approach generally have a good forecasting ability for a horizon of up to 6 h, but their performance tends to be less optimal in situations where clouds rapidly form and disperse [9,13].
In addition to these established methods, recent advancements have been made in short-term solar irradiance forecasting.For instance, Miller et al. [14] proposed a method that combines geostationary satellite observations, cloud masking and retrieval algorithms, wind field data, and radiative transfer calculations to generate accurate short-term forecasts of solar insolation for solar power generation.This approach considers factors such as cloud advection, shadow displacement, solar geometry, and terrain height to predict the transient properties of down-welling solar irradiance.The algorithm outperforms persistence-based forecasting methods and demonstrates improved accuracy compared to them [14].Furthermore, Lago et al. [15] developed a generalized model for short-term solar irradiance forecasting that does not rely on local ground measurements.Instead, the model utilizes satellite-based measurements and weather forecasts, employing a Deep Neural Network (DNN) structure that can generalize across locations.The model shows comparable or better performance compared to local models trained with ground measurements, making it a cost-effective alternative for solar irradiance forecasting without the need for costly installation and maintenance of the ground sensors [15].

All-Sky Imagers
All-Sky Imagers (ASIs) are cameras capable of capturing images with a 180-degree field of view, enabling them to take pictures of the entire sky from one horizon to the other.ASIs are used for cloud detection, the determination of Cloud Motion, and the determination of cloud height [13].ASIs can also be linked to other equipment as well, such as ceilometers or pyranometers [9].Solar irradiance forecasts that are made by using ASIs generally use the following approach: (1) take sky pictures near or at the forecast site, (2) use image-processing techniques to detect clouds in the picture, (3) determine Cloud Motion Vectors through linking clouds in consecutive images, and (4) use Cloud Motion Vectors to determine future cloud positions and estimate the future irradiance accordingly [2,9].ASI-based methods are able to provide forecasts with a very high spatial and temporal resolution, compared to satellite-based methods.Therefore, they are very valuable for predicting high-frequency fluctuations or ramps in solar irradiance [9,11].Methods based on ASIs generally outperform other methods on very short-term forecasts, i.e., up to 30 min [9].They have proven to be useful for the management of solar thermal energy plants, the management of microgrids, scheduling storage-integrated PV systems, and the participation of PV and solar thermal electric (STE) plants in power grid operation [16].
Recent investigations have contributed to the advancement of ASI-based solar forecasting techniques.For instance, a study by Zhang et al. proposed a machine learning approach for Cloud Motion Vector estimation using ASI images, which showed promising results for improving short-term solar irradiance forecasts [17].Furthermore, a study by Li et al. explored the integration of ASI data with deep learning models for accurate solar power forecasting in complex weather conditions [18].

Sensor Networks
While Sensor Networks often span over several hundred meters [19] or even kilometers [20], research on the use of low-cost illuminance meters showed that small networks can also detect cloud shadow movement when the sampling rate is high enough.Espinosa-Gavira et al. [21] used a network of 16 lux meters, spanning only 15 m by 15 m, that was able to detect cloud shadow movement.As a continuation of this research, Espinosa-Gavira et al. [22] also developed a method for deriving Cloud Motion Vectors from small Sensor Networks with promising results.While, to the best of the authors' knowledge, no forecasting techniques have currently incorporated these methods, the findings underscore the significant utility of small-scale networks in supporting solar forecasting efforts.The research conducted by Elsinga and Van Sark [23] demonstrated how Solar PV systems themselves can be used as sensors.A total of 202 rooftop PV systems, spread over the province of Utrecht, the Netherlands, and spanning an area of roughly 1400 km 2 , were used to create a PV sensor field.By combining real-time power output measurements with the cross-correlation time lag between pairs of PV systems, a peer-2-peer forecast method for very short-term forecasts of up to 30 min was successfully developed.

Numerical Weather Predictions
The data obtained by Sensor Networks are limited to their location, and the resolution is limited by their dispersion.Numerical Weather Prediction (NWP) models have been developed to model conditions over large areas and can be used to generate input data for solar forecasting and can be used as an alternative to obtaining input data for solar forecasting.NWP models are based on physical modeling and generally are built on a set of differential equations describing physical and thermodynamical processes.These equations are numerically solved and have often been optimized to predict variables such as temperature, humidity, wind, and the probability of precipitation [2, 11,13].The input data for NWP models are obtained through direct, atmospheric measurements or satellite data [8].
NWP models can be divided into two categories: global and local.Global NWP models describe global weather patterns, whereas local NWP models describe weather restricted to a certain area, be it a country, a continent, or another bounded area [2,11].Often, NWP models are combined with Model Output Statistics (MOS), which can help improve the forecast accuracy by about 10-15% [8,13].Depending on the model used, NWP models have a spatial resolution typically ranging from 16 to 50 km and a temporal resolution of about 15 min [2,11].NWP models are known to perform better than other methodologies for a time horizon from 6 h to about two weeks regarding the forecasting of atmospheric conditions and are therefore a valuable resource for generating input data [11,13].However, due to the resolution of NWP models, it is not possible to resolve the small-scale physical processes that are related to cloud formation.Consequently, the prediction of cloud formation involves large uncertainties and errors.NWP models can, at best, give information on the probability of cloud formation through, for example, the determination of atmospheric saturation.As a result, stand-alone NWP models are often not sufficient for accurate irradiation prediction, but they provide a valuable resource for determining atmospheric conditions [11].
One of the recent developments is an emphasis on improving solar irradiation prediction by NWP models [13].For example, Zhang et al. [24] researched the effect of postprocessing NWP model outputs for solar forecasting.However, when comparing their method to time series modeling and extrapolation techniques, they found only marginally small improvements.Nevertheless, when averaging the postprocessed NWP model output with the time series techniques, the performance proved to be superior to that of stand-alone techniques, highlighting the importance of using an ensemble of techniques for forecasting.Another current example is the work of Sabzehgar et al. [25].They developed a method combining NWP models with Neural Networks to improve the forecasting irradiance and power output.The postprocessing of the NWP model output has also been proven to improve NWP forecasts significantly.Verbois et al. [26] demonstrated that by applying postprocessing, the RMSE of NWP models can be reduced by up to 30%.Verzijlbergh et al. [27] were able to obtain similar results by reducing the rRMSE (relative RMSE, see Section 4) by applying MOS to correct biases in the model.
Another recent development, sparked by the increase in computing power, is an increased interest in high-resolution atmospheric modeling.More specifically, Large Eddy Simulations (LES) are considered to be one of the most promising methods to increase the resolution of atmospheric modeling [28].High-resolution NWP models based on LES have already shown promise in the development of large off-shore wind farms [29], and the potential of LES for the benefit of solar forecasting is being investigated [30].A first attempt of LES for solar forecasting, based on MicroHH (a computational fluid dynamics code for the simulation of turbulent flows in the atmosphere) [31], has already been tested against NWP model results that are postprocessed by using a machine learning algorithm, hinting at the possibilities of LES [32].

Hybrid Approaches
The above-mentioned approaches are often combined in one way or another and then can be considered as hybrid approaches.They can be combined to tackle the weaknesses of one method or to enhance each other to increase accuracy and strength [2].One recent example is the work conducted by Paletta et al. [33].They combined All-Sky Imagery with satellite observations to investigate how these techniques can complement each other.They found that by combining these two data sources, clear-sky forecasts and long-term forecasts can be improved.Si et al. [34] also combined different sources of input.Satellite Images were used as input for a Deep Convolutional Neural Network.The output of the Neural Network was then combined with so-called "cloud factors", which were derived from meteorological data and NWP data.The combined data were used as input for a multilayer perceptron to produce a forecast of solar irradiance, with good results.

Artificial Intelligence and Solar Forecasting
It would be impossible to discuss solar forecasting methods while not mentioning Artificial Intelligence.A lot of research is already focusing on the use of AI for solar forecasting [2,8,13,35] and also for decision-support tools in different domains of the electricity grid [3].The wide range of different applications that have been applied successfully by researchers highlights the versatility of AI techniques.For example, AI has been used to develop energy bidding tools [36]; perform day-ahead solar forecasting [37], wind speed forecasting [35], solar radiation estimation [2, 8,13,35], the monitoring of fields of PV systems [38], fault detection, and the diagnosis of wind energy systems [5]; and demand load predictions [3].
It should be noted that there is no official definition of what AI is [11] and what techniques and methods are considered Artificial Intelligence.Instead, AI is often used as an umbrella term to describe a wide variety of techniques, including but not limited to machine learning, supervised learning, optimization algorithms, pattern recognition techniques, and regression methods.One of the main strengths of what are generally considered AI techniques is that they are able to solve complex problems for which it is impossible to find explicit algorithms or mathematical solutions [7].Often, this includes pattern recognition in large datasets in which the underlying principles or dependencies are very complex or unknown.The recent increase in the usage of AI techniques has been facilitated by a rapid increase in computational power over the last decades [39].
One of the most frequently used AI techniques in solar forecasting, as seen in Figure 3b, and many other fields of research, is the Artificial Neural Network (ANN) [2, 3,11,13,40].The strength of ANNs is that they require a very low level of programming to solve a wide variety of complex problems.Specifically, nonlinear, stochastic, or mathematically ill-defined problems (e.g., pattern recognition or classification) are very well suited for ANNs [2,11,40,41].Other popular techniques include the support vector machine [2,8,13], k-Nearest Neighbor algorithms [8,11], intelligent optimization algorithms [3], and Markov Chains [2, 8,13].Fuzzy Logic Control (FLC) has also been widely applied to control Solar PV systems and smart grids [3].It makes it easy to use many input variables and to make use of the expert knowledge of human decision makers without the use of complex mathematical expressions [3,42].
A recent example of using AI methods for solar forecasting is the work conducted by Eseye et al. [43].A data-driven approach was developed that employs a wavelet transform method, support vector machine, and particle swarm optimization to make predictions on the PV power output.The results were compared to seven other AI-based methods and proved to be competitive.This research also highlights the numerous methods already developed by using AI methods.Mishra and Palanisamy [44] developed a solar forecasting method built on Recurrent Neural Networks that was able to predict solar forecasts over a wide range of time horizons ranging from intrahour, hourly, to day-ahead scales using real-time inputs.Another example that showcases the possibilities of AI within solar forecasting is the method developed by Ge et al. [45].They developed a method that only uses empirical data and AI, thus excluding the use of any physical model or empirical relationship while still being able to achieve similar results to more classical methods.Another method that has been gaining interest for nowcasting is the General Adversarial Network (GAN) method, which has already proven to be able to forecast precipitation with high precision [46], improve time series Satellite Image prediction [47], and perform sky image forecasting [48].For further reading on AI techniques, refs.[8,11] are recommended.

Assessment of Forecasting Methods
Depending on the equipment and input data used, the methods that are applied can vary to a great extent.For the further development and validation of solar forecasting techniques, it is important to consider how to assess and compare the performance of these different methods.Most often, conventional metrics are used since they give a general overview of the global performance.However, it is very difficult to directly compare methods based on the results of these metrics alone.These metrics need to be interpreted in a correct manner since the performance of a forecasting method is dependent on various factors such as the spatial and temporal resolution of the input data, time of year, percentage of clear-sky days in the dataset, location, forecast horizon, etc. [2,7].And at the same time, the end-user of the forecast also plays an important role in determining the requirements for a useful forecasting method [9].

Common Performance Metrics for Solar Forecasting
The predicted values of solar forecasting methods and their accuracy are generally expressed as either irradiance (W/m 2 ) or solar power output (kW) [2].The most commonly used statistical metrics to assess the accuracy of the forecast are described below very briefly [2,7,9]: • Mean Absolute Error (MAE): This calculates the average difference between the predicted and observed values.It is an easy-to-understand metric that gives an idea of the accuracy of predictions.Unlike the RMSE (Root Mean Squared Error), which amplifies larger errors due to the squaring process, the MAE gives equal importance to all errors, regardless of their size.As a result, each error has an equal impact on the MAE: • Relative RMSE (rRMSE): This is a normalized version of the RMSE that takes into account the magnitude of the actual values when assessing the predictive accuracy of a model.It helps to evaluate a model's performance in relation to the data's variability, which is especially helpful for data with different scales or units: Next to these statistical metrics, the Forecast Skill (FS), also known as the Performance Skill (PS) or Forecast Skill Score (SS), is often used to evaluate the forecast.It allows the user to compare the developed forecasting method against a reference method [7,9].Another metric that is being applied to solar forecasting is the Continuous Ranked Probability Score (CRPS).The CRPS measures the sharpness and reliability of the forecast and rewards a high concentration of the forecasted probability around the target value [2,9].The CRPS is given by the following equation where F(x) and F(x) are defined to be the Cumulative Distribution Function (CDF) of the probabilistic forecast and the actual measurement, respectively: The above-mentioned metrics are only the most commonly used metrics.For further reading on the different metrics that are used, we refer to [2,7,9].

Assessment of Ramp Events and Timing Errors
When assessing ramp events and timing errors in solar energy forecasting, the focus is particularly on the Temporal Distortion Index (TDI) and its extension, the Temporal Distortion Mix (TDM).This approach encompasses using the time derivative of the normalized global horizontal irradiance (GHI) for ramp event detection and classification.
Depending on the application of the forecast, the most appropriate performance metrics should be used.The metrics mentioned before are well suited for determining average errors.However, they are not well suited for measuring the accuracy of predicting ramp events [7,9,33].There is no strict definition of a ramp event, as it often depends on the end-use, whereby the events are critical [49].Ramp events can be described as sudden and significant changes in the irradiance or power output.An example of a ramp event could be a sudden drop in the irradiance due to a passing cloud.However, these events can be major concerns for certain end-users, such as network operators or large PV plant operators, and can be of great interest for short-term to medium-term forecasts [7,9].And even though the forecasting of ramp events is of great importance for many applications, there is only relatively little research focused on this area [49].
A couple of methods have already been proposed to measure the ability to forecast ramp events.The first to discuss would be the swinging door algorithm.Formally, this algorithm is not a metric, but it can be used to detect ramp events.It was proposed back in 1990 for data compression [50].The swinging door algorithm splits a signal up into segments.The starting point of that segment can be seen as the pivot point of a swinging door as a threshold is chosen to determine how far the door can swing.If the starting point of the next segment is within the swinging doors' range, there is no ramp event.If it is outside of this range, a ramp event is detected [51].It has been suggested and applied to detect energy ramps in historical wind and solar-energy-production data [52], and further research has already been performed to optimize the algorithm for the detection of wind energy ramps [51,53,54].
Another metric that has been introduced to measure timing errors, which are critical for the detection of ramp events, is the Temporal Distortion Index (TDI), which is based on Dynamic Time Warping (DTW) [55].This metric was first introduced by Sakoe and used as a time-normalization algorithm for spoken word recognition during the 1970s [56,57].A straightforward method to explore the time alignment of two signals involves matching each data point in signal 1 with its corresponding data point in signal 2 and then evaluating their similarity.However, by warping the time index of signal 2, the data points in signal 2 can be optimally aligned to the data points in signal 1.This concept is visualized in Figure 4.The TDI gives a measure of the amount of time warping needed to align the two signals, thus giving a measure of the temporal error of the second signal.A complete description of this metric is outside the scope of this review.The reader is referred to [55], where a complete methodology for using the TDI is described.Continuing the TDI method introduced by the authors of [55], a method was developed to split up the TDI into two parts, giving a score of how much the forecasted signal is advanced (or early) and how much of the signal is late [7].This metric is called the Temporal Distortion Mix (TDM), expressed as a percentage ranging from −100% to +100%, which indicates that the signal is early or late, respectively [7].This new metric was combined with a ramp score based on the swinging door algorithm and complemented with a complete procedure to determine the quality of a forecast [7].
Another method, focusing on the detection of ramp events, was developed by [49].The time derivative, over a specified time horizon D, of the normalized GHI is used to determine the Ramp Rate.In this case, ϵ is the maximum, which is the clear-sky GHI at the top of the atmosphere for the day under consideration: Based on measurements from a clear day, a threshold for detecting which events should be counted as ramp events is derived.This threshold states that on a clear day, 99% of the derivatives of the measurements should lie below this threshold.The benefit of this method is its relative simplicity.Current research focuses on the relation between D and the threshold limit and the recall and precision of ramp event forecasting [58].Based on a confusion matrix, recall and precision are defined as in Table 1.

Confidence Intervals and Ranges in Solar Forecasting Studies
Based on the compiled information from various solar forecasting studies presented in Table 2 (see Section 4), we can draw some overall conclusions regarding the confidence intervals and ranges of different error metrics.It is important to note that these conclusions are based on the provided information and should be interpreted with caution as the actual confidence intervals may vary across individual studies.
The RMSE values vary depending on the forecast horizon, ranging from 115.6 W/m 2 for a 10 min horizon to ranges between 8.64 and 49.1 for longer forecast horizons.The exact ranges are not specified for certain forecast horizons.Additionally, specific values are given for selected time intervals.The RMSE for a 1 h ahead forecast ranges from 79 to 100 W/m 2 .
The MAE ranges between 6% and 7.5%, indicating the average magnitude of errors.For specific forecast horizons, the MAE is reported as 70 W/m 2 for a 5 min ahead forecast.The MAPE for the NN irradiance forecast is 0.95%, while the MAPE for the NN power production forecast is 45.3%.
The relative RMSE ranges from 6.7 to 39.8%, providing a measure of error relative to the predicted value.Clear days with a clear sky as Class 1 exhibit lower rRMSE values, with partially cloudy days reaching as low as 4.7% and 7.6%, while cloudy days (Class 3) show higher rRMSE values in the range of 30-50%.
The FS varies across studies, with reported values ranging from 2% to as high as 66%.The CRPSS improvement is above 2.7% in some cases, indicating the enhanced performance of forecasting models.Furthermore, an optimized model shows a skill score improvement of 21% compared to the previous Cloud Motion Vector (CMV) model [59].
These conclusions provide valuable insights into the wide array of error metrics utilized in solar forecasting studies, allowing researchers and practitioners to gain a comprehensive understanding of the accuracy and performance of forecasting models.By delving into these studies, stakeholders can deepen their knowledge of the advancements made in solar forecasting, leading to enhanced reliability in predicting solar energy outcomes.

Current Research-An Overview
The focus of this review is on research publications of the last five years; thus, this review focuses on research published from 2018 to the time of writing (2023).The search engine "Google Scholar" was used to find the research papers.Different search terms were used separately from each other to obtain an indicative view of trends in solar forecasting publications.These search terms were based on the methods that are generally used to obtain input data.An extra search term, based on the presence of AI within the research field, was used as well to include these techniques too.The search terms used were (1) "Solar Forecasting", (2) "Solar Forecasting All Sky Camera", (3) "Solar Forecasting Satellite", (4) "Solar Forecasting Numerical Weather Prediction", (5) "Solar Forecasting Sensor Networks", and ( 6) "Solar Forecasting Artificial Intelligence".For each search term, the first 50 results were considered in addition to those in Table 2, the relative frequency of different data resources Table 3, then various application of AI techniques is presented in Tables 4-7.
In addition to these publications, some additional publications that were considered to be valuable based on expert knowledge were added too.
To illustrate some of the research trends that are happening in the field of solar forecasting, Figures 5 and 6 show the number of publications associated with each search term for the period 2012 to 2022.The year 2023 was excluded since this year was ongoing at the time of writing.The approach used was to insert one of the search terms into Google Scholar, filter on one specific year, and then iterate over the period of ten years.The amount per year was summed to determine the total amount of publications for that period.Interestingly, when inserting one of the search terms with a filter set to the period of ten years to determine the total amount of publications in one go, the returned amount of publications differed significantly from the amount obtained by summing the separate years together.This indicates that Google Scholar probably uses a certain kind of algorithm to present the most relevant results depending on the filters and search terms applied.This serves as an extra reminder that the trends found are only indicative.
The overarching search term (1) "Solar Forecasting" returns 270,100 results when all the years are summed.Naturally, the other "better specified" search terms return fewer results.The distribution of publications, based on the search term, can be found in Figure 5. Here, we can see that search term (4) "Solar Forecasting Numerical Weather Prediction" returns significantly more results than the other, better-specified search terms.While All Sky Cameras receive specific attention from the IEA-PVPS through their Task 16 project (subtask 3) [69], showcasing their importance for solar forecasting, there are relatively few publications compared to the other methods.Figure 6 shows a graph illustrating the number of publications per year per search term.For all search terms, we can see an increase in the amount of publications.More specifically, there was a strong increase in publications found by using the search term (6) "Solar Forecasting Artificial Intelligence" over the last few years, indicating the importance of Artificial Intelligence techniques for the field of solar forecasting.Figure 7 shows the number of publications returned for search terms (2)-( 6), normalized to the amount of returned results from the overarching search term (1) "Solar Forecasting".By showing the relative amount of publications to the total amount of publications, it becomes possible to see how the interest in specific subjects is distributed within the field of research.Figure 7 shows how research on solar forecasting predominantly used Numerical Weather Prediction at the start of the considered period in 2012.Over time, the reliance on Numerical Weather Prediction seems to decrease, whereas the use of Artificial Intelligence seems to increase.The graph illustrates the number of returns given by Google Scholar per search term for solar energy forecasting, including "Solar Forecasting", "Satellite", "Sensor Networks", "All Sky Camera", "Numerical Weather Prediction", and "Artificial Intelligence". Figure 6.The graph illustrates the number of publications that are returned per year per search term for solar energy forecasting, including "Solar Forecasting", "Satellite", "Sensor Networks", "All Sky Camera", "Numerical Weather Prediction", and "Artificial Intelligence".The above-mentioned search method resulted in a comprehensive set of publications revolving around the development of solar forecasting techniques.To summarize the ongoing research, Tables 2-7 were created.These tables summarize, per publication, the most important methods that were used to create the forecasts and the relative frequency of the data sources.The table also includes, where applicable, details on the forecasting accuracy and error of the developed method.

Analysis of Current Research
Based on Tables 2 and 4-7, the current trends and practices in the field of solar forecasting were identified and will be discussed below.This discussion will include an analysis of data resources, an analysis based on the time horizon, the prevalence and usage of AI techniques, the effect of weather conditions based on forecasting location, and a discussion on which error metrics are used to assess solar forecasting techniques.

Data Resources
Data resources delve into the diverse data sources and methods employed in solar forecasting, ranging from Satellite Imagery and Numerical Weather Predictions (NWPs) to ground-based measurements and open-access data, as presented in Table 5.These varied inputs are crucial to accurately capture meteorological parameters and solar irradiance, thereby enhancing the precision of solar forecasting models.
To gather the cloud cover information, sky conditions, and other meteorological parameters, Satellite Images from various satellites, including EUMETSAT's Meteosat, FengYun-4A, and MSG SEVIRI, are utilized [16,33,37].NWPs from organizations such as ECMWF and NOAA play a significant role by providing data on atmospheric conditions, including cloud cover and temperature [33,37,44].Ground-based measurements using pyranometers, pyrheliometers, and irradiance sensors directly measure the solar irradiance or power output, serving as crucial reference data for model validation and evaluation [44].Some studies validate and verify solar forecasting models by utilizing data from PV systems or solar power plants, which provide actual power generation values based on solar irradiance [16].Additionally, sky cameras and ceilometers capture the localized cloud cover, cloud base height, and sky conditions, offering valuable supplementary data for solar forecasting models [33].The relative frequency of the data sources that are used is summarized in Table 3. Certain studies employ numerical models like the Weather Research and Forecasting model (WRF) to simulate atmospheric conditions and integrate them with other data sources to generate solar irradiance predictions [19].Several articles highlight the use of open-access data sources, such as open-source datasets, publicly available Satellite Images, and meteorological data from organizations like NREL and NSRDB [19,70].It is important to note that the analysis presented is based on the provided information, and the actual techniques used in each study may vary.Furthermore, some articles may employ a combination of techniques or integrate multiple data sources to achieve more accurate solar forecasting.On the other end of the spectrum, the shortest-term forecasting is conducted within a time horizon of 1 min to about 1 h.Various techniques have been employed for such short-term forecasting, showcasing the diverse approaches in the field.These techniques include localized forecasting using sky images and Cloud Motion tracking [76], nowcasting techniques utilizing analog methods and geostationary Satellite Images [71], probabilistic forecasts encompassing intraday variability and satellite information [81], direct learning from Satellite Images with regions of interest [74], spatiotemporal optimization based on Satellite Imagery [87], and minutely forecasting through real-time sky image-irradiance mapping [68].Additionally, other methods such as predictions based on near-real-time satellite data [82], ensemble forecasting with dropout Neural Networks and neighboring satellite information [63], and generalized models using satellite data without local telemetry [88] have also been explored.Furthermore, deep learning models have shown promise, including a model for intraday forecasting using satellite-based estimations [89], a proposed model combining deep learning and machine learning for hourly solar irradiation forecasting [65], and a spatiotemporal deep learning model for satellite-derived short-term forecasting [90].Additionally, studies have focused on cloudiness forecasting for solar energy purposes [79], comparative studies of LSTM Neural Networks in day-ahead global horizontal irradiance forecasting [77], ultra-short-term PV power forecasting considering neighboring plant data [72], and hybrid approaches combining satellite remote sensing and time series models for mesoscale surface solar irradiation distribution [67].Furthermore, a hybrid forecasting method utilizing satellite visible images and modified Convolutional Neural Networks (CNNs) has been proposed [34].

Prevalence of Artificial Intelligence
Based on a comprehensive review of the literature provided in Tables 4, 6 and 7, it is evident that a wide range of AI techniques have been extensively employed for solar energy forecasting.These techniques include diverse methodologies and data sources.Table 5 highlights the multifaceted nature of the problem.Notably, deep learning architectures based on ECLIPSE (Envisioning Cloud-Induced Perturbations in Solar Energy) have been investigated [33], showcasing the potential of advanced Neural Network structures.Additionally, supervised learning models have been explored to forecast solar energy production [36], leveraging historical data and relevant features.Furthermore, regression, support vector regression, ensemble learning, deep learning, and physical-based techniques have been investigated to model solar energy patterns [44].Recurrent Neural Networks have been proposed for solar radiation nowcasting, enabling short-term forecasting [16].Cloud Motion Vectors have been incorporated by using the Deep Flow algorithm to enhance the accuracy of solar radiation predictions [70].Moreover, the European Solar Radiation Atlas (ESRA) clear-sky irradiation model has been utilized in combination with other algorithms to improve solar power nowcasting [19].The Atlas (ESRA) clear-sky irradiation model has also been employed for PV power forecasts [20].Satellite-based approaches, such as the FengYun-4 geostationary satellite data, have been utilized to develop solar radiation nowcasting systems [73].Additionally, Numerical Weather Prediction has been employed to enhance the solar power forecasting accuracy by considering weather conditions [24].Machine learning algorithms like the Long Short-Term Memory (LSTM) Neural Network have been utilized for short-term solar power forecasting [91].Moreover, the application of Empirical Mode Decomposition (EMD) has been explored for day-ahead solar power forecasts [25].Techniques such as Elastic Net Regularization have been investigated to improve the precision of solar power forecasts [26].Furthermore, local vector autoregressive ridge modeling has been employed to predict solar power generation [20].Lastly, the combination of the European Solar Radiation Atlas (ESRA) with Heliosat-2 has been proposed as a hybrid satellite-based solar forecasting approach [92].
Clearly, the field of solar energy forecasting encompasses a wide range of AI techniques, reflecting the multifaceted nature of the problem.But, even though these techniques are a substantial part of the ongoing research, the process of selecting and applying these techniques sometimes seems to be based on trial and error.This also applies to the selection of variables that are used as inputs for AI models.In a sense, this is inherent to the black box that is introduced by a number of AI techniques.One cannot look inside a Neural Network and see why certain patterns are detected.In the end, you can only see that it is able to detect these patterns by analyzing the outcome.Thus, only by trying different techniques can one be able to see which technique is best at detecting the right patterns.But then, when trying to improve the technique for forecasting, it can become hard to do so since the inner workings are not well known.This is more of a general issue in using AI, prompting new developments such as explainable AI (XAI) [93].

Local Weather Conditions
The precision of solar energy forecasting exhibits variability among different nations and is subject to the influence of diverse models, methodologies, and prevailing weather patterns unique to each country, as illustrated in Figure 8.This variability can be attributed to the unique weather characteristics experienced in specific regions.In the United Kingdom (UK), known for its cloud-dominant weather, a 10 min solar energy forecast achieves an accuracy with an RMSE of 115.6 W/m 2 [33].The Netherlands, characterized by its cloudy weather, demonstrates a forecasting skill of 19.9% with an average of 160 sunny days throughout the year [36].The United States of America (USA) provides a solar energy forecast at 30 min intervals.Spain, known for its abundant sunshine, has an RMSE of 126.7 W/m 2 and a forecasting skill of 23.3% [44].On the other hand, China, with its diverse weather patterns, has forecasting intervals of 60 min and an RMSE of 134.9 W/m 2 [16].Greece achieves the highest Economic Revenue for PV models, and Japan shows varying RMSE values for different forecast horizons [70].Singapore, France, Germany, Uruguay, Australia, Denmark, Morocco, Mauritius Island, and Korea also have their own specific accuracy metrics, influenced by their respective dominant weather patterns [19,20,[24][25][26]91,92].

Widely Employed Performance Metrics
Concerning performance metrics, Table 2 provides an overview of the frequently utilized metrics in solar-energy-forecasting research, which include RMSE, MAE, FS, MBE, nRMSE, MAPE, rRMSE, and CRPS, as illustrated in Figure 9. Notably, among these metrics, RMSE stands out as the most prevalent, with over 110 studies utilizing it to assess and verify the performance of various forecasting methods.The distribution of these metrics is further elaborated in Table 9.In addition, many other metrics are used as well, such as the Prediction Interval Normalized Averaged Width (PINAW) [81], Brier Skill Score (BSS) [71], Correlation Coefficient (R) [94], Normalized Peak Mean Absolute Error (nPMAE) [19], and Economic Revenue (ER) [37].Table 9.The relative frequency of the most common performance metrics used in solar forecasting.

Comparison to Naive Forecasters
A common approach to assess forecasting methods is to compare the results to a naive forecaster, such as the persistence approach, by using the Forecast Skill and other metrics, e.g., the RMSE [16,33,71,86,95].Then, if there is an improvement compared to the naive approach, naturally, it is claimed that the method in question shows promise as there is a quantitative improvement.However, when everybody is compared to the worst performer in class, everybody can claim to obtain good results.Following this analogy raises the question of how promising the results that are obtained actually are.This reflects the need for identifying state-of-the-art methods that can be used as high-standard reference methods.

Future Research-Recommendations
Reviewing the current state of the art and analyzing the most current research revealed some of the current shortcomings within the field of solar forecasting.Based on these shortcomings, recommendations for future research will be discussed in the following sections.

Creation of a Benchmarking Framework
As mentioned in the previous section, it is hard to accurately compare different forecasting methods since they are dependent on many different variables.This reflects the need for a more standardized approach to developing and assessing forecasting methods.This could be performed through the development of a benchmark for solar forecasting.The International Energy Agency (IEA) has already included the creation of a benchmarking framework for solar forecasting in their PVPS task 16 project as part of subtask 1 [69], which highlights the importance of this subject.Benchmarking is known to stimulate innovation, technical development, and the building of a stronger research community.Often, it has a strong, positive effect on maturing a scientific discipline.This can be attributed to the fact that the development of a benchmark requires a thorough evaluation of the field of research to identify key problems, which in turn requires good communication and collaboration [97,98].Through the development of evaluation methods and procedures, it helps to increase the reproducibility of research [99].The development of a benchmark also helps to identify state-of-the-art methods.These state-of-the-art methods can then be used as high-standard references, instead of naive approaches, to better identify what methods are really pushing solar forecasting forward.
A good benchmark should provide a basic framework for action.It needs to be clear and straightforward, and it should describe the steps needed to perform a benchmark [97].
According to [98], a benchmark should have three components: (1) a motivating comparison, (2) a Task Sample, and (3) a performance measure.The motivating comparison reflects the purpose of a benchmark as it allows for a comparison of methods to identify the best practices.At the same time, it should also describe the motivation for pushing the field of research forward, in this case, to allow for the better integration of solar energy.Thus, the motivating comparison illustrates the context in which the benchmark is operating.The Task Sample should showcase an example of the task that is at hand, i.e., use input data on atmospheric conditions to determine future solar irradiance for the better integration of solar energy.The performance measure used for a successful benchmark should not only be seen as a way to describe the characteristics of the method that is benchmarked, but it should also reflect the fitness of the method for the task at hand.Thus, to choose a good performance measure, the end-user should also be considered.

Creation of Publicly Available, Standardized Datasets
As mentioned before, one of the difficulties in comparing methods results from the fact that the performances are affected by many variables, and thus different input datasets can yield different performances.A starting point for assessing and benchmarking different methods would be through standardized datasets [100].However, ref. [100] also pointed out that there are only a few standardized datasets available, hindering researchers that do not have the resources available to make their own meteorological measurements.To demonstrate the possibilities of solar forecasting, the use of a suitable dataset is essential [10].Luckily, some scholarly journals actively promote the publication of code and datasets alongside papers to enhance the reproducibility of the research [99].The development of high-quality datasets for solar forecasting is also part of IEA-PVPS Task 16, subtask 1, indicating the importance of the matter.
A complete, standardized dataset should contain the following elements: (1) qualitycontrolled weather and irradiance data with, preferably, a 1 min time interval; (2) highresolution sky images for the same location and time period; (3) Satellite Images for the same location and time period, with the same time interval; and (4) Numerical Weather Prediction data for the same location and time period [100].Prompted by the above recommendations, Pedro et al. [100] released a standardized dataset of three years containing quality-controlled data with a 1 min resolution of GHI and DNI ground measurements in California, USA.These measurements are complemented by sky images, Satellite Imagery, and NWP forecasts for the same location.Another effort to create a standardized dataset, to accelerate solar forecasting research, was made by Nie et al. [101].This dataset, however, was specifically aimed at short-term solar forecasting using ASI cameras.The dataset contains three years (2017-2019) of quality-controlled, down-sampled sky images and PV power generation data, which are prepared for deep learning methods.In addition, the high-resolution images are added too if needed together with sky video footage and a code base containing scripts for data processing and baseline modeling.
Since these complete datasets are, in general, not readily available, data for solar research are often collected from multiple sources.However, many solar forecasting methods rely on data science and machine learning techniques and thus require large datasets.As a result, collecting and aligning datasets are often time-consuming processes [10].When looking to put one's own dataset together, the first place to look would be for datasets that are published alongside papers.These datasets typically have already been qualitycontrolled and matched, thus saving a lot of work.High-quality weather and irradiance data are obtained from in situ measurements performed by weather stations, buoys, and radiosondes.These data are often freely available for research purposes.Some well-known examples are the Baseline Surface Radiation Network (BSRN) and the Surface Radiation Budget Network (SURFRAD).However, these measurements are limited to the location of measurement equipment.Ground data for other locations are often derived by using interpolation and Numerical Weather Prediction, thus losing some accuracy.To complement ground measurements, remotely sensed data from geostationary satellites could be used.While they are, in general, of lower accuracy than ground-based data, satellite data are often freely available.Some well-known sources for satellite data are EUMETSAT, NOAA, and NSMC.Stand-alone NWP models can also be a valuable source for data on atmospheric conditions [99].Well-known examples of available NWP data sources are ECMWF and NOAA.There are also more locally focused weather predictions available, such as the Royal Netherlands Meteorological Institute (KNMI) for the Netherlands.And when looking for ASI images, an extensive list of 72 open-source, sky image datasets was put together by Nie et al. [102].This list also details whether or not the images are complemented with additional data, such as solar irradiance or the PV power output, and whether they are selected to be suitable for AI-driven methods.

Classification of Forecasting Sites
Part of creating a benchmark or a solution to be able to better compare different forecasting methods could be to work with a classification scheme that aims at classifying different forecasting sites.This classification scheme should be based on the different factors that affect solar irradiance.The starting factor for differentiating forecasting sites is the local climate, which can be differentiated by using the Kóppen-Geiger System.This system is already over 100 years old but is still widely used in a wide range of research disciplines that require climate classification [103].Seasons should also be taken into account as they strongly affect weather patterns.The choice of seasons should be linked to the Kóppen-Geiger System, because some climate regions experience winter, spring, summer, and fall, while other regions only experience a dry and wet season.
A classification scheme, as mentioned above, could also be the starting point for the development of a top-down approach for developing solar forecasting solutions, as this is currently lacking [3].A classification scheme can help to highlight which techniques work best for which class of forecasting sites.This creates an opportunity to start with a focus on the most suited techniques for the forecasting site that is considered since it is known that no system of forecasting (e.g., deep learning or physical modeling) performs best in all situations [104].

Value of Expert Variables, Artificial Intelligence, Preprocessing, and Postprocessing
In solar forecasting, using expert variables, AI techniques, preprocessing, and postprocessing approaches has proven to be significant for enhancing forecast reliability and accuracy.Expert variables produced from area knowledge and expertise can give significant insights and improve the modeling and forecasting processes.These variables may include atmospheric conditions, cloud properties, solar geometry, and historical data patterns [105,106].
AI approaches, such as machine learning and deep learning, have shown considerable promise in identifying complex correlations and patterns in input data, resulting in more accurate predictions.These approaches may be used at several phases of the forecasting process, such as data preprocessing, feature extraction, model training, and prediction.Forecast models can adapt to changing conditions and improve their effectiveness over time through the use of AI algorithms [107,108].
Preprocessing is crucial for cleaning, filtering, and transforming raw data from various sources, including Sensor Networks, satellite imaging, NWP models, and sky cameras.
Preprocessing procedures such as data fusion, quality control, and outlier detection ensure that the input data are correct, dependable, and appropriate for forecasting.This stage aids in minimizing noise, correcting biases, and dealing with missing data, ultimately enhancing forecast quality [109,110].
Postprocessing methods are employed to refine and improve the forecast outputs.These techniques can include statistical analysis, error correction, bias adjustment, and ensemble modeling.Postprocessing helps in calibrating the forecasts, reducing systematic errors, and providing more reliable and consistent predictions.By combining different forecasting models or utilizing ensemble techniques, the uncertainty in the forecasts can be quantified, and better decision making can be facilitated [10,111].
Open-source data and benchmarking are crucial in the field of short-term solar forecasting.Open datasets provide a standardized platform for comparing different forecasting methods and evaluating their performance.Applying benchmarking exercises allows researchers and practitioners to assess the strengths and weaknesses of different techniques, identify best practices, and drive innovation in the field [10,112].It is important, however, to exercise caution and not blindly trust AI models.While AI techniques can significantly enhance forecasting capabilities, they should be used as tools that assist human experts rather than replacing them entirely.Human expertise and judgment remain critical in interpreting results, validating forecasts, and making informed decisions in complex situations [10,112].
Explainable AI (XAI) techniques play a vital role in ensuring the transparency and interpretability of AI models in solar forecasting.XAI techniques aim to provide insights into the inner workings of AI models and explain the reasons behind their predictions.By making the decision-making process of AI algorithms more transparent, XAI enables users to understand how and why a certain forecast or outcome was produced.This fosters trust, facilitates model validation, and helps identify potential biases or limitations in the AI system.Incorporating XAI techniques into solar forecasting models can enable stakeholders to have a better understanding of the underlying factors influencing solar energy predictions and make more informed decisions based on the forecasted results [113,114].
Looking to the future, advancements in Sensor Networks, Satellite Imagery, NWP models, and sky cameras, coupled with ongoing research in AI and data-driven approaches, hold promising prospects for short-term solar forecasting.Continued improvements in data quality, model algorithms, and integration techniques are expected to further enhance the accuracy, reliability, and usability of solar energy forecasts, ultimately contributing to the more efficient and effective utilization of solar resources [95,115].

Extreme Weather, Outliers, and AI
The latest IPCC report [116] states that human-induced climate change has influenced extreme weather events such as heat waves, heavy precipitation, droughts, and tropical storms and has also increased the change in compound events, including consecutive heatwaves and droughts.Renewable energy sources are directly affected by the availability of sufficient water, wind, or solar radiation.This makes the renewable energy supply susceptible to extreme weather events.Recent events such as long-lasting high temperatures, droughts, and low wind speeds have already proven to be able to impact energy prices and the security of energy services [117].And the increasing likelihood of these weather events only stresses the need to consider extreme weather events in future energy-forecasting research, although it is hard to predict how these events will affect energy forecasting and solar forecasting in particular.
While AI techniques have shown promise in improving solar forecasting, they may not always be suitable for predicting extreme weather conditions such as lightning.Extreme weather events introduce outlier events that are challenging for AI methods to detect as they deviate from normal trends.Lightning occurrences, for instance, are highly dynamic and unpredictable, making them difficult to capture accurately by using traditional AI ap-proaches.The complex nature of lightning and its relationship with solar energy production requires a deep understanding of atmospheric physics and the intricate interplay between various meteorological variables.Research studies have highlighted the limitations of AI techniques in capturing and predicting lightning events for solar forecasting.For example, a recent study by [118] demonstrated that while AI models showed promising results in the overall solar forecasting accuracy, they struggled to predict lightning-related disruptions in solar power generation accurately.Similarly, ref. [119] found that AI models trained on historical weather data had limited capability in capturing the spatiotemporal patterns associated with lightning occurrences.These findings indicate that specialized models and techniques, such as physics-based models or data-assimilation methods, may be required to incorporate lightning-related information into solar forecasting systems effectively.Therefore, while AI techniques have demonstrated advancements in solar forecasting, they may face limitations in accurately predicting and accounting for extreme weather conditions like lightning, highlighting the need for further research and development in this area.

Conclusions
Solar energy has become an essential element in the global shift toward renewable energy sources.Accurate solar forecasting is crucial to effectively integrate solar power into the energy grid, optimizing energy management and reducing operational costs.However, understanding the various techniques employed in this field remains challenging due to the diversity of input data sources and limited explicit mentions in the available literature.
In this comprehensive review, we conducted an in-depth analysis of commonly used techniques and data sources in solar forecasting.Through an extensive examination of relevant studies, we identified key methodologies for predicting solar irradiance and the PV power output.We explored various aspects, including time horizons, time resolution, methodologies, data sources, and evaluation metrics employed in this field of research, emphasizing their significance in decision making and resource optimization within the renewable energy sector.
Our findings revealed that solar-energy-forecasting research utilizes a wide range of techniques and metrics to assess the accuracy and performance of models.AI techniques, including deep learning architectures, supervised learning models, regression models, support vector regression, ensemble learning, and physical-based techniques, have been extensively employed in this field [33,44,70].These approaches have been implemented in numerous countries, such as the United Kingdom, the Netherlands, the United States of America, Spain, China, and others [16,19,36].
Specifically, deep learning architectures like Recurrent Neural Networks and Long Short-Term Memory (LSTM) Neural Networks have shown promising results in short-term solar energy forecasting [33].The incorporation of satellite data, sky-camera imagery, and Numerical Weather Prediction has been explored to enhance the accuracy of solar radiation predictions [16,20].Clear-sky irradiation models, such as the European Solar Radiation Atlas (ESRA), have been combined with other algorithms to improve solar power nowcasting [26].Additionally, hybrid approaches that combine satellite data with Artificial Intelligence methods or time series models have been proposed to achieve more accurate solar power forecasts [24,67].
However, it is important to note that the accuracy of solar energy forecasting varies across different countries due to the influence of models, methods, and dominant weather patterns specific to each region.The unique weather characteristics of each country result in varying accuracy for different forecast horizons [36].For example, in the UK, with its cloud-dominant weather, a 10 min solar energy forecast achieves an accuracy with an RMSE of 115.6 W/m 2 [33].The Netherlands, characterized by its cloudy weather, demonstrates a forecasting skill of 19.9% with an average of 160 sunny days throughout the year [36].Spain, known for its abundant sunshine, has an RMSE of 126.7 W/m 2 and a forecasting skill of 23.3% [44].On the other hand, China, with its diverse weather patterns, has forecasting intervals of 60 min and an RMSE of 134.9 W/m 2 [16].These variations underscore the need for continuous research and development efforts to improve the forecasting accuracy further and tailor approaches to specific regional conditions.
Furthermore, the reviewed studies on solar energy forecasting encompass a wide range of time horizons.The longest time horizons range from 24 to 21 h, enabling predictions of mesoscale surface solar irradiation distribution and the impact of large-scale clouds [62,84].Short-term forecasting, on the other hand, occurs within a time horizon of 1 min to about 1 h, utilizing diverse techniques such as localized forecasting, nowcasting techniques, and probabilistic forecasts [71,76,81].Other methods, including prediction based on near-real-time satellite data, ensemble forecasting, and deep learning models, have also been explored [65,82,88,89].Additionally, studies have focused on cloudiness forecasting, studies of LSTM Neural Networks, ultra-short-term PV power forecasting, and hybrid approaches [67,72,79].
Through the identification of current research trends within the field of solar forecasting, a number of shortcomings and problems were also identified.The main problem that was identified is the difficulty in comparing different methods due to the huge variety of techniques and data that are used for solar forecasting.Therefore, the development of a benchmarking framework or top-down approach is recommended.There is also a lack of standardized datasets.They are often not readily available, hindering research.There is, however, a multitude of data sources available that can be used, but they need to be combined and aligned, which is often a time-consuming process.For a better comparison of methods, but also to better specify datasets and a benchmarking framework, solar forecasting sites should be classified based on local climate and weather patterns.
And while AI, especially combined with preprocessing and postprocessing, has proven to significantly improve forecasting, it is important to not blindly trust AI.The use of Explainable AI can shine a light on the inner workings of these techniques, allowing us to better identify biases and limitations, such as the handling of outlier weather events.AI should be used as a tool that assists experts, not replace them.Human expertise remains vital in interpreting results and for the creation of expert variables that can be used for forecasting.
In conclusion, this review provides valuable insights into the landscape of solar forecasting research, shedding light on the diverse techniques and data sources applied in this dynamic field.By understanding both the strengths and limitations of existing methodologies, researchers and practitioners are empowered to make informed decisions related to solar energy integration, management, and cost optimization.These decisions play a pivotal role in driving the successful transition toward a renewable energy future.
However, in light of the presented findings, a crucial question emerges: how can we further refine the accuracy and reliability of solar energy forecasting to align with the evolving requirements of renewable energy integration and management?This inquiry encourages exploration and dialogue on potential pathways for enhancing forecasting techniques.This includes the incorporation of emerging data sources, the refinement of evaluation metrics, and the tailored addressing of challenges unique to various regions and weather conditions.
Effectively addressing this question calls for sustained efforts in research and development, the establishment of benchmarks that set high standards for comparison, and the availability of comprehensive datasets.This concerted approach is essential for unlocking the full potential of solar energy forecasting [16,33,86].
Author Contributions: C.H. and K.B.: conceptualization, methodology, data collection, formal analysis, investigation, writing-original draft, and visualization.S.G.: conceptualization, methodology, resources, writing-review and editing, supervision, and project administration.W.v.S.: conceptualization, methodology, resources, writing-review and editing, supervision, project administration, and funding acquisition.All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Ministery of Economic Affairs and Climate within the framework of the Topsector Energy via the Netherlands Enterprise Agency (RVO), grant reference TSE1220051, and grant reference TEUE1821406 (project Solar Forecasting with All-Sky Imagers, SolFaSi).

Figure 1 .
Figure 1.A block diagram for solar-forecasting-based AI techniques.

Figure 2 .
Figure 2. Flowchart illustrating the steps involved in short-term solar forecasting.Abbreviations used are further explained in the text and are AI: Artificial Intelligence, LP: Linear Programming, OA: Optical Analysis, GHI: global horizontal irradiance, PV: photovoltaics, DNI: direct normal irradiance, CCI: Cloud Clearness Index.

Figure 3 .
Figure 3. Word web on (a) AI techniques and (b) different data sources for solar energy forecasting.Additional acronyms are added and can be found in the abbreviations, created by using Vosviewer (Version 1.6.18)[12].

Figure 4 .
Figure 4.A visualization of Dynamic Time Warping adapted from[55].(A) Normal alignment of two signals where a measurement at t = i in signal 1 corresponds to the measurement of signal 2 at t = i.(B) Illustration on how the time index of the second signal can be warped to better match the shape of the first signal, thus matching measurements not only by their timestamp but also taking into account the signal shape.

Figure 5 .
Figure5.The graph illustrates the number of returns given by Google Scholar per search term for solar energy forecasting, including "Solar Forecasting", "Satellite", "Sensor Networks", "All Sky Camera", "Numerical Weather Prediction", and "Artificial Intelligence".

Figure 7 .
Figure 7.The graph illustrates the relative amount of publications per search term, compared to the total amount of publications per year found by using the search term "Solar Forecasting".This should give an indicative view of the distribution of methods used within the field of solar forecasting.

Figure 8 .
Figure 8. Geographical distribution of solar forecasting investigations utilizing AI techniques.

Table 1 .
Confusion matrix for further investigating how Ramp Rate is affected by changing D and t.

Table 3 .
The relative frequency data sources used for solar forecasting.