Reliability Predictors for Solar Irradiance Satellite-Based Forecast

: The worldwide growing development of PV capacity requires an accurate forecast for a safer and cheaper PV grid penetration. Solar energy variability mainly depends on cloud cover evolution. Thus, relationships between weather variables and forecast uncertainties may be quantiﬁed to optimize forecast use. An intraday solar energy forecast algorithm using satellite images is fully described and validated over three years in the Paris (France) area. For all tested horizons (up to 6 h), the method shows a positive forecast skill score compared to persistence (up to 15%) and numerical weather predictions (between 20% and 40%). Di ﬀ erent variables, such as the clear-sky index ( K c ), solar zenith angle (SZA), surrounding cloud pattern observed by satellites and northern Atlantic weather regimes have been tested as predictors for this forecast method. Results highlighted an increasing absolute error with a decreasing SZA and K c . Root mean square error (RMSE) is signiﬁcantly a ﬀ ected by the mean and the standard deviation of the observed K c in a 10 km surrounding area. The highest (respectively, lowest) errors occur at the Atlantic Ridge (respectively, Scandinavian Blocking ) regime. The di ﬀ erences of relative RMSE between these two regimes are from 8% to 10% in summer and from 18% to 30% depending on the time horizon. These results can help solar energy users to anticipate—at the forecast start time and up to several days in advance—the uncertainties of the intraday forecast. The results can be used as inputs for other solar energy forecast methods.


Introduction
Since the year 2000, the cost of PV modules generation has significantly decreased, increasing the competitiveness of PV power against electricity generated from fossil fuels [1]. Concomitantly, financial incentives for the use of low-carbon energy have been established all over the world by various institutions. Consequently, the total PV capacity installed in the world has been multiplied by 14.4, from 40.2 GW in 2010 to 580.2 GW in 2019. During the same period, the total capacity of all renewable energies doubled from 1226.6 to 2536 GW [2]. The dramatic increase in PV production raises the issue of its massive penetration into the grid. The injection of any power source in the grid must be balanced at each instant by an equal consumption to ensure the safe and stable operation of the grid at a constant frequency.
However, the production of solar energy is essentially driven by the solar irradiance reaching the Earth's surface, quantified as global horizontal irradiance (GHI). This physical quantity is highly variable for two main causes. The first one is astronomic: the irradiance depends deterministically on the diurnal and seasonal variations of solar elevation above the horizon. The second one is meteorological: atmospheric components (water vapor, aerosols and mainly clouds) significantly attenuate the solar radiation passing through the atmosphere and reaching a solar energy production system. In particular, • helping dispatch operators to propose the cheapest and safest electric mix according to user's consumption and energy availability. • limiting financial risks for electricity distributor buying (respectively, producer selling) PV power on the electricity market. • designing optimal rules for the management of micro-grids including at least a PV system and a battery storage.
In particular, intraday horizon forecasts (typically up to 6 h) are required more and more for the grid power reserve management, the increase in intra-day auctions in the electricity market [6] and the current worldwide development of micro-grids.
Forecasting the surface solar irradiance enables us to forecast PV production using proven models converting GHI into PV power production [7,8]. It does not require any PV power historical dataset. It has the advantage of providing an operational forecast even if the PV output measurements cannot be provided in real-time. GHI forecasts also permit the assessment of the predictability of a future PV farm not yet built. During the last decade, numerous studies proposed PV production forecast methods. A large proportion of them have been surveyed by different authors [9][10][11][12][13]. These publications agree that each forecasting technique is suitable for a particular time horizon range.
At the intraday scale, time series modeling approaches are often used. A wide range of methods modeling time-series using machine learning techniques have been developed [14]. Such methods are trained on historical data and offer direct PV power forecasts if long-term historical datasets of this quantity are available. More advanced methods use hybrid approaches [15] or include a variety of environmental data to assimilate more information driving PV power variability [16]. Nevertheless, these approaches depend on the behavior of the training datasets. Applications to other regions or periods requires retraining based on available historical data [17].
Other appropriate methods for this time scale are based on cloud cover forecasting using images from the geostationary meteorological satellite. These techniques can be applied at any geographical site covered by the satellite without using training datasets. Moreover, it enables us to overcome the underestimation of cloud cover by classical numerical weather prediction models (NWP) [18]. Some studies demonstrated the superior performance of such techniques compared to NWP, over the USA [19], Germany [20] or Netherlands [21]. Satellite-based forecast is today operational and currently distributed as a commercial product [22,23].
The performance of such methods is very sensitive to the cloud cover state and evolution. For instance, methods based on the temporal extrapolation of cloud motion present better results for passing cloud events than for sudden cloud appearance or disappearance [24]. Then, reliability predictors for such algorithms could be identified among various observable and predictable weather data. Synoptic situation, defined by a weather regime and the surrounding cloud cover distribution, may influence the GHI forecast accuracy. So far, most of the published studies focus on the benefits for end-users to prefer satellite-based methods rather than NWP forecasts at the intra-day scale. Nevertheless, relationships between atmospheric state and forecast performance have been identified for day-ahead NWP-based forecasts. This showed, for instance, that the highest forecast errors occur for the highest solar elevation angle and intermediate cloud coverage states [25]. However, such studies have never been undertaken for algorithms forecasting solar irradiance using satellite-based cloud observations as the main input. Knowledge of the relationship between satellite-based forecast accuracy and weather predictors could be used to announce a magnitude of order of intraday PV forecast errors for day or week-ahead weather forecasts. This information would help grid managers to prepare the sizing of storage capacity or ancillary electricity resource, considering the announced error range.
Firstly, this paper presents and assesses the performance of a GHI intraday forecast method based on the extrapolation of cloud cover observed by the Spinning Enhanced Visible and Infrared Imager (SEVIRI) on board of the geostationary satellite Meteosat Second Generation (MSG). A cloud motion vector field (CMV) originally designed to determine wind field from MSG data has been applied. The algorithm has been tested for over 3 years (from July 2017 to June 2020), a period length exceeding the length of most validation processes of similar state-of-the-art methods. Performance has been assessed using ground measurements at the Baseline Surface Radiation Network (BSRN) site of Palaiseau (France) [26] on the Site Instrumental de Recherche par Télédétection Atmosphérique (SIRTA) atmospheric observatory [27] and compared with a persistence algorithm as well as with GHI outputs of the Météo-France Action de Recherche Petite Echelle Grande Echelle (ARPEGE) NWP model. Secondly, score results have been analyzed according to the different weather regimes occurring in the northern Atlantic and known to impact the meteorological situation over the Paris area [28], as well as surrounding cloud patterns observed from SEVIRI images. In this paper, Section 2 describes our forecast algorithm, the data used and the assessment methodology. Section 3 presents the selected weather predictors and their known influence on actual surface solar irradiance. Section 4 shows the forecast accuracy results and discusses them according to the selected predictors. Section 5 concludes this work.

Preliminaries: Using Satellite Data to Assess GHI
Since the late 1970s [29], images from the geostationary meteorological satellite have been used to map the solar energy yield by assessing the GHI at ground level. Even today, such imagery ( Figure 1) offers the best compromise of cloudiness distribution mapping and monitoring in terms of wide coverage (almost one third of the Earth's surface), spatial resolution (<4 km) and time sampling (5-15 min).
Numerous works continued this activity, following technical enhancement of space borne sensors. The so-called Heliosat method [30][31][32][33][34] proposed common principles to numerous algorithms estimating surface solar radiation from satellite images.
The fundamental principle of Heliosat consists of quantifying the cloudiness of a given pixel by computing a cloud index n (also called "cloud albedo" [34]), which is based on the normalized difference between the reflectance observed by the broadband visible channel of the satellite with the one simulated if the atmosphere were free of cloud at the same time for the same pixel. n has a value ranging theoretically between zero (clear sky) and one (overcast sky). A clear-sky index K c is deduced from n through a quasi-linear relationship, varying according to Heliosat method version (1). K c can be considered as an attenuation factor of surface solar radiation only due to cloud presence.
Using a clear-sky model assessing the GHI under clear-sky (GHI cs ), the GHI at ground level is computed as: The use of satellite images to map the GHI is advantageous because it enables to obtain a solar energy yield estimation with a constant and satisfactory accuracy for various parts of the globe. It avoids the costly installation and maintenance of meteorological instruments in every site of interest.

Overview of Satellite-Based Intraday Forecast Methods
The cloud index derived from a satellite image has been initially designed to assess the GHI from a satellite image. However, it has been considered as a fundamental input for a forecast algorithm. Indeed, cloud index temporal variation contains almost the totality of the stochastic component of GHI temporal evolution. Hammer et al. [35] proposed a GHI forecast scheme by predicting statistically, the evolution of the cloud index map. Lorenz et al. [36] set up a forecasting algorithm by deriving cloud motion vectors (CMV) from two subsequent cloud index maps; the obtained CMV field is then used to extrapolate the cloud index map at the time when the forecast is initiated. The extrapolated value of cloud index is then converted into the forecast GHI using (1) and (2). The GHI cs is assumed to be easily predictable as it depends on the solar position-precisely calculated with astronomical equations-and atmospheric optical depth, assumed to be constant at the intraday scale. Cloud motion vector derivation from geostationary meteorological satellite images is used, and has been applied for decades, to derive large-scale wind fields for operational weather forecasts [37]. So far, [36] have been widely cited as the reference work using CMV to forecast solar radiation from satellite data. In this field, many studies were published using this generic principle (described in Figure 2) through different versions.
Like [36], the algorithms developed by [21,[39][40][41][42][43] include a CMV field computed from the satellite data, whereas those from [44,45] import a wind vector field from a NWP model. Both approaches extrapolate a cloud index image from their CMV fields. More recent works based on machine-learning techniques use the CMV field as an input among others for bi-dimensional time-series modeling of cloud index maps [46,47]. All of these studies could demonstrate the pertinence of their algorithm by comparing their accuracy against NWP or persistence algorithms. However, these three different approaches have not been yet compared over the same measurement points, during the same period and using identical error metrics [43]. In this present work, we implement a forecasting method based on the principle of [36] using a specific method CMV computation method built from Meteosat images and originally dedicated to derive synoptic wind for operational weather forecast. The process follows the diagram presented in Figure 2 with several differences detailed in Section 2.3.

Validation Data
Forecast results have been assessed using surface solar radiation measurements from the BSRN station of Palaiseau located at the SIRTA observatory (France, '48.713 • N; 2.208 • E', 157 m above average sea level). The GHI measured at 1 min temporal resolution by a pyranometer (Kipp and Zonen CM22) is used in this present study.

Satellite Data
SEVIRI is a 12-channel imager on the MSG geostationary satellites operated by EUMETSAT. In this work, we used the observations acquired from 01 July 2017 to 30 June 2020 mainly from MSG-3 and MSG-4 and punctually from MSG-1 when MSG-4 has been in maintenance. These data are delivered in real-time by the SATMOS service from Météo-France under the form of NetCDF files containing calibrated radiance. Images are continuously archived at Institut Pierre-Simon Laplace (IPSL) in Palaiseau.
The computation of CMV and the forecast algorithm uses the high-resolution visible channel (HRV) observing the Earth through a broadband from 0.4 to 1.1 µm. The spatial resolution of HRV data is 1 km at nadir, corresponding approximately to a pixel of 2 × 1.2 km 2 over Palaiseau (respectively, 3 km at nadir and 4 × 6 km 2 for infrared channels). The temporal resolution is 15 min for both channels.
MSG images are time stamped every 00, 15, 30 and 45 min of the hour. SEVIRI starts to scan the whole Earth disk at each time stamped instant. The scan duration is approximately of 12.5 min. Observation time of the pixel surrounding SIRTA observatory takes place 11 min after time stamped instant. Thus, cloud index can be derived at each 11, 26, 41 and 56 min of an hour.

ARPEGE Outputs
The NWP model ARPEGE is one of the operational forecast model used at Météo-France [48]. We use the output "surface solar radiation downwards" of ARPEGE, as a reference model to assess the satellite-based forecast. This model is operational as a global forecast system with a horizontal resolution of about 7.5 km over mainland France. Its initial conditions are determined by a 4D-Var data assimilation of observations. ARPEGE is a spectral hydrostatic model with a presentation in finite elements on the vertical direction. It presents 105 vertical levels from 10 m. to 75 km above sea level.
For each day, we selected the run initialized at 12:00 UTC for 0-24 h ahead with an hourly time resolution. Outputs have been linearly interpolated to coincide with MSG timestamps. Spatial resolution of this global model is 5 km over Europe.

Cloud Index Computation
The conversion of HRV radiance into cloud index is made with an in-house version of the Heliosat method. This version is defined by the way to compute the different terms of the cloud index n defined in [33] and [31] by (3): ρ is the reflectance derived from the calibrated radiance L observed by the sensor through the following equation: where L is the calibrated radiance observed by the sensor, d is the Earth-Sun distance in AU temporally corrected by the Earth orbit eccentricity, I 0 is the total solar irradiance for the HRV channel and θ is the solar zenith angle (SZA) above the given pixel. ρ c is the maximum observable reflectance assuming that it is due to the brightest cloud. According to [34], we determine ρ c by the 95th percentile of all reflection values at local noon in the target region. We set this value to 78% for the whole studied period. Sensor change between MSG-1, -3 and -4 did not impact significantly this 95th percentile value. ρ g is the ground albedo, typically the albedo of a pixel if the sky is cloud free. This is usually the relative minimum of the reflections during a certain time span (e.g., a month) derived for each pixel of the satellite image. In order to prepare our algorithm for operational application, we undertook an internal study, using almost 5 years of data, leading to the best GHI retrieval results with minimal operational constraints. Then, ρ g was computed as the 4th percentile of all reflection values at a given time slot for a given month during the two years preceding a given month. The 4th percentile value has been selected as a good compromise to obtain ρ g minimum values, avoiding secondary minimums and outliers. For example, for a GHI estimation from an MSG image acquired on any day of September 2020 at 12:00 UTC, ρ g is the 4th percentile for all ρ values observed at 12:00 UTC during September 2018 and 2019. The retrievals are limited to θ less than 78 • .

Cloud Motion Vector Field Computation
The algorithm computing cloud motion vector is detailed in [49]. It has been originally conceived to extract atmospheric motion vectors used as inputs of operational weather forecast models. Instead of using cloud index images as shown in Figure 2, this CMV computation method uses radiances. The process includes two main steps:

•
In the first step, a CMV field is produced with a block-matching method based on the minimization of the sum of squared differences (SSD, or Euclidean distance) of pixel values (radiances). For each vector, if the forecast is initiated at time T0, a square of 36 × 36 pixels-called target window-is selected on the image acquired at T0-15 min (image 1). A square of 96 × 96 pixels-the search window-is then selected in the image acquired at T0 (image 2) with the same center as the target window of image 1. The target window of image 1 is displaced pixel-by-pixel in all directions, over all possible positions allowed inside the search window of image 2. The relative position of the two windows presenting the minimal SSD is marked as the motion vector tip, and thus defines the CMV. In practice this process is applied to all the predefined positions of a regular grid and produces a CMV vector field between T0-15 min and T0. In this study we do not take into account the height (or level) of the tracked clouds. The resulting CMV fields may be composed of vectors at different levels. When clouds are present in different, overlapping layers, the satellite generally "sees" the clouds of the uppermost layer in the visible channel. • A second CMV field has to be calculated over the same grid with the previously described procedure between the image acquired at T0-30 min (image 0) and the image acquired at T0-15 min (image 1), in order to detect unrealistic temporal evolution of the tracked cloud or structure during the half-hour period before T0, the forecast time, by comparing both vector fields. In practice, this second CMV field is extracted before the one derived between T0-15 min and T0).
Particularly in a heterogeneous cloud field, this process can lead to spatial incoherence between neighboring vectors or unrealistic directions or speed variations for some vectors. Thus, a series of quality tests is applied as a second step with the following criteria: • Suppression of null vectors (corresponding often to cloud free areas).

•
Suppression of vectors with a norm exceeding a realistic wind speed. • Temporal consistency test: each vector calculated between T0-15 min and T0 is compared to its predecessor computed at the same location in the images acquired at T0-30 min and T0-15 min. If the difference in vector direction or the ratio of magnitude (proportional to the wind speed) of both vectors, exceed fixed thresholds, the vector is suppressed. • Spatial consistency test: each vector is compared with its nearest neighbors. Similarly, if the difference in direction or the magnitude ratio exceeds fixed thresholds, the vector is suppressed.
In practice, the trackable clouds almost always stay at the same level over 1/4 h, the period between two consecutive satellite images used for CMV calculation. This may not be the case in the following situations:

•
When clouds appear or dissipate (inside the area covered by the target window), they are only present on one image of a pair.

•
Sometimes two clouds or cloud groups (or more) can be present at different levels inside a target window and can move at a different speeds or in different directions. The derived CMV may correspond to the motion of one of the clouds (generally the largest one, or in some cases the thickest cloud when semi-transparent clouds (cirrus) are also present. Or the CMV may correspond to some "mixture" of the motion of both clouds.
In both situations the resulting vector generally shows important differences in speed and/or direction with its neighbors, and/or with its collocated predecessor or successor, and therefore will be suppressed by quality tests, mainly by temporal and spatial consistency tests.
In the current implementation, a CMV is computed over an area of 513 × 513 pixels centered on Palaiseau. A vector is computed every 36 pixels. Figure 3 shows an example of computed CMV before and after the consistency tests. It can be seen that the consistency tests suppressed two very long vectors near the south-west corner of the image.

Cloud Index Image Extrapolation and Post-Processing
At a forecast initiation time T0, once the cloud index and the CMV have been computed, the latter is applied on the cloud index derived from the satellite image acquired at T0 for its extrapolation. Every pixel included in the 36 × 36 pixels block containing a vector is shifted according to its norm and direction. Thus, the cloud index values stay the same, while their position can change.
The extrapolation is made for every time horizon h from 15 to 360 min with a 15 min time step. The extrapolation is made "step-by-step": the cloud index map forecast at time horizon h is the result of the extrapolation applied on the forecast result of h-15 min. This allows taking into account a smoother cloud motion, in particular for a rotating motion. Indeed, a "direct" extrapolation of the image T0 to every time horizon steps lead to a homothetic shape of the original image, ignoring the curved motions. This process enables us to obtain, at time T0, a time-series of 24 extrapolated images up to h = 6 h. Numerous authors, including [36], apply a smoothing filter on extrapolated images. This permits a reduction in forecast errors by eliminating small randomly varying structures that can become unrealistic after the extrapolation process. However, if it reduces the errors in term of hourly energy sums, the results lead to a reduction in GHI variability [39]. If a low error rate in the hourly energy sum is better for applications such as electricity trading, forecasting the actual variability can be useful for micro-grid applications, where forecasting the ramp-rates accurately is more fruitful than forecasting the PV power magnitude. In the present study, we applied a rectangular low-pass filter with size increasing with time-horizon.

Forecast GHI Computation
The last step of the forecast algorithm is the conversion of the target pixels into GHI values. We extracted the pixel including the site of Palaiseau and computed the clear-sky index Kc from the cloud index value, following the relationship given by [34]. Thus, the generic simplified Equation (1) became: For computing GHI cs , we selected the clear-sky model called SOLIS [50,51] available through the pvlib library [52]. To simplify the implementation of our forecast algorithm, we use the monthly climatological means of aerosol optical depth proposed in this library and assigned a constant value for water vapor concentration. SOLIS has been widely used and tested as a robust model with a stable accuracy even with climatological data such as aerosols inputs [53]. The forecast values of K c = K c (T 0 + h) are finally converted into GHI (T 0 + h) using (2).

Forecast Predictors
Solar irradiance variability is driven by its two mains components:

•
The deterministic variability only dependent on the course of the Sun, directly and entirely quantified by the solar zenith angle θ.

•
The stochastic variability of cloud cover, depending on the weather situation at multiple spatio-temporal scales. Thus, its quantification can be made through various parameters influencing cloudiness at the intra-day scale.
For cloud cover variability, we limited the study to three potential predictors: the observed clear sky index on the punctual site, the clear sky index map surrounding the SIRTA site and the weather regimes occurring in the northern Atlantic and western Europe. These parameters are likely to present links with the intraday irradiance variability at the kilometric scale. Moreover, their observation and forecast are available even for operational conditions. Thus, they can be used to predict the expected uncertainty of operational forecasts.

Predictor Solar Zenith Angle
The SZA variation describes the daily and seasonal course of the Sun. It determines the maximum possible irradiance for a given instant and also influences the magnitudes of forecast errors. Moreover, GHI retrieval from satellite images at high SZA values (when the Sun appears close to the horizon) induces higher error rates, mainly due to possible misalignment of cloud and its shadow viewed from the satellite, as well as an overestimation of reflectance computed in (4) leading to false cloud detection (overestimation of cloud index). SZA values are computed through the Solar Position Algorithm (SPA) [54] also available in pvlib [52].

"Observed" Clear-Sky Index
What is denoted here by "observed" clear sky index is the value of K c_obs , computed with the measured GHI obs from pyranometer: GHI obs and GHI cs are both available at forecast initialized time if the site is equipped with a pyranometer. Values of K c_obs at this instant assume to represent the possible state of the sky. For extreme values, the sky is overcast or cloud free meaning a relative stability at least in the first tens of minutes. For intermediate values, the sky is often covered by a heterogeneous cloud field that can change quickly.

Spatial Pattern of Surrounding Clear-Sky Indices Assessed from Satellite
K c_obs is observable only from the punctual target site, delivering information on surrounding situation only through assumptions. As computation of cloud indices of the whole studied area of the satellite image is a necessary step for the extrapolation process (see Section 2.3.5), the clear-sky indices over the surrounding area of the target site are easy to obtain.
We consider the clear indices map of 7 × 7 MSG-HRV pixels, representing approximately a rectangle of 8 × 14 km 2 surrounding Palaiseau. From this map at each forecast initialization instant, we computed the arithmetic mean of the 49 clear-sky indices assessed from satellite image and their standard deviation. These two parameters enable us to quantify the magnitude of the cloudiness and its spatial distribution inside this surrounding area.

Weather Regimes of North Atlantic-Europe Domain
Cloudiness patterns and their spatiotemporal evolution originate from weather characteristics at larger synoptic scales (pressure and temperature fields). For the northern Atlantic area, four situations per season have been identified as distinct weather regimes [55,56]. The impacts of each weather regime on the meteorological situation in Paris area have been studied by [28,57], who investigated the influence of these same weather regimes on solar and wind energy production.
Weather regimes are distinguished by clustering the daily maps of anomalous 500 hPa geopotential height in the north Atlantic-European region. The classification of Cassou [58] presents four different cases: • An anticyclone over northern Canada and Greenland and a zonal circulation towards Europe: this regime corresponds to the negative value of the Northern Atlantic Oscillation (NAO) and it is therefore called NAO−. Van Der Wiel et al. [57] focused on winter situations, whereas [28] studied summer situations only. From these two works we deduced the expected consequences on the cloud cover evolution over the Paris area. For both summer and winter, cold and wet circulation are more often brought by NAO− and the Atlantic Ridge, whereas dry and warm fluxes are rather caused by Scandinavian Blocking and NAO+ regimes.
To verify these assumptions, we computed average hourly GHI measurements, observed by pyranometers for various regimes. We obtained the regime daily classification from [59], which does not guarantee the results for spring and autumn. Therefore, in the rest of this present paper, all presented results according to weather regimes will concern only summer defined by June, July and August and winter (December, January, and February). Results are summarized in Table 1. We remark that this assumption is verified in summer but with small differences. Scandinavian Blocking and zonal regimes show higher GHI values. In winter, significantly low occurrences of the NAO− regime affect the statistical quality of the results. However, Scandinavian Blocking presents a net superiority in averaged irradiance in winter compared to the other regimes. For a deeper vision of weather regimes' impact on cloud cover observed by satellites, we manually visualized the MSG images of the same periods classified in these four regimes. We deduced some trends showing that NAO+ and NAO− present, most of the time, a uniform advection of the cloud field in the considered area. Atlantic Ridge presented a majority of convection cases and Scandinavian Blocking showed large cloud free areas. Figure 4 shows representative images of Atlantic Ridge and Scandinavian Blocking corresponding to our first assumptions. We assumed that these trends and the image representativeness are based on our subjective considerations.

Evaluation Protocol
The GHI forecasts for the single pixel colocated at Palaiseau are compared with the GHI measured by the BSRN station. In order to compare instant pixelwise value from satellites with local ground measurements, we converted both satellite and ground data irradiation sums to 15 min time steps irradiation sums. Then, for a given satellite acquisition time (e.g., 12.11 UTC), we compared both irradiations summed over the last 15 min from 11.57 UTC to 12.11 UTC. Minute resolution ground data have been simply summed over these 15 min. For satellite data, we first computed a GHI forecast value at each minute between the two acquisitions using a temporal linear interpolation between 11.57 and 12.11 UTC of predicted clear-sky index. Then we computed the GHI every minute, given that the clear sky model is able to provide GHI cs value at 1 min steps. Then forecast data have been summed over these 15 min. This protocol follows the recommendations of [60] who advise to take into account the Sun position at each minute to synthesize the 15 min information gap due to satellite sensor resolution.
We performed the forecast using all HRV images available at IPSL from 1 July 2017 to 30 June 2020. We only used images sufficiently illuminated by the Sun to compute CMV and cloud index maps. We rejected all data computed with an SZA above 78 • . In practice, the first and last daily observations used in July were time stamped at 5.26 and 18.26 UTC. In January, forecasts started at 9.41 and ended at 14.11 UTC.

General Results
We computed errors between all forecast (GHI for ) and observation (GHI for ) values by classifying them in time horizon from 0 to 360 min with 15 min time step. We use the mean bias error (MBE), mean absolute error, RMSE and correlation coefficient or Pearson coefficient (R), respectively, defined by (7), (8), (9) and (10).
where Cov (GHI for ,GHI obs ) is the covariance between forecast and observation values ; σ GHIfor and σ GHIobs are standard deviations of, respectively, forecast and observation values. Table 2 shows the results of several metrics for selected typical time horizons. Relative value corresponds to the absolute value divided by mean of GHI ground measurements. Results at time horizon of 0 min evaluate our algorithm of GHI estimation from satellite. Results are similar to the state-of-the-art for Palaiseau station. In particular, [60] published very close results in terms of relative RMSE (25.5% and 24.8% for the GHI database HelioClim-3 version 4 and version 5, respectively). The high value of the correlation coefficient ensures consistent results. Error metrics increase with time horizon except for bias error. However, results are satisfactory in the magnitude of order from similar works using ground truth data in the Netherlands [21] and Germany [43] at different periods. Figure 5a shows the RMSE results compared with those of ARPEGE model and persistence algorithm as reference algorithms. The persistence algorithm generally used in the intraday algorithm consists in considering that clear-sky index K c value is constant from forecast initialization time T0 to time horizon h (9). In other words, this naive algorithm takes into account the course of the Sun but assumes that cloud cover did not change at all during the forecast.   (12). A positive value of FS means that RMSE for is lower than RMSE ref .
It highlights the evidence that satellite-based forecast present smaller errors than an NWP at these time horizons. Moreover, the persistence algorithm presents larger errors. The difference of errors between forecast and persistence increases with the time horizon. Figure 6 shows the forecast skills according to the four seasons defined as trimesters with complete months. The skill scores have similar layouts except for wintertime mainly due to shorter daytime and thus, are less statistically representative. For all seasons, the satellite forecast gives better results than the NWP forecast and persistence. The overall results show a satisfactory consistence regarding the state-of-the-art and enable to investigate forecast uncertainty sensitivity to the selected predictors.

Results against Punctual Predictors
Local predictors, namely the SZA and the observed clear-sky, have their values defined at forecast target point and at forecast initialization time T0. For commodity, we present the RMSE results as a function of both predictors. Figure 7 shows so-called "heatmaps" of the absolute (a) and relative RMSE (b) of the forecast for all seasons at a time horizon of 60 min.
Such figures show that the Sun at its highest position with an intermediate state of cloud cover clearly affects the absolute RMSE. When the magnitude of GHI is higher with a lower SZA value, the absolute error increases. As mentioned previously in Section 3.2, intermediate values of K c obs occur when the cloud cover is not homogeneous and when it can evolve rapidly. The relative RMSE is more regularly sensitive to the K c obs and is impacted by the lowest Sun elevations. As this metric is weighted with the mean of the observation values, the Sun position effect is strongly attenuated and the errors appear larger for low GHI values. Heatmaps from other time horizons (not shown) present similar patterns. Similar results have been highlighted for the NWP-based forecast in [25]. Such information is relevant at least for network managers dealing with large PV farms. A variation of absolute errors can induce a large difference in terms of PV electricity resource for grid dispatching.   Figure 8 shows the relative RMSE as a function of the mean and the standard deviation of surrounding K c as described in Section 4.3. It shows a clear evolution of error pattern according to the time horizon. At T0 + 0 min, we can notice that for any classes of K c , the error increases with the standard deviation. This is more pronounced for extreme values of the means of K c . Indeed, cases of overcast or clear sky induce large errors if spatial heterogeneity increases. For later time horizons, the relative RMSE becomes larger for lowest values of K c but the increase with standard deviation is less pronounced. The errors for the clear-sky situation are more stable and quite low. Whatever the time horizon, the maps show an increasing variation of the errors with the spatial heterogeneity of the area.

Results against Neighboring Clear-Sky Index Pattern Observed by Satellite
These distinct signals, which are a function of time horizon, can be useful to predict the forecast error magnitude as a function of the current cloud cover neighboring pattern. Further studies could be done by choosing different sizes of the neighboring area in order to refine this error prediction.

Results against Synoptic Weather Regimes
We plotted the relative RMSE of the satellite-based forecast according to the four weather regimes presented in Section 3.4. Figure 9 shows the curve as a function of the time horizon. The blue curve labeled "All regimes" gathers all the forecast whatever the weather regime and is plotted here as a reference. The other curves show the RMSE of forecast occurring during the considered weather regimes. A weather regime occurrence is attributed on a daily basis. Therefore, any forecast starts and ends within the same weather regime. The results show a clear distinction of the error according to each weather regime, even between the two considered seasons. In summer, the relative RMSE according to regime presents a constant ranking for all time horizons. The Atlantic Ridge presents the largest errors. Relative RMSE increases from 4% to 6% compared to all situations. Scandinavian Blocking shows a significantly lower error (−5% to −4%) than the other regimes. NAO+ and Zonal (NAO−) remain close to the general score of the forecast. However, NAO−shows a relative RMSE up to 5% larger than NAO+.
In winter, relative RMSE is significantly higher than in summer. Colder temperatures and humidity lead to more frequent cloudy situations where this type of forecast is less accurate as mentioned, for example, by [21]. Moreover, shorter daytime duration decreases the observation number compared to summer, which increases the relative RMSE. Winter results show an inverted ranking concerning weather regimes defined by the NAO. NAO+ appears much less accurate than all regimes, whereas Atlantic Low (NAO−) shows a smaller error. However, the divergent results of NAO− in winter is also explained by the weak number of occurrences of this regime as noticed in Table 1. Scandinavian Blocking decreases the relative RMSE from 11% to 14% when the Atlantic Ridge increases it from 7% to 16%.
For both seasons, the performances under the Atlantic Ridge and Scandinavian Blocking regimes follow the magnitude of the GHI estimation presented in Table 1. It provides a loss of 4% to 5% in relative RMSE. The Atlantic Ridge increases the RMSE by 4%-6%. In winter, these respective variations are 11%-14% and 7%-16%.
Thus, this forecast error ranking can be explained by the relative warmer and drier situation in Europe due to the proximity of anticyclonic situations during the Scandinavian Blocking regime. The cold and humid conditions of the Atlantic Ridge induced more depressions and unstable cloudy situations, which affected the forecast accuracy. These results for these two regimes are consistent with the satellite image example presented in Figure 4.
Such a clear ranking is a useful result because weather regime occurrences can be forecast accurately several days in advance through NWP models and therefore predict the magnitude of intraday error forecast. Figure 10 shows the forecast skill of satellite-based forecast against those of ARPEGE. We can remark that the performances according to weather regime do not show the same ranking as for relative RMSE. In summer, the Atlantic Ridge and Scandinavian Blocking regimes have the highest skill scores up 250 min of time horizon. We can deduce that during the Atlantic Ridge regime, the forecast is less accurate even for ARPEGE. The NAO+ and NAO−regimes present the worst forecast skill score meaning that even if satellite forecasts present better results than NWP at these time horizons, forecasting GHI during the two NAO phases seems to give better results for ARPEGE compared to other weather regimes. In winter, unsurprisingly, the Scandinavian Blocking regime shows the best score. NAO+ presents the worst score, confirming the better skill for ARPEGE to forecast GHI during such regime. However, this result does not concern NAO−, certainly due to its lack of occurrence.

Conclusions
This work is aimed at identifying relationships between the accuracy of an intraday surface solar irradiance forecast method and meteorological variables that can be easily observed or forecast. Firstly, we proposed an irradiance forecast method using satellite images, based on general principles already known from the recent state-of-the-art. Similar methods have been used on operational and commercial bases for several years. We tested herein the implementation of a proven cloud motion vector computation method, routinely used to extract wind vectors in operational conditions. Secondly, we proposed a long-term validation over three complete years using ground measurements and a comparison with an NWP model (ARPEGE). The results showed forecast errors varying with time horizon from 24% at 15 min to 39% at 6 h in relative RMSE (corresponding, respectively, to 86 and 163 Wm −2 in RMSE). Whatever the forecast horizon, the method shows a positive forecast skill score compared to persistence (up to 15%) and ARPEGE model (between 20% and 40%). The bias is almost constant about-17 Wm −2 (5% in relative value) whatever the time horizon. Even if this validation is restricted to one location, the good results regarding the state-of-the-art methods enable further studies on the uncertainties.
Finally, we evaluate the accuracy sensitivity with respect to meteorological conditions quantified by the Sun's elevation, the cloud cover pattern surrounding the target site and the synoptic weather regimes. We showed, for example that, according to clear-sky index spatial distribution at forecast initialization time, the relative RMSE at 90 min may vary from less than 15% for a clear-sky condition to more than 50% for heterogeneous cloudy skies. A weather regime occurrence can be known several days in advance. It is found that the summer Scandinavian Blocking regime is related to the lowest relative RMSE, whereas the Atlantic Ridge regime shows the highest errors. The difference of relative RMSE between these two regimes is from 8% to 10% in summer according to time horizon. In winter, this difference is much higher, from 18% at the earliest time horizon steps to 30% after 5 h. These results confirm the lower accuracy of this forecast method in the case of heterogeneous cloud cover as observed using clear-sky index spatial distribution heatmaps. Such a situation often occurs often during the Atlantic Ridge regime. The extrapolation of a cloud cover using only a CMV constant in time during the forecast induces larger errors when the cloud cover strongly varies in space and time.
Our results permit to propose these predictors to solar energy forecast users. These predictors can be either known when the forecast starts or several days in advance (which is the case of weather regimes). This enables us to anticipate the reliability of an intraday forecast using a CMV-type technique.
It can be particularly useful for micro-grid applications where the energy management methods to optimize PV electricity consumption with forecast are part of a growing experimental field over the world [61].
These predictors have been selected for their known influence of the uncertainty of solar irradiance estimation using satellite data. However, this predictor list is certainly not exhaustive and further parameters can be tested, in particular, if the latter are easy and cheap to obtain in operational conditions. A key result of our work is the highlighted dependence between forecast accuracy and weather regimes occurring in the Paris area. Such work can be of course tested on different areas influenced by identified weather regimes. Moreover, deeper studies on synoptic barometric fields may provide additional predictors derivable from available meteorological observations. Finally, this methodology can also be undertaken for other types of solar irradiance forecast algorithms because weather regime occurrence has direct influence on cloud cover behavior. More and more developing forecast methods include a machine-learning approach. Identified predictors could be used as pertinent inputs to improve their results. Funding: This work is part of the project Grid Power Sustainability, cofinanced by the European Union and Regional Council Ile-de-France. Europe is involved in Ile-de-France through the European Regional Development Fund (ERDF). This action benefited from the support of the Chair of "Challenging Technology for Responsible Energy" led by l'X -Ecole polytechnique and the Fondation de l'Ecole polytechnique, sponsored by TOTAL. This research was supported by 3rd Programme d'Investissements d'Avenir [ANR-18-EUR-0006-02].