As costs for photovoltaic (PV) modules and systems continuously drop, PV is becoming increasingly mainstream and widespread. The technology expands ever more in geographical regions and climates with lower annual irradiations and with significant snowfall, such as Scandinavia and Canada [1
With the expansion of new PV markets, the interest in climate-specific impacts also seems to increase, including impacts from snowfall on PV production. In a number of studies from snow-rich locations with cold winters, PV systems are found to suffer annual energy output losses up to 34% for certain years and systems, with monthly losses up to 100% [3
]. Moreover, in moderate climates, significant loss levels around 5–6% typically, and up to 9.3%, have been reported, whereas plants in mild climates show typical annual losses below 2–3% [3
]. These studies and their reported levels indicate that snow losses can have a substantial impact on the energy yield, in addition to impacting the financial conditions for the site investor and owner. Therefore, snow impacts should reasonably be taken into account during the planning of PV sites and be included in quotations and financial return on investment (ROI) or lifecycle cost of energy (LCoE) calculations.
The extent to which a PV array is covered with snow during winter is clearly depending on climate and—at a more detailed level—on weather conditions present before, during, and after snowfall. The surface temperature of the modules, whether the module surface is wet or dry, and the type of snow (wet or dry) impact the extent of accumulation of falling snow on the modules [11
]. Once a module array is covered by snow, air temperature and irradiation levels and snow type play a crucial role in the speed of natural clearing of the modules by sliding or melting [9
]. Furthermore, design parameters and technical specifics of the plant impact the speed of snow clearance. In general, higher tilt angles lead to shorter periods of snow coverage, but not in a linear relation. Townsend and Powers [6
] suggested a linear relationship with the squared cosine of the tilt angle for annual losses, while Marion et al. [10
] proposed a linear relationship to the sine of the tilt angle for the speed of snow sliding. Moreover, the smoothness of the array surface affected by module framing or mounting system parts [3
] and interference with ground or roof constructions below the array [3
] were reported to have a significant impact on snow clearance rates. Townsend and Powers [6
] defined a basic approach to model ground interference, but for rooftop systems, many more parameters are likely to impact the interference, such as roof tilt angle, the possible presence of snow racks, module mounting height, type of roof cladding, and distance between the bottom of the array and the roof eaves.
In addition to the fraction of a PV module or array that is covered by snow, the spatial distribution in relation to string and module substring configurations also defines the resulting electrical power loss. The impact of snow coverage is similar to the impact of shading, or soiling in general, and includes the thickness and light transmittance of the snow cover [3
]. Because snow is cleared by a combination of melting and sliding, it tends to cover only a part of the module and array, mostly reported to be the bottom of modules [9
], but not limited to this area. In the latter case, snow from the bottom part of one module row is observed to slide downwards onto the top part of the underlying module row in the same array [13
]. Based on these observations, the use of landscape orientation of modules seems to be beneficial because module bypass diodes typically divide modules into substrings in parallel to their long edge. However, early observations in an ongoing study indicate that snow sliding might be affected negatively due to module frames and inter-module gaps occurring at smaller intervals [14
]. No conclusions on the final, combined effect of the two factors—bypass diodes and clearance speeds—can be drawn from current literature.
Several approaches to model snow coverage and/or snow losses have been developed, most of them being empirical and a few based on modeling physical principles [3
]. Pawluk [3
] presented an overview of models that are included in Table 1
—with a single addition—to summarize the extent of the underlying data. Most approaches directly focus on predicting the impacts on PV energy generation, while other models predict snow coverage and from there estimate the expected energy yield. All listed empirical models show improvements in prediction accuracy, compared to prediction methods not considering snow losses at all, for the sites studied or used in validation. Prediction improvements are, however, not always consistent for all time resolutions, showing decent correspondence for annual estimates and large errors for monthly estimates or higher resolution estimates [6
]. For example [15
] presented annual absolute errors within 2% and 4% (of annual energy yield), respectively, and monthly absolute errors within 85% and 34% (of monthly observed energy yield), respectively.
shows that most studies have a clearly limited statistical base, with limitations in one or more of the following three parameters: number of PV plants, geographical distribution, i.e., (micro-)climates, and number of winter seasons. This statistical limitation creates uncertainty regarding the general applicability and a need for broad validation efforts, extending all of the three mentioned parameters. Three studies include larger amounts of PV plants [15
]. The study by Hong et al. [15
] also had a relatively high number of winter seasons but was geographically limited to one city area (Seoul, Korea). Zamo et al. [17
] geographically covered two counties in France—one a hilly region near a mountainous area and the other flatlands near the sea. The models tested in [17
] were ignorant of any PV systems’ specifics and were all regressive, focusing mostly on advanced regression methods. The best model in general showed root-mean-square errors (RMSE) of 9–12% for hourly values three days ahead, but no specific results for the snowy seasons were presented. Lorenz et al. [19
] covered northeast Germany (basically, the area of former East Germany) and showed a decrease of RMSE from 11% of installed power to around 7.5% for intra-day hourly prediction values at a single site level when implementing an assumed 100% snow cover for hours with air temperature below zero (0 °C).
The usefulness of a temperature threshold of 0 °C for snow coverage was confirmed by [3
], though [3
] noted that such a threshold should be combined with the use of snow cover conditions in order to prevent underestimation of snow losses during cold periods.
Out of all discussed prediction models, only the model by Marion et al. [10
], with some minor adjustments [16
], has been implemented in a publicly available PV simulation tool, namely in the National Renewable Energy Laboratory (NREL) System Advisor Model (SAM) [22
The work presented in this article is aimed to extend data collection size (number of sites, number of years) by using generally available data to arrive at snow loss estimates that can be used to verify existing snow loss models and investigate new or improved modeling approaches. To succeed with this aim, it is critical to find reliable methods to infer snow-free performance, and it is preferable to identify inference methods for many of the main site-specific parameters.
We have found that satisfying snow-free models can be inferred for most PV systems. Inference of tilt and azimuth angles is possible with room for improvement. We can also show that estimated annual snow loss levels are in line with earlier publications and showing correlation with the model by Marion et al. with adjustments by Ryberg and Freeman [10
3.1. Comparison of Snow Data
The comparison of snow cover data readily showed that the satellite products suffered from inconsistent data coverage with several gaps, that made them difficult to use. Best data cover was gained with the snow water equivalent (SWE) product from the Cryoland service. However, we found quite a few cases in which this source classified the reference sites as snow-free, despite that the PV array observations indicated snow cover. Because our main target was to find days when production was not influenced by snow cover, the above snow detection confusion was a problem. Snow depth from the UERRA reanalysis product, on the other hand, tended to overestimate the number of snow-covered days for the PV array. This overestimation caused a loss of production data points but did not induce errors. That robustness, together with the fact that UERRA had full data cover, made it the best choice for our applications.
3.2. Inference of PV Array Tilt and Azimuth
The tilt and azimuth were estimated for the reference sites and those of the quality-controlled survey sites for which we had information about the geometry from the survey (n
= 24). Scatterplots showing the reported and estimated tilts and azimuths are shown in Figure 4
a,b for all three methods tested.
Summarizing the error statistics, we found the new method, optimizing the absolute error (l1-error) to be the most satisfactory. The plot includes sites with multiple orientations. If only sites with a single orientation are considered the bias, standard deviation and correlation coefficient are found to be −4°, 9°, and 0.3 for the tilt, and −6°, 10°, and 0.9 for the azimuth, respectively. The cause of the two outliers in the tilt estimation has not been investigated. Even if an error of about 10° may seem high, it did not have any major effect on the estimation of the clear sky model, as shown below.
3.3. Inference of Snow-Free Models
Using tilt and azimuth angles from the previous step, the shading coefficients were then estimated for all the quality-controlled basic and survey sites (n
= 258), using the method described earlier. Error statistics for production during the same hours used for estimation (no snow in UERRA data and recorded production above its 5th percentile) are presented in Table 3
. In the same table, we also give the error statistics separately for the sites where data about the panel geometry were known and used as input for the estimation of the snow-free models.
Note that there is no difference in performance between the snow-free models with respect to what information is used regarding the panel tilts and azimuths. Hence, we conclude that the quality of the estimated angles is good enough for use as input to the subsequent estimation of the snow-free model parameters. A density plot for the hourly snow-free production at all the 258 basic and survey sites (n
= 2,004,830) is presented in Figure 5
3.4. Snow Loss Estimation
The snow loss was calculated as the difference between modeled snow-free production and measured production. Missing data were handled by requiring at least 90% data coverage during the daytime in order to obtain representable monthly values.
An example showing the estimated snow-free and actual monthly production at one of the basic sites is shown in Figure 5
b. Note there are non-negligible snow losses during the winter 2017–2018 and that the 90% data-availability criteria caused a gap of two months in the graph during the start of 2019.
Moreover, note that there are some discrepancies between the model and the measurements, especially during the summer. The satellite data used in SARAH-2 are from the geostationary Meteosat satellite. Since these are located above the equator the view angle becomes large at northern latitudes, resulting in distorted cloud geometries. Moreover, we have so far only used an estimation of the direct component, derived from the global irradiance in SARAH-2 using the pvlib python function pvlib.irradiance.disc.
The choice to require at least valid data for measured PV yield for February, March, and April, in order to accept annual estimates is in line with the statistics for monthly contributions illustrated in the boxplot in Figure 6
. The plot includes site-wise monthly means based on all years with at least 90% data availability per month. It shows that March is typically the month with the highest snow losses, responsible for 43% of all sites’ accumulated snow losses. February is clearly second in place, but the contributions for January and April are not that different in average contribution. April’s median value is lower than January’s, while its average value is higher. Since the higher values in the distribution for April are about twice as high as for January, it is reasonable to require data availability for this month—mostly April has not a very significant contribution, but in some cases, it is of such magnitude that it should not be ignored. The three months (February to April) together add up to a share of 79% overall (all sites and winters). Note that May contributions can be as high as January’s, larger than November or December. Snowfall in May is not so common but does occur. Early May there is also the possibility that snow from precipitation in April is still covering the PV array. In May, the amount of irradiation that is typically expected per hour of snow coverage is considerably higher than November–January. At the site with the highest latitude that is included in our study, the one hour with the highest global horizontal irradiation in May 2020 (819 Wh/m2
) equaled almost half the irradiation during December 2018, the month with the lowest irradiation (1726 Wh/m2
). Even though this example is somewhat extreme, it clearly indicates that very few hours of snow loss in May are enough to contribute more than months such as November or December.
a shows the estimated snow losses for all sites that passed the quality checks, which are ordered by latitude to provide a sense of the geographic locations. Over time, new sites have been commissioned and added to the service providing the data, and therefore there are more blanks, i.e., missing data, in the earlier years.
To put the snow loss data into perspective, it can be compared to the mean specific yields in Figure 7
b. The snow losses vary from 0–198 kWh/kWp. Yield for all sites and years varies between 356 kWh/kWp and 1262 kWh/kWp, with both the highest and lowest value situated near the same latitude. The reasons behind these extremities have not been investigated, but apart from physical causes, it might well be due to misreporting by the PV system owners. Either way, they are outliers compared to the median for all sites’ mean specific yields of 872 kWh/kWp and considering that 50% of all sites have a mean specific yield between 800 kWh/kWp and 956 kWh/kWp.
If we compare the three winter seasons with the most datapoints—2017/2018, 2018/2019, and 2019/2020—we notice that for basically all sites the first winter shows clearly higher snow losses than the second, which in turn has higher snow losses than the third winter. For the winter of 2017/2018, half of the sites have snow loss estimates between 28–71 kWh/kWp. Lower and upper quartiles for the specific yield for that same year were 734 kWh/kWp and 902 kWh/kWp, with a median of 819 kWh/kWp.
Relative annual snow losses are illustrated in the boxplot in Figure 8
, showing values between 0% and 20%. For the snow-rich winter season of 2017/2018, the mean relative snow loss was 6.3%. It should be noted that snow conditions may vary considerably between sites, which might explain the large intervals. The lowest values could be impacted by site owners clearing the module array from time to time. We have not investigated the possibility of estimating this behavior but know from the survey sites that it is not very common (11% and 4% out of 85 contestants apply total clearance and clearing just the top layer of snow, respectively).
3.5. Comparison with Existing Snow Loss Models
below shows scatterplots for the comparison of snow loss estimations by our inference method and models by Marion et al, Ryberg and Freeman [10
], and van Noord et al. [7
]. The Marion model was applied twice, assuming either portrait or landscape module orientation. Since results were similar only the “portrait” results, which were slightly better, are presented in Figure 9
The graph for the Marion model in Figure 9
a clearly shows a correlation with the inferred model for annual snow-losses. The Pearson correlation coefficient between these models is 0.73. The bias and standard deviation for the Marion model estimates are −0.13 kWh/kWp (0.5% of mean annual snow losses) and 18 kWh/kWp (8.9% of mean), respectively. As discussed in the introduction section, monthly aggregate estimates of the Marion model typically show poorer correspondence, with large relative errors, and this is also true for our results.
The second model compared here, the lin–temp model, which assumes a linear relation between monthly average temperature and relative monthly snow loss, tends to overestimate the annual snow losses compared to the snow-free model. Interestingly, underestimation is very limited and stays at an almost equal distance of the 1:1-line slope. For this model comparison, the Pearson correlation is 0.66, somewhat lower than that for the Marion comparison. Bias and standard deviations are 17 kWh/kWp and 30 kWh/kWp, respectively, considerably higher than for the Marion model. Results for monthly estimates are in line with the annual results, i.e., higher bias and standard deviation than the Marion model estimates and relatively few numbers of underestimations (compared to our estimates).
3.6. Snow Loss and Panel Tilt Angle
shows the fractional snow loss (relative to the snow-free model) for March 2018, plotted as a function of panel tilt angle. For this month, there was snow covering almost all of Sweden, resulting in data from 161 sites. The data points are quite scattered, but after calculating the median loss for each 10-degree interval, one can see a trend in which the loss decreases with increasing tilt angle.
4. Discussion and Conclusions
The results show that the new approach presented in this article can be used to estimate snow losses with satisfying accuracy from just PV yield data and publicly available weather datasets. Snow losses for the included sites and years were estimated to vary from 0% to 20% of annual snow-free PV yield. Expressed in specific energy losses (energy normalized by installed DC-peak power) the estimates vary from 0 kWh/kWp to 198 kWh/kWp.
There is, on average, good agreement for the inferred PV yield model during snow-free hours, judging from high correlation (≥95%), low bias (<0.12%), and standard deviations (RMSE; 6–7%) as % of installed power. The latter is quite close to the 5% RMSE mentioned by Zamo et al [17
] as the best performance in a benchmark of regression modeling methods—even considering they normalized in percent of maximum measured production. There, Zamo et al used ground-measured weather data from the nearest meteorological station rather than numerical weather models.
The inferred model’s performance for snow-free hours can be taken as an indicator that also the “snow-free” estimates for hours with snow cover are on average in good agreement. The model performance during hours with snow cover is however difficult to assess since we do not have measurements for snow-free PV systems to compare with. We did compare the snow loss estimates by the existing Marion model, and there is a pretty good correlation for annual values. Still, the Marion model’s estimates on a monthly basis show large deviations, similar to other studies [10
]. There could, however, be several other explanations for these deviations. It can be an indication that the parameter values obtained for sites in the United States are not representative of the northern Swedish climate or the types of PV sites. It is also possible that uncertainties or errors in weather data (snow depth, temperature) are causing the deviations between the Marion model and our inferred model. Tilt estimate errors are less likely to explain the difference because the estimation method used tends to underestimate tilt angle, which would lead to higher estimates for the Marion model.
There are several parts of the approach presented in this article that allow for further improvement. Some main suggestions are discussed below.
The mean value for the estimated tilt angle in this study was about 25°. For this tilt angle, a 10° error will affect the snow-free POA irradiance during the snow season by about 5% (simulations with pvlib python using solar irradiance data from the SMHI radiation network). Improving the tilt angle estimate is, therefore, not critical for calculating the snow-free production, but it could be important for the estimation of snow losses and validation of existing prediction models. The Marion model uses the sine of the tilt angle in the estimation of the amount of snow slide. Here, the same 10-degree error as above translates into a difference of about 25%, regarding how fast the panel becomes snow-free. That 25% difference implies a significant impact on the expected loss in production. The fact that there is a relation between estimated loss and the tilt angle is illustrated in Figure 10
. Thus, an improved tilt angle estimate could improve correlation with the Marion model. The same correlation might also benefit from a parameter fit for the Marion model to the inferred model’s results.
Moreover, there is room for improvement by extending the model to also detect cases with panels in multiple directions and infer multiple tilt angles.
It is possible that an improved tilt angle estimate can be obtained using our approach in combination with the direct irradiation from the SARAH-2 data instead of inferring it from the global irradiation using empirical methods as we did. Another way forward would be to look for alternative methods with a focus on obtaining good tilt estimates.
A better description of the direct irradiance could also help improve the snow-free model estimation, especially toward the summer when convection and patchy clouds affect the production to a greater extent. In Figure 5
a, the density of the scatter of modeled and measured production during snow-free conditions looks convincing, but there are some caveats, and several sites were omitted during quality control. Improving the snow-free model should decrease the number of omitted sites and further increase statistics for snow loss studies using big-data.
Shading can be a real issue at some sites and should be investigated in more detail. Maybe one can find a good compromise when it comes to the time resolution of the shading factor for the direct component somewhere in-between an hourly and a monthly variation. Given that the amount of data available for analysis is sufficient.
Moreover, snow detection has room for improvement. Although snow detection by UERRA is conservative, it still misses out on a number of cases in which there is snow on the panels. Our simple comparison of satellite-based datasets of snow cover could possibly be extended with additional satellite products, reanalysis datasets, and other ways of processing the data. Furthermore, a combination of several data sources or processing methods could perhaps make the detection of snow-free cases more precise.
Several factors that are known or expected to impact snow clearance and/or snow losses still need more studying. We can mention ground or roof interference and the overall impact of module orientation—portrait versus landscape—on snow clearance and power losses. The ability to infer module orientation or key interference parameters from PV production data would create new possibilities to study these topics with a big-data approach.
One more aspect where inference methods could be developed is for the detection of non-natural snow clearance, such as manual sweeping or melting. The ability to detect these cases helps to study their impact and further improve snow loss prediction models.
Finally, it would be interesting to evaluate the proposed method at sites with ground truth regarding the production loss, for instance, at locations with twin installations, where one is cleared from snow and the other is not.
Currently, our planned next step will be to try and estimate snow losses at any location without knowing the production and only through given reanalysis data and information about the tilt and azimuth of the installation. Given the results obtained, we now have the data needed for doing a regression on a general level. This regression would provide us with maps indicating potential snow losses given the site location and geometry. Replacing the reanalysis data with climate projections would make it possible to study how snow losses may change in the future.