Precipitation Extremes and Their Synoptic Models in the Northwest European Sector of the Arctic during the Cold Season

: Precipitation extrema over the Barents Sea and the neighbouring locations in Europe were analysed using data obtained from station observations and a highly detailed ERA5 re-analysis dataset. These data did not always spatially coincide (on average, coincidence was ~50%). Daily amounts of precipitation were typically higher in the observation data, although there may be a reverse picture. The analysis revealed that at several stations and in many of the ERA5 grids, the set of precipitation extremes exists as a mixture of two different subsets. The cumulative distribution functions (CDF) of the largest population in the context of both the re-analysis and observational data are well described by Pareto’s law. However, very rare cases exist in which the values deviate and exceed this base distribution value in regions possessing large values. These super-large anomalies do not obey the statistical law common to all other extremes. However, this does not mean that the extremes can be arbitrarily large. They do not exceed the marginal values that are typical for this type of climate and season. The analysis conﬁrms that extreme precipitation in the western sector of the Arctic is caused by the penetration of moist air masses from the Atlantic in the circulation systems of intense cyclones. At certain times, mesoscale convective systems are embedded in atmospheric fronts and can signiﬁcantly contribute to the formation of precipitation. Intensiﬁcation of such cyclones corresponding to global warming should lead to a transformation of typical CDF, as modern outliers will become regular components of the Pareto law. This change in the statistics of extreme events reﬂects the nonstationarity of the climate state. The inﬂuence of polar lows on the formation of large daily precipitation amounts is not felt.


Introduction
Recently, interest has increased in regard to the study of the meteorological conditions of the Arctic due to the economic development of this region. This region is an area of intensive shipping, fishery, and natural gas production. From a climate perspective, the region is of great interest due to the observation that this area has heated more quickly than has any other region on Earth. This is the so-called Arctic amplification [1] that has been occurring and that results from dynamic processes in the upper ocean layers due to increasing upward heat flux (so-called "atlantisation process" [2]), and in the atmosphere due to intensification of poleward heat advection [3,4]. These effects are particularly significant in the Barents Sea. The geographical position of this region causes the Barents Sea to exist as a transitional zone where the climate is primarily influenced by the North Atlantic Ocean. Notably, relatively little research has been published regarding the climate of this region.
The purpose of this article is to study extreme precipitation that occurs on the marine surface and the neighbouring portion of land consisting of the Scandinavian and Cola peninsulas during the cold season. We focus on a small portion of the Pan-Arctic domain to more clearly delineate its regional characteristics. We have taken into account that in the sub-Arctic region, the winter season lasts throughout the entire interval from November to March. At this time of year, particularly large amounts of precipitation are associated with cyclones that arise from the west and southwest and sometimes originate over this region. The extreme values of precipitation during the warm season were studied in our recent work [5].
A large portion of the Barents Sea is covered with ice during the cold season. From the perspectives of temperature, roughness, and other characteristics, the sea surface merges with land. The last two decades have been characterized by significant climate heating, a reduction in the surface of the old and first-year sea ice in the Arctic, and the appearance of a significantly larger ice-free marine surface compared to that which existed earlier.
The Arctic region is characterised by a sparse in situ observational network. Re-analysis data provide an important alternative for filling this gap, as they provide global coverage and combine weather forecast models and the assimilation of observations from a wide variety of sources. For this purpose, a horizontally detailed re-analysis of ERA5 was performed. This product (see below) was developed by the European Centre for Medium-Range Weather Forecasts (ECMWF).
Extreme precipitation events can be characterised using various indicators. Here, we used daily amounts. For statistical description of their extrema, the Weibull, Frechet, and Gumbel distributions (and their relatives, such as beta and gamma distributions and their mixes) and also energetic distributions (such as the Pareto distribution) have traditionally been applied (see [6][7][8][9][10][11]). We continue to work in this tradition. Extreme value theory assumes that the data selected for analysis must be independent and identically distributed. However, extreme value theory can be extended for dependent time series [12,13] or for data that have previously been transformed into independent series [14,15]. Regarding the identical distribution, our analysis utilised different extreme values (wind speed [15], precipitation [5], and wave climate [16] and revealed that the set of extremes obtained from observations or from modelling is a mixture of two different subsets. The probability distribution function of the largest population volume aptly describes one of the abovementioned lows. This approximation was used as the base distribution. The largest events belonging to this base distribution are sometimes referred to as "black swans" based on the terminology introduced by N. Taleb [17]. However, there are very rare cases in which the values deviate and exceed the base distribution in the large-value area. These unique events are termed "dragons" based on the terminology introduced by D. Sornette [18]. Statistical analysis that could be used to differentiate extreme values is substantial not only due to the need for practical goals but also because they permit, in some cases, the detection of the origins of extrema, as various statistical distributions sometimes suggest different originating mechanisms. This hypothesis motivated our interest in detecting, analysing, and understanding such different extrema and their nature. It is important to understand an extension of the ability of atmospheric re-analysis to reproduce extrema. Hence, we plan to investigate utility of the ERA5 data (meteorological information with a fine spatial resolution) in regard to simulating the peculiarities of precipitation extremes.
The next step of our analysis is to detect the synoptic processes responsible for such events. We understand that precipitation extrema in high latitudes connect with extratropical cyclones (see, for example, [19]); however, this mechanism has not been fully explored. Therefore, we focus on case studies that aid us in gaining a detailed understanding of the processes involved in extreme precipitation occurring along frontal structures, in extreme cyclones, and in polar lows [20].
In the following section, we describe the study area and briefly summarise the information regarding the datasets. The next sections provide evidence of a Pareto distribution in the station observation data. Subsequently, we describe the re-analysis data and compare the fit to the observations. In the next section, we describe the synoptic conditions that lead to precipitation extremes. The Section 5 concludes the paper.

Data
The study was performed in the sub-Arctic realm of the Barents Sea and included both the Atlantic Northeast and inland European territory. A dataset of daily precipitation amounts (covering the period 1966-2018) was used from stations located within the domain ( Figure 1). The ERA5 re-analysis developed by the ECMWF was used as a source of re-analysis data. Its horizontal resolution was increased to 0.25 × 0.25. Compared to the values in a previous successful ERA-Interim re-analysis, the number of vertical levels was increased to 137 pressure levels from 1 to 1000 hPa, the temporal resolution was changed to hourly, and the list of output parameters was extended. Furthermore, the number of assimilated observations was enhanced (approximately five-fold greater than that of ERA-Interim) [21,22].

Methodology
To apply statistical approaches, we composed our data based on the independence condition. Practically, this indicates that the data sample was required to include only inde-pendent extreme values. We demonstrated (both for the station and for the ERA5 grid cells) that independence is achieved if the time series consists of the daily precipitation amount for each day. This dead-time interval (1 day) was obtained via autocorrelation function analysis as a period for the disappearance of the correlation between consequent fluctuations.
During the winter, the Barents Sea can be either open water or covered with ice at various concentrations. In our analysis, we only used open water samples from the western portion of the sea. Over the ice-free sea, precipitation can occur in the form of either snow or rain. During the cold season, precipitation over the cold land surface falls mainly in the form of snow.
Our analysis revealed that the daily precipitation time series cannot be satisfactorily described by a low-parameter formula from the known probability distribution laws. However, an approximation can be achieved by utilizing a mixture of distributions. For example, a mixture of several (at least five) beta distributions reproduced all details of the cumulative distribution function (CDF). However, this CDF is described by 30 parameters that must be calculated from the samples. It is difficult to recognise this solution as satisfactory, as such a parameterisation is unstable and spatially non-universal (because at each grid point or station, the CDF exhibits a unique number of parameters).
Therefore, an alternative approach was utilised in which the total sample size was not analysed, but only its largest values that exceeded a certain threshold. The statistics for extreme precipitation are described by the Pareto distribution for realising the peaks-over-threshold method. The CDF for this approach is: where p is daily precipitation amount and Pth is threshold value. We can stress that the advantage of a multiparameter formula (based on a mixture of beta distributions) over a simpler Pareto formula is small. Taking into account this circumstance and also the knowledge that we study the statistical regime primarily to identify the physical patterns of the extreme formation, the Pareto formula was used.
Moreover, our analysis allowed us to obtain a single threshold value (9 mm) for the entire region. This provides an important advantage, as it is now possible to concentrate our attention on the analysis of γ as a single parameter. This threshold value removed 98% of the sample, and approximately 150-200 values remained for analysis. The Pareto parameters were estimated using the maximum likelihood method. To this end, expression (1) can be replaced by: In Figure 2, several Pareto plots are presented as calculated based on the station data; a straight line was recovered if the sample exhibits a Pareto distribution. We can visually observe the quality of the description and also quantify it based on the coefficient of determination (R 2 ). Which provides a measure of approximation success. In a mathematical sense, the use of R 2 is related to the application of the Cramer-Mises-Smirnov statistical criterion. At all stations, we observed that the majority of the points of the empirical CDF exhibited close approximations of a Pareto distribution. This is mentioned above the base distribution denoting the community of black swans. The application of the Kolmogorov-Smirnov test also indicated that there was no reason not to trust the Pareto distribution. However, all peculiarities are again not described. There are several points of the CDF that depict predominantly high values that do not belong to the base distribution. We have not tried to find common formulae but conclude that, as was mentioned previously, representatives of such populations were marked as "dragons" (in comparison to "black swans" representing the largest events belonging to base distribution).
We will continue to follow this terminology by designating the appearance of these anomalies as the "dragon effect" to indicate the differences in samples belonging to various groups.

Statistical Features of the Observed Daily Precipitation Sums
The same statistical properties observed in the statistics of extremes could be considered to be results of the same organisation principle, and this suggests a common originating mechanism for each representative of this population. This idea allows us to understand that a large extreme is not distinguished from its small siblings, with the exception of its large power.
The outliers from the base distribution can be detected based on obvious breaks in the tails of the precipitation distributions (Figure 2a). Here, there is a group of several points arranged along a line on the coordinate area of the Pareto distribution. This result falls under the classification discussed in the Section 1, where the sample data of the same item refer to different distribution functions. In other cases, there were only a small number of such extreme values that did not fall into the base distribution (Figure 2c), and finally, there were no such specific events in some stations (Figure 2b,d). A very important observation is that we can easily diagnose events adhering to the base distribution (there are several examples demonstrating that such a diagnosis is not simple and requires different methods adapted to the specific problem [27]).
Note that in statistics, there is the test of a null hypothesis, which states that two samples are derived from the same population, against an alternative hypothesis, which states that a specific population tends to possess larger values than does the other. This is the Mann-Whitney U test (the Kruskal-Wallis test extends the Mann-Whitney U test when there are more than two sets). However, to compare two (or more) sets of observations, they must be selected a priori. In the case of precipitation extremes, such a method is not necessary, as all data refer to the same nomenclature.
Consider the property of the base distribution in more detail. The parameter γ of the Pareto distribution calculated for all the stations is presented in Table 1. Most of the data are in the range of 3 to 5, but sometimes, γ is much higher. They refer to stations located in the interior of the continent (such as Sodankila) or that are blocked by mountains from the influence of the sea (such as Alta or Kvikkjokku). Concerning the Kolguyev Island Northern station, the sample size of anomalies exceeding the threshold was very small, and this may have led to a distortion of the result. It should be noted that in conditions where the threshold value is approximately the same, a significantly larger γ indicates that large anomalies occur less frequently.

The Pareto Distributions in Data of the ERA5
The next step of the analysis was to investigate the extent to which the abovementioned peculiarities of precipitation extremes are reproduced by ERA5 re-analysis. The establishment of the correspondence between simulation products and near-surface observations could help us to assess the quality of the modelling products and their ability to reproduce the precipitation extremes. Additionally, it was important to extend our analysis to the marine area, as this cannot be implemented based on station data only.
In Figure 3, we plotted several CDFs on the bases of the ERA5 data. Again, as observed in the station data, practically all points of the CDFs from the EPA5 exhibit close approximations of a Pareto distribution. Only a small number of points do not consist of this base distribution. In some cases, these points exit the general dependence and were arranged along a line. They belong again in the Pareto distribution; however, there are much larger values for γ. In other cases, there are several or only one such extreme value, and finally, there are no such specific events. These results fall under the situation discussed above, where the sample data for the same item refer to different distribution functions. It is important to complement these similar properties of the CDFs by comparing information regarding rare events (black swans and dragons) obtained from observations and by using the ERA5 for the same day at the same grid point. For such a comparison, 10 dates were selected (based on the CDFs) in which the largest anomalies were observed. Next, we considered the extrema at the grid points close to the station and studied if the ERA5 anomalies exhibit similar sizes and origins on the same dates.
First, it was observed that the dates did not always coincide. Coincidence occurred in 10-80% of cases, and on average, coincidence was ≈50% (Figure 4). The absolute values of the daily amounts of precipitation were higher in the observational data, although there may be a reverse picture (Figure 4).
For the selected extrema, we analysed if they belonged to the community of black swans (B) or dragons (D). Species coincidence (D-D or B-B) was noted in approximately 60% of cases ( Figure 5). In 35% of cases, event D (according to observations) was denoted as B according to the ERA5 data. The opposite was true only in 5% of the cases. This result reflects the observation that the absolute values of the extrema according to observations typically exceed those yielded by reanalysis.   Figure 1) largest precipitation extremes with similar extrema according to ERA5 data at these grid points (abscissa axis) and the ratio of observed precipitation (Pobs) to the total of ERA5 data (Pera5) (ordinate axis). Thus, the extremes of daily precipitation are adequate (from the point of view of belonging to a particular family) and are reflected in the re-analysis data in only approximately 25-30% of cases. In another 15-20% of cases, D in the observational data is converted to B in ERA5.
It should be noted that despite the observation that the extrema in the series of observational data and in the re-analysis data do not always appear synchronously, the exponents γ (see expression (1)) extracted from both the observation and ERA5 data are approximately the same. All ERA5 data over the land territory of the selected region lie between the mean values of the observed γ and its standard deviations. Moreover, ERA5 data were approximately the same over land and sea areas.
Another important effect detected among the samples belonging to the base distribution is the gradual increase in the parameter γ from west to east (Figure 6), which indicates that the frequency of large anomalies decreases in this direction. As mentioned, in conditions where the threshold value is approximately the same, a significantly larger γ indicates that large anomalies occur less frequently. Considering that large anomalies in precipitation are maximally frequent with a cyclonic type of circulation (e.g., [19]; see the next section), we can conclude that this effect reflects a decrease in cyclonic activity in the west-east direction, and cyclones are typically destroyed in the Yamal-Taimyr region. Therefore, the ERA5 generates extrema that exhibit statistical properties that are similar to observations; however, in both cases, these are random processes, and their extrema do not always coincide.

Precipitation Extrema and Extratropical Cyclones
As mentioned previously, one of the important features of the CDFs of precipitation extrema is the observation that most powerful cases did not fall into the base distribution (Figures 2 and 3). Such "super-large" anomalies are unique, and therefore, they are presented in the sample as extremely rare events. Additionally, as mentioned, their return periods exceed the return periods of the same anomalies prescribed by base distribution. Typically, it is difficult to generalise the features of the dragon effect in space from both station data and ERA5 data, as the parameters of "super-large" anomalies vary at neighbouring stations or at neighbouring grid points. We attempted to use synoptic analysis to determine the circulation conditions that were responsible for the "superlarge" precipitation anomalies, and these efforts are coordinated with the dragon events as determined from the CDFs (Figures 2 and 3). Priority was given to events that were simultaneously recorded as anomalous at several stations (or grid nodes).
Thus, in many locations in the Barents Sea region, anomalies of the highest ranks were noted on 10 January 2019 (see Table 2). At that time, the region was under the influence of a cyclone in the mature stage, and the centre of this cyclone was located in the Svalbard region ( Figure 7).  Air from the temperate latitudes of the Atlantic Ocean was transferred by cyclonic circulation to the Barents Sea region. As a result of occluding, the warm and moist air spread into the middle and upper troposphere covered not only the warm sector of the cyclone but also filled the central and rear areas of the vortex. The analysis of the thermodynamic diagrams (Atmospheric Soundings (uwyo.edu)) proves that the layered rain/snow clouds (a complex of Ns-As-Cs) occupied the entire troposphere. Therefore, significant precipitation fell over a large area, and the anomaly was designated as the greatest at many points within the entire region. The largest daily amount was almost 30 mm. However, for the Arctic (particularly in winter) this is a significant anomaly.
Consider the following episode ( Figure 8): The situation appears similar to that discussed earlier (Figure 7). However, atmospheric sounding data show that in the lower and middle troposphere (between 800 and 500 hPa), the temperature is close to the moist adiabatic lapse rate, thereby manifesting the development of convective clouds. The shear of wind in the lower troposphere confirms the ascending air movements generated by the frontal system. Moist convection conditions and large wind shear, together with a large moisture content, contributed to the development of a pseudo-adiabatic process to initiate much precipitation.

Precipitation Extrema and Role of the Polar Lows
It is interesting to compare extreme precipitation to polar lows. For example, an interesting feature of the situation on 6-7 January was the presence of a polar low that was located near a pixel (72 N, 28 E) [26]. It is clear that it could not be responsible for the observed wide zone of precipitation anomalies that were particularly distinguished along 73 N (see Table 2).
Consider the next situation on 27 November 2019 when the entire region was again under the influence of a polar cyclone. Along its western periphery, the polar low moved outside the baroclinic zone to the south. The following maximum of precipitation that appeared in a few days (29-30 November 2019) over a large area ( Table 2) was not related to this polar low. The extreme precipitation was generated by a synoptic-scale vortex.
In general, our analysis of several hundred cases of polar lows that are collected in different archives [23][24][25][26] revealed that their role in the formation of precipitation extremes of daily sums is insignificant. Only in rare episodes (for example, on 19 January 2000) can we note a weak local extreme of precipitation occupying one or two pixels near the location of the polar low (Table 2); however, the anomaly was far from the highest ranks. Small polar lows, fast movement speeds, and short lifetimes do not contribute to the formation of extended long-term precipitation anomalies. Additionally, it is established that at least some portion of polar lows are formed within the outbreaks of cold air from ice fields to the warm sea surface. The moisture reserve in cold air is small, and based on this, there is no reason to expect a large amount of precipitation.

Conclusions
• It was demonstrated that the Pareto distributions are close for the extremes that were extracted from both the re-analysis data and observational data. However, the strongest episodes did not fall into this distribution. For them, the correspondence between the absolute value of the anomaly and its probability is completely lost, thereby signifying that any anomaly can occur. This means that super-large anomalies do not obey the statistical law common to all other extremes. It is no wonder, according