Population Bias on Tornado Reports in Europe

: Tornadoes are associated with damages, injuries, and even fatalities in Europe. Knowing the spatial distribution of tornadoes is essential for developing disaster risk reduction strategies. Unfortunately, there is a population bias on tornado reporting in Europe. To account for this bias, a Bayesian modeling approach was used based on tornado observations and population density for relatively small regions of Europe. The results indicated that the number of tornadoes could be 53% higher that are currently reported. The largest adjustments produced by the model are for Northern Europe and parts of the Mediterranean regions.


Introduction
Tornadoes in Europe can be associated with severe damage and can result in injuries and fatalities. Despite this, until recently their threat has been underestimated. Antonescu et al. [1] showed that European tornadoes reported between 1995-2015 resulted in 4462 injuries and 316 fatalities and damages estimated at more than €1 billion. They also indicated that the density of tornado reports was the highest over Belgium, Germany, the Netherlands, and the southeastern United Kingdom . For these countries, there is a reporting bias, as central and western Europe have a high population density compared with other regions in Europe (e.g., Eastern Europe). Coastal areas were also hot-spots for tornado reports (e.g., western Italy, eastern Spain), as these areas tend to have higher population density compared with inland areas and also because of the waterspouts that move from the Mediterranean Sea inland [2].
This difference in the population density (i.e., different regions of Europe, coastal area versus inland area) introduces a bias in the reporting of tornadoes. This is because tornadoes (and also other types of severe weather events like hail or extreme winds) are "targets of opportunity" [3]. Thus, an observer needs to witness the event and then to report it and systems need to exist for collecting and verifying the reports (i.e., tornado database). Very few countries in Europe have developed and maintained such databases, which resulted in a lack of information about tornadoes [2]. One reason for not developing tornado databases is that they do not seem justified. Compared with the United States, the impact of tornadoes in European countries (given their relative small area) is relatively low, and thus there is no need to develop tornado databases for individual countries [4]. Only when considered from a pan-European perspective does the impact of tornadoes in Europe start to emerge. The collection of tornado reports and other types of severe weather reports at the pan-European level started in 2006 with the development of the European Severe Weather Database (ESWD) by the European Severe Storms Laboratory [5]. Currently, ESWD contains more than 16,500 tornado reports collected between 1800-2020.
Given the differences in population density across Europe, the real number of tornadoes is an unknown quantity. Even in relatively high populated areas, tornadoes might not be reported because of their small spatial extent and short life time, obstruction of the observer view point (e.g., forest, hills, buildings) or if the tornado occurred during the nighttime. As indicate in previous studies for the United States [6] and Canada [7], the population density is the key factor, besides the meteorological factors, in determining the bias in tornado reports.
Several studies have addressed the issue of population bias on tornado reports. Anderson et al. [6] used a hierarchical Bayesian model to account for the population bias on tornado occurrence using historical tornado reports for the United Stated from the Storm Prediction Center between 1953-2001. Their results for the central and eastern United States indicated that F0-F1 tornado reports vary less with population density compared with F2-F5 tornadoes. Starting from the hypothesis that the number of tornado reports in Canada is significantly lower than the actual number of tornadoes, Cheng et al. [7] also used a Bayesian modeling approach that considered the population bias on tornado reports. Their model also included the occurrence of cloud-to-ground lightning, as cloud-to-ground lightning can be used to quantify convective storms activity and thus can be used as precursor for tornadoes. Their results showed that in areas with low population density, the probability of tornado occurrence is significant higher compared with the observed tornado climatology for Canada. More recently, Potvin et al. [8] developed a Bayesian hierarchical modeling framework for correcting the reporting bias in the United States tornado database. Compared with other covariates (e.g., distance from the nearest city, terrain ruggedness index, road density) population density explained more of the variance in the number of reported tornadoes. Their model indicated that approximately 45% of the tornadoes that occurred in the study domain were reported.
The aim of this article is to analyze the effects of population bias on tornado reports in Europe using a Bayesian modeling approach. The expected tornado counts over Europe can be used to better understand the societal and economic impact of tornadoes and to be included in national disaster risk reduction strategies. This article is structured as follows. Section 2 details the tornado and the population density datasets. Section 3 describes the Bayesian modeling approach. The results and discussions are presented in Sections 4 and 5, respectively. Finally, Section 6 summarizes the results.

Tornado and Population Datasets
Tornado reports were obtained from the European Severe Weather Database. The ESWD collects information on severe storms over Europe (i.e., tornadoes, severe wind, large hail, heavy rain, heavy snowfall, damaging lightning) using a citizen-science approach [9] and through collaborations with national weather services and volunteer severe weather spotter networks. Before the inclusion in the ESWD, each report is verified and receives a quality control level. In this article, tornado reports with a quality control level Q0+ (i.e., validated with meteorological data such as radar and/or satellite imagery) have been used. Unlike the United States tornado database that contains only reports for tornadoes, the ESWD also contains reports for waterspouts [2]. The ESWD include information about the surface type (e.g., land, forest, sea, lake) over which tornadoes have been observed and the surface types crossed during the event. Thus, all the waterspouts that moved inland were included in the analyses presented in this article.
Data on population density in Europe were obtained from Eurostat [10]. The data were extracted for NUTS3 regions. NUTS (Nomenclature of territorial units for statistics) classification is a system for dividing the economic territory of European Union and the United Kingdom. The NUTS 2021 [11] classification valid from 1 January 2021 contains 1166 regions at NUTS3 level. NUTS3 level represent small regions for specific diagnoses. For example, for Romania there are 42 NUTS3 regions (41 counties and Bucharest, the capital city). In previous studies, for the United States, was argued that the rural population density at the county level is a more appropriate measure for tornado reporting compare with total population density [12]. Here, we follow [6] and use the total population density as population tends to be distributed over much of the counties and not concentrated in isolated towns. Based on the data availability and overlap with the tornado dataset, for each NUTS3 region the population density was average for the period 2006-2019 and the area for the period 2006-2015. These data together with the number of tornadoes reported between 2006-2020 at NUTS3 level were included in the Bayesian model.

Model
The Bayesian model used in this article starts with the hypothesis that the population density is the main influence on tornado reporting and thus, that tornadoes are underreported in Europe. As indicated by [6,7], the occurrence of tornadoes can be described as a series of conditional models linked using the Bayes' rule. Considering the t n as number of observed tornadoes and T n as the true number of tornadoes (T n ≥ t n ), a binomial model can specified in which where p n (β) represents the probability to observe a tornado and n indicates the NUTS3 region. The probability to observe a tornado is a function of β, which is related to population density (x n ). Anderson et al. (2007) used an exponential model for p n assuming that the probability of detection increases with population density The true number of tornadoes (T n ) in the n NUTS3 region is modeled as Poisson process, which is conditioned on the climatological frequency λ where a n is the area of the NUTS3 region and λ (i.e., Poisson intensity) is a measure the tornado frequency per unit area [6].

Estimation
For the Bayesian approach, the prior distribution for β and λ need to be specified. These prior distributions are non-informative (i.e., large variance) because there is no prior knowledge that can inform the distributions. Thus, the population parameter β is specified as The distribution of exp(β) is a normal distribution characterized by the mean µ β and variance σ 2 β . In the model developed in [6], µ β was set to 0.5 and σ 2 β to 10,000. For the climatological frequency parameter λ, a prior gamma distribution was used λ ∼ gamma(q, r) Following [6], the shape parameter q was set to 0.001 and the scale parameter r to 0.001 corresponding to a prior mean of 1 and a prior variance of 1000 (non-informative). Using the Bayes' rule A Markov Chain Monte Carlo analysis of the Bayesian model was applied to obtain a sequence of realizations from the posterior model ( [6,7]) using the WinBUGS software [13] (Available online at https://www.mrc-bsu.cam.ac.uk/software/bugs/the-bugs-projectwinbugs/, accessed on 23 October 2021).

Results
Based of the data from ESWD a total number of 2319 tornadoes were observed in Europe between 2006-2020 over the NUTS3 regions considered in this article. The predicted number of tornadoes by the model considering the population bias is 3563 tornadoes (Table 1). Thus, 65% of tornadoes predicted by the model were reported. The difference between predicted and observed number of tornadoes is low (<0.027 tornadoes 10,000 km −2 yr −1 ) over parts of the United Kingdom, the Netherlands, Belgium, Germany, and Italy ( Figure 1). As indicated previously [1,14], these are regions characterized by both a high number of tornado reports and high population density. For some NUTS3 regions there were no tornadoes reported during the study period, for example, parts of northwestern and southeastern France, central Romania, central Italy, Albania, Northern Macedonia, southern Bulgaria, Finland. For these regions the predicting values of tornado counts are less then 2 tornadoes over the 15 years study period. The highest difference between predicted and observed tornadoes (between 0.33-0.41 tornadoes 10,000 km −2 yr −1 ) is over Iceland and northern parts of the United Kingdom, Norway, Sweden, and Finland ( Figure 1). This is not surprising given that the population density of these regions is low compared with other regions of Europe and also given that the number of observed tornadoes by NUTS3 regions in this area is less than 10 tornadoes over the entire study period. Low values for standard deviation (<0.13 tornadoes 10,000 km −2 yr −1 ) of the tornado occurrence from the posterior distribution are found over most of Europe (Figure 2), with the exception of an area stretching from eastern Germany over Austria, Croatia, Bulgaria, and Greece characterized by values between 0.13-0.30 tornadoes 10,000 km −2 yr −1 . In these regions the NUTS3 area are characterized by relatively small area and low population density and also a low number of observed tornadoes during the study period.

Discussion
The Bayesian model used in this article only included the population effects, but previous studies using a similar approach have also included meteorological data [7]. These are large areas were the data collection is unreliable and thus can reduce the predictive capacity of the model. In their study of the probability of tornado occurrence across Canada, Cheng et al. [7] included the cloud-to-ground lightning climatology to account of for the spatial variability of tornadoes. For Europe, an indication regarding the predictive capacity of the model can be obtained by comparing the results from Figure 1 with lightning density over Europe between 2008-2012 developed by Anderson and Klugmann [15] using data from the Arrival Time Differing NETwork. Over northern Norway, Sweden, and Finland, where NUTS3 have a large area and the collection of data is not as reliable as for other regions of Europe, the Bayesian model is introducing large adjustments. For these regions, the lightning density (Figure 4 from [15]) is lower (<0.4 flashes km −2 yr −1 ) compared with almost any other regions in Europe. Thus, the adjustments in the number of tornadoes are less realistic for the regions from a meteorological point of view. For the Baltic states (i.e., Estonia, Latvia, Lithuania) the corrections from the model are realistic due to the relatively large lightning density (i.e., 0.4-2.5 flashes km −2 yr −1 ) in this region. The predictive capacity of the model in these areas can be improved by improving data collection (e.g., ref. [16] used satellite data to obtain information on unreported tornadoes that occurred in forested regions) or by considering as covariate meteorological factors related to tornado occurrence [17].
Future research will develop the model by considering meteorological covariates such as lightning density (e.g., from Arrival Time Difference long-range lightning detection network [18]) and tornadic environments (e.g., ERA5 reanalysis data [19]).

Conclusions
In this article, a Bayesian model was applied to adjust the number of observed tornadoes over Europe. The hypothesis was that the number of observed tornadoes in Europe is lower that the real number due to the differences in population density. The model indicated that the average annual number of tornadoes during the study period (2006-2020) was 237.5 compared with an annual average of 154.6 observed tornadoes. The largest adjustments occur over northern Europe (e.g., Iceland, Norway, Sweden, Finland), but also parts of the Mediterranean region (e.g., Spain, Greece).
The corrected distribution of tornadoes in Europe can be used to better understand the risk posed by European tornadoes. Compared with the previous distribution of tornadoes in Europe, the distribution obtained in this article is more relevant for risk reduction strategies, as it is including a correction for the population bias on tornado reporting. Furthermore, the current results can be used by decision-makers and emergency managers to develop disaster risk reduction strategies for tornadoes. Very few countries in Europe have developed tornado preparedness and response programs.