Probabilistic Quantiﬁcation in the Analysis of Flood Risks in Cross-Border Areas of Poland and Germany

: Measuring the probability of ﬂood risk is a key issue in the economics of natural disasters. This discipline studies actual and potential e ﬀ ects of natural disasters on the functioning of economic systems. In traditional economic understanding, it is assumed that both the decision making processes and market processes operate within a certain level of access to information. It is also assumed that the e ﬀ ects of certain phenomena are predictable. However, a natural disaster is di ﬃ cult to predict. It is hard to predict the time of its occurrence, its impact, direct exposure to its e ﬀ ects and ﬁnally, its social and economic results. Exposure to a random hazard, combined with the amount of damage resulting from its potential materialization, is called risk. In this study, the authors focus on presenting a method for quantiﬁcation of the random element of ﬂood risk. We are using measurement data for cross-border areas between Poland and Germany who witnessed a ﬂood of the century in the 1990s. The empirical data illustrate the usefulness and universality of probabilistic quantiﬁcation methods for ﬂood risk analysis. The analysis of water level is interesting in a much broader context than the hydrological-economic one. In Central Europe, river water level is immediately connected with two other disaster-like phenomena: drought and heavy rainfall. Also, the course of the Oder river is typical for North European Plain. Therefore, the conclusions presented by the authors are universal by nature and describe certain broader phenomena. Employment of methods of probabilistic quantiﬁcation using extreme values yields very interesting results: ﬂood risk changes dynamically. Five-year period measurements themselves indicate that there are periods of relatively low exposure of the areas to the disaster (with negligible probability 0.02) and periods of disproportionately high risk increase. The risk of exceeding alarm levels and warning levels changes rapidly, reaching as much as 30% in some locations. a decrease in the ﬂood risk in the middle section of the Oder has been observed.


Introduction
What is a natural disaster in terms of economy? It is a natural phenomenon that results in significant dysfunction of the economic system or human economic behavior. The effects of such events are observed in the amount of assets, production factors, employment, and consumption. A natural disaster is usually associated with negative results but there are also positive effects that should not be ignored. The relation between a natural event and its effects is studied by the economy of extreme phenomena. Three key questions are asked in this field: where the event shall occur, what its intensity will be, and what will its effect be on the economic system. The effect can be assessed in non-material categories: personal injury, civilizational regress, environmental damage, as well as in a strictly financial sense, expressed in funding necessary to rebuild lost elements of to influence the probability, it is possible to limit the risk in its deterministic aspect: for instance through appropriate urbanizing policy [32]. Wherever the deterministic aspect is difficult to manipulate, the risk may be limited by influencing its random component [33]. However, in every case, a measurement and assessment of both risk components is necessary. Therefore, accurate assessment of probability and size of damage are basic tools in the economy of extreme phenomena. A natural disaster is rare and difficult to predict by nature, however it also seems to be inevitable. That is why risk management should not be a temporary solution but a long-term activity connected with exposure dynamics and vulnerability of areas to the risk.
The authors place their study in the ex ante trend of disaster economy. They focus on the aspect of predicting extreme phenomena. The study presents methods of measuring the probability of flood. A cross-border area between Poland and Germany has been analyzed, located in the middle section of the Oder River. In the 1990s this area witnessed flooding with catastrophic results. The area of central Oder River is interesting for two reasons: • The course of the river is typical for this part of Europe, it is a flat area, highly urbanized in some parts, with high urbanizing pressure, the area is considered highly vulnerable and exposed to flood risk; • There is a clear climatic trend observable in the area: occurrence of periodical drought with an increase in the intensity and frequency of heavy rainfall.
It is obvious that the mentioned phenomena directly translate into water levels, nature of high water, and time of water flow. Therefore, the authors agree that observing the water levels is a good touchstone for certain broader climate change. Therefore the main aim of the study is to present the method of quantification of uncertain events, such as high water. Following the probabilistic theory of extreme values, the authors will verify the following hypotheses: Hypothesis 1 (H1). The probability of extreme rainfall is described with a mathematical model with high data compatibility parameters.

Hypothesis 2 (H2).
Both the scale and nature of flood risk significantly change in the studied period.

Hypothesis 3 (H3).
Over the past 15 years, a decrease in the flood risk in the middle section of the Oder has been observed.

Theoretical Background
In this study the authors focus on presenting tools to assess the probability of extreme water levels of the Oder River on three measurement points in its middle cross-border course. In the mathematical aspect, levels exceeding certain critical values correspond to a description of the highest mean between n points distributed randomly on an interval of a straight line with a set length of l, proposed by Nicolas Bernoulli [34]. In terms of the analysis of water levels, it is the highest (averaged at a certain time interval) deviation from the reference value. Such observations will be called extreme values. Accepting certain assumptions about the statistical distribution of the analyzed phenomenon it is possible to specify the values of parameters of extreme statistical distribution [35,36]. Rapid development of science, economics, medicine, insurance, banking, and engineering has quickly utilized the advantages of using probabilistic models in describing the frequency of extreme phenomena [34,37]. Using the distribution of extrema to analyze water levels in rivers was advocated by [38,39]. Rapid development of this branch of probability theory coincides with growing interest in climate change and the assessment of its possible effects: in the form of floods [17,[40][41][42][43][44][45][46][47], and their connection with heavy rainfall [48,49].

Method
The concept of theory of extreme values studies the distances of certain groups of elements in a number set from the set threshold values. This structure us not too complicated, yet it perfectly works in the observation of of random events in the environment. The events are measurable (for example water level) and it occurs with a regularity, a frequency. In stochastic processes such structure is called random variable. According to Bernoulli's concept, the number set is an n-element expression of the variable, a set of observed water levels. In Bernoulli's concept there are certain number sub-sets and among them there is a value representing the sub-set, it is its maximum. There are many ways to create sub-sets, however in the analyses of water levels they are observations grouped in equal time intervals. We are therefore dealing with a reduction of the entire set to its maximum values in sub-sets: Definition 1. Extreme values are values that occur with relatively low probability and have high influence on the behavior of other values in the series [59].
Assume that water level X 1 , . . . ,X n is a series of independent random events (i.i.d.) with a distribution described with a distribution function F(x), the set of observations x 1 , x 2 , . . . , x n will be the expression of these random variables. The measurements of water levels in the studied river are noted in certain time horizon, systematically and regularly. The maximum on this time horizon will be marked with a new variable: Since the measurements are taken in three different locations, we shall mark j with an auxiliary symbol y j -for the observation of water level m in possible locations: The basic issue addressed in the study is to determine the statistical distribution of maxima of water levels and estimate their parameters. Only such information is a useful tool for modeling flood risk in terms of disaster economy. Since studying water levels and relating them to a set level (threshold and warning in this case) corresponds to Bernoulli's problem, we can use the Fisher and Tippett [35] theorem. If random variable M n has a certain probability distribution, it is described with one of the three distribution functions given with a formula: x ∈ R, Frechet : Weibull : The proof of Fisher and Tippett theorem can be found in works by Embrechts et al. [50] and Mikosch [60]. The distribution functions are called boundary distribution functions for the distribution of extreme values. The family of these three distribution functions was presented by von Misess [61] in the form of a single family of distributions referred to as the generalized extreme value distribution (GEV), described by the following formula [50]: For the purpose of application, a version of the generalized extreme value distribution G γ is extended by a parameter of location µ and scale σ. For practical applications, parameterization of distribution function G γ is made, based on Theorem 1. This operation allows for comparative analysis and interpretation of the observed extreme events. Theorem 1. If random variable X has distribution function F, then random variable (µ + σX) has distribution function F µ,σ (x) = F((x − µ)/σ) [62].
According to the above theorem, after parameterization, the distribution function G γ (x) is transformed into distribution function G γ,µ,σ x−µ σ , described with the following formula [35,50]: In terms of statistics, parameter γ is called shape parameter (extreme value index: EVI). It specifies the asymptotic behavior of the distribution function of distribution. The higher its value, the thicker the right tail of distribution is. This characteristics immediately determines the characteristics of distribution of extreme phenomena. The other parameters may be translated into standard characteristics of distribution of probability: µ + 0.5772σ corresponds to mean value of distribution, µ − σln(ln2) corresponds to distribution median, 1 6 π 2 σ 2 is variance.
The subject literature proposes several methods for estimating GEV distribution parameters. This study uses a commonly applied maximum likelihood method [53,63,64]. The maximum likelihood method, along with the probability weighted moments [65] is one of the best and most commonly used methods of estimating GEV distribution parameters. The estimation of parameters requires defining the way to assign sub-sets of a given set of water level observations. A great deal of research based on the analysis of natural phenomena indicates that there are clear climate changes observed. From the stochastic point of view, it means that there are premises that the assumption of distributions being identical might not follow through. Such a situation opens up an interesting perspective of study: assessment of how distribution parameters change and, consequently, how probabilities change in individual time horizons. For the sake of this study, the authors have initially reviewed the data, analyzing various lengths of time series (meaning various sizes of sub-sets in Bernoulli's concept). The analysis included positional statistics of distributions and frequency of occurrence of maxima. Two events have been assigned in each location and each sub-set: exceeding the warning level in a given location and exceeding the alarm level. The alarm level is an accepted water level depicting flood threat even with a minuscule increase of water levels. It depicts highly probable risk of flooding the infrastructure, land development and actual danger to human health and life. When the alarm level is exceeded, it leads to taking specific action in order to increase the protection or counteract the impact. On many occasions it is a reason to evacuate the population of the affected areas. Observing the alarm level is the basis for sounding the flood alarm. The warning level is a locally defined water level which usually results in careful monitoring of water levels, preparing possible preventive action and an information campaign for the possibly affected area. It can also be the grounds for declaring a state of hydrological threat. Every hydrometeorological agency in Europe sets two or three warning levels in measurement points. Remember that due to the varied shape of the river bed in each location, the threshold values are different. Sub-sets from which maximum values have been determined are called blocks, since they were determined in the study using Gumbel's block maxima method. It is a method of selecting extreme values widely used in the extreme value theory. The method can be presented with the following steps: (1) An observation sample of random variable X with set size N is selected from the studied population. With study conducted by authors, the random variables were daily water levels on the Oder River in three measurement points located at the border between Poland and Germany. (2) In next step, the selected sample is divided into m blocks (series), with n observations each, i.e., nm = N. The division must be done in such way, so that j block contains consecutive observations for j = 1, . . . , m. For each of the three locations, the authors of the study selected three five-year periods. The values of maximum daily water levels were selected from 30-day periods. Therefore, each five-year period was composed of m = 60 blocks with n = 30 observations in each block (with 31-day months, one non-maximum observation was rejected). (3) Maximum observation was determined from each block, as expression of random variable M n;j , which, according to Formula (2)has the following form: Using the block maxima method, three series of observations of maximum water levels were obtained for three five-year periods y 1 , . . . , y m being the expression of random variable series M n;1 , . . . , M n;m . The obtained data was used for estimating parameters of nine distribution functions G γ,µ,σ for distributions of maximum water levels, depicting distributions of maximum values in three five-year periods for each studied location. Estimation yielded the estimator vector (γ,μ,σ) [18]. The economy of extreme phenomena is based on an ex ante assessment of flood risk. A random component of the flood risk is the possibility of certain hazard to materialize. Mathematically, it is expressed with a probability to exceed a certain level set a priori by the random variable.

Definition 2.
Flood risk is the potential damage resulting from exceeding the set critical water level u n;p by random variable M n P M n > u n;p = 1 − p, where p ∈ (0, 1).
In theoretical considerations, the conventional level u n;p is formally called p-quantile of the generalized extreme value distribution. The value of the quantile is obviously a function of parameters of the distribution function for the distribution of maximum values G γ,µ,σ [66]. After all, it depends on the value of theoretical distribution function: Therefore, the random component of flood risk depends on three components: Thus, the measure of probability is a certain function which can be expressed in the following way: Definition 3. The probabilistic measure of flood risk level D n, G γ,µ,σ , h cr is defined as a probability of exceeding the critical level h cr by random value M n . D n, G γ,µ,σ , h cr = P(M n > h cr ) = p cr (8) where n is the number of days from which expressions of random variable M n , are selected, and G γ,µ,σ is the distribution function of the distribution of random variable M n .
The measure D n, G γ,µ,σ , h cr may also be described with the formula: Considering Equations (6) and (7), critical level h cr may be regarded as 1 − p cr quantile of distribution describing maximum daily water levels of n-days, and expressed as: h cr = u n;(1−p cr ) . It is also possible to define the measure which allows to calculate the value of critical level h cr for critical probability level p cr . set a priori. Since h cr = u n;(1−p cr ) is a quantile measure of risk level, it is defined as: Definition 4. The quantile measure of flood risk level is defined by 1 − p cr quantile of distribution of maximum daily water levels from n-day periods, being a critical level which will be exceeded with flood risk at p cr level, and it is calculated from the formula: where G −1 γ,µ,σ is the reverse distribution function of the generalized extreme value distribution of water levels, p cr is the probabilistic measure of flood risk, described by Equation (9), and u n;(1−p cr ) is determined in Equation (7).
This way, an original probabilistic model of flood risk is obtained, which in authors' intention is to be used to measure and assess the risk of flood-related damage hazard.
Definition 5. The probabilistic model of flood risk based on distribution function G γ,µ,σ of the generalized maxima distribution of random variable depicting daily water levels, is expressed with the formulas: and R FQ n, G −1 γ,µ,σ , p cr = D Q R F is the measure of risk of maximum water level above level h cr (with set block length n and at a set critical level h kr ), whereas R FQ is the measure of maximum water levels to be exceeded with set critical probability p cr . The aforementioned method can be presented in a form of the following steps describing the flood risk assessment algorithm (Table 1): Table 1. Methodological path of probabilistic quantification in analysis of flood risk. 1 Acquiring hydrometrical data from hydrometeorological agencies.

2
Determining maximum values for the selected n-element data sub-sets. 3 Selecting distribution and estimating its parameters.

4
Model quality assessment (compatibility test between theoretical and empirical distribution of hydrometrical data).

5
Determining the critical level. 6 Estimating the risk level, using probabilistic risk measure. 7 Descriptive assessment of risk based on the developed measurement scale.
Source: Own study.

Results
In this study, the authors use the problems of estimating maxima distribution parameters to analyze flood risk. Hydrological data from three measurement stations have been analyzed, the stations located in the middle part of the Oder river: Slubice/Frankfurt (Oder), Gozdowice, and Widuchowa. All three locations lie at the border between Poland and Germany so the analysis applies to flood risk in cross-border areas. The data used in the study comes from daily water levels in the listed locations from a period between 01 January 2004 and 31 December 2018. For the sake of a detailed analysis of flood risk and observation of certain trends, the data has been divided into three five-year periods. For each of the five-year periods, the block size in Gumbel's block maxima method was 30 days. Therefore, 60-element sets of expression of random variable M 30;j were acquired for each location. In the next step of the analysis, the data was presented in the form of an empirical distribution function and parameters for theoretical distribution functions of maxima were determined. In order to verify the compatibility of theoretical and empirical distribution functions, two compatibility tests were used: the Anderson-Darling test and the Kolmogorov-Smirnov test. The first test stands out from the rest in the sens that is exhibits higher sensitivity to compatibility of distributions in their tails. Its application for assessing the compatibility of models in the study is essential, since it is the distribution tails that define extreme phenomena. The other test was used as standard, reference method in this type of analyses, it is a non-parametric test and it is used in the study as an additional method which allows to compare the results with results of other analyses in other studies. The p-value of the Anderson-Darling compatibility tests was accepted as a measure of matching theoretical distributions to empirical data.
Using the probabilistic model of flood risk (11) based on critical water level hcr specific for each location, and a set length of n block, probabilities of damage were estimated. Two reference levels hcr were assumed: Alarm levels (SA) and warning levels (SO), applicable in three hydrological stations. Thus, for measurement point in Słubice SA = hcr = 410 cm, SO = hcr = 360 cm, in Gozdowice SA = hcr = 500 cm, SO = hcr = 440 cm, and in Widuchowa SA = hcr = 650 cm i SO = hcr = 630 cm. The graphs of the empirical distribution function for the distribution of maxima of random variable M30 were drawn according to procedure proposed by [67]. Table 2 presents estimation results of parameters of distribution function G γµ,σ for distribution of maximum water level values in the studied locations for three periods (Period I:   When analyzing the p-value of the Anderson-Darling test for all distributions, it is clear that, for each location and period, the estimated theoretical distribution corresponds to the empirical distribution at any significance level α < 0.73. Such matching can be considered good and very good. In the case of Kolmogorov-Smirnov test, the value p v = 0.492 does not foredoom the usefulness of the model. Very high (close to 1) values of p-value in most cases prove that there is very good matching of the theoretical distribution proposed by authors to the empirical distribution of maxima, which successfully verifies study hypothesis H1. Figures 1-3 Table 2 presents the estimators of distribution functions for maximum distributions and calculated statistic values for maxima distributions. Similar trends occur for each location, concerning both water levels and the nature of observed phenomena. In terms of analysis of extremes, key parameter is the gamma parameter. In the statistical aspect [62] its values describe the properties of tails of initial variable distributions. Values above zero yield thick tails of reference variable distributions. Therefore, the share of extreme phenomena in the maxima group is relatively high, zero value yields thin tail, and values below zero yield cut distributions where certain extreme values are not observed while the lowest value have a high share (left tail). In most cases (excluding the point in Widuchowa) the gamma parameter is negative, which means the right tail is cut off. No extreme observations have been observed in empirical data. In fact, there was no flooding or high water throughout the entire period. It is also observable that the parameter increases against absolute value in subsequent periods, therefore the weight of the left tail grows in the distribution. It may mean that risk level has decreased (when positional statistics are the same) or that the center of distribution (µ) is moving towards higher extremes. This significantly increases the flood risk. The measurement point located in Słubice recorded clear changes in expected values over a course of 15 years. The highest value was reached between 2009 and 2014, along with additional increase in share of the right tail. This may unambiguously indicate a drop in flood risk and its lowest value in the entire studied period. Very similar phenomena are observed in the other two locations. Period II is characterized by high risk, and during period III risk reduction occurs along with certain stronger representation of the lowest values in the distribution of maxima. Therefore, the situation concerns the entire section of central Oder River.
Values of estimated parameters allow to draw a certain specific picture of flooding situation for each of the three locations: • Słubice and Gozdowice: frequency of flood-related socio-economic damage is decreasing over the studied period of 15 years. There are however periods of increased risk. The probability of extreme phenomena is getting lower. • Widuchowa: measurements and calculations obtained from the measurement point in the location indicate a clear increase in extreme value index in period two and decrease in period three to the level recorded in period one. This means that, based on the value of extreme value index, it is concluded that the frequency of adverse flooding phenomena in this location decreased over a period of fifteen years compared to the beginning of the study period.
The above estimates of frequency, scale and variability of events allow to observe certain regularities and unique characteristics of the analyzed phenomena. The resultant of the three factors generates specific values of probability of extreme phenomena. Therefore, there is a possibility to measure the probability of flood-related damage. The probability has been described with the formula (11). Again, two critical values have been accepted: alarm level ( SA h cr ) and warning level ( SO h cr ). Calculations of risk level are presented in Table 3. The color scale of levels allow to assess the risk level visually [68]. The estimated values of probabilities formally confirm the description of model parameters. The probabilities of exceeding alarm levels are decreasing, however between 2009 and 2014 the risk of exceeding was high. The values of risk measure in three locations indicate that Slubice/Frankfurt (Oder) was the most exposed location to flooding. Note that in recent years the flood risk is insignificant in two locations. The warning levels are lower so probabilities are higher and correlated respectively (97%). In all three locations, period II was characterized by high probability of exceeding warning levels, while the years 2014-2018 witnessed moderate risk level. Figure 4 presents a graph with estimate results.

Discussion
In this paper, the authors have proposed a universal mathematical tool for measuring flood risk. The presented measures are based on simulating the risk using a model developed with empirical data. The method described above is the statistical instrumentation that supports action taken to prevent future flooding incidents. Due to their predictive functionality, authorial methods of measuring and assessing flood risk become part of ex-ante sub-area of disaster economy. The authors present the application functions of the models using examples of three selected areas of the Oder river basin, located at the border between to central European countries. Nonetheless, it must be pointed out that the developed tools are universal by nature. The method can be successfully used to assess and measure risks in other regions of Europe and the world. The universal nature of the methods presented in the study results from having a multitude of parameters as well as from typical similarities in instances of water level rising, regardless of its location. The ability to set three parameters in the model makes it highly flexible and broadens its application spectrum. The authors also point out the utilitarian nature of the risk models. They can be used by flood protection agencies or urban policy planners. From the perspective of economic applicability, the presented probabilistic tools can be used to support the profit and loss analysis often used to assess the economic efficiency and usefulness of protective investments or flood-related protective projects. The models proposed by the authors also provide complement and support for the methodology of estimating potential flood-related damage. This feature of the models may be used by insurance companies and other financial institutions which are part of environmental risk transfer system. The authors also see some limitations in the presented models. The basic one is the limitation of constant access to hydrometrical and meteometrical data from areas where risk measurement is to be taken. Additionally, raw data only describe the behavior of water in the riverbed, yielding no knowledge of the scale of threat. Without additional quality information it is difficult to relate the estimated risk to time-related or space-related threat. The models only provide information of the possibility of disaster, giving no data of its scale or duration. Consequently, it is difficult to relate them to the financial aspect of the expected damage at this stage of study. Another limitation, immediately connected with the previous one, is the limited or inconsistent measurement network in some areas. It specifically relates to poorly developed areas or areas under-equipped with protective and measuring infrastructure. As the authors claim, another limitation may also result from the statistically advanced calculating algorithm and certain high-handedness in selecting models and data blocks for analysis. To some extent it excludes the automation of the measurement process, although at this stage of development of computer algorithms such factor seems less significant.

Conclusions
Theoretical study and discussion over disaster economy allowed to extend the knowledge of both the hazards and their results. The development of the discipline has also provided analytical tools to measure phenomena and assess their potential effect. What is unique about a natural disaster is that its occurrence is relatively rare but is accompanied by large-scale damage. Additionally, it is difficult to assess the risk of disaster, which includes a random and a deterministic component. Areas with significant exposure to risk (for instance, valleys with flooding rivers) increase their vulnerability and exposure to potential damage due to economic and urban development. This changes the deterministic component of the risk. On the other hand, we are witnessing climate changes that increase the probability of disasters or make them unpredictable. This changes the random component of the risk.
While it is difficult to expect that the development of regions stops, it can take place with consideration of disaster potentially happening. Therefore, preventive action can be taken in this field. The action will be based on experience taken from natural disasters that have already happened (ex post approach) or on an attempt to predict its occurrence in the future and its influence on the economic system (ex-ante). Such need makes empirical methods of quantification of extreme phenomena very useful. Mathematical tools allow one to estimate the possibility of occurrence of certain events, allowing the creation of a certain degree of protection against the effects of disasters and optimizing the economic growth of regions in terms of vulnerability and exposure to hazards. The main role in measuring the risk of disaster is played by the theory of distribution of extreme values. It describes a mathematical model of phenomena which deviate from the norm, phenomena which are extreme in nature and unique in scale. As the results in this article indicate, the models are doing a very good job at describing phenomena connected with flood hazard. The models also allow to observe the variability of risk over time. The analysis was spread over a cross-border area between Poland and Germany, in central Oder river valley. It may be assumed that this is a model example of phenomena in this part of Europe. The area was struck by a disastrous flood in the 1990s. Therefore, an interesting question occurs whether there are premises to think that such phenomenon may occur again in the foreseeable future. The application of GEV models allowed to draw interesting conclusions: flood risk within the last 5 years has been minimal, close to zero. The observed trend also shows that within the last 15 years there has been clear change in the distribution of water levels: the share of extreme phenomena is decreasing and there is also a certain decreasing trend for average water levels in general. Additionally, there is also decrease in uncertainty of measurement expressed with standard deviation. Does this mean that the flood risk in the Oder valley is declining? No, it is not. There are also periods of high probability. Therefore, new perspectives for study emerge: the assessment of water levels in a long time horizon and the assessment of long-term trends. Meteorological study in those areas indicate a relation between water levels and ever more frequent drought periods with a change in rainfall nature, which is also regarded as a disaster. it would therefore be possible to search for a model that connects all these phenomena into one concept of disaster risk. The third perspective of study, noted by the authors, is the assessment of flood risk in its deterministic component: potential damage. The study would allow to describe the risk model in these areas, based on events of 1997.