An Amplification Model for the Regional Estimation of Extreme Rainfall within Orographic Areas in Campania Region (italy)

Orography strongly interacts with the atmospheric circulation, especially during frontal events, generating an enhanced spatial variability of the rainfall field. Regional models of extreme rainfall have to deal with these circumstances in order to provide good spatial estimation of the regionalized variable. We present a model for the regional estimation of the mean of the probability distribution of the annual daily rainfall maxima in a region (Campania, Southern Italy) with complex orography. In a recent work, we found that areas with enhanced variability of extreme rainfall, in the same region, correspond to a particular set of orographic objects, which had been classified through an automatic, GIS-based geomorphological procedure. Here, we propose an approach that considers the same orographic objects as building blocks for a regional model that is able to capture the amplification of extreme rainfall caused by orography. The regional model is then the product of a basic stationary random spatial process and an amplification factor, whose values are related to the topographic features of the orographic objects. This approach represents a step towards the improvement of the predictive ability of regional models of extreme rainfall within orographically complex areas.


Introduction
The interaction between atmospheric circulation and topography, especially during frontal events, produces a significant increase of the precipitation amount in orographic areas [1,2].This complex interaction also leads to an enhanced spatial rainfall variability [3][4][5][6]: this, associated with an inadequate and uneven rain-gauge network in mountainous areas, makes the observation and modeling of orographic effects a challenging task.
Here, we focus on the effect of orography on long-term probabilistic prediction of extreme rainfall at regional scale; hence, the main issue is to identify and model the effects induced by the orography over some statistical parameters of the rainfall extremes (in this work, annual daily rainfall maxima).Higher order statistical parameters are often considered either constant or smoothly varying over relatively large areas: this is a result of the sample estimation errors that usually mask spatial variability.On the contrary, estimates of low order statistical parameters, as the mean of the distribution, are less affected by sample errors and require a more accurate spatial model for regional estimation.Such statistical parameters of the annual maxima are affected by a multitude of events, with different characteristics.Physically based, meteorological models, although continuously improving, are storm-based models, therefore they are not suitable for the scope: the influence of orography needs to be embedded into the regionalization model of the parameters of the probability distribution of the rainfall extremes [7].
A common solution to the problem is to split the region into homogeneous sub-regions in order to apply regression relationships with at site descriptors of topography within the homogeneous regions [3,[8][9][10].A more sophisticated approach consists in coupling a regression model with a spatial, nonparametric model of the residuals [4,6,11].
Following the latter approach, topographic variables need to be included into the spatial interpolation methods used to map the statistical parameters of the rainfall maxima distribution.Interpolation methods, even as sophisticated as geostatistical techniques, can indeed still suffer from lack of rainfall data in mountainous areas, providing inaccurate and sometimes biased estimation of the regionalized variables in those areas.
In scientific literature, many regional studies deal with the estimation of rainfall in orographic areas using advanced kriging techniques that incorporate auxiliary topographic variables into the spatial interpolation model [3,4,6,[11][12][13].
These methods may be categorized into two groups.One includes the methods that consider the external information directly in the kriging system, such as the kriging with an external drift or cokriging.The other consists of methods that first deal with the external information, for example by fitting a regression among the field and the external variables, and then apply an interpolation procedure, typically based on kriging, to the residuals.These methods are often referred to as "detrended kriging" [4].
This approach represent a significant improvement over simple regression analysis, and gives results whose quality depends on the physical meaning and complexity of the topographic descriptors and of the regression model.However, all these methods introduce local or locally averaged descriptors of the topographic surface as external variables for the orographic effects modeling.These descriptors are generally obtained through more or less complex elaboration of Digital Elevation Models (DEM): thus, they are not based on the identification of "orographic object" but only on local properties of the topography.
This approach to regional modeling of extreme rainfall can be improved by introducing an objective description and identification of orographic features [7,14].This allows dividing the region into "topographically" homogeneous areas, for which different spatial rainfall variability is considered.This is what we basically want to show.
In Section 2, we present the study area, the available rainfall data and the current approach for the regional modeling of extreme rainfall as well as findings from previous works on the study area related to the influence of orography over the parameters of the probability distribution of the rainfall extremes.In Section 3, we define all the methods and models adopted.Finally, in Section 4 the results of the application of the proposed model for the regional estimation of the mean of the probability distribution of the annual daily rainfall maxima in the study area.

Description of the Study Area
The study area is the Campania region and some surroundings, in Southern Italy (Figure 1), located between the Apennines and the Tyrrhenian Sea.
The region is characterized by Mediterranean climate, with dry summers and wet winters.The mean annual precipitation ranges from 800 to 1100 mm/year.
The precipitations are largely generated by wet air masses that come predominantly from the Tyrrhenian Sea (i.e., southwest-northeast direction) and they are strongly influenced by the interaction with orography [15,16].
Figure 1 shows the location of the region and its terrain elevations by DEM.The area is characterized by a complex topography (Figure 1), which Cuomo et al. [14] schematized through an automatic, objective, Geographical Information System (GIS)-based geomorphological procedure.This procedure identifies orographic entities by means of topographic concepts of key contour, key saddle, summit point and prominence (Figure 2).The study in [14] provides orographic entities, hierarchically organized in nested orographic classes.Based on the results of a recent study [7], we have considered groups (4th order) and systems (3rd order) as meaningful objects for the regionalization study here presented (Figure 3).
Table 1 provides the extent of the selected orographic objects within the study area compared to its total extension.

Rainfall Data
The rainfall database comes from the VAPI (VAlutazione delle PIene, Flood Estimation) project [9,10], a national project for the regional analysis of the frequency of extreme rainfall and river floods, redacted within the National Research Council (CNR, Consiglio Nazionale delle Ricerche).
In Figure 3, the map of the region is presented along with the location of the 605 rain-gauges used for the analysis, among which 245 are placed in the region of interest.Besides the rain gauges located in the study area, we considered a convenient number of gauges in the surrounding area in order to avoid the presence of border errors in the spatial analysis.
In this work, we use the annual maxima of daily rainfall because they represent the most reliable and largely available data in the region.The length of the time series varies from 5 to 77 years, collected from the beginning of the last century until the year 2000, and it is on average equal to 45 years.The maximum value of the mean of the annual maxima of daily rainfall is 134.7 mm, the minimum is 39.3 mm, the average is 69.8 mm.The average coefficient of variation, Cv, is 0.353 and the average coefficient of skewness is 1.254.Location of the study area with the rain gauges (triangles), the "anomalous" rain gauges (squares) (Section 2.3.2 and [7]) and the orographic barriers (solid grey) [14].

The VAPI Model for Extreme Rainfall in the Region
The most recent and complete analysis of extreme rainfall at a regional scale for the study area was conducted within the VAPI national project [10] and described in the VAPI Campania regional report [9].
The VAPI procedure is based on the use of TCEV (Two Components Extreme Value, [17]) as a parent probabilistic model for the annual maxima of daily and sub-daily rainfall.The probabilistic model is coupled with a three level hierarchical procedure for the regional estimation of the parameters.The regional model is a simple model of homogeneous region at each level [8].
The first and second levels of regionalization deal with the regional estimation of the shape and scale parameters of the distribution.At both levels, the whole region can be considered homogeneous, with the shape and scale parameters that assume unique values everywhere.
At the third level, the regionalized variable is the mean of the annual maxima of daily (and hourly) rainfall.Here we report only the results for daily rainfall, which are the data analyzed in our work.
The regional analysis conducted within the VAPI project led to the identification of six homogeneous sub-regions at the third level.In each sub-region, a linear relationship between elevation and logarithm of mean annual daily rainfall maxima was estimated through a regression analysis.
The boundary of the homogeneous sub-regions are represented in Figure 4 along with the lines representing the six linear regressions that were fitted.

Identification of Spatial Outliers in Extreme Rainfall
In a recent work, we identified areas with enhanced spatial variability of the mean of the annual daily rainfall maxima through an innovative statistical procedure [7].The procedure demonstrates how linear geostatistics is not able to capture the spatial variability of the regionalized variable in areas with a strong orographic forcing.It is an iterative statistical procedure applied to the logarithmic transformation of the original series of the annual daily rainfall maxima.The logarithmic transform was applied in order to obtain an approximately Gaussian field, for which linear geostatistics works the best.
Let Y(x) be the mean of the annual maxima of daily rainfall, Z(x) denote its logarithmic transformation as follows: The detection of the spatial non linearity, where the term "non-linearity" is used to characterize areas where linear geostatistics fails, due to an enhanced small scale variability of the field Z(x) was performed by analyzing the residuals of a cross-validation procedure in an innovative way.
The Kriging with Uncertain data (KUD) [18] estimator was run for the jack-knife estimation of the field in all the sampling points and then the standardized cross-validation residuals were calculated as follows: where ( ) is the observed value of the field at sampling sites xi, is the jackknife KUD estimation of the field at sampling sites xi and σ denotes the estimated predictor variance (for theoretical details, refer to [7]).The cross-validation residual at a given sampling site, as defined in Equation ( 2), represents the sum of the interpolation and the sampling errors at that site.Under the hypotheses of the kriging predictor, the standardized cross-validation residuals should follow a standardized Gaussian distribution.Because of the orographic effects, instead, there are "anomalous" errors in some gauges.The "anomalous" errors can be identified as outliers on the upper tail of the error probability distribution in the Normal Probability Chart, which facilitates the comparison between the empirical and the expected theoretical distribution of the residuals.We considered as outliers the residuals whose 95% confidence interval (in blue in Figure 5) deviates from the straight line (in red in Figure 5), which represents the theoretical distribution (i.e., standardized Gaussian distribution) of the residuals on the Normal Probability Chart (Figure 5).Hereinafter, we refer to these sampling sites/rain gauges where the outliers occur as "anomalous" sites/rain gauges or as spatial outliers.In these sites, the kriging predictor fails with an unacceptable underestimation of the field.Figure 5 displays the Normal Probability Chart of the standardized residuals.In the upper tail of the distribution, within the black square, there are the anomalous errors, which significantly deviate from the theoretical distribution.These errors correspond to rain gauges marked as anomalous in Figure 3.
The statistical procedure detected the presence of 19 spatial outliers in the study area.The following step was to interpret the nature of these spatial outliers and highlight the physical factors that influence the rainfall field in such a way to produce a rainfall amplification that is so evident in the mean of the annual maxima.
In Table 2, we show some basic statistics of the data (i.e., annual daily rainfall maxima) in the study area, discerning between the anomalous subset and the remaining rain gauges.These statistics reveal that the average value of the field over the region (i.e., the average mean) is much higher in the anomalous gauges than in the other gauges of the region while the other statistics (i.e., average coefficient of variation, Cv, and average skewness) do not change as much.In particular, the circumstance that the skewness does not increase for the anomalous subset suggests that the mean is not affected by the presence of a few very intense events in the series but there is a more general amplification of the events producing the annual maximum.
Table 2. Statistics of the data in the study area, with distinction among the whole dataset, the anomalous subset and the remaining rain gauges [7].In a climatologically homogeneous region like the one analyzed, orography may be identified as the main characteristic that may cause this amplification of the events producing the annual maxima.This is confirmed by a first examination of the results by comparing the anomalous rain gauges placed in the study area with the location of the orographic areas defined in Section 2.

Set of Rain
The results of the comparison are interesting (see Table 3 for the details):


The anomalous sites are located very close to orographic barriers: about 70% of the anomalous rain gauges are placed in orographic areas whereas only less than 20% of the gauges are located in those areas. About 30% of the rain gauges placed in orographic areas are anomalous, against 3% of anomalous gauges in the other areas.

Formulation of the Regional Model for the Mean of the Annual Daily Rainfall Maxima
The outcomes reported in the previous Section confirm the potential value of an approach based on the identification of orographic objects for the improvement of regional models of extreme rainfall.Here, we introduce a preliminary model for the regional estimation of the mean of the probability distribution of the annual daily rainfall maxima that gives a better interpretation of variability of the rainfall field in orographic areas.Despite its simplicity, this model is already very effective in increasing the accuracy of the regional estimates of the selected parameter.
We suppose that the mean of the annual maxima of daily rainfall, Y(x), can be seen as the product of a basic, weakly stationary process, Y'(x), and of an amplification factor, AF, that arises only in orographic areas: Then, by taking the logarithm: The basic, weakly stationary process, or, better its logarithm, Z'(x), can be computed by using classic linear spatial interpolation techniques, such as KUD from the rainfall measurements in the valleys and in the plains.
The Amplification Factor, AF, corrects the linear estimations where it is necessary, i.e., in orographic areas as highlighted in Section 2. The Amplification Factor is computed accordingly an Amplification model (Section 3.2).

The Amplification Model
Here, we propose a model for the assessment of AF by relating its values to some global features of the orographic barriers.In the following, a set of topographic indices are evaluated for the estimation of AF in orographic areas.

Computation of Topographic Indices
Each isolated orographic object in the study area (Figure 3) has been treated as a unique homogeneous shape, for which a set of topographic indices has been computed, by means of a DEM having a spatial resolution of 20 m (Figure 1).
We considered five topographic indices that are representative of the process of orographic interaction and amplification of extreme rainfall [1,4]: (i) Mean slope, denoted as Sl (%), which is strictly related to the rate of climb of the wet air masses in proximity to the orographic object.(ii) Average elevation, zmean (m a.s.l.), which is representative of the force, which leads up masses over orographic obstacles.(iii) Maximum elevation, zmax (m a.s.l.), which could state the intensity of the force, which leads up masses over orographic obstacles.(iv) Prominence, P (m), defined as the difference between the maximum elevation within the shape, zmax, and the average elevation of its perimeter line.It could express the presence of orographic obstacles in relation to the neighborhood.(v) Exposition, cosϕ, defined as the cosine of the angle between the dominant direction of the wet air masses and the principal direction of inertia of each orographic object, related to the minimum moment of inertia (Figure 6).
We believe that the principal direction of inertia related to the minimum moment of inertia can conveniently represent the orientation of the orographic object itself.
In the Tyrrhenian area, the dominant direction of the wet air masses is known to be the direction southwest-northeast, which is inclined about 45° above the horizontal (west-east) line.In this case, a further statistical assessment have been done: We defined the dominant direction as the reference direction for measuring the angle ϕ, such that the linear correlation between ln(AF) and cosϕ is maximum.The result was a dominant direction with inclination 41° above the horizontal (Figure 6), which is in good agreement with the fact that rainfall events in this area prevalently come from the southwest quadrant.
Exposition is important because it affects the dynamic of the interaction between the orographic obstacles and the wet air masses.
Figure 6 shows the orographic objects and their orientation by means of the principal directions of inertia related to the minimum moment.

Computation of AF
In our model, the Amplification Factor arises only in orographic areas, so that outside orographic areas its value is equal to the unity.Inside orographic areas, Equations ( 3) and ( 4) are useful for its evaluation, so that the logarithm of AF can be computed as the difference between the observed value Z(xi) and the value Z′(xi) estimated by the Kriging with Uncertain Data predictor using only the rainfall measurements in the valleys and in the plains, as follows: 0 outside orographic areas ln( ) ( ) '( ) inside orographic areas By means of Equation ( 5), we evaluated ln(AF) in each one of the 47 rain gauges, placed in orographic areas (Figure 7) and, then, we assigned to each orographic object a representative value of ln(AF), evaluated through the arithmetic mean of the values of ln(AF) related to the rain gauges within the object.

Regression Analysis
In our model, the values of the Amplification Factor (Section 3.3) are simply related to some global features of the orographic barriers, expressed by means of some more significant topographical indices (Section 3.2).This choice is mainly driven by the limited availability of rainfall data in these areas.
Univariate regressions between the Amplification Factor and the various topographical indices were first carried out.Then multiple regression analysis was performed.
Criteria to be respected are that the explanatory variables should not be highly correlated and they should represent physical explanations for AF variations.
The regression analysis has been performed using the data recorded in all 47 rain gauges (not just the anomalous ones) placed in orographic areas (Figure 7).In particular, to proper select a combination of topographic indices and build the Amplification model by means of an eventually multivariate regression, we evaluated all the possible combinations of topographic indices, Xi, considering both a two-variate model and a three-variate model, as follows: The best combinations were found considering both the coefficient of determination, R 2 , and the adjusted R 2 , R 2 adj, which allows to take account of the circumstance that R 2 automatically and spuriously increases when extra explanatory variables are added to the model.R 2 adj was computed as follows: ( ) where p is the total number of explanatory variables (i.e., topographic indices) in the model (not including the constant term), and n is the sample size.
Unlike R 2 , the adjusted R 2 increases only when the increase in R 2 , due to the inclusion of a new explanatory variable, is more than one would expect to see by chance.

Results and Discussion
In this Section, we firstly report the results of the regression analysis for the definition of the Amplification model.Then, we illustrate the results related to the application of the complete regional model for the estimation of the mean of the annual daily rainfall maxima to the study area by comparing its final performances against the actual regional model (i.e., VAPI) and a pure geostatistical interpolation.

The Amplification Model
Here, we considered each orographic object in the study area as a unique homogeneous shape for which we computed some topographic indices and the average value of ln(AF) (as stated in Section 3).Table 4 shows these values for the 14 orographic objects in the study area (numbered as in Figure 7).The former step toward the regression analysis was to look at the correlation matrix to investigate at the same time the dependence between the multiple variables involved in the analysis.As we can see, from Table 5, the variables are all correlated to each other: the correlation between some topographic variables, such as exposition and slope or exposition and prominence is strictly related to the geological process that led to the formation of the orography itself.The correlation between prominence and slope, prominence and zmean, prominence and zmax is instead mainly related to the definition itself of the prominence (Section 3.2) so that, as expected, the highest correlations are between P and these variables.From Table 5, it is possible extract the values of R 2 related to the univariate regressions between ln(AF) and each topographic index by squaring the values in the first row, as reported in parenthesis.The higher correlation is between ln(AF) and prominence, followed by slope and exposition.
Then, Table 6 shows the top five models (having the best fits among all the possible combinations in term of R 2 and R 2 adj) in the case of multiple regressions, as in Equations (6a) and (6b).
It is clear that the exposition is a crucial explanatory variable in the bivariate regression, followed by slope.The introduction of a third variable into the model should be chosen among prominence, zmean and zmax.All the multivariate models, except M22, provide very similar results.Following a pure statistical criterion, the bivariate regression M21 is to be considered the most meaningful, because it provides the best result in terms of adjusted R 2 .Nonetheless, the tri-variate model M31 is also interesting, despite its poorer statistical performances, because it includes a physically meaningful variable, prominence (P), which can improve the accuracy of the model in certain circumstances.
Then, the chosen amplification model, M31, states: In Figure 8, we show the comparison between the observed ln(AF) and the model M31 results, as shown in Equation ( 7) (the differences with M21 are very small, though).The perfect agreement is obtained in correspondence of the solid line, with a slope of 45°.The points represent the fourteen orographic shapes of the study area.The alignment of the points in the graph shows a significant residual dispersion, which suggests that the amplification model can be furtherly improved.In this study, we had to use a relatively simple model because of the poor density of the rain gauge network in orographic areas: This is a consequence of the historical inaccessibility of these places due to morphology, vegetation and lack of infrastructure.However, interesting considerations can be made once model performances in terms of estimated mean, estimated variance and MSE are computed, as follows in the next Section.

Regional Model Performance
Finally, the performances of the proposed regional model ( 4) + (8) are compared with three spatial interpolation techniques: (i) linear regression with elevation, using the formulas given by VAPI report [9]; (ii) linear regression with elevation, following the same approach of the VAPI report but updating the formulas with the new and extended database; and (iii) Kriging with Uncertain Data.The comparison was performed in terms of estimated mean, estimated variance, MSE, net MSE, dimensionless RMSE (ρ) and maximum error, distinguishing the whole region and orographic areas.. To the net MSE, we subtracted the average variance of the sampling errors in the study region (equal to 0.0044) from the MSE.The contribution to the MSE given by the variance of the sampling errors, in fact, cannot be eliminated.For this reason, the comparison between different models is more significant when done in terms of net MSE rather than MSE.
The numbers in Table 7 refer to the (natural) logarithm of the mean annual daily rainfall maxima.The improvements in all the metrics are encouraging, especially in orographic areas (Table 7), where we have a reduction of the net MSE almost equal to 65% compared to the current regional model for the mean of the annual daily rainfall maxima (i.e., VAPI).In the whole region, compared to the current VAPI model for daily rainfall maxima, we have an improvement of the net MSE close to 50%.

Conclusions
We presented a preliminary model for the regional estimate of the mean of the annual daily rainfall maxima in a region with complex orography.This model can be included into regional models for the long-term probabilistic prediction of extreme rainfall.The reasons that led us to approach the problem in this way are presented as background in Section 2, which mainly reports the work in [7] along with the most recent regional studies in the area of interest [9].
The identification of spatial anomalies in orographic areas highlights how linear geostatistics is unable to describe the spatial variability of the rain field in areas with a strong orographic forcing.A preliminary model to deal with this circumstance is then suggested.
The comparison between the proposed model and other regionalization methods shows that the former produces better regional estimates of the mean of the annual maxima of daily rainfall.This is particularly evident in regard to the linear regressions with elevation within homogeneous sub-regions, which is the current model used for extreme rainfall estimation in the study area.The improvements allow a better regional estimation in orographic areas and provide encouraging results for future developments of this new approach to regional modeling of extreme rainfall in orographic areas.The improved interpretation of spatial variability of the extreme rainfall field can also provide the support for the development of better criteria for rain gauge and meteorological radar network design.
With reference to the region of this study, it is evident that any further improvement of the regional model is deeply linked to a more accurate observation of the phenomenon, by updating the rain gauge network and, even better, by integrating it with meteorological radars.Depending on the quantity and quality of data available, a first step could consist in introducing separate models of upwind and downwind hillslope.More sophisticated models could consider the eventual interaction between different orographic objects as well.

Figure 1 .
Figure 1.Location of Campania region with DEM.

Figure 4 .
Figure 4. Map of the six homogeneous sub-regions for the mean annual daily rainfall maxima and linear relationships with elevation from VAPI Campania report.Reproduced with permission from Rossi, F. et al.Valutazione Delle Piene in Campania; Published by CNR-GNDCI, 1995 [9].

Figure 5 .
Figure 5. Normal Probability Plot of the standardized residuals from cross-validation.Reproduced with permission from Furcolo, P.; Pelosi, A.; Rossi, F. Statistical identification of orographic effects in the regional analysis of extreme rainfall.Hydrol.Process.2015 [7].

Figure 6 .
Figure 6.Principal directions of inertia related to the minimum moment of the orographic object.

Figure 7 .
Figure 7. Rain gauges in orographic areas.The orographic shapes are numbered with two digits: the first one denotes the order (groups, 4th order, and systems, 3rd order), the second one identifies progressively the object.

Figure 8 .
Figure 8.Comparison between the observed ln(AF) and its estimation.

Table 1 .
Extent of the study area and of the orographic areas (4th and 3rd order).

Table 3 .
Main statistics referred to the location of the "anomalous" rain gauges.

Table 4 .
Values of ln(AF) and topographic indices for each orographic object.

Table 6 .
Regression coefficients, coefficient of determination, R 2 , and adjusted coefficient of determination, R 2 adj for five selected multiple regression models.

Table 7 .
Comparison with the main statistics related to different methods of spatial estimation.The proposed model is in the last column, indicated as KUD+AF.