Characteristics of the Underestimation Error of Annual Maximum Rainfall Depth Due to Coarse Temporal Aggregation

This study analyzed all characteristics of the error committed in evaluating annual maximum rainfall depth, Hd, associated with a given duration, d, when data with coarse temporal aggregation, ta, were used. It is well known that when ta = 1 min, this error is practically negligible while coarser temporal aggregations can determine underestimation for a single Hd up to 50% and for the average value of sufficiently numerous series of Hd up to 16.67%. By using a mathematical relation between average underestimation error and the ratio ta/d, each Hd value belonging to a specific series could be corrected through deterministic or stochastic approaches. With a deterministic approach, an average correction was identically applied to all Hd values with the same ta and d while, for a stochastic correction, a thorough knowledge of the statistical characteristics of the underestimation error was required. Accordingly, in this work, rainfall data derived from many stations in central Italy were analyzed and it was assessed that single and average errors, which were both assumed as random variables, followed exponential and normal distributions, respectively. Furthermore, the single underestimation error was also found inversely correlated to the corresponding annual maximum rainfall depth.

Rainfall data may be available with temporal aggregation (or time resolution), t a , that varies depending on the technologies used (for example, a recording system with paper rolls, which was largely adopted in the past, allowed hourly or half hourly aggregations) and the procedures followed by the rain gauge station manager.Nowadays, through the use of tipping bucket sensors, all tipping times are recorded in a digital data-logger with each tip equal to a well-known depth (typically 0.1 mm or 0.2 mm).Then, the rainfall characteristics such as rain rates are obtained by aggregating the number of tips over selected t a , which is a variable between one minute and 24 h.
When rainfall data are aggregated, their analyses at a time scale smaller than the adopted t a are not possible.In addition, the specific choice regarding t a could also influence the results of the analyses involving time durations greater than or equal to t a itself [20].
It is well known that rainfall data recorded at fixed temporal aggregations may underestimate true maximum accumulations for durations, d, equal or comparable with t a [21][22][23][24][25]. Reference [21] observed that multiplying the results of a frequency analysis of H d by 1.13 yields values very close to those derived from analyses based on true maxima [22] by assuming that a uniform rainfall occurring over the duration of interest developed a theoretical relationship between the sampling adjustment factor (SAF) versus t a /d where SAF is the average ratio of the true maximum accumulation to the maximum given by a fixed interval gauge record [23] by using high temporal resolution data, which found a relationship between SAF and t a /d more consistent with other empirical studies [26][27][28].However, the analyzed series length within the range of 5.3 years to 14.9 years was too limited to perform results of general validity [24] following the [21] theoretical approach.This found that a correction factor (CF), which was a parameter with the same practical meaning of the SAF adopted by Reference [23], depends on the rainfall temporal distribution.More recently, reference [25] defined a procedure to derive quasi-homogeneous H d series starting from inhomogeneous series with a percentage of values obtained from coarse rainfall data (typically old data) and another percentage derived from continuous data (recorded in the last 20 to 30 years).In detail, for each d value, a mathematical relation between average underestimation error, E a% , and the ratio t a /d can be used to correct the H d values.This correction can be carried out with different approaches of deterministic or stochastic types.
Through a deterministic approach, an average correction is identically applied to all H d values characterized by the same t a and d while, for a more realistic correction, a stochastic approach, which requires a thorough knowledge of the statistical characteristics of the underestimation error, should be used.In the last case, each underestimated H d value is modified/corrected by using a specific value of the single underestimated error, E % , treated as a random variable.
Among the most important determinations that may take place with precipitation data, there are the rainfall depth-duration-frequency relationships [29][30][31][32], which can be determined by starting from the annual maximum rainfall depths, H d , cumulated over different durations, d [33].The use of uncorrected H d series determines rainfall depth-duration frequency curves with underestimation in the interval of 5% to 10% depending on the return period and duration [25].Furthermore, the underestimation of H d values due to coarse t a plays a significant role in the analysis of the effect of climatic change on extreme rainfall.In fact, the correction of the values can vary the sign of the trend from positive to negative especially for series where the probability of the presence of values with t a /d = 1 is particularly high [34].
On this basis, the main objective of this paper is to analyze the behavior of the probability distribution function for both single and average underestimation error variables.The second task of this study is to define, if existing, the correlation law between the single underestimation error and the corresponding annual maximum rainfall depth.

Case Study
Rainfall information used in this study were observed in a geographical area of central Italy known as the Umbria region (8456 km 2 ).This area has a very complex orography along the East boundary where the Apennine Mountains exceed 2000 m a.s.l.while it is mainly hilly in the central and western areas with elevation in the interval 100 m a.s.l. to 800 m a.s.l.
The mean annual rainfall is about 900 mm, which ranged over the region from 650 mm to 1450 mm (on the basis of the observation period 1921-2015 and a network of 93 rain gauges).Higher monthly rainfall values generally occur during the autumn-winter period with floods caused by widespread rainfalls.Mean annual air temperature ranges on a monthly base from 3.3 • C to 14.2 • C with maximum values during the month of July and minimum values during January.
A wide percentage of the study area is included in the Tiber River basin.In fact, the Tiber River crosses the region from the North to the South-West receiving water from many tributaries, which is mainly located on the hydrographic left side.
The Umbria region is presently monitored through a network characterized by one rain gauge for each 90 km 2 with all devices continuously connected to a central unit through a radio link while, before 1992, only 18 rain gauges with measurements made every 30 min were installed.
In this study, 16 rain gauge stations characterized by the best quality of continuous rainfall data recorded for at least 20 years are considered.Their geographic position and main characteristics are reported in Figure 1 and Table

Methods
The accumulated rainfall recorded over a time interval d, x d can be obtained through the following procedure [35,36].
where x(t) is the rainfall rate at time t.
Then the annual maximum rainfall depth over a duration d, H d , may be easily derived through the following equation.
with t 0 equal to the initial instant of the year.
To obtain H d for each year, the knowledge of rainfall data with any t a ≤ d is request.Since it can be seen in Figure 2, the H d value is sometimes correctly estimated (Figure 2a) while sometimes it is underestimated (Figure 2b,c) with single errors that may reach the 50% (Figure 2c).
(1) where x(t) is the rainfall rate at time t.
Then the annual maximum rainfall depth over a duration d, Hd, may be easily derived through the following equation.
with t0 equal to the initial instant of the year.
To obtain Hd for each year, the knowledge of rainfall data with any ta ≤ d is request.Since it can be seen in Figure 2, the Hd value is sometimes correctly estimated (Figure 2a) while sometimes it is underestimated (Figure 2b,c) with single errors that may reach the 50% (Figure 2c).
For every duration d, a very long Hd series is affected by underestimations depending on both ta and the shape of the rainfall pulses [25].If we consider pulses with a rectangular shape, when ta = d, the average error, Ea%, is equal to 25% because all errors assume an equal probability of occurrence values between 0% and 50% while, in the case of triangular pulses, the average underestimation becomes 16.67%.However, an analysis of a large number of rainfall events observed in different stations and d values highlights that, before and after the peak, the rainfall depth exhibits a steeper trend.For instance, Figure 3 shows three rainfall events associated with the Hd for d = 60 min that were observed in a station of central Italy.Therefore, the actual value of Ea% should be less than the theoretical value of 16.67% [24].For every duration d, a very long H d series is affected by underestimations depending on both t a and the shape of the rainfall pulses [25].If we consider pulses with a rectangular shape, when t a = d, the average error, E a% , is equal to 25% because all errors assume an equal probability of occurrence values between 0% and 50% while, in the case of triangular pulses, the average underestimation becomes 16.67%.However, an analysis of a large number of rainfall events observed in different stations and d values highlights that, before and after the peak, the rainfall depth exhibits a steeper trend.For instance, Figure 3 shows three rainfall events associated with the H d for d = 60 min that were observed in a station of central Italy.Therefore, the actual value of E a% should be less than the theoretical value of 16.67% [24].We note that, in principle, underestimation errors in determining the Hd values cannot be eliminated independently of the adopted ta.However, if d = ta = 1 min for an extreme rainfall event of intensity equal to 300 mm/h, the underestimation error becomes negligible.Furthermore, considering that the durations of interest for Hd are typically ≥5 min, observed rainfall characterized by ta = 1 min may be assumed as continuous data characterized by a negligible error.In any case, the largest average error occurs for ta = d and decreases when the ratio ta/d decreases.
Considering rainfall data with temporal aggregations in the interval of 1 to 1440 min, the methodology adopted in this study mainly consists of the analysis of a large number of underestimation errors obtained during the evaluation of annual maximum rainfall depths with durations between 10 min to 2880 min.
Starting from the rainfall measurements (tipping times) for each selected station, aggregated data with the following ta were obtained: 1 min, henceforth denoted as "Observed"; 10 min, 15 min, 30 min, 60 min, 180 min, 360 min, 720 min and 1440 min, henceforth all denoted as "Generated".This procedure is clearly explained in Table 2 for a reduced set of rainfall data observed at the Gubbio rain gauge.As it can be seen, the first column on the left represents measured rainfall depths with temporal aggregation equal to 1 min.The other columns ("Generated" rainfall depth) can be derived from the first through easy addition operations.For example, the top rainfall depth with ta = 10 min, equal to 1.6 mm, is the sum of the top 10 observed values (0.2, 0.2, 0.0, 0.2, 0.2, 0.4, 0.0, 0.2, 0.2, 0.0).
For all selected stations and considering some typical durations (≤1440 min), all annual maximum rainfall depths may be determined by using both the "Observed" and "Generated" data.Obviously, for each set of rainfall data, Hd can be evaluated only for d ≥ ta.We note that, in principle, underestimation errors in determining the H d values cannot be eliminated independently of the adopted t a .However, if d = t a = 1 min for an extreme rainfall event of intensity equal to 300 mm/h, the underestimation error becomes negligible.Furthermore, considering that the durations of interest for H d are typically ≥5 min, observed rainfall characterized by t a = 1 min may be assumed as continuous data characterized by a negligible error.In any case, the largest average error occurs for t a = d and decreases when the ratio t a /d decreases.
Considering rainfall data with temporal aggregations in the interval of 1 to 1440 min, the methodology adopted in this study mainly consists of the analysis of a large number of underestimation errors obtained during the evaluation of annual maximum rainfall depths with durations between 10 min to 2880 min.
Starting from the rainfall measurements (tipping times) for each selected station, aggregated data with the following t a were obtained: 1 min, henceforth denoted as "Observed"; 10 min, 15 min, 30 min, 60 min, 180 min, 360 min, 720 min and 1440 min, henceforth all denoted as "Generated".This procedure is clearly explained in Table 2 for a reduced set of rainfall data observed at the Gubbio rain gauge.As it can be seen, the first column on the left represents measured rainfall depths with temporal aggregation equal to 1 min.The other columns ("Generated" rainfall depth) can be derived from the first through easy addition operations.For example, the top rainfall depth with t a = 10 min, equal to 1.6 mm, is the sum of the top 10 observed values (0.2, 0.2, 0.0, 0.2, 0.2, 0.4, 0.0, 0.2, 0.2, 0.0).
For all selected stations and considering some typical durations (≤1440 min), all annual maximum rainfall depths may be determined by using both the "Observed" and "Generated" data.Obviously, for each set of rainfall data, H d can be evaluated only for d ≥ t a .Considering all H d values derived from the "Observed" rainfall data as a benchmark, the annual maximum rainfall depth errors due to the use of rainfall data with a coarse t a ("Generated") can be determined.For example, Tables 3 and 4 emphasize the single underestimation errors for Nocera Umbra rain gauge considering t a values equal to 30 and 15 min, respectively.As it can be seen, for fixed t a and d, underestimations can randomly vary with years.In Table 3, for t a = d = 30 min, the minimum underestimation error is practically negligible (0.01% in 2009) while it increases to 48.74% in 2010.It may be deduced that significant underestimations occur when t a is equal to d, whereas they become negligible when t a /d ≤ 0.1.

Single Error Analysis
With the main purpose to determine the best probability function representing the single underestimation error assumed as a random variable, all errors on the annual maximum rainfall depth due to the use of "generated" data characterized by temporal aggregation in the range of 10 to 1440 min were first grouped considering different t a /d ratios.Then, dividing the well-known range of possible individual errors (0-50%) into a reasonable number of classes, for each t a /d, the relative frequency of underestimation errors has been determined.As an example, Figure 4 shows the histograms of the relative frequency for three representative cases.Figure 4c points out that, for low values of the t a /d ratio, an adequate representation of the errors' frequency is possible only if a reduced amplitude of the classes is considered (see Figure 5).
Since a specific t a /d ratio may result from different t a values, it has been verified whether this dependence produces any significant effect on the relative frequency histograms.Figure 6 shows the behavior of three histograms for cases with the same t a /d (equal to 1) but with different t a (equal to 30 min, 60 min, and 1440 min).Since a specific ta/d ratio may result from different ta values, it has been verified whether this dependence produces any significant effect on the relative frequency histograms.Figure 6 shows the behavior of three histograms for cases with the same ta/d (equal to 1) but with different ta (equal to 30 min, 60 min, and 1440 min).
Based on the analysis of Figures 4-6, the shape of the histograms seems to be the same independently of ta and d.The best mathematical equation representing these trends was the exponential curve according to the most common statistical tests (Pearson, Kolmogorov-Smirnov, and Anderson-Darling).It is characterized by the following probability density and cumulative distribution functions, respectively.
where the parameter λ (>0) is the inverse of the sample expected value.
On this basis, the characterization of the single underestimation error only depends on the parameter λ.Considering all rainfall data observed in the selected stations and all possible combinations of ta and d, Figure 7 displays λ values as a function of the ta/d ratios.The best interpolation function can be expressed by the equation below.
However, from our results, we observed that the assessment of λ could be further improved by splitting Equation ( 5) on the basis of the d value because of its link with the shape of the rainfall hyetograph that influences the error magnitude [24].Rectangular hyetographs were typically observed for d up to 30 min, triangular hyetographs for greater values of d up to 180 min, and pulses represented by quadratic functions [24] for larger values of d.As a consequence, the following three relations, which are plotted in Figure 8, were derived.
For each ratio ta/d and for each d value, Equations ( 6)-( 8) can be used to obtain the parameter λ to be applied when the generation of E% values is necessary.Based on the analysis of Figures 4-6, the shape of the histograms seems to be the same independently of t a and d.The best mathematical equation representing these trends was the exponential curve according to the most common statistical tests (Pearson, Kolmogorov-Smirnov, and Anderson-Darling).It is characterized by the following probability density and cumulative distribution functions, respectively.
f (E % ) = λe −λE % (3) where the parameter λ (>0) is the inverse of the sample expected value.On this basis, the characterization of the single underestimation error only depends on the parameter λ.Considering all rainfall data observed in the selected stations and all possible combinations of t a and d, Figure 7 displays λ values as a function of the t a /d ratios.The best interpolation function can be expressed by the equation below.
However, from our results, we observed that the assessment of λ could be further improved by splitting Equation ( 5) on the basis of the d value because of its link with the shape of the rainfall hyetograph that influences the error magnitude [24].Rectangular hyetographs were typically observed for d up to 30 min, triangular hyetographs for greater values of d up to 180 min, and pulses represented by quadratic functions [24] for larger values of d.As a consequence, the following three relations, which are plotted in Figure 8, were derived.
For each ratio t a /d and for each d value, Equations ( 6)-( 8) can be used to obtain the parameter λ to be applied when the generation of E % values is necessary.
Atmosphere 2018, 9, x FOR PEER REVIEW 11 of 17    6)-( 8) depending on ratio ta/d where ta is the time resolution of rainfall data and d is the duration.

Average Error Analysis
The main characteristics of the average underestimation error of annual maximum rainfall depth, Ea%, were analyzed by Reference [25].They concluded that, in the worst conditions that occur for ta = d, the Ea% for a series of appropriate length is less than or equal to 16.67% and developed reliable relationships between Ea% and values of ta and d.In this work, as an element of novelty, the analysis of the statistical distribution of the average error was carried out.
Additionally, for the average underestimation error, the relative frequency was evaluated by considering different ta/d ratios and adequate classes' amplitudes.From the visual analysis of Figure 9 and, from the adoption of the main statistical tests (Pearson, Kolmogorov-Smirnov, Anderson-Darling), it may be deduced that the following normal distribution function correctly represents the behavior of the random variable.where μ and σ are the sample expected value and standard deviation, respectively.
For the μ parameter, results relevant to Morbidelli et al. (2017) were obtained.In fact, in this case, the results showed that the μ parameter depends on both the ta/d ratio and d.Specifically, for ta/d = 1 values of μ = 12.40%, 11.89%, and 10.41% were found for d = 30 min, 60 min, and 1440 min,

Average Error Analysis
The main characteristics of the average underestimation error of annual maximum rainfall depth, E a% , were analyzed by Reference [25].They concluded that, in the worst conditions that occur for t a = d, the E a% for a series of appropriate length is less than or equal to 16.67% and developed reliable relationships between E a% and values of t a and d.In this work, as an element of novelty, the analysis of the statistical distribution of the average error was carried out.
Additionally, for the average underestimation error, the relative frequency was evaluated by considering different t a /d ratios and adequate classes' amplitudes.From the visual analysis of Figure 9 and, from the adoption of the main statistical tests (Pearson, Kolmogorov-Smirnov, Anderson-Darling), it may be deduced that the following normal distribution function correctly represents the behavior of the random variable.
where µ and σ are the sample expected value and standard deviation, respectively.For the µ parameter, results relevant to Morbidelli et al. (2017) were obtained.In fact, in this case, the results showed that the µ parameter depends on both the t a /d ratio and d.Specifically, for t a /d = 1 values of µ = 12.40%, 11.89%, and 10.41% were found for d = 30 min, 60 min, and 1440 min, respectively.The characteristic behavior of the σ parameter is shown in Table 5.At increasing values of µ, corresponding values of σ and cv = σ/µ increase and decrease, respectively.This last trend shows that the dispersion of the values of the random variable increases in conditions of reduced values of µ and decreases in the cases of greater practical interest.a negative correlation between the considered quantities was found.In Figure 10, an example of this type of result for three different cases is shown.Note that while in most cases, an inverse link between the underestimation error and the annual maximum rainfall depth was highlighted, the probability of incurring an opposite result is not negligible (22%).However, in almost all cases with a direct correlation between the single error and H d , the magnitude of the link was very limited, which was shown in Figure 11.

Correlation Hd-Error
The analysis of the underestimation error of the annual maximum rainfall depths is very important when systematic corrections on Hd series affected by this problem have to be carried out.In this context, it is necessary to analyze the possible link between each single underestimation error with the corresponding Hd.To this aim, all possible combinations of ta and d were analyzed.In 78% of cases, a negative correlation between the considered quantities was found.In Figure 10, an example of this type of result for three different cases is shown.Note that while in most cases, an inverse link between the underestimation error and the annual maximum rainfall depth was highlighted, the probability of incurring an opposite result is not negligible (22%).However, in almost all cases with a direct correlation between the single error and Hd, the magnitude of the link was very limited, which was shown in Figure 11.

Correction Procedure
When a series of n annual maximum rainfall depths, (i = 1, …, n), of assigned duration d obtained from data characterized by coarse ta has to be corrected, the steps below are followed.

Correction Procedure
When a series of n annual maximum rainfall depths, H d i (i = 1, . . ., n), of assigned duration d obtained from data characterized by coarse t a has to be corrected, the steps below are followed.

2.
A set of underestimation errors E % i (i = 1, . . ., n), respecting the probability density function of point 1, has to be generated, 3.
Each generated E % i value has to be combined with a specific uncorrected H d i on the basis of the inverse correlation between these quantities.4.
Each H d i has to be corrected in accordance with the following equation.
where H corr

5.
In the case of the Montecarlo procedure, steps 2, 3, and 4 have to be repeated.

Figure 1 .
Figure 1.Morphology and the rain gauge network of the Umbria region.The position of the rain gauges used in the analysis is also shown.

Figure 1 .
Figure 1.Morphology and the rain gauge network of the Umbria region.The position of the rain gauges used in the analysis is also shown.

Figure 2 .
Figure 2. Representation of a generic rainfall pulse characterized by duration, d, equal to the measurement temporal resolution, ta.(a) Case where a correct determination of the annual maximum rainfall depth of duration d, Hd, is possible.(b) Case with a generic underestimation of Hd.(c) Case with the maximum underestimation of Hd (50%).

Figure 2 .
Figure 2. Representation of a generic rainfall pulse characterized by duration, d, equal to the measurement temporal resolution, t a .(a) Case where a correct determination of the annual maximum rainfall depth of duration d, H d , is possible.(b) Case with a generic underestimation of H d .(c) Case with the maximum underestimation of H d (50%).

Figure 3 .
Figure 3. Sample hyetographs recorded at the Bastardo station (Umbria Region, central Italy) involving annual maximum rainfall depths for d = 60 min.From top to bottom, moving windows selected for the years 1993, 1998, and 2011.

Figure 3 .
Figure 3. Sample hyetographs recorded at the Bastardo station (Umbria Region, central Italy) involving annual maximum rainfall depths for d = 60 min.From top to bottom, moving windows selected for the years 1993, 1998, and 2011.

Figure 4 .Figure 4 .
Figure 4. Relative frequency of the underestimation error, E%, of the maximum rainfall depths considering classes amplitude equal to 5%.(a) ta/d = 1.(b) ta/d = 0.5.(c) ta/d = 0.333.All selected rain gauge stations and all possible combinations of ta and d were considered.

Figure 5 .
Figure 5. Similar to Figure 4c but with classes' amplitude equal to 1%.

Figure 7 .Figure 7 .
Figure 7. Behavior of the λ parameter as a function of the ratio between temporal aggregation, ta, and duration, d, for all selected rainfall stations and for ta/d ≥ 0.1666.The best interpolating function is also plotted.

Figure 7 .
Figure7.Behavior of the λ parameter as a function of the ratio between temporal aggregation, ta, and duration, d, for all selected rainfall stations and for ta/d ≥ 0.1666.The best interpolating function is also plotted.

Figure 8 .
Figure 8. Behavior of the λ parameter obtained by Equations (6)-(8) depending on ratio ta/d where ta is the time resolution of rainfall data and d is the duration.

Figure 9 .
Figure 9. Relative frequency of the average underestimation error, Ea%, of the maximum rainfall depth, which considers the classes' amplitudes shown in the x-axis.(a) ta/d = 1, (b) ta/d =0.5, and (c) ta/d = 0.333.All selected rain gauge stations and all possible combinations of ta and d were considered.

Figure 10 .
Figure 10.Relation between the annual maximum rainfall depth, H d , and the underestimation error, E % .(a) Cerbara station, t a = d = 30 min.(b) Montecucco station, t a = d = 60 min.(c) Casa Castalda station, t a = d = 60 min.The best interpolating linear function is also plotted.

Figure 11 .
Figure 11.Relation between annual maximum rainfall depth, H d , and underestimation error, E % .(a) San Biagio della Valle station, t a = d = 30 min.(b) Compignano station, t a = 15 min and d = 30 min.The best interpolating linear function is also plotted.
are the corrected and uncorrected H d i values, respectively. .

Table 2 .
"Observed" and "Generated" rainfall data characterized by different time resolutions, t a , starting from 3 January 2017 at 6:00 a.m. at the Gubbio rain gauge, located in the Umbria Region.

Table 3 .
Single underestimation errors, E % , (in%) in the determination of the annual maximum rainfall depth considering rainfall data with temporal resolution equal to 30 min and various durations, d, at the Nocera Umbra rain gauge, located in the Umbria Region.For each d, the average underestimation error, E a% , is also reported.

Table 4 .
Single underestimation errors, E % , (in %) in the determination of the annual maximum rainfall depth considering rainfall data with temporal resolution equal to 15 min and various durations, d, at the Nocera Umbra rain gauge, located in the Umbria Region.For each d, the average underestimation error, E a% , is also reported.
Behavior of the λ parameter obtained by Equations (6)-(8) depending on ratio t a /d where t a is the time resolution of rainfall data and d is the duration.

Table 5 .
Observed behavior of the parameters µ, σ, and cv (=σ/µ) of Equation (9) as a function of the ratio t a /d where t a is the time resolution of rainfall data and d is the duration.The characteristic behavior of the σ parameter is shown in Table5.At increasing values of μ, corresponding values of σ and cv = σ/μ increase and decrease, respectively.This last trend shows that the dispersion of the values of the random variable increases in conditions of reduced values of μ and decreases in the cases of greater practical interest.