Magnitude Frequency Analysis of Small Floods Using the Annual and Partial Series

Flood frequency analysis using partial series data has been shown to provide better estimates of small to medium magnitude flood events than the annual series, but the annual series is more often employed due to its simplicity. Where partial series average recurrence intervals are required, annual series values are often " converted " to partial series values using the Langbein equation, regardless of whether the statistical assumptions behind the equation are fulfilled. This study uses data from Northern Tasmanian stream-gauging stations to make empirical comparisons between annual series and partial flood frequency estimates and values provided by the Langbein equation. At T = 1.1 years annual series estimates were found to be one third the magnitude of partial series estimates, while Langbein adjusted estimates were three quarters the magnitude of partial series estimates. The three methods converged as average recurrence interval increased until there was no significant difference between the different methods at T = 5 years. These results suggest that while the Langbein equation reduces the differences between the quantile estimates of annual maxima derived from annual maxima series and partial duration series flood frequency estimates, it does not provide a suitable alternative method to using partial series data. These results have significance for the practical estimation of the magnitude-frequency of small floods.


Introduction
Estimates of the size and frequency of floods is important for infrastructure planning and design and in the management of water resources and riparian areas [1].Research on flood frequency has focused on the estimation of extreme flood events, rather than the more frequent small to moderate magnitude flood events which dictate alluvial channel morphology [2,3] and consequently are of particular interest to geomorphologists.The Institution of Engineers Australia (IEA) recommends the use of the partial series for estimating the magnitude-frequency of frequent floods [4], as it has been shown to provide more accurate estimates of frequent flood events than the annual series [3,4].However, the partial series is seldom used due to uncertainty in its application [5][6][7].Instead, the magnitude of frequent flood events is commonly determined by transforming annual series estimates using a formula known as the Langbein equation.Recognising the theoretical statistical relationships that exists between the annual and partial series under certain criteria, Langbein [8] demonstrated a method for converting annual series average recurrence intervals to partial series intervals.Originally developed for use in specific statistical situations, the Langbein equation has subsequently been commonly used as a practical method to convert annual series intervals to partial series intervals, even when the statistical assumptions behind the equation are not met [5,[9][10][11][12][13].

Flood Frequency Analysis
Flood frequency analysis is used for making probabilistic estimates of a future flood event based on the historical stream-flow record, with probability often expressed as the average length of time between floods and called the return period or average recurrence interval (T).The two main methods of flood frequency analysis are analytical and graphical, with the IEA [4] recommending that both procedures are used in a complementary manner.The analytical method of flood frequency analysis usually involves fitting a probability distribution function to model the observed peak flow data from which the probability of exceedance of flow-discharge of a particular magnitude flood may then be calculated.Although this method is widely used, there is little theoretical basis in the choice of distribution [14,15], and despite extensive research, no particular distribution has emerged as the best fitted across and most uniform across different sites [16].The parameters of the probability distribution are generally estimated through analysis of the selected data sample, which is assumed to be representative of its parent population.Methods such as L-moment diagrams and associated goodness-of-fit procedures have been advocated for evaluating the suitability of various distributional alternatives for modeling flood flows in a region [17].However, the true distribution and its parameters may still differ significantly from the empirically fitted distribution, particularly when samples are small [18].
Of the two main choices of data series in flood frequency analysis, the most frequently used is the annual series, which is composed of the single maximum discharge for each year of the record.IEA [4] identified three advantages to using the annual series: there is a high probability that flood events are independent; the series is easily and unambiguously extracted; and the form of the frequency distribution of annual floods generally conform to theoretical distributions.The major disadvantage to using the annual series is that because only one flood is included from each year of the stream-flow record, the annual series may exclude significantly large floods if several occur in a single year and may include small annual maximums for some years.This may result in small floods occurring more frequently than indicated by the annual series [3].
The partial series, also known as Peaks Over Threshold (POT), is composed of all discharges over a chosen threshold for the entire stream gauge record-some years may contribute several floods and other years none.Advantages of the partial series are that insignificant floods are excluded, which can improve magnitude estimates of high frequency floods [3], and that the partial series can produce more data points than the annual series, which can be particularly useful when the period of stream-flow record is short [4].However, historically the partial series has been less commonly used than the annual series, mainly due to the complexity in choosing the threshold discharge level and ensuring the independence of each flood event [6,7,[19][20][21].As there is no unique threshold value which best defines the partial series [22], an appropriate level must be determined, generally through the trial of several different threshold levels [7,15,20].Lowering the threshold up to a certain level increases the number of data points which may improve flood frequency estimates.However as the number of flood events in the series increases, the possibility that they will not be independent also increases, as conditions created by one flood may also affect following floods (e.g., soil moisture).No general guidelines for ensuring independence have been developed, with the criterion for independence instead requiring subjective judgment, with consideration of the circumstances and objectives of the study and the characteristics of the catchment and flood data [4].

Low Magnitude Frequent Floods
Due to the difficulties in defining the partial series, the estimation of the magnitude-frequency of frequent floods is often made using the easier to define annual series, despite evidence it underestimates their magnitude [3].The annual series also provides a different measure of the probability of a flood, the average recurrence interval, to that provided by the partial series.As the annual series only considers one flood for each year, the average recurrence interval in this series is the average interval of time in which a flood of the selected magnitude occurs as an annual maximum, whereas the average recurrence interval for the partial series is the average time interval between two successive floods of at least the selected magnitude [10].
Assuming that the floods in the partial series are independent and distributed according to a Poisson process, Langbein [8] demonstrated the existence of a statistical relationship between the recurrence intervals generated by the two series.This relationship between the two series was further defined by Chow [23] to produce the equation: where T P is the average recurrence interval determined for the partial series and T A is the corresponding average recurrence interval using the annual series.While other empirically derived relationships between the annual and partial series have been produced for particular datasets ranging from 20 to 46 years [20,24], a more common approach has been to use the Langbein equation, or a table of equivalent annual and partial series values based on the Langbein equation, (e.g., [11,[25][26][27][28][29][30]) to "convert" annual series flood frequency values to partial series values regardless of the theoretical validity of it application.Several studies have shown significant deviations from the values predicted by the equation when using empirical data [5,8,31], with differences between actual recurrence intervals and those predicted by the Langbein equation up to 40 percent for floods of relatively high frequency at some locations [5].The objective of this paper is to compare magnitude-frequency estimates of frequent floods determined using the annual series, the Langbein adjusted annual series and the partial series, and to determine whether Langbein's equation provides a suitable empirical method to convert annual series flood frequency average return intervals to partial series intervals.The purpose of this study is to improve understanding of practical methods available to fluvial geomorphologists and catchment managers for the estimation of high-frequency low-magnitude flood events.This would allow estimation of the frequency or magnitude of geomorphically important flood events such as bankfull discharge.

Study Area
Tasmania is the southernmost state in Australia, with the main island extending across a latitudinal range of 39°40′-43°20′ S. The North-Eastern Region covers almost one-third of Tasmania's landmass (Figure 1), and is delineated by the Tamar Estuary in the West and the Fingal Valley in the South.The Region's temperate marine climate includes a winter dominated rainfall that is largely controlled by topography and ranges from an annual average of less than 700 mm in low lying and coastal areas up to more than 1200 mm in the highlands [32].Steep precipitation gradients exist in some areas, and occasional very heavy rainfall events associated with the passage of intense low pressure systems occur about some areas of the Region causing localised flooding [33].

Data
Stream-flow data for the thirteen gauging stations shown in Figure 1 and listed in Table 1 were obtained from the Tasmanian Department of Primary Industries, Parks, Water and the Environment (DPIPWE).These DPIPWE stations represent all those in North-Eastern Tasmania that possess records of adequate length and quality.Obvious errors were removed from the stream-flow data and each dataset was trimmed to full hydrological years.Regardless of the size of catchment, topography, size and land cover, the original 15 minute sampling period data was transformed to a daily time step by calculating the mean discharge for each 24 hour period of record.The gauging stations are distributed throughout North-Eastern Tasmania (Figure 1) on a variety of stream types, and have accumulated catchment areas ranging from 26.4 km 2 to 3306.4 km 2 , with a mean of 509.5 km 2 .The number of years of stream-flow record (n) varied significantly around the mean of 33 years, with the maximum length being 85 years (North Esk River at Ballroom) and three sites having the minimum length of 10 years (Ansons River downstream of Big Boggy Creek, Nile River at Deddington and Scamander River upstream of Scamander water intake).Note: years of record relates to the period immediately prior to 1 January 2012.

Annual Series
Daily stream-flow data from each site was time-stepped to annual maxima, with checks made to ensure peak events from one year were not included as peak events for the following year.While there are various a-priori theories for choosing particular probability distributions for flood frequency data, in practical applications empirical suitability plays a much larger role in distribution choice [14,15].In a study of the suitability of a range of distributions using a large set of Australian annual series data, Rahman et al. [21] recommended that the Log-Pearson 3, Generalized Extreme Value, and Generalized Pareto Distributions should be compared before the final choice of a distribution.In this study a single distribution was used, with the choice based on previously demonstrated empirical suitability as well as practicality.A two-parameter Log-Normal distribution with Bayesian Markov Chain Monte Carlo (BMCMC) parameter estimation has previously been found to be the best performing flood frequency distribution and associated parameter estimation procedure for Tasmanian annual series flood data [34] hence this was used in this study, utilizing facilitating software [35], with each algorithm iterated 5000 times.The fit of each Log-Normal distribution was checked visually using histograms and quantile-quantile (QQ) plots and the fitted distributions were also verified against the original data on log-log plots.Plotting positions for the observed peak discharges were determined following the general recommendations of Cunnane [36] and IEA [4] using the equation: where m is the rank of each event and α is a bias constant.The bias constant adjusts plotting positions to account for the dataset being a sample of the real population.The bias constant was set at 0.4 in this study following the example of previous flood frequency analysis studies in Eastern Australia [37].

Partial Series
A peaks-over-threshold (POT) analysis was undertaken on daily stream-flow data from each site [34].Ensuring the independence of successive flood peaks in the partial series is a complex and possibly subjective problem with no definitive guidelines existing [7].Malumad [38] found relatively robust flood-frequency estimations using time intervals from 7 to 60 days between successive peaks, and Svensson et al. [39] used thresholds depending on catchment size: 5 days for catchments <45,000 km 2 , 10 days for catchments 45,000-100,000 km 2 , and 20 days for catchments >100,000 km 2 .As the largest catchment in this study was 3306 km 2 14 days between flood events was used as a criterion to ensure independence.In consideration of the range of values suggested by the literature, four different partial series were defined for each site.Thresholds were adjusted to provide partial series data sets where the number of events (k) equals 1n, 1.5n, 2n and 2.5n (named PS 1 , PS 1.5 , PS 2 and PS 2.5 respectively).The IEA [4] suggest that graphical interpolation is sufficiently accurate when using the partial series where T < 10 years, but that a probability distribution should be fitted for making inferences beyond this.Both analysis methods were used in this study.The Generalized Pareto Distribution (GPD) has been widely used for flood frequency analysis with partial series data (e.g., [6,16,20]) and was fitted to each of the four partial series data sets for each of the thirteen sites in this study.The parameters of the GPD were estimated [34] using a maximum likelihood approach [40].Distributions were checked against the plotted stream-flow data following the procedures outlined for the annual series above.The coefficient of variation (CV) of the grouped partial series estimates was also determined as a measure of their dispersion.

Comparison of Different Methods
At-a-site flood frequency estimates for T = 1.1, 1.5, 2, 3, 5 and 10 years were made for each of the 13 stations for both the annual series (AS) and for each of the four partial series (PS 1 , PS 1.5 , PS 2 , PS 2.5 ) using the procedures detailed above.Langbein adjusted flood frequency estimates (LC) were determined from the annual series estimates using Equation (1).It should be noted that Langbein's equation is used in this study to determine if it provides an empirical method to convert annual series average recurrence intervals to partial series average recurrence intervals, and that consequently the theoretical assumptions behind the equation were not considered in the choice of flood frequency analysis method.The PS 1 values (also referred to as PS) were chosen for comparison with annual series estimates (AS) and Langbein adjusted annual series estimates (LC), as the data set on which the PS 1 estimates were based contained the same number of flood events as the annual series.In addition, all partial series magnitude estimates were generally closely clustered, irrelevant of the number of flood events included.The three estimates (AS, PS and LC) were then compared, and the ratio of AS to PS and LC to PS was calculated for each station.Mean ratios averaged across all thirteen stations were also compared.

Results
Data generally conformed well to statistical models, with both annual series and partial series flood frequency curves created from the probability distributions providing a fairly good fit (Figure 2) with the observed data from each of the thirteen stream gauging stations.This is demonstrated by results from the South Esk River above Macquarie River (Site 181) (Figure 2), with graphs of the fitted distributions against the observed data for both the annual and partial series for that site shown in Figure 2a and 2b respectively.The final partial and annual series flood frequency estimates are presented in Table 2, along with the coefficient of variation for the grouped partial series estimates (PS 1 , PS 1.5 , PS 2 and PS 2.5 ).Partial series CV was generally larger for sites with shorter stream-flow records, and displayed a general increase at and above T = 5 years.Table 2.Estimated discharge (m 3 s −1 ) for 1.1, 1.5, 2, 3, 4, 5 and 10 year average recurrence interval floods using partial (PS) and annual series (AS) data for Northern Tasmanian stream gauging stations (Coefficient of variation for partial series estimates PS 1 , PS 1.5 , PS 2 and PS 2.5 shown in parentheses).The percentage differences between partial series (PS) and annual series estimates (AS) and between the partial series and Langbein adjusted annual series estimates (LC) averaged across all 13 sites are listed in Table 3. Differences were smallest at T = 5 years with Table 3 showing T = 5 results as closest to the ratio of 1, and increased as average recurrence interval decreased.AS estimates were 95 percent of PS estimates at T = 5 years but decreased to just 33 percent of PS estimates at T = 1.1 years.LC estimates were generally closer, ranging from 101 percent of PS estimates at T = 5 years to 75 percent of PS estimates at T = 1.1 years.Both AS and LC estimates were significantly larger than PS estimates at T = 10 years (119 and 122 percent respectively).Figure 3 and Table 4 show mean annual series average recurrence intervals against the mean partial series recurrence interval at an equivalent discharge across the 13 North-Eastern Tasmanian stream-flow stations.Mean partial series discharge estimates at the lowest average recurrence interval (T = 1.1 years) were equivalent to a mean discharges on the annual series of 2.17 years average recurrence interval (SD = 0.33).Mean partial series estimates at T = 5 years were equivalent to T = 5.71 years on the annual series (SD = 1.00), and at T = 10 years partial series estimates were smaller than annual series estimates, with the equivalent annual series at T = 8.63 years (SD = 2.30).Differences between mean Langbein adjusted annual series average recurrence intervals and partial series intervals were smaller than differences between annual series and partial series intervals, but remained significant at low average recurrence intervals.Mean partial series estimated discharge at T = 1.1 years were equivalent to a mean Langbein adjusted values at T = 1.67 years and at T = 5 years were equivalent to Langbein adjusted values at T = 5.52 years.

Discussion
Annual series flood frequency estimates made using data from Northern Tasmanian stream gauging stations differed from partial series estimates at most average recurrence intervals.Differences were largest for the most frequent floods, with annual series estimates only 33 percent of partial series estimates at T = 1.1 (Table 3).The difference between the two series progressively decreased as T increased until there was negligible difference at around T = 5 years (AS = 95% PS), and by T = 10 years annual series estimates were larger than partial series estimates (AS = 119% PS).These results coincide with those from other studies comparing annual and partial series estimates at low average recurrence intervals (T < 10 years).Langbein [8] found that for equivalent floods, the recurrence intervals in the partial-duration series are smaller than in the annual series, and in results very similar to this study, also found that the difference between the two series is inconsequential for floods greater than about five year recurrence interval.Adamowski et al. [16] also found annual series quantiles significantly less than partial series for frequent floods.
Differences between the two series reflect the different data sets used.The partial series, which uses all floods above a threshold, is likely to include more medium sized flood events than the annual series, which only uses the largest flood event of each year.As more flood events are included the average recurrence interval between peaks of a given magnitude automatically declines [5], and as a result small floods occur more frequently than indicated by the annual series [3].Differences between annual series and partial series estimates decrease for larger more infrequent floods because the majority of extreme flood events are likely to be included in both series.
The differences between Langbein adjusted values and partial series values reflect the differences between the annual and partial series.The largest differences between the two values occurred at the smallest average recurrence intervals, and decreased as T increased, until there was no significant difference between the two values at around T = 5 years (Table 4 and Figure 3).Langbein adjusted values (LAS) averaged almost 40% lower magnitude discharge than partial series estimates at T = 1.1 years.The tendency for empirical results to deviate from the theoretical Langbein relationship has been previously demonstrated [5,8,24,31].The results from this study are very similar to results from Page and McElroy [5], who found differences between actual recurrence intervals and those predicted by the Langbein equation to be as much as 40 percent for floods of relatively high frequency up to the level of the mean annual flood (assumed to be a flood with T = 2.33 years).
Previous studies [5,8,31] have attributed this difference to the difficulty in determining independent flood peaks for partial series.The Langbein function assumes that floods occur as statistically independent events, and because flood selection cannot always guarantee adherence to this it is unlikely that actual data will conform precisely with the mathematically derived function [5].This is not supported by the small differences in magnitude between partial series estimates made with varying thresholds in this study.
The variation between the different partial series estimates at a site was generally small at low average recurrence intervals (T < 5 years) in comparison to the difference between partial and annual series estimates, as illustrated by the small coefficient of variation values in Table 2.This suggests that the choice of threshold level has relatively little effect on partial series estimates at low recurrence intervals.Other authors [3,4] have recommended that the partial series be used to estimate floods with average recurrence intervals of around ten years or less, and the results from this study support those views, particularly for floods with an average recurrence interval of five years or less.

Conclusions
This study found large differences between annual and partial series flood frequency estimates made using Northern Tasmanian stream-flow data for average recurrence intervals of less than five years, similar to other studies finding such significant deviations [5,8,31].Annual series estimates were one third the magnitude of partial series estimates at T = 1.1 years, but the two series converged as average recurrence interval increased until there was no significant difference between the two series at T = 5 years.This study also found that at low recurrence intervals there were relatively small differences between the various partial series estimates for a site made using different discharge thresholds, especially in comparison to the differences between partial series and annual series flood frequency estimates.This suggests that the definition of the partial series data set may not be of critical importance at low average recurrence intervals, although more research is required to confirm this.
In addition, this study found that Langbein's equation did not provide a suitable empirical method to convert annual series flood frequency estimates to partial series estimates at average recurrence intervals of less than five years.Langbein adjusted annual series estimates were three quarters the magnitude of partial series estimates at T = 1.1 years.
These results suggest that both the annual series and the Langbein adjusted annual series significantly underestimate the magnitude of frequent floods and should not be used at average recurrence intervals of less than five years.Rather, the partial series should be used for estimates of high frequency-low magnitude floods (T < 5 years).While the high sampling variability associated with the small sample size in this study would be reduced by a larger survey, the results of this analysis are supported by those from other studies.As floods of this frequency are of interest to geomorphologists and ecologists, these results have particular significance for relevant research in these fields.

Figure 1 .
Figure 1.Location of major rivers and stream-flow stations in North-Eastern Tasmania used in this study.State Government stream-gauge codes are used to identify sites.

Figure 2 .
Figure 2. Original stream-flow data compared to fitted probability distributions (solid line) for the South Esk River above Macquarie River (Site 181); (a) annual series data against the Log-Normal distribution; and (b) partial series data against the Generalised Pareto Distribution.

Figure 3 .
Figure 3.Comparison of annual and partial series average recurrence intervals (T).Conversion of annual series to partial series according to Langbein's function is represented by the solid line, while points represent mean (plus and minus standard deviation) of 13 North-Eastern Tasmanian stream-flow stations.

Table 1 .
Stream gauging sites and flow records used in the flood frequency analyses.

Table 3 .
Ratio of annual series estimates (AS) and Langbein adjusted annual series estimates (LC) to partial series (PS) estimates averaged across 13 North-Eastern Tasmanian stream gauging stations.

Table 4 .
Annual series average recurrence intervals (in years) equivalent to partial series as predicted by Langbein's function and estimated from mean values across 13 North-Eastern Tasmanian stream-flow stations.